1.1      Initiative Summary:

Businesses looking to operationalize LLM-supported applications will benefit from using cloud-based (private or public) LLM “as a service” (LLMaaS) platforms for governance and scalability. Among many features, data governance (primarily for unstructured text) will be a critical offering of these platforms, including that from Blattner Technologies. This initiative will focus on contributing to the development of an extensible end-to-end data governance framework, including external data ingestion, parallelized data preparation and analytics, and versioning.

 


1.2      Desired Outcomes

-   Prototype innovative workflow-based capabilities for preparing unstructured text in a scalable, traceable, and intuitive manner for downstream LLM-related tasks, such as training and fine-tuning.

-   Presentation to broader company highlighting approach, challenges, solutions, and significant insights stemming from the effort.

 

1.3      Core Skills Required

-   Required skills:

o   Fundamental LLM knowledge (e.g., prompt engineering, fine-tuning)

o   NLP-based development (e.g., tokenization, embedding generation, and operations, textfication)

o   Python development

o   Experience with parallel distributed systems and/or parallel computation libraries such as Spark, Dask, or RAPIDS

-   Optional/preferable skills:

o   Kubeflow

o   Vector databases

o   Experience with NLP libraries such as spaCy and gensim

 


1.4      Estimated Effort

-   Full-time summer internship (40 hours/week)

-   Depending on progress, work may extend to part-time during the Fall semester (e.g., 10 hours/week)

 


1.5      Additional Information

This is a remote internship opportunity, working with summer mentors and reporting to the Chief Product Officer of BOSS AI. The group has a deep focus on implementing LLMs “as a service” (LLMaaS) and team members have a range of skills from enterprise software engineering, NLP, ML, and UX. You can expect to gain valuable experience in operationalizing LLMs and addressing critical security needs for all language models.



This job is currently not open for applications. Would you like to see our other open positions?