ABOUT US 


Zephyr is building an innovative AI platform to change the way we treat cancer, diabetes, and other chronic diseases. By aggregating massive data sets and harnessing advanced technologies and AI to increase our understanding of biology, Zephyr discovers insights that will transform how new therapies are developed and how we treat patients. We will use that knowledge to devise interventions that enable people to live longer and healthier lives. Working in close partnership with industry-leading institutions across academia, biopharma, and care delivery, Zephyr is advancing our understanding of how to characterize and treat chronic diseases. With an initial focus on cancer and diabetes, Zephyr is working to revolutionize drug development, reform clinical trials, and change healthcare to impact patient lives. Zephyr is based in Tysons Corner, VA, and currently operates as a remote-first organization.


WE ARE HIRING A

SENIOR ML OPS ENGINEER

We are looking for a talented and driven Sr. MLOps Engineer to join our growing team of innovators pushing the boundaries of precision medicine. As an MLOps Engineer, you will enable our team of researchers and data scientists to design, train, validate, and deploy the machine learning models that drive our cancer biology research and medical insights. We are excited for the perspectives and experience you will bring to our team. You will be responsible for contributing to, operating, and improving all aspects of our machine learning infrastructure. You are comfortable working with data engineers, data scientists, software engineers, and other roles across our team. The ideal candidate will have experience with Amazon Web Services, Python, data labeling and versioning, model training frameworks, model and feature registries, and production model deployment architecture.

ABOUT THE TEAM

The MLOps team is working to enable our engineers, researchers, and data scientists to design, train, validate, and deploy industry leading machine learning models as rapidly, securely, and efficiently as possible. These initiatives are central to the business success of the Zephyr AI organization. We work quickly and largely in a self-directed capacity. We blend open source and off-the-shelf ML solutions with custom implementations where appropriate. We provide well documented internal tools and interfaces that other teams can consume to accelerate their workflows. The team’s scope of work includes scaling, observability, and security of model training and deployment, feature management, data and model versioning and storage, model deployment, cost engineering, and more.

ESSENTIAL DUTIES AND RESPONSIBILITIES 

  • Actively design, plan, and implement changes and advancements to ML automation and infrastructure.

  • Build and operate systems and automation to manage data and model versions.

  • Create and operate pipelines for batch inference against production models.

  • Support and monitor ad-hoc inference requests against production models.

  • Work with cross-functional team members to gather requirements and feedback about ML infrastructure.

  • Build, test, and document self-service tools enabling data scientists and engineers to accelerate their work with advanced ML infrastructure.

 Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions 

SKILLS TO BE SUCCESSFUL IN THE ROLE 

  • Expertise with Python and machine learning frameworks in the Python ecosystem.

  • Expert knowledge of the machine learning lifecycle.

  • Experience implementing and operating model training pipelines (e.g. scikit-learn, Keras, Torch, Tensorflow).

  • Experience implementing and operating feature management systems (e.g. Feast, Tecton).

  • Experience implementing and operating model registries (e.g. MLflow, Comet, Neptune) and model inference runtimes (e.g. MLServer, KServe, Seldon).

  • Track record of building great developer experiences with custom tools and automation.

  • Proficiency with containers and container orchestration (e.g. Docker, Kubernetes, AWS ECS).

  • Proficiency with AWS.


You will be a step ahead if you have:

  • Experience working in a regulated environment.

  • Experience with data manipulation tools including Spark and Pandas.


WHAT WE OFFER 

This is a full-time, exempt, position, reporting to the Director of Engineering. This position requires the ability to work cross functionally within the organization. 


We offer competitive compensation as well as a comprehensive benefits package including:


  • 100% Company Paid Medical/Dental/Vision Insurance 

  • Generous paid time off

  • Paid holidays

  • 401(k) program 

  • Voluntary life and disability plans

  • Employee assistance program (EAP)

  • Opportunities for advancement



We are an equal opportunity employer


Zephyr AI provides equal employment opportunities (EEO) to all applicants without regard to race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, disability, genetic information, marital status, amnesty, or status as a covered veteran in accordance with applicable federal, state and local laws. Zephyr AI complies with applicable state and local laws governing non-discrimination in employment in every location in which the company operates. This policy applies to all terms and conditions of employment, including, but not limited to, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation, and training.




This position has been filled. Would you like to see our other open positions?