PLEASE NOTE BEFORE APPLYING:

CODOXO IS NOT ABLE TO OFFER SPONSORSHIP OR ACCOMMODATE ANY CANDIDATES THAT ARE CURRENTLY BEING SPONSORED NOW OR IN THE FUTURE

This is for a Full Time Role with Codoxo, NOT C2C

 

Of the $3.8T we spend on healthcare in the United States annually, about a third of it is estimated to be lost due to waste, fraud and abuse.  Codoxo is the premier provider of artificial intelligence-driven solutions and services that help healthcare companies and agencies proactively detect and reduce risks from fraud, waste, and abuse and ensure payment integrity. Codoxo helps clients manage costs across network management, clinical care, provider coding and billing, payment integrity, and special investigation units. Our software-as-a service applications are built on our proven Forensic AI Engine, which uses patented AI-based technology to identify problems and suspicious behavior far faster and earlier than traditional techniques. 

  

We are venture backed by some of the top investors in the country, with strong financials, and remain one of the fastest growing healthcare AI companies in the industry.  

 
Position Summary:  

We have built a SaaS application that is supported by large volumes of healthcare data, where we must efficiently process, transform, and query data at scale to deliver near real-time insights to users. This role will play a key part in building and optimizing distributed data pipelines using modern big data frameworks, with a strong emphasis on PySpark-based processing. 

 

Key Responsibilities: 

We are seeking a junior to mid level Data Engineer with experience in the following areas: 

  • Ingestion and processing of large volumes of structured and unstructured data

  • Building and maintaining scalable data pipelines using PySpark and distributed data processing frameworks

  • Performing data validation, cleansing, and transformation at scale 

  • Designing and managing large-scale data storage solutions (data lakes / reservoirs) 

  • Writing efficient queries using SQL and optimizing performance across large datasets 

  • Supporting real-time or near real-time data processing use cases 

Note: Claims loading experience in the healthcare space is desired, but not required. 

 

Qualifications: 

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or related field 

  • At least 2 years of experience in the software or data engineering field 

  • Strong experience with Python and PySpark for large-scale data processing 

  • Experience working with relational databases (PostgreSQL preferred) 

  • Strong SQL skills, including analytical queries and performance optimization 

  • Experience with UNIX/Linux and shell scripting 

  • Familiarity with distributed data processing concepts and big data architectures 

  •  

 

Beneficial technical skills: 

  • Experience with AWS data ecosystem (Glue, S3, Aurora, Data Lake, SageMaker) 

  • Experience processing billions of records in distributed environments 

  • Experience with healthcare data, especially medical claims 

  • Experience with PL/SQL or PL/pgSQL 

  • Physical Requirements: Work is performed in an office environment (either in our office or work-from home) and requires the ability to work on a computer, operate standard office equipment, and work at a desk.  

 

Accessibility Notice: If you need reasonable accommodation for any part of the employment process due to a physical or mental disability, please send an email to careers@codoxo.com with the subject "Accommodation". Reasonable accommodation requests will be considered on a case-by-case basis. 

 

Benefits for You 

  • Health, Dental, and Vision insurance with 100% employee premium coverage (Starts Day 1)
  • Unlimited PTO
  • Annual Professional Development stipend
  • Annual home office stipend
  • 401K Match (after 90 days) 

 

We are an Equal Opportunity Employer: 

Codoxo provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.  This policy applies to all terms and conditions of employment. 

This job is currently not open for applications. Would you like to see our other open positions?