Senior ML Engineer

Location: San Francisco (flexible/hybrid)

At Cleanlab you’ll get to

Pioneer novel software systems for data-centric AI. Our tools enable data scientists/engineers (across all industries) to effectively diagnose/fix issues in their datasets, thus improving the quality of their business’s core asset.

Discover how to best leverage new Generative AI advances for tools that automatically find & fix issues in datasets. The north star of our company is to use AI to increase the value of Data.

Work with text, structured/tabular, image/video datasets from companies across diverse industries, and set up large-scale modeling infrastructure at a dynamic startup (0 to 1).

What we’re looking for

Strong software engineering skills and experience productionizing code/models in cloud environments while maintaining the necessary infrastructure. You should be comfortable with large software systems.

Experience in MLOps: processing data, deploying and monitoring models, setting up the necessary infrastructure for Data/AI projects.

Up to date on the latest advances in Generative AI (LLMs, LVMs)

Responsibilities

  • Spin up cloud infrastructure for ML projects spanning text, image, and structured/tabular datasets.
  • Work on large-scale Data/AI projects with Cleanlab enterprise customers, from inception to production.
  • Work as individual contributor tech lead on a ML team of around 5.
  • Innovate on new algorithmic techniques to improve a dataset, leveraging the newest Foundation model advances.

Qualifications

This is a senior role! Candidates must have at least 5+ years work experience in ML. Schoolwork or general data science work does not count, you should have tackled hard MLOps challenges at companies dealing with massive datasets, reliable model serving, cloud infrastructure, etc.

  • Python (pandas, scikit-learn, numpy, Jupyter)
  • PyTorch/PyTorch Lightning, Hugging Face
  • Relational databases
  • AWS
  • Docker
  • Git
  • CI/Testing, e.g. Jenkins

Bonus:

  • PhD in Machine Learning
  • Strong research publications or open-source contributions
  • Sagemaker, MLflow, Ray
  • ELT tools and data cleaning tools
  • Cleanlab or other data-centric AI tools
  • LangChain and LLMOps stack

Benefits

Working at Cleanlab is awesome! Beyond the opportunity to work at a well-funded AI startup with an incredible, friendly founding team of MIT graduates, all full-time employees receive the following:

  • Annual travel stipend
    • Travel enhances our empathy with different cultures and enables us to work together more effectively. It’s how we grow and learn: traveling is an essential part of what makes us human. At Cleanlab, every two months you will receive a reimbursable travel benefit. This is a unique benefit that lets you work from Paris for a week in February, then take a backpacking trip in the Andes for a weekend in March.
  • Premium health insurance (+ dental and vision)
    • We provide a fantastic $4 (we cover the rest) health insurance option. We also provide a $0 deductible 100% coverage premium health care option for those who prefer the best health insurance.
  • Stipend for attending conferences to keep up with the latest innovations in ML and software.
  • Competitive salary (+ equity offering for certain roles), with regular opportunities for a raise if things are going well.

The compensation range for this role is $180,000 to $220,000. The final offer details are determined by several factors including candidate experience/expertise and may vary from the pay range provided.

About Us

Prior to Cleanlab, our founders (3 ML PhDs from MIT) worked at OpenAI, Google, Microsoft, Amazon, AWS, Facebook AI Research (FAIR), Dropbox, Oculus, Palantir, NASA, General Electric, MIT Lincoln Laboratory, MIT, Harvard, and Stanford – at every place we worked we repeatedly encountered the same issue – AI solutions failed to work reliably on real-world, human-centric data due to label errors and poor data quality. So, we spent eight years of PhD research at MIT inventing a new field to solve this problem and after successful pilots with world-leading organizations, Cleanlab emerged.

Everything we do at Cleanlab is guided by our north star – to improve the world’s ML data more easily and quicker than any other solution – enabling AI systems to train more reliably on real-world, messy, error-prone data. We develop next-generation data-centric AI, open-source algorithms and provide no-code SaaS enterprise solutions to help individuals and teams at companies (across all industries) diagnose/fix issues in their datasets and produce more reliable ML models by providing clean labels for training.

While many companies can help store/manage data or develop ML models, there exist few solutions today to improve the quality of existing data, which is the core asset of the modern enterprise. This is where you come in. At Cleanlab, you’ll be able to take ownership of critical projects that pioneer the future of data-centric AI.

We are a hybrid company, with over half of our team (and office) located in San Francisco.

  • Read about the Cleanlab team here.
  • Read how Cleanlab went from MIT PhD research to tech used by Amazon, Google, etc here.
  • See what Google, Tencent, and other Cleanlab users think here.

How to Apply