Data Annotation & Crowdsourcing

Software to help you annotate your data efficiently and reliably. Accurately assess the quality of different annotators and data providers. Works for all data types: image, text, tabular, audio, etc.
Hero Picture

Case StudyGoogle


Google used Cleanlab to find and fix label errors in millions of speech samples across different languages, to quantify annotator accuracy, and provide clean data for training speech models.

number of speech samples processed with cleanlab
Quote from Patrick Violette Senior Software Engineer at Google
Cleanlab is well-designed, scalable and theoretically grounded: it accurately finds data errors, even on well-known and established datasets.
Cleanlab is now one of my go-to libraries for dataset cleanup.

Case StudyBanco Bilbao Vizcaya Argentaria (BBVA)

reduction in required number of labeled training transactions
improvement in ML model accuracy (with no change in existing modeling code)
Excerpt from an article by David M. Recuenco, Expert Data Scientist at BBVA:
[We used Cleanlab in] an update of one of the functionalities offered by the BBVA app: the categorization of financial transactions. These categories allow users to group their transactions to better control their income and expenses, and to understand the overall evolution of their finances. This service is available to all users in Spain, Mexico, Peru, Colombia, and Argentina.

We used AL [Active Learning] in combination with Cleanlab.

This was necessary because, although we had defined and unified annotation criteria for transactions, some could be linked to several subcategories depending on the annotator’s interpretation. To reduce the impact of having different subcategories for similar transactions, we used Cleanlab for discrepancy detection.

With the current model, we were able to improve accuracy by 28%, while reducing the number of labeled transactions required to train the model by more than 98%.

CleanLab assimilates input from annotators and corrects any discrepancies between similar samples.

CleanLab helped us reduce the uncertainty of noise in the tags. This process enabled us to train the model, update the training set, and optimize its performance. The goal was to reduce the number of labeled transactions and make the model more efficient, requiring less time and dedication. This allows data scientists to focus on tasks that generate greater value for customers and organizations.
BBVA is one of the largest financial institutions in the world. With a strong presence in multiple countries, BBVA offers a wide array of banking and financial services to individuals, businesses, and institutions.
Company Logo
Graph showing results achieved with Cleanlab on a real dataset
Graph showing results achieved with Cleanlab on a real dataset

Case StudyGavagai

Quote from Fredrik Olsson Head of Data Science at Gavagai
At Gavagai, we rely on labeled data to train our models, both publicly available datasets and data we have annotated ourselves. We know that the quality of the data is paramount when it comes to creating machine learning models that can produce business value for our customers.

Cleanlab Studio is a very effective solution to calm my nerves when it comes to label noise!

The tool allows me to upload a dataset and obtain a ranked list of all the potential label issues in the data in just a few clicks. The label issues can then be assessed and fixed right away in the GUI.

Cleanlab should be a go-to tool in every ML practitioners toolbox!
Gavagai provides multilingual text analytics for customer insights. Analyzing reviews, surveys, call transcripts, support tickets, and social media, their platform helps discover, track, and act on customer data to improve Customer Experience.
Company Logo


Use our ActiveLab system (active learning with relabeling) to efficiently collect new data labels for training accurate models.
  • Obtain reliable data labels even with (multiple) imperfect annotators.
  • Only ask for labels that will significantly improve your model.
Learn more and see benchmarks of the effectiveness of ActiveLab on real data.
Use our CrowdLab system to analyze data labeled by multiple annotators and estimate:
  1. consensus label for each instance that aggregates the individual annotations.
  2. quality score for each consensus label to gauge confidence that it is the correct choice.
  3. quality score for each annotator to quantify the overall correctness of their labels.
Learn more and see benchmarks of the effectiveness of CrowdLab on real multi-annotator data.
Videos on using Cleanlab Studio to find and fix incorrect values in:
Compare quality of different data providers and data sources. Listen to a discussion on this topic in the Weights & Biases podcast.
Use least expensive data provider to first obtain noisy labels, and then ask more expensive provider (or in-house experts) to review select examples flagged by Cleanlab.
AI algorithms invented by Cleanlab scientists provide automated quality assurance for data annotation teams working across diverse applications (speech transcription, autonomous vehicles, industrial quality control, image segmentation, object detection, intent recognition, entity recognition, content moderation, document intelligence, LLM evaluation and RLHF, ...)
Read about active learning to efficiently annotate image data.
Read about analyzing multi-annotator text data with CrowdLab.
Read about efficiently annotating text data for Transformers with active learning.
Cleanlab is used for automated quality assurance at leading data annotation companies