Crowdsourcing is a technique that makes AI smarter by allowing organisations and individuals to gather information, often large amounts of data and annotation, from a large number of people and other sources. Typically, this is done through the internet. The data can be used to train and improve machine learning models, which in turn are used to create AI systems. By collecting data from a diverse group of people, AI systems are more likely to be representative of the real-world and less likely to be biased, or restricted by employees’ mindset based on conventional company thinking.
One of the main advantages of crowdsourcing is that it allows organisations to collect large amounts of data quickly and relatively inexpensively. This can be especially useful for organisations that need to gather data from specific sub-groups of people, such as those with disabilities. For speech-based systems, there may need to be special efforts to include under-represented people who speak a particular language, or a language with a certain regional accent, or speak it as a second language.
Also, crowdsourcing makes AI smarter through using it to gather feedback on the performance of AI systems, allowing for the identification and correction of errors and biases. This can be done by asking people to evaluate the output of the AI system and provide feedback, which can then be used to improve the system. This helps ensure that AI systems are accurate and unbiased, and that they meet the needs of the people who will be using them.
Examples of crowdsourcing that makes AI smarter
Amazon Mechanical Turk is a micro-tasking platform that allows businesses and researchers to gather data and input from a large number of people in a relatively short period of time. Data providers that use the platform are able to make what are usually small sums of money as and when they have time to perform tasks around other responsibilities. The platform is commonly used to gather data for training machine learning models, such as image and text annotation.
Google’s AI Platform allows developers to train machine learning models using their own data. The platform also allows for data annotation, which can be done by anyone with a Google account. There are tasks at every skill level.
Zooniverse is a citizen science platform that allows a network of over one million volunteers to assist with scientific research by helping to record, classify and analyse data. The platform has been used to gather data for a wide range of scientific research. The platform can provide citizen scientists to take on diverse scientific challenges including the study of galaxy formation, climate change and wildlife conservation. Volunteers are able to pursue a personal passion to good purpose, and researchers have opportunities for dialogue with a wider group of people than other researchers.
Mozilla’s Common Voice is a project that aims to improve the quality of voice recognition technology by crowdsourcing voice data. In essence, it helps teach machines how real people speak. The project encourages users to submit their own voice samples through reading supplied sentences, and also validate the samples submitted by others. The speech samples are donations, contributors are not paid.
Large speech dataset users tend to have their own networks of data providers. Defined.ai is a leading provider of speech datasets, models and tools for training voice systems driven by Artificial Intelligence. It is based in Seattle, USA, and through its Neevo platform it has a crowdsourced workforce of over 500,000 global contributors from more than 70 countries who speak over 50 languages. Between them they have logged successful completion of over 200 million tasks to record speech, annotate it, or check the work that has been done by others. Having its own datasets provides a competitive advantage, and an added income stream from renting them to clients. It is a great example of how crowdsourcing makes AI smarter.
In the security sector, Sift Science uses machine learning and crowdsourcing to detect fraudulent behaviour in real-time. The platform analyses crowdsourced data from multiple sources, including user behaviour, device fingerprinting, and network information, to identify patterns that may indicate fraud. The platform also allows companies to also crowdsource input from human experts to improve the accuracy of the system. Other operators in this sector include iovation, Riskified, and Signifyd.
These are just a few examples of how organisations are using crowdsourcing to make AI smarter. Crowdsourcing is a powerful technique that allows users to gather data and feedback from a large number of people, which can be used to train and improve AI systems. This helps to ensure that AI systems are accurate, unbiased, and meet the needs of the people who will be using them.