In January 2024 we set our theme for the year of unleashing collective intelligence through the fusion of AI and human intelligence (HI), particularly in regard to creating datasets to train AI. We said we would delve into the heart of this theme to unravel the layers of HI + AI where human and artificial intelligence converge to amplify our collective cognitive abilities. However, before we race too far ahead of ourselves, our partner ScaleHub consulted with a cross-section group of mainly CEOs and COOs to check where their current consensus is on this topic – if they have one.
The ScaleHub portal is a crowdsourcing platform that takes traditional data extraction to the cloud and provides access to both public and private global crowd communities for purposes of document automation. This enables businesses to scale and automate on demand through faster, easier, super-accurate data extraction.
Benefits of AI and crowdsourced datasets
The generally recognised reasons for CEOs and COOs to consider introducing AI to their businesses, AI that has been trained on crowdsourced data, are as follows:
1. High-Quality Data Training
AI systems are reliant on data for learning and improvement. Crowdsourcing provides access to a vast pool of people who can label data, identify objects in images, or transcribe audio. This human input helps AI learn from more diverse and nuanced information, leading to greater accuracy and generalisability.
Human input is necessary to differentiate between satire and hate speech, identify sarcasm or hidden meanings behind positive or negative phrasing, or apply ethical considerations.
2. Enhanced AI Evaluation and Debugging
Identifying and fixing biases or errors in AI systems can be challenging. Crowdsourcing can be used to gather feedback on an AI’s performance. People can evaluate the AI’s outputs, highlighting areas where it struggles or produces incorrect results. This feedback loop allows for targeted improvements and helps ensure the AI is functioning as intended.
3. Humans-in-the-Loop for Complex Tasks
Certain tasks require human judgment or common sense that AI struggles with. Crowdsourcing allows you to leverage human intelligence for specific steps within the AI workflow. This can be particularly useful for tasks like sentiment analysis or identifying complex patterns in data.
A medical diagnosis system might struggle with a rare or atypical case. Human doctors can use their experience and knowledge to consider less common possibilities.
4. Broader Innovation and Idea Generation
AI development can benefit from fresh perspectives. Crowdsourcing platforms can be used to solicit ideas for new AI applications or solutions to specific problems. This can lead to a wider range of creative solutions and accelerate innovation cycles.
Also, datasets to train AI based on historical data might struggle to handle a completely new situation or a sudden shift in trends. Humans can use their understanding of cause-and-effect to adapt to changing circumstances.
5. Cost-Effectiveness
Compared to hiring a dedicated team, crowdsourcing tasks can be a more cost-effective way to access human intelligence for data training, evaluation, or specific steps within the AI development process.
The general AI and crowdsourcing background to the debate
The in-person debate about using artificial intelligence for business purposes was somewhat overshadowed by criticisms of the shortcomings of generative AI. Its hallucinations, the controversy of scraping of material under copyright, and possible infringement of law of contract, have all contributed to a perceived need for caution about anything to do with, or that uses, AI.
Training AI with extensive and comprehensive datasets is the standby advice to ensure good levels of AI-driven customer experience and internal processes. Such datasets can be crowdsourced on-demand. However, the chill in the room over the questionable reliability of AI-driven systems and tools spread to crowdsourcing. It was agreed that crowdsourcing democratises data through the greater diversity of contributors, but could this form of collective intelligence be trusted enough for a business to put AI that was trained on it in its central core?
It was agreed that where AI has been introduced so far, in the use of chatbots for example, it has been applied to the “low hanging fruit,” the simplest tasks. Future use of AI to tackle more complex matters will demand even more from high quality training datasets.
AI Challenges
The questions and issues CEOs and COOs want answers to include these.
- How to use AI to allow people to work better, rather than having fewer people to produce the same overall level of work as before.
- How to create perceived value so customers pay more for better services that AI can actually make cheaper to deliver.
- How to outsource introducing AI into company processes to a reliable and trustworthy third-party.
- How to fuse AI with HI so that the outcome is better than using just one of them.
Crowdsourcing Challenges
- Develop data protection rules and a feedback system that validates results and removes machine bias.
- Demonstrate how it can be used to upskill people.
- Establish best practice guides on the use of generally accepted guard rails.
- How to select a crowd to address certain specific issues within collective intelligence.
- Using different sources of data from different time periods can be complicated, but a finance company had a good idea to look at data from the stock market crash of 1929 when considering 21st century monetary crises.
- Is there a better way than using humans to sample and check what AI is doing?
Data Security
There are some great benefits available for training AI with crowdsourced data to operate in the healthtech sector. However, security of personal data is an issue that could make many people think it’s too risky to let their records be used.
In the UK, for example, the Government has a poor track record of incomplete digitisation projects (e.g. centralised health records), and cybersecurity (e.g. ransomware hacks of the National Health Service, scamming of people who pay their television licence and vehicle tax online).
Key Takeaways
Such failings may not be at the heart of debates over using AI in business, and training it on crowdsourced data sets, but there are many people who have nothing other than this to add to the debate. CEOs and COOs may be wise to express their caution over moving too fast, given concerns over trustworthiness of how the data is created, managed and protected, and by whom. Will the fusion of AI and HI, and the collective intelligence it creates, actually be better for them than investing in just one line of either AI or HI?
There are also numerous historical examples of developments and innovations that became mainstream before fundamental flaws became apparent. Diversity, a benefit of crowdsourcing, remains vital.
- Facial recognition systems that cannot distinguish between coloured people is an often quoted example.
- Early Covid treatment was largely based on analyses of how mostly white people responded to treatment, and Covid mortality rates were higher in other ethnic groups – at least initially.
- Going back further, car safety belts were designed on data based on predominantly male drivers, and female drivers suffer higher levels of injuries.
The rollout of flawed “advances” due to failings by the teams gathering, creating and interpreting the data really begs the question “Is AI+HI actually better than the sum of its parts if humans train AI?”
These examples confirm the need for diversity in data sources, and for data sample sizes to be large enough for robust and accurate findings. Apart from this, what can, or should, service providers and platforms do to build confidence and encourage greater investment by businesses in using AI trained on crowdsourced data sets?
0 Comments