Speech controlled apps and software, and voice recognition are swiftly moving from being a novelty to an everyday user expectation. Demand for voice-activated systems and devices is forecast to grow from what was a $9 billion market in 2017 to almost $32 billion by 2025. What are the key drivers?
Prominent growth areas include virtual agents in banking, infotainment systems in automotive, managing vast amounts of data and tele-health in healthcare, and education, which has seen a significant switch to online learning due to pandemic-related social distancing. And, of course, call centers.
Also, the number of IoT devices at home and in the workplace will multiply exponentially, with the number of active devices increasing from 7.6 billion at the end of 2019 to an estimated 24.1 billion in 2030.
In the more immediate future, it’s estimated that by the end of 2020, 50% of all search will be conducted via voice and 75% of U.S. homes will have at least one smart speaker. (Source: DefinedCrowd White Paper, “Training a Voice Assistant.”)
Tech providers thus have a growing requirement for vast amounts of speech data upon which to base reliable and comprehensive services. They have to protect against fraud and impersonation, recognize dialects and accents, even identify a user’s emotional mood to respond in the most appropriate manner. Templated answers that miss the mark, or phrases such as, “I’m sorry, could you repeat that,” simply don’t cut it.
To service this market, Daniela Braga founded DefinedCrowd in 2015, in Seattle, Washington State, to provide customised voice and text data for both speech technologies and Natural Language Processing (NLP), and image data for visual recognition purposes. In 2016 a round of seed funding received support from backers including Sony and Amazon Alexa Fund.
Growth continued with some work for prominent clients including Mastercard (in Spanish), the national electricity provider in Portugal, and a Fortune 500 tech company that wanted comprehensive speech training data in French. By the close of 2019 the DefinedCrowd team had doubled to over a hundred employees. A second round of funding in 2020 raised over $50 million, making it the largest ever Series B round of funding raised by a female founder in the U.S. It marked a landmark point where Daniela wanted to move away from being described as a startup.
The core speech data products
To companies that need high quality AI training data quickly, DefinedCrowd’s off the shelf speech datasets, DefinedData, provide a catalog of datasets in multiple languages and domains. For more customized data, their DefinedWorkflows product uses a crowd of thousands of certified contributors, combined with their own Machine Learning to deliver the exact datasets needed.
Either way, for a voice assistant to conduct fluent, near-human-like conversations and enable smooth, helpful interactions with users, it needs to be trained with data that is specific to its purpose. This data is sourced and structured through a combination of workflows that include speech collection, transcription, annotation and tagging, with various stages of validation along the way to ensure accuracy and saliency.
Speech is put in the audio context in which devices will need to hear, understand and respond to it, such as in a car with all the various possible other sounds from inside and outside the vehicle.
Having first understood what the user is saying, a voice assistant has to be trained to respond. After the speech data has been collected and transcribed into text, the validated transcriptions can then be annotated to reap greater value from the data by identifying the speakers’ intent. AI matches the instruction or question with the appropriate response, and at each stage either the dialogue continues or the instruction is carried out. Machine learning continually improves the quality of response through building on a larger base of previous examples where the user was happy with the outcome.
DefinedCrowd’s growth highlights the value of its product offering – high-quality training data that fuels world-class AI models. In August 2020, Inc. magazine announced that DefinedCrowd is the 27th fastest-growing private company in the United States, and the fastest-growing private company in Washington State based on an incredible 8550.28% growth in revenue from 2016 to 2019.
In response, Founder and CEO Daniela Braga said: “With our talent and resources, we are definitely on track to continue on our growth trajectory and become the number one trusted data company for Artificial Intelligence in the world.”