eTech Insights – Tiny AI Will Drive Improved Voice Recognition Capabilities

The Problem: Large Voice AI Platforms are Expensive and Require Outsized Data Models

As of 2019, 47% of voice-based applications reside in the healthcare sector. That is almost three times higher than finance, ecommerce, manufacturing, and supply chain[1]. The benefits of voice recognition in healthcare includes better productivity, fewer medical errors, greater mobility, improved clinical documentation, and better decision-making[2].

Two factors impact the continuous improvement of sophisticated voice recognition engines that are evolving from large technology companies who are quickly advancing these solutions. The first is cloud connections used for the AI models. These connections put a strain on communications networks and will likely increase cloud-based storage costs. Cloud connections may also create a latency factor from data input to data output for the AI models. The second factor is the tremendous amount of energy that may be required to run some of the advanced AI voice recognition models. For example, training Google BERT, Google’s AI voice model, with just one data run used the energy to power a US household for 50 days[3]. Google BERT had 340 million data parameters. Megatron-LM, the AI engine from NVIDIA, has 8.3 billion data parameters. While these models are making voice recognition close to infallible, they are also driving up research and delivery costs for these companies. A new approach to reducing these challenges with voice recognition and AI is called Tiny AI.

The Solution: AI Training and Advancing Architectures

Tiny AI involves training a smaller AI environment (student) from a larger sophisticated AI model (teacher). The training involves iterations of running data inputs for each model to compare and tune the output so that the student eventually produces the same outcome as the teacher. This results in a smaller AI engine with the same capabilities of the larger AI model. These AI engines can be more easily adopted to smart phones, smart home assistants, and bots.

As AI evolves to work with a myriad of IoT devices, new architectures are being proposed to reduce the data communications and power consumption overhead. One such evolving architecture is the Cognitive Edge Server Architecture (CESA)[4]. The CESA is designed to deliver lower latency (instant reactions become possible), increase privacy and security capabilities (personal data is not sent to the cloud), and improve robustness (autonomous nodes are immune to network vulnerabilities).

For this architecture to be successful, Cognitive Edge devices need to accommodate artificial intelligence with true plasticity, be capable of fast online and incremental learning, quickly adapt to new environments or circumstances, and have the capability for lifelong accumulation of knowledge. With this architecture, active edge nodes would be able to sense, interpret, and act on data with minimal interaction with the cloud. This would provide significant improvements for AI solutions that are implemented with IoT devices and healthcare applications. We will continue to monitor the AI industry to see how CESA evolves.

The Justification: AI with Higher Accuracy Drives Increased Healthcare Adoption

Tiny AI will be important in supporting the continued advancement of voice recognition in healthcare. Research and development from larger technology companies is required to drive AI voice solutions in healthcare to a higher level of capabilities. The ability to minimize the footprint of the AI engine in healthcare solutions with higher levels of language understanding will continue to drive increased levels of adoption for AI voice/speech recognition to support clinical documentation. This will result in reduced clinician burnout and improved clinical documentation that increases patient safety and clinical outcomes. This will also increase the use of AI voice recognition in supporting healthcare consumer services on smart devices.

The Players: Big Tech Will Drive Next Generation AI Voice Recognition Capabilities

Large tech companies who are working on AI voice solutions across several industries will create solutions that are adopted by healthcare companies to create more accurate AI solutions. Representative companies to watch in this environment are Google, NVIDIA, Huawei, Amazon, and Microsoft/OpenAI. Many of these companies will create software development tool kits that can be used by healthcare vendors to integrate the AI engines into their solutions.

Success Factors

  1. Evaluate how Tiny AI solutions are updated against “teacher” AI environments to ensure the highest levels of voice recognition.
  2. Identify healthcare services with more complex languages in order to test Tiny AI solutions and provide voice recognition across all specialties of the enterprise.
  3. Tiny AI is an emerging and high-risk voice recognition solution at this time and is best positioned for testing in healthcare organizations with Innovation Centers.

Summary

Tiny AI is evolving to provide sophisticated voice recognition engines in smaller configurations that will be easier to support and maintain as integrated application components. Tiny AI will deliver higher levels of language understanding than currently exist with legacy healthcare voice recognition solutions. Nuance has announced a partnership with Microsoft for its AI-enabled voice recognition, and that may indicate that Nuance is moving to incorporate Microsoft AI into their solutions. I expect that these types of partnerships will continue to evolve between legacy voice recognition vendors and the large tech companies. Voice recognition engines that incorporate large tech companies’ AI solutions will provide more efficient and accurate language understanding. Healthcare organizations should evaluate all voice recognition solutions to determine the provenance of the AI model being used and whether a Tiny AI engine is supported.

Consumer applications for Tiny AI will continue to grow with the increased use of smart devices and automobile voice recognition services. Imagine your car knowing that you are having a health issue due to changes in your voice pattern; that future is coming.