The season of voice-based personal assistants

Christine Jorgensen of Qualcomm

One of the hottest gift ideas this holiday season are virtual assistants specifically those with a voice user interface (VUI).These handy devices are becoming increasingly common in our daily lives since Siri was first introduced in 2011. Around 700 million people are using AI personal assistants and the market is expected to grow to almost 2 billion by 2021. There are multiple solutions out there from Siri to Google Assistant to Amazon Alexa and Microsoft Cortana. And Samsung has recently launched their Bixby assistant, while Facebook is expected to bring their own virtual assistant, simply called “M”, to commercialisation next year, says Christine Jorgensen, director product management at Qualcomm.

As a developer, it is important to understand how these devices work and how to take advantage of their capabilities. Internally they’re powered with Bluetooth and Wi-Fi modules like the Qualcomm QCA9377-3 and processors such as the Qualcomm Snapdragon Mobile Platform. In this blog, we’re going to dive into how it all fits together.

Conversational and command-based interactions

A conversational interface is a user interface that mimics having a conversation with a human. Personal assistants come in two flavors: chatbots or text based interactions, and voice user interfaces (or voice activated assistants) like the commercial products indicated earlier. Voice activated assistants are typically command-based AI interactions – you ‘wake it up’ and tell it what to do.

Voice activated assistants are ideal for day-to-day tasks such as:

  • Fact finding: Internet searches to find information, time of day and weather queries.
  • Tasking: Setting alarms, sending messages, playing music & video, ordering things online, smart home coordination.
  • Information gathering: Call centers collecting user information, healthcare providing initial diagnosis.
  • Training: Learn a new language by conversing with an AI teacher.

Using a VUI bypasses the need for a keyboard, screen, and spellchecking which also makes it useful for hands-free communication as well as for accessibility needs.

The components

The hardware components for voice based assistants include speakers & microphone, Bluetooth and Wi-Fi modules, and standard computer architecture (CPU, RAM). Although there’s a lot of technology in the device, the real brains usually reside in the cloud.

The easiest way to start writing apps that take advantage of VUI is to use a library such as Dialogflow which has integrations for all of the major players. If you want to delve deeper into the brains you can learn more about Natural Language Processing and machine learning in general.

The process

To be effective with this technology as a developer and a designer, it is important to understand the process of the complete command interaction which is as follow:

  • The virtual assistant is “woken up” using a trigger word (“Ok Google”, “Hey Siri”) to ensure that it only starts acting upon your command.
  • Audio is recorded on the device, compressed, and streamed to the cloud over Wi-Fi. Noise reduction algorithms are often applied to the recorded audio so that the commands are more easily interpreted by the cloud processing.
  • The audio is turned into text commands using a proprietary voice-to-text platform. Analog sound waves are converted to digital data by sampling the analog signal at a specified frequency. The digital data is analysed to determine where the English phonemes (“bb”, “oo”, “sh”, etc.) occur. Once the phonemes are identified, a statistical modelling algorithm such as the Hidden Markhov Model, is used to determine the likelihood of a specific word.
  • The text is processed using Natural Language Processing (NLP) to determine the desired action. The algorithm first uses part-of-speech tagging to determine which words are adjectives, verbs, nouns, etc. It combines this tagging with statistical machine learning models to deduce the meaning of the sentence.
  • If the action requires further searches, then they are performed at this time. For example, “Hey Siri, what is the Snapdragon mobile platform?” would require an internet search to return the information. If the command is something like “Ok Google, send mom a message” then the command data (action: send message, recipient: mom) is sent back to the virtual assistant.

A reply is constructed in the cloud and the desired output words are retrieved from a database of speech samples. These words are stitched together to form a sentence and returned to the hardware to broadcast to the user.

What’ll be the next talk of the town?

Now that you know how voice-activated assistants work you can start building your own products. Why not try making a voice powered RC car or maybe a Christmas tree that responds do your child’s commands? With the power of voice recognition and the latest Qualcomm Technologies including our Bluetooth and Wi-Fi modules as well as our Qualcomm 3D Audio Tools, you can treat yourself to some fun new developer challenges over the holidays.

The author of this blog is Christine Jorgensen, director product management at Qualcomm

Comment on this article below or via Twitter: @IoTNow_OR @jcIoTnow 

FEATURED IoT STORIES

9 IoT applications that will change everything

Posted on: September 1, 2021

Whether you are a future-minded CEO, tech-driven CEO or IT leader, you’ve come across the term IoT before. It’s often used alongside superlatives regarding how it will revolutionize the way you work, play, and live. But is it just another buzzword, or is it the as-promised technological holy grail? The truth is that Internet of

Read more

Which IoT Platform 2021? IoT Now Enterprise Buyers’ Guide

Posted on: August 30, 2021

There are several different parts in a complete IoT solution, all of which must work together to get the result needed, write IoT Now Enterprise Buyers’ Guide – Which IoT Platform 2021? authors Robin Duke-Woolley, the CEO and Bill Ingle, a senior analyst, at Beecham Research. Figure 1 shows these parts and, although not all

Read more

CAT-M1 vs NB-IoT – examining the real differences

Posted on: June 21, 2021

As industry players look to provide the next generation of IoT connectivity, two different standards have emerged under release 13 of 3GPP – CAT-M1 and NB-IoT.

Read more

IoT and home automation: What does the future hold?

Posted on: June 10, 2020

Once a dream, iot home automation is slowly but steadily becoming a part of daily lives around the world. In fact, it is believed that the global market for smart home automation will reach $40 billion by 2020.

Read more
RECENT ARTICLES

Global industry accelerating IoT adoption in response to Covid-19, new Inmarsat research reveals

Posted on: September 22, 2021

New research by Inmarsat, the provider of global mobile satellite communications, reveals a rapid increase in the maturity level of organisations adopting the industrial Internet of Things (IoT) since the start of the Covid-19 pandemic. Respondents drawn from multiple industries also reported that Covid-19 has demonstrated the importance of IoT to their businesses, with many accelerating

Read more

Nutanix cloud platform breaks down silos in hybrid multicloud operations

Posted on: September 22, 2021

Nutanix, a provider of hybrid multicloud computing, announced new features in the Nutanix Cloud Platform, including the launch of AOS version 6 software, to help enterprises build modern, software-defined data centres and speed their hybrid multicloud deployments.

Read more