“OK Google, tell me the quickest route to the office.”

“Alexa, tell me the news.”

“SIRI, make a call to Leesa.”

Aren’t these the sentences we generally hear these days?? Isn’t this how most of us begin our day?? The virtual assistants have become an integral part of our routines. They assist, entertain, help us do tasks, and do every possible task in the field of work. But ever thought how do these virtual assistant think? How do they know what to speak?

Virtual assistants show our dependency on the technology. The more it provides the more we ask for. There is no end to our demands and there is no saturation to how much technology will give us. The virtual assistant is one such thing technology has given mankind.

The brain behind the voice

The virtual assistants do not think on their own. There is a series of things that go on behind the scenes when a command is received. These behind the scenes processes are done by the Deep Neural Network (DNN), Hybrid Emotion Inference Model (HEIM), and NLP, NLG.

The person using the Virtual assistant gives a command. The system first converts the received voice command to text. This text undergoes analysis and the system frames a reply. This reply is then converted back to voice for the person sending the command.

This very simple appearing task is a complex process that includes the following processes.

Deep neural networks (DNN)

Deep Neural Networks are basically neural networks with high levels of complexity. Their networks generally have two or more levels. These networks work on complex data.  They deploy various mathematical models to process data. The data a virtual assistant receives cannot be categorized. It needs to be processed fast and accurately. DNN is used for this purpose.

Hybrid Emotion Inference Model (HEIM)

Humans have the quality of understanding the emotion of a person by the tone of his voice. This is a quality of humans that make his communication not only easy but effective too. A Machine Learning model Hybrid Emotion Inference Model does this work of emotion recognition for machines. The HEIM uses the Latent Dirichlet Allocation (LDA) and Long Short-Term Memory (LSTM) extracts text features and models the features of the received sound respectively. This data reveals the emotions behind our voice.


These are Natural Language Processing and Natural Language Generation techniques. NLP is the ability of the machine to understand the human spoken and written language. This understands the command that the virtual assistant receives.

Once the system generates the reply to be given NLG is set to work. NLG is an AI technology that generates the voice of the reply to be given. It generates the text and speech of the reply using predefined data.

Privacy issues?

The virtual technically has permission to access much data. When a person asks the virtual assistant to read his emails the assistant gets access to the passwords. There are many such instances that have caused hindrance to the trust that people should actually have on it. The companies providing them are working on building high privacy walls but we never know what it might be doing.

Thought Virtual Assistants have changed the way we live. It is difficult to trust them completely with the data privacy.


Virtual Assistants have been in the picture for some time now. And they have definitely changed the way we operate. The researchers need to work on this factor of Virtual assistants to make things really reliable. Otherwise, the deployed technology is a gem to the ease of human life.