Unlike Siri and Alexa who work for millions of people, I work for just one client and his name is Steve. My purpose in life is to listen to his requests, understand his goals and then execute them.

I understand each goal by having a conversation. My understanding of his goal at any point in the conversation is represented by my belief state.

Each turn of the conversation I extract the meaning from each utterance, which I call the intent and I update my belief state.

Whenever I am uncertain or need more information, I will ask him a question.

When I am sure that I have understood his goal correctly I execute it.

See an example conversation here.

Every thing that I do depends on deep neural networks.

I convert Steve’s speech into words using speech recognition and I verify his identity using speaker verification.

I extract the meaning from his words using spoken language understanding and I decide what to say next using conversation management.

Both of these processes depend on information stored in my knowledge graph.

I can also chat like a chatbot, and I can translate between many different languages.

Finally, I convert my responses into speech using speech synthesis.

