Wanted to make a chatbot, but wondered if you were doing it right? This short blog shows an architecture reference model how most of today’s bots seem to be internally organized. I’ve seen this architecture used on most text based customer service, FAQ and other one trick bots.

The Architecture

Making bots understand us or respond in a human way is still far beyond our grasp. For practical reasons most of today’s bots are one trick bots: They can handle simple use cases like answering frequently asked questions (FAQ), respond to simple queries, or “turn on livingroom lights” when requested.

Below is the architecture reference model I’ve seen in most of today’s bots. The components used have already been in production for a while, so they are quite reliable and hence – popular in bot making. The architecture is simple, meaning it’s not for AI math-magicians only. You can create your one trick bot with these toys in both acceptable time and budget. So, it’s time to roll up the sleeves.

Image: Common bot architecture

The model is used in most of today’s bots, which don’t understand context. In short, this means the bot remembers nothing of the previous talks with the user. While the approach feels quite brainless, this is usually quite acceptable; You ask, the bot answers.¬†This reduces the bot complexity by a factor, as handling context is one of the most daunting tasks in bot development.

The parts and the whole

Next chapter shows you the components and how they interact to create a fully working chatbot. As I am quite familiar with Microsoft Azure technologies, I’ll include you the links to corresponding Azure services for real life implementation of this reference model. Azure is by no means a requirement – this reference model can be implemented with other technologies as well. You can even code your own language understanding component if you like (done that too with Kalle Kotipsykiatri)!

Bot Framework

You should always build your bot app on top of an existing framework. The framework’s job is relieve you from the generic tasks like how to handle messaging from various channels into a single bot code. The protocols to pass text, images, “user is typing” info. How to secure the communications. The web and REST interfaces to interact with the bot. Just makes no sense to write these yourselves.

Implement with Azure Services:

Language understanding service(s)

The user text input is first fed to a Language understanding service, which is responsible for translating the utterance into an intent. To briefly demonstrate, the component’s main responsibility is the following:

  • User input: “Get me a cab” –> Intent out: ORDER_TAXI
  • User input: “When does the next train to Helsinki leave?” –> Intent out: TRAIN_SCHEDULE_QUERY
  • User input: “Hello, how RU?” –> Intent out: GREETING
  • User input: “Gotta run, bye!” –> Intent out: END_OF_CONVERSATION

On some language understanding components you can even train the system a little further; You can pick up something called entities from the utterance, like:

  • User input: “When does the next train to Helsinki leave?” –> Intent out: TRAIN_SCHEDULE_QUERY, DESTINATION = Helsinki

To improve your result accuracy, you can chain other cognitive services, both before and after the main language understanding component. A few examples: Feed the user input into a spell checking service, which corrects the user typos. This “purified” input usually results in better intent identification. If your bot users are multilingual, first detect the language used, then do a machine translation for the input, and after this feed the result to your language understanding model with correct language. Sometimes you may want to add a sentiment analysis to the user input: How happy or frustrated the user seems to be? If the user is ok, proceed normally. If not, maybe you should redirect the discussion to a real person?

Implement with Azure Services:

Fallback handler(s) – optional

When your language understanding fails to detect the intent, you’ll need to decide what to do next. Here are the most common options I’ve met. You don’t have to restrict yourself on having just one fallback method in single bot. The more the merrier.

Do nothing

Just let the user know that bot had no clue of what the user said. Not very elegant, but efficient. This is usually the last option in the chain of fallback handlers.

Google for answers

To be quite frank, most bots revert to custom search if the intent detection fails. Here’s the short recipe for that: First you need to create a search index from where to look for answers. You have intranets, extranets, databases and other file shares full of content, which you can index and use as the last resort. Use any of these to populate your search index with suitable content. When the intent detection fails, use text analytics services to extract keywords from user input. Then feed the keywords found into the search and hope for the best.

Redirect the conversation to a real person

You also have an option to route the bot conversation to a real person. The trigger for rerouting might be any/all of the following: 1) Sentiment analysis service detected the user is frustrated (or angry) with the bot discussion. 2) Language understanding module detects that user wanted the discussion to be rerouted with utterances like “I want to talk to a real person”, “Can I talk to someone real”, “Can I talk to customer service” etc. or 3) always reroute discussion to real person when language understanding fails to detect the intent.

Implement with Azure Services:

Response generator

The last piece in chain is to formulate the reply to the user. If no intent was found, this module may just pass the response through created in the fallback modules (or the reply from actual person). Otherwise, we use the intent found to generate an answer. In most simplest form this module just queries a database which contains intent – text reply pairs.

On Azure:


This is it. The commercial bots are usually no more complex than of this, at least in Finland. If you follow this pattern, It’s safe to say that the bot architecture is ok, and you can implement the most-sought simple bot scenarios with it.