Kalle Kotipsykiatri amazed C-64 home users on the mid-eighties with it’s witty comments and the ability to have it’s conversations in finnish. I stumbled across the original BASIC-language source codes published on Mikrobitti magazine written by Jyrki J. J. Kasvi. I decided to convert the original Kalle to a modern bot platform, the intent being to test finnish natural language understanding in real life. You can test the resulting modernized Kalle with the button below.

 

Natural language understanding

Conversational bots today are typically built using a separate component responsible for interpreting user utterances into intents, while picking up predefined entities from the input. An example of such component is Microsoft LUIS cloud service, which currently understands 11 different languages – but not finnish.

Teaching the model

In modern implementations, the AI model responsible for interpreting user input into intents is taught by entering a number of example utterances, which each are given an intent value. For example, “I want to order a cab” would correlate with user intent “ORDER_TAXI”. You usually enter more than one example for each intent to make identifying the intent more reliable. Model size typically varies from few hundred to several thousand examples.

Using the model

The user input is fed to the model, and the AI returns information which example(s) the input text correlated best with. Model may also return certain variables (like location, time, product name etc.) picked up from the input line. Or the model may return that input does not really match any of the examples.

Kalle and finnish language understanding

Original Kalle’s language understanding was already based on the mechanism used today: Take the user input, and try to match it to the examples contained in the model and return the most likely intent of the user. Kalle’s approach was completely algorithmic – no AI was involved. Or needed for such a simple purpose.

Kalle’s engine tries to find either a phrase, whole word or a morpheme (smallest meaningful part of a word – very important in understanding finnish language) from the user input. As soon as match is found, the engine returns the “intent” defined for the finding. Kalle can also pick up potential variables from the input, like what the user said that he/she fears.

Producing finnish language

Kalle’s engine searches the database for all phrases that matches the intent found, and randomly returns one of the findings. The text is then slightly corrected to produce better finnish conjugation before it is returned to the user.

How to define Kalle’s intelligence?

The intents and matching phrases are defined on an Excel file. The “intent” rows contain keyed examples that the engine is looking from the user input. The matching “Phrase” rows determine the alternative answers. If nothing matches “NO_COMPUTE” is returned. By altering this Excel, Kalle can be modified for a totally different purpose.

Kalle and mainstream

Finnish is globally a marginal language with only 6 million speakers. And some claim that finnish is one of the most difficult languages to learn. Unfortunately – this means that reasonably priced and ready to use language understanding solutions from the big players are not to be expected any time soon – if ever. Finnish language with it’s slightly challenging word conjugations make it also a difficult task for mainstream language AI solutions. To get even with the mainstream, we need to develop our own language understanding modules tailored for our language.

Kalle: Lessons learned

While making the new Kalle I really tried to find an existing finnish language understanding component to translate user input to intents. Can it be that no such component already exist?

Thanks to this excercise and Kasvi’s original source codes I now have at least a beginning of such component that I can continue to build on. It’s not much by any means (the bots wouldn’t pass the Turing test), but it can already be used to quickly produce finnish speaking Frequently Asked Questions (FAQ) bots. I’ll most likely continue on this on my free time; Maybe Wildcode will one day publish the first finnish understanding REST-api service for other bot builders to use?

Another point that needs to be developed further is bot’s understanding of context. Kalle remembers user’s last input line, so it can nag about user repeating him/herself. Everything other than that is like Kalle met the person for the first time. The sense of conversation context (or to be exact – the lack of sense) is one of the most difficult concept for any conversational bot. Siri, Cortana, Alexa, Xiaoice – all are affected by this problem. In general – no good solutions exist in the field of study.

If you can speak (or type) finnish, try Kalle out! The original C-64 Kalle was already an impressive example of using finnish on a chatbot. The original code (or the idea behind the code) is still quite valid even today!

Links and further reading