AvBrand.com - Blog - Musings on Artificial Intelligence and chat bots

AvBrand Exploring Technology

Blog

< Back to Blog

Musings on Artificial Intelligence and chat bots

11:55 PM, December 14th, 2011

Over on one of my sites, www.uer.ca, I have a chatroom for members to use. Although the site is dedicated to Urban Exploration, the chatroom often features a variety of topics and usually contains anywhere from 4 to 20 people at any time of day.

Around the holiday season, to make the chatroom more interesting, I switch on a bot that I wrote called the "Santa Bot". It's an automated chat bot that interacts with the participants in the room, and appears to them just like any other user. He responds to direct questions, and also responds to select statements. He also speaks at other times.

I'm not much for AI (Artificial Intelligence) -- I've never had much interest trying to write something that pretends to be a person. Most chatbots out there are extremely complex, and they work by trying to figure out what you mean when you talk to them. Saying "How are you" usually results in a response like "I am fine", since the bot has either been programmed to understand that question or has learned the appropriate response.

Writing that kind of algorithm seemed much too complex for me -- I just wanted a simple bot that I could use to entertain the chat users. So, over the last few years, the Santa Bot has evolved into its current incarnation, which results in wonderful interactions such as this one:

[23:15:04] <Aleksandar> santa, you dont know what fus do rah is?
[23:15:07] <Santa> its better you dont know what that is

In each of the samples I'll be providing here, <Santa> is what the chat bot said, and the other names are other UER members (normal people). The time code on the left side shows the time the message was posted in the chat, to give an idea of the time between messages.

Here are a few other samples of interactions with Santa to whet your whistle and get you interested:

[15:34:02] *** budda has joined #general
[15:34:06] <Santa> shit, don't show budda those naked pics I took
[15:34:12] <gr8fzy1> Woot, a person has come in!
[15:34:27] <gr8fzy1> Say Woot Santa.
[15:34:30] <Santa> Woot!
[15:34:32] <gr8fzy1> lol
[15:34:35] <Santa> LOL, I like!
[15:34:49] <gr8fzy1> Me too Santa, me too.

[15:39:54] * madcap takes joint from santa
[15:39:57] * Santa steals the joint from madcap
[15:40:14] * gr8fzy1 laughs* Pwnd

[00:04:45] <Astro> santa, quieres me en tu pantelones
[00:04:49] <Santa> Astro es en el pantelones de Av?

How Santa Learns
One important thing to know right away is that pretty much nothing Santa says is pre-programmed. He has only a few canned phrases used in specific instances, such as when greeting a new person joining the chat room. Everything else he says is a repetition of what was said by a previous person in the chat room at some point in time in the past. Sometimes from 30 minutes ago, sometimes from 30 days ago.

So, if you say "hello there" to Santa, he will say it again to someone else, at some point in the future. To help keep the responses sounding real, any reference to Santa's name is replaced with the name of the person he is responding to. For example:

A user says "How are you, Santa?" to Santa at some point. At another point in the future, user Bob says something to Santa, and Santa chooses to respond with the 'How are you' line. He writes back with "How are you, Bob?". In this example, he has replaced his own name with that of Bob.

Once Santa has used a particular phrase, he deletes it from his database. This keeps him sounding fresh and prevents him from using the same phrase over and over again. Once used once, a phrase can be re-learned if another chat user says it again. He also learns each phrase only once at a time. So if three different people each say "hello", he only learns it once.

How Santa Chooses What To Say
Okay, so we've got a big database of everything that was said in the chat. How do we select what to say for a given input phrase?

This selection is done with largest word matching. A score is given to each word found in the input phrases, with longer words receiving a larger score. Then, phrases in the database are examined and ranked, with the phrase that contains the largest number of high-ranking input words being given the top priority.

Example:
Input phrase: i love cheese
Possible outputs: "i love you", "you cheese"

Although both phrases match some of the words, the second possible phrase will be selected, since it matches a longer word, and thus has a higher score. This type of matching means that Santa is more likely to focus on the longer and more complex words, rather than focusing on words like "a", "the", or "to".

Alright, great, but how could this possibly work? There's no way the selected phrase has anything at all to do with the input phrase or question.

Human Nature
The next part works due to human nature and the structure of conversations in chat rooms. When reading an essay or an article, sentences are long and are grouped into paragraphs. But in a chat room, sentences are often on their own lines, and are often further split up into multiple lines.

Reading one line of chat on its own, without the preceding and following lines for context, results in confusion:

[23:27:37] <aurelie> the hearts make them so much creepier

So, in chat rooms, our brains are trained to infer the context by looking at the previous lines.

In addition, due to the multi-user nature of chat, a dozen conversations can be happening at once. If a line is confusing and seems out of place, our tendency is to simply ignore it, classifying it as intended for someone else.

Santa takes advantage of these shortcomings in chat rooms. First, everything he says is time-delayed, with the length of the delay controlled by the length of the phrase he is about to say, but never less than three seconds. This simulates a user typing on their keyboard.

Second, although the things he says are often from unrelated conversations, our natural ability to put things in the appropriate context allows our brains to fit his responses into the current context of the conversation.

For example:

[23:33:55] <beefcurtainwin> did you finnaly find the good market basket?
[23:33:59] <Santa> I bougtht all the stuff for it at the market.

Here, the common word is "market", and while the response phrase was originally for a completely different conversation, it is easy to fit this into the context of the question. Did you find the good market basket? Yes, I bought things I need for the good market basket at the market.

Sometimes, for conversations on particular topics, Santa can be almost eerily topical, such as this discussion about the recent CERN tests to find the Higgs Boson particle (also sometimes called the God particle):

[14:11:41] <Price22> so uh
[14:11:48] <Price22> they find the god particle yet?
[14:11:52] <Santa> the observed what appears to be a mass producing particle
[14:11:59] <Price22> im at school and havnt googled yet
[14:12:11] <Price22> BUT MASS CANNOT BE CREATED
[14:12:17] <Tamara> i dont think it should be called the god particle
[14:12:21] <Santa> it could be the particle that gives mass to other partilces
[14:12:33] <MutantMandias> SAnta thinks it should. And Santa has the gun.
[14:12:37] <Santa> and everyone thinks it might be realted to the discovery of the HB
[14:12:40] <keti> no
[14:12:49] <keti> but theyre saying they're close to finding it

[14:15:08] <keti> santa, higgs boson?
[14:15:12] <Santa> the higgs boson is fundamentally flawed
[14:16:11] <Santa> its congruent with science

Other advantages
Since there is no real understanding of language besides word lengths, Santa can be used in any language, and often with multiple languages simultaneously. He also naturally matches the caliber of conversation that takes place in the room -- if all conversations are science-related, Santa will bring up science topics. If all conversations are sexual, Santa will have a very dirty mouth.

Shortcomings
Naturally, there are many shortcomings with a system like this. Probably the biggest one is Santa's inability to truly understand a question and respond to it. This means direct questions rarely result in an answer you'd expect. Sometimes the responses can be quite shocking;

[23:45:31] <Porcelain_Doll> santa do you like tuna?
[23:45:35] <Santa> Screw you jerk. You made me feel like shit.

However, nonsensical or non-sequitur responses are often ignored by the chat users, as their conversation has advanced and they subconsciously believe Santa to be speaking to someone else, or they believe they simply didn't understand the context.

Also, Santa will pick up on all of the bad habits of the chat users -- if foul language is used, he will also have foul language, often at inopportune times.

Final thoughts
It usually doesn't take very long for people to figure out that Santa is a bot, if they didn't know already. But this is true for most chat bots. At the very least, interacting with Santa can be extremely humerous.

The fact that Santa's responses make sense even some of the time is extremely interesting to me, and it shows that creating an AI need not necessarily involve any kind of actual intelligence programming. The human brain can do much of the work for us.

Comments

Be the first to comment!

Musings on Artificial Intelligence and chat bots

Comments

Comments Closed