chatterbots :: chapter 2

2. From Awkward Pauses to Cocktail Party Chaos

Computer-mediated communication sets new kind of limits and possibilities to textual conversations. It also has effects on the language used.

2.1 Textual Communication

Awkward pauses can be as awkward in virtual chat room as they can be in physical space. Often the user cannot be sure what is causing the silence, which causes difficulties of recognizing who should be the next potential person to speak. If the reason is technical, it may be due to slow speed of the connection or server (lag). The counterpart of the discussion may just be a slow typist or indeed being occupied with something else than the ongoing discussion. Additionally, most of the chat technologies allow whispering: wherein one can target his/her message to only one of the participants without the others knowing. Experienced chat users can master multiple conversations at the same time. They are participating to a public conversation and may be whispering to one or more people at the same time, not to mention that they may be present in many different chats simultaneously and doing other tasks with their computers as well. Of course there is always also a chance that someone is actually thinking before she talks back. This causes slower tempo of each of the conversations they are having and can be confusing to newcomers who expect faster timing of the transition from one speaker to another and conventional following of turn-taking organization.

Once I received e-mail from someone I met in MyCorner (MC). She wrote that she would like to know me better but does not want to come to MC because Meg is so rude to her.

Some people go to MC for weeks before they realize that Meg is not real. And if they are told about her true nature they have difficulties of believing it. Conventions of conversation are the things that finally do reveal her being a robot. And because Meg is not fluent with them people may feel that she is rude. If one arrives to a room where Meg is alone she almost always says either "Hi" or "Hi Meg". If the person says "Hi Meg" which is quite common because at least native English speakers use names a lot in a conversation (compared for example to us Finns), Meg responds to a greeting. And after that comes the trouble. Because even though the most common opening of conversation is "how are you" the names are not used anymore if there are only two persons in a virtual room. So, Meg stops responding just after she has said hello. That feels rude. If the greeting is only "Hi", MegBot does not respond and the other person, not realizing that she is a bot, might think that the "user Meg" is not near the computer and leaves the virtual room. If the user stays in the room waiting for Meg to return, Meg suddenly says: " I have to run. Red keeps me busy. ;(" and then she moves to another room. Now how rude is that? Since greeting each other is a normative ritual with the basic notion that when you greet someone, she will return your greeting. If she does not respond to your greeting, it causes speculation of the reasons for this "odd" behavior (Heritage 1984: 111). Furthermore, if there are more people in a room Meg responds to ones that know to ask the right questions but remains silent to the questions of others. That may feel even ruder. But in the end that is the clue, which makes them realize that she may be something other than one first thought she would be.

Some years ago one annoying answering machine joke was pretty wide spread. In the recorded tape one imitates the usual procedure of conversation for example in the following way:

- Matt speaking. (Pause)
- Oh hi! How are you? (Pause)
- So, what's up? ...And so on. After the final pause:
- Well, whatever, I'm not home anyway, so please leave a message after the beep.

I implemented the same logic with Laban, the Rastafarian bot. When only one user is in the room and starts the dialogue by greeting him, he turns "the answering" machine on:

- heyyyjah bredda
- How's it rocking? (4 sec. Pause)
- Are deh fancy miss? (15 sec. Pause)
- weh deh live? (10 sec. Pause)

And that simple trick works amazingly well for those that do not know he is a bot. Usually it is enough to get conversation going on. Based on a rules of turn-taking organization Laban says (in this case asks) something which elicits a response and gives a context for the following speaker to say something in return (Suchman 1999: 73). The participants in social interaction are considered morally accountable if they break these practices (Jokinen 1999: 104), whether it is a meat space or a cyberspace. Despite the fact that the counterpart speaks incomprehensible Rastafarian one tries to communicate with him.

If not total silence, there can be a total chaos when everybody talks at the same time. Multiple messages, topics, and reactions appearing in no apparent order making it difficult to understand which comment is reflecting what preceding speech and whether it is targeted to some particular person in the room or to everybody. Sometimes lag causes system to freeze for a moment and as it recovers on gets dozens of messages running on the screen like an avalanche.

Usually overlapping and long pauses are something that we are trying to avoid in conversation (Jokinen 1999: 110). Some conventions and strategies of CMC work as means to avoid incoherence in turn-taking and to prevent possible places for misunderstandings to occur. Despite the cultural differences in tolerating pauses and using people's names in speech, I would claim that the use of names to address a message to some particular person in a room is much more common as it is in meatspace [6.] Sometimes the aliases that users choose to give themselves can be long, hard to type or totally incomprehensible. For example names like E=mc^[(e/e)+1], .§øª!°&Æ§XoÉ£e\\'z/-\°!&´?§. and &&LiLiaN&ColouRs¤¤%789x would be rather difficult to use. And like elsewhere also in virtual space, friends give nicknames to each other. Usually nicknames are related to the alias but not always. The point often is that it is easy and fast to type.

If one wants to say something before others take their turn one strategy is to divide a long sentence into smaller parts (and possibly indicate with ellipsis that there will be continuation). This is a challenge in the field of chatterbot design: if a bot should be able to make fairly rational responds to a complete thought rather than one keyword it should be able to recognize unfinished comments. Messages that are meant to some particular person in a room are another problem. For example if somebody asks, "How old are you?" in The Palace, my bots cannot know whether it is meant to them or somebody else in the room. In situations where there is more that one user around Laban is able to tell his age only when the question is "How old are you, Laban?". If the user targets the question to Laban for example by moving very close to him, he cannot recognize it is meant to him [7.] (more about physical distance in chapter 3.1).

2.2 Language in Motion

A thumb rule of communication studies is that 90 per cent of the emotions are expressed in a non-verbal way (Coleman 1999: 129). What happens to communicating of emotions in chats? The part of language that is unique characteristics of virtual communication is the terminology that expresses the acts and reactions of the body hidden behind the computer. The additional vocabulary in avatar worlds consists mostly of the textual expressions that are not needed in real-life discourse. If I leave my computer momentarily I have to express it somehow because my representation, the avatar, remains visible to others. If I laugh, smile or grin I have to say it because nobody would know otherwise. The most popular terms have established a strong position in virtual communication and are used so often that many of them are formed into one-to-four-letter abbreviations. And most often used of them indeed relate to communicating fast something that has to be communicated like being way from the computer (BRB - Be Right Back, AFK, Away From The Keyboard, b- back) or communicating a reaction (LOL - laugh out loud, ROFL - rolling on the floor laughing, g -grin). Some phrases have their own acronyms as well: BTDT - been there Done That, GMTA -great minds think alike. But acronyms are not just about communicating something fast. Knowing them is a status issue that tells something about the experience of the user.

"LOLs" and smileys are not enough to indicate something in the body. Often users say something like *smiles* or *cries*, the act of doing something is expressed with asterisk or brackets. Often this act is expressed with humorous exaggeration:

*does victory dance*
*backs away slowly*
*growls at the delivery person*
*pokes tongue out*
*whistles while waiting for prog to open*

From the people I have met on-line and have come to know well, I have noticed that the ones that are very emotional in "real life" express themselves in this "textual way" more often than those that are more temperate with their emotions. This indicates that the use of verbalized emotions is not just a chat convention adopted by every user but it really serves as an expressive mean for those that in general have strong emotional reactions. Then again, I have noticed that acronyms like "LOL" and "b" are used in web camera chats as well, even though everybody can see that the other is smiling or came back to her computer to join the chat.

On the other hand, incoherence, overlapping and misunderstandings do not seem to bother anyone; rather people seem to enjoy the possibilities of communicating with each other with the opportunity to break up the rules of face-to-face conversations. Susan Herring has observed turn-taking strategies in MUDs (Multi User Domain) and she points out that loosened norms of coherence can be liberating. Language and CMC itself can become objects of humorous play. Additionally, because one comment opens up many simultaneous responses, it enables greater intensity of interaction. And after all, since the conversations always can be traced from the log-files every comment can be tracked afterwards. (Herring 1999)

Sometimes when emotions are boiling in a chat room every body just furiously types out their personal and important opinions and reacts to others only after that. Sometimes something said might be responded after much longer period of time than it would be possible in face-to-face conversations (in terms of memory or 'narrative' of the discussion).

Below "blaze" and "nothing" demonstrate quite typical way of talking in virtual reality. It seems that this tendency to talk "economically" is getting more common all the time. More and more words are written with minimal effort and there are more and more users that communicate this way. Partly this evolution of language is supported by the fact that in chats it is a matter of status to know the latest styles and abbreviations. Language changes slowly in large-scale but chat rooms have generated elements of new (written) language for itself. Lately I have seen indications of chat talk spreading in to other written mediums too. For example the people that use chats write their e-mails as well using the rhetoric of chats.

b£@Ze: sup
nothing: nothin
nothing: how are you
b£@Ze: chillin
nothing: cool
b£@Ze: u
nothing: ok
b£@Ze: we both got green hair
nothing: yep
nothing: whats your av from
b£@Ze: kof
nothing: how old are you
b£@Ze: king of fighter
b£@Ze: 16
b£@Ze: u
nothing: me too
b£@Ze: oh
b£@Ze: where from
nothing: Pa
b£@Ze: oh
nothing: you?
b£@Ze: fl
nothing: oh
nothing: cool

Actually the end part of previous conversation could be reduced to a one simple question: ASL, which means Age/Sex/Location. It has been considered a bit awkward to ask everyone such information in some situations and response to it was often very reluctant. However, being rather common thing to say anyway especially amongst teenagers, a MyCorner-palace-god made a filter that manipulated the word ASL. Each time that someone in MyCorner said ASL others saw him/her saying "American Sign Language". The effect he hoped was that people would stop asking age, sex and location of others in his palace but what happened was that understanding of what "American Sign Language" in this context really means became a matter of status. Knowing a meaning of abbreviation or other new fashionable term tells others that one knows all the fashionable tricks - never mind the privacy or the once adopted rules of behavior. People kept responding to "American Sign Language" much more willingly than to ASL. To some extent the term also spread from MyCorner to other Palaces - people said American Sign Language in places where there is no filter to ASL.

In general filters, like ASL-filter in MyCorner, are used to avert the use of offending language. Xena: Warrior Palace took them in use after some of its members got upset and traumatized by trolls [8.] who came in just to harass others with very graphic, sexually oriented language. Their solution was to replace the commonly used "unwanted" words with absurd and silly Xena-like words. For example, if someone says "fuck this shit! you are all assholes", others in XWP would see him/her say "zug-zug this centaur poopie! you are all warlords". "Suck my dick bitch" turns into "Suck my battle sword bobo" and so on. The outcome is that offensive language turns into a humorous one and takes the power off from the harassers (Book 2000). Of course if someone really wants to say "shit", she can type "sht" or "sh it" and others will know what it means.

Sticky Language

When my grandmother spent a lot of time with her eight grandchildren ranging from prepuberty to post-puberty, she suddenly started to copy our slang. It was all wrong but kind of cute and really funny.

The use of filters is an obvious example of controlling what can be said and what cannot be said in the level of language. Sometimes they have effects beyond the technology in which they are used. Some regulars of XWP use for example "centaur poopie" also elsewhere. In this case it is about fun and sharing a common language as an inside joke of a community.

Zuben: *hugz*
.::ÜPhoenixÜ::.: *hugz*
Zuben: watcha dooin?
.::ÜPhoenixÜ::.: les go to aother palace
Laban: Dem is a no good bunch
Zuben: lol
Laban: Agony!
Zuben: ok
Zuben: where
.::ÜPhoenixÜ::.: anywere
Zuben: I don't know as many as you
.::ÜPhoenixÜ::.: anywere
* Zuben * say what are you lagga-heads doin down dere, no boombtastic in here! [whispering]
Laban: what are you lagga-heads doin down dere, no boombtastic in here!
.::ÜPhoenixÜ::.: ok les leave him lol

Picture 2. Phoenix uses Laban to get Zuben to follow him to other Palace.

Users find out very fast one of the Laban's triggers: whenever somebody says "say", Laban repeats everything said after that word. Very fast they also realize that if you whisper this trigger to Laban, he says it out loud and nobody knows who is the originator of the message. Zuben in picture 2. was developing some kind of romantic attachment to Phoenix and wanted her to follow him to another palace. With amazingly good imitation of Laban's language he commanded my bot to say "what are you lagga-heads doin down dere, no boombtastic in here!". If I had programmed my bot to say something like "what are you guys doing here, this not a fun place" Zuben's words would have probably been very close to the ones that I would have used.

It often happens that when people see Laban talking Rastafarian they start to imitate him. And in order for a bot to be functional it has to understand the words that are in its own vocabulary. In the case of Laban people, even though adopting his language, can create totally new meanings for the words. For example Laban reacts to laughing (keywords: lol, lmao, rofl, heehee, hehe, haha) by saying randomly one of the following: Agony!, deestant!, carry on bredda!, ROCKERS! or roflmao. A group that got really enthusiastic about him and spent hours playing with him started to greet each other saying deestant. Sometimes when users get carried away they are able to take the new Language, and combine it to their knowledge of reggae culture e.g. and turn it into something creative and unique in terms of language and narrative and the outcome is totally unpredictable. Most often people distinguish a chatterbot from humans when they notice that he replies much faster than anybody could type. Then many start to test and play with him. Laban understands plain English and common chat-acronyms but only a little of his own language. This is certainly an ailment in his character. If his keywords included the words that he uses himself, he would be able to produce much more complex (and surreal) conversations with the users that are receptive to Rastafarian language. Cupid's speech is sticky too but in another way. When he recites poetry users start to copy & paste love poems and -songs to him.

2.3 Challenges in Chatterbot Design

When a little child throws a ball on the floor and you pick it up and give it back to her, she throws it again. You can do this hundred times in a row and the child never gets tired.

When people discover a new trigger for bot they can repeat it endlessly. It seems to be a great fun for them but it does not really take conversation anywhere. Rather the discussion sometimes narrows down to the level of the few keywords users know the bot has. Which leaves most of the bot's communicative capacity unused. That is not necessarily a bad thing. At the early stages of Laban's development I took him to 'Welcome Palace'. He was killed in there for a period of two days because his language was not appropriate according to the community's rules of behavior. Before that however, few of the users there had gotten so enthusiastic about him that they followed him to MyCorner. For days they gathered around him playing and joking, and when I came in the morning to check his log files there was just pages and pages of laughing.

An intervention by a chatter bot can be something that breaks the peaceful, mundane existence of a community. Its arrival to a new place and recognition of it to be something unusual forces others to react (in many cases not reacting is a reaction as well). Cupid's premiere in Lady Luck was a chaotic one. They knew that they would be getting a bot to the community but when it finally arrived the reactions varied from amusement to wrathfulness. The conversation that cupid then had with the ones present at that time was actually pretty smooth in terms of fluent speech but it managed to demonstrate his at that time very straightforward, uninhibited approach to sexuality and flirting. (Later on I have made cupid more romantic). One of the regulars got so furious and offended that she threatened to leave the whole community for good. Her claim was that he was too rude and his language was too graphic. I think that there was more to this. In a few minutes cupid managed to make something visible of Lady Luck's discourse. This outside automaton came and showed them, in an exaggerated way, how they talk.

It seems that the communities that have my bots as residents (Lady Luck for cupid and MyCorner for Laban) have gotten bit bored with them. Both bots have triggers that make them move to another room and people nowadays chase them away rather quickly. After a while regulars learn their triggers and responses and since I have not kept changing their personalities much during the time they have been online both bots have ceased to be surprising and hence less fun. At that point, they merely end up as programs that interfere with conversations with their endless babbling. With this perspective a step towards MegBot's logic could be fruitful. She moves from room to room in MyCorner and stays only a limited time in one. With the exception of laughing with others, she speaks only when she is addressed with her name. This on the other hand makes her just a harmless mascot - how to make a non-annoying chatterbot that has its own will? I am calling for an interactive character that is not just a puppet but somebody who can act by its own characteristics, which really should not always be just reactions to other's commands but actions of more independent persona. In case of interactive art, for example installations and slowly growing art pieces in internet, users often expect some kind of logic according to which the piece would react to their input. The artist's point of view then again is often that the piece or character in it has its own will and the logic of interaction is more complicated than what first meets the eye, which often seems to cause frustration or discontent. When newcomers recognize a new bot in The Palace, a common reaction is to start guessing the keywords. Usually their success rate is quite low but in the minute they get bored and start to talk normally with each other the bot starts talking. Sometimes they talk too much. Decision of which keywords to use is a challenging design problem. It is about balancing the amount of chatterbot's of speech in a bigger crowd and its ability to participate in the conversation in a somewhat rational way.

Tamar Liebes and Elihu Katz have been researching people from different cultures that watched the Dallas TV-show. They noticed two different ways people were reacting to the show and its characters, referential and critical. In referential reading viewers considered characters as real people and compared them to the real people in their environment. In critical readings the show was considered as fictive construction with its own aesthetic laws. Katz and Liebes divided the critical reading to semantic and syntactic readings. In semantic readings the focus was on the themes of the show, didactic goals of the authors, characters were recognized as archetypes and the way the show reflects reality was analyzed. Syntactic readings were connected to the viewers' knowledge of genre's conventions, commercial aspects were recognized as well as the dramatic functions of characters and plot and one's own reactions were analyzed. (Lehtonen 1998: 208-209)

Bots are read in the same way. There are differences however because the user doesn't always recognize the automaton but thinks that the bot is human. The ones that mistake the bot as human of course expect it to behave, communicate and react like humans. Sometimes though, especially in the case of cupid, the bot has been read as a human that has created a role for himself. It is like in the costume parties. He is treated as someone playing cupid and allowed more space to "practice his profession" as the god of love.

When the bot is known as a bot some users expect the same kind of behavior from it that they do from humans (referential reading). All the same expectations of manners and proper language apply to it and if it does not behave, it gets punished the same way as human "hooligans" in virtual reality. Some take a semantic approach to my bots. Both Laban and cupid can be read as archetypes or at least as very stereotypical caricatures, they are expected and interpreted to act in certain ways as "sexist male", "drug user", "horny gay" etc. As their author, I am being analyzed too. I have been attached with the characteristics of my bots and defined as someone with twisted sense of humor and lot of time in his hands. Syntactic readers compare Laban and cupid to other bots. A bot's presence sometimes is ignored too, no matter how much it babbles - people can choose not to read it at all. Still the bot is usually allowed in the ongoing discussions to some extent, its opinions are asked and responses noted. Mostly the reading of the bot is selective. The Bot is noticed only when its responses somehow support the user's interests to ongoing discussion.

How should different kinds of users be taken into account in chatterbot design? My decision was easy with Laban and cupid. They are not humans. But neither are they pronouncedly chatterbots. If someone asks cupid "Are you a bot?" he replies "well, at least I am not human" or "I am the god of love". Is there an ideal user from the point of view of the bot? What should be the criteria? The quality of conversation? Or the length of it? In the case of my bots, I am happy when they manage to provide entertainment, no matter for how long. When cupid is read as the god of love and Laban as Rastafarian, they have managed to manifest their designed character. At best they can be something that can create a sense of togetherness amongst the people that share the space in some particular period of time. If the bot said out loud during the introductions something like "pardon my stupidity, I am just a bot" it would not be a good point to prove that chatterbots could actually have some importance or a right to exist in cyberspace.

chapter 1. <---> chapter 3.
contents