Imagine this scenario: You hear the buzz of conversation as you near the local pub, so you walk in. Immediately, since you're a 7'9" 300 pound herculean adventurer, the locals gawk at you and temporarily hush up. Then, as they figure youre not going to start slaughtering anyone, you walk up to the bar and sit down. All around you, locals are making small talk, about the price of wheat, the thief that was lynched the other night... typical backwater drudgery.
The bartender approaches and engages you in conversation. Since he is directly within your targeting range (imagine 2.5 ft radius) the chat also appears in your chat reader. You can clearly hear his words over the crowd.
Basically, the tech design here would require realistic 3D sound emitters, a text to speech engine, and a localized AIML set (dynamic, of course, so that the NPC's aren't totally stupid.) Real-Time phonetics emulation would be an option to turn on/off based on performance... but having realistic conversations with NPC's (not perfect, AIML is dumb sometimes) would be awesome.
It seems to me that running an AI server to handle all the 'Chat' AI functionality would offer one helluva lot to the immersion. All that's needed is a script that handles efficient conversation targeting and a "Text In/Sound Out" script that reads the chat from the AI server and tags the source. Everything else would be fairly standard for MMO NPCs.
I'm thinking of adding phonetics based "Voice Styles" and different voice types, for accents, customization, and a real feeling of having A Person on the other side of the pixels.
I understand the limitations of AIML, but the overall structure makes it ideal for NPC chat, and I was wondering if anyone has looked into this at a serious level. Are there games out there utilizing this approach, or is it too bandwidth consumptive/CPU draining?
I'm still in a conceptual process, but the idea seems sound. Anyone see any glaring errors?
You could also allow players to submit ideas for additions to the AIML. Combine this with a relatively large community (5000+) and you're guaranteed to snag some obsessive compulsive gamer who will devote dozens of hours developing perfect conversational AIML sets for your npcs.
Ok, that would be unwarrantedly taking advantage of a sick person for coroporate gain. :whistle:
Anyway, what are some thoughts?
An XML format for defining automated answers to questions. (Like a chatbot)
Embedding an Alice like bot into the game sounds fun, as long as getting all the data for all the different NPCs is not a challenge. Text-to-speach really isn't that wonderful, so unless the theme is `robot world' maybe leave that out (which would make development a lot easier, free up cpu ect).
Make a little avatar demo and post it
Text to speech just lacks polish. I'm not looking for a universal system, just a very broad, closed system approach to add depth to a gameworld where it was previously lacking. And yeah, I'll do up a simple avatar over the next few weeks as a kind of demo.
My goal is, at this point, a realistic virtual tavern. Something like John's, if you've ever read Feist's Serpentwar series.
Anyway, aiml seemed to me to be the easiest route to realistic conversation. On a MMO scale, it would require a huge database, and a server all it's own, but for my purposes the CPU usage shouldn't be bad at all
Check out [www.Alicebot.org](www.Alicebot.org) to learn more of AIML.
Sounds like a cool idea. I'd like to see a demo of it too when you're done.
AT&T Natural Voices and some other company sell very high quality voice packs. I honestly thought they were very convincing from their demos. It might be worth investigating if you go along with this.
Blah. I was shown a chatbot demo the other day by a friend of mine, thinking that the text-to-speech was computer generated. It was a library of recorded responses, which is where the tone and inflection really came through and outshone the CG stuff (like the AT&T pack.)
I ran a basic bot voice into one of the microsoft avatars just to test out the voices I could find (I'm not yet willing to put money into this if there's no good voice collections out there.)
It's standard bot text-to-speech, and it r sux.
Hiring people and recording their recitals of AIML sets with proper inflection is the only way to really get this idea to work.
Word to the wise: 15 chat bots talking in a virtual room with text-to-speech sounds like crap. Especially if you are unfamiliar with sound programming
A project like this requires more time and effort than is worthwhile to me at this point. I'm still keen on the idea of AIML driven NPC's. Just not having them speak. Btw, AIML set conversations are really stupid, as well.
How are you?
I'm great, how are you?
I'm great, how are you?
The day is looking fine.
The day is.
Is the day is what?
It's probably my implementation that sucks, and I only used the standard ALICE brain, although if you've played with AIML at all, you'll have come to the same conclusion I have: ALICE is pretend-AI.
Maybe I should leave this idea in the attic to collect dust until such a time as we have real (or real enough) AI to deal with something of this nature.
Or until I have \\$20,000 to shell out to voice actors to record huge AIML sets
btw, I discovered CyN, which is an aiml integration with OpenCyc... very interesting case for developing a so called strong AI out of aiml.
@ Ooka: I'm really glad I found this thread. I've been dreaming of seeing this hugely important feature in a future MMOG. Unfortunetly, sound never seems to get the attention it deserves, so I'm not gonna hold my breath.
Do you guys think we'll have talking NPC/Mobs/pets in MMOGs by 2010?
Actually, I think that it would take several dozen voice actors to make this a doable idea. You'd need to set up scripted responses to almost any type of question in-game, which would be a very large effort, and then you'd give each actor a recorder and have them speak each phrase numerous times, covering any of the contexts in which the phrase could be spoken. Different accents (scottish accent for dwarves, british for elves, etc) and different tones would need to be covered as well.
Once you had the library of responses built up, you could build up a believable conversational system. It would probably cost upwards of \\$15000 to complete it, but once you did, you could not only use it directly in a game, but you could apply learning algorithms to it and have the computer generate it's own voices.
It's an aspect of immersive game technology that hasn't yet been developed because of it's complexity. The question comes down to this: is it worth the time and money to invest in the technology, or do we let it advance on it's own until voice libraries are as developed and freely available as graphics libraries?
Linguistics is far more complex than graphics. In fact, I'd say that it's by far the most complex system short of strong AI that relates to gaming.
2010 might be possible, but I'd say we will definitely have it by 2015.
Check out Cyc, for an example. Each piece of "common sense" knowledge is hand-fed into the knowledge base to verify it's accuracy and pertinence. The system is very specifically designed to allow the engine to infer things about context. A similar system would have to be employed in order for a voice-chat program to be believable and accurate.
Anyway, the first company/person that develops it is going to make a killing, because once one game has it, every game will need it. Just like 3D graphics. And there will be lots of crappy spinoffs and imitations that will lead to further discoveries and improvements in the field.
This has given me another idea for a thread, I'm going to develop it in another thread.
Your right Ooka, who ever develops this thing will make a killing...the whole development team could retire!
Well, it's good to know that someones moving on it.
Check out MegaHAL The Ultimate ChatBot.
definitely some funny stuff, there. Hal can seem almost human, at times.
I prefer Cyn, however, and have actually had some decent conversations with it, which it remembers. Hal always forgets me
I mused about replacing the typical adventure game dialogue tree with code that actually simulates the conversation, taking into account factors such as the NPC's emotional state and knowledge base, and generates his/hers sentences on the basis of the actual grammar, but then I've realised that:
a) The algorhithm would be too complex for such an amateur as myself to ever conceive, and
b) I've had to create at least one full vocabulary of any given language for the thing to work. Given that typical vocabulary for any given language is several thousands of nouns at best - good luck typing them all into the computer.
As you might imagine, I no longer even consider this possibility.
Thats why you use things like wordnet, or the openCyc database. Millions upon millions of terms that have been given a standardized format for use in computer/AI related apps.