The following letter fits right into this month’s topic on procedural generation. I’ve edited (just a little) for length:
Hi Emily! I read your chapter in the Procedural Generation in Game Design book, and was really impressed. I tried to follow up on some of the sources you mentioned (e.g. the Spy Feet game) but I wasn’t able to get a lot of details, and we have a pretty specific use case, so I’d love to beg a moment of your time to get me pointed in the right direction. Or, if answering my question properly takes more than a moment, I’d be happy to talk about a consulting fee…
We’re doing a bunch of what I’d call dynamic writing, which you can read more about here or on our wiki if you’re interested in the specifics. We have procedurally generated characters (heroes in a fantasy setting) with personality stats tied to their histories, and our system allows writers to take those personalities (and other details) into account in 2 main ways. The first is by picking who takes what role in any given story, (e.g. the highest goofball stat in the party might be picked to be telling the joke in a particular story) and the second way is by inserting markup in the text to add variations for specific personality traits (or relationship status, class, age, etc..) For example we can say things like, if the leader is more bookish, they’ll say something academic, but if they are more hothead, they’ll say something aggressive. This markup is also how we handle gendered words and attraction.
One of the things our game supports (due to the 2D art style and just the stories we want to tell) is really dramatic character transformations, like, to take a simple example, you might find a wolf shrine, and make a deal with the wolf god, and get your head replaced with a wolf head. Now you have a bite attack, cool. But it would be great if we could alter the character’s speech to reflect their condition. Likewise for other conditions or origin stories, or frankly (eventually, maybe) personality quirks.
Clearly, modifying all of our stories to deal with wolf-head characters is not going to be a winning approach, or, we could do it for the wolf, but then what about the crow, the cat, the tree, the snake, etc.. So that leads us to procedural language. And that’s where I was, reading your chapter a couple weeks ago, where you talk about token replacement for things like gestures, exclamations, and hedges, and the “drunk filter” for modifying text after the fact.
I’d love to go the extra mile and integrate this level of personality into our game. I think it would feel really special to players. The filter approach sounds the most promising to me, for our game. We’re already fairly far along in production, and while we could talk about tokenizing our text, I’d be concerned that it would take a lot of training, be bug prone, and would cramp our writers’ style. On the filter side, I can imagine some simple approaches, find-and-replace schemes and such. For example the wolf might have a rule to 60% of the time, replace a single ‘r’ with a double ‘rr’.
So I suppose my question is this. Is there some existing research on this sort of thing? Can you recommend an approach, or best practices? On the other hand, is there an open data set (computational linguistics??) that you know of that would be of interest? I don’t really know where to start in this field, I just ended up here :-) So I am looking for some guide posts. It seems like a problem that’s in your wheelhouse?
— Nick Austin
In my own projects, I’ve most often used filter effects to reflect transient states like the character being cold, drunk, nervous, etc., so that I’m layering that over their other word choices at the very end of the text generation process. For instance:
- Slurring or lisping
- Adding disfluencies, pauses or hesitation words (“Uh, …”)
- Sprinkling in the occasional cough or bit of barked laughter
Those are all pretty easy to do, and in line with your r/rr replacement idea. For cases where characters might be communicating through social media, it’s also possible to sprinkle in expressive emoji that put their own twist on dialogue choices. It looks like that would be less suitable for your project, though.
Another effect that works decently well on written text is to swap out just the punctuation sometimes — having a punchier character upgrade instances of “!” to “!!”, or a more uncertain character turn “.” into “…”. Using those modifications some of the time can subtly inflect how the reader perceives your characters’ speech without actually changing any words.
If you’re running filters, you could also swap out particular phrases even if your dialogue isn’t tokenized — for instance if your characters mostly invoke a particular deity when upset, you could replacement-filter all the “Oh Thor!” invocations to a specific crow deity, for instance. I mostly don’t do that on my projects because I’m usually controlling whether “Oh Thor!” gets generated at all. But it’s an option for cosmetic retouching when the core writing has already been done.`
Finally, I’ve also worked with instances where we were adding more complex phrases into dialogue — e.g.
- aggressiveness, like “If you must know…” before an assertion
- hedging, like “I think” or “I heard”, “as far as I know”, etc
but this soon comes into territory where the effects are repetitive and annoying if done carelessly, and to do them well often requires more markup of the sentence grammar, so that you know where in the sentence you can safely insert these elements. So this sort of thing might not be a good fit for you in general.
If you want to look at some academic research, this paper presents some of Lyn Walker’s research (the same researcher who did the SpyFeet project), associating different language tics with personality traits in the Big Five (OCEAN) personality model.
Then there’s an interesting (if not especially cheap) book called Computational Paralinguistics that looks at markers of speaker traits and status, such as gender, age, personality, intoxication, emotion, etc., in both spoken and written dialogue.
This contains many enlightening and even entertaining segments, like the portion where they describe how they collected enough corpus data to do computational studies on how speech changes as the speaker gets more drunk. It also suggests a number of corpora for specific areas, though some of these, like the promising-sounding “Speaker Personality Corpus,” focus on speaker audio, so it might be more challenging to extract conclusions about straight textual content.
Finally, a bit of advice about approaching this kind of procedure in general. This may be obvious, but: a little bit of filtering/personality intervention often goes a long way, and it’s easy to produce terrible results by overdoing it. Whatever implementation you go for, I’d recommend setting it up so that you can easily adjust probability of firing for each effect separately (maybe wolf R replacements look fine if used 33% of the time but the crow’s CAW! interjections are annoying at anything more than 10%).
Expect to iterate a bunch, and layer in effects one at a time so you can get a sense of what difference they’re making separately.
A few other related resources:
- Tanya Short’s talk from GDC 2018 about modular character design for procedural generation
- The Mary Jane of Tomorrow does not do this kind of text filtering, but it’s using a grammar in which for instance her uses of the word “the” may be replaced by “ye” when she’s in her medieval speech mode, something that could just as well have been done via filter after the main generation was complete
- Liza Daly’s A Physical Book project models alterations on how words are printed on a page — this one’s not about dialogue at all, but it is about procedural filtering to add an additional layer of meaning to an existing text