I’m curious: do you follow much research that happens in stories and dialog these days? In the world of machine learning research, there’s much less in dialog and stories than other areas (e.g. image generation/recognition or translation), but once in a while, you come across some interesting work, e.g. Hierarchical Neural Story Generation (by some folks in Facebook AI).
For some years now I’ve followed work coming out of the UCSC Expressive Intelligence Studio; work done at Georgia Tech around crowdsourced narrative generation; game industry applications introduced or covered at the GDC AI Summit (though it is rarer to see extensive story-generation work here). I’ve also served on the program committees for ICCC and ICIDS and a few FDG workshops; and am an associate editor on IEEE Transactions on Games focused on interactive storytelling applications. Here (1, 2, 3) is my multi-part post covering the book Interactive Digital Narrative in detail.
That’s not to say I see (or could see) everything that’s happening. I tend to focus on things that look most ready to be used in games, entertainment, or chatbot applications — especially those that are designed to support a partially human-authored experience. I also divide my available “research” time between academic work and hands on experiments in areas that interest me.
So with that perspective in mind:
- I’m not attempting a comprehensive literature review here! That would be huge. This coverage cherrypicks items
- I will go pretty lightly on the technical detail since the typical readership of this blog may not be that interested, but I’ll try to provide summary and example information that explains why a given item is interesting in my opinion, and then link back to the original research for people who want the deeper dive
- I’ll actually start by summarizing a bit the paper the questioner linked
- Even with cherrypicking, there is a lot to say here and I am breaking it out over multiple posts
That Initial Paper
For other readers: the linked article in this question is about using a large dataset pulled from Reddit’s WritingPrompts board and a machine learning model that draws on multiple techniques (convolutional seq2seq, gated self-attention). After training, the system is able to take short prompts and create a paragraph or so of story that relates to the prompt. Several of the sample output sections are quite cool:
But they are generating surface text rather than plot, and the evidence suggests that they would not be able to produce a coherent long-term plot. Just within this dialogue section, we’re talking about a tablet-virus-monster object, and we’ve got a couple of random scientist characters.
Inform 7 is used in a number of contexts that may be slightly surprising to its text adventure fans: in education, in prototyping game systems for commercial games, and lately even for machine learning research.
TextWorld: A Learning Environment for Text-Based Games documents how the researchers from Tilburg University, McGill University, and Microsoft Research built text adventure worlds with Inform 7 as part of an experiment in reinforcement learning.
Reinforcement learning is a machine learning strategy in which the ML agent gives inputs to a system (which might be a game that you’re training it to play well) and receives back a score on whether the input caused good or bad results. This score is the “reinforcement” part of the loop. Based on the cumulative scoring, the system readjusts its approach. Over many attempts to play the same game, the agent is trained to play better and better: it develops a policy, a mapping between current state and the action it should perform next.
With reinforcement learning, beacuse you’re relying on the game (or other system) to provide the training feedback dynamically, you don’t need to start your machine learning process with a big stack of pre-labeled data, and you don’t need a human being to understand the system before beginning to train. Reinforcement learning has been used to good effect in training computer agents to play Atari 2600 games.
Using this method with text adventures is dramatically more challenging, though, for a number of reasons:
- there are many more types of valid input than in the typical arcade game (the “action space”) and those actions are described in language (though the authors note the value of work such as that of BYU researchers Fulda et al in figuring out what verbs could sensibly be applied to a given noun)
- world state is communicated back in language (the “observational space”), and may be incompletely conveyed to the player, with lots of hidden state
- goals often need to be inferred by the player (“oh, I guess I’m trying to get that useful object from Aunt Jemima”)
- many Atari 2600 games have frequent changes of score or frequent death, providing a constant signal of feedback, whereas not all progress in a text adventure is rewarded by a score change, and solving a puzzle may require many moves that are not individually scored
TextWorld’s authors feel we’re not yet ready to train a machine agent to solve a hand-authored IF game like Zork — and they’ve documented the challenges here much more extensively than my rewording above. What they have done instead is to build a sandbox environment that does a more predictable subset of text adventure behavior. TextWorld is able to automatically generate games containing a lot of the standard puzzles:
This Friday I had the pleasure of speaking to the AAAI workshop Knowledge Extraction from Games, which focused on gathering information from games and putting that information to use: for instance, studying level design in a platformer in order to find standard rules about platformer design or to propose alternative level designs that the creators might not have considered.
I was invited to talk about this topic from a designer’s perspective, looking particularly at how these techniques could be valuably applied to narrative games. And the problem, as I outlined it, was as follows:
Games that aspire to offer a lot of narrative agency often face the following challenge: they need a number of distinctive, hand-authored units of content (whether those are dialogue lines for Character Engine, storylets in a quality-based narrative system, or choice nodes in a ChoiceScript game) where each individual unit may both affect and be affected by the underlying world state.