Mailbag: Breaking into Writing for Voice UI

Dear Emily,

[personal information redacted] I have been following your articles for a long time and I decided to write to you because I am stuck: I would like to expand my skills in the game design field but at the same time I started to get interested in voice applications. I have read that writers and copywriters can make a great contribution to the development of VUI, so that they can understand the language and the context in which conversations take place. I saw that you managed to combine these two areas – game narrative and natural language programming – so I would like to ask you where I can start if I wanted to take this career path? What skills should I focus on? I’m looking for courses on platforms like Coursera and Udemy, but it’s not clear to me what criteria I should consider for the choice. Except for HTML, I don’t know the development software that are proposed, all I know is that I am interested in understanding how an editor, who deals with finding the best plot structure for a story or making characters believable, can contribute to the development of an Avatar or a dialogue flow, for example. And if companies are interested in this type of profile.

I definitely wouldn’t want to discourage you looking into natural language processing if you think you’re interested in it: it’s a fascinating field, and currently under lots of demand.

I didn’t approach the subject with Coursera or Udemy myself, so I don’t know the offerings there very well, but I would imagine that there are introductory courses that would explain a lot of the basic ideas and tools. Another way in would be to look through the resources listed here.

It’s also possible to play with transformer-based language models using Google colab notebooks. These models take a huge amount of skill, data, and computation time to build, but once they’ve been trained, they can be used in a range of applications. For instance, this notebook by Max Woolf will let you experiment with a trained GPT-2 model, which has applications both in generating text and in creating machine translations (among other things).

Then there are also sites such as where groups doing active research regularly post their progress, sample code, or trained models. You would need a good grounding in the basic concepts in order to make sense of these.

That said, you might not need all of those skills even if you were building your own voice-based system from scratch.

There are a number of commercial APIs that would allow you to do intent and entity recognition (very simply, what does the user want to do, and to what objects?) using their NLP tools: Google’s DialogFlow, Facebook’s, Microsoft’s LUIS, and others. CognitionX is a company that (among other things) tracks developments in the chatbot space, and they have a news feed with a steady stream of related information.

It’s when you want to expand beyond or improve upon commercially-available NLU functions that you start to need these skills. And learning them well is likely to take a few years of effort. Many people who do this professionally have come out of a graduate-level program that focuses on this subject matter, or at the very least have done some bootcamps or a great deal of self-directed learning. While I managed and collaborated with computational linguists and specialists in natural-language-focused machine learning, I am not fully trained in that area myself.

And you certainly don’t need all of that to accomplish what you’re asking about: “how an editor, who deals with finding the best plot structure for a story or making characters believable, can contribute to the development of an Avatar or a dialogue flow.”

Writing or editing content for a voice-driven system does not necessarily require you to understand the inner workings of speech-to-text methods or named entity recognition.

I have a vested interest because until a couple of months ago I was working at Spirit AI on Character Engine, which is designed to help people without extensive NLU understanding write character content. There are a range of studios working in this general space. Earplay publishes voice-based games. Tom Hewitson runs a London-based company that does development in this space.

You might also be interested in last year’s NarraScope talk on “designing games that listen” or any of these talks from an IF meetup on interactive audio. I’ve also written a little bit about the general challenge of adapting interactive narrative for different media.

So the way I’d recommend engaging with this is to try out some of the games that have been written this way, as well as looking at what creators have said about the tools and process of building those experiences. If that still seems like an area where you might be interested in working, you could look for work with one of these companies and learn the specific toolset that they use to create content. Almost certainly you will not be required to invent the tech stack yourself.

One thought on “Mailbag: Breaking into Writing for Voice UI”

  1. Thanks so much for the help Emily. I decided to start slowly, so for now I am dedicating myself to studying UX Writing, and then I will start with Conversational Design at
    I tried to reach Tom Hewitson, unfortunately without success, but I don’t give up :)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s