Visualizing Procgen Text, Part Two

In a previous post (now several months ago — sorry!) I wrote about visualization strategies for looking at procedurally generated text from The Mary Jane of Tomorrow and determining whether the procedural generation was being used to best effect.

There, I proposed looking at salience (how many aspects of the world model are reflected by this text?), variety (how many options are there to fill this particular slot?) and distribution of varying sections (which parts of a sentence are being looked up elsewhere?)

It’s probably worth having a look back at that post for context if you’re interested in this one, because there’s quite a lot of explanation there which I won’t try to duplicate here. But I’ve taken many of the examples from that post and run them through a Processing script that does the following things:

Underline text that has been expanded from a grammar token.

Underlining is not the prettiest thing, but the intent here is to expose the template-screen-shot-2016-09-11-at-4-45-52-pmstructure of the text. The phrase “diced spam pie” is the result of expanding four layers of grammar tokens; in the last iteration, the Diced Spam is generated by one token that generates meat types, and the Pie by another token that generates types of dish.

This method also draws attention to cases where the chunks of composition are too large or are inconsistent in size, as in the case of the generated limericks for this game:

screen-shot-2016-09-11-at-4-45-28-pm

Though various factors (limerick topic, chosen rhyme scheme) have to be considered in selecting each line, the lines themselves don’t have room for a great deal of variation, and are likely to seem conspicuously same-y after a fairly short period of time. The first line of text locks in the choice of surrounding rhyme, which is part of why the later lines have to operate within a much smaller possibility space.

screen-shot-2016-09-11-at-4-45-36-pmIncrease the font size of text if it is more salient. Here the words “canard” and “braised” appear only because we’re matching a number of different tags in the world model: the character is able to cook, and she’s acquainted with French. By contrast, the phrase “this week” is randomized and does not depend on any world model features in order to appear, so even though there are some variants that could have been slotted in, the particular choice of text is not especially a better fit than another other piece of text.

This particular example came out pretty ugly and looks like bad web-ad text even if you don’t read the actual content. I think that’s not coincidental.

screen-shot-2016-09-11-at-4-45-59-pmColor the font to reflect how much variation was possible. Specifically, what this does is increase the red component of a piece of text to maximum; then the green component; and then the blue component. The input is the log of the number of possible variant texts that were available to be slotted into that position.

This was the trickiest rule to get to where I wanted it. I wanted to suggest that both very high-variance and very low-variance phrases were less juicy than phrases with a moderate number of plausible substitutions. That meant picking a scheme in which low-variance phrases would be very dark red or black; the desirable medium-variance phrases are brighter red or orange; and high-variance phrases turn grey or white.

Here “wordless” and “lusty” are adjectives chosen randomly from a huge adjective list, with no tags connecting them to the model world. As a result, even though there are a lot of possibilities, they’re likely not to resonate much with the reader; they’ll feel obviously just random after a little while. (In the same way, in the Braised Butterflied Canard example above, the word “seraphic” is highly randomized.)

Finally, here’s the visualization result I got for the piece of generated text I liked best in that initial analysis:

screen-shot-2016-09-11-at-4-45-44-pm

We see that this text is more uniform in size and color than most of the others, that the whole thing has a fair degree of salience, and that special substitution words occur about as often as stressed words might in a poem.

*screen-shot-2016-09-11-at-5-25-03-pm

There’s another evaluative criterion we don’t get from this strategy, namely the ability to visualize the whole expansion space implicit in a single grammar token.

Continue reading “Visualizing Procgen Text, Part Two”

ProcJam Entries, NaNoGenMo, and my Generated Generation Guidebook

ProcJam happened last month, pulling together lots of different awesome things:

 

Screen Shot 2015-11-26 at 3.47.25 PM

A procedural château generator (or perhaps you’d prefer a procedural Palladian façade generator). A Twitter bot that tweets about odd clothing combos.

Ordovician generates strange sea-creatures that swim across your screen:

Screen Shot 2015-11-26 at 4.37.57 PM

But I was most fascinated by the pieces that do procedural work with words. K Chapelier’s Stochastèmes generates new words based on the poetic corpora (such as thurweedlesoe when I picked Wordsworth, and woulders).

baladescreenshot

Balade is Windows-only, so I wasn’t able to play it, but the screenshots give a sense of the French cityscapes it generates through words: you can choose streets to travel and receive small descriptions of these places.

Paradise Generator uses random text combinations to suggest a variety of possible paradises: a fairly light level of procedural generation, but using some well-selected components. At least, my first couple of paradises were quite interesting.

Servitude plays a bit more like a traditional game, though it claims there are various randomized elements.

Or there’s Mainframe, a procedurally generated horror game by Liz England and Jurie Horneman. I didn’t manage to win it (maybe I just didn’t persist long enough; I’m not quite sure), but it combines the body horror and malevolent AI themes I associate with a lot of Liz’s stuff. (Maybe unfairly? Yes, maybe unfairly.)

*

Meanwhile, ProcJam was not the only place for a bit of procedural generation this month. NaNoGenMo was the full-month press to procedurally create 50,000 words – 50,000 words of anything, not necessarily guaranteed to make sense.

Carolyn VanEseltine used Markov techniques and a ChoiceScript grammar to make an interactive generated novel, complete with choices, chapters, and stats.

Nick Montfort created a generated poem about consumerist impulses, one that offers us 126 pages of possible purchases such as “a subtle indigo topcoat that is exclusively available here” or “an understated navy thong that is significantly reduced in price”.

Kevan.org created a work based on Around the World in 80 Days mashed up with information from Wikipedia, which produces many many paragraphs like this:

Moving on, we arrived at London South Bank University. If I remembered correctly, this was founded in 1892 as the Borough Polytechnic Institute. Passepartout asked me if it was chosen to be clerk to the Governing Body, but I did not know. Passepartout examined the training and demonstrating Centre for Efficient and Renewable Energy in Buildings (CEREB). Passepartout explained how it had been designed to include two Thames barges set above a pentagon surrounded by five other pentagons. We moved on, disappointed by stricter student visa requirements in the United Kingdom.

The full repository of other creations can be found at the NaNoGenMo 2015 site, coordinated by Darius Kazemi.

*

Exploring the resources associated with NaNoGenMo and ProcJam brought me to this forum on generative text, and from there this video by Kenneth Goldsmith on conceptual writing, which gives an above-average explanation of what’s interesting about procedural writing in its own right.

*

Not really part of either of those things, Caelyn Sandel and Carolyn VanEseltine have a game idea generator that randomly combines concepts they’ve had for their works into new concepts. And of course Juhana Leinonen’s IF Name Generator is a classic, but it has been newly updated with name lists from IF Comp 2015 to remix those titles.

*

So. I thought about doing NaNoGenMo or ProcJam or somehow sort of doing both. ProcJam is so open-ended with its “make a thing that makes things” concept that almost anything could probably be construed to be a part of that project. And I’ve also got several procedural text game projects that have been knocking around unfinished for ages. What I ultimately wound up doing was sort of related but in fact none of the above.

Parrigues1The Annals of the Parrigues (warning! PDF!) is a (mostly) procedurally generated guidebook to a fictional pseudo-English kingdom, along with a making-of commentary on the process of generation. There are also some portions of the code (though I’m not releasing the whole source at this point, and indeed it wouldn’t really be meaningful to do so, as you’ll see if you look at the thing). It’s not an interactive piece of fiction at all, though it was built with various tools including Inform. Rather, it’s a story I wrote with the machine. If you want to know where to find the biggest library in the kingdom, what type of meal to avoid at the Fenugreek and Sponge, or why people keep trying to assassinate the Duchess of Inglefunt, this one’s for you.

 

Parrigues2