Scraping IFDB

Working on updating some old articles for the IF theory book, and reading discussions on the intfiction forum, I found myself wondering about some of my preconceptions of IF history. So I decided to check some of my assumptions against IFDB, by searching on certain tags and then reducing the results to a list of dates.

My initial hypotheses were more or less as follow:

  1. Recent years have seen some experimental break-aways from the early convention that all IF must be second person.
  2. Female protagonists are much more common after the mid/late 90s.
  3. Single room games took off in the late 90s and have been relatively frequent ever since.
  4. Puzzleless games took off in the late 90s and have been relatively frequent ever since.

(1) To get my worst prediction out of the way first, here is the result I got on first/third person games:

Assuming that the tagging on these is remotely accurate, the 80s actually saw a lot of experimentation at least with the first person, which then died off; since then there has been just a modest trickle of both first and third person. The early first-person trend is even clearer when expressed as a percentage of total publications per year:

There are some serious issues with this methodology, I should point out. One, tagging on IFDB is not consistent; less-known games tend not to be covered, and many older games are likely to be less tagged than new ones. Many examples are likely to be omitted from these counts. (But if anything I would expect 80s games to be underrepresented in the tagging, rather than the reverse.)

Moreover, there is no distinction here between English-language games and those in other traditions. I know that some language traditions tend more towards the first person than others, so probably a more intelligent approach to the data would break out the Italian, Spanish, etc., games from the English ones. Even with those caveats, I’m kind of surprised by this one.

Assuming the data are worth anything, though, they suggest that language/library support is not actually the key issue determining whether IF authors use persons other than second person. TADS 3 (via built-in support, I believe) and Inform 7 (via extensions) both are better at letting the author select another person or tense than the previous generations of these languages. But their introduction hasn’t led to a boom in non-second-person games.

(2) On female protagonists, I was sort of right, but not in the way I expected.

I had vaguely assumed that there would be a lot of male protagonist games and then a gradual rise in female protagonist games to match it, more or less at the point around 1997 or so when the gender balance of the IF newsgroups seems to have started shifting to include more women. (A bit of anecdata: “Everybody Loves a Parade” (1997) has a moment that reveals the PC as female; at the time, reviewers hailed this as an amusing surprise twist. A couple of years later it no longer seemed all that surprising and twisty.)

What actually happens is that there’s a spike in female protagonists, but subsequently a rise in the count of explicitly male protagonists as well. I’d guess that reflects changing gender proportions in the writing and playing communities, but also a movement towards having specific player character personae at all:

(3) Here’s the chart for single room games. It more or less does do what I expected:

Perhaps the most notable thing about this chart is the way the number jumps suddenly at 1998. But 1998 was a year of very high production overall, and a lot of minicomps. Suddenly there were a lot more venues for bite-sized works. And once that trend started, it continued.

(4) Puzzleless games, I was sort of right about.

My first impulse was to think, hm, I wonder whether this phenomenon correlates with years when the IF Art Show was running, since that competition encouraged experiential and often puzzle-free work.

But there was an IF Art Show in 2007; it just seems that the entries were not as frequently puzzleless as the earlier ones, or aren’t labeled as such.

Reformatting the data to represent not absolute numbers but percentage-of-all-published-games for these periods shifts the effect even harder:

Over ten percent of the games published in 1998 are tagged as puzzleless. 1998 again! That 1998 was a big year for IF is obvious just from a glance at the XYZZY list: this was the year of Anchorhead and Photopia, Spider and Web and Losing Your Grip, Once and Future and Bad Machine. But the other numbers suggest it was also a year of massive innovation and change in the community and in the types of games that were being produced.

The drop-off is as striking as the pickup. It suggests either that the IF community has actively rejected the puzzleless experiments of 1998-2001 or so; or that we’ve stopped labeling as “puzzleless” games that would previously have come under that category.

My personal, fuzzy sense of this is that we had a boom in experimental puzzleless games, and that that gave way to a series of works that are more balanced between puzzle and story than what went before. “Make It Good”, “Blue Lacuna”, and “King of Shreds and Patches” are hardly puzzleless, but they tend to integrate their puzzles more deeply with stories than many older works.

So. I’m not sure what to make of all that. As mentioned, the data is pretty flawed, and there’s a lot I’d like to be able to look up (average play times, for instance) that isn’t covered by IFDB. I wouldn’t mind doing some comparisons on game genres as well (am I right that “slice of life” IF has become more prevalent since 2000 or so?), but there are just too many games in each category for my partly-manual counting process to cope with, so I’d need some other way to get the IFDB data on those.

And finally, what the charts don’t show at all is the relative influence of the games in question.

There are only nine games tagged “moral choice” in the whole database, but I feel like “Fate”, “The Baron”, “Tapestry”, “Slouching Towards Bedlam” and “Whom the Telling Changed” all raised significant discussion — enough so that I think of this as an important focus of mid-2000s IF even though, by IFDB standards, there are very few examples.

16 thoughts on “Scraping IFDB

  1. I’m not sure I’d even say that the the 80s saw “experimentation” with first person. It was more like part of the established norm. Scott Adams wrote his games in the first person, and those games were the first text adventures seen by a lot of the people who went on to write their own. I’d wager that, in games written before 1990, first-person narration correlates well with two-word parsers. Both would tend to go away as people stopped imitating Adams and started imitating Infocom.

  2. (On first person) Even with those caveats, I’m kind of surprised by this one.

    The Scott Adams convention was first person, and a lot of people copied that, so it isn’t too surprising.

    There also might be an “other” for that category; there’s been a couple odd ones. Michael Berlyn’s Cyborg, for instance, used “we” literally to consider the player-brain and machine-body a symbiosis, and Suspended had first-person reports from the various robots.

    People in general underestimate how much weird and experimental early IF is out there.

  3. What actually happens is that there’s a spike in female protagonists, but subsequently a rise in the count of explicitly male protagonists as well. I’d guess that reflects changing gender proportions in the writing and playing communities, but also a movement towards having specific player character personae at all[…]

    Another hypothesis is that, with the advent of more female protagonists, “male protagonist” gets tagged more often rather than being treated as a default. That is, if all protagonists are male, no one is going to use the “male protagonist” tag. This may not be compatible with the way that the ifdb works, since obviously a lot of early games must have had their entries created well after the games were, but it’s a hypothesis. Also, if older games are undertagged, that might explain some of it.

    All that said, I suspect you’re right that this is at least partly due to a move away from the AFGNCAAP.

  4. Baf, Jason — those are excellent points. I had forgotten Scott Adams did this.

    On the other hand, I don’t recall Scott Adams games doing a lot with a defined protagonist. So, to the extent that I’m curious about IF games in which the first person is used to present a specific character, it may be true that that doesn’t happen a lot until the 90s at the earliest. Unfortunately there’s no tag that would let me search on that per se.

    Matt — Re. “This may not be compatible with the way that the ifdb works, since obviously a lot of early games must have had their entries created well after the games were.”

    That was my assumption, yeah — that things like labeling changes become an issue only after IFDB is introduced in the mid-2000s, and before that it’s hard to tell because games were presumably labeled with whatever seemed appropriate as they were first inducted into the system, rather than with the labels they would have been given at the time of their publication.

    • One reason Mystery Fun House is my favorite Scott Adams game the protagonist appears to be a motivationless treasure-grabber at first until you realize what’s going on (which is not that hard necessarily, but for me only happened about an hour into play).

  5. Fascinating. Makes me wonder what the games of the most recent years do more of, or do better. Gaining a foothold outside of the I-F community, perhaps?

    • Some things that would be fun to search on, but aren’t currently tagged in any usable way:

      — games originally released outside the community (e.g., to TIGSource, or published in some other venue, or whatever, rather than being primarily presented to the IF community)
      — games organized by scene rather than by map
      — games with reactive NPCs who comment on your behavior as they see it (I don’t think there’s any tag for this, but I keep wishing to search for it apropos of intfiction forum discussions)
      — number of rooms implemented in the game (as opposed to just “single room” vs “not”)
      — novice-friendly features, such as hinting, quest journals or equivalent strategies for tracking what the player is currently working on, tutorial modes… actually I guess I could search on built-in hints, as those are tagged. But the other things would be interesting to know about too.

      The other thing that’s interesting of course is just the selection of what people think is worth tagging…

  6. Pingback: More IFDB data « Emily Short's Interactive Storytelling

  7. I’d be curious to see if there is any correlation between the characteristics found in award-winning IF and the overall trends you’ve discovered here. And/or if the awards lead the trends: once I see that a certain type of story tends to win, I alter my writing style.

    • If it says “percentage”, then it’s corrected for the total number of games listed for that year, but no, I haven’t corrected for the number of games that are tagged, period — I don’t have a way of getting that information out of the database that wouldn’t be insanely time-consuming.

  8. I would much rather know why 1998 was the banner year that some people feel the data suggest. What was it about that year? Did something happen? Did a new system come out?

    Maybe if you can figure out why people cared at that point, you can see what it might take to get them to care again.

    I would be curious about audience statistics from 1998 versus some years in the 2000s.

    • I’m not claiming that it was a banner year for IF *playership*, but I think it was an important watershed for the authoring community. Several things come into this: maturity of authoring systems (both Inform and TADS had been around long enough to build a critical mass); increased interest in the IF competition; and the organization of a number of small competition, which raised both production and the idea among authors that they should experiment in nontraditional formats.

      Posting history reinforces this. If you look at the posting stats on rec.arts.intfiction since its inception (according to Google), you get the following chart:

      1987 2
      1988 3
      1989 25
      1990 126
      1991 377
      1992 986
      1993 1388
      1994 3226
      1995 3830
      1996 10835
      1997 10963
      1998 20724
      1999 14571
      2000 11268
      2001 14504
      2002 14166
      2003 10669
      2004 10202
      2005 8175
      2006 16847
      2007 12107
      2008 12898
      2009 9704
      2010 8052

      There’s a substantial bump at 1998. For some reason, authors were interested — in writing, and in talking about writing, and in doing experiments.

      • (So, just to clarify if it wasn’t clear — I don’t think that reproducing whatever was magical about 1998 is something that would make IF superfamous among players now; I think in fact that hobbyist IF is much better known outside the community right now than it was then, thanks to the growth of indie gaming, the greater visibility afforded by the web, and the active outreach efforts in a bunch of peripheral communities. But I think the group of niche producers, *cut off from* the outside world, was at a bit of a local maximum just then.)

  9. The single-room aspect is easy to figure out: people are sick of those darn compass notations. It made sense for a cave simulation, not for indoors or urban landscapes. You don’t “go north” to go to the bathroom, you simply “go to the bathroom”. You simply go to Jane’s flat, not go north, go east, go east, open door, etc…

    Nightfall did a great job in this regard, with its named locations auto-navigation feature.

    When you think about it, many of the simulation, mechanic aspects of IF are plain boring, something Adam Cadre explored to great effect, though, in 9:05 and Nemean Lion.

    Perhaps in the near future IF development should concentrate less on simulation and builtin verbs and more on flexibility of action via pattern matching or something. More range of action for the player rather than lots of small boring steps via simple action/verbs to accomplish something.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s