testing – Emily Short's Interactive Storytelling

Putting together a play-test

Recently read an interesting article by some Microsoft playtesters that suggests playtesting studies using 25-35 participants focusing on a single hour of gameplay, followed up with standardized surveys. The idea is that this could be done repeatedly during the course of a game’s development in order to drive gameplay improvements and then confirm that the changes have had the desired effect. This method contrasts with usability tests (an hour to two-hour interview one-on-one with testers, usually conducted with a group of eight or so) in that it is more statistically reliable though not so in-depth.

This is something I’ve been thinking about a lot lately.

Continue reading “Putting together a play-test”

In Praise of the Glorious Practice of Beta-Testing

Juhana Leinonen has just announced a new site for IF authors seeking testers and vice versa. It takes a slightly different approach from the IF betatesters’ mailing list, in that you can subscribe to an RSS feed rather than getting email at times of year when you might not be in the mood for testing. There’s also a small selection of articles on the art of testing, and the opportunity to specify what kinds of games you’re willing to test.

I mention this not so much because I have a vested interest in pushing one site or another, but because it’s worth reminding prospective authors as we head into summer and the season of comp-game writing:

Please test your game and credit your testers.

It makes your game better, and it offers your players some kind of promise that you made an effort. Also, please give your testers enough time to work that you will be able to fix what they find — ideally get the game into testing a month or more ahead of the comp deadline. This especially applies if you’ve never written IF before. It takes a lot more testing time than you think.

(My personal plan for the coming competition is not to bother reviewing games that don’t credit any testers, just to spare myself the annoyance of writing the same dull rant ten or fifteen times. I realize this isn’t foolproof and that someone could stuff in the names of a half dozen imaginary friends, but still. Worth a try.)

Alabaster 30a

Having got rid of the annoying flaw in disambiguation (I hope), I’ve posted the latest build of Alabaster. The plan at the moment is to do a little more beta-testing to make sure that the conversation is sufficiently rounded out; then to remove the conversation-building machinery and do the last speed tests and refinements once that is gone. If you want to play along, transcripts are welcome.

Currently the biggest between-turn lags — sometimes very long indeed — continue to be in response to disambiguation questions or when the parser can’t match a quip at all. I am not sure why it’s doing this, but I suspect that the quip-creation machinery may be slightly interfering with the efficiency.

Then we’ll do some profiling.

The startup delay should be gone completely, though, and between-move delays reduced in most other cases.

Fit and finish issues: disambiguation

One of the fun (or, depending on how you look at it, annoying) aspects of polishing an IF game is working on the disambiguation. This usually involves combing through beta transcripts and seeing where the tester plainly meant X and the game instead understood Y. Sometimes the fixes are trivial things — you left out a synonym, you didn’t set a pronoun at the right time, etc., so the code didn’t have all the information it needed to make the right determination.

The more interesting cases are the ones that challenge you to think more deeply about how language normally works, though, and come up with sensible working rules about what the player probably means.

Alabaster spoilers after the cut.

Continue reading “Fit and finish issues: disambiguation”

LICK TREE. PURCHASE ANTLERS.

This is mostly to put in a plug for Juhana Leinonen’s new extension “Object Response Tests” (linked from here). The extension implements an ANALYZE verb that systematically runs through standard rules actions (and new actions, if you choose) on any object in your game, exposing infelicitous responses, incorrect plurals and articles, and all those other little flaws that can escape even a good human beta-tester.

I spent a lot of yesterday evening thus: extract a complete list of in-world objects from the world index of Alabaster; write a test script to run through and ANALYZE each of them in turn; revise responses I didn’t like; run the test script again…

Unsurprisingly, this is a lot more thorough than even (what tried to be) a fairly rigorous testing by hand. It’s also a lot more fun. (Though I find it has the perhaps-unhealthy effect of making me lavish a lot more time on commands no one is ever going to try, like DRINK UNDERGROWTH or STAND ON CHIN or TIE FACE TO SKY. There’s something about seeing the same library messages over and over that makes me want to spice them up with randomized elements.)

More seriously: the huntsman seems to be getting a bit more character as I go along. I picture him as a loner who has never been quite comfortable with the other villagers or with the court, who doesn’t feel at home anywhere, and who has not much left to lose. (This justifies a wider variety of endings for him, of course — he’s more likely to be willing to run away from his home kingdom if he hasn’t got a wife and kids he’s leaving behind, say.)

More About Beta-Testing

Reading through the author’s forum from the recent comp, I ran across some discussion of how mean/unfair reviewers can be; this is no doubt true, though it is also true that people get irked about having their time wasted by a game from an author who didn’t do (what they consider to be) due diligence in writing the thing. It’s a fine balance. (And no, the purpose of the comp is not to encourage newbies. Not as such. It may do that, which is fine, but that’s not the mission of the competition. I tend to think that it’s about producing more cool IF, and in a context where that IF gets noticed and talked about, both inside and outside the community.)

Anyway, the particular bit that caught my attention was this:

Certain reviewers seem to imagine that hoardes of willing beta-testers drop out of the sky at the slightest mention that you’re working on a game.

I kinda wonder how much time these reviewers have spent beta-testing people’s games?

— Harry Wilson aka Conrad Cook

Actually, lots of the reviewers have put in significant testing time. This is part of what makes them so sensitive to all the ways in which a game can go wrong. Beta-testing is itself a lesson in game design and implementation: you get an up-close view of what can go wrong and how someone else goes about fixing it. But second: if you are having a hard time finding testers, ask people. You can post a request on RAIF or join the IF betatesting site, but if that is not producing enough feedback, go on ifMUD and find folks; or approach people individually by email and say, “Hello, would you please test my game?”. It helps to include some basic information about what kind of game it is — and, perhaps, an explanation of why you think that person would be a good fit and/or might enjoy doing it. (This latter part is optional, but it might help if you’re addressing someone you haven’t worked with/talked to before at all.)

Sometimes people will be too busy, and will say no. But it’s not wrong to ask. People are much, much more likely to say yes if you ask them specifically and personally than if you put out a general call for volunteers. (Later on when you’ve written some games that people know are good, putting out a general call for volunteers may turn up more testers — but by that point you may already have a team of people you like working with, and not need to do that so much anyway.) Some authors have a rule of seeking as beta-testers the very reviewers that were most harsh on them last time, but that’s certainly not obligatory (and heaven knows you may find it’s more to the point to work with a beta-tester who is sympathetic to your basic vision but has high standards of craft).

Finally, the community expectation in general is that beta-testers will not diss your game in public, will not disclose the errors they found, and will not distribute a pre-release copy; and conversely that you will credit your beta-testers in some way in the finished game. (An ABOUT or CREDITS verb is usually used for this purpose.)

But when all is said and done, it is not unreasonable for players to expect that the game you offer them has been through testing. If you haven’t tested it, you haven’t finished it. If it’s not finished, don’t submit it. It’s that simple. Seriously.