Rebalancing Bee

A couple days ago I mentioned the rerelease of my game Bee, and promised a follow-up article about some of the technical aspects.

The Project

Key things to know about the game:

Varytale had a Twine-like diagram of the internal structure of each storylet.
  • It was a piece of storylet-based interactive fiction originally designed for the Varytale platform.
  • Each storylet describes a vignette in the life of the main character, ranging from a couple of paragraphs to (at the extreme) a couple of pages. There are usually additional choices to make within the storylet, so the storylet’s internal structure is like a short branching narrative.
  • The game cycles through the months of the year. Every half-month, the player can pick one storylet to play, out of three options. A standard playthrough covers three years, though there are circumstances in which you could spend less time, or more.
  • Storylets to populate the option list can have requirements (controlling whether they’re available at all) and also frequency (controlling how likely they are to be chosen). Some storylets are only available in particular seasons or even particular half-month slots.
  • A chosen storylet may change the player’s stats.
  • Within a storylet, the player may experience further choice points with further stat restrictions and effects.

You can also play the new Dendry version here.

The Challenge

Bee was written for Varytale, but that platform went away a long time ago. I still had a dump of the content, and Ian Millington had partially ported it for Dendry, an open version of a Varytale-like system. But the Dendry version was unfinished and broken — the endings couldn’t be reached, the frequency of storylet selection was off, and there was no status readout to tell you how you were doing on various game metrics.

Autumn Chen did the work of making a status readout for Bee and importing to Dendry the content that hadn’t yet been ported over. (Autumn has also written a couple of Dendry projects of her own, so she was familiar already with what’s possible there – if you like Bee, I recommend checking out Autumn’s work.)

Once that was done, I did some testing, and asked Autumn about a few playability issues I was encountering.

For one thing, I kept being offered the same options, when I knew there were more interesting storylines locked in the game’s system somewhere.

For another, I was finding it hard, even playing deliberately and with knowledge of the system, to unlock a couple of the most interesting endings. And that was at odds with what I wanted the player experience to be.

Part of this was a straight-up bug. We had an issue where one-off narrative storylets weren’t given a higher chance of being chosen vs. repetitive storylets, though they had been in the Varytale version. That was a question of updating all the frequencies to match the Varytale frequencies.

But there were some problems that probably existed even in the initial release – problems that I felt better equipped to fix than I had been in 2012, because I had both a) more experience with this kind of system and b) a different set of tools at my disposal.

Standardising Frequencies

One challenge was that the frequency of storylets was too inconsistent.

In the very original version of the game, Bee was required to use an action meter like the one in Fallen London – that was part of the concept of how Varytale would monetise.

But when you’re designing for a system where people will play your game in a number of short sessions, it becomes more forgivable — or even necessary — to let them see the same text over again in order to refresh themselves about things. So the first version of Bee had a bunch of spelling-drill storylets that we expected the reader to see repeatedly but some hours apart in real time.

Once the action meter went away, that repetition became more annoying, so I addressed it one storylet at a time, making some spelling drill storylets richer (more things to try! the second time you reach it, you can do something different than you did the first time!) and making others lower-frequency.

The Varytale tools were designed towards thinking about each storylet individually. There were cool reader analytics on what players were enjoying:

Varytale let you view which storylets were scored well or badly, on an individual basis.

…but, conversely, they didn’t make it easy to batch-edit things or even review certain kinds of patterns in the storylet requirements data. Editing the storylet system as a system wasn’t really part of the scheme.

By contrast, Dendry source files are raw text. Gone was the ability to look at a flow layout within each storylet. But having a text file directory did make it easier to search for key features (frequency settings for storylets, e.g.) and make sure that they were adhering to design goals.

So one of the things I did was to standardise frequencies for storylets of particular types – a low-medium frequency for chores, medium for spelling drills, high for narrative events with lots of preconditions.

That helped a good bit.

The Calendar

Image of a spreadsheet showing the storylets available in the first and second half of each month.
Elements in orange only happen if the month is right and other conditions are met.

Another element clarified by text-searching in Dendry: the calendar in the original crowded some elements and left too much empty space in other months. If you had a particular two-week period with two or three storylets that could only show up then, they were pretty bottlenecked, and the player was likely to miss some of them.

After adjustment, the calendar came out with a few season-specific chores that ran for long periods, but no two storylets were restricted to the same N with a “month = N” requirement.

Storylet Stat Balances

Where I have a sequence of storylets that heavily involve one particular stat, I find it useful to break out a spreadsheet that looks at

  • what are the minimum and maximum values at which each storylet can appear?
  • what further gain or loss does each storylet offer (to that particular stat)?

(The content management system for Fallen London offers a content view that organises storylets by min/max quality levels for any quality the author might select – it’s an easy way to quickly organise storylet pieces with a relatively linear narrative, but also a good way to get a sense of more dynamic systems.)

So for example, here are storylets associated with having your Lettice stat at a given value. The Lettice stat is actually a menace – meaning it tracks how badly things are going with her.

Screenshot listing storylets to do with Lettice and showing the minimum and maximum stats required to show a particular branch, as well as how much the player gains or loses by picking that branch.

(There are actually more Lettice storylets than this, but I’ve hidden a number of rows that were functionally similar to the ones shown here, in order to fit a full range of possibilities into the screenshot.)

What we see here is that:

  • There are a number of branches we can see any time: some of them (light blue) make the relationship better, and others (light yellow) make the relationship worse. There are a decent number of these, and while some are restricted to just one point of the year (like “easter”), others can occur at any time and are likely to be offered to the player quite a bit. In practice, the player has a bunch of opportunities to electively make that relationship better or worse.
  • There are branches that we can choose only when our relationship with Lettice is good (those with Max values up to Lettice = 3). Some of those (the darker blue-purple ones) make that relationship even better.
  • There are also branches we can choose only when our relationship with Lettice is bad (those with Min values from 1 to 10). Many of those make the relationship worse.
  • The last (red) branch with Min 10 actually kicks off a situation where your relationship with Lettice becomes so bad that it leads to an ending of the story.

This isn’t necessarily the right pattern for all projects! In fact, some of the features here would be actively discouraged by others: for instance, Choice of Games projects often discourage “rich get richer” mechanics, where someone with a low stat has opportunities to make it even lower, or someone with a high stat gets to make it even higher.

But for this system, the player can pick which storylet to play out of several, and typically has access to at least something that will freely raise or lower the Lettice stat.

In that context, what we gain by adding the rich-get-richer arrangement is a sense of narrative acceleration – especially when the player is intentionally exploring a particular narrative direction. A player who pushes a main stat in a particular direction should see the stakes escalate and new possibilities come into play rapidly.

We see the effects of this when we look at randomised playthrough analytics:

Bar chart of how many randomised automatic playthroughs of the game end with a Lettice stat with a particular value. There are many in the range of -4 to 4; then a gap; then a few of 10 or greater.
In a hundred randomised playthroughs, what was the player’s Lettice stat when the game concluded?

Players rarely end the game with a Lettice stat in the 6-9 range. They usually either have a better relationship – 5 or less – or rapid acceleration has pushed them above 9, at which point the Lettice ending of the game becomes available. Our automated player still chose Lettice-baiting storylets enough to wind up with some very high Lettice stats occasionally – but the important thing is that a human player wouldn’t be locked into boredom here, because the Lettice ending storylet would be available, and they’d be free to see how that went.

This is how the balance looks after I adjusted it. Before, there were fewer ways to raise or lower Lettice’s stats outside the rich-get-richer situations – meaning that the player would often get stuck with whatever type of Lettice relationship was established in the early game, and it was hard to tell the story of a relationship that evolved.

I also lowered the ceiling for bad things to happen, and made the upper range increases happen faster – to make it more likely that the player would experience some actual response to a situation.

We could take other approaches if we wanted other narrative dynamics. For instance, if we wanted the gameplay to mainly shepherd the player towards a median value, we could have low-value storylets raise the stat while high-value storylets lowered it.

Then it would be an active effort to get the stat to one of the extremes – which in some cases might be the desired effect.

Multi-Stat Storylets

Many storylet games have dependencies that can’t be expressed in terms of just one stat, and we’d need some additional mapping to understand, say, the tradeoff between your Lettice stat and your Parents stat.

Bee is comparatively simple, and in practice those tradeoffs were pretty easy to understand: if you have a very high stat of one of the negative types, you can sometimes reduce that stat by raising a different stat – trading one problem for another. Getting upset with your siblings, your poverty, or your social isolation can turn into conflict with your parents, for instance. Since this was straightforward, I didn’t wind up making additional spreadsheet views or graphs to model multiple stat situations.

But there were still a few things I wound up unpicking here. In the original version of the game, if you were feeling isolated and cut off from the world but you’d made friends with Jerome, you could text Jerome. Doing so would raise your Jerome relationship and lower your isolation stat – which seems reasonable on its face.

But the lowering of your isolation stat would mean that you also couldn’t text Jerome again until you were again feeling upset. That felt both wrong as a bit of procedural rhetoric and frustrating for people who would like to maximise their Jerome relationship. So I made some tweaks to make it easier to text Jerome again after that initial chat.

Validation

Autumn ran randomised playthrough tests, which gave us analytics on which storylets were appearing too frequently and which endings were (all else being equal) particularly hard to unlock.

We did a couple of passes with this and revised until most of storylets were averaging 1-2 hits per playthrough; various late-game storylets appeared .2-.4 times per playthrough; and a handful of hard-to-reach or story-ending storylets that appeared more like .07-.15 times per playthrough.

Chart of the least common scenes in Bee based on a hundred automated random playthroughs.

At this point, I was satisfied: though it was possible for the player to get the game into extreme stat situations, doing so would now fairly quickly lead to the possibility of a related ending – rather than parking you for a long time in a situation where you can only replay cards related to your negative situation.

4 thoughts on “Rebalancing Bee”

  1. This is super interesting! Thank you for the insights. I’ve also been thinking a lot about how to QA large interactive stories lately, and this gave some great food for thought.

    If this data ever becomes available, I would be super interested to know how the distribution of actual player choices matches the random sampling. Random sampling definitely seems like the right place to start, but I suspect there will be some significant deviations when actual people start playing, and I’d be super curious to see what those are and whether you have any potential explanations for them (or suggested ways to sample that better match actual play).

  2. Hi Emily. I’m playing this right now. It’s really absorbing, a very fine piece of work. You particularly made me feel something for poor Lettice. I am trying to make all the choices that will develop our sibling relationship, although it will probably screw me up spelling-wise. Wonderful.

Leave a comment