One of the fun (or, depending on how you look at it, annoying) aspects of polishing an IF game is working on the disambiguation. This usually involves combing through beta transcripts and seeing where the tester plainly meant X and the game instead understood Y. Sometimes the fixes are trivial things — you left out a synonym, you didn’t set a pronoun at the right time, etc., so the code didn’t have all the information it needed to make the right determination.
The more interesting cases are the ones that challenge you to think more deeply about how language normally works, though, and come up with sensible working rules about what the player probably means.
Alabaster spoilers after the cut.
F’rinstance: in Alabaster, everyone has a bunch of body parts. (To start with, it was just Snow White who had them — because her description mentioned her eyes and hair — but then, if one NPC is implemented with that degree of depth, the player reasonably expects everyone to be. So: body parts all around!)
At first, my disambiguation rule was this: if the player isn’t specific, he probably means to look at the body part of the person he’s talking to. He’s only going to examine his own body parts if he specifically distinguishes them from those of his interlocutor. And he probably doesn’t mean the body parts of the dead animal on the ground, either.
That worked for a lot of cases, and it was certainly better than having a bunch of “Which do you mean, Snow White’s nose, your nose, or the decomposing hart corpse’s nose?” every time the player typed X NOSE.
But more transcripts have come in, and they make it obvious that that’s not such a good rule after all. Suppose I have this:
You peer more closely at it. The heart is gone, the chest cavity lies open, it is plainly dead. And yet it seems to be looking at you with those glassy eyes.
(Snow White’s eyes)
They are flawless, deep black, the pupil and the iris indistinguishable except in the brightest lamplight. (Or perhaps there is a color that sunlight would reveal, but you have never seen her so.)
Wrong! Wrong, wrong, wrong. Immediately, totally breaks any illusion that the computer is telling you a story in a sensible way.
So then my thought was, do some kind of special-case tracking so that if you have just looked at the corpse, your immediately following reference to the eyes will disambiguate as the corpse’s eyes. Often an IF game is quite specifically hinting at you to notice something, and maybe it is worth tracking that in some way and using it for disambiguation and possibly to guide prose generation in the future.
But the result of that also seemed wrong, because what if the player types >X EYES again on the next turn? Should that revert to looking at Snow White’s eyes? Probably not. And what if he typed >X HEAD or >X FACE? He probably still means the corpse’s head, right?
The current rule is this: whenever someone or someone’s body part gets mentioned to the game, that someone is set to be the current “body context”. Barring other information, the game will disambiguate to assume you want to interact with the body part of the person you most recently interacted with.
That may turn out to be not-quite-right too. We’ll just have to see.
And I’m still wrestling a bit with the possibility of generalizing my initial idea, viz., to have some systematic way of tracking how the game has directed the player’s attention most recently, and using that to craft more fluent exchanges.
6 thoughts on “Fit and finish issues: disambiguation”
This seems similar to the problem of appropriately tracking pronouns, so maybe you could hook into the rules for updating pronouns to update the “body context” as well.
Computational linguists doing dialogue modelling often use a ‘salience list’. It contains (roughly speaking) every object previously mentioned in the dialogue; auto disambiguation of a noun phrase “the X” works by taking the first object in the list matching the property X. You might have a whole bunch of complicated rules for updating where things sit in the list, but very roughly when something is mentioned it gets pushed to the top. (You’d want to say that mentioning the hart pushes all its body parts to the top also, most likely, not just the ones explicitly mentioned.) “Mention” can be by anyone, so if the player responds to the description of the hart by asking something specific about Snow White she (and her eyes) get pushed again to the top of the salience list.
Sad to say I haven’t had time to look at I7, so I don’t know if something like this is already being done. But perhaps the idea of having a generic salience ordering of everything is helpful?
Oops, should have mentioned that the salience list is also used for pronouns. (It gets tricky with things like “he hit him”, where you want the top /two/ most salient male people not the top one twice.) Linguistically speaking I think the claim would be that pronoun resolution is a special case of more general (partly salience-based) NP resolution. Admittedly that special case comes with its own special restrictions, so it might not make sense to code it that way. (And again, this is written from a position of woeful ignorance about I7. Excuse me if this is old hat.)
Sad to say I haven’t had time to look at I7, so I don’t know if something like this is already being done.
It doesn’t do anything nearly that sophisticated, no. It does keep track of one probable antecedent apiece for “him”, “her”, “it”, and “them” — but that’s as far as it goes (and it would certainly mess up the >SHOW HIM TO HIM sort of command).
I could sort of imagine adding something like this to the parser (though it would be a fairly significant hack). I think one would want the salience list to have a less powerful effect than other considerations about which actions are plausible; “is this thing edible?” seems likely to be more important than “is this thing most recently mentioned?” when deciding the object of an EAT command, for instance. In the experiments you’re talking about, how/how much does interpretation depend on the computer’s knowledge of a model world (or facts about the real world)?
I also thought of ‘it’. I can imagine some sort of ranking scheme a la my old favorite ChooseObjects where body parts of the ‘him’, ‘her’, and ‘it’ objects got the highest ranking, followed by the body parts of the person the player is talking to, followed by the player. Though actually I’m not sure how to disambiguate the first two objects in that list. Is there a state ‘is in a conversation’? If so, the ranking would look like:
if (in a conversation):
most likely: the person being spoken to
next most likely: ‘him’, ‘her’, or ‘it’
next most likely: oneself
most likely: ‘him’ ‘her’ or ‘it’
next most likely: the person last spoken to
next most likely: oneself