Intro
It’s late night, 1:35am. I was watching “How to Sell Drugs Online, Fast”, which despite its many flaws, falls into the sweet spot between a documentary and a complete fiction: the spot where it’s intelectually stimulating but also not boring. It’s not just about this single meaning spectrum: boring - interesting, there are many other things needed to fall in said sweet spot, but none of that is relevant here.
I then made the mistake of picking up a 60% cocoa bar o chocolate. Cocoa contains between 8% and 58% the amount of caffeine coffee has, so the effect is that I am now in that dreamy but active state, where just enough of your brain has shut down so you can be both productive and creative.
A Fighter’s Festival
Now, onto the main topic of this post: I am making a game about fighting (it’s called A Fighter’s Festival), and in this game, you (the player) are sort of a “coach” for the fighters. Last week I had a crazy idea for a game mechanic, which I think is pretty cool and I am now coming up with algorithms to implement it.
The idea for the mechanic is pretty simple: I want you (the player/coach) to shout at the fighers with your very deep coach wisdom. I also want the fighters to actually react to what you say. And finally, I want it to be possible to say both positive and negative things, and you must learn the personality of each of your fighters so you can:
- say useful things;
- say those things in a way your fighters will listen to.
No, the complex bit is that this is some kind of Natural Language Processing, which computers are pretty bad at, and even to get good enough you usually need a lot of computing power to train ML models and depending upon the case, quite a bit just to run them too.
There are other two constraints for this mechanic, one of them makes it harder,the other, easier (maybe).
The first of those constraints is that I’m going to show this game at a local game development event: Glitch-tyba Mundo. And this is going to be at the end of this month: so I am both low on time to develop the solution and also to train any ML models I may need.
The second of the constraints is that the platform where the game runs (Nibble) has no keyboard input, I don’t want to be dependent on touch input and I want to create a fluid (as in you can “type” what you think) input system.
You may be thinking how can this work in our favor. Maybe I am crazy but let me explain. The idea is that since we have no keyboard or similar, character-based input method, we need to use simpler, faster, word-based input methods. And the fact that we choose the words that get displayed to the user is the big advantage here: our search space is very reduced because we have to handle much fewer inputs in our system.
Another thing that makes the system easier to design than a general NLP system is that we have a very simple set of meanings we want to convey to the fighters with our coaching.
The Mechanic
So, let’s work bottom-up in this mechanic. The first thing we need to know is what can the fighers actually do in response to the coach?
I thought of two broad categories where those actions fit. The first is actions which are taken in direct response to the coach, and the second is actions which change the mood of the player.
- instant actions
- on self
- attack
- defend
- heal
- on others
- help other attack
- defend other
- heal other
- on self
- mood changes
- get deprimed (-strength, -agility)
- get anxious (-agility)
- get angry (+strength, -agility)
- calm down (+agility)
So, our task is first one of classification: given any of the possible phrases classify it in one of three buckets:
- instant action
- mood change
- none
After that, for instant actions we need to find to which characters it applies, what needs to be done, and if there is an object in the sentence.
For mood changes we need to find out which change and what intensity.
The simplest way I can think of finding information in a phrase is just to have phrase templates, indicating the locations of the elements inside it, for example, we could have:
- move now,
{name}
! - move your lazy ass,
{name}
! - get off your feet,
{name}
! - hey
{name}
, move !
And then we would replace {name}
with whatever fighter name the player typed. We could also use this information for when the user is about to type a
fighter name, we only show fighter names.
Notice, though, that just finding out some important parts of the sentences is not enough because of a thing I will call context-based semantics.
To illustrate, let’s use our second example above:
- move your lazy ass,
{name}
!
In a fighter with a set of traits such as:
- animosity
- confidence
- dominance
This phrase instead of being an instant action, could be a mood change towards angry (notice this is due to mainly the tone of the phrase and the dominance trait, we may be able to use this later).
And with another fighter, which has the following traits:
- submission
- humorous
- confidence
The same phrase could be an instant action!
Also notice that may exist contexts + phrases where an action is both instant and a mood change, but that mood change should always be detected first, as it’s the tone, coupled with the context (personality + current mood) that gives us how a given fighter will react to a given phrase. So, we could say that for every tuple (phrase, personality, mood)
there is a
corresponding set of actions (a1,a2,…,an).
Managing Complexity
At this point I feel the need to simplify or at least better display this mechanic. The player must be able to know, to a reasonable degree of certainity and without it being too impossible, which result a given action will have. And when this result do occur, if the player didn’t expect it, it must be shown in the most explicit way why it did occur. Otherwise players cannot learn how to play the game.
The first step for this is to show changes that happen to heros, and actions they take. As mentioned, this is not enough; for the system to be readable, it is a necessity that we show why they take a given action or choose a given choice, as if we were showing the though process that happens inside the mind of the character.
The thing is, there is really no way of representing thought processes concisely (that’s probably why philosophy books are so massive). The best we can do is hint at something and hope the player gets it. But how do we hint at something the character thought? Language can help us here, since we are always using language not to give an exact thought process, but to give a summary of our thoughts and help others understand what experienced.
Using the above example of a dominant
character which gets angry
when told
to do something in an agressive tone. It’s not easy to represent the entire
thinking behind this action, but it’s really easy to communicate it, we all are
a bit dominant (or very much so) and we all felt this way, and when we felt like
this we entered a state of mind in which some things happened: we acted without
reason, we demonized people, we cursed and many other things. This is the key
here: we just need to make the character do something that imediately makes us
remember the way we felt and then, through empathy, we will know what the
character thoughht. In this case, it can be as simple as making the character
say:
asshole!
All together now
So, what did we learn from this very very long philosophic blabber?
- we do not want to run expensive (my time & computer time) algos;
- we need to represent input and output in such a way the player can infer the function in between them.
Lets put this in a diagram, just because I love diagrams.
Now, working backwards in this diagram: first we have the character reaction
block (on the very right), which is responsible for figuring out what a character will do. As
discussed previously, to do that it needs a tuple with (phrase, personality, mood)
.
In the diagram, I broke this tuple even further: personality
became a set of
traits, and phrase
got divided into a action and a tone.
We could represent this in plain old Lua as:
function react(traits, mood, tone, action)
...
return reaction
end
Now lets go up a bit: we have two libraries: the traits library and the mood library. The traits library maps a fighter to their traits, and the mood library maps a fighter to their current mood:
function traits(figher)
...
return traits
end
and
function mood(figher)
...
return mood
end
Now comes three very important parts of our system that we will build in the simplest way possible: target detection, tone detection and action detection. For the target detection, we use the template system described previously; for tone and action, we simple hardcode the values inside the templae for each phrase (simple solutions, babe!).
And that’s it! Now we just need to input phrases, which can be trivially done by constructing a tree from all the possible phrases and traversing it in real-time as the user makes choices.
There you have it: the fakest NLP system. All pre-processed with just one
important function which is actually just a simple heuristic modeling character physicology: the react
function described above.