Friday, January 26, 2024

BattleTech AI [Part 3/??]: Extending the Behavior Interface to Support Ongoing/Asynchronous Work

 In previous posts, I've talked about how a Behavior Tree is made up of Behavior nodes, assembled together with the connective tissue of sequence and selector nodes. This is a great start, but it can be useful to provide a means for suspending traversal through the tree, doing part now, and meaningfully be able to come back later, especially if some behaviors take a while.

We used Unity on PC for Battletech, which gave us a certain structure to work with. You can set up "Game Objects" that get periodic update calls, so that you can do AI or other internal logic for each object. You have to finish what you're doing pretty quickly so that other things, like rendering, like other stuff on the computer outside the game, can continue to operate. There are certainly approaches you can take to make this a little less intrusive, so that you can pretend you have the whole computer to yourself, but even these have costs.

The way that I connected the AI system with the rest of the BattleTech game was that the game would inform a unit that it was time for that unit to move. In BattleTech, there was some opportunity for a player to move their units in any order, or to defer a movement. We thought a little bit about this, but I felt that the value of allowing the AI to choose its order would be small, and if one unit moved at the end of turn 1, but at the beginning of turn 2, that might feel like the unit is moving too fast, even though the human player has the same option.

So, the game tells us that it's time for one of the AI mechs to move, it begins going through the behavior tree. A lot of the nodes are "internal", having no effect on the outside world, no visuals. Examples of this are the observational nodes, which provide a value of success or failure for some question about the outside world or the mech itself. These usually execute very fast. However, there are things that take longer, like pathfinding.

Our game was broken up into missions, some of which were a simple arrangement of a few different spawnpoints on the map, with the human player's mech starting at one spawnpoint, and the AI mechs spawning elsewhere on the map, with the last player standing winning.

We also had other missions that were more carefully scripted. In these missions, the level designer would provide "orders" to give the AI mechs things to do. Often, this would be that the AI player units would start at one location, and go to a second location. Sometimes, it was left up to the AI to figure out a good path to the destination, sometimes a path was provided.

In other games, even waiting for movement to complete is an important constraint. It's not so bad for BattleTech, because the Behavior Tree code generates orders, which it gives to the underlying game code to act on, in a similar way to the way that human player input is acted on. Once the AI decides on the orders, it submits them, and the AI goes away until the next time it needs to move a unit.

Even so, generating these orders might take up to a second, and we want to go on drawing the screen, probably playing music, other things, so we broke up that processing by "timeslicing", which just means doing a little bit, letting the computer do what it needs to do, then doing a little bit more, and so on, until completion.

This timeslicing approach meant that we would be working on traversing our tree, executing behavior nodes, and then the timer would tell us to pause our work. If you have ever played "Red Light/Green Light" or "Granny's Footsteps", it's similar.

The way I got this to work was to check the clock when I started working, I would do as much execution as I could up to the point where the clock told me that I was out of time, and then I would pass a special return code up, RUNNING, instead of success or failure. This value would continue up the tree to the root, which would return RUNNING, and I would know that the AI was still in process.

Then, on my next update, I would follow the chain of RUNNING nodes down to the bottommost one, let it continue as before, and pass RUNNING up the next time we ran out of time. Eventually, we'd have orders to submit, and we wouldn't need to do any more evaluation.

In a different game, where the behavior trees were being evaluated more frequently, or in a more ongoing fashion, you could imagine a "walk to <location>" behavior that might take several seconds to complete, depending on how far away the location is, and how fast or slow the AI walks. You would return RUNNING as long as the character was walking (don't get confused between the RUNNING return state, which refers to evaluation of the Behavior Tree and the physical character's movement animation).

If your game has this kind of frequent/ongoing AI processing, you would want to validate that the various sensors that you had used to decide to do the ongoing action were still consistent with doing the action. For example in the grocery shopping example, if you suddenly ran out of gas on your way to the grocery store, you'd want to have the AI be aware of this right away, and select a different behavior for getting food.

How far back do you go? Seems like you need to re-run the observational behaviors to figure out the context to be acting in, but that breaks observational behaviors away from behaviors with physical effects, which makes the structure a little different. For more discussion on that, you'll have to go elsewhere, as we had the luxury of essentially one unit moving at a time.


Next up: Influence Maps for Spatial Awareness

Battletech AI [Part 2/??]: Behavior Trees, an abstract discussion

 In Part 1, I talked a little bit about diving in to the BattleTech project, choosing Behavior Trees as an architecture. In this part, I'll talk a little bit about what Behavior Trees are, and how we used them to issue orders. In a later part, I'll talk more about the actual workflow and implementation.

What is a Behavior Tree?

It's a tree of Behaviors. I suppose it doesn't really need to be a tree - if you had nodes with more than one parent, that'd no longer be a tree, and maybe that makes some sense, like if you wanted to reuse a chunk of your tree elsewhere. But for us, there's a top-level node, and then a bunch of nodes below. I'm basically pushing the important part down to the obvious next question:

What is a Behavior?

In related architectures, the idea of a "Behavior" might also be called a "Task" or an "Action". Let's use "Behavior" here, since the overall architecture uses that in its name. Perhaps it makes sense to look at a big section of the tree as making up a higher-level "behavior", which is fine when you're talking about "attack behavior" or other emergent things that the AI does. But these are importantly emergent observed phenomena and not anything that you can point to in the code.

A Behavior, for our purposes, is an elementary bit of business for the AI to do. It's like a function call that returns SUCCESS or FAILURE (maybe some other values as we get deeper, but at first just a boolean indicating if we succeeded). Examples of this might be "go to <target location>" or "shoot weapon at <target enemy>" or "power off engines" or "stand up". If you imagine playing a game like BattleTech, you could imagine these things being actions you could take on your turn, possibly by clicking on the map, perhaps pressing a button in the UI or on your keyboard.

This feels like a pretty simple idea to start with, and it is. If you've done any programming, you might know about function calls or subroutines, and this is basically the same idea. There are at least two important pieces here that starting here provides:
  • Encapsulation - by putting together a behavior like "shoot weapon at <target>", we're establishing a small space to work in. In this case, we're passing in a target (how that works isn't super important right now). The important part is that there's a lot of stuff outside our little sandbox, and we are deliberately only paying attention to what we need to get our job done. This works on the way out, too: we're returning SUCCESS or FAILURE and not leaking extra information to the outside world. Again, in some cases, we will want a little more flexibility, but this is a useful rule to work with.
  • Success/Failure Results - every behavior will output a success/failure value, even stuff that seems like it would always succeed. This is not a super heavy burden, and simple code sometimes becomes more complicated, so providing this output requirement helps keep things predictable.
The combination of these two makes Behaviors like little Lego building blocks that you can put together, mixing and matching to build your tree, except we haven't talked about how those connections are made, which is the next bit.

Connecting Behaviors

If we have a couple of different behaviors, like "shoot gun" and "eat sandwich", the first thing we might think of to connect them is to put them in a sequence - like "shoot gun" AND THEN "eat sandwich". Implicitly, we want to do the first action, and then, once the first action has completed, we do the other action. Perhaps there's a connection between the two actions that give us a reason to do things in a certain order, and maybe we don't want to do all of the actions all the time. Using the result values from an action can express what we want to do.

Sequences

Let's now consider the action "shoot gun". Let's say that our gun needs ammunition, and will not work without being loaded. So, for now, let's also imagine an action "load gun". These two actions make sense to group together, so let's connect them as nodes under a parent node to capture the idea of "load gun" AND THEN "shoot gun". This is a good start, but recall that all of our behaviors return success or failure. If I fail to load the gun, I don't expect to be able to shoot the gun, so when I combine these nodes, I probably want the concept to be more like IF I "load gun" THEN I "shoot gun". We're still clear that the shooting happens after the loading, but we're also clear that the shooting is conditional on the success of loading.

The structure we will use to accomplish this is to begin to build a tree. "load gun" and "shoot gun" are each nodes in the tree, and we'll group them together as children under a sequence node. The sequence node adheres to the same interface we built for behaviors - it runs, it returns success or failure. Success for our sequence node is if we executed "load gun", if it succeeds, and then if we execute "shoot gun" and that also succeeds. We walk through each of the behaviors under our sequence node parent, doing them one at a time, and proceeding to the next one. When all of the behaviors have run and succeeded, the parent sequence node returns success. If any of the behaviors fail, we stop this execution and return failure.

So far, our example only has two behaviors, but it's easy to imagine, and easy for the tree to support, having three or more behaviors under a sequence node. All have to succeed in order for the sequence node to return success, if any fail, the parent sequence node reports that the sequence has failed.

Selections

A different way we might want to group together behaviors is the idea that we want one of the behaviors to run to accomplish some goal, and we have many possible ways of doing this. So long as one behavior succeeds, the entire grouping has done its job. The earlier example of "eat sandwich" or "eat old pizza" or "cook meal" is that idea - I'm hungry, each of these behaviors would work for what I want, which is to feed myself somehow.

We'll use a very similar structure, grouping the different behaviors together under a selection node. This node will try running each of its children, and for each one, if the child succeeds, then we're done, and the parent selection node returns success, no further work needs to be done. If the child fails, we go on to the next child, try that, and repeat. The first child success is a success for the parent. If all children fail, then the parent returns failure.

This "early return" pattern might remind programmers of how many languages handle AND and OR clauses, evaluating expressions one at a time, left to right, stopping when later evaluations are not necessary. A sequence is like an AND of all its children - do this AND that AND the next, so on. A selector is like saying do this OR that OR the next, one at a time, until one succeeds.

This feels like a neat way to think of a combination of behaviors as being like pieces of a big logical calculation, like you're wiring up a circuit or assembling a flowchart. These aren't bad ideas, and can give us some inspiration of other nodes we might want.

Inverters

One of the simpler nodes, it has a single child and flips SUCCESS to FAILURE and vice versa. I'll hold off on examples here, as they seem weird. You could assume that you might want to reuse nodes, and sometimes the opposite outputs can be helpful, and that's a good idea for now.

Using Behaviors to Observe the World

Because we have this idea of a behavior being activated and returning success or failure, we can think of actions that don't have any external change in the world, that are essentially just observing the outer world, or even our inner self. Am I hungry? Is my gun loaded? Is there a bad guy nearby? Do I have money in my bank account? All of these questions can be answered "yes" or "no", and so fit well into our behavior definition.

Going back to the eating example, we might think about the trip to the grocery store, which we said would look like "go to store" AND THEN "buy ingredients" AND THEN "come home" AND THEN "prepare meal", but we might want to stick an observation at the beginning, a new node "has money", which seems like a good thing to think about before taking this trip. If I don't have money, I don't even want to go to the store, much less any of the other behaviors. The "has money" behavior node returns success if I do have money, and the rest of the sequence proceeds as expected. If I don't have money, it returns failure, and we can think about doing other things.

Thinking of observer nodes like this expands our idea of what a behavior tree can do - now we group more "active" behaviors together with these more passive observations to ensure that we do things in the right context. An important part of making an AI appear intelligent is to have it properly react to its environment, and the more observations you make, the better you'll be able to ensure that the actions make sense.

Indeed, let's add some more observations to our grocery trip, like "has car" which succeeds if you have a car, maybe we'd want to also have an observation like "car has gas", and it's probably good to have an observation like "the store is open". We can sequence all of these observations along with our sequence action nodes, and if any of the observations is false, we'll save a trip.

I'll loop back here and point out that with observation-only behaviors, it feels a little more useful to have the inverter node from before - we might want to set up a sequence that happens when "car has gas" is true, and elsewhere in the tree we might have another "car has gas" node that we put under an inverter node to express the idea that we're going to do a sequence of actions when we have observed that we don't have gas in the car, maybe we'll go to the gas station.


Summary

These are the bare bones ideas of some of the nodes that are often used in Behavior Trees. Lots of people have implemented a lot of versions of these ideas, and I'll get into more details in a later post of what we did to solve some of our requirements. The basic ideas feel pretty foundational and generally useful, but there's nothing stopping you from coming up with variations or brand new nodes. A variation on a sequence might be a randomized selection that mixes up the order that things are done. You might build in "cooldown" logic with a node that fails if it's called within five seconds of the last time it was called.

With the simple behavior interface, you can build deep and broad trees easily, where nodes in a branch combine together to capture larger patterns.

Next Up: expanding the behavior interface

New Year, New Series, Old Tech - Battletech AI [Part 1/??]: Introduction to Behavior Trees

Back in 2016, I got hired at Harebrained Schemes to be the AI engineer for their upcoming PC turn-based squad-based strategy mech combat game. I got pretty wide room to build whatever system I thought would work, but there were constraints, based on the game we were making:
  • Turn Based - a constraint or a liberty, I had a bunch of time (maybe a second?) to make a choice per unit per turn. This meant I could choose an action, but the expectation of quality was higher than an action game
  • Squad Based - the AI units had to feel like they were working together as a group of up to four units, typically with some military training.
  • Mech - the fiction was that these were big combat rigs, with human pilots inside. There was not freedom to write off bad decisions as robots being robots or alien brains.
  • Combat - the AI needed to focus on, and have competency in, fighting. Being aggressive, starting fights, taking the enemy out when possible.
  • Game - implicit, but worth calling out, the requirement is that the decisions were good, but "perfect" gameplay was an anti-requirement. We don't want to keep the player from winning altogether. 
In my initial conversations with Kevin, the designer that I would be working with, we talked about having a "playbook", where certain configurations of units (my units look like this, the enemy units are laid out like that) would lead to certain pre-scripted plans (from this configuration, we should try to flank to the left).

I pushed back on this. My gut feeling was that the number of configurations we'd have to set up, the number of scripted reactions, would be way too high. I was worried about matching the current state to a known input state in our playbook. I was worried that the scripted responses would feel too canned, and also that the response scripts would need to be interrupted as the player reacted.

Perhaps we could have made this work, but it didn't feel like a path to responsive, satisfying gameplay.

My mind went back to previous work I had done, working with Pattie Maes' Activation Nets for my thesis, which I knew had evolved into C4, which I hadn't done anything with. I chatted with Damian Isla at GDC, who got me thinking about Behavior Trees, which he had used on the Halo 2 AI, and he had given a GDC talk about them, so that felt promising.

My experience with most AI architectures is that the idea is simple, and the complexity comes in implementation. Consistent with that, Behavior Trees are an easy idea - decompose the big behavior space you want to work in, down to a small amount of actions, some of which might be alternative ways of accomplishing a goal. If I'm hungry, I have a goal to feed myself, which could be satisfied by making a sandwich or reheating a slice of pizza from last night, or getting ingredients from the store, and cooking a meal. These actions themselves might break down into tasks (going to the store, buying a list of items, coming back home, preparing the meal). Importantly, Behavior Trees provide structural support for a task failing. If I wanted to reheat the pizza, but my roommate had finished it all off, that task would not work, and I'd have to do a different option.

Lots of different people have made lots of different Behavior Tree implementations, and from my research, most of those implementations were targeting realtime, not turn-based use cases. This means that my implementation might better be called a "decision tree", a glorified flowchart that cascades down the tree, ultimately resulting in a set of orders that the AI hands to the game for the game to execute.

This isn't (just) a pedantic notational quibble - if you're doing realtime control, using Behavior Trees, you're going to want to be aware of ongoing actions and handle changing situations. Because BattleTech is turn-based, I had the luxury of doing a fresh calculation each turn for each unit.

At the time that I started, one of my friends and coworkers, Andy, had set up a small Unity project with existing Mech assets (thanks to an agreement to work with Catalyst who had a bunch of Mech models). Shortly after coming back from GDC, I hacked together a pretty tiny Behavior Tree as a proof of concept, where the mechs would walk toward player-controlled units, and fire some weapons. It wasn't a lot, but it showed all involved that this was a system that could work, and I could improve it to meet our requirements.

Next up: more depth on Behavior Trees