dib, a robot for the 2002 ICFP Programming Contest

Written by Team Solecist: Dan "dfan" Schmidt, Sean "buzzard" Barrett, Dan "inky" Shiovitz

How did they do it?

dib is written in Ocaml, a language in which we had a total of one man-week of experience before the contest started. But hey, good learning oppportunity, right? We further handicapped ourselves by being thousands of miles away from each other. But it ended up being fun regardless.

dfan began working on the task as soon as it was announced, starting by writing a server in Python. inky started playing around with the problem a few hours later and was persuaded to form a team, despite the fact that dfan had just made the unilateral decision to switch to Ocaml because he had recently started playing with it and it seemed like a good fit. buzzard was roped in moments later. After three hours of installing and configuring CVS, ssh, and Ocaml while communicating through chat windows, we were off!

'Team Solecist' was eventually chosen as our name after the realization that, as Ocaml neophytes, we were probably committing tons of solecisms. Also, we had used up the half hour of time allocated to choosing a team name and had to get back to coding.

Overview

dib ended up being just under 2000 lines of Ocaml code. It consisted of three major systems, which ran in series each turn.

  1. Determine the "optimal" move with the 'strategist'
  2. Override local navigation with the 'pattern matcher'
  3. Compute the amount to bid with the 'bidder'
Each of these systems worked totally independently; the strategist picks a goal, computes one or more moves to achieve that goal, and then reports just the first move as "what to do now". The pattern matcher looks at that one move, irrespective of what it is supposed to accomplish, and attempts to adjust it for local navigation and certain strategies. The bidder takes the suggested move, looks at the presence of other robots in the vicinity, and attempts to outbid those robots.

The strategist

The strategist evaluates six or seven possible things to try to do, in a particular order. The strategist is stateless; it makes plans from scratch every turn, rather than trying to remember a plan from turn to turn. This is mostly because we didn't have the time to have it remember plans, but also because this makes it resilient to changes, like being pushed. It performs the action associated with the first of the following checks that succeeds.

1. Is it in position to dunk another robot? If it is, and if it isn't the case that if it misses because the other bot moves first, the other bot will be in position to push IT into water--if that isn't the case, then it goes for the kill.

2. Does it have any packages that are at their destination? if so, it drops them.

3. Are there any packages here it can pick up? If there are, it clumps the packages into groups by their destination, and then finds the group with the best ratio of value to distance-from-here. (We do a BFS across the entire map from the robot's location to allow distance reasoning and pathfinding. A* or something more clever for pathfinding is going to be slower in mazes, since it's more expensive inner loop, and won't allow distance reasoning.) If that pickup will still leave space free, it evaluates other packages in the same way, but now reasoning using distances from the destination of the group it just decided to pick up, rather than from its current location.

4. Has the bot moved more than one square in the last five moves? (Back and forth in two squares is NOT moving more than a square.) If so, there is a 25% chance it doesn't move. This is to avoid "mexican standoffs", where bot A pushes bot B east, so B drops a package. bot A picks up the package, but bot B pushes A west, resulting in A picking up the package (or not, depending on bids) and then dropping it when pushed, and B ending up on top of it; the cycle repeats endlessly. It also avoids a problem with meeting in a two-way corridor and both bots try to sidestep to get around the other, and they repeat that endlessly.

5. Does it have any packages? If so, then pick the nearest delivery location, compute a route to it, and take the first step on that route.

6. If it got to this point, it has no packages, so it finds the nearest place-where-it-knows-there-are-packages or the nearest unexplored base. Then it takes a step to it.

7. Otherwise it sits in place.

The pattern matcher

The pattern matcher attempts to navigate around obstacles, set up kills, avoid putting itself in a place to get killed, and other stuff. It does this by using a collection of hand-built map templates that represent situations involving other bots and their predicted movements, relative to our robot trying to make a specific move. There's a template which represents our robot trying to go head-on into another robot, when there is also a row of empty spaces next to it--in which case we go around. There's a template which means 'this robot is waiting for us to move into a space between it and water; stand still and wait for it to move away'. There's a template which means "this robot is coming towards us down a 1-wide corridor that we're near the mouth of, so back out of the corridor to make way for it". (submitted patterns)

The bidder

The bidder says: are we in position to be dunked, and we're moving? then bid 5% of our money. is our move an attempt to dunk someone else? then bid 1% of our money. is our bid potentially going to conflict with other bots? then bid X where X is our estimated amount that they're bidding, based on past situations where they seem to have outbid us (bids are secret).

The unimplemented

There were infinite things we thought of but didn't have time to do. This is mainly because we were using ocaml despite the fact that nobody on our team knew it well—dfan had a week of experience with it, buzzard had pair-programmed for 4-8 hours with, but not in the driver's seat, and inky had done a little ML in a college class. The above is a very utilitarian strategy, since there's no long-term planning.

We don't go long distances to hunt someone down and dunk them; we only dunk opportunistically. (The pattern matcher encourages us to stop in opportune locations or even to move one square for them.)

We never dunk predictively: guess that a robot is going to go to square X, and move to square X with a large negative bid.

We didn't think to put in code to handle undeliverable packages; the existing code is moderately tolerant of them, but we could end up with an armful of undeliverables and sit in place the rest of the game.

We compute a prediction for which ways each bot might go, but we don't use that in the bidding system for guessing whether we'll be in conflict.

buzzard thought up an easy to implement scheme where, while going from point A to point B, we can basically for free evaluate any point on the map and say "can we hit this point while going shortest path from A to B" (becuase there are a lot of such shortest paths, since diagonal moves aren't allowed); we could have used this for investigating unexplored bases or going over supposed delivery locations while going from A to B; but he never implemented it.

We keep track of as much state of all the packages as we can. We fought over what to do when we hear about a drop. If we don't know the destination of the package, should we assume that it has been delivered? If we think it's sitting on the map, and we're wrong, we could go way out of our way to try to pick it up, only to find that it has disappeared. We ended up assuming that all dropped packages were delivered unless we can prove they weren't, which leaves us vulnerable to 'hiders.'

Very early on, we did think of the hider strategy, 'pick up a bunch of packages, drop them four squares away from the base, repeat,', with the idea being that other robots will assume they were delivered when they see them dropped, and we can thus go back and get them unmolested. Not only did we never implement this, but (see above) we didn't even have time to implement a defense against it.

buzzard didn't start writing the pattern matcher until nearly midnight, and didn't finish it until like 4am when the contest was at noon, so the patterns weren't as tuned as we would have liked. Bots clog horribly in 1-wide corridors, and one of the test maps on the public server was semi-maze-like with one-wide corridors everywhere; he had some ideas for implementing smart 1-wide corridor logic, computing 'zones' which are continuous 1-wide regions, and then detecting there are bots in them and seeing whether they're emerging or going in, but he didn't have time for it (a very short-lookahead version of this did get implemented in the pattern matcher).

We don't keep any state from turn to turn, so we recompute all pathfinds every turn, which wastes computation. They say they'll disqualify bots that use more than 1 CPU second per turn on average, and they allowed for boards up to 1000x1000 squares. We have no idea how we'll perform on such boards, but clearly saving plans and paths from turn to turn would have improved it.

We solve the integer packing and travelling salesmen problems in greedy fashion--pick the package group with the best score/distance ratio, then the next such group, repeat. But then when we pathfind, we just go to the nearest, not the best score/ distance.

The pattern matching stuff all only handles one bot in the vicinity. If there are two we just plow ahead with our plans. If we attempt to pickup or drop, we never check that we're in danger of being dunked, so we don't avoid being dunked. This can also occur if we select 'pickup nothing' as our 'don't do anything this turn' move, e.g. because our pathfind fails or because we decide we're stuck and do nothing due to the 25% chance of that.

Source

Excited to see what a mess of Ocaml code written by complete beginners in three days looks like? Our distribution is here. You were warned.