I recently put together an A* implementation for the first time. I actually did it twice: once in a naive way just to get a good grasp on how the algorithm works, and then a second time with an eye towards making it perform better. I’ve been meaning to write up my observations on it for a couple of months, so here they are.
Amit’s A* Pages served as my primary reference for how the algorithm works. That’s definitely the place to go if you want to learn about it.
For my first implementation of the algorithm, I did a very straightforward implementation: I had a 2D grid of locations with movements costs, a list of closed off locations, and a list of open locations being considered. The data structures looked like this (with some unimportant details elided):
There are three important details about this first implementation:
- The open and closed lists are kept in NSMutableArrays. The open list is kept sorted at insertion time. Checking for membership in one of the lists means scanning through the list.
- PathNodes are dynamically allocated.
- This implementation was dog slow.
On an iPad 2 with a moderately sized map, finding a decent length path took 200ms. Totally not acceptable. I expected the first pass to be bad; in particular I expected maintaining the sorted open list to perform terribly. So I fired up Instruments, captured a trace, and the results looked like this:
By far the thing taking the most time was looking through the open and closed lists
for a previously allocated PathNode (
indexOfCoordinate:inList:). Maintaining the
open list took almost no time next to that.
I was breaking other rules too: in particular, allocating and deallocating those PathNodes at runtime was a bad idea. So, armed with the profiler results, I took a second shot at the implementation.
- I keep a 2D array of PathfindingCells with the same dimensions as the input CollisionMap. This is allocated once when the Pathfinder object is created, and never again.
- Instead of using NSMutableArrays, I now keep list links directly in the PathfindingCell structs.
- As a result of those two properties, I can now:
- Directly look up any cell given an x,y location.
- Determine whether a cell is in the open or closed list without traversing the list.
- Only one allocation is made for each run, and that’s for the Path object
that gets returned from the
Importantly, the algorithm didn’t change, but the data layout did. With these changes, a profiler run for the same map and path now looks like this:
That’s better. I could probably make it faster; maybe a priority queue would work better than the insertion-sorted list I’m using now, and I could pre-allocate a fixed length Path object or something else, but it’s fast enough for me now, so I stopped at this point.
I posted the source code for the implementation as a Github Gist. It’s just the A* implementation without any drawing code or anything, but hopefully it’ll still be useful to learn from.