I found the light-bot game a while back, probably the way most people do: Reddit - you are simultaneously the best and worst thing to happen to me. Providing me with endless interestingness, but with such horrible timing - usually the middle of the afternoon when I have work to be doing. Like a crack dealer that shows up only when the kids are actually at the library trying to get their homework done. This particular game I discovered at about 11:30pm on a work-night. Great timing, as always.
I did work my way through most of it that night, and finished it off the next morning. Much kudos to the author of the game! Quite fun. It reminded me of a nicely simplified version of RoboRally - a game I love to play, but can't find enough people willing to play with me. I think I need to work on my salesmanship. Perhaps using "It's a game about programming! It's fun!" Isn't the best technique... especially when they know you're a professional programmer...
Anyway, having solved the puzzle, I got to wondering... was my solution optimal? Surely I could optimize the number of instructions. Especially when one of the designers at work claimed to have beaten my score by a good 20 instructions. I think he mis-typed... but I haven't yet proven it.
I've been looking for a good, simple program to try an evolutionary algorithm on, and this one jumped right out at me and said "Evolve ME". Never being one to ignore the voices in my head (for long), I got right to it.
It's a simple enough game to simulate - just a few moves and a very straight forward board. Working on the simulator was a great deal of fun, especially as I'd just been reading The Annotated Turing (highly recommend, if you feel the need to bust a serious geek move or two).
Once the simulator was done, generating some random programs was easy enough. But there are some gotchas hiding in there. Looping programs are easy to accidentally generate, but are no good - you won't advance your state any further than you are, and you send your computer into an infinite loop.
So, it's better to not do a purely random program - just make sure it won't loop. Easy enough, given the very straight forward set of instructions and 2 sub-routines you're allowed to use:
- main program can call either sub1 or sub2
- sub 1 can only call sub 2
- sub2 can't call anything
Ok... problem solved. Now, how to evolve the generated program? Well, we need a selection criteria - what makes a good program, and what makes a bad one? Simple enough - the more lights lit up, the better, right? But what if it doesn't light any of them up? Well, how about how far away it is from the nearest light when it's done? That way, if nothing lights up, we'll select for bots that get closer to the lights. And if it lights them all up? Well, then the number of instructions!
So, I have 3 criteria. Can we condense it a little? 2 comparisons is easier than 3. How about if we multiply the distance to the nearest square by the number of unlit blue squares? That should roll the two criteria for programs that don't win the board into one, easy to compare number. Once it hits 0 (there are no unlit blue squares), I can ignore it and go with the number of moves used.
Next up - need to actually create a population, run the generated programs on the board, and score them all. Then select the best N programs, and create a new generation that are permutations of the best programs! Most of that is relatively straight forward list manipulation. Permutation takes a little thinking about, but really, it's not hard:
- Select some number of random changes to make
- Pick a random location for each change
- Pick a random move
(implemented exactly that way, it could randomly pick the move that was already there - no biggie - consider it a benign mutation.)
Done and done.
Lastly - when to terminate? This one I couldn't decide on, so I implemented two alternatives:
- Duration
- Number of generations without improvement
- to select the cohort size
- number to promote
- injection of newly generated random members
- which board to use
- number of mutations per generation
- and to set the random seed
This program does suffer from the problem of getting trapped in local-maxima. Thus all the ways to introduce new randomness. I'm still tinkering, and welcome any feedback or input.
The source code is here. Just compile with g++. No options necessary, though -o3 might be a good idea. I wrote it on Win32 under Cygwin, but I don't think it does anything terribly cross-platform unfriendly.
Other TODOs:
- Optimize!
- trivial cohort sizes spend most of their time in allocating memory
- non-trivial cohort sizes spend most of their time sorting
- simple optimization - don't need to sort the entire list - just find the N best items
- Better optimization - switch over to proper STL lists and use the STL algorithms.
- (for shame - publishing code that does it's own lists!)
- Remove those news - allocate it all at the beginning and be done with it, eh?
- Any other options for getting out of local maxima?
No comments:
Post a Comment