Static Board Evaluation, Minimax, Alpha-Beta Pruning
If you are simply given a board position, how can you figure out what move is best? The basic approach is to assume that you can give a numeric value to any board position. E.g., if the game is over and you have won, it should receive the best possible value, let's say +1.0. If the game is over and you have lost, it should receive the lowest possible value, let's say -1.0. A draw would be in the middle, or 0.
Can you come up with a way to assign a value for other board positions, when the game isn't done yet? This is the hard part.
One basic approach is to come up with some heuristics based on game strategy. E.g., for Connect 5, I might look at the number of 4-in-a-rows, 3-in-a-rows, and 2-in-a-rows I have and somehow add them up, and subtract the 4-in-a-rows, 3-in-a-rows, and 2-in-a-rows my opponent has, weighting the 4-in-a-rows most heavily. This valuing isn't necessarily 100% accurate, but it's easy to do. In my code, I would write a function that finds a move that maximizes the value. (I don't necessarily need a function that computes the value of the current board. Think about why this is the case.) This should find any immediately winning move, and do a reasonable job otherwise.
Another basic approach just expands the previous one. In order to find out what move to make, I'll see what would happen if I made each of the possible moves. After I make a move, my opponent gets to move. After that, I can make a move, then my opponent, etc., until the game ends. This makes a (large) tree of game positions, with the children of a game position being all those obtained after a possible move.
Diagram 1. An example of simple minimax algorithm on the game of tic-tac-toe.
I can easily assign values to these final won, lost, or drawn positions. Working backwards, I want to assign values to earlier positions, based on what happens later. If I'm forced to lose a move later, that's as bad as having lost. Along the way, I should always pick my best move, i.e., the one with the maximum value, whenever it is my turn. I should also assume that my opponent is just as smart and picking his/her best move, i.e., the one with the minimal value, whenever it is his/her turn.
The minimax algorithm describes perfect play for deterministic, perfect-information games. The goal of the algorithm is to achieve the best payoff against best play.
Minimax Properties:
Minimax Algorithm:
Minimax(state, turn)
if (final? state) then return (WIN, LOSS, or DRAW)
Otherwise {results} = (Minimax {Successors state} (not turn))
if MAXIMIZING: return (maximum {results}) with corresponding moves
if MINIMIZING: return (minimum {results}) with corresponding moves.
Thinking ahead for the rest of the game is impractical except for very simple games (like tic-tac-toe). This is because there can be a very large number of moves and board positions. In practice, I can only think ahead a few moves. This still generates a tree of moves and board positions, but one where the leaves are not necessarily finished games. Instead, I need to use my heuristic board evaluation routine to estimate the values of these positions.
Modified Minimax Algorithm:
Minimax(state, lookahead, turn)
if (Lookahead-Limit-Reached? lookahead) then return (static-board-evaluation state)
Otherwise {results} = (Minimax {Successors state} (not turn) (- lookahead 1))
if MAXIMIZING: return (maximum {results}) with corresponding moves
if MINIMIZING: return (minimum {results}) with corresponding moves.
You'll have a hard time searching more than 2 moves ahead in the time alotted. Compare to the 8+ moves that good human chess players look ahead or the 14+ moves that Deep Blue chess program looks ahead.
We can improve on the performance of the minimax algorithm through the use of alpha-beta pruning. The idea is that we can prevent ourselves from exploring branches of the game tree that are not worth exploring (because they will have no effect on the final outcome. Consider the following situation:
The minimax algorithm provides us with the optimal path as expected. But did we really need to do all that work? In particular, look at nodes 5 and 4. Lets pretend that we are at the middle level stage of finding the minimum value between nodes 6, 5, and 4. We have already calculated the value of node 6. So, we are at least guaranteed that the minimum value between these 3 nodes is <= 6. But wait! On the level above (top level), we are trying to maximize the values. We've already calculated a value of 7 to the left. Something that is guaranteed to be 6 or less will never be greater than 7, so why go any further with the calculations?
Mentally trace through the steps to convince yourself of this. The same property holds true for nodes 1 and 2.
Modified Minimax Algorithm with Alpha-Beta Pruning:
inputs: state - current state of game lookahead - number of lookaheads remaining turn - whose turn it is (also indicates whether we maximize or minimize) alpha - the best score for max along the path to state beta - the best score for min along the path to state Minimax(state, lookahead, turn, alpha, beta)
if (Lookahead-Limit-Reached? lookahead) then return (static-board-evaluation state)
Otherwise
if MAXIMIZING: alpha = -infinity for each s in {Successors state) alpha = (maximum alpha (Minimax s (- lookahead 1) (not turn) alpha beta)) if (> alpha beta) return alpha end return alpha
if MINIMIZING: beta = +infinity for each s in {Successors state) beta = (minimum beta (Minimax s (- lookahead 1) (not turn) alpha beta)) if (< beta alpha) return beta end return beta
Start by thinking about strategies for playing the game. Try playing our sample programs to see what is good and bad about their strategies.
If you are going for the extra credit, you will quickly learn that a faster program is generally a better program. After you get your program to work, you should spend time to see how to make it faster and smarter.
Making your board evaluator smarter generally makes it slower. You look for more kinds of patterns to make more interesting distinctions. Making your game tree search smarter generally means making it search more of the tree. You can try to make the search faster so that you can look another move ahead. Or you can try to figure out what parts of the tree you don't even need to look at ("pruning" the tree) and what order to search the tree in (to maximize pruning).
In the recent past, most of the better programs for this assignment haven't searched ahead at all. In the given time limits, your time is generally better spent on a good board evaluator.
A bit of an e-mail exchange between Ian and a the student whose program that placed 3rd in the initial tournament:
| >Interesting to note that about 6 of these 8 do *not* do | >extensive minimax searching. | | By the way... I added a minimax function to my code, which wasn't difficult | at all, but it extended my move time from an average of 18 seconds to 58 | seconds on a 20x20 board. I asked around some comp sci major upperclassmen, | and other than alpha beta pruning (which would shave off maybe 10 seconds or | so they said), they didn't have any other suggestions. | Ah -- i'd been thinking the 20sec included minimax! (The winning program took about 1sec/move.) It's a bit of an art, but time-saving includes: - revisiting data structures and basic functions, to see if you're doing unnecessary or re-computation; - including other heuristics, like only searching near currently-placed pieces??? [very heuristic, in that it's not guaranteed to be a lousy approach :-] - trim your static board valuation function to be very trim, so that the search can go deep.. stuff like that. Very open-ended -- which is why i left it as a project!
Historical note: In the late 80's, Hitech (from Carnegie Mellon University) was generally the computer chess program to beat. It was designed by a former world champion of postal chess, and his goal was to make the program as smart as possible, both in board evaluation and searching. A rival program, Deep Thought (also from CMU; the predecessor of Deep Blue from IBM) was started with an entirely different goal. Rather than being smart, Deep Thought was mainly fast so that it could look ahead more moves than any other program. As both programs improved, they traded top honors among all chess programs back and forth, with Deep Blue currently the clear winner.