R E A D M E   f o r   C O N N E C T  5 . S S

All portions of this document are relevant to
connect5.ss v2.0 by Andrea Pound (ID# 477821); Dec. 19, 2001

Fewer portions are relevant to the hardcopy, v.1.0.
See below.

-------------
Administrivia
-------------

This is the required Readme file for connect5.ss,
a Scheme program that analyzes a Connect 5 board
and returns the next move for one player.

A full statement of the program specifications can
be found at 
http://www.owlnet.rice.edu/~comp210/Assignments/Connect5/connect5.html

To run the program from this directory, type "./go < [input file]"
where the input file is in the format specified at the
URL above.  Many input examples can be found in tests.txt,
but note that most of those examples are commented out.

The program can be made to play a game against another or
against itself by running either conn5-text or conn5-gui,
which may be found in 
/home/comp210/public_html/01fall/Assignments/Connect5/Bin

---------
Connect 5
---------

Connect 5 is also known as Go-Moku.  It was proven to be a win
for black (i.e. the first player) in the early 90s.  More on
Go-Moku and threat-space searching can be found here:
http://library.thinkquest.org/18242/data/resources/gomoku.pdf

Go-Moku is similar to the commercial game Pente, but Pente 
adds capture rules and permits a win by captures in addition to
five-in-a-row.  The author of this document has played quite
a lot of Pente ...

-----------------
connect5.ss v.2.0
-----------------

The connect5.ss (v.2.0) in this directory does not implement the perfect
strategy or any kind of depth search at all.  It is really very
simple.  It probably won the early tournament in Comp 210 for
two reasons: first, not all the other students had finished
their code for the early tournament, and second, it used a cumulative 
scoring heuristic that made sense to a player who had some casual 
experience winning against other humans without thinking "deeply" 
about moves.  Most of the other players in the tournament still seemed 
to be making very human moves, so the strategy worked.

-----------------------
Common Program Elements
-----------------------

The following is true for both versions of connect5.ss:

o  the gameboard is represented with a vector-of-vector, i.e. a matrix
o  there's a function, matrix-slices, that returns all possible segments
   of a certain length in a hypothetical matrix; not merely "lines" of
   rows, columns, and diagonals, but "slices" of every line ... every
   way that every line can be cut to the specified length 
o  there's a gamepiece-counter that looks at a slice on the gameboard
   and, if it isn't already blocked from being a win, returns the number 
   of pieces growing toward a win (the sign of this return value
   indicates the pieces' color: positive for black; negative for white)
o  there's a priority board for scheduling different possible moves
   according to their importance: highest priority for immediate wins,
   second highest priority for required blocks of the next player's wins,
   some very much smaller priority for other moves, and zero 
   priority for impossible moves (where there are already pieces played)
o  there's a function that sorts positions on the board according
   to their value on the priority board; the program ends by printing 
   the move with the highest priority (or equivalent)
o  the program assumes it will always be fed a playable board in the
   correct format, but that was a given in the problem specification

-----------------------------------
This Document and the Basic Program
-----------------------------------

o  v.1.0 of connect.ss (the hardcopy version) stops there.
   It assigns special priority to wins and forced blocks,
   but it assigns the same priority to every other playable space.
   In short, it has no heuristics for play.  It simply does
   what is required.
o  v.2.0 of connect.ss (Andrea's tournament winner) goes further.
   The remainder of this document describes primarily v.2.0.

----------------------------
Advanced Heuristics for Play
----------------------------

o  the priority board is initialized with zeroes in positions that
   are impossible to play and very small priorities (weighted more or less
   toward the center) in every other position; positions that start
   with zero priority will never be increased in priority
o  the point of the small positive priorities is to break ties in priority
   in favor of more centrist and theoretically more connectible moves;
   it doesn't really matter what these priorities are as long as they're
   small and tend to increase toward the center of the board

o  each slice of WIN-LENGTH on the gameboard is easily examined for
   how many pieces it contains
o  a slice of the board is only interesting if it's a potential win for 
   some player
o  most interesting slices are given scores equal to the square of
   the number of pieces they contain (to make it a little more
   aggressive, black's slices receive a tiny bonus)
o  these scores are cumulative, so overlapping slices of interest
   improve the score of their playable intersections
o  but some slices are instant wins or forced moves; they get special
   scores that never change (they're non-cumulative)
o  scores are easily sorted; then, it's possible to either return a
   move immediately, if the score is high enough or if that's the
   only available strategy, or investigate the high scores further
   with some strategy that looks ahead a few moves

o  but, after immediate wins and forced blocks, there's another
   pattern that's easy to find and worth acting on instantly
   without going any deeper: "double threats" in a single row
o  a double threat in a single row is a slice of length WIN-LENGTH+1
   that has blanks on both ends and WIN-LENGTH-2 pieces of one
   color at any positions in the middle; filling in the blank that's 
   not on the end will cause the opponent to block one end, but leave 
   the other open for an instant win right after that
o  currently, only double threats for black are prioritized; potential
   double threats for white should probably be prioritized only
   slightly less than double threats for black, and I can't recall
   why I didn't bother, but now there's no incentive to change it--
   the program is entirely functional, and I will be interested
   in seeing how many people beat the same version that won the
   early tournament (perhaps confirming my hypothesis that the
   early tournament would be somewhat easier to win thanks to the
   smaller number of entrants and the little time they had to finish)

------------------------
Example of Advanced Play
------------------------

Consider this board:

- - - - - -
- X X X - -
- - O O O X
- O - - - -
X X X - - -
O O O - - -

If WIN-LENGTH is 5, then there are 32 slices of WIN-LENGTH on this board:
+ 2 for each row (the 5 left-most and 5 right-most spaces in each line of 6)
+ 2 for each column (the 5 up-most and 5 down-most spaces in each line of 6)
+ 2 for each major diagonal (ditto)
+ 4 shorter diagonals (the 4 diagonals with exactly 5 spaces in them)

There are many plausible moves for X, but one obviously sets up
a win for X on X's next/following move: (4 1).  The O player can't
do anything about it except flail about on the edges of the four-in-a-row
that would set up.  A search for double-threats will discover
this fact.  But let's consider how the normal scoring heuristic would do.

Ordinarily, (4 1) would have a priority of around 20:
+ 9: the square of 3 Xs on row 1 in a slice that includes (0 1) 
+ 9: the square of those same 3 Xs in a slice that includes (5 1)
+ 1 from the O in column 4 in a slice that includes (4 0)
+ 1 from the O in column 4 in a slice that includes (4 5)
+ whatever small value was assigned to (4 1) based roughly 
  on its proximity to the center
+ some small bonus for being an aggressive move for black

The O in the major diagonal and the X in column 5 are adjacent
to (4 1), but they are disregarded here because they can never
be directly involved in a win connected to (4 1).  

So the normal scoring heuristic has correctly discovered the most
useful play for black.  The next highest score on this board would be 
around 13 to 15.  

However, searching the totality of slices WIN-LENGTH+1 in length, 
it becomes apparent that (4 1) is a "middle" blank in a WIN-LENGTH+1 slice
that contains (WIN-LENGTH - 2) black pieces and has blanks on both ends.
Accordingly, it should be assigned a much much higher priority than
any other space (in fact, for boards that are 20x20 or less,
it should receive a priority higher than the normal scoring heuristic could
possibly total up).  It's a perfect move for this board.

If there were any immediate wins for X or any necessary moves
to block O, then they would receive maximally high and extremely
high priorities, respectively ... values that would in every
case sort to the correct rank in the list of all possible moves.
But here (4 1) receives a moderately high special priority--not
as important as an immediate win or forced block, yet higher
than any other kind of threat.

----------------
To Do for v.3.0?
----------------

This scoring heuristic is certainly inadequate compared to a good
depth search (especially compared to the threat-space search that
always results in a win for black).  But most depth searches depend on 
a good evaluation function.  On reflection, it seems that the above scoring
heuristic could be made into a reasonably good evaluation function,
particularly if the -1 to +1 continuum recommended in the problem
statement were ignored.  The evaluation function could simply order
the best moves for a given player, just like the above scoring
heuristic does now for black.  Then, both the depth and breadth of the 
search could be easily controlled: only go so many levels deep and only 
check out the top X moves for the current player.  If the current player 
ever gets a win or a double-threat, return either +1 for black's 
inevitable win or -1 for white's inevitable win.  Otherwise, return 0.  
If the limits of depth and breadth searching are reached before black's 
inevitable win is found, the program could choose the best move, 
according to the scoring heuristic at the top level, that (if possible) 
did not lead to white's inevitable win.

That still may not be as good as an evaluation function that
rates entire boards on a continuum from -1 to +1 and does genuine
minimax or alpha-beta pruning, but it comes to mind as a way to
add depth to a moderately successful scoring heuristic.