Checkers AI Project

This is the repo for a checkers AI project.

Project Structure

src/
- contains .cpp source files
include/
- contains all .h header files
bin/
- all object files are dumped in here (clean commands cleans this directory up)
build/runner
- When project is built it's compiled to the runner file

(1) Implementation Details

The project was implemented purely in c++. The decision to use c++ was made because bit manipulation in Java is unnecessarily difficult due to not having any definitive unsigned 32 bit integer primitive. C++ on the other hand contains a plethora of integer data types in the cstdint library. Two group members were also taking the networks course concurrently so socket programming in C wasn't an issue.

Bitboard Implenetation

Here is the bitboard layout that we utilized where each number corresponds to the bit index:

This orientation allows all moves to be done with either a 7 bit or 1 bit rotation. The entire board is then represented by three 32 bit integers: white pieces, black pieces, and kings. This is much more space-efficient than array implementations.

Along with space saving there were other advantages. Using bit manipulation means there's no need to constantly index into any sort of data structure like in the array representation of a board, saving time. Another large advantage is that many of the heuristics can be done with bit masks. That, in combination with inlining done by modern c++ compilers, boils many of these calculations down to a few lines of machine code.

Search Implementation and Depth

For search we stuck to a recursive minimax with alpha-beta pruning. This implementation can be seen in the src/MinimaxSearch.cpp file. This function, identical to the description in the textbook, allows the elimination of unnecessary search, cutting the ply factor by a quarter on average.

We used a simple variable depth solution. We have a minimum and maximum search depth such that the minimum is always met and the search will continue to the maximum depth as long as there are jumps available. With this we could achieve a maximum ply of around 11 in a couple of seconds. The minimum depth of 5 ensured that pieces wouldn't hover around corners during endgame as 5 moves is often enough to encounter another piece which was combined with higher evaluation values on encounters rather than avoidance.

(2) Features Used for State Evaluation

When evaluating the value of a state, we implemented various heuristics to assess the quality of that state. We decieded to on feature evalations being positive and negative for black and white, respectively. All the heuristics can be viewed in the src/Heuristic.cpp file.

Heuristic Function

The first heuristics implemented were on the more basic side of evaluating a state. Pawn count simply calculated a value for the difference between the number of pawns (black - white as this produces a negative value when white has more and a positive when black has more). We used this same formula for kings except added an extra weight, multiplying the difference between black and white kings by 1.5. We also implemented another set of heuristics for both kings and pawns by determing the number of 'safe' pieces, pieces that were on the edge of the board and could not be jumped.

After having values representing the number of pieces comparitively on the board, we moved on to looking at the position of the pieces and giving value to those in certain positions. We created a heuristic for the number of defending pieces of either black or white, that is, the number of pieces in the first two rows of that colors starting side. We also created an attacking heuristics representing the number of peices in the opponents first three rows.

We wanted to measure how close or available a pawn promotion was. We acheived this by calculating the distance each pawn was from the opponents first row (the promotion row) and summing them up. We believed it wasn't enough just to have this raw number as other pieces may have been impeding pawns from being promotion so we created another heuristic known as openPromotion. This counted the number of open spaces on the promotion row, i.e the opponents first row.

Finally, we wanted values for which pieces were able to make moves and which pieces were able to make jumps. For which pieces were able to make a move, we counted how many moves each pawn and king could make and simply summed them up. This, however, only considered direct moves, not jumps, as it searched the surrounding piece for an open space. To search for a jumpable piece, we had to look at whether there was an oppinent in the direct path of a piece and then whether a jump over that opponent was possible.

(3) The Learning Method - Population Regeneration

The learning method was originally intended to be genetic algorithm, however due to project time constraints, it turned out to be a hill climbing type approach with simple randomization of weight values. If we had implentated a crossover probability of the weight sets of the specimen during population regeneration, then this would have been a considered more of a genetic approach. Our randomization weight sets allowed well performing specimen to bubble into the next round of the simulation.

(4) Feature Weights, Before and After

Initially, all of the heuristic feature functions returned positive values. For our very early testing, we had our AI contain static coefficients of '1' for all of the weights before we did any training. Once we modified our heuristic functions to return positive and negative values for black and white pieces respectively, we also defined a distribution range of 2^28 to -2^28 for coefficients to be randomly assigned to the features.

After inital tests when comparing to our static weights, we realized that the range of randomness of the values was far to0 large and negatively impacted our traning. Several iterations led us to the convergence of the range to be set betweeen -100 and 100. The distribution of weights spread to any larger magnitude did not perform well against the clients with that of the former.

When testing variable depth and implementing Alpha Beta search, we initially limited our included features to the less computationally intensive functions. Early onset goals were to test our evaluation function with only the first four simple features being used ( counts of pawn, king, safe pawns, and safe kings) and achieve beating the server AI. Once variable depth was implemented, the rest of the features (all 10) were included when training to find good sets of heuristic values.

Due to the implementation of negitive and positive values for features and weights, it was not necessary to annex certain features from utilization later in the training process.

(5) Weight Training

A separate simulation loop was implementated, detached from the server, to play rounds of games in a single elimination tournament(ish) style. Several tests were done varying the number of rounds, heuristic weight sets (specimen), and minimum search depth. The simulations did not include a time limit, but due to the nature of the search depth, it was comparable to that of the server time (games typically did not last more that 120 seconds). For quick tests, such as simulating a large specimen number (e.g 500), a significantly shallow search depth (2 or 3) was used.

A simulation consists of a starting number of specimen (usually 100), and the number of rounds performed on these speciment. In each round, the tournament of games were played to filter the top 20% of highest performing specimen. Each specimen consisted of two heuristic weight sets, the inital set and the 'endgame' set. In each simulation game, both specimen would switch to thier 'endgame' set of wieghts after half of the pieces(total) removed off the board - that is, 12 pieces. After every round, new specimen were randomly generated and added to the population to regain the inital population count.

In the case of draws, a second game was played betweeen the specimen with sides switched. If the second game also tied, then both specimen were added back into the population (in order to play other specimen during the same round). A round of games ended when either the population was less than 20% of the inital population (top of the elimination bracket), or all specimen in the population equally tied games with one another.

The number of rounds tested varied between 5 and 20 (with a population of 50 to 100). For each round completed, a printout of a tabulated statistics summary was displayed. The satistics included were the number of black and white wins between sets of speciemen, a tie and then a respective win, and 2-game ties. Over the course of rounds, it was shown shown that better weighted set of specimen advanced among these rounds, since the number of ties increased over the course of several subsequent rounds.

The longest simulation performed was roughly ~40 minutes, with population=100 and rounds=20.

Citations

The vast majority of the project is personal work but there were some inspirations and little tricks from various sources. Any code directly copied from a source like StackOverflow is stated inline in comments. An exhaustive list of all sources used is here:

Contributors

Mark Bluemer
David Jardim
Joseph Muhitch
Eric Van Heel

checkersPlayer/README.md