Performance before any optimisations

I created a few boards and timed how long it took the algorithm to run a depth-2 search on these boards, starting with both Player 0 and Player 1. An image showing the boards used is at the end of the post. The results are:

I also tried running a depth-3 search on Board 4 for Player 1 (because Player 1 can find a winning move with this depth), but it did not finish running even after a couple of hours of running.

The optimisations

I made several optimisations. I describe them here chronologically, i.e., in the order I tried implementing them.

Alpha-beta pruning

There are situations where a board position does not need to be analysed, because based on the boards we have already analysed, this board position will definitely not be chosen in optimal player. This resulted in the following times:

This was not a significant improvement in the times. I was surprised by this. I decided to run cProfile to try to determine why there was not a significant time-save. It seemed to be that the process of creating new boards was taking up a lot of time - and I was only pruning a board after the board was created. I needed to prune before the board was created.

To achieve this, I had to significantly re-structure the whole program, removing the frontier and instead writing the main function find_move recursively. The resulting times were a significant improvement:

Phew! It was satisfying to see the times drop so much, and this motivated me to keep going.

Stop if winning move found

If a winning move was found in a current board position, there is no need to continue analysing this position, so can stop this early. Pruning does not detect this (at least, not how I implemented it. This could be a sign I did it wrong…), so I had to manually add this. This resulted in further improvements:

I also tried running the depth-3 search on board 4, and to my surprise it ended in under a minute!

Note this would not be representative of a generic depth-3 search, because the winning move is found about 1/6 of the way into the full search.

Lists within lists

Running cProfile revealed that having nested lists to encode the board slows things down considerably, so I re-wrote the program so that the board was represented by a single list. I was worried this would take a lot of effort, but fortunately it consisted of replacing board[i][j] with board[6*i + j], and other similar simple changes. This halved the times:

Checking if the game is over

Running cProfile again showed that the new bottle-neck was checking if the game had ended. This involved looping through the set of all possible winning lines, and checking to see if Player 0 or Player 1 occupied all the positions in each line.

Originally, I had one for-loop to check if Player 1 won, then another to check if Player 0 won. I changed this to have a single loop, and for each line check if Player 1 or Player 0 won. This resulted in another big chunk of time-saving.

Checking cProfile showed that I had halved the time to check if the game had ended, but it was still the biggest bottle neck.

I then tried to re-design the program to cut down further, but to no avail. For example, I tried to group the set of lines into groups that could be ruled out together, e.g. if I know that position (2,2) in the board is empty, that rules out 7 of the lines. It will be interesting to know if there is a more efficient way to check if the game has ended!

Tidying up and fixing a “bug”

My code was becoming untidy (I was not using version control properly, and instead was creating multiple versions of functions in the same file) so I tidied up all the code. While doing this, I discovered that I did not correctly update the prune function during the ‘Lists within lists’ step: I was only pruning boards of at least depth 2, when it could be pruning board of depth 1. I made the necessary tweak, resulting in the following times:

Woohoo! What big progress. What used to take hours now only takes 3 seconds.

Duplicate board positions

The last thing I wanted to try was dealing with repeat positions. Previously I only skipped these if the same position occurred and they had same parent. But now I wanted to have a way of skipping board positions regardless of where they were in the game-tree. This took many hours to get correct, because my first attempt caused the algorithm to produce sub-optimal moves, and I had no idea why.

The error was that when I pruned a board, I would finalise the board’s value, though the board was not fully analysed. Then, when the board occurred somewhere else in the tree, I would use this incomplete value and miss out all the analysis that was pruned the first time around.

After fixing the bug, the new times are:

The times are not always better, and some are worse.

Next steps and final thoughts

The next step is to introduce neural networks. A brief google search reveals that min-max is not appropriate and that I should have been using reinforcement learning. Doh! In the back of my mind, I was curious as to how the neural network could learn the heuristic function; what would the loss/error be that it would minimise?

Though the optimisation of the min-max algorithm is incomplete (e.g. I do not understand why the latest version is not faster than the previous version), I will end it here. This is because I have already spent a couple of days on this, I have already learnt from this, and it is not necessary for the bigger goal of developing a neural network.

Some final takeaways:

The code

The code, at this stage of project, can be found on github.

The boards used for testing

image