In last episode, we finished up minimax strategy for Connect-N games, including Tic-Tac-Toe and Gomoku. This episode, we will implement its GUI environment based on Pygame library for human vs. human, AI vs. AI or human vs. AI plays, which is essential for self-play AlphaGo Zero reinforcement learning. The environment is further embedded into OpenAI Gym as it’s the standard in game reinforcement learning. All code in this series is in ConnectNGym github .
This episode extends last one, where Minimax and Alpha Beta Pruning algorithms are introduced. We will solve several tic-tac-toe problems in leetcode, gathering intuition and building blocks for tic-tac-toe game logic, which can be naturally extended to Connect-N game or Gomoku (N=5). Then we solve tic-tac-toe using Minimax and Alpha Beta pruning for small N and analyze their state space. In the following episodes, based on building blocks here, we will implement a Connect-N Open Gym GUI Environment, where we can play against computer visually or compare different computer algorithms.
This series, we deal with zero-sum turn-based board game algorithm, a sub type of combinatorial games. We start off with small search space problem, introduce classic algorithms and corresponding combinatorial gaming theory and ultimately end with modern approximating Deep RL techniques. From there, after stepping stone is laid, we are able to learn and appreciate how AlphaGo works. In this first episode, we illustrate 3 classic gaming problems in leetcode and solve them from brute force version to DP version then finally rewrite them using classic gaming algorithms, minimax and alpha beta pruning.