Implementation of the Genetic Programming and Artificial Neural Network to Othello

 

1. Introduction

Nowadays, along with the booming of the Artificial Intelligence (AI) field, there are many implementation of the AI, such as autonomous planning and scheduling, game playing, autonomous control, diagnosis, logistics planning, robotics, language understanding and problem solving, etc. One of the first undertaken tasks in AI is a game playing. Board games like Othello, checkers and chess provide a simple yet interesting testbed of AI systems. Since computers can be programmed, it has been improved so much, so that the machine can beat the human champion in Checkers and Othello although not every time [Russel, 1995]. In this project, we will focus on the game of Othello. Like checkers, Othello is an interesting board game for studies. Othello is a deterministic, perfect information, zero-sum game of two players. For this, genetic algorithm and artificial neural network could be useful algorithms. Genetic programming can automatically create a computer program to solve a problem while we have to decide what we should do to solve it. On the other hand, artificial neural network can have the system to learn to play Othello without expert knowledge while most programs developed for strong Othello play depend on expert-knowledge-based evaluation functions and advanced search algorithms. In this project, one player will play using genetic programming against another player using artificial neural network. The purpose of this study is to compare the performance of genetic programming and artificial neural network to Othello. Results of this study may give a better understanding about genetic programming and artificial neural network to Othello.

2. Genetic Programming

Genetic Programming is very closely related to Genetic Algorithms, it also uses process such as crossover, mutation to produce new population. The difference between these two algorithms is in the representation of the solution. In genetic algorithms, the solution is represented as a string, but in genetic programming, it generates a computer program which is represented as a tree as the solution. For the Othello game, the population in the Genetic Programming consists of M individuals which are in tree representation. Each tree has terminals and functions for the Othello game where the terminals are for leafs and the functions are for the other nodes in the tree. The terminals are defined as “white”, “black”, “white_edges”, “black_edges”, “white_corners”, “black_corners”,  “white_near_corners”, “black_ near_corners” and “10” and the functions are “+”, “-”, “*” and “/”.

To assign the fitness value of the individuals in the population, each individual will compete with another individual. The more pieces an individual wins from the other individuals the higher the score will be given to this individual. To produce the new individual for the new generation, we choose the genetic operation such as mutation, crossover, or reproduction (copy the old individual as the new individual) probabilistically. Then, we check whether the number of individuals for the population is equal to M, if it is still less than M then we keep produce a new individual until it is equal to M. After it reaches the N-th generation then we stop generate a new population and return an individual with the best fitness value as the result. The player that represented the genetic programming will choose the best move based on this individual.

 

Genetic Programming Algorithm

 

Generate initial population randomly

While the termination criterion is unsatisfied

            For each individual in the population

                        Evaluate fitness value

            End for

            While individuals < M

Select a genetic operator (reproduction, crossover, mutation) probabilistically:

-         reproduction:

·        select one individual based on the fitness value

·        individuals = individuals + 1

-         crossover

·        select two individual based on the fitness value

·        do crossover

·        insert the two new individual to the new population

·        individuals = individuals + 1

-         mutation

·        select one individual based on the fitness value

·        do mutation

·        insert the new individual to the new population

·        individuals = individuals + 1

      End While

Generations = Generation + 1

End While

Designate result

End

 

Implementation

 

 Create initial population by generate tree randomly:

-         Decide the depth of the tree randomly à d       

-         Generate a tree with depth = d randomly

 Assign fitness value for each tree in the population:

                        Each individual in the population play against p random players

                        Fitness value for each individual = total this individual’s pieces – total this individual’s opponent’s pieces

 While number of generation < N

Create new population :

While individuals < M

            Select an operator to create a new child probabilistically

                        - Reproduction : create a new individual by copying from an individual in the old population based on the rank

- Crossover : select individual’s parents from two individuals in the old population probabilistically based on the fitness.

- Mutation : select an individual in the old population to be mutated probabilistically based on the fitness. After having an individual to be mutated, then we choose a node in this individual randomly and mutate the node.

            Add this child to the new population

Update number of individuals in the new population

End While

generation = generation+1

 End While

 

3. Artificial Neural Network

The architecture of the neural network

The neural network consists of 3 layers : input layer, hidden layer and output layer. Input layer has 62 nodes and it represents the state of the board in Othello game. Hidden layer has 32 nodes and 32 nodes are used there because too many hidden nodes can cause overfitting or too few nodes can cause underfitting.

For output layer, it has only one node. Its value represents the goodness of the position where the black piece can be put. We can get the goodness of some possible positions in the board by comparing each output value.

 

The train of the neural network

Othello game can be divided into three phases: opening game, mid game and end game. Each phase requires the different strategies. So, we divide the whole steps of Othello game into three phases according to the number of the pieces and make the three kinds of training data, the data for the first phase, the data for the second phase and the data for the third phase. To make the training data, we implement two players that used one of nine features. The nine features are the number of black pieces, the number of white pieces, the number of black pieces in edge, the number of white pieces in edge, the number of black pieces in corners, the number of white pieces in corners, the number of black pieces in near corners, the number of white pieces in near corners and no feature using one constant. After playing each game, we decide the goodness of each move of the two players according to winner. That is, we assign “1” to the each move of the winner and “0” to each move of the looser. For the training data, there are the three kinds of training data, the data for the first phase, the data for the second phase and the data for the third phase. For each training data set, there is one neural network. Each data set has 3000 games.

 

The training algorithm

We train the network by gradient descent based on some training data consisting of pairs (X, T). The vector X is an input pattern and the vector T is the desired output. The following is the formula for error back propagation Algorithm. 

 

For each training examples (X, T) where X is the input vector and T is the target vector, the algorithm can be described as follows.

 

1.      feed forward input X and compute the output O.

 

2.      for each output node k, compute

 

dk   = Ok(1-Ok)(Tk – Ok)

 

3.      for each hidden node h, compute

 

                                    dh   = Oh(1-Oh)åWkhdk  

 

4.      Update weight Wji

 

                                    Wji  =  Wji  +D Wji  

                                                where  D Wji(n)   = hdj Xji + a   D Wji(n-1)  

 

In the algorithm,  h is the learning rate and a  is the momentum, 0£  a < 1.

 

References

Eskin, E. and Siegel E. (1999). Genetic Programming Applied to Othello: Introducing Students to Machine Learning Research, ACM SIGCSE Bulletin, v.31 n.1, p242-246, Mar 1999

Russel, S. and Norvig, P. (1995). Artificial Intelligence: A Modern Approach Second Edition, p162, Prentice Hall

Schultz, M. (2000). GP Othello. Available at: http://www1.cs.columbia.edu/~evs/ml/OthelloStudProj/Schultz%20Matt/writeup.html

Koza, John. R. (1997). Genetic Programming. Dept. of Computer Science, Stanford Univ.

 

D.B. Fogel, Blondie24: playing at the edge of AI, San Mateo, CA: Morgan Kaufmann, 2002

Siang Y. Chong, Member, IEEE, Mei K. Tan, and Jonathon D. White (2005) Observing the Evolution of Neural Networks Learning to Play the game of Othello, IEEE transactions on evolutionary computation, Vol.9, No.3

Thimothy Anderson, Kenneth O. Stanley, and Risto Miikkulainen, Eeuro-Evolution Through Augmenting Topologies Applied To Evolving Neural Networks to Play Othello, department of computer sciences University of Texas at Austin

 

Source Code

 

Paper