This is the page for the Deep Learning Project of the master IASD. The goal is to train a network for playing the game of Go. In order to be fair about training ressources the number of parameters for the networks you submit must be lower than 1 000 000. The maximum number of students per team is two. The data used for training comes from Facebook ELF opengo Go program self played games. There are more than 98 000 000 different states in total in the training set. The input data is composed of 8 19x19 planes (color to play, ladders, current state on two planes, two previous states on four planes). The output targets are the policy (a vector of size 361 with 1.0 for the move played, 0.0 for the other moves), the value (1.0 if White won, 0.0 if Black won) and the state at the end of the game (two planes).
The project has been written and runs on Ubuntu 18.10. It uses Tensorflow 2.0 and Keras for the network. If you want to use dynamic batches of examples you should also install Pybind. A set of 100 000 examples is available in the zipfile if you want to start training without Pybind. An example of a small convolutional network with two heads is given in file golois.py and saved in file test.h5. The networks you design and train should also have the same policy and value heads and be saved in h5 format.
The files to use for the project are available here: DeepLearningProject.zip.
An example network and training episode using the precalculated dataset of 100 000 states is given in file golois.py. If you compile the golois library using compile.sh you can get dynamic batches with the golois.getBatch call.
Each week or so I will organize a tournament between the networks you send me. A referent student in the class should send me a zip file containing all the networks trained by the students who are willing to participate. Each network name is the names of the students who designed and trained the network. The model should be saved in keras h5 format. A swiss tournament of 10 rounds or more will be organized and the results will be posted here. Each network will be used by a PUCT engine that makes 128 evaluations at each move to play in the tournament.
score models/golois.h5 ( 980949 ) = 12.0 , games = 12.0 , average = 1.0
score models/Kadoche.h5 ( 711544 ) = 10.0 , games = 12.0 , average = 0.8333333333333334
score models/Guerendel.h5 ( 998973 ) = 7.0 , games = 12.0 , average = 0.5833333333333334
score models/Dublineau_Montini.h5 ( 965558 ) = 6.0 , games = 12.0 , average = 0.5
score models/test.h5 ( 191100 ) = 3.0 , games = 12.0 , average = 0.25
score models/Petiteau.h5 ( 191100 ) = 2.0 , games = 12.0 , average = 0.16666666666666666
score models/Henric.h5 ( 191100 ) = 2.0 , games = 12.0 , average = 0.16666666666666666