Deep Learning Course

The slides of the course are available here: DeepLearningIASD.pdf.

References

"Deep Learning with Python", Francois Chollet. Manning, 2020. book.

Keras and Tensorflow

"Residual Networks for Computer Go", Tristan Cazenave. IEEE Transactions on Games, Vol. 10 (1), pp 107-110, March 2018. resnet.pdf.

"Mastering the game of Go without human knowledge", David Silver et al. 2017. AlphaGoZero.

"Spatial Average Pooling for Computer Go", Tristan Cazenave. CGW at IJCAI 2018. sap.pdf.

"A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play", David Silver et al. Science 2018. AlphaZero

"Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model", Julian Schrittwieser et al. 2019. muzero.pdf

"Accelerating Self-Play Learning in Go", David J. Wu. AAAI RLG 2020. accelerating.pdf

"Polygames: Improved Zero Learning", Tristan Cazenave et al. 2020. polygames.pdf

Deep Learning Project

Introduction

This is the page for the Deep Learning Project of the master IASD. The goal is to train a network for playing the game of Go. In order to be fair about training ressources the number of parameters for the networks you submit must be lower than 1 000 000. The maximum number of students per team is two. The data used for training comes from Facebook ELF opengo Go program self played games. There are more than 109 000 000 different states in total in the training set. The input data is composed of 8 19x19 planes (color to play, ladders, current state on two planes, two previous states on four planes). The output targets are the policy (a vector of size 361 with 1.0 for the move played, 0.0 for the other moves), the value (1.0 if White won, 0.0 if Black won) and the state at the end of the game (two planes).

Installing the Project

The project has been written and runs on Ubuntu 18.10. It uses Tensorflow 2.0 and Keras for the network. If you want to use dynamic batches of examples you should also install Pybind11. A set of 100 000 examples is available in the zipfile if you want to start training without Pybind. An example of a small convolutional network with two heads is given in file golois.py and saved in file test.h5. The networks you design and train should also have the same policy and value heads and be saved in h5 format.

Source files

The files to use for the project are available here: DeepLearningProject.zip.

An example network and training episode using the precalculated dataset of 100 000 states is given in file golois.py. If you compile the golois library using compile.sh you can get dynamic batches with the golois.getBatch call.

Tournament

Each week or so I will organize a tournament between the networks you send me. A referent student in the class should send me a zip file containing all the networks trained by the students who are willing to participate. Each network name is the names of the students who designed and trained the network. The model should be saved in keras h5 format. A swiss tournament of 10 rounds or more will be organized and the results will be posted here. Each network will be used by a PUCT engine that makes 128 evaluations at each move to play in the tournament.

Results (23 June 2020)

golois.policy.weight.4.4 ( 951113 ) : 28 wins, 28 games played, winrate = 1.00

goloissym.0.0001 ( 997461 ) : 26 wins, 28 games played, winrate = 0.93

golois ( 980949 ) : 24 wins, 28 games played, winrate = 0.86

model ( 997813 ) : 21 wins, 28 games played, winrate = 0.75

Pan_Girerd-model5 ( 994729 ) : 21 wins, 28 games played, winrate = 0.75

BilKA_DeepLearning_Go__v3 ( 950407 ) : 16 wins, 28 games played, winrate = 0.57

goBot_omola_Resnet_V3_bis ( 754240 ) : 16 wins, 28 games played, winrate = 0.57

Master_IASD_GO_AurelGhioca ( 979794 ) : 12 wins, 28 games played, winrate = 0.43

u.v4_Simmat_Bertoldo ( 876238 ) : 11 wins, 28 games played, winrate = 0.39

chainsawMassacre3 ( 996907 ) : 10 wins, 28 games played, winrate = 0.36

dl_rattrapage_residual_vf_Simmat_Bertoldo ( 963425 ) : 10 wins, 28 games played, winrate = 0.36

MS_Go ( 978844 ) : 7 wins, 28 games played, winrate = 0.25

KlouviRiva ( 4207980 ) : 6 wins, 28 games played, winrate = 0.21

test ( 191100 ) : 2 wins, 28 games played, winrate = 0.07

JAM_HAO_Go ( 976939 ) : 0 wins, 28 games played, winrate = 0.00

Results (7 april 2020)

golois.policy.weight.4.4 ( 951113 ) : 32 wins, 32 games played, winrate = 1.00

goloissym.0.0001 ( 997461 ) : 30 wins, 32 games played, winrate = 0.94

AL_DO_7 ( 992564 ) : 25 wins, 32 games played, winrate = 0.78

goliath_ultimate ( 977052 ) : 24 wins, 32 games played, winrate = 0.75

RemGo8 ( 999985 ) : 24 wins, 32 games played, winrate = 0.75

Obelix14-3 ( 980635 ) : 21 wins, 32 games played, winrate = 0.66

golois ( 980949 ) : 20 wins, 32 games played, winrate = 0.62

theo ( 965463 ) : 20 wins, 32 games played, winrate = 0.62

octogone ( 941820 ) : 16 wins, 32 games played, winrate = 0.50

amiGo2 ( 345622 ) : 13 wins, 32 games played, winrate = 0.41

AZORIN_COHEN_Dumbo ( 999478 ) : 13 wins, 32 games played, winrate = 0.41

Lahrech_Gonzalez_2020-03-17-12h-29m-03s ( 404817 ) : 11 wins, 32 games played, winrate = 0.34

DHS ( 910169 ) : 8 wins, 32 games played, winrate = 0.25

model_MT_OR_v7 ( 928534 ) : 7 wins, 32 games played, winrate = 0.22

gogol1_2 ( 283604 ) : 4 wins, 32 games played, winrate = 0.12

cesar_dynamic_100e_34_645 ( 733052 ) : 4 wins, 32 games played, winrate = 0.12

test ( 191100 ) : 0 wins, 32 games played, winrate = 0.00

Results (17 january 2020)

goloissym.0.0001 ( 997461 ) : 42 wins, 42 games played, winrate = 1.00

golois ( 980949 ) : 40 wins, 42 games played, winrate = 0.95

AitAliBraham_Dang ( 996520 ) : 35 wins, 42 games played, winrate = 0.83

GUERENDEL_HENRIC_final ( 999815 ) : 35 wins, 42 games played, winrate = 0.83

LeMoing_Massiani_3 ( 979206 ) : 34 wins, 42 games played, winrate = 0.81

Gouteux_Philbert_Final6 ( 983132 ) : 32 wins, 42 games played, winrate = 0.76

Kadoche_resnet_model ( 986026 ) : 30 wins, 42 games played, winrate = 0.71

MonierL_model ( 999988 ) : 27 wins, 42 games played, winrate = 0.64

Delemazure_Bressan_v2_m6e ( 999396 ) : 26 wins, 42 games played, winrate = 0.62

Bodrito_Gallais ( 985714 ) : 25 wins, 42 games played, winrate = 0.60

Pluvinage_Ducos_bidule-aug-1-0.42-0.66-0.61 ( 976812 ) : 21 wins, 42 games played, winrate = 0.50

Kizardjian ( 993707 ) : 18 wins, 42 games played, winrate = 0.43

Minh_final ( 724064 ) : 16 wins, 42 games played, winrate = 0.38

Dublineau_Montini ( 997771 ) : 15 wins, 42 games played, winrate = 0.36

Dupin_Rynkiewicz ( 934375 ) : 15 wins, 42 games played, winrate = 0.36

fabiago ( 52866 ) : 13 wins, 42 games played, winrate = 0.31

MTIBAA_BENJELLOUN3 ( 936208 ) : 13 wins, 42 games played, winrate = 0.31

Petiteau ( 995683 ) : 13 wins, 42 games played, winrate = 0.31

test ( 191100 ) : 6 wins, 42 games played, winrate = 0.14

deLannoy_gollum ( 801754 ) : 4 wins, 42 games played, winrate = 0.10

Dikri ( 779444 ) : 2 wins, 42 games played, winrate = 0.05

Casagrande_Simmat_residual.final7 ( 982314 ) : 0 wins, 42 games played, winrate = 0.00

Trace of the games

The trace of the games available here: Golois.trace.zip.