ChineseChess-AlphaZero

About

Chinese Chess reinforcement learning by AlphaZero methods.

This project is based on these main resources:

DeepMind's Oct 19th publication: Mastering the Game of Go without Human Knowledge.
The great Reversi/Chess/Chinese chess development of the DeepMind ideas that @mokemokechicken/@Akababa/@TDteach did in their repo: https://github.com/mokemokechicken/reversi-alpha-zero, https://github.com/Akababa/Chess-Zero, https://github.com/TDteach/AlphaZero_ChineseChess
A Chinese chess engine with gui: https://github.com/mm12432/MyChess

Note: The distributed version is in distributed branch.

Note: This repo is still under contruction. There is also a slower implementation of MCTS player (previous version), see branch old.

Contributing

In order to build a strong chinese chess AI following the same type of techniques as AlphaZero, we need to do this with a distributed project, as it requires a huge amount of computations.

If you want to join us to build the best chinese chess AI in the world:

For instructions, see wiki
For live status, see http://alphazero.52coding.com.cn/

Environment

Python 3.6.3
tensorflow-gpu: 1.3.0
Keras: 2.0.8

Modules

Reinforcement Learning

This AlphaZero implementation consists of two workers: self and opt.

self is Self-Play to generate training data by self-play using BestModel.
opt is Trainer to train model, and generate new models.

For the sake of faster training (since I don't have 5000 TPUs), another two workers are involved:

sl is Supervised learning to train data crawled from the Internet.
eval is Evaluator to evaluate the NextGenerationModel with the current BestModel.

GUI

Requirement: pygame

python cchess_alphazero/run.py play

Screenshots

You can choose different board/piece styles and sides, see play with human.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want to use CPU only, replace tensorflow-gpu with tensorflow in requirements.txt.

Make sure Keras is using Tensorflow and you have Python 3.6.3+.

Configuration

PlayDataConfig

nb_game_in_file,max_file_num: The max game number of training data is nb_game_in_file * max_file_num.

PlayConfig, PlayWithHumanConfig

simulation_num_per_move : MCTS number per move.
c_puct: balance parameter of value network and policy network in MCTS.
search_threads: balance parameter of speed and accuracy in MCTS.
dirichlet_alpha: random parameter in self-play.

Basic Usage

Self-Play

python cchess_alphazero/run.py self --gpu '0' --type normal

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

--new: create new BestModel
--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use
--ucci: whether to play with ucci engine (rather than self play, see cchess_alphazero/worker/play_with_ucci_engine.py)
--distributed: run self play in distributed mode which means it will upload the play data to the remote server and download latest model from it

Trainer

python cchess_alphazero/run.py opt --gpu '0'

When executed, Training will start. The current BestModel will be loaded. Trained model will be saved every epoch as new BestModel.

options

--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--total-step TOTAL_STEP: specify total step(mini-batch) numbers. The total step affects learning rate of training.
--gpu '1': specify which gpu to use

View training log in Tensorboard

tensorboard --logdir logs/

And access http://<The Machine IP>:6006/.

Play with human

python cchess_alphazero/run.py play --gpu '0'

When executed, the BestModel will be loaded to play against human.

options

--ai-move-first: if set this option, AI will move first, otherwise human move first.
--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use
--piece-style WOOD: choose a piece style, default is WOOD
--bg-style CANVAS: choose a board style, default is CANVAS
--cli: if set this flag, play with AI in a cli environment rather than gui

Note: Before you start, you need to download/find a font file (.ttc) and rename it as PingFang.ttc, then put it into cchess_alphazero/play_games. I have removed the font file from this repo because it's too big, but you can download it from here.

Evaluator

python cchess_alphazero/run.py eval --gpu '0'

When executed, evaluate the NextGenerationModel with the current BestModel. If the NextGenerationModel does not exist, worker will wait until it exists and check every 5 minutes.

options

--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use

Supervised Learning

python cchess_alphazero/run.py sl --gpu '0'

When executed, Training will start. The current SLBestModel will be loaded. Tranined model will be saved every epoch as new SLBestModel.

About the data

I have two data sources, one is downloaded from https://wx.jcloud.com/market/packet/10479 ; the other is crawled from http://game.onegreen.net/chess/Index.html (with option --onegreen).

options

--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use
--onegreen: if set the flag, sl_onegreen worker will start to train data crawled from game.onegreen.net
--skip SKIP: if set this flag, games whoses index is less than SKIP would not be used to train (only valid when onegreen flag is set)

Name		Name	Last commit message	Last commit date
Latest commit History 272 Commits
cchess_alphazero		cchess_alphazero
freeze		freeze
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChineseChess-AlphaZero

About

Contributing

Environment

Modules

Reinforcement Learning

GUI

How to use

Setup

install libraries

Configuration

Basic Usage

Self-Play

Trainer

Play with human

Evaluator

Supervised Learning

About

Releases

Packages

Languages

License

charleschen2006/ChineseChess-One

Folders and files

Latest commit

History

Repository files navigation

ChineseChess-AlphaZero

About

Contributing

Environment

Modules

Reinforcement Learning

GUI

How to use

Setup

install libraries

Configuration

Basic Usage

Self-Play

Trainer

Play with human

Evaluator

Supervised Learning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages