Mortal-Policy

This repository is a branch of Mortal original repository ,transitioning from value-based methods to policy-based methods.

Overview

Initially developed in 2022 based on Mortal V2, migrated to Mortal V4 in 2024.
This branch features:

More stable performance optimization process
Enhanced final performance

Note:
The performance results are based on a comparison with the baseline model. The baseline used for testing has been uploaded to RiichLab(mjai.app) and has maintained the stable rank across multiple evaluation batches.

Installation

Consistent with the original repository. Read the Documentation
Requirement: PyTorch>=2.4.0
Tested With: PyTorch2.5.1+CUDA 12.4 (install via pip)

Run

Mortal-Policy adopts an offline to online training approach:

Data Preparation
Collect samples in mjai format.
Configuration
Rename config.example.toml to config.toml and set hyperparameters.
Training Stages
- Offline Phase1 (Advantage Weighted Regression):
  Run train_offline_phase1.py
- Offline Phase2 (Behavior Proximal Policy Optimization):
  It is optional and only suitable when online is unavailable, and the code is coming soon
- Online Phase (Policy Gradient with Importance Sampling and PPO-style Clipping):
  Run train_online.py
While online-only training is possible, it is not recommended.
Advantage Weighted Regression(AWR) is not included in the original implementation based on Mortal V2. You can try the following alternative options: Behavior Cloning(BC) or distillation from the value-based Mortal.

Weights & Configuration

Maintained alignment with original Mortal repository. For details see this post.
The weights, hyperparameters, and some online training features have been removed from this branch when it was open-sourced.

License

Code

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Name		Name	Last commit message	Last commit date
Latest commit History 209 Commits
.github		.github
docs		docs
exe-wrapper		exe-wrapper
libriichi		libriichi
log-viewer		log-viewer
mortal		mortal
.gitattributes		.gitattributes
.gitignore		.gitignore
.tokeignore		.tokeignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
typos.toml		typos.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mortal-Policy

Overview

Installation

Run

Weights & Configuration

License

Code

Assets

About

Releases

Packages

Contributors 6

Languages

License

Nitasurin/Mortal-Policy

Folders and files

Latest commit

History

Repository files navigation

Mortal-Policy

Overview

Installation

Run

Weights & Configuration

License

Code

Assets

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 6

Languages

Packages