Name		Name	Last commit message	Last commit date
parent directory ..
common		common
deepq		deepq
trained_models		trained_models
README.md		README.md
__init__.py		__init__.py
enjoy_continuous.py		enjoy_continuous.py
logger.py		logger.py
train_continuous.py		train_continuous.py

README.md

Branching Dueling Q-Network (BDQ)

Branching Dueling Q-Network (BDQ) is a novel agent which is based on the incorporation of the proposed action branching architecture into the Deep Q-Network (DQN) algorithm, as well as adapting a selection of its extensions, Double Q-Learning, Dueling Network Architectures, and Prioritized Experience Replay.

As we show in the paper, BDQ is able to solve numerous continuous control domains via discretization of the action space. Most remarkably, we have shown that BDQ is able to perform well on the Humanoid-v1 domain with a total of 6.5 x 10²⁵ discrete actions.

Our TensorFlow code for BDQ is based on the implementation of the DQN agent as part of the initial release of the OpenAI Baselines. However, it does not require installation of the Baselines as we have made it into a free-standing codebase with relative path imports.

Getting Started

You can clone this repository by:

git clone https://github.com/atavakol/action-branching-agents.git

To train a new model or evaluate a pre-trained one, change directory to the agent's main directory to run the corresponding scripts (required due to using relative paths).

Train

You can readily train a new model for any continuous control domain from the OpenAI Gym collection or the custom reaching domains provided in this repository, by running the train_continuous.py script from the agent's main directory.

cd action-branching-agents/agents/bdq
python train_continuous.py

Evaluate

Alternatively, you can evaluate a pre-trained model saved in the agent's trained_models directory, by running the enjoy_continuous.py script from the agent's main directory. By default, evaluation is run using a greedy policy.

Provided pre-trained models

Currently, a set of pre-trained models are provided in the agent's trained_models directory for the following domains:

MuJoCo (custom): Reacher3DOF-v0, Reacher4DOF-v0, Reacher5DOF-v0, Reacher6DOF-v0
MuJoCo (standard): Reacher-v1, Hopper-v1, Walker2d-v1, Humanoid-v1

Render visualisation of the environment during training

While training, you may start or stop rendering visualisation of the tasks on demand by pressing r to render or s to stop rendering. Please keep in mind, this rendering is with an exploratory policy throughout training for BDQ.

Saving a model

The current implementation keeps track of the model with the highest average score over the evaluations and saves it to file only at the end of training.

Citation

If you find this open-source release useful, please reference in your paper:

@inproceedings{tavakoli2018action,
  title={Action Branching Architectures for Deep Reinforcement Learning},
  author={Tavakoli, Arash and Pardo, Fabio and Kormushev, Petar},
  booktitle={AAAI Conference on Artificial Intelligence},
  pages={4131--4138},
  year={2018}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bdq

bdq

README.md

Branching Dueling Q-Network (BDQ)

Getting Started

Train

Evaluate

Provided pre-trained models

Render visualisation of the environment during training

Saving a model

Citation

Files

bdq

Directory actions

More options

Directory actions

More options

Latest commit

History

bdq

Folders and files

parent directory

README.md

Branching Dueling Q-Network (BDQ)

Getting Started

Train

Evaluate

Provided pre-trained models

Render visualisation of the environment during training

Saving a model

Citation