MABWiser: Parallelizable Contextual Multi-Armed Bandits

MABWiser (IJAIT 2021, ICTAI 2019) is a research library written in Python for rapid prototyping of multi-armed bandit algorithms. It supports context-free, parametric and non-parametric contextual bandit models and provides built-in parallelization for both training and testing components.

The library also provides a simulation utility for comparing different policies and performing hyper-parameter tuning. MABWiser follows a scikit-learn style public interface, adheres to PEP-8 standards, and is tested heavily.

MABWiser is developed by the Artificial Intelligence Center of Excellence at Fidelity Investments. Documentation is available at fidelity.github.io/mabwiser.

Bandit-based Recommender Systems

To solve personalized recommendation problems, MABWiser is integrated into our Mab2Rec library. Mab2Rec enables building content- and context-aware recommender systems, whereby MABWiser helps selecting the next best item (arm).

Bandit-based Large-Neighborhood Search

To solve combinatorial optimization problems, MABWiser is integrated into Adaptive Large Neighborhood Search. The ALNS library enables building metaheuristics for complex optimization problems, whereby MABWiser helps selecting the next best destroy, repair operation (arm).

Quick Start

# An example that shows how to use the UCB1 learning policy
# to choose between two arms based on their expected rewards.

# Import MABWiser Library
from mabwiser.mab import MAB, LearningPolicy, NeighborhoodPolicy

# Data
arms = ['Arm1', 'Arm2']
decisions = ['Arm1', 'Arm1', 'Arm2', 'Arm1']
rewards = [20, 17, 25, 9]

# Model 
mab = MAB(arms, LearningPolicy.UCB1(alpha=1.25))

# Train
mab.fit(decisions, rewards)

# Test
mab.predict()

Available Bandit Policies

Available Learning Policies:

Epsilon Greedy [1, 2]
LinGreedy [1, 2]
LinTS [3]. See [11] for a formal treatment of reproducibility in LinTS
LinUCB [4]
Popularity [2]
Random [2]
Softmax [2]
Thompson Sampling (TS) [5]
Upper Confidence Bound (UCB1) [2]

Available Neighborhood Policies:

Clusters [6]
K-Nearest [7, 8]
LSH Nearest [9]
Radius [7, 8]
TreeBandit [10]

Installation

MABWiser requires Python 3.8+ and can be installed from PyPI using pip install mabwiser or by building from source as shown in installation instructions.

Support

Please submit bug reports and feature requests as Issues.

Citation

If you use MABWiser in a publication, please cite it as:

    @article{DBLP:journals/ijait/StrongKK21,
      author    = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
      title     = {{MABWiser:} Parallelizable Contextual Multi-armed Bandits},
      journal   = {Int. J. Artif. Intell. Tools},
      volume    = {30},
      number    = {4},
      pages     = {2150021:1--2150021:19},
      year      = {2021},
      url       = {https://doi.org/10.1142/S0218213021500214},
      doi       = {10.1142/S0218213021500214},
    }

    @inproceedings{DBLP:conf/ictai/StrongKK19,
    author    = {Emily Strong and Bernard Kleynhans and Serdar Kadioglu},
    title     = {MABWiser: {A} Parallelizable Contextual Multi-Armed Bandit Library for Python},
    booktitle = {31st {IEEE} International Conference on Tools with Artificial Intelligence, {ICTAI} 2019, Portland, OR, USA, November 4-6, 2019},
    pages     = {909--914},
    publisher = {{IEEE}},
    year      = {2019},
    url       = {https://doi.org/10.1109/ICTAI.2019.00129},
    doi       = {10.1109/ICTAI.2019.00129},
    }

License

MABWiser is licensed under the Apache License 2.0.

References

John Langford and Tong Zhang. The epoch-greedy algorithm for contextual multi-armed bandits
Volodymyr Kuleshov and Doina Precup. Algorithms for multi-armed bandit problems
Agrawal, Shipra and Navin Goyal. Thompson sampling for contextual bandits with linear payoffs
Chu, Wei, Li, Lihong, Reyzin Lev, and Schapire Robert. Contextual bandits with linear payoff functions
Osband, Ian, Daniel Russo, and Benjamin Van Roy. More efficient reinforcement learning via posterior sampling
Nguyen, Trong T. and Hady W. Lauw. Dynamic clustering of contextual multi-armed bandits
Melody Y. Guan and Heinrich Jiang, Nonparametric stochastic contextual bandits
Philippe Rigollet and Assaf Zeevi. Nonparametric bandits with covariates
Indyk, Piotr, Motwani, Rajeev, Raghavan, Prabhakar, Vempala, Santosh. Locality-preserving hashing in multidimensional spaces
Adam N. Elmachtoub, Ryan McNellis, Sechan Oh, Marek Petrik, A practical method for solving contextual bandit problems using decision trees
Doruk Kilitcioglu, Serdar Kadioglu, Non-deterministic behavior of thompson sampling with linear payoffs and how to avoid it

Name	Name	Last commit message	Last commit date
Latest commit AshishPvjs Merge pull request #94 from fidelity/Fix-CI Aug 30, 2024 b104071 · Aug 30, 2024 History 195 Commits
.github/workflows	.github/workflows	update ci config (#77 )	Aug 2, 2023
docs	docs	Bump version	Aug 7, 2024
docsrc	docsrc	Replace `NoReturn` with `None`	Feb 17, 2024
examples	examples	Update README.md	Jul 19, 2022
mabwiser	mabwiser	Bump version	Aug 7, 2024
tests	tests	Changing test_base to accomodate LearningPolicyType and NeighborhoodP…	Aug 2, 2023
.gitignore	.gitignore	LinTS reproducibility (#34 )	May 4, 2021
CHANGELOG.txt	CHANGELOG.txt	Bump version	Aug 7, 2024
CODEOWNERS	CODEOWNERS	treebandit	Jul 26, 2021
CONTRIBUTING.md	CONTRIBUTING.md	add detail	Nov 23, 2022
LICENSE	LICENSE	initial release	Jun 14, 2019
README.md	README.md	update ci config (#77 )	Aug 2, 2023
__init__.py	__init__.py	initial release	Jun 14, 2019
requirements.txt	requirements.txt	Update version, docs, changelog, readme	Feb 14, 2022
setup.py	setup.py	pypi compliant classification (#83 )	Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MABWiser: Parallelizable Contextual Multi-Armed Bandits

Bandit-based Recommender Systems

Bandit-based Large-Neighborhood Search

Quick Start

Available Bandit Policies

Installation

Support

Citation

License

References

About

Releases 21

Contributors 12

Languages

License

fidelity/mabwiser

Folders and files

Latest commit

History

Repository files navigation

MABWiser: Parallelizable Contextual Multi-Armed Bandits

Bandit-based Recommender Systems

Bandit-based Large-Neighborhood Search

Quick Start

Available Bandit Policies

Installation

Support

Citation

License

References

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases 21

Contributors 12

Languages