Skip to content

Latest commit

 

History

History
4 lines (3 loc) · 273 Bytes

README.md

File metadata and controls

4 lines (3 loc) · 273 Bytes

q-learning-delusion

A counterexample for Q-Learning, discussed in "Non-delusional Q-learning and value-iteration."

Lu, Tyler, Dale Schuurmans, and Craig Boutilier. "Non-delusional Q-learning and value-iteration." Advances in Neural Information Processing Systems. 2018.