First attempts at training agent to tackle maze without a DQN. The repeated_compactState reflects the latest of such attempts; the state space needed to be significantly compacted to be tractable and have the agent explore all states a sufficient number of times to learn a policy. State was compacted to agent location and the quadrant with least number of traps.
Final version of project, including write-up, can be found here.