Skip to content

Commit 448dee8

Browse files
committed
Fix: PPO tutorial updated for gymnasium support and Mujoco rendering compatibility in Colab
1 parent 5f17335 commit 448dee8

File tree

1 file changed

+28
-1
lines changed

1 file changed

+28
-1
lines changed

intermediate_source/reinforcement_ppo.py

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@
4242
# !pip3 install torchrl
4343
# !pip3 install gym[mujoco]
4444
# !pip3 install tqdm
45+
# !pip install torchrl gymnasium[mujoco] mujoco==3.1.1 (For Google Colab)
4546
#
4647
# Proximal Policy Optimization (PPO) is a policy-gradient algorithm where a
4748
# batch of data is being collected and directly consumed to train the policy to maximise
@@ -211,8 +212,34 @@
211212
# to a large panel of RL simulators, allowing you to easily swap one environment
212213
# with another. For example, creating a wrapped gym environment can be achieved with few characters:
213214
#
215+
# -----------------------------------------------------------------------------
216+
# ⚙️ Google Colab and gymnasium compatibility for Mujoco-based environments
217+
# -----------------------------------------------------------------------------
218+
219+
# Try importing gymnasium (preferred), fallback to gym
220+
try:
221+
import gymnasium as gym
222+
USING_GYMNASIUM = True
223+
except ImportError:
224+
import gym
225+
USING_GYMNASIUM = False
226+
227+
import os
228+
229+
# In headless environments like Google Colab, Mujoco needs osmesa for rendering
230+
if "google.colab" in str(get_ipython()):
231+
os.environ["MUJOCO_GL"] = "osmesa"
232+
233+
# Use a newer environment name if gymnasium is available
234+
# (v5 environments are preferred; gym uses v4)
235+
env_version = "v5" if USING_GYMNASIUM else "v4"
236+
env_id = f"InvertedDoublePendulum-{env_version}"
237+
238+
# Replace this later:
239+
240+
#base_env = GymEnv("InvertedDoublePendulum-v4", device=device)
241+
base_env = GymEnv(env_id, device=device)
214242

215-
base_env = GymEnv("InvertedDoublePendulum-v4", device=device)
216243

217244
######################################################################
218245
# There are a few things to notice in this code: first, we created

0 commit comments

Comments
 (0)