You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am a student, I use MuJoCo for my research on RL.
My setup
mujoco version: 3.2.5
python api
64 bit
Ubuntu 24.04.1 LTS
RTX 2060 super, 8GB @ 2010 MHz
What's happening? What did you expect?
Running the same code for MuJoCo MJX Humanoid environment from the collab tutorial, but with my own humanoid model, gives me the following error
/home/mayur-kamat/anaconda3/envs/rl/lib/python3.12/site-packages/jax/_src/interpreters/xla.py:133: RuntimeWarning: overflow encountered in cast
return np.asarray(x, dtypes.canonicalize_dtype(x.dtype))
2024-11-20 20:37:20.877478: W external/xla/xla/hlo/transforms/simplifiers/hlo_rematerialization.cc:3020] Can't reduce memory use below 2.12GiB (2278089352 bytes) by rematerialization; only reduced to 5.65GiB (6063398400 bytes), down from 5.70GiB (6125699228 bytes) originally
E1120 20:37:33.901627 165263 hlo_lexer.cc:443] Failed to parse int literal: 894515288310727292233
/home/mayur-kamat/anaconda3/envs/rl/lib/python3.12/site-packages/jax/_src/interpreters/xla.py:133: RuntimeWarning: overflow encountered in cast
return np.asarray(x, dtypes.canonicalize_dtype(x.dtype))
/home/mayur-kamat/anaconda3/envs/rl/lib/python3.12/site-packages/jax/_src/interpreters/xla.py:133: RuntimeWarning: overflow encountered in cast
return np.asarray(x, dtypes.canonicalize_dtype(x.dtype))
/home/mayur-kamat/anaconda3/envs/rl/lib/python3.12/site-packages/jax/_src/interpreters/xla.py:133: RuntimeWarning: overflow encountered in cast
return np.asarray(x, dtypes.canonicalize_dtype(x.dtype))
2024-11-20 20:40:31.260133: W external/xla/xla/tsl/framework/bfc_allocator.cc:306] Allocator (GPU_0_bfc) ran out of memory trying to allocate 1015.14MiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
XlaRuntimeError: RESOURCE_EXHAUSTED: Out of memory while trying to allocate 1064452096 bytes.
This happens on my Local Machine as well as the Collab T4 gpu. I would like to know what the issue is and how it can be resolved. Halving the environment count didn't solve it either. Besides I manually measure the size of the batches environments which was barely 300 MBs of data.
Steps for reproduction
The humanoid model I use is given below. Please use this model in the humanoid environment given in the colab tutorial to replicate the results.
Alright so the following changes seemed to make it work!
os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"] = ".5" # reduce to 50% of GPU from default of 75%
in the xml file. changing the options to use iterations="1" and ls_iterations="4" and disabling the eulerdamp flag.
Reducing the iterations and ls_iterations was producing instability so I had to additionally increase the armature value for all my joints from 0.1 to 0.3
Disabling the collisions on all geoms by setting contype="0" conaffinity="0" and the explicitly mentioning the pair of bodies that will collide under the contact section using pair attribute.
These changes have worked so far although I still get this one warning or error I don't really know what it is E1122 14:18:54.592463 24092 hlo_lexer.cc:443] Failed to parse int literal: 894515288310727292233 Although the code seems to run just fine.
Hi @KamatMayur , I loaded your model and created an mjx.Data. ncon is 406, you'll want to use max_geom_pairs and max_contact_points to reduce the memory, or use contypeconaffinity to reduce contact pairs. We're working on better support for broadphase, but with the impl at HEAD you'll have to manually specify a static ncon.
Intro
Hi!
I am a student, I use MuJoCo for my research on RL.
My setup
mujoco version: 3.2.5
python api
64 bit
Ubuntu 24.04.1 LTS
RTX 2060 super, 8GB @ 2010 MHz
What's happening? What did you expect?
Running the same code for MuJoCo MJX Humanoid environment from the collab tutorial, but with my own humanoid model, gives me the following error
This happens on my Local Machine as well as the Collab T4 gpu. I would like to know what the issue is and how it can be resolved. Halving the environment count didn't solve it either. Besides I manually measure the size of the batches environments which was barely 300 MBs of data.
Steps for reproduction
The humanoid model I use is given below. Please use this model in the humanoid environment given in the colab tutorial to replicate the results.
Minimal model for reproduction
minimal XML
Confirmations
The text was updated successfully, but these errors were encountered: