You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I recently tried the MUJOCO PUSH dataset, but I cannot figure out the concrete meaning of the modalities. The paper mentioned
The multimodal inputs are gray-scaled images (1 × 32 × 32) from an RGB camera, forces (and binary contact information) from a force/torque sensor, and the 3D position of the robot end-effector.
I found the modality in the dataset are "control", "image", "sensor", "pos". What are the correspondences between these modalities and the paper? (i.e. what's the meaning of these modalities?).
The text was updated successfully, but these errors were encountered:
Someone else can confirm, but here's how I think of things:
-> The "image" modality refers to the gray-scale images.
-> The "pos" modality refers to the 3d position of the end-effector.
-> The "sensor" refers to the forces/binary contact information.
-> The "control" refers to what the controller is sending the arm itself. ( This one I'm the least sure about ).
Hi, I recently tried the MUJOCO PUSH dataset, but I cannot figure out the concrete meaning of the modalities. The paper mentioned
I found the modality in the dataset are "control", "image", "sensor", "pos". What are the correspondences between these modalities and the paper? (i.e. what's the meaning of these modalities?).
The text was updated successfully, but these errors were encountered: