Extending SayCan Framework: Enhancing language-driven robotic task planning and execution with improved scoring, direct control, and pipeline optimization.
This project extends the original SayCan robotic task planning framework. Key updates include:
- Direct Control Mechanisms using PyBullet.
- Enhanced Scoring Systems for affordance and language evaluations.
- Batch Processing Optimization to improve throughput and reduce latency.
Follow these steps to set up the environment and run the experiments:
git clone https://github.com/csce585-mlsystems/SayCan-Extended.git
cd SayCan-Extended
conda create -n saycan python=3.8
conda activate saycan
Ensure all required packages are installed:
pip install --upgrade pip
pip install -r requirements.txt
The project will download necessary assets automatically on the first run, including:
- UR5e robot URDF files
- Robotiq 2F-85 gripper files
- Bowl assets
- ViLD pretrained model weights
The simulation environment is built using PyBullet and includes:
- UR5e robotic arm
- Robotiq 2F-85 gripper
- Manipulatable objects (blocks, bowls)
- Cameras for top-down and perspective views
- Vision Module: ViLD (Vision-Language Detection) for zero-shot object detection.
- Language Module: GPT-3.5-turbo-instruct for task planning and decomposition.
- Manipulation Module: PyBullet direct control for pick-and-place actions.
Run the main notebook SayCanWithDirectControl.ipynb
to:
- Set up the PyBullet simulation environment.
- Perform object detection using ViLD.
- Execute tasks with optimized scoring and batch processing.
Launch Jupyter Notebook:
jupyter notebook
- CUDA-capable GPU recommended for running vision and language models.
- Tested on Python 3.8 with PyTorch and JAX.
To install or update dependencies based on your conda environment:
pip freeze > requirements.txt
If you use this work, please cite:
@misc{saycan2022,
title={Do As I Can, Not As I Say: Grounding Language in Robotic Affordances},
author={Michael Ahn and Anthony Brohan and Noah Brown and ... Andy Zeng},
year={2022}
}
This project is licensed under the Apache License 2.0. See the LICENSE file for details.
- Code: GitHub Repository
- Video Presentation: YouTube Video