How to perform inference MoE model with expert parallel #6891

Guodanding · 2024-12-18T13:13:52Z

Hello, I want to perform inference on the HuggingFace MoE model Qwen1.5-MoE-A2.7B with expert parallelism using DeepSpeed in a multi-GPU environment. However, the official tutorials are not comprehensive enough, and despite reviewing the documentation, I still don't know how to proceed.

Could you please help me refine this request?

Guodanding · 2024-12-20T02:03:40Z

hello

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to perform inference MoE model with expert parallel #6891

How to perform inference MoE model with expert parallel #6891

Guodanding commented Dec 18, 2024 •

edited

Loading

Guodanding commented Dec 20, 2024

How to perform inference MoE model with expert parallel #6891

How to perform inference MoE model with expert parallel #6891

Comments

Guodanding commented Dec 18, 2024 • edited Loading

Guodanding commented Dec 20, 2024

Guodanding commented Dec 18, 2024 •

edited

Loading