LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
-
Updated
Oct 31, 2024 - Python
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
An Open-sourced Knowledgable Large Language Model Framework.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
Collaborative Training of Large Language Models in an Efficient Way
Best practice for training LLaMA models in Megatron-LM
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
Best practices & guides on how to write distributed pytorch training code
llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.
llama2 finetuning with deepspeed and lora
Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.
A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM
Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
Add a description, image, and links to the deepspeed topic page so that developers can more easily learn about it.
To associate your repository with the deepspeed topic, visit your repo's landing page and select "manage topics."