Previous worked in ByteDance, Temu. Master of NUS, bachelor of SYSU.
Focus on parallel training in LLMs.
-
colossalai
- Singapore
-
15:24
(UTC -12:00)
Popular repositories Loading
-
Finetune_llama2
Finetune_llama2 PublicBuild a llama fine-tuning script from scratch using PyTorch and transformers API. It needs to support 4 optional features: gradient checkpointing, mixed precision, data parallelism, tensor parallel…
Python 2
-
-
ColossalAI
ColossalAI PublicForked from hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Python 1
-
BandWidth_Test
BandWidth_Test PublicTest the GPU bandwidth of collectives operators like all-reduce, all-gather, broadcast and all-to-all primitives on single-node multi-GPU (2, 4, 8 cards) and multi-node multi-GPU (16 cards) setups,…
Python 1
-
Pytorch-profile
Pytorch-profile PublicUse pytorch profile api to further analysis the training detailed information, like heaps and stacks, time consuming.
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.