InternLM / lmdeploy Public

Notifications
Fork 242
Star 2.7k

Code
Issues 107
Pull requests 21
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: InternLM/lmdeploy

[Benchmark] benchmarks on different cuda architecture with mo...

#815 opened Dec 11, 2023 by lvhan028

Open 6

报名参加书生·浦语大模型实战营——两周带你玩转微调部署评测全链路

#890 opened Dec 26, 2023 by vansin

Open

Labels 32 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

107 Open 725 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug] InternVL-Chat-V1-5量化报错

#1660 opened May 27, 2024 by BigWhiteFox

2 tasks done

模型输出为空 awaiting response

#1659 opened May 27, 2024 by GZL11

1 of 2 tasks

[Bug] w4a16量化Qwen1.5-chat模型报错

#1656 opened May 27, 2024 by legends-7

2 tasks done

换用 LLM 基座的 LLaVA 模型适配 good first issue

Good for newcomers

#1655 opened May 24, 2024 by red-fox-yj

[Feature] A series of various optimization points

#1647 opened May 23, 2024 by zhyncs

[Bug] awq量化4bit 设置 --tp 4 报错

#1646 opened May 23, 2024 by zeroleavebaoyang

2 tasks done

LMDeploy-0.4.1运行qwen1.5 110B，推理长时间无结果

#1639 opened May 22, 2024 by summerrain321

2 tasks done

[Feature]- Support for the microsoft/Phi-3-vision-128k-instruct Vision Model

#1637 opened May 22, 2024 by sabarish244

engine_config = TurbomindEngineConfig(tp=2, quant_policy=0, cache_max_entry_count=0.2, session_len=4096)# quant_policy=8, self.pipe = pipeline("InternVL-Chat-V1-5", backend_config=engine_config) 其他配置参数不变，改变quant_policy=8，0，4 ，显存占用和推理速度没有任何改变是为什么呢？

#1635 opened May 22, 2024 by YangYangTx

Are there any plans to support CUDA 11.7? awaiting response

#1632 opened May 21, 2024 by dlin511

lmdeploy搭建的服务，是否支持通过传输stop_words的方式来控制模型输出 awaiting response

#1631 opened May 21, 2024 by qiuxuezhe123

2 tasks

[Bug] qwen1.5-14b-chat使用turbomind进行推理，会出现输出重复的情况

#1629 opened May 21, 2024 by qiuxuezhe123

2 tasks

使用KV cache（int8或int4）量化internvl-v1.5后，显存反而增加了

#1626 opened May 21, 2024 by qingchunlizhi

1 of 2 tasks

[Feature] Layer Wise Calibration and Quantization of Models (To quantize model on Low VRAM GPU)

#1625 opened May 21, 2024 by Tushar-ml

[Feature] specify gpus in pipeline

#1624 opened May 21, 2024 by kleinzcy

GPTQ 和 AWQ 的推理 kernel 能否互用？

#1623 opened May 21, 2024 by sleepwalker2017

[Feature] Implement COG-VLM2

#1622 opened May 20, 2024 by isidentical

[Bug] hang when many requests

#1619 opened May 20, 2024 by NiuBlibing

2 tasks done

[Feature] Grammar/structured output support

#1614 opened May 19, 2024 by nidhoggr-nil

[Bug] 部署的多模态模型，多轮对话时输出结果异常

#1612 opened May 17, 2024 by wssywh

2 tasks done

[Feature] Throw exception when response error

#1610 opened May 17, 2024 by NiuBlibing

[Bug] 部署llava-v1.6-34b，模型一直输出重复的结果

#1604 opened May 16, 2024 by wssywh

2 tasks

Support for Pali gemma

#1596 opened May 15, 2024 by bks5881

[Docs] Add docs to NVTX options

#1595 opened May 15, 2024 by yyccli

[Bug] llava, cuda out of memory

#1593 opened May 15, 2024 by AmazDeng

1 of 2 tasks

Previous 1 2 3 4 5 Next

Previous Next

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly