We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When chatting with ChatGLM2 via vLLM, we only got few messages, e.g.
result = chat.completion( ... messages=[ ... [ ... ChatMessage(role="user", content="中国共有多少人口?"), ... ], ... [ ... ChatMessage(role="user", content="中国首富是谁"), ... ], ... [ ... ChatMessage(role="user", content="如何在三年内成为中国首富"), ... ], ... ], ... temperature=0.7, # You can also overwrite the configurations in each conservation. ... max_tokens=2048, ... ) Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 17.31it/s] >>> print(result) [' 根据2021年中国国家统计局发布的数据,截至2020', ' 中国的首富目前的个人财富来自房地产和互联网行业。根据202', ' 成为首富是一个非常具有挑战性和难以预测的因素,而且这个目标并不是每个人']
The max_tokens seems not working.
max_tokens
/kind bug
The text was updated successfully, but these errors were encountered:
No branches or pull requests
When chatting with ChatGLM2 via vLLM, we only got few messages, e.g.
The
max_tokens
seems not working./kind bug
The text was updated successfully, but these errors were encountered: