-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does InternVL support multi-image interleaved conversations: #153
Comments
model.chat只支持history为None时传入新的图片
你可以仿照chat方法封装generate方法 |
我看了一下你们的代码,拼法貌似跟internvl-demo一样,都是放在了第一轮的user里面,跟我理解的“交错”不太一样。我理解的交错是像你们处理deepseek-vl那样,image的token在每一轮的user里面,而不是集中在第一轮的user里面。 所以还是想跟internvl的作者确认一下,对于多轮带图片的对话,internvl正确的处理方式是什么。 |
@irexyc
|
对于internvl: 对于deepseek-vl 前者,如果新一轮的对话中有图片,会改变历史prompt(kv-cache没办法复用,需要重新算)。后者并不会改变,这两者我觉得并不一样。 |
我理解了 主要还是历史图片tokens处理 官方这里确实没有看到一个处理方式 |
According to the demo code in readme, the images are put in the first round chat and the image token are put in the front of question.
我想知道InternVL-chat 是否支持像DeepSpeed-VisualChat那样的图像-文字交错对话,如果支持的话,每一轮对话中,图像的token应该如何插入,希望可以给一个例子。
I want to know if InternVL support interleaved text-and-image conversations. If so, where the image token should be put in each
conversations?
The text was updated successfully, but these errors were encountered: