Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check base64 image validation #1615

Closed
wants to merge 4 commits into from
Closed

Conversation

AllentDan
Copy link
Collaborator

@lvhan028
Copy link
Collaborator

Please check why the UT failed

@AllentDan
Copy link
Collaborator Author

Please check why the UT failed

Fixed in #1617

Comment on lines 37 to 41
img = load_image_from_base64(image_url.split(',')[1])
try:
img.load()
except Exception as e:
raise ValueError('invalid base64 image') from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

load_image_from_base64 这个函数可能会直接抛异常,比如文件头部信息就缺失了。
img = load_image_from_base64('/9j/')

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

一起捕获了

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里抛异常好呢,还是给个 error msg 然后返回一个空白的图片好呢?

另外这里在下面检查图片完整性比较好,img = Image.open(BytesIO(response.content)), img = Image.open(image_url) 都可能遇到不完整的图片,后面做 img.convert('RGB')的时候就会有问题。

Copy link
Collaborator Author

@AllentDan AllentDan May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最理想是服务最开始就做 request 的合法性检验,但是图片的信息好像不太好提取(或者重复提取)。pipeline 感觉可以直接抛异常吧。
@lvhan028 有什么建议吗

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

抛异常不太好,会让服务crash。
建议返回错误码和错误信息

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

发现错误,不能直接返回么?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接返回 None 吗?现在这部分的调用栈比较深了,耦合在 generate 里面,generate 返回的是迭代器。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最理想是服务最开始就做 request 的合法性检验,但是图片的信息好像不太好提取(或者重复提取)。pipeline 感觉可以直接抛异常吧。 @lvhan028 有什么建议吗

不好在接口处中加判断是么?参数合法性校验,应该发生在收到输入的时候

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不好在接口处中加判断是么?参数合法性校验,应该发生在收到输入的时候

接口处加需要走一遍读图的逻辑。要么就把处理图片 prompt 逻辑放到 generate 外面,不然要读图两次

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

抛异常不太好,会让服务crash。 建议返回错误码和错误信息

异常倒并不会让服务 crash。但是不抛异常,如果让它就继续跑,会让 pipeline 卡住。或者用 @irexyc 的想法,新建一个假图,然后报个 error

赞同新建个假图,报个warning,最起码不要影响其他的请求。
我在client这边有做完整性校验,但后来发现只要长时间运行,服务端一样总会出现"image file is truncated"的情况,之后整个pipeline就卡住了,无法处理新的请求。

@AllentDan AllentDan mentioned this pull request Sep 5, 2024
@AllentDan AllentDan closed this Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants