New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
为插件“批量翻译PDF文档(NOUGAT)”添加图片匹配功能(高级参数开启) #1623
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* 适配 google gemini 优化为从用户input中提取文件 * 适配最新的智谱SDK、支持glm-4v * requirements.txt fix * pending history check --------- Co-authored-by: binary-husky <[email protected]>
…binary-husky#1520) * Update crazy_functional.py with new functionality deal with PDF * Update crazy_functional.py and Mermaid.py for plugin_kwargs * Update crazy_functional.py with new chart type: mind map * Update SELECT_PROMPT and i_say_show_user messages * Update ArgsReminder message in get_crazy_functions() function * Update with read md file and update PROMPTS * Return the PROMPTS as the test found that the initial version worked best * Update Mermaid chart generation function * version 3.71 * 解决issues binary-husky#1510 * Remove unnecessary text from sys_prompt in 解析历史输入 function * Remove sys_prompt message in 解析历史输入 function * Update bridge_all.py: supports gpt-4-turbo-preview (binary-husky#1517) * Update bridge_all.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update bridge_all.py --------- Co-authored-by: binary-husky <[email protected]> * Update config.py: supports gpt-4-turbo-preview (binary-husky#1516) * Update config.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update config.py --------- Co-authored-by: binary-husky <[email protected]> * Refactor 解析历史输入 function to handle file input * Update Mermaid chart generation functionality * rename files and functions --------- Co-authored-by: binary-husky <[email protected]> Co-authored-by: hongyi-zhao <[email protected]> Co-authored-by: binary-husky <[email protected]>
* Update Latex输出PDF结果.py 借助mathpix实现了PDF翻译中文并重新编译PDF * Update config.py add mathpix appid & appkey * Add 'PDF翻译中文并重新编译PDF' feature to plugins. --------- Co-authored-by: binary-husky <[email protected]>
function_plugins dictionary
* 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 * 微调未果 先stage一下 * update --------- Co-authored-by: binary-husky <[email protected]> Co-authored-by: binary-husky <[email protected]>
* Add gemini_endpoint to API_URL_REDIRECT * Update gemini-pro and gemini-pro-vision model_info endpoints
* Add anthropic library and update claude models * 更新bridge_claude.py文件,添加了对图片输入的支持。修复了一些bug。 * 添加Claude_3_Models变量以限制图片数量
maintainability
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
通过高级参数可以尝试让程序对NOUGAT OCR后的结果匹配图片到文章中,对pdf图片周围的文本块与NOUGAT识别后的文本进行模糊匹配进行定位,将图片以md形式加回文章中。效果如下:
缺点:
一些模型会把图片参数吃了(即英文原文有图像参数,但翻译版本没有),例如GPT-3.5-turbo (甚至glm-3表现都比它表现好一点) ,提示词需要改进
对于一些pdf的图片识别效果并不是很好
NOUGAT识别的结果有问题导致图片位置不对