Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

引入其他开源PDF OCR项目用于扫描件PDF的识别 #634

Open
HonorWater opened this issue Jun 13, 2024 · 1 comment
Open

引入其他开源PDF OCR项目用于扫描件PDF的识别 #634

HonorWater opened this issue Jun 13, 2024 · 1 comment
Assignees

Comments

@HonorWater
Copy link

MaxKB 版本

v1.2.0

请描述您的需求或者改进建议

首先感谢开发者开源这么好的项目!
有很多的PDF文档都是扫描件,MaxKB是无法正常识别的。

请描述你建议的实现方案

希望可以加入PDF的OCR功能,可以对PDF导入后先进行OCR识别:一般都是把PDF每一页转换为图片,然后进行识别。
可以参考这个开源项目:https://github.com/hiroi-sora/Umi-OCR
他的OCR识别效果还是很好的

附加信息

No response

@baixin513
Copy link
Contributor

感谢反馈,我们先调研一下。

@baixin513 baixin513 modified the milestone: v1.5.0 Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants