Feature Request: Add support for 16-bit quantized LSTM models #4331

lackner-codefy · 2024-10-22T14:54:22Z

Your Feature Request

For LSTM, there are currently fast 8-bit integer models, as well as the best models, probably using 32-bit floating point values.

While the fast models are indeed fast, they make a lot of errors in my specific use-case (with tesseract 5.3.0 and 5.4.1, mostly German language). I tested with the best models and they don't have this problem. However, they are also much slower, increasing the processing time considerably.

I'd like to have a better compromise between performance and accurracy. Something like a 16-bit integer model, which would (hopefully) still be pretty fast, but doesn't suffer from these random quality issues.

Would it be possible to implement support for 16-bit integer models? I'm aware that its not a trivial task since int_mode() is checked all over the place, and its also not trivial to write arch specific code to handle vector / matrix operations efficiently.

If its not within the scope of this project, what other tricks could be used to speed up the "best" model?

The text was updated successfully, but these errors were encountered:

amitdo · 2024-10-22T15:24:44Z

While the fast models are indeed fast, they make a lot of errors in my specific use-case

The 'fast' models are not based on the 'best' models. They were trained with a smaller network and converted to int8.

There is an option to convert a 'best' model to an int8 model. This will give you a better accuracy compared to the 'fast' model.

amitdo · 2024-10-22T15:41:58Z

https://github.com/tesseract-ocr/tesseract/blob/main/doc/lstmtraining.1.asc

stweil · 2024-10-22T16:40:26Z

@lackner-codefy, did you also test with models from tessdata? Do they produce similar results as the "best" models?

And can you say more about your specific use case? For some use cases (especially for historic German prints) my models might be better than the official models: https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/.

stweil · 2024-10-22T16:46:02Z

probably using 32-bit floating point values

Tesseract 4 used double precision (64 bit) values. The current "best" models therefore still provide 64 bit values which are converted to float (32 bit) by Tesseract 5 (unless it was built to use double).

amitdo · 2024-10-22T17:19:08Z

About the tessdata repo stweil mentioned. The models there are a comination of two models: A model for the legacy pcr engine and a lstm model based on the 'best' model that was converted to int8.

With that model you can use the command line option --oem 1 which will tell tesseract to only use the lstm model.

lackner-codefy · 2024-10-22T18:19:46Z

@amitdo @stweil Thanks for all of your suggestions. Really appreciated! 🙏

I'll do some experiments to see if converting a best model to int8 gives some improvements.
I remember we were using tessdata before and that it performed worse. But for good measure, I'll verify that with another experiment.
Most of the documents in the collection are scans of printed documents. However, the quality of the scans can sometimes be quite poor. Some documents have a lot of JPEG artefacts and/or contain handwritten notes.
I'll check https://ub-backup.bib.uni-mannheim.de/~stweil/tesstrain/. Since there seem to be multiple models, is there any in particular you had in mind? I'll make sure to use --oem ... to select the correct model.

amitdo added the feature request label Oct 22, 2024

amitdo added the priority: low label Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add support for 16-bit quantized LSTM models #4331

Feature Request: Add support for 16-bit quantized LSTM models #4331

lackner-codefy commented Oct 22, 2024

amitdo commented Oct 22, 2024 •

edited

Loading

amitdo commented Oct 22, 2024

stweil commented Oct 22, 2024 •

edited

Loading

stweil commented Oct 22, 2024

amitdo commented Oct 22, 2024

lackner-codefy commented Oct 22, 2024

Feature Request: Add support for 16-bit quantized LSTM models #4331

Feature Request: Add support for 16-bit quantized LSTM models #4331

Comments

lackner-codefy commented Oct 22, 2024

Your Feature Request

amitdo commented Oct 22, 2024 • edited Loading

amitdo commented Oct 22, 2024

stweil commented Oct 22, 2024 • edited Loading

stweil commented Oct 22, 2024

amitdo commented Oct 22, 2024

lackner-codefy commented Oct 22, 2024

amitdo commented Oct 22, 2024 •

edited

Loading

stweil commented Oct 22, 2024 •

edited

Loading