Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segment-line: annotate polygon or clipped image #127

Open
bertsky opened this issue May 12, 2020 · 3 comments
Open

segment-line: annotate polygon or clipped image #127

bertsky opened this issue May 12, 2020 · 3 comments
Labels
enhancement New feature or request

Comments

@bertsky
Copy link
Collaborator

bertsky commented May 12, 2020

Currently all we get is bounding boxes, which for historic print often overlap heavily.

Tesseract internally of course "knows" (already decided) which component belongs to which line, but how do we get that information via API? There are 2 general paths:

  1. polygon coordinates via baseline; either via existing/old API or via new API we have to get into Tesseract, cf. Add RowAttributes getter to PageIterator tesseract-ocr/tesseract#2971 (comment)
  2. retrieving a clipped line image for each line individually, perhaps via GetTextlines or GetComponentImages.

@wrznr what do you think?

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 8, 2021

Although we now have shrink_polygons (#162) as alternative solution (on all hierarchy levels), but GetImage may still be useful in some circumstances:

  • if the hull polygon still overlaps neighbours (because it should be more concave)
  • if the precision, which still is the bboxes of contained glyphs, is not enough (images transport the exact glyph polygon)

Here's an example of glyph images extracted by

  1. ocrd-tesserocr-segment as it is (with BoundingBox), combined with ocrd-segment-extract-glyphs:
    ſ cropped by bbox
  2. ocrd-tesserocr-segment modified by GetImage(RIL.SYMBOL, 0, None):
    ſ cropped by polygon

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 8, 2021

So how about the following parameters for an opt-in (each having the segment images annotated as derived images):

  • ocrd-tesserocr-segment and ocrd-tesserocr-recognize: array parameter add_alternativeimages with values region, line, word and/or glyph
  • ocrd-tesserocr-segment-region, ocrd-tesserocr-segment-line and ocrd-tesserocr-segment-word: boolean parameter add_alternativeimages

@bertsky
Copy link
Collaborator Author

bertsky commented Feb 9, 2021

2. modified by GetImage(RIL.SYMBOL, 0, None):

Unfortunately, this only works with None as 3rd arg, which is equivalent to GetBinaryImage(RIL.SYMBOL). One can pass the raw image there, but Tesseract will only apply the polygon mask above the glyph level in that case. So there is no way to see raw images clipped to white around the polygon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant