AWS Textract might be more accurate, for instance, but is also not nearly as cost effective as spinning up an EC2 instance with Tesseract or easyOCR or PaddleOCR.<p>Or is it more sensible on an accuracy-vs-cost standpoint to just run a transformers model like TrOCR after identifying bounding boxes with textual data with something like CRAFT or EAST?
Depends on different factors & critera.<p>short example list:<p>* fixed font character text on blank background; human hand writing set against busy city street background<p>* converting non-text font image to text description. (collage of images forming illusion of text font)