I'm trying to get some text from images which look like this:
This example would actually be the best case scenario as most of them would have a colored and more complex background instead.
I don't need it to be 100% accurate since I know the possible outcomes and could try to do a partial match with them.
I tried Aspose OCR and Tess4j. Aspose gives me random characters and Tess4j gives nothing.
Is this doable with a free library?
Tesseract seems to be the best free library for this purpose.
I know some projects using Tesseract do pre-processing to images they are OCR'ing. Like changing contrast, rotating, resolution, etc. Then they OCR same image with multiple times for different pre-processing changes and then compare the results.
More information here
Related
How reliable is ZXing's barcode localization for DataMatrix decoding compared to libdmtx?
I have a set of png image files of stickers (proprietary, so unfortunately I'm not able to share them) containing DataMatrix barcodes. These stickers sit on flat surfaces, have very nice quiet zones and are generally centered in the image, but suffer from inequal lighting conditions and slight dust, likely the largest obstacle to reliable decoding.
I'd like to use a modifiable Java library to decode them and it seems that ZXing is the only open-source option (open to other suggestions). However, upon running these images through the ZXing online decoder, I consistently get NO BARCODE FOUND, even on the cleanest images. In contrast, when I run the same images through proprietary online decoders, like Inlite's Free Online Barcode Reader, I get reliable decodes for all the images. My company has implemented a library in C that also reliable decodes the barcode images by processing them and calling libdmtx. Similarly, this online DataMatrix decoder built on libdmtx can also reliably read my image files.
Is the barcode localization in ZXing significantly inferior to libdmtx?
If I attempt the same preprocessing on the image files before I run them through ZXing, could I achieve similar results? I have a strong preference for a Java library (ZXing), but I may have no choice but to use libdmtx. Would appreciate any insight, thanks!
I had similar problem as you but on encoding side. As per my findings Zxing is certainly inferior to Libdmtx. We are using both libraries in house in C++ and Java project.
There is a case when Zxing breaks while generating barcode look at my comments here:
https://github.com/zxing/zxing/issues/624
However Libdmtx works flowless. The other free options you have in java world are (they are for encoding):
barcode4j
OkapiBarcode
Another alternative is the relatively new ZXing cpp port here: https://github.com/nu-book/zxing-cpp.
It contains a completely new DataMatrix detector that was meant to fix serious limitations of the Java upstream version. It was specifically designed to deal with low resolution images (module size as low as around 2 pixels) and symbols that have just the required 1 module quite zone and a busy background.
The following comparison is certainly not 'fair' but I just had the dmtxread utility of the libdmtx try my test set of images and it missed 3 of 17 samples and took a whooping 300 times as long compared to my code :).
This question may be beyond the scope of a simple answer here at stack overflow, but my hope is that it will lead me to be able to formulate several more specific questions to get where I need to be.
I want to write a program that searches a buffered image for text and returns it as a string. I don't want to write an entire OCR program, but would rather use an API that is freely available such as tesseract. Unfortunately I've been unable to find a Java API for tesseract.
I know that the font is arial and I know it's size. I am wondering if that will help.
I've already managed to capture the screen, but I'm not sure how to accomplish the next step of identifying the text found in the image.
the question
How can I implement a simple OCR function into my java program?
You can use tesjeract or tess4j wrapper of Tesseract API. Be sure to rescale you images to 300 DPI since screenshots' resolution (72 or 96 DPI) is in general not adequate for OCR purpose.
The OCR implementation is complicated, but using an SDK like http://asprise.com/product/ocr/index.php?lang=java is simple.
I want to build an upload applet / desktop client that can resize - trim and then send an image to my amazon s3 bucket.
Trimming my images has proven to take the longest and be the most cpu intensive. On my server when a user uploads a image using an html form I use imagemagick command line tools to do the trim and resize.
What tools would you guys recommend using?
Edit: I need to be able to automatically crop images. Something similar to the imagemagick trim function.
I would recommend that you check out JAI.
Specifically check out the classes CropDescriptor, ScaleDescriptor and their static methods.
Also, take a look at the subclasses of OperationDescriptorImpl, they will give you an idea of the types of operations JAI is capable of.
I am working on google appengine to create a tool for comparing image similarity.
I need to extract the pixel values of each image to perform this.
Unfortunately appengine does not support the java image libs.So I am unable to proceed.
Is there any appengine safe image library in java capable of extracting image data?
I saw some techniques in python but dont want to switch to python if I can do it in java somehow...
GAEJ has its own graphic library with fairly limited features and java.awt.image.BufferedImage is a restricted class (ie, java.awt.Image is not supported and still not present in the Jre Class White List ).
There's an open issue here, that you might want to star.
EDIT:
Somebody has patched pngj to work with InputStream.(You could use it to read a PNG pixel by pixel)
The new version of pngj now an alternative pngj-sandbox.jar that only references whitelisted classes, it should run in google-app-engine.
can https://github.com/witwall/appengine-awt help to you? i believe it willbe enough to add theis lib as dependency to the project to make BufferedImage working (but havent' tried this yet)
This question already has answers here:
image processing to improve tesseract OCR accuracy
(14 answers)
Closed 4 years ago.
I have a PDF which contains a scanned document where I should be reading some parts of it. I already had it done with Google Cloud OCR, but I just noticed it might not be adequate as I'll be exceeding monthly quota (1k requests/month), so instead I'm switching to Tessaract.
The project is done in Windows and Java, but currently I'm doing some tests using linux.
I am not uploading my original image or none of them as I am not sure if it contains sensible information, but rather some images from the internet which are VERY similar.
I have read that I can help improve Tessaract to have a better quality doing some previous work on the original image (using TextCleaner?). I would like to know how to do that kind of stuff in a windows/java enviroment and most important, how to eliminate successfully the dark background on the table and if possible eliminate the horizontal and vertical lines of the table as the don't help at all during the OCR.
Yes, you are right, you can clean the image to get a better recognition, see https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality .
You can use ImageMagick to sharpen the image(high resolution). Tessaract works better on high resolution images. If you are using python(I think you don't), pillow (PIL or Python Imaging Library) works great to enhance the quality of images.
My text cleaner script will not help much with this image. It won't remove the dark background, especially since it is textured. For other images will large regions of nearly constant color, it can make that background white. But it runs only on Unix-like systems and not with java. So for Windows you would need to use Windows 10 built-in Unix or install Cygwin.
Here is one example from http://www.fmwconcepts.com/imagemagick/textcleaner/index.php
Input:
textcleaner -g -e stretch -f 25 -o 10 -s 1 twinkle.jpg twinkle_g_stretch_f25_o10_s1.jpg
Text Recognition depends on a variety of factors to produce a good quality output. OCR output highly depends on the quality of input image. This is why every OCR engine provides guidelines regarding the quality of input image and its size. These guidelines help OCR engine to produce accurate results.
Here Image Preprocessing comes into play to improve the quality of input image so that the OCR engine gives you an accurate output.
I have written a detailed article on image processing in python. Kindly follow the link below for more explanation.
https://medium.com/cashify-engineering/improve-accuracy-of-ocr-using-image-preprocessing-8df29ec3a033