I was wondering if anyone knew of a way that I could feed an image file to a Python, C or Java program and get back the coordinates of where a specific string appears on the image?
What you're talking about is called Optical Character Recognition, or OCR.
OCR isn't easy to implement from scratch, but there are libraries out there. OpenCV can be used for OCR.
I looked into this a bit more. It turns out that sikuli does exactly what I needed.
Related
How to extract handwritten text from images, like bank form images, in Java?
I tried to using Tesseract, OCR, GOCR but didn't working for me. Are there any other ways to extract handwritten text from images in Java which works at least 80-90%?
Question with links to libraries
I don't think JAVA natively supports this function, so you gotta use libraries.
There was a question which asked for working libraries and the general consent was: you won't get a 80-90% working recognition in a free and open source library.
Anyway, you can try this, as it is a wrapper for Tesseract.
I'm creating an android apps that use basic image manipulation (open image,copy, crop, paste).
Basically, the program will have multiple image input and it will be manipulated(copy, crop, paste). Then, an output image will be saved. I had done this in python using PIL library and would like to do the same using Java.
So, how do i do this in? Can someone point me a good library that have all the features? Thanks.
Hope this makes your work on this easy. I was going through some links and found this.
Let me know if you need anything else. Thanks.Image Manipulation in Java
Hi All,
As showing i have an image with fields Username & City & Work as in the image. I just want to read these Character fields value from this image with java program.
If any one have any idea about this please let me know
thanks
You can google with Java Character recognition through image and there is also good way to do this with this example. this jar you can use as for testing
Tess4J, a Java wrapper of Tesseract engine, can recognize such images (after rescaling to 300 DPI).
You should start looking into character recognizing libraries like shown here. also, look at this question here
To read the image you can use BufferedImage: http://docs.oracle.com/javase/tutorial/2d/images/loadimage.html
Once you have loaded the image you can run an OCR module to get its text. Here are some examples of OCR software sorted from better to worst: ABBYY (but it is not free), Tesseract, Java OCR, Asprise...
And that is all !!
This question may be beyond the scope of a simple answer here at stack overflow, but my hope is that it will lead me to be able to formulate several more specific questions to get where I need to be.
I want to write a program that searches a buffered image for text and returns it as a string. I don't want to write an entire OCR program, but would rather use an API that is freely available such as tesseract. Unfortunately I've been unable to find a Java API for tesseract.
I know that the font is arial and I know it's size. I am wondering if that will help.
I've already managed to capture the screen, but I'm not sure how to accomplish the next step of identifying the text found in the image.
the question
How can I implement a simple OCR function into my java program?
You can use tesjeract or tess4j wrapper of Tesseract API. Be sure to rescale you images to 300 DPI since screenshots' resolution (72 or 96 DPI) is in general not adequate for OCR purpose.
The OCR implementation is complicated, but using an SDK like http://asprise.com/product/ocr/index.php?lang=java is simple.
I am working on google appengine to create a tool for comparing image similarity.
I need to extract the pixel values of each image to perform this.
Unfortunately appengine does not support the java image libs.So I am unable to proceed.
Is there any appengine safe image library in java capable of extracting image data?
I saw some techniques in python but dont want to switch to python if I can do it in java somehow...
GAEJ has its own graphic library with fairly limited features and java.awt.image.BufferedImage is a restricted class (ie, java.awt.Image is not supported and still not present in the Jre Class White List ).
There's an open issue here, that you might want to star.
EDIT:
Somebody has patched pngj to work with InputStream.(You could use it to read a PNG pixel by pixel)
The new version of pngj now an alternative pngj-sandbox.jar that only references whitelisted classes, it should run in google-app-engine.
can https://github.com/witwall/appengine-awt help to you? i believe it willbe enough to add theis lib as dependency to the project to make BufferedImage working (but havent' tried this yet)