How to Read Text From Bounding Box using Java With OpenCV

How to Read Text From Bounding Box using Java With OpenCV - java

I am working on Handwritten Form Recognition System, till now i have reached to this step where,i have been able to detect text using java with openCV but now i want to read the text from each of these bounding boxes Click to open image
I have being doing research to find out the process for the same using java with openCV but i was unable to find any.
Suggest me some links,Technologies,methods or process to perform this particular task with "JAVA".

This answer is more general than question specific. I will try to stick as much as possible with the problem statement.
Although there is a lot of on going research on recognition of hand written text, there is no full-proof method, which works with all possible problems.
The sample image you posted here is relatively noisy, with extremely high variance between the font of the same letter. This is exactly where it gets tricky.
I would personally suggest that once you have the bounding boxes around the text (which you already do), run contour extraction in all these bounding boxes in order to extract single letters. Once you have them, you need to figure out relevant feature/s that can represent the maximum variance (or at least 95% Confidence Interval) of the particular letter.
With this/ese feature/s, you need to train a supervised algorithm, letters as training data and their corresponding value (for eg. actual values) as labels. Once you have that, give it some data (the easiest and most difficult cases) to analyze the accuracy.
These links can help you for a start :
One of my first tools to check the accuracy with the set of features I use before I start coding: Weka
Go through basic tutorials on machine learning and how they work - Personal Favorite
You could try TensorFlow.
Simple Digit Recognition OCR in OpenCV-Python - Great for beginners.
Hope it helps!

Related

Handwriting Recognition Java

I know there are other posts about this, but I cannot seem to find one strictly for handwriting. I am going to have a form and all I need to read in is 8 squares in the left hand corner that will have 3 letters proceeded by 5 numbers.
The problem with most posts is that people either post about software for writing on the screen or software that doesn't recognize handwriting yet. I would prefer to have something in java, but something simple in another language would work.
What would really work is if people could scan their documents and just type the job number for the document name, but apparently they cant do that right...

Can you change the form? This problem will simplify a lot if you can change the form to be something that is easier for a machine to read. To recognize an arbitrary handwriting is hard as well as error prone.
What I have in mind is a form like this:
form example http://shareworldonline.com/w3/testprep/images/test%20form.jpg
However, if you have to have handwriting, check out the solutions in this thread.

if i got you correctly, you are doing offline hwr,
when i was doing offline hwr, i found most difficult separating characters in word, seems like you have them in squares, so all what you need to do is find your boxes (ie by using histogram)
and compare content of your box with each element in yours characters database (i used levenshtein distance for that)
I know it maybe not very helpful, but maybe push you on right track.

How can I compare two images for similarities (Not exact matches with MD5)?

How can I take two images and compare them to see how similar they are?
I'm not talking about comparing two exact images using MD5. The two images that I am comparing will be completely different, as well as likely different sizes at times.
Using Pokemon cards as an example:
I'm going to have scanned HD images of each of the cards. I want the user to be able to take a picture of their Pokemon card with their phone and I want to be able to compare it against my scanned images and then determine which card it is that they took a picture of.
The processing does not have to be done directly on the phone, offloading to a web service is an option however note that my knowledge somewhat limited on the programming languages (limited to PHP/JAVA/Android pretty much). The server I'm using is my own Ubuntu server so I do have access to the exec command from php if this would help.
At first I figured someone would have done something like this before (comparing two images). I tried using php with imageik using an example I found that claimed to do what I was trying ( utilizing compareImages() ), but it didn't work at all. There doesn't seem to be much (if any) documentation on doing something like this which is why I'm so stuck. All I'm looking for is a push in the right direction.
My second thought was to try using OCR to pull just the title of the card and I would just compare that against a database of titles and display the images tied to that title. So far I've tried using phpocr first, which didnt work at all as it requires monochrome images to my understanding. Next I tried tesseract directly from the console on my server, and while it did WAY better than phpocr, more than 80% of the characters were either wrong or incorrect on a scanned image, so a lower quality image coming from a smart phone would really have troubles.
I also tried OpenCV for Android but couldnt get any of the samples working.
Has anyone done anything like this, or at least used something that can accomplish what Im looking for?

There are two distinct tasks - identify area of interest ( which can be done with Haar cascades - same as face detection ) and recognition of identified image which can be
done with invariant moment techniques (like Hu moments - it was good enough to count soviet tanks on satellite images so it shall be good for pokemons). Nice property of invariant moments is soft degradation of results in case of low quality - you get a list of probability for symbols - like this is 80% pikachu and 30% something else.
We are developing OCR library based on invariant moments for use in android here:
https://sourceforge.net/projects/javaocr/
(
pure java and reasonable speed , and there are android samples in demos subdirectory.
And here is app based on javaocr, it will recognize black on white phone number and dial it: https://play.google.com/store/apps/details?id=de.pribluda.android.ocrcall&feature=search_result#?t=W251bGwsMSwyLDEsImRlLnByaWJsdWRhLmFuZHJvaWQub2NyY2FsbCJd
)
You may also consider some aiming help so user positions symbol to be matched properly
( so first task will use real intellect )

You should decide what kind of similarity comparison you need. There are geometric algorithms. They use edge detection and then try to match detected edges in both images. They are probably useful when dealing with different colours of objects with the same shape. And there are algorithms that are more based on colour similarity. They compare what colours are in the image and how they are distributed.
If you are looking for a concrete algorithm, you probably should have a look at the Hough Transform.

Handwritten character (English letters, kanji,etc.) analysis and correction

I would like to know how practical it would be to create a program which takes handwritten characters in some form, analyzes them, and offers corrections to the user. The inspiration for this idea is to have elementary school students in other countries or University students in America learn how to write in languages such as Japanese or Chinese where there are a lot of characters and even the slightest mistake can make a big difference.
I am unsure how the program will analyze the character. My current idea is to get a single pixel width line to represent the stroke, compare how far each pixel is from the corresponding pixel in the example character loaded from a database, and output which area needs the most work. Endpoints will also be useful to know. I would also like to tell the user if their character could be interpreted as another character similar to the one they wanted to write.
I imagine I will need a library of some sort to complete this project in any sort of timely manner but I have been unable to locate one which meets the standards I will need for the program. I looked into OpenCV but it appears to be meant for vision than image processing. I would also appreciate the library/module to be in python or Java but I can learn a new language if absolutely necessary.
Thank you for any help in this project.

Character Recognition is usually implemented using Artificial Neural Networks (ANNs). It is not a straightforward task to implement seeing that there are usually lots of ways in which different people write the same character.
The good thing about neural networks is that they can be trained. So, to change from one language to another all you need to change are the weights between the neurons, and leave your network intact. Neural networks are also able to generalize to a certain extent, so they are usually able to cope with minor variances of the same letter.
Tesseract is an open source OCR which was developed in the mid 90's. You might want to read about it to gain some pointers.

You can follow company links from this Wikipedia article:
http://en.wikipedia.org/wiki/Intelligent_character_recognition
I would not recommend that you attempt to implement a solution yourself, especially if you want to complete the task in less than a year or two of full-time work. It would be unfortunate if an incomplete solution provided poor guidance for students.
A word of caution: some companies that offer commercial ICR libraries may not wish to support you and/or may not provide a quote. That's their right. However, if you do not feel comfortable working with a particular vendor, either ask for a different sales contact and/or try a different vendor first.
My current idea is to get a single pixel width line to represent the stroke, compare how far each pixel is from the corresponding pixel in the example character loaded from a database, and output which area needs the most work.
The initial step of getting a stroke representation only a single pixel wide is much more difficult than you might guess. Although there are simple algorithms (e.g. Stentiford and Zhang-Suen) to perform thinning, stroke crossings and rough edges present serious problems. This is a classic (and unsolved) problem. Thinning works much of the time, but when it fails, it can fail miserably.
You could work with an open source library, and although that will help you learn algorithms and their uses, to develop a good solution you will almost certainly need to dig into the algorithms themselves and understand how they work. That requires quite a bit of study.
Here are some books that are useful as introduct textbooks:
Digital Image Processing by Gonzalez and Woods
Character Recognition Systems by Cheriet, Kharma, Siu, and Suen
Reading in the Brain by Stanislas Dehaene
Gonzalez and Woods is a standard textbook in image processing. Without some background knowledge of image processing it will be difficult for you to make progress.
The book by Cheriet, et al., touches on the state of the art in optical character recognition (OCR) and also covers handwriting recognition. The sooner you read this book, the sooner you can learn about techniques that have already been attempted.
The Dehaene book is a readable presentation of the mental processes involved in human reading, and could inspire development of interesting new algorithms.

Have you seen http://www.skritter.com? They do this in combination with spaced recognition scheduling.
I guess you want to classify features such as curves in your strokes (http://en.wikipedia.org/wiki/CJK_strokes), then as a next layer identify componenents, then estimate the most likely character. All the while statistically weighting the most likely character. Where there are two likely matches you will want to show them as likely to be confused. You will also need to create a database of probably 3000 to 5000 characters, or up to 10000 for the ambitious.
See also http://www.tegaki.org/ for an open source program to do this.

Detect frequency of audio input - Java?

I've been researching this off-and-on for a few months.
I'm looking for a library or working example code to detect the frequency in sound card audio input, or detect presence of a given set of frequencies. I'm leaning towards Java, but the real requirement is that it should be something higher-level/simpler than C, and preferably cross-platform. Linux will be the target platform but I want to leave options open for Mac or possibly even Windows. Python would be acceptable too, and if anyone knows of a language that would make this easier/has better pre-written libraries, I'd be willing to consider it.
Essentially I have a defined set of frequency pairs that will appear in the soundcard audio input and I need to be able to detect this pair and then... do something, such as for example record the following audio up to a maximum duration, and then perform some action. A potential run could feature say 5-10 pairs, defined at runtime, can't be compiled in: something like frequency 1 for ~ 1 second, a maximum delay of ~1 second, frequency 2 for ~1 second.
I found suggestions of either doing an FFT or Goertzel algorithm, but was unable to find any more than the simplest example code that seemed to give no useful results. I also found some limitations with Java audio and not being able to sample at a high enough rate to get the resolution I need.
Any suggestions for libraries to use or maybe working code? I'll admit that I'm not the most mathematically inclined, so I've been lost in some of the more technical descriptions of how the algorithms actually work.

If you are aiming at detecting frequency pairs then your job is very similar to a DTMF detector.
Try searching for DTMF in places like sourgeforge, you'll find detectors in many programming languages. The frequency pairs placing along the spectrum seems to be even more stringent than your specs so you should be fine adapting a DTMF detector to your input.

Check out SNDPeek, its a cross-platform C++ application that extracts all kinds of information from live audio; https://github.com/RobQuistNL/sndpeek

Algorithm for searching for an image in another image. (Collage)

Is this even possible? I have one huge image, 80mb with a lot of tiny pictures. They are tilted and turned around as well. How can i search for an image with programming? I know how to use java and c++. How would you go about this?

You might want to look up the Scale Invariant Feature Transform (SIFT) algorithm. Just for example, it's used in a fair number of programs for automatically generating panoramas, to recognize the parts of pictures that match up, despite differences in scaling, tilting, panning, and so on.
Edit: Quite true -- it is patented, and I probably should have mentioned that to start with. In case anybody care's it's US patent # 6,711,293.

One algorithm I've used before is SIFT. If you're interested in implementing the algorithm for yourself, you can see course notes for CPSC 425 at UBC, which describes in gentle detail how to implement SIFT in MATLAB. If you just want code that does this, take a look at VLFeat, a C library that does SIFT and a number of other algorithms.
Quotation from Jerry Coffin:
Edit: Quite true -- it is patented, and I probably should have mentioned that to start with. In case anybody care's it's US patent # 6,711,293.

How much do you know about the image? Exactly what it looks like? Do you have a copy of the image and you just need to figure out where in the large image it is?
Anyway, the branch of CS that deals with these kinds of questions is called Computer Vision.
Open CV and TINA are two open source libraries you might be able to use.

You should probably start out with the simplest ideas and see if they are sufficient for your needs. In the field of pattern matching the simplest idea is that of template matching. There is an efficient implementation of template matching found in OpenCv.
Note that template matching is rotation variant, meaning if the template you are trying to match can be rotated in the image you are trying to find it in, it won't work unless you pre-rotate the templates.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.