i have the following picture of which i want to crop out all letters in to new images.
http://imgur.com/N2JqmFi
The result of each letter should look like this for example:
http://imgur.com/LvjdZh1
I have had troubles achieving this, i have used thresholding, findContours, and many other things. I just cannot seem to cut out the letters since the image contains very much noise.
Can someone please provide some help and information?
if it did not find any contours in your img, rect_min will contain invalid negative numbers, and submat will crash.
you'll have to add some logic, that cares for that.
Related
I'll start saying what I'm doing:
I'll take a photo with a webcam, in this photo there will be an object, always the same object, in a square format with letters inside it. I need to identify those letters. The step of identifying those letters is already done, the problem is the quality of the image coming from the webcam: it won't be the best nor in the best positioning, and the api I'm using to identify those letters requires positioning and quality.
The reason why I have a square is to help to identify where those letters are, so I can 'look for an square' in the image an then do what I've already done to identify the letters. My question is: is there more things I have to do in order to achieve this? Os is it only 're positioning the image, look for the square and then it's done'. If I need to study image processing there is no problem, I'm here because I don't even know what I have to look for.
I'm developing in Java because 'school things', so if there's already and api (I've heard and tried OpenCV, but I don't know what to do with it) it would really help me.
Thanks in advance.
Edit 1: As asked by Springfield762, I took some photos and I'll explain them below.
First let me explain what are the photos: the 'square thing' that will contain the letters isn't done yet, another department is taking care of it, so I had to improvise something here with pens and batteries. The letters will all be made of wood in a nice shape, I had to replace them with some Magicka cards as I don't have them yet, but the cards fits well to explain the example. I also made an example of the the square (that actually ended as an rectangle) in paint, so it has absolutely nothing of beautiful.
I took 3 photos, one using the light coming from the window, the second using the light of my room and the third using the flash of the webcam. (Sorry about links, I can't post images nor links, although I'm always here, this is the first time I post a question...)
Window light:
Room light:
Flash:
Square (rectangle) example:
The 'project' of the square you guys can ignore, I did it so that you can understand the images. And the reason I took 3 different photos was just to show all different possibilities that the webcam might be in. Also, the quality of the Magicka cards isn't a problem, since each card represents one letter, so it'll be easy to 'see' them.
Well, I found most answers to this question, I'll explain them below.
First it's not a square, but a rectangle, and it is still to be made. So I started testing the software using anything that was a rectangle, first I had to 'locate' the rectangle in the frame captured by the camera, then show it in the original image seen by the user, I accomplished that by:
Capturing the actual frame
Converting that frame to HSV;
Applying some kind of threshold (using the Core.inRange function, so that I could find a specific color in the range specified in the function);
Applying the Imgproc.findContours to find the contours of the rectangle;
Finally drawing a rectangle using the points found by the findContours;
How it ended: i.imgur.com/wmNVai0.jpg
After that I knew that I could place the rectangle in a way that all the letters inside it would be in a straight line, so I didn't need to care about the positioning of the letters. Now I had to fight with the OCR.
I chose Tesseract as it is OpenSource and seems to be a strong tool (supported by Google, that's for sure something), then I started to test some images.
In the beginning it was tough and I thought I'd have to train OCR even more, but the thing is that it has some kind of dictionary that tries to find words which are listed in this dictionary, and I didn't need that as I was looking for characters that could be in a total random way. I had to turn off that dictionary by adding the following line to a conf file:
load_system_dawg F
load_freq_dawg F
After that I had to change somethings in the image as well:
Transform into Grayscale;
Resize it by ~80%;
Original images (I can't post links...):
i.imgur.com/DFqNSYB.jpg
i.imgur.com/2Ntfqy3.jpg
Grayscale:
imgur.com/XUZ9b1Z.jpg
i.imgur.com/yjXMH5Q.jpg
Resized:
i.imgur.com/zgX9bKF.jpg
i.imgur.com/CWPRU3I.jpg
(Sometimes I had problems with resized images and on other moments I didn't, that's something I have to test even more.)
Then I could get some good results, though I'm still afraid as the light of the environment makes a whole difference, I still have to test it and mainly I still need the god da** base, I'll post it as an edit later.
If I did anything wrong or if anyone wants to correct me, please feel free to say it!
I've been working on a simple OCR project for a couple days. The app is supposed to extract a text from an image. The solution I've come up with is:
greyscaling, rotating, removing noise from the image and isolating every single character on the image. So I need some help with a simple algorithm that would let me recognise the character. I only need to recognise the letters A,B,C,D.
The method i'd use would be to make a mask of each letter, scaling it to the rectangle you found when cutting the picture and "score" each letter, by comparing RGB for example.
I need to detect shapes and their colours on a taken image. These shapes are: a heart, a rectangle, a star and a circle. Each shape has one of 5 predefined colours. Colour recognition works fine, but shape recognition is a real problem.
After hours and hours of googling, trying and tweaking code, the best I have come up with is the following procedure:
First, read the image and convert it to grayscale.
Then, apply blur to reduce the noise from the background.
Medianblur seems to work best here. Normal blur has little effect, and Gaussian blur rounds the edges of the shapes which gives trouble.
Next, apply threshhold.
AdaptiveThreshold doesn't give the results I expected; the result vary widely with each image. I now apply two thresholds: One uses Otsu's algorithm to filter the light shapes, and the other uses the Binary Inverted value for the darker shapes.
Finally, find contours on the two threshholds and merge them in one list.
By the amount of points found in each contour, I decide which shape is found.
I have tried adding Canny, sharpening the image, different threshholds, Convex Hull, Houghes, etc. I probably tried every possible value for each method as well. For now, I can get things working with the above procedure on a few images, but then it fails again on a new image. Either it detects too much points in a contour, or it doesn't detect the shape at all.
I am really stuck and dont know what else to try. One thing I still have to work out is using masks, but I can't find much information on that and don't know if it would make any difference at all.
Any advice is more than welcome. If you would like to see my code, please ask. You can find sample images here: http://tinypic.com/a/34tbr/1
I have 5 images by default in the program, and I will allow the user choose an image from the desktop. The program will determine which image between the 5 images is the closest one to the user image.
Can anyone help me and take me to the start of the idea?
You can try to use a feature extraction algorithm like SIFT, SURF etc. Then compare extracted features with your database. You can select the best matching image based on the number of correct matches.
Generally SIFT works fine for 2D objects, like picture of a label or an advertisement board. Rotation on 2D plane or scale wont matter if you are using SIFT. SURF is supposed to be an improvement of SIFT but I do not have much experience on it.
These algorithms are said to be bit heavy. Anyway if you are matching just 5 images it wont be much of a problem.(Or you can simply calculate the descriptors(features) of your images before hand and store them. Then at run time all you have to do is get the descriptor of the user image and compare it) But still if you are trying to match images of basic shapes like squares and circles, using square detection or circle detection might be efficient performance wise.
I don't have much experience doing image analysis so I thought I'd ask more enlightened individuals :)
Basically what I want to do is analyse an image and work out what the most common colours are (these will be averages obviously).
Does anybody know of an effective way to do this? If at all possible I'd like to avoid using any third party libraries, but everything will be considered at least.
Like I said, I don't have much experience with image analysis so please be patient with me if I don't understand your answers properly!
I've tried Google but there doesn't seem to be anything resembling what I'm after. Maybe my Google-Fu just isn't good enough today.
I'd really appreciate any pointers you guys could give!
Thanks,
Tom
A rough idea of how you might be able to do this:
You could use java.awt.image.PixelGrabber to grab a 2D array of RGB ints from the image, pixel by pixel.
When you have this array populated, you can go through and sort however you want (sounds like it would be memory-intensive), and perform simple functions to order them, count them, etc.
Then you could look at java.awt.Color and, using the Color(int, int, int) constructor, create boxes with those colors (as visual placeholders) with the number of occurrences appearing below it.
To get the hex values for the color, you can use a String like so:
String rgb = Integer.toHexString(color.getRGB());
rgb = rgb.substring(2, rgb.length());
(substring is necessary, otherwise you'd get 8 characters)
Hopefully this gets you on the right track!
Resources: Color Class, Image Class
Consider a "color cube" with RGB instead of XYZ. Split the cube into subcubes, but make them all overlap. Ensure they are all the same size. An easy to remember/calculate cube-pattern would be one that goes from 0-127, 64-191, 128-255 in all directions, making for a total of 27 cubes. For each cube, find what colors in the image fall into them.
As you make the cubes smaller and smaller, the results will change less and less and begin to converge on the most popular color ranges. Once you have that convergence, take the average of the range to get the "actual color".
That's about as much detail as I can go into with my boss hovering around the cubefarm :-)
I know your trying to avoid third party libraries, but do take a look at OpenCV. They have some good stuff w.r.t image manipulation and analysis. Maybe that can work for you.