I am trying to solve the problem of template matching, wherein we have been given a pattern image I have to find whether that image exists as a sub-image inside an image i.e. source/search image.
I have implemented SAD [Sum of Absolute Difference Solution using JAVA give in Wikipedia using some ].
Now, I want to implement a scale-invariant template matching solution for the given problem.
Where I don't care which scale is given for a pattern image, if there is a match at a certain scale then it should report it.
Things I have tried so far:
1. I have seen SIFT but it is too complicated to implement it.
Also, Image Fingerprinting used by TinEye is not that effective as it is useful in finding similar image, not sub-image.
I can not use exhaustive approach which I used earlier as there could be many scales at which there is a match.
CUSTOM APPROACH: Classify the image in several color groups e.g. 255 colors but I am not sure that would help, in scale - in-variance problem.
I saw some questions, but they are rotation and scale invariant, I just need scale-in-variance.
It is one of the semester project, on which I am currently working on.
Can someone guide me about possible solutions and approaches that I can take?
Related
My aim is to let the user take a selfie from my app and then apply different image processing algorithms to convert it in a cartoon type image. I followed the algo written here, and then also used the method written just below the chosen answer to convert black and white sketch to colored image that should look like cartoon. Every thing is ok except that after applying Gaussian Blur , the image becomes too hazy and unclear. Here is the image output:
Any advice how I can make it more clear? Like shown in this link. I know they used Photoshop , but I want to achieve it with Java and Android.
PS: I found all the image processing methods from here. Plus the method mentioned here (the one below the chosen answer), what could be the ideal values in the arguments?
cartoonifying image
If you have a basic knowledge of C++, you can port this app for your need.
This application works real time. If you want to non-real time,than you can use bilateral filter against two medianBlurs at the bottom of function. That will give better results, but bad performance. But you need to use OpenCV in your application. If you want to make application with more functions, I will suggest you to do it.
I'll start saying what I'm doing:
I'll take a photo with a webcam, in this photo there will be an object, always the same object, in a square format with letters inside it. I need to identify those letters. The step of identifying those letters is already done, the problem is the quality of the image coming from the webcam: it won't be the best nor in the best positioning, and the api I'm using to identify those letters requires positioning and quality.
The reason why I have a square is to help to identify where those letters are, so I can 'look for an square' in the image an then do what I've already done to identify the letters. My question is: is there more things I have to do in order to achieve this? Os is it only 're positioning the image, look for the square and then it's done'. If I need to study image processing there is no problem, I'm here because I don't even know what I have to look for.
I'm developing in Java because 'school things', so if there's already and api (I've heard and tried OpenCV, but I don't know what to do with it) it would really help me.
Thanks in advance.
Edit 1: As asked by Springfield762, I took some photos and I'll explain them below.
First let me explain what are the photos: the 'square thing' that will contain the letters isn't done yet, another department is taking care of it, so I had to improvise something here with pens and batteries. The letters will all be made of wood in a nice shape, I had to replace them with some Magicka cards as I don't have them yet, but the cards fits well to explain the example. I also made an example of the the square (that actually ended as an rectangle) in paint, so it has absolutely nothing of beautiful.
I took 3 photos, one using the light coming from the window, the second using the light of my room and the third using the flash of the webcam. (Sorry about links, I can't post images nor links, although I'm always here, this is the first time I post a question...)
Window light:
Room light:
Flash:
Square (rectangle) example:
The 'project' of the square you guys can ignore, I did it so that you can understand the images. And the reason I took 3 different photos was just to show all different possibilities that the webcam might be in. Also, the quality of the Magicka cards isn't a problem, since each card represents one letter, so it'll be easy to 'see' them.
Well, I found most answers to this question, I'll explain them below.
First it's not a square, but a rectangle, and it is still to be made. So I started testing the software using anything that was a rectangle, first I had to 'locate' the rectangle in the frame captured by the camera, then show it in the original image seen by the user, I accomplished that by:
Capturing the actual frame
Converting that frame to HSV;
Applying some kind of threshold (using the Core.inRange function, so that I could find a specific color in the range specified in the function);
Applying the Imgproc.findContours to find the contours of the rectangle;
Finally drawing a rectangle using the points found by the findContours;
How it ended: i.imgur.com/wmNVai0.jpg
After that I knew that I could place the rectangle in a way that all the letters inside it would be in a straight line, so I didn't need to care about the positioning of the letters. Now I had to fight with the OCR.
I chose Tesseract as it is OpenSource and seems to be a strong tool (supported by Google, that's for sure something), then I started to test some images.
In the beginning it was tough and I thought I'd have to train OCR even more, but the thing is that it has some kind of dictionary that tries to find words which are listed in this dictionary, and I didn't need that as I was looking for characters that could be in a total random way. I had to turn off that dictionary by adding the following line to a conf file:
load_system_dawg F
load_freq_dawg F
After that I had to change somethings in the image as well:
Transform into Grayscale;
Resize it by ~80%;
Original images (I can't post links...):
i.imgur.com/DFqNSYB.jpg
i.imgur.com/2Ntfqy3.jpg
Grayscale:
imgur.com/XUZ9b1Z.jpg
i.imgur.com/yjXMH5Q.jpg
Resized:
i.imgur.com/zgX9bKF.jpg
i.imgur.com/CWPRU3I.jpg
(Sometimes I had problems with resized images and on other moments I didn't, that's something I have to test even more.)
Then I could get some good results, though I'm still afraid as the light of the environment makes a whole difference, I still have to test it and mainly I still need the god da** base, I'll post it as an edit later.
If I did anything wrong or if anyone wants to correct me, please feel free to say it!
I built a simple OCR to detect text. I now need to identify and segment the text from the source image. I used the Canny edge detector to get something like this.
http://i.imgur.com/at4YTb2.png
Sorry, I don't have enough reputation to post images.
I can't figure out a way to separate the text part. I have read,
http://www.math.tau.ac.il/~turkel/imagepapers/text_detection.pdf which is using the Stroke Width Transform
ROBUST TEXT DETECTION IN NATURAL IMAGES WITH EDGE-ENHANCED MAXIMALLY STABLE EXTREMAL REGIONS
The algorithms described are probably the best answers but are very difficult to implement in Java. Moreover, in my use cases the text would be prominent much like above. The above algorithms seem like overkill. I would be grateful if anyone can suggest any alternative to solve this problem. Thanks!
I have one method to combine 3 greyscale images to one colour image which is done by using getRed(), getGreen() and getBlue() in Java, for each individual input image and then applying the colour to the output image which works quite well. Im looking to find other methods for doing this however.
It doesnt have to be accurate in terms of sea being blue, etc. but it needs to be coloured in a way that different areas of the 'map' can be differentiated.
Ive been looking into ways of doing this but unfortunately havent actually managed to find an alternative way of doing it, im looking to use something apart from the getRGB() values.
Im not looking for anyone to code for me, just to give me some pointers on what to look for.
Thanks!
Your comment here is critical: "but it needs to be coloured in a way that different areas of the 'map' can be differentiated. Ive been looking into ways of doing this but unfortunately havent actually managed to find an alternative way of doing it"
What did you do with the other 4 images - or "channels". Usually, when doing color space mapping one has 3 channels and one converts to another 3 channel color space. In your case you have 7 channels, and you want to put all that information into 3 channels? It all depends on what you are viewing. Hyperspectral imagery would be a good place to start to see containers for storing imagery data with more than 3 channels.
You can convert to a different colorspace as others have suggested or perform any other transformation. It sounds like though in order to differentiate different parts of the image, you will need some post processing. This will depend on your transformation.
I am doing my final year Mca and my topic is image matching using edge detection.
I have an Image at hand where the subject is not smiling, and two other images.
Bigger image containing the image at hand as a part of the image
Same as the image at hand with (some modification like smiling)
Now I want to check for presence in the first case, matching in the second case.
My approach:
I will find edges for all the given images-to reduce the amount of data to check.
I'm stuck on how to proceed. Any suggestions are extremely appreciated.
I have been Java user for years and I can do virtually anything, but ... as I found Mathematica about 2 years ago, I really started to love Mathematica. This is kind of problem I would use Mathematica to solve.
Just take a look at image processing reference.
Example of ImageCorrelate function:
CVOnline is a great source of computer vision algorithms. The section on edge detection can probably point you in the right direction.