I have a screenshot taken from java.awt.Robot as a java.awt.image.BufferedImage and know there will be a unique 10x10 solid red (same RGB) coloured square somewhere in that screenshot (more likely closer to the middle).
What's an efficient approach to finding its coordinates? Is JavaCV even the right library to use? I found a brute force approach in .net here: Bitmap Detection but I'm wondering if there's a better way.
The first question is, what does it take to recognize the color - is it an exact RGB value, where either the color is an exact match, or it isn't? And will that still be the case if the image is lossy-compressed, for example as a JPEG?
Assuming you can do that, probably you want to do a search that tries to minimize time spent searching areas which won't be fruitful: On an imaginary line between two opposite corners, test the value at each corner and the midpoint; if no match, try at the midpoint between the midpoints and the corners; if no match, divide the space in half vertically or horizontally and try again; once you find a pixel of the right color, have some code that walks the pixels in each direction to determine if it's really 10x10.
Really any sort of search pattern will work; what you probably don't want is a situation where you just start at 0,0 and go 0,1 0,2 ... 1,0 1,2 since that will make matching in the upper left fast and bottom right slow (assuming a coordinate space that starts at the top left).
Related
I have one BufferedImage image1 and BufferedImage image2, and I want to know if they are equal.
image1 is made before-hand and stored into an image file, where I convert using ImageIO. However, image2 is made on the spot, so it is pretty much guaranteed that they have different sizes. What I do know is that image2 will equal one of 9 different image1's.
So, what I want to do is check if they are the same image's, but ignoring all the white pixels on the edge because they are different size, so if I compare all the pixels they would be different no matter what. If you're wondering why there is the color white on the edge, the images are numbers so the remaining space will be white.
If you want to make it simpler, the color of the real image will always be black, but I would like it better if you make it a generic solution (meaning taking in account all colors) so I could use the concepts later.
private boolean equals(BufferedImage image1, BufferedImage image2) {
// This is what I want to fill out.
}
What I first tried to do was to find the first non-white pixel of image1, and the first non-whiten pixel of image2, and then check the rows after that to see if everthing is equal. However, the images are pretty big, and this approach takes more than O(n ^ 2). I need a faster way.
What I first tried to do was to find the first non-white pixel of image1, and the first non-whiten pixel of image2, and then check the rows after that to see if everthing is equal. However, the images are pretty big, and this approach takes more than O(n ^ 2). I need a faster way.
Most probably there is no very faster way using this approach. You can use edge detection, but the algorithms for that aren't really faster too.
I would try to work with bounding boxes for each image (number).
If it is possible to save image1 the size the number is, this were the way to go. Just shrink the image to the real size of the number and save that image to disk. You then can shrink image2 to its bounding box too and the comparison is quite simple and fast.
If shrinking is no option, calculation of the bounding box is an option. Go through the image array and detect the top most and the left most pixel in both images. You then get at least the bounding edges for the top and left side, which is all you need to compare the images. (If images can differ in size, you need the whole bounding box)
By the way, you don't need to run in O(n^2). If you detect the top most or left most pixel in both images, you can set an offset to work from. You only need to find a difference to state that these numbers are different. You can work with logic to determine, which number it must be based on simple tests. For example take numbers one (1) and zero (0). Whereas zero has white pixels in the middle part, the one must have black pixels there and vice versa. So detecting areas where the numbers definitely are black or white can help you estimate the number in the image by testing up to 9 areas.
I bring you a maybe complex question which i would love your help with. Allow me to go straight to the point:
I desire an algorithm or logic in which i draw a shape using my mouse (for example a square) and it becomes a perfect square, with all the 4 sides in straight lines and perfectly regular. A human-drawn square is hardly perfect, but i wish that after it goes through the "filter" of this algorithm ,it becomes such.
A fine example of what i wish is in the game Trine, where the Wizard works by a similar principle: You draw a shape in the screen and it becomes the closest shape, that is, if you draw something similar to a square it becomes a perfect square box, but if you draw a triangle it becomes a perfect triangular box. Its like it detects what kind of shape it is and then draws a better version of it.
I want this for a game, just so you know what is the goal of all this.
Please help me figure out either the algorithm or logic behind this, or at least tell me what is the name of this kind of action (:
P.S. i added a simple image so it becomes even more clear what i intend =)
If I had to implement this task, I would store the recognizable patterns, and would try to make a match for them.
Take the minX, maxX, minY, maxY values form the user-drawn points, that will help you to scale the pattern. Choose the scaling so that the aspect ratio for the pattern would be the average of the X and Y aspect ratios.
The patterns can consist of certain number of straight lines. The pattern matches if
There are no points outside of the threshold
There is at least one user-drawn point close to each key points in the pattern
If you have the pattern matched, you will have the key points for your pattern (calculating the center of your pattern, and the size/aspect ratio). Then you can replace the user-drawn points with your image - that may be totally different from the pattern used to match (imagine a circle).
There are many ways to do this. One way that you could do it is to create a neural net that recognizes these shapes. I would generate variations of circles, squares, lines, and triangles with random perturbations to replicate "hand-drawn" versions. Then you would want to represent this as a two-dimensional array (where locations that have been drawn on would be 1's and locations that haven't been drawn on, would contain 0's). You can then convert this two-dimensional array into an input vector of n x n elements. The output of the neural net would be a vector with four elements, each one representing either a line, circle, square, or triangle. You would then train this neural net using your randomly-perturbed images until you end up with a neural net that recognizes the input with an error that is under some error-threshold. This is actually quite similar to recognizing handwritten digits.
Other ways include:
Shape contexts.
k-means clustering
Support vector machines
You don't have just an arbitrary shape, you also have the shape's path. So try counting corners. Decide on a angle threshold that will represent a corner. For each point, sample the next consecutive x number of points. Measure the angle between the first half and second half. If the angle surpasses your threshold, consider it a corner. (Obviously select the point that give you the best angle with the least amount of error, not just the first one that surpasses the threshold.) Mark the location of the corners and draw your shape to match.
Ellipses & lines: if no angles are detected, sample a few segments. Measure the orientation. If they are very similar, then line. If very different, then ellipse. If ellipse, find the bounding box and draw inside.
I have an app where the user draws pictures and then these pictures are converted to pdf. I need to be able to crop out the whitespace before conversion. Originally I kept track of the highest and lowest x and y values (http://stackoverflow.com/questions/13462088/cropping-out-whitespace-from-a-user-drawn-image). This worked for a while, but now I want to give the user the ability to erase. This is a problem because if for example the user erases the topmost point the bounding box would change, but I wouldn't the new dimensions of the box.
Right now I'm going through the entire image, pixel by pixel, to determine the bounding box. This isn't bad for one image, but I'm going to have ~70, it's way too slow for 70. I also thought about keeping every pixel in an arraylist, but I don't feel like that would work well at all.
Is there an algorithm that would help me solve this? Perhaps something already built in? Speed is more important to me than accuracy. If there is some whitespace left on each side it won't be a tragedy.
Thank you so much.
You mentioned that you are keeping track of the min and max values for X and Y co-ordinates (that also seems the solution you have chosen in the earlier question).
In similar way to this, you should be able to find the min and max X & Y co-ordinates for the erased area, from the erase event...
When the user erases part of the image, you can simply compare the co-ordinates of the erased part with the actual image to find the final co-ordinates.
There is a related problem of trying to see if 2 rectangles overlap:
Determine if two rectangles overlap each other?
You can use similar logic (though slightly different) and figure out the final min/max X & Y values.
I am trying to solve a problem of compositing two images in Java. The program will take a part of the first image and past it on the second image. The goal is to make the boundary between the two images less visible. The boundary must be chosen in such a way that the difference between the two images at the boundary is small.
My Tasks:
To write a method to choose the boundary between the two images. The method will receive the overlapping parts of the input images. This must first be transformed so that the boundary always starts from the left-top corner to the right-bottom corner.
Note: The returned image should not be the joined image but gives which parts of the two images were used.
The pixels of the boundary line can be marked with a constant (SEAM). Pixels of the first image can be marked with integer 0, pixels of the second image with integer 1. After choosing the boundary line, the floodfill algorithm can be used to fill the extra pixels with 0 or 1.
Note: The image can be represented as a graph whereby each pixel is connected with its left, right, top and bottom neighbor. So using the flood fill will be like depth-first search.
The "shortest path algorithm" must be used to choose the boundary in order to make it small.
Note: I cannot use any Java data structure except Arrays (not even ArrayList) or I can use my own defined data structure.
I am new in this area and am trying to solve it. What steps must I follow to solve this problem?
My main issue is, how do I represent the images as graphs in Java code (for instance with arrays or my own data structure)?
You can apply a varying opacity level at boundary to the center of image.So the edges cannot be identified.
see http://sreejithvs999.wordpress.com/2013/06/12/transparent-image-composition-in-java-fixing-an-image-over-another-with-changing-opacity-or-alpha-of-pixels/
where one image is fixed over another with changing transparency.
Here is what can be an answer to your "Main issue" part of the question.
You have to represent the images pixels as graph and also you have a restriction to use Java array only. Well if you look at a 2 dimensional array which you will need to use represent the pixels of the image, it can be used as an graph as well just that each item in the array will only have data value (pixel color) and the attached node to the current node can be calculated using the below formula:
Current pixel : [X,Y]
Top pixel : [X,Y-1]
Bottom pixel : [X,Y+1]
Left pixel : [X-1,Y]
Right pixel : [X+1,Y]
NOTE: X and Y are index in the 2D array. Also, when incrementing/decrementing X or Y to calculate neighbor pixel you need to make sure that you dont't overflow/underflow the boundary of the image i.e decrementing should not cause the value of X/Y to be < 0 and increment should not cause the X to go beyond width of image and Y to go beyond height of image.
Refer: http://docs.oracle.com/javase/tutorial/2d/images/index.html
I have a gray scale image of 64x64. I found the dots of the contour by a simple algorithm:
find brightest spot (Example: 100)
divide by 2 (100/2 = 50)
define a band around the result (50-5=45 to 50+5=55)
mark all dots that have value in the band (between 45 to 55)
The question now is, how do I decide of the order of connecting the dots?
(All references will be accepted, links, thoughts, etc')
Your algorithm allows the entire image, save for one pixel, to be "the contour". I'm not sure that's exactly what you want; usually a contour is a border between two different regions. The problem with your method is that you can get huge blobs of pixels that have no particularly obvious traversal order. If you have a contour that is a single pixel thick, then the traversal order is much more obvious: clockwise or counterclockwise.
Consider the following image.
........
..%%%%..
.%%%%...
...%%%%.
....%...
........
Here, I've marked everything "dark" (<50, perhaps) as % and everything bright as .. Now you can pick any pixel that is on the border between the two regions (I'll pick the dark side; you can draw the contour on the light side also, or with a little more work, directly between the light and dark sides.)
........
..%%%%..
.*%%%...
...%%%%.
....%...
........
Now you try to travel along the outer edge of the dark region, one pixel at a time. First, you look in the direction of something bright (directly left, for example). Then you rotate around--counterclockwise, let's say--until you hit a dark pixel.
........
..%%%%..
1*5%%...
234%%%%.
....%...
........
Once you hit position 5, you see that it's dark. So, you mark it as part of the contour and then try find the next piece on the contour by sweeping around starting from the pixel you just came from
........
..%%%%..
.0*%%...
.123%%%.
....%...
........
Here 0 is where you came from--and you're not going back there--and then you try pixels 1 and 2 (both light, which is not okay), until you hit pixel 3, which is dark.
In this way, you can walk around the contour pixel by pixel--both identifying the contour and getting the order of pixels--until you collide with the same pixel you started with and will leave from it so that you hit the same pixel that you did the first time you left it. Then the contour is closed. In our example, where we're making an 8-connected contour (i.e. we look at 8 neighbors, not 4), we'd get the following (where # denotes a contour point).
........
..####..
.##%#...
...#%##.
....#...
........
(You have to have this two-in-a-row criterion or if you have a dark region a single pixel wide, you'll walk up it but then not be able to walk back down along it.)
At this point, you've covered one entire boundary. But there might be others. Keep looking for dark pixels next to light ones until you have drawn a contour on top of all of them. Now you've converted your two-level picture (dark & bright pixels) into a set of contours.
If the contours end up too noisy, consider blurring the image first. That will smooth the contours out. (Alternatively, you can find the contours first and then average the coordinates with a moving window.)
In general, a given set of points can be connected in multiple ways to make different shapes.
For example, consider a set of 5 points consisting of the corners of a square and its center. These points can be connected to form a square with one side "dented in" to the center. But which side? There is no unique answer.
Other shapes can be much more complicated, with no obvious way to connect the dots.
If you are allowed to reduce your set of points to a convex hull, then it would be much easier.
I have also tried to create an algorithm that will connect contour dots into the smooth curve. See my open-source project http://outliner.codeplex.com.
The idea is the same as proposed by FUZxxl but I do not understand his worries about complexity: the time of processing is proportional to the total length of all contour strokes.
I don't know if collecting those points will get you far. (I can come up with situations in which it's almost impossible to tell which order they should come in.)
How about going to the brightest point.
Go to the brightness point of, say, 360 points surrounding this point at a distance of, say, 5 pixels.
Continue from there, but make sure you don't go back where you came from :)
Maybe try:
Start at a
Find the nearest point b
Connect a with b
and so on.
Probably not good, as complexity is something like O(n²). You may simplify this by looking only for points near the start, as aioobe suggest. This algorithm is good, if the points are just like 2-3px away from each other, but may create very strange grids.
See also Flood fill
and the lovely applet under SO
mapping-a-branching-tile-path .