I have a photo of a paper that I hold up to my webcam, and want to minimize the area of the photo to just the paper. This way, my OCR program will potentially be more accurate, as well as conceivably faster.
I have taken a couple steps thus far to isolate the paper from the background.
First, I use Canny Edge detection, set with high thresholds. This provides a two-color representation of the edges of my image. On it, I can see a rounded rectangle among some other artifacts that happen to have sharp edges in the background.
Next, I use a Hough transformation, to draw vectors with over 100 point hits in polar coordinates on a black background. The resulting image is as shown:
See that large (the largest), almost-rectangular figure right in the middle? That's the paper I'm holding. I need to isolate that trapezoid as a polygon, or somehow otherwise get the coordinates of its vertices.
I can use these coordinates on the original image to isolate a PNG of the paper and nothing else.
I would also highly appreciate if you could provide answers to any of these three sub-questions.
-How do you find the locations of the intersections of these lines on the image?
-How would I get rid of any lines that don't form the center trapezoidal polygon?
-With these points, is there anything better than convex hull that would allow me only to get the trapezoidal/rectangular shaped region of the image?
Here is another example, in which my program produced a better image:
Related
I will start by explaining the problem in a real world application; I have a camera that is pointed at a dark surface. A paper feeder (just simple machine that spits out pages when it's told to by the controlling computer) feeds paper into the area the camera is looking.
Using openCV I am to process the image, and determine if the feeder fed correctly.
This means that I have to be able to determine not just if a page is present, but if multiple pages are present. Sometimes multiple pages are fed because they stuck together, in which case the pages are so close to perfectly aligned with eachother they visually appear to be one page unless you look very closely.
The problem I am having is that Canny Edge Detection, combined with the Hough Transform, does not give me the accuracy I need. Typical examples of using Canny to find a paper in an image return results where each page edge is many lines (5-15). Using find contours determines this is a rectangle.
These typical examples do not help as I need to be able to detect that there is another line very close to the edge of the page.
I've been playing with the threshold values of the hough transform as well as how much blur I apply before Canny and have gotten it fairly reliable, but the problem is I believe the sensitivity is now too low and any pages that feed in the above example (on top of eachother) will not be detected by this system.
Examples:
The above image has two visible pages, one that has just come out of the feeder. There is text on the page. I need to be able to identify the page's angle, and that there is in fact only one page.
The problem I'm having is that I need the line detection sensitive enough to be able to tell if there is two pages stuck together, but I also need to not have the lines from the text on the page detected.
I would suggest you not looking for faint line of one paper covering the other, but look for the overall shape of this white area. If you know that all these pages are of A4 size - you find the rectangle of white area and check if the angles are right and if proportion of sides is square root of 2. And if the two pieces of paper are perfectly alligned - you would not detect it anyway, unless you measure the thickness of paper somehow.
There is a corner detection filter which can help you find the corners of rectangle ABCD. then you just check if vector AB is perpendicular to BC and AD and so on. Also if there are more than 4 corners detected with reasonable threshold within the compact white area - it is a message "somethings not ok."
As the title implies I need an algorithm, code or a library that would help me to stretch a Bitmap (or a Path in Android) to an arbitrary polygon. Polygon is given with a list of x, y coordinates. Actually I need to transform/stretch a Path object in Android which is also given by x, y coordinates. I mentioned Bitmap because it is more likely that someone had similar problem and I assume that both will be transformed my a Matrix
I tried to use Matrix.setPolyToPoly(...) but it doesn't seem to help since it is transforming to square like area (only 4 points) not to an arbitrary polygon.
For better illustration what I need please check out image bellow. It is not exact transformation but something close. Note that whole image is stretched to star shaped polygon, it is not a mask and not a trim, just pixel transition.
I saw your question a few days ago, then yesterday I ran across this:
Canvas#drawBitmapMesh | Android Developers
It's kind of hard to grasp, but the way I understand it you start with an imaginary elastic grid over your bitmap. The way you want to warp the bitmap can be expressed by moving the x,y points of the grid to alternate locations.
Here's an article with a diagram and here's an article with some sample code.
Obviously, the hard part now is to take your frame polygon and use it to generate the warped vertices in the mesh. That may take some fancy mathematics. But I thought this would be a step in the right direction.
This is what I was envisioning: I'm looking at the star polygon and I'm picturing a circle as the starting point (not the square). The star could be seen as taking the circle and stretching points on it toward and away from the center. Whichever way it was stretched would create some vectors, from zero at the center to strongest at the stretch point.
For a Path, you could then just apply the vectors to the points in the path, but the lines would also need to be bent so this would be some pretty convoluted math with Bezier curves (convoluted at least for me, I'm not any sort of mathematician).
But if you drew the Path onto a Bitmap you might be in a better position. You could just alter the mesh vertices using the different vectors then use Canvas.drawBitmapMesh() to render the final result.
I need to detect shapes and their colours on a taken image. These shapes are: a heart, a rectangle, a star and a circle. Each shape has one of 5 predefined colours. Colour recognition works fine, but shape recognition is a real problem.
After hours and hours of googling, trying and tweaking code, the best I have come up with is the following procedure:
First, read the image and convert it to grayscale.
Then, apply blur to reduce the noise from the background.
Medianblur seems to work best here. Normal blur has little effect, and Gaussian blur rounds the edges of the shapes which gives trouble.
Next, apply threshhold.
AdaptiveThreshold doesn't give the results I expected; the result vary widely with each image. I now apply two thresholds: One uses Otsu's algorithm to filter the light shapes, and the other uses the Binary Inverted value for the darker shapes.
Finally, find contours on the two threshholds and merge them in one list.
By the amount of points found in each contour, I decide which shape is found.
I have tried adding Canny, sharpening the image, different threshholds, Convex Hull, Houghes, etc. I probably tried every possible value for each method as well. For now, I can get things working with the above procedure on a few images, but then it fails again on a new image. Either it detects too much points in a contour, or it doesn't detect the shape at all.
I am really stuck and dont know what else to try. One thing I still have to work out is using masks, but I can't find much information on that and don't know if it would make any difference at all.
Any advice is more than welcome. If you would like to see my code, please ask. You can find sample images here: http://tinypic.com/a/34tbr/1
I'm building an Android puzzle game where the user rotates and shifts pieces of a puzzle to form a final picture. It's a bit like a sliding block puzzle but the shape and size of pieces is not uniform - more like a sliding block version of tetris.
At the moment I've got puzzle pieces as imageViews which can be selected and moved around a view to position them. I've got the vector forms of the shapes behind the scenes as ArrayLists of Points.
But...I'm stuck on how to snap align the pieces together. I.e. when a piece is nearby another, shift one piece so that the nearby edges overlay each other (i.e. essentially share a boundary).
I'm sure this has been done plenty of times but can't find examples with code (in any language). It's similar to snapping to a grid but not the same and is the same kind of functionality you get in a diagramming type interface when you can snap objects to each other.
Can anyone point me toward a tutorial (any langauge) / code / or advise on how to implement it?
Urs is like Tangram game. I think it cannot be done with pieces of image to form a final picture. It can be done by Creating Geometry shapes(for both Final shape and pieces/slices of final picture) using android.Graphics package. Its quite easy to determine the final shape from the edges and vertices of pieces/slices.
http://code.google.com/p/photogaffe/ is worth checking out. It is an opensource sliding puzzle consisting of 15 pieces and allows the user to choose an image from their gallery.
You would only have to figure out your various shapes and how to rotate them. And if you are supplying your own images...how to load them.
Hope that helps.
What about drawing a box around each shape. Afterwards you define the middle of it. Then you can store a value for the rotation for each piece. And you would need to store the neighbours together with a vector the their middle.
Then you simply have to compute that the vector is in a reasonable range and the rotation is +-X degree. For example if the vector is in a range of +-10pixels and the rotation is +-3° you could rotate the piece and fit it into the puzzle.
I have a gray scale image of 64x64. I found the dots of the contour by a simple algorithm:
find brightest spot (Example: 100)
divide by 2 (100/2 = 50)
define a band around the result (50-5=45 to 50+5=55)
mark all dots that have value in the band (between 45 to 55)
The question now is, how do I decide of the order of connecting the dots?
(All references will be accepted, links, thoughts, etc')
Your algorithm allows the entire image, save for one pixel, to be "the contour". I'm not sure that's exactly what you want; usually a contour is a border between two different regions. The problem with your method is that you can get huge blobs of pixels that have no particularly obvious traversal order. If you have a contour that is a single pixel thick, then the traversal order is much more obvious: clockwise or counterclockwise.
Consider the following image.
........
..%%%%..
.%%%%...
...%%%%.
....%...
........
Here, I've marked everything "dark" (<50, perhaps) as % and everything bright as .. Now you can pick any pixel that is on the border between the two regions (I'll pick the dark side; you can draw the contour on the light side also, or with a little more work, directly between the light and dark sides.)
........
..%%%%..
.*%%%...
...%%%%.
....%...
........
Now you try to travel along the outer edge of the dark region, one pixel at a time. First, you look in the direction of something bright (directly left, for example). Then you rotate around--counterclockwise, let's say--until you hit a dark pixel.
........
..%%%%..
1*5%%...
234%%%%.
....%...
........
Once you hit position 5, you see that it's dark. So, you mark it as part of the contour and then try find the next piece on the contour by sweeping around starting from the pixel you just came from
........
..%%%%..
.0*%%...
.123%%%.
....%...
........
Here 0 is where you came from--and you're not going back there--and then you try pixels 1 and 2 (both light, which is not okay), until you hit pixel 3, which is dark.
In this way, you can walk around the contour pixel by pixel--both identifying the contour and getting the order of pixels--until you collide with the same pixel you started with and will leave from it so that you hit the same pixel that you did the first time you left it. Then the contour is closed. In our example, where we're making an 8-connected contour (i.e. we look at 8 neighbors, not 4), we'd get the following (where # denotes a contour point).
........
..####..
.##%#...
...#%##.
....#...
........
(You have to have this two-in-a-row criterion or if you have a dark region a single pixel wide, you'll walk up it but then not be able to walk back down along it.)
At this point, you've covered one entire boundary. But there might be others. Keep looking for dark pixels next to light ones until you have drawn a contour on top of all of them. Now you've converted your two-level picture (dark & bright pixels) into a set of contours.
If the contours end up too noisy, consider blurring the image first. That will smooth the contours out. (Alternatively, you can find the contours first and then average the coordinates with a moving window.)
In general, a given set of points can be connected in multiple ways to make different shapes.
For example, consider a set of 5 points consisting of the corners of a square and its center. These points can be connected to form a square with one side "dented in" to the center. But which side? There is no unique answer.
Other shapes can be much more complicated, with no obvious way to connect the dots.
If you are allowed to reduce your set of points to a convex hull, then it would be much easier.
I have also tried to create an algorithm that will connect contour dots into the smooth curve. See my open-source project http://outliner.codeplex.com.
The idea is the same as proposed by FUZxxl but I do not understand his worries about complexity: the time of processing is proportional to the total length of all contour strokes.
I don't know if collecting those points will get you far. (I can come up with situations in which it's almost impossible to tell which order they should come in.)
How about going to the brightest point.
Go to the brightness point of, say, 360 points surrounding this point at a distance of, say, 5 pixels.
Continue from there, but make sure you don't go back where you came from :)
Maybe try:
Start at a
Find the nearest point b
Connect a with b
and so on.
Probably not good, as complexity is something like O(n²). You may simplify this by looking only for points near the start, as aioobe suggest. This algorithm is good, if the points are just like 2-3px away from each other, but may create very strange grids.
See also Flood fill
and the lovely applet under SO
mapping-a-branching-tile-path .