I am trying to use the Java Robot class to create a bot to automate some tedious tasks for me, I have never used the Robot class. I have looked up the Class in the Java docs, usage seems straightforward but I have an issue of finding a certain image(I say image, I mean a certain part of the screen) effectively. Is there any other way other than loading 'x' ammount of pixels, checking them, checking the next ammount etc until I find the image I am looking for? Also is there any list of the Button and MouseButton identifiers needed for the Java Robot class as I cna not find any.
For the mouse button identifiers, you are supposed to use BUTTON1_MASK and other button mask constants from java.awt.event.MouseEvent. For example, to click the mouse you would do something like:
Robot r = new Robot();
r.mousePress(MouseEvent.BUTTON1_MASK);
r.mouseRelease(MouseEvent.BUTTON1_MASK);
I believe BUTTON1_MASK is the left mouse button, BUTTON2_MASK is the middle mouse button, and BUTTON3_MASK is the right mouse button, but it has been a month or so since I have used Robot.
As for the checking for an image, I have no idea how that is normally done. But the way you specified in your question where you just check every group of pixels shouldn't be too computationally expensive because you can get the screen image as an array of primitives, then just access the desired pixel with a bit of math. So when checking the "rectangle" of pixels that you are searching for your image in, only keep checking the pixels as long as the pixels keep matching. The moment you find a pixel that does not match, move onto the next "rectangle" of pixels. The probability that you will find a bunch of pixels that match the image that end up not being the image is extremely low, meaning that each rectangle will only need to check about 5 or fewer pixels on average. Any software that performs this task would have to check every pixel on the screen at least once (unless it makes a few shortcuts/assumptions based on probabilities of image variations occurring), and the algorithm I described would check each pixel about 5 times, so it is not that bad to implement, unless you have a huge image to check.
Hope this helps!
Related
I am making a game similar to Risk and struggling to find a way to implement the interaction with countries.
The basic idea is to create custom objects that are not rectangular and be able to change their colour by clicking them, highlight them with mouseover, or as the game progresses.
How would I go about having highlight-able countries that can be selected? The problem with sprites is their bounding boxes are rectangular, and if I define Box2D vertices and make polygons it gets really messy. Also, there are a lot of countries so a lot of the platformer style solutions don't fit.
How should I also change the colours of what is selected? Would it be best to have an individual sprite for every country and keep switching between them or is there a better way?
One way is to use polygons like you tried but I wonder why and what you mean it got messy. There are tools out there that let you draw vertices over a image and let you export that. You probably need to clean up the data a bit and import it into your app. It's also not very hard to make such an app yourself, have it import your image and start drawing and export to your favorite format. The more detailed you draw your polygons the more detail you get in your.
Perhaps an easier solution would be to use the opacity of each image of a country. Each country gets it's own image and you need to overlap the bounding rectangles to line them all up. When your mouse is hovering over one or more of these bounding boxes you check if the mouse is over a transparent pixel. If it is transparent you are obviously not hovering over the actual country. Some things to consider:
I would create the game in a pixel perfect manner so each pixel of your images is translated to a single pixel of the screen your outputting to.
To align your whole map I would create one big world map in your drawing application. Then save each country but remain the canvas size of the complete map. When packing these images with the LibGDX TexturePacker remove the whitespace (transparent pixels) and you will get an offset in your atlas. You can use this offset for each country to line them up and save precious texture space by removing all that whitespace.
Always check for a simple collision first before diving in deeper.
If you want to have "hover" functionality then don't do pixmap = texture.getTextureData().consumePixmap() each update since it's rather expensive. You might be better off creating your own 2D boolean array that represents the clickable area when you initialize the country object.
I have a screenshot of the screen where I need to find the coordinate of the button's center (approximately). Screenshot and sample buttons in *.png format. I'm assuming a method with this signature:
public Coordinate getBtnCoordinate(BufferedImage src, BufferedImage dst) {
...
}
#Data
class Coordinate {
private int x;
private int y;
}
In the future this will be used in this way:
Coordinate сoordinate = getBtnCoordinates(...);
Robot robot = new Robot();
robot.mouseMove(сoordinate.getX(), сoordinate. getY());
robot.mousePress(InputEvent.BUTTON1_MASK);
But my attempts to implement getBtnCoordinates do not lead to anything for almost a week (((. Help me please implement this method. I will be grateful for any help.
The implementation depends on whether the searched button matches one of the example images exactly or if you need to find "similar" buttons, which is probably a lot harder.
If you search for an exact match, you could scan the screenshot for a pixel or that matches a pixel color at a given position in the button. Once you find a matching pixel, compare the other pixels until there is too much mismatch or the whole button matches.
If the screenshots have a single color background in the interesting area, you can use segmentation, i.e. find rectangular areas with content first, then compare only these areas to your example buttons.
If the buttons might vary in size, you probably want to segment the buttons themselves into content and top, left, right and bottom border to be able to do partial matches.
If the matches need to be really fuzzy, you may need more advanced techniques, e.g. use machine learning to train a model of the example buttons.
Without seeing typical example screenshots and buttons for your case, it's hard to provide more concrete advice for this problem.
I am doing my final Year project on Speed Calculation using webcam. In this project we want to calculate speed of object by taking three sequential images whenever motion is detected. As given here: Raser Abwehr SpeedCam 2012, in this three line RED BLUE and GREEN are made and whenever any vehicle cross it, it takes one snap.
For this I have idea that suppose my camera resolution is 640*480 hence I can divide X-Axis in three parts of 210px each therefore I can have three rectangular screens of size (210*480). Now, I want that whenever any vehicle enters in Screen1 then it click a picture then it start second screen detector and when vehicle enters into second screen it takes second picture and at last it detect in third and click picture. Hence we have three picture and we can calculate Speed by process given here Calculating Speed using a Webcam
Presently, I am using JavaCV as Image Processing Library. It is just like running multiple instance of a single Java program to detect motion in different screen. Please suggest to me how i can do. Can Thread be useful here?
(More like a comment but it doesn't fit)
I'd suggest by starting try making it work by taking three pictures at a fixed interval (which you guess).
Then, if you want to address the issue of detecting speed of objects that are moving at quite different speeds, I'd just suggest by starting with taking as many pictures as possible once you detect any movement, for a sufficiently long time, and then figuring out afterwards which one you should use for the analysis.
I can see what you are trying to do but you should probably start with dumb things first. Just my two cents...
I have an app where the user draws pictures and then these pictures are converted to pdf. I need to be able to crop out the whitespace before conversion. Originally I kept track of the highest and lowest x and y values (http://stackoverflow.com/questions/13462088/cropping-out-whitespace-from-a-user-drawn-image). This worked for a while, but now I want to give the user the ability to erase. This is a problem because if for example the user erases the topmost point the bounding box would change, but I wouldn't the new dimensions of the box.
Right now I'm going through the entire image, pixel by pixel, to determine the bounding box. This isn't bad for one image, but I'm going to have ~70, it's way too slow for 70. I also thought about keeping every pixel in an arraylist, but I don't feel like that would work well at all.
Is there an algorithm that would help me solve this? Perhaps something already built in? Speed is more important to me than accuracy. If there is some whitespace left on each side it won't be a tragedy.
Thank you so much.
You mentioned that you are keeping track of the min and max values for X and Y co-ordinates (that also seems the solution you have chosen in the earlier question).
In similar way to this, you should be able to find the min and max X & Y co-ordinates for the erased area, from the erase event...
When the user erases part of the image, you can simply compare the co-ordinates of the erased part with the actual image to find the final co-ordinates.
There is a related problem of trying to see if 2 rectangles overlap:
Determine if two rectangles overlap each other?
You can use similar logic (though slightly different) and figure out the final min/max X & Y values.
I am planning to develop a jigsaw puzzle game.
Now I already have images and image pieces, so we don't need algorithm to cut the image in pieces.
On the UI side there would be two sections
First section contains the broken images in random order.
Second section contains the outline of the full image. User need to drag and drop the the cut images onto the outline image.
I am not sure how can the pieces be matched on the the outline image?
Any idea about the algorithm or the starting pointers?
Allow the user to drag each piece into the outline area. Allow the piece to be rotated in 90 degree increments.
Option 1:
If a piece is in the correct location in the overall puzzle, and at the correct angle, AND connected to another piece, then snap it into place with some user feedback. The outside edge of the puzzle can count for a connection to edge pieces.
Option 2:
A neighbor is an adjacent puzzle piece when the puzzle is assembled. When the puzzle pieces are mixed up, they still have the same neighbors. Each puzzle piece (except the edge pieces) has four neighbors.
If a piece is near one of its neighbors at the correct angle relative to that neighbor, then snap it to the other piece. Then allow the two (or more) pieces to be dragged around as a unit, as is done with a single piece. This would allow the user to assemble subsections of the puzzle in any area, much like is done with a physical jigsaw puzzle, and connect the subsections with one another.
You can check the piece being moved to its four neighbors to see if they are close enough to snap together. If a piece has its proper edge close enough to the proper edge of its neighbor, at the same angle, then they match.
There are several ways to check relative locations. One way would be to temporarily rotate the coordinates of the piece you are testing so it is upright, then rotate the coordinates of all its desired neighbors, also temporarily, to the same angle. (Use the same center of rotation for all the rotations.) Then you can easily test to see if they are close enough to match. If the user is dragging a subassembly, then you will need to check each unmatched edge in the subassembly.
Option 2 is more complex and more realistic. Option 1 can be further simplified by omitting the rotation of pieces and making every piece the proper angle initally.
For a regular shapes you can go with a matrix. I recommend this as the first approach. Dividing the puzzle is as simple as defining X,Y dimensions of the matrix. For each piece you have a series of four values then, one for each side, saying whether it is flat, pointing out, or pointing in. This will give you a very classic jigsaw puzzle setup.
How the pieces actually look becomes a strict GUI thing. Now, for the first draft I recommend getting it working with perfectly square pieces. Taking rectangular bits of an image should be easy to do in any GUI framework.
To go to shaped pieces you'll need a series of templates. These will become masks that you apply to the image. Each mask clips out a tiny portion of the image to produce your piece. You'll probably need to dynamically create the masks in order to fit them to the puzzle. At first start with simply triangular connections. Once you have that working you can do the math to get nice bulbous connector shapes. Look up "clip" and "mask" in your GUI framework.
If you wish to do irregular polygon shapes that don't follow a general matrix layout, then you need to do a lot more work. This is why I recommend getting the square first working as a good example. Now you'll need to delve into graph theory and partitioning. Pick up some books on 3D programming -- focusing on algorithms, as they do partitioning all the time. Though I wouldn't doubt if there is a book with this exact topic in it.
Have fun.
the data structure is simple I guess- each peace will point to it's neighbors and will hold the actual shape to display.
on the MMI (UI) of the app - what is your developing environment ?
If it's windows - I would go with c# and winforms or even better wpf.
if it's unix, you'll have to get someone else's advise, as I'm not an expert there.
1) How to break image into random polygons
It seems that you have figured out this part. (from : "Now I already have images and image pieces, so we don't need algorithm to cut the image in pieces.")
2) what kind of data structure can solve the problem
You can create a Class Piece like Scribble class in this example and your pieces would be array of objects of Piece class.
So, you will have two arrays,
(i) actual image pieces array
(ii) image piece outline array
So, whenever you drag and drop one piece on to the full outline of image, it will check whether the image piece object is intersecting more than 80% and ID (member variable of Piece object) of actual image piece and image piece outline matches, then you got the right piece at right place...
3) UI implementation
Check this out.
You could make an array of objects of the class "PuzzleTile"
Every such tile has an image and an integer
After every move, check if the integers are sorted correctly, means:
123
456
789
You could make a function for that which returns a bool.
Note: I'm currently developing under C#, that's why it's probably easiest to realize especially this concept under C#, although other platforms need none up to barely some modification to this.