I have a processing algo which performs well if I process each color channel seperately. but when I tried to process the whole pixel value, things missed up. the results are not good. now I want to isolate the 3 color channel from the pixel value( exclude alpha) then work on the new value (the 3 channels).
How can I do that in C++? knowing that I tried the RGB_565 bitmap format which is not a good solution. and knowing that I want to merge the RGB into a 24bits variable.
You can access each channel separately. The exact way depends on actual pixel format.
ANDROID_BITMAP_FORMAT_RGBA_8888: each pixel is 4-byte long, layout pattern is RGBARGBA..., i.e. the 1-st pixel byte is red component, the 2-d is green, the 3-d is blue and the 4-th is alpha component.
ANDROID_BITMAP_FORMAT_RGB_565: each pixel is 2-byte long, stored in native endianness, so color components may be extracted in next way:
red = (u16_pix >> 11) & 0x1f;
green = (u16_pix >> 5) & 0x3f;
blue = (u16_pix >> 0) & 0x1f;
ANDROID_BITMAP_FORMAT_RGBA_4444:
is deprecated because of poor quality, you shouldn't even think about this one
ANDROID_BITMAP_FORMAT_A_8:
is 1 byte per pixel and designed for alpha-only or grayscale images. It is probably not what you are looking for.
Note that Android has no 24bpp format, and you must choose 32bpp or 16bpp one. About your algo: there are two alternatives - code may access individual components right inside packed pixel value, or you may deinterleave packed pixels into few planes, i.e. arrays, each of them will hold only one channel. Then after processing you may interleave them again to one of the supported formats or transform to some other format you are interested in.
Related
Java BufferedImage class has a long list of class variables known as the image type which can be used as an argument for the BufferedImage constructor.
However, Java docs did a minimal explanation what these image types are used for and how would it affect the BufferedImage to be created.
My question is:
How would an image type affect the BufferedImage to be created? Does it control the number of bits used to store various colors (Red,Green,Blue) and its transparency?
Which image type should we use if we just want to create
an opaque image
a transparent image
a translucent image
I read the description in the Java Doc many times, but just couldn't figure out how should we use it. For example, this one:
TYPE_INT_BGR
Represents an image with 8-bit RGB color components, corresponding to a Windows- or Solaris- style BGR color model, with the colors Blue, Green, and Red packed into integer pixels. There is no alpha. The image has a DirectColorModel. When data with non-opaque alpha is stored in an image of this type, the color data must be adjusted to a non-premultiplied form and the alpha discarded, as described in the AlphaComposite documentation.
Unless you have specific requirements (for example saving memory or saving computations or a specific native pixel format) just go with the default TYPE_INT_ARGB which has 8 bits per channel, 3 channels + alpha.
Skipping the alpha channel when working with 8 bits per channel won't affect the total memory occupied by the image since every pixel will be packed in an int in any case so 8 bits will be discarded.
Basically you have:
TYPE_INT_ARGB, 4 bytes per pixel with alpha channel
TYPE_INT_ARGB_PRE, 4 bytes per pixel, same as before but colors are already multiplied by the alpha of the pixel to save computations
TYPE_INT_RGB, 4 bytes per pixel without alpha channel
TYPE_USHORT_555_RGB and TYPE_USHORT_565_RGB, 2 bytes per pixel, much less colors, don't need to use it unless you have memory constraints
Then there are all the same kind of formats with swapped channels (eg. BGR instead that RGB). You should choose the one native of your platform so that less conversion should be done.
I'm trying to figure out a geo-hashing method for images. It is hard because the space of possible images is of much higher dimensionality than lat/lng. (geo-hashing converts a location to a string where the string progressively refines the location)
So, what I need is something that:
INPUT: A list of JPG or PNG images on disk
OUTPUT: For each image a string WHERE the longer the string prefix in common between any two images, the higher chance that the two images are the same.
It doesn't need to be perfect, and it doesn't need to handle extreme cases, like cropped images or heavily adjusted images. It is intended for multiple copies of the same image at different resolutions and compression levels.
I can't use:
File or image-data hashing, because even a teeny change between two images makes a completely different hash and you don't get any proximity
Image subtraction, because it won't be a N-to-N comparison.
I've read in other answers to try wavelet compression or a laplacian/gaussian pyramid, but I'm not sure how to implement in Java or Python. However, I have made progress!
Resize to 32x32 using http://today.java.net/pub/a/today/2007/04/03/perils-of-image-getscaledinstance.html to not discard data. Ok that everything gets turned into a square.
Create a pyramid of successively smaller thumbnails all the way down to 2x2.
In the 2x2, encode up a string of "is the next pixel brighter than the current? If so, 1, else 0" (This throws away all hue and saturation, I may want to use hue somehow)
Encode successive binary numbers from the 8x8 and 32x32 pyramids
Convert the big binary number to some higher radix representation, like Base62.
This seems to work well! Minor differences from compression or color balancing aren't enough to change a "is the left side of this area brighter than the right side". However, I think I'm re-inventing the wheel, some sort of progressive encoding might be better? SIFT and other feature-detection is overkill, I don't need to be able to handle cropping or rotation.
How about this. The hash string is made up of groups of three characters, representing red green and blue:
{R0, G0, B0}, {R1, G1, B1}, {R2, G2, B2}, ...
For each group, the image is resized to a 2^N by 2^N square. Then, the value is the sum (mod, say, 255, or whatever your encoding is) of the differences in intensity of each of the colours over some walk through the pixels.
So as a small example, to compute e.g group 1 (2x2 image) one might use the following code (I have only bothered with the red pixel)
int rSum = 0;
int rLast = 0;
for (int i=0; i<2; i++) {
for (int j=0; j<2; j++) {
rSum += Math.abs(image[i][j].r - rLast);
rLast = image[i][j].r;
}
}
rSum %= 255;
I believe this has the property that similar images should be close to each other, both for each character in the hash and in terms of successive characters in the hash.
Although for higher values of N the chance of a collision gets higher (many images will have the the same sum-of-difference values for R G and B intensities across them), each successive iteration should reveal new information about the image that was not tested with the previous iteration.
Could be fairly computationally expensive, but you have the advantage (which I infer from your question you might desire) that you can end the computation of the hash as soon as a negative is detected within a certain threshold.
Just an idea, let me know if I wasn't clear!
What you're describing seems to me to be an example of Locally Sensitive Hashing applied to the image similarity problem.
I'm not sure that the common prefix property is desirable for a good hash function. I would expect a good hash function to have two properties:
1) Good localization - for images I1 and I2 ,norm(Hash(I1)-Hash(I2)) should represent the visually percepted simiarity of I1 and I2.
2) Good compression - The high-dimension image data should be embedded in the low-dimension space of hash functions in the most discriminative way.
Getting good results from the following:
Scale down (using good scaling that doesn't discard information) to three images:
1x7
7x1
and a 6x6 image.
Convert all to grayscale.
For each image, do the "is next pixel brighter?'1':'0' encoding, output as base62.
Those outputs become the values for three columns. Nice successively refined differencing, packed into 2 chars, 2 chars, and 6 chars. True, discards all color, but still good!
I'm a complete beginner to programming and I've been trying to figure this out for a while but I'm lost. There's a few different versions of the question, but I think I can figure the rest out after I have one finished code, so I'm just going explain the one. The first part asks to write a program using DrJava that will display an image, wait for a user response, and then reduce the image to have only 4 levels per color channel. It goes on to say this:
"What we want to do is reduce each color channel from the range 0-255 (8 bits) to the range 0-3 (2 bits). We can do this by dividing the color channel value by 64. However, since our actual display still uses 1 byte per color channel, a values 0-3 will all look very much like black (very low color intensity). To make it look right, we need to scale the values back up to the original range (multiply by 64). Note that, if integer division is used, this means that only 4 color channel values will occur: 0, 64, 128 and 192, imitating a 2-bit color palate."
I don't even get where I'm supposed to put the picture and get it to load from. Basically I need it explained like I'm five. Thanks in advance!
Java API documentation will be your best resource.
You can read an BufferedImage via a function ImageIO.read(File).
BufferedImage is an Image, so you can display it a part of a JLabel or JButton.
BufferedImage can be created with different ColorModels, RGB, BGR, ARGB, one byte per colour, indexed colours and so on. Here you want to copy one BufferedImage to another with another Colormodel.
Basically you can create a new BufferedImage with the differing ColorModel, call:
Graphics g = otherImg.getGraphics();
g.drawImage(originalImg, ...);
ImageIO.write(otherImg, ...);
I currently have a Java program that will get the rgb values for each of the pixels in an image. I also have a method to calculate a Haar wavelet on a 2d matrix of values. However I don't know which values I should give to my method that calculates the Haar wavelet. Should I average each pixels rgb value and computer a haar wavelet on that? or maybe just use 1 of r, g,b.
I am trying to create a unique fingerprint for an image. I read elsewhere that this was a good method as I can take the dot product of 2 wavelets to see how similar the images are to each other.
Please let me know of what values I should be computing a Haar wavelet on.
Thanks
Jess
You should regard the R/G/B components as different images: Create one matrix for R, G and B each, then apply the wavelet to parts of those independently.
You then reconstruct the R/G/B-images from the 3 wavelet-compressed channels and finally combine those to a 3-channel bitmap.
Since eznme didn't answer your question (You want fingerprints, he explains compression and reconstruction), here's a method you'll often come across:
You separate color and brightness information (chrominance and luma), and weigh them differently. Sometimes you'll even throw away the chrominance and just use the luma part. This reduces the size of your fingerprint significantly (~factor three) and takes into account how we perceive an image - mainly by local brightness, not by absolute color. As a bonus you gain some robustness concerning color manipulation of the image.
The separation can be done in different ways, e.g. transforming your RGB image to YUV or YIQ color space. If you only want to keep the luma component, these two color spaces are equivalent. However, they encode the chrominance differently.
Here's the linear transformation for the luma Y from RGB:
Y = 0.299*R + 0.587*G + 0.114*B
When you take a look at the mathematics, you notice that we're doing nothing else than creating a grayscale image – taking into account that we perceive green brighter than red and red brighther than blue when they all are numerically equal.
Incase you want to keep a bit of chrominance information, in order to keep your fingerprint as concise as possible, you could reduce the resolution of the two U,V components (each actually 8 bit). So you could join them both into one 8 bit value by reducing their information to 4 bit and combining them with the shift operator (don't know how that works in java). The chrominance should weigh less in comparison to the luma, in the final fingerprint-distance calculation (the dot product you mentioned).
I'm trying to replicate some image filtering software on the Android platform. The desktop version works with bmps but crashes out on png files.
When I come to xOr two pictures (The 32 bit ints of each corresponding pixel) I get very different results for the two pieces of software.
I'm sure my code isn't wrong as it's such a simple task but here it is;
const int aMask = 0xFF000000;
int xOrPixels(int p1, int p2) {
return (aMask | (p1 ^ p2) );
}
The definition for the JAI library used by the Java desktop software can be found here and states;
The destination pixel values are defined by the pseudocode:
dst[x][y][b] = srcs[0][x][y][b] ^ srcs[1][x][y][b];
Where the b is for band (i.e. R,G,B).
Any thoughts? I have a similar problem with AND and OR.
Here is an image with the two source images xOr'd at the bottom on Android using a png. The same file as a bitmap xOr'd gives me a bitmap filled with 0xFFFFFFFF (White), no pixels at all. I checked the binary values of the Android ap and it seems right to me....
Gav
NB When i say (Same 32 bit ARGB representation) I mean that android allows you to decode a png file to this format. Whilst this might give room for some error (Is png lossless?) I get completely different colours on the output.
I checked a couple of values from your screenshot.
The input pixels:
Upper left corners, 0xc3cbce^0x293029 = 0xeafbe7
Nape of the neck, 0xbdb221^0x424dd6 = 0xfffff7
are very similar to the corresponding output pixels.
Looks to me like you are XORing two images that are closely related (inverted in each color channel), so, necessarily, the output is near 0xffffff.
If you were to XOR two dissimilar images, perhaps you will get something more like what you expect.
The question is, why do you want to XOR pixel values?
The png could have the wrong gamma or color space, and it's getting converted on load, affecting the result. Some versions of Photoshop had a bug where they saved pngs with the wrong gamma.
What are you doing prior to the code posted?
PNG is a compressed format, using the deflate algorithm (See Section 5 of RFC2083), so if you're just doing binary reads, you're not looking at actual pixels.