ImageJ: stretchHistogram vs equalize

ImageJ: stretchHistogram vs equalize - java

I am working with on a project where I have to display some pictures (grayscale) and I notice that many of them were too dark to properly see.
Then looking at ImageJ API documentation, I found the class: ij.plugin.ContrastEnhancer
And there is two methods there that I am having a hard time to understand their conceptual differences stretchHistogram() and equalize() both make the image brighter, but I still want to understand the differences.
My question is: what is the conceptual differences between those methods?

A histogram stretch is where you have an image that has a low dynamic range - so all of the pixel intensities are concentrated in a smaller band than the 0 to 255 range of an 8-bit greyscale image, for example. So the darkest pixel in the image may be 84 and the brightest 153. Stretching just takes this narrow range and performs a linear mapping to the full 0 to 255 range. Something like this:
Histogram equalisation attempts to achieve a flat histogram - so all possible pixel intensities are equally represented in the image. This means that where there are peaks in the histogram - concentrations of values in a certain range - these are expanded to cover a wider range so that the peak is flattened, and where there are troughs in the histogram, these are mapped to a narrower range so that the trough is levelled out. Again, something like this:
For a uni-modal histogram with a low dynamic range, the two operations are roughly equivalent, but in cases where the histogram already covers the full range of intensities the histogram equalisation gives a useful visual improvement while stretching does nothing (because there's nothing to stretch). The curve for mapping to equalise a histogram is derived from the cumulative distribution (so imagine each histogram bar is the sum of all previous values) and theoretically it's possible to achieve a perfectly flat histogram. However, because we are (normally) dealing with discrete values of pixel intensities, histogram equalisation gives an approximation to a flat histogram as shown above.
Note that the images above were taken from this web page.

Related

Removing lakes from diamond square map

I implemented the diamond square algorithm in Java, but i'm not entirely satisfied with the results as a height map. It forms a lot of "lakes" - small areas of low height. The heights are generated using the diamond square algorithm, then normalized. In the example below, white = high, black = low and blue is anything below height 15: a placeholder for oceans.
This image shows the uncolored height map
How can I smooth the terrain to reduce the number of lakes?
I've investigated a simple box blurring function (setting each pixel to the average of its neighbors), but this causes strange artifacts, possibly because of the square step of the diamond square.
Would a different (perhaps gaussian) blur be appropriate, or is this a problem with my implementation? This link says the diamond square has some inherent issues, but these don't seem to be regularly spaced artifacts, and my heightmap is seeded with 16 (not 4) values.

Your threshold algorithm needs to be more logical. You need to actually specify what is to be removed in terms of size, not just height. Basically the simple threshold sets "sea level" and anything below this level will be water. The problem is that because the algorithm used to generate the terrain is does so in a haphazard way, small areas could be filled by water.
To fix this you need to essentially determine the size of regions of water and only allow larger areas.
One simple way to do this is to not allow single "pixels" to represent water. Essentially either do not set them as water(could use a bitmap where each bit represents if there is water or not) or simply raise the level up. This should get most of the single pixels out of your image and clear it up quite a bit.
You can extend this for N pixels(essentially representing area). Basically you have to identify the size of the regions of water by counting connected pixels. The problem is this, is that it allows long thin regions(which could represent rivers).
So it it is better to take it one step further and count the width and length separate.
e.g., to detect a simple single pixel
if map[i,j] < threshold && (map[i-1,j-1] > threshold && ... && map[i+1,j+1] > threshold) then Area = 1
will detect isolated pixels.
You can modify this to detect larger groups and write a generic algorithm to measure any size of potential "oceans"... then it should be simple to get generate any height map with any minimum(and maximum) size oceans you want. The next step is to "fix" up(or use a bitmap) the parts of the map that may be below sea level but did not convert to actual water. i.e., since we generally expect things below sea level to contain water. By using a bitmap you can allow for water in water or water in land, etc.
If you use smoothing, it might work just as well but you still will always run in to such problems. Smoothing reduces the size of the "oceans" but a large ocean might turn in to a small one and a small one eventually in to a single pixel. Depending on the overall average of the map, you might end up with all water or all land after enough iterations. Blurring also reduces the detail of the map.
The good news is, that if you design your algorithm with controllable parameters then you can control things like how many oceans are in the map, or how large they are, how square they are(or how circular if you want), or how much total water can be used, etc).
The more effort you put in to this you more accurate you can simulate reality. Ultimately, if you want to be infinitely complex you can take in to account how terrains are actually formed, etc... but, of course, the whole point of these simple algorithms is to allow them to be computable in reasonable amounts of time.

Programmatically find shaky OR out-of-focus Images

Most modern mobile cameras has a family of techniques called Image Stabilization to reduce shaky effects in photographs due the motion of the camera lens or associated hardware. But still quite a number of mobile cameras produce shaky photographs. Is there a reliable algorithm or method that can be implemented on mobile devices, specifically on Android for finding whether a given input image is shaky or not? I do not expect the algorithm to stabilize the input image, but the algorithm/method should reliably return a definitive boolean whether the image is shaky or not. It doesn't have to be Java, but can also be C/C++ so that one can build it through the native kit and expose the APIs to the top layer. The following illustration describes the expected result. Also, this question deals with single image problems, thus multiple frames based solutions won't work in this case. It is specifically about images, not videos.

Wouldn't out of focus images imply that
a) Edges are blurred, so any gradient based operator will have a low values compared to the luminance in the image
b) edges are blurred, so any curvature based operator will have low values
c) for shaky pictures, the pixels will be correlated with other pixels in the direction of the shake (a translation or a rotation)
I took your picture in gimp, applied Sobel for a) and Laplacian for b) (available in openCV), and got images that are a lot darker in the above portion.
Calibrating thresholds for general images would be quite difficult I guess.

Are you dealing with video stream or a single image
In case of video stream: The best way is calculate the difference between each 2 adjacent frames. And mark each pixel with difference. When the amount of such pixels is low - you are in a non shaky frame. Note, that this method does not check if image is in focus, but only designed to combat motion blur in the image.
Your implementation should include the following
For each frame 'i' - normalize the image (work with gray level, when working with floating points normalize the mean to 0 and standard deviation to 1)
Save the previous video frame.
On each new video frame calculate pixel-wise difference between the images and count the amount of pixels for whom the difference exceed some threshold. If the amount of such pixels is too high (say > 5% of the image) that means that the movement between the previous frame and current frame is big and you expect motion blur. When person holds the phone firmly, you will see a sharp drop in the amount of pixels that changed.
If your images are represented not in floating point but in fixed point (say 0..255) than you can match the histograms of the images prior to subtraction in order to reduce noise.
As long as you are getting images with motion, just drop those frames and display a message to the user "hold your phone firmly". Once you get a good stabilized image, process it but keep remembering the previous one and do the subtraction for each video frame.
The algorithm above should be strong enough (I used it in one of my projects, and it worked like a magic).
In case of Single Image: The algorithm above does not solve unfocused images and is irrelevant for a single image.
To solve the focus I recommend calculating image edges and counting
the amount of pixels that have strong edges (higher than a
threshold). Once you get high amount of pixels with edges (say > 5%
of the image), you say that the image is in focus. This algorithm is far from being perfect and may do many mistakes, depending on the texture of the image. I recommend using X,Y and diagonal edges, but smooth the image before edge detection to reduce noise.
A stronger algorithm would be taking all the edges (derivatives) and calculating their histogram (how many pixels in the image had this specific edge intensity). This is done by first calculating an image of edges and than calculating a histogram of the edge-image. Now you can analyse the shape of the histogram (the distribution of the edges strength). For example take only the top 5% of pixels with strongest edges and calculate the variance of their edge intensity.
Important fact: In unfocused images you expect the majority of the pixels to have very low edge response, few to have medium edge response and almost zero with strong edge response. In images with perfect focus you still have the majority of the pixels with low edge response but the ratio between medium response to strong response changes. You can clearly see it in the histogram shape. That is why I recommend taking only a few % of the pixels with the strongest edge response and work only with them. The rest are just a noise. Even a simple algorithm of taking the ratio between the amount of pixels with strong response divided by the amount of pixels with medium edges will be quite good.
Focus problem in video:
If you have a video stream than you can use the above described algorithms for problematic focus detection, but instead of using constant thresholds, just update them as the video runs. Eventually they will converge to better values than a predefined constants.
Last note: The focus detection problem in a single image is a very tough one. There are a lot of academic papers (using Fourier transform wavelets and other "Big algorithmic cannons"). But the problem remains very difficult because when you are looking at a blurred image you cannot know whether it is the camera that generated the blur with wrong focus, or the original reality is already blurred (for example, white walls are very blurry, pictures taken in a dark tend to be blurry even under perfect focus, pictures of water surface, table surface tend to be blurry).
Anyway there are few threads in stack overflow regarding focus in the image. Like this one. Please read them.

You can also compute the Fourier Transform of the image and then if there is a low accumulation in the high frequencies bins, then the image is probably blurred. JTransform is a reasonable library that provides FFT's if you wish to travel down this route.
There is also a fairly extensive blog post here about different methods that could be used
There is also another stack overflow question asking this but with OpenCV, OpenCV also has Java bindings and can be used in Android projects so this answer could also be helpful.

Adding Noise to an Image

I am trying to add noise to a BufferedImage in Java, but I am more interested in the algorithm used to add noise to an image rather than the Java or any other language-specific implementation.
I have searched the web and found out about the Gaussian Noise, but the tutorials/articles either show only code samples that is not very useful to me, or complex mathematical explanations.

It's not clear what your question is, but here's some random observations in case they help:
If the image is relatively unprocessed (it hasn't been scaled in size) then the noise in each pixel is roughly independent. So you can simulate that by looping over each pixel in turn, calculating a new noise value, and adding it.
Even when images have been processed the approach above is often a reasonable approximation.
The amount of noise in an image depends on a lot of factors. For typical images generated by digital sensors a common approximation is that the noise in each pixel is about the same. In other words you choose some standard deviation (SD) and then, in the loop above, select a value from a Gaussian distribution with that SD.
For astronomical images (and other low-noise electronic images), there is a component of the noise where the SD is proportional to the square root of the brightness of the pixel.
So likely what you want to do is:
Pick a SD (how noisy you want the image to be)
In a loop, for each pixel:
Generate a random number from a Gaussian with the given SD (and mean of zero) and add it to the pixel (assuming a greyscale image). For a colour image generate three values and add them to red, green and blue, respectively.
Update I imagine nightvision is going to be something like astronomical imaging. In which case you might try varying the SD for each pixel so that it includes a constant plus something that depends on the square root of the brightness. So, say, if a pixel has brightness b then you might use 100 + 10 * sqrt(b) as the SD. You'll need to play with the values but that might look more realistic.

Distance between N images: incrementally! (same crop, but re-compressed/adjusted)

I'm trying to figure out a geo-hashing method for images. It is hard because the space of possible images is of much higher dimensionality than lat/lng. (geo-hashing converts a location to a string where the string progressively refines the location)
So, what I need is something that:
INPUT: A list of JPG or PNG images on disk
OUTPUT: For each image a string WHERE the longer the string prefix in common between any two images, the higher chance that the two images are the same.
It doesn't need to be perfect, and it doesn't need to handle extreme cases, like cropped images or heavily adjusted images. It is intended for multiple copies of the same image at different resolutions and compression levels.
I can't use:
File or image-data hashing, because even a teeny change between two images makes a completely different hash and you don't get any proximity
Image subtraction, because it won't be a N-to-N comparison.
I've read in other answers to try wavelet compression or a laplacian/gaussian pyramid, but I'm not sure how to implement in Java or Python. However, I have made progress!
Resize to 32x32 using http://today.java.net/pub/a/today/2007/04/03/perils-of-image-getscaledinstance.html to not discard data. Ok that everything gets turned into a square.
Create a pyramid of successively smaller thumbnails all the way down to 2x2.
In the 2x2, encode up a string of "is the next pixel brighter than the current? If so, 1, else 0" (This throws away all hue and saturation, I may want to use hue somehow)
Encode successive binary numbers from the 8x8 and 32x32 pyramids
Convert the big binary number to some higher radix representation, like Base62.
This seems to work well! Minor differences from compression or color balancing aren't enough to change a "is the left side of this area brighter than the right side". However, I think I'm re-inventing the wheel, some sort of progressive encoding might be better? SIFT and other feature-detection is overkill, I don't need to be able to handle cropping or rotation.

How about this. The hash string is made up of groups of three characters, representing red green and blue:
{R0, G0, B0}, {R1, G1, B1}, {R2, G2, B2}, ...
For each group, the image is resized to a 2^N by 2^N square. Then, the value is the sum (mod, say, 255, or whatever your encoding is) of the differences in intensity of each of the colours over some walk through the pixels.
So as a small example, to compute e.g group 1 (2x2 image) one might use the following code (I have only bothered with the red pixel)
int rSum = 0;
int rLast = 0;
for (int i=0; i<2; i++) {
for (int j=0; j<2; j++) {
rSum += Math.abs(image[i][j].r - rLast);
rLast = image[i][j].r;
}
}
rSum %= 255;
I believe this has the property that similar images should be close to each other, both for each character in the hash and in terms of successive characters in the hash.
Although for higher values of N the chance of a collision gets higher (many images will have the the same sum-of-difference values for R G and B intensities across them), each successive iteration should reveal new information about the image that was not tested with the previous iteration.
Could be fairly computationally expensive, but you have the advantage (which I infer from your question you might desire) that you can end the computation of the hash as soon as a negative is detected within a certain threshold.
Just an idea, let me know if I wasn't clear!

What you're describing seems to me to be an example of Locally Sensitive Hashing applied to the image similarity problem.
I'm not sure that the common prefix property is desirable for a good hash function. I would expect a good hash function to have two properties:
1) Good localization - for images I1 and I2 ,norm(Hash(I1)-Hash(I2)) should represent the visually percepted simiarity of I1 and I2.
2) Good compression - The high-dimension image data should be embedded in the low-dimension space of hash functions in the most discriminative way.

Getting good results from the following:
Scale down (using good scaling that doesn't discard information) to three images:
1x7
7x1
and a 6x6 image.
Convert all to grayscale.
For each image, do the "is next pixel brighter?'1':'0' encoding, output as base62.
Those outputs become the values for three columns. Nice successively refined differencing, packed into 2 chars, 2 chars, and 6 chars. True, discards all color, but still good!

what values of an image should I use to produce a haar wavelet?

I currently have a Java program that will get the rgb values for each of the pixels in an image. I also have a method to calculate a Haar wavelet on a 2d matrix of values. However I don't know which values I should give to my method that calculates the Haar wavelet. Should I average each pixels rgb value and computer a haar wavelet on that? or maybe just use 1 of r, g,b.
I am trying to create a unique fingerprint for an image. I read elsewhere that this was a good method as I can take the dot product of 2 wavelets to see how similar the images are to each other.
Please let me know of what values I should be computing a Haar wavelet on.
Thanks
Jess

You should regard the R/G/B components as different images: Create one matrix for R, G and B each, then apply the wavelet to parts of those independently.
You then reconstruct the R/G/B-images from the 3 wavelet-compressed channels and finally combine those to a 3-channel bitmap.

Since eznme didn't answer your question (You want fingerprints, he explains compression and reconstruction), here's a method you'll often come across:
You separate color and brightness information (chrominance and luma), and weigh them differently. Sometimes you'll even throw away the chrominance and just use the luma part. This reduces the size of your fingerprint significantly (~factor three) and takes into account how we perceive an image - mainly by local brightness, not by absolute color. As a bonus you gain some robustness concerning color manipulation of the image.
The separation can be done in different ways, e.g. transforming your RGB image to YUV or YIQ color space. If you only want to keep the luma component, these two color spaces are equivalent. However, they encode the chrominance differently.
Here's the linear transformation for the luma Y from RGB:
Y = 0.299*R + 0.587*G + 0.114*B
When you take a look at the mathematics, you notice that we're doing nothing else than creating a grayscale image – taking into account that we perceive green brighter than red and red brighther than blue when they all are numerically equal.
Incase you want to keep a bit of chrominance information, in order to keep your fingerprint as concise as possible, you could reduce the resolution of the two U,V components (each actually 8 bit). So you could join them both into one 8 bit value by reducing their information to 4 bit and combining them with the shift operator (don't know how that works in java). The chrominance should weigh less in comparison to the luma, in the final fingerprint-distance calculation (the dot product you mentioned).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.