For a project I am working on, I was tasked with creating a way of converting an image into a non-cryptographic hash so it could be easily compared with similar images, however I ran into an issue where the JVM would begin to recklessly consume memory, despite the Java Monitoring & Management Console not reporting any increase in memory consumption.
When I first ran the application, the Task Manager would report values like this:
However after only about 30 seconds, those values would have doubled or tripled.
I used the JMMC to create a dump of the process, but it only reported around 1.3MB of usage:
The strangest part to me is that the application performs an operation which lasts for about 15 seconds, then it waits for 100 seconds (debug), and it is during the 100 seconds of thread sleeping that the memory used doubles.
Here are my two classes:
ImageHashGenerator.java
package com.arkazex.srcbot;
import java.awt.Color;
import java.awt.Image;
import java.awt.image.BufferedImage;
public class ImageHashGenerator {
public static byte[] generateHash(Image image, int resolution) {
//Resize the image
Image rscaled = image.getScaledInstance(resolution, resolution, Image.SCALE_SMOOTH);
//Convert the scaled image into a buffered image
BufferedImage scaled = convert(rscaled);
//Create the hash array
byte[] hash = new byte[resolution*resolution*3];
//Variables
Color color;
int index = 0;
//Generate the hash
for(int x = 0; x < resolution; x++) {
for(int y = 0; y < resolution; y++) {
//Get the color
color = new Color(scaled.getRGB(x, y));
//Save the colors
hash[index++] = (byte) color.getRed();
hash[index++] = (byte) color.getGreen();
hash[index++] = (byte) color.getBlue();
}
}
//Return the generated hash
return hash;
}
//Convert Image to BufferedImage
private static BufferedImage convert(Image img) {
//Create a new bufferedImage
BufferedImage image = new BufferedImage(img.getWidth(null), img.getHeight(null), BufferedImage.TYPE_3BYTE_BGR);
//Get the graphics
image.getGraphics().drawImage(img, 0, 0, null);
//Return the image
return image;
}
}
Test.java
package com.arkazex.srcbot;
import java.io.File;
import java.io.IOException;
import javax.imageio.ImageIO;
public class Test {
public static void main(String[] args) throws IOException {
//Create a hash
byte[] hash = ImageHashGenerator.generateHash(ImageIO.read(new File("img1.JPG")), 8); //Memory grows to around 150MB here
System.out.println(new String(hash));
try{ Thread.sleep(100000); } catch(Exception e) {} //Memory grows to around 300MB here
}
}
EDIT: The program stopped growing to 300MB after a few seconds for no apparent reason. I had not changed anything in the code, it just stopped doing it.
I think that what you missing here is that some of the image classes use off-heap memory. This is (presumable) invisible to the JMMC because it only gets told about on-heap usage. The OS-level memory usage monitoring sees it ... because it is looking at the total resource consumption of the JVM running your application.
The problem is that the off-heap memory blocks are only reclaimed when the corresponding on-heap image objects are finalized. That only happens when they are garbage collected.
The program stopped growing to 300MB after a few seconds for no apparent reason. I had not changed anything in the code, it just stopped doing it.
I expect that the JVM decided it was time to do a full GC (or something like that) and that caused it to free up lots of space in the off-heap memory pool. That meant the JVM no longer needed to keep growing the pool.
(I am being deliberately vague because I don't actually know how off-heap memory allocation works under the covers in a modern JVM. But if you want to investigate, the JVM source code can be downloaded ...)
See the explanation in the /** comments */
public class Test {
public static void main(String[] args) throws IOException {
//Create a hash
/** Here it allocates (3 * resolution^2 )bytes of memory to a byte array */
byte[] hash = ImageHashGenerator.generateHash(ImageIO.read(new File("img1.JPG")), 8); //Memory grows to around 150MB here
/** And here it again allocates the same memory to a String
Why print a String of 150 million chars? */
System.out.println(new String(hash));
try{ Thread.sleep(100000); } catch(Exception e) {} //Memory grows to around 300MB here
}
}
Related
As the title says I'm trying to resize a PNG image in order to reach a target file size (in terms of MegaBytes).
I searched a lot on SO and over the web, found a lot of codes but all of them are not taking in consideration the final file size.
I've arranged some code but trying to optimize performance.
Example:
Source image dimensions = 30 MB
Target file output size = 5 MB
Current flow:
1 - Load the PNG image as BufferedImage
2 - Recursively use Scalr.resize(...) in order to resize the image
2.1 - For each step use ImageIO.write to store the compressed PNG on a temporary file
2.2 - Check the size using File.length, if size on disk is > 5 MB return to step 2
3 - Save the image using ImageIO.write(...)
This method works, fine-tuning some parameters (such as scale factor) I'm able to accomplish the task.
I'm trying to understand if I can improve all by calculating/guessing the final file size without storing the image on a temporary file.
There is a byte[] obj stored into the BufferedImage obj that I can get using BufferedImage.getData().getDataBuffer() that represents the content of the image but obviously the size of this array is 2x o 3x bigger than the final size of the file because of the PNG compression algo.
I've tried some formula in order to calculate the value, something like:
w * h * bitDepth * 8 / 1024 / 1024 but I'm sure that I'm loosing a lot of data and the accounts do not add up!
At the moment I'm using mainly this code:
static void resize(BufferedImage image, String outPath, int scalingFactor) throws Exception {
image = Scalr.resize(image, image.getWidth() - scalingFactor);
// image.getData().getDataBuffer() - the byteArray containing image
File tempFile = File.createTempFile("" + System.currentTimeMillis(), ".png");
ImageIO.write(image, "png", tempFile);
System.out.println("Calculated size in bytes is: " + tempFile.length() + " - factor: " + scalingFactor);
// MAX_SIZE defined in bytes
if (tempFile.length() > MAX_SIZE) {
// recursion starts here
resize(image, outPath, chooseFactor(tempFile, 4));
} else {
// break the recursive cycle
ImageIO.write(image, "png", new File(outPath));
}
}
static int chooseFactor(File image, int scale) {
// MEGABYTE is 1024*1024
double mbSize = (double) image.length() / MEGABYTE;
return (int) ((mbSize / scale) * 100);
}
There is a way to calculate/guess the final file size starting from BufferedImage object?
Please tell me if I have made myself clear or can I give additional information.
Also recommend a more appropriate title for the question if you think it is not explanatory enough.
Thanks.
Any monotone function along image width/height can be used to perform a binary search.
This approach will work well for many changes that may be needed
(changing from PNG to JPG, adding compression, changing optimization
targets) versus an ad-hoc solution such as directly predicting the
size of a PNG (which, for example, could simply change depending on
what library is installed on your production servers or on the client
that your application uses).
The stored bytes is expected to be monotone (anyway my implementation is safe [but not optimal] under no monotonic functions).
This function perform a binary search to the lower domain (e.g. not upscale the image) using any function:
static BufferedImage downScaleSearch(BufferedImage source, Function<BufferedImage, Boolean> downScale) {
int initialSize = Math.max(source.getWidth(), source.getHeight());
int a = 1;
int b = initialSize;
BufferedImage image = source;
while(true) {
int c = (a + b) / 2 - 1;
// fix point
if(c <= a)
return image;
BufferedImage scaled = Scalr.resize(source, c);
if(downScale.apply(scaled)) {
b = c;
} else {
// the last candidate will be the not greater than limit
image = scaled;
a = c;
}
}
}
if we are interested in the final PNG file size, the search function will be the PNG size:
static final Path output = Paths.get("/tmp/downscaled.png");
static long persistAndReturnSize(BufferedImage image) {
if(ImageIO.write(image, "png", output.toFile()))
return Files.size(output);
throw new RuntimeException("Cannot write PNG file!");
}
(you could persist to memory instead filesystem).
now, we can generate images with size no more than any fixed value
public static void main(String... args) throws IOException {
BufferedImage image = ImageIO.read(Paths.get("/home/josejuan/tmp/test.png").toFile());
for(long sz: asList(10_000, 30_000, 80_000, 150_000)) {
final long MAX_SIZE = sz;
BufferedImage bestFit = downScaleSearch(image, i -> persistAndReturnSize(i) >= MAX_SIZE);
ImageIO.write(bestFit, "png", output.toFile());
System.out.println("Size: " + sz + " >= " + Files.size(output));
}
}
with output
Size: 10000 >= 9794
Size: 30000 >= 29518
Size: 80000 >= 79050
Size: 150000 >= 143277
NOTE: if you do not use compression or you admit an approximation, probably you can replace the persistAndReturnSize function by an estimator without persist the image.
NOTE: our search space is size = 1, 2, ... but you could perform a similar search using more parameters like compression level, pixel color space, ... (although, probably, your domain then will be not monotone and you should use https://en.wikipedia.org/wiki/Gradient_descent or similar).
We have an application which serve images, to speed up the response time, we cache the BufferedImage directly in memory.
class Provider {
#Override
public IData render(String... layers,String coordinate) {
int rwidth = 256 , rheight = 256 ;
ArrayList<BufferedImage> result = new ArrayList<BufferedImage>();
for (String layer : layers) {
String lkey = layer + "-" + coordinate;
BufferedImage imageData = cacher.get(lkey);
if (imageData == null) {
try {
imageData = generateImage(layer, coordinate,rwidth, rheight, bbox);
cacher.put(lkey, imageData);
} catch (IOException e) {
e.printStackTrace();
continue;
}
}
if (imageData != null) {
result.add(imageData);
}
}
return new Data(rheight, rheight, width, result);
}
private BufferedImage generateImage(String layer, String coordinate,int rwidth, int rheight) throws IOException {
BufferedImage image = new BufferedImage(rwidth, rheight, BufferedImage.TYPE_INT_ARGB);
Graphics2D g = image.createGraphics();
g.setColor(Color.RED);
g.drawString(layer+"-"+coordinate, new Random().nextInt(rwidth), new Random().nextInt(rheight));
g.dispose();
return image;
}
}
class Data implements IData {
public Data(int imageWidth, int imageHeight, int originalWidth, ArrayList<BufferedImage> images) {
this.imageResult = new BufferedImage(this.imageWidth, this.imageHeight, BufferedImage.TYPE_INT_ARGB);
Graphics2D g = imageResult.createGraphics();
for (BufferedImage imgData : images) {
g.drawImage(imgData, 0, 0, null);
imgData = null;
}
imageResult.flush();
g.dispose();
images.clear();
}
#Override
public void save(OutputStream out, String format) throws IOException {
ImageIO.write(this.imageResult, format, out);
out.flush();
this.imageResult = null;
}
}
usage:
class ImageServlet extends HttpServlet {
void doGet(req,res){
IData data= provider.render(req.getParameter("layers").split(","));
OutputStream out=res.getOutputStream();
data.save(out,"png")
out.flush();
}
}
Note:the provider filed is a single instance.
However it seems that there is a possible memory leak because I will get Out Of Memory exception when the application keep running for about 2 minutes.
Then I use visualvm to check the memory usage:
Even I Perform GC manually, the memory can not be released.
And Though there are only 300+ BufferedImage cached, and 20M+ memory are used, 1.3G+ memory are retained. In fact, through "firebug" I can make sure that a generate image is less than 1Kb. So I think the memory usage is not healthy.
Once I do not use the cache (comment the following line):
//cacher.put(lkey, imageData);
The memory usage looks good:
So it seem that the cached BufferedImage cause the memory leak.
Then I tried to transform the BufferedImage to byte[] and cache the byte[] instead of the object itself. And the memory usage is still normal. However I found the Serialization and Deserialization for the BufferedImage will cost too much time.
So I wonder if you guys have any experience of image caching?
update:
Since there are so many people said that there is no memory leak but my cacher use too many memory, I am not sure but I have tried to cache byte[] instead of BufferedImage directly, and the memory use looks good. And I can not imagine 322 image will take up 1.5G+ memory,event as #BrettOkken said, the total size should be (256 * 256 * 4byte) * 322 / 1024 / 1024 = 80M, far less than 1Gb.
And just now,I change to cache the byte and monitor the memory again, codes change like this:
BufferedImage ig = generateImage(layer,coordinate rwidth, rheight);
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ImageIO.write(ig, "png", bos);
imageData = bos.toByteArray();
tileCacher.put(lkey, imageData);
And the memory usage:
Same codes, same operation.
Note from both VisualVM screenshots that 97.5% memory consumed by 4,313 instances of int[] (Which I assume is by cached buffered image) is not consumed in non-cached version.
Although you have a less than 1K PNG image (which is compressed as per PNG format), this single image is being generated out of multiple instances of buffered image (which is not compressed). Hence you cannot directly co-relate image size from browser to memory occupied on server. So issue here is not memory leak but amount of memory required to cache this uncompressed layers of buffered images.
Strategy to resolve this is to tweak your caching mechanism:
If possible use compressed version of layers cached instead of raw
images
Ensure that you will never run out of memory by limiting cache size
by instances or by amount of memory utilized. Use either LRU or LIRS
cache eviction policy
Use custom key object with coordinate and layer as two separate
variables overriding with equals/hashcode to use as key.
Observe the behavior and if you have too many cache misses then you
will need better caching strategy or cache may be unnecessary
overhead.
I believe you are caching layers as you expect combinations of layer
and coordinates and hence cannot cache final images but depending on kind of
pattern of requests you expect you may want to consider that option if possible
Not sure what caching API you are using or what are actual values in your request. However based of visualvm it looks to me that String objects are leaking. Also as you mentioned if you turn off caching, problem is resolved.
Consider extract of below snippet of your code.
String lkey = layer + "-" + coordinate;
BufferedImage imageData = cacher.get(lkey);
Now here are few things for you to consider for this code.
You possibly getting new string objects each time for lkey
Your cache has no upper limit with and no eviction policy (e.g. LRU)
Cacher instead of doing String.equals() is doing == and since this
are new string objects they never match causing new entry each time
VisualVM is a start but it doesn't give the complete picture.
You need to trigger a heap dump while the application is using a high amount of memory.
You can trigger a heap dump from VisualVM. It can also be done automatically on an OOME if you add this vmarg to the java process:
-XX:+HeapDumpOnOutOfMemoryError
Use Memory Analyzer Tool to open and inspect the heap dump.
The tool is quite capable and can help you walk the object references to discover:
What is actually using your memory.
Why the objects from #1 aren't being garbage collected.
I have a small piece of code that takes a screenshot of my desktop every five minutes.
However I'm a little confused by the amount of memory it takes up - often it will creep up to 200mb of RAM, which I'm sure is excessive... Can anyone tell me a) sensible ways to reduce the memory footprint or b) why it's going up at all?
/**
* Code modified from code given in http://whileonefork.blogspot.co.uk/2011/02/java-multi-monitor-screenshots.html following a SE question at
* http://stackoverflow.com/questions/10042086/screen-capture-in-java-not-capturing-whole-screen and then modified by a code review at http://codereview.stackexchange.com/questions/10783/java-screengrab
*/
package com.tmc.personal;
import java.awt.AWTException;
import java.awt.GraphicsDevice;
import java.awt.GraphicsEnvironment;
import java.awt.Rectangle;
import java.awt.Robot;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.TimeZone;
import java.util.concurrent.TimeUnit;
import javax.imageio.ImageIO;
class ScreenCapture {
static int minsBetweenScreenshots = 5;
public static void main(String args[]) {
int indexOfPicture = 1000;// should be only used for naming file...
while (true) {
takeScreenshot("ScreenCapture" + indexOfPicture++);
try {
TimeUnit.MINUTES.sleep(minsBetweenScreenshots);
} catch (Exception e) {
e.printStackTrace();
}
}
}
//from http://www.coderanch.com/t/409980/java/java/append-file-timestamp
private final static String getDateTime()
{
DateFormat df = new SimpleDateFormat("yyyy-MM-dd_hh:mm:ss");
df.setTimeZone(TimeZone.getTimeZone("PST"));
return df.format(new Date());
}
public static void takeScreenshot(String filename) {
Rectangle allScreenBounds = getAllScreenBounds();
Robot robot;
try {
robot = new Robot();
BufferedImage screenShot = robot.createScreenCapture(allScreenBounds);
ImageIO.write(screenShot, "jpg", new File(filename + getDateTime()+ ".jpg"));
} catch (AWTException e) {
System.err.println("Something went wrong starting the robot");
e.printStackTrace();
} catch (IOException e) {
System.err.println("Something went wrong writing files");
e.printStackTrace();
}
}
/**
* Okay so all we have to do here is find the screen with the lowest x, the
* screen with the lowest y, the screen with the higtest value of X+ width
* and the screen with the highest value of Y+height
*
* #return A rectangle that covers the all screens that might be nearby...
*/
private static Rectangle getAllScreenBounds() {
Rectangle allScreenBounds = new Rectangle();
GraphicsEnvironment ge = GraphicsEnvironment.getLocalGraphicsEnvironment();
GraphicsDevice[] screens = ge.getScreenDevices();
int farx = 0;
int fary = 0;
for (GraphicsDevice screen : screens) {
Rectangle screenBounds = screen.getDefaultConfiguration().getBounds();
// finding the one corner
if (allScreenBounds.x > screenBounds.x) {
allScreenBounds.x = screenBounds.x;
}
if (allScreenBounds.y > screenBounds.y) {
allScreenBounds.y = screenBounds.y;
}
// finding the other corner
if (farx < (screenBounds.x + screenBounds.width)) {
farx = screenBounds.x + screenBounds.width;
}
if (fary < (screenBounds.y + screenBounds.height)) {
fary = screenBounds.y + screenBounds.height;
}
allScreenBounds.width = farx - allScreenBounds.x;
allScreenBounds.height = fary - allScreenBounds.y;
}
return allScreenBounds;
}
}
The other answers are right that Java will use as much memory as it is allowed to, at which point it will garbage collect. To work around this, you can specify a smaller max heap size in the JVM settings. You do this with the -Xmx setting. For example, if you think you only need 32MB, run it as:
java -Xmx32M [your main class or jar here]
The heap of your program (the non-stack memory) will never take more than 32MB, but it will crash if it needs more than that at once (and that's where you'll need to profile). I don't see any obvious leaks in your program (assuming ImageIO doesn't require any cleanup), though, so I think you'll be fine.
JVM garbage collector will eventually clear your memory heap. For manually clearing that heap call Runtime.getRuntime().gc();, but I don't advise doing that for every 5 minutes.
For a modern computer, 200MB is not an excessive amount of memory. The JVM will let the heap grow for a while if you're creating and discarding lots of objects so that your program doesn't get bogged down with garbage collection. Let your program run for several hours and then check back if you think there's a problem.
I'm wondering if there are any algorithms out there written in Java currently for determining if an image has a low range of different pixel colours contained within it.
I'm trying to detect placeholder images (that typically consist of high percentages of single colours (typically white and grey pixels) as opposed to full colour photos (that consist of a plethora of multiple colours).
If nothing exists, I'll write something myself (was thinking about sampling an arbitrary pixels in random positions across the image or averaging out all pixel colours contained across the image) and then determining quantities of the different colours I find. There may be a trade off between speed and accuracy depending on the methodology used.
Any advice / pointers / reading material appreciated.
A way to do it would be:
final BufferedImage image = // your image;
final Set<Integer> colours = new HashSet<Integer>();
for (int x = 0; x < image.getWidth(); x++) {
for (int y = 0; y < image.getHeight(); y++) {
colours.add(image.getRGB(x, y));
}
}
// Check your pixels here. In grayscale images the R equals G equals B
You can also use the Java Advanced Imaging(JAI) since it provides a Histogram class:
package com.datroop.histogram;
import java.awt.image.renderable.ParameterBlock;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Arrays;
import javax.media.jai.Histogram;
import javax.media.jai.JAI;
import javax.media.jai.PlanarImage;
import javax.media.jai.ROI;
import javax.media.jai.RenderedOp;
public class HistogramCreator {
private HistogramCreator() {
}
public static int[] createHistogram(final PlanarImage image) {
// set up the histogram
final int[] bins = { 256 };
final double[] low = { 0.0D };
final double[] high = { 256.0D };
Histogram histogram = new Histogram(bins, low, high);
final ParameterBlock pb = new ParameterBlock();
pb.addSource(image);
pb.add(null);
pb.add(1);
pb.add(1);
final RenderedOp op = JAI.create("histogram", pb);
histogram = (Histogram) op.getProperty("histogram");
// get histogram contents
final int[] local_array = new int[histogram.getNumBins(0)];
for (int i = 0; i < histogram.getNumBins(0); i++) {
local_array[i] = histogram.getBinSize(0, i);
}
return local_array;
}
public static void main(String[] args) {
try {
String filename = "file://localhost/C:/myimage.jpg";
System.setProperty("com.sun.media.jai.disableMediaLib", "true");
// Create the histogram
int[] myHistogram = createHistogram(JAI.create("url", new URL(filename)));
// Check it out here
System.out.println(Arrays.toString(myHistogram));
} catch (MalformedURLException e) {
e.printStackTrace();
}
}
}
A simple low-overhead approach would be to do a histogram of the Red, Green and Blue component values separately. There would only be 256 colour levels for each so it would be quite efficient - you could build each histogram in an new int[256] array.
Then you just count the number of non-zero values in each of the histograms and check whether they all meet some threshold (say at least 20 different values in each would imply a photograph)
An alternative approach would be to create a HashSet of colour values in the image, and keep adding to the HashSet as you scan the image. Since HashSets hold unique values, it will store only the unqique colours. To avoid the HashSet getting too large, you can bail out when the size of the HashSet hits a pre-determined threshold (maybe 1000 unique colours?) and conclude that you have a photograph.
I'm opening bunch of files using JFileChooser and for each image I create BufferedImage using image = ImageIO.read(path);. Where image is declared as a static field.
Now I've got 30files 1Mb each, and after running 60times read() my memory usage (checked in OS programs manager) grows for about 70Mb.
Because my image variable is static it's not the issue that image content is stored somewhere. So my questions is, why I'm loosing so much memory?
I'm writing app that need to load tons of pictures to memory, is there somewhere a leek? Is it garbage collector task to clean unused data?
Here is my code to read this data:
public class FileListElement {
private static final long serialVersionUID = -274112974687308718L;
private static final int IMAGE_WIDTH = 1280;
// private BufferedImage thumb;
private static BufferedImage image;
private String name;
public FileListElement(File f) throws IllegalImageException {
// BufferedImage image = null;
try {
image = ImageIO.read(f);
// if (image.getWidth() != 1280 || image.getHeight() != 720) {
// throw new IllegalImageException();
// }
} catch (IOException e) {
e.printStackTrace();
}
image.flush();
//
// thumb = scale(image, 128, 72);
// image = null;
name = "aa";
}
}
What's wrong with it?
Maybe I'm doing sth wrong? I need raw pixels from tons of images or compressed images loaded to RAM. So that I could fast access to any pixel of the image.
It's odd that loading 1Mb pic takes much more than 1Mb.
You can't count on the current memory usage being the amount of memory that is needed the garbage collection does not run constantly, especially if you are far from your max memory usage. Try loading more images, you might find there is no issue.
It's odd that loading 1Mb pic takes much more than 1Mb.
Well, I would expect the format stored on disk to possibly be compressed/smaller than a BufferedImage in memory. So I don't think this is odd.