Most advanced image compression today (not on browser)?

Most advanced image compression today (not on browser)? - java

I am writing an application in Java to view images which contain a lot of text and graphics, like a screenshot of a webpage, actually its a image of a magazine article. Some parts are text, some parts are graphics.
My client program is written in Java, I can use any image format, what is the best image compression format I can get my hands on in Java? So I can compress and de-compress?
It would be nice if the image became clearly as it loaded, but thats not necessary, its not 1997 anymore (remember gif loading).

You haven't supplied one key piece of information: does the compression need to be lossless or is lossy fine? And if lossy is fine withint what tolerance?
You can reduce an entire image to a single bit of information if you're prepared to a highly lossy compression format. :-)
Seriously though, the two main lossless formats are GIF and PNG (both in browser and out of) and the biggest (by far) lossy format is JPG. Other formats like BMP and TIF are nowhere near as efficient.
all three of these formats are well-supported in Java (either directly or with readily-available third party libraries). PNG tends to be better compression ratios than GIF.
See:
PNG vs. GIF compression
GIF vs. PNG: this one gives reference to (and measures0 the PNGpong format;

DjVu (http://djvu.org/) is probably one of the smartest format to compress / store text oriented images. It actually stores the text as text and adds images for backgrounds. This means that you can partily treat your documents as text (full text search, copy/paste of thetext, ...). The viewing part is probably well suported in Java.
The biggest problem is the lack of good free support for encoding the documents. There is some open source tools available, but last time I checked none of them was very user friendly, nor very developer friendly either. There are very good commercial tools, but pretty expensive.

For lossy compression, you might also want to consider Jpeg 2000. It's much more efficient than standard JPEG compression, especially at very low quality settings, it's relatively widely adopted and there are plenty of coding / decoding libraries out already.

In addition to cletus' points
There are tools like pngout, pngcrush and optipng to still compress the images

I prefer PNG for any internet image work I'm doing, as GIF has a limited palette and JPG doesn't handle transparency.

DLI image compression is the best available out there. It isn't a standard.
But still a lot of fun to play with.
https://sites.google.com/site/dlimagecomp/
Have a look at 'http://www.jpegmini.com/main/home' .
Recompress the jpg file with one of these 'http://www.maximumcompression.com/data/jpg.php'
A noise filter can be used to further improve signal to noise ratio.
'http://compression.ru/video/deblocking/index_en.html'
This is just for research purpose, i can't think of a way how you can implement this in java.

Related

Is there a way to incrementally write to an image file to avoid running out of RAM when rendering a large image?

Currently I am running out of RAM when rendering images and while I have optimized it as much as I can for memory efficiency, I have realized that for images as large as I want it will never be enough so instead I want to write to a file as I am rendering. I have no idea how I could do this and I'm almost certain that most image formats won't be suitable for this as to recalculate compression they would need to put the image being appended to into RAM.
This is my current code. I'm happy to completely switch libraries to get this to work.
panel.logic.Calculate(false); // Renders the image - I can make this render in steps
Graphics2D g2d = bufferedImage.createGraphics();
panel.paint(g2d);
g2d.dispose();
SimpleDateFormat formatter = new SimpleDateFormat("dd-MM-yyyy_HH-mm-ss");
Date date = new Date();
File file = new File(formatter.format(date) + ".png");
try {
ImageIO.write(bufferedImage, "png", file);
} catch (IOException ioException) {
ioException.printStackTrace();
}

Yes, this is possible in several image formats, but won't be easy with the standard Java Graphics2D api. In particular, the class java.awt.image.BufferedImage explicitly represents an image where the entire bitmap is held in memory.
I would start by asking, how large are the images you are thinking of here? Unless your generating program is unusually memory constrained, then any image that is too big to hold in memory during generating will then be too big to hold in memory during display, so would be useless, I think?
In order to write an image file in a "streaming" style, you will need a format that allows you to write pixels or regions. This will be hard in image formats that are more sophisticated like JPEG, but easier in image formats that are more pixel oriented like BMP or PNG.
For BMP, you would need to write the file header, then stream out the pixel array into the file, which you could do pixel-by-pixel without holding the whole thing in memory, then write the footer. The format is described here: https://en.wikipedia.org/wiki/BMP_file_format
For PNG, it would be much the same, except that the file format is quite a bit more complicated and involves a compression layer (which can still be handled in a streaming format).
There aren't many libraries to handle this approach, because of the obvious limitations I outlined above and that other commenters have outlined: if an image is so large that it needs it, then it will be too large to ever display it.
I think you might be able to persuade ImageIO to write a BMP or PNG in a streaming fashion if you implement a custom RenderedImage class. I might see if I can get that to work; I'll update here if so.
Example code
I had a go at writing a PNG image in a streaming fashion using ImageIO.
Here's the code:
https://gist.github.com/RichardBradley/e7326ec777faccb9579ad4e0b0358f87
I found that the PNG encoder will request the image one scanline at a time, regardless of the Tile settings of the Image.
See com/sun/imageio/plugins/png/PNGImageWriter.java:818
(This may in fact be a bug in PNGImageWriter, but no-one has noticed because no-one writes images in a streaming style in real world use.)
If you want to stream the data pixel-by-pixel instead of line-by-line, you could fork PNGImageWriter and refactor that section. I think it should be quite possible.
(I am not 100% certain that something inside the ImageIO / PNGImageWriter pipeline will not just buffer the image to memory anyway. You could turn the image size right up and retest to be sure.)

Your problem is not that the final image may not fit in memory.
The problem is that the rendering process takes too much memory.
This means that you have to modify the rendering process in such a way that it will write its intermediate results to disk instead of keeping it in memory.
This may mean that you can use the BMP format and write bit by bit to the disk as described in the answer provided by Rich (ok, in larger chunks, not really each single bit …), or you write an intermediate format of your own, or you allocate disk memory as cache memory.
But when your current rendering process finishes without an OOME, writing the resulting image to disk cannot be the real issue; only when writing would mean that the given data structure has to be converted again into a particular format, this could cause an issue (for example, the renderer returns a byte array holding the image as BMP, but the output should be a JPEG – in that case, you may have to hold the image in memory twice, and that could cause the OOME).
But without knowing details about what panel.logic.Calculate() and panel.paint() are really doing (in detail!), the question is difficult to answer.

First, I think you can assign more memory to JVM by Config -xmx on jvm parameters.
Second, Use Lazy-Load Strategy. you can try to split the image and every image splited is loaded when it display on the panel.

May be you can take the following steps:
Downsize the image
Change image format
Assign more memory to JVM

BufferedImage reduces Image size

I was using java ImageIO and BufferedImage for some image operations and wanted to see how it behaves when I output the image as a jpg. There I found some interesting behaviour that I can't quite explain. I have the following code. The code reads an image and then outputs the same image as "copy.jpg" in the same folder. The code is in Kotlin, but the functions used are java functions:
val image = File("some/image/path.jpg")
val bufImage = ImageIO.read(image.inputStream())
FileOutputStream(File(image.parentFile, "copy.jpg")).use { os ->
ImageIO.write(bufImage, "jpg", os)
}
I would expect it to output the exactly same file, except maybe the meta information. However the resulting file was almost a tenth of the original file. I doubt the meta information would be that much. The exact size difference varied depending on which image file I used, however every time the output image would be smaller. But I could not see a quality difference to the old file. When zooming in I would see the same pixels.
Why is the file size reduced so dramatically?

JPEG is lossy compression: it throws away lots of information in order to keep the file small.  (An uncompressed image file could be orders of magnitude larger.)
It's intended to throw away information that you're not likely to see or care about, of course; but it still loses some image data.
And the loss is generational: if you have an image that came from a JPEG file, and then recompress it to a JPEG file, it will usually lose more data, giving a worse-quality result than the first JPEG file — even if the compression settings are exactly the same.  (Trying to approximate an already-compressed image won't work the same as trying to approximate the original source image. And there's no way to recover information which is already lost!)
That's almost certainly what's happening here.  Your code reads a JPEG file and expands it into a BufferedImage (which holds the uncompressed image data), and then compresses it again into a new JPEG file, which loses further quality.  It's probably using a lot higher compression than the first file used, hence the smaller size.
I'd be surprised if you couldn't see any difference between the two JPEG files in an image viewer or editor, when magnified.  (JPEG artefacts are most obvious around sharp edges and boundaries, but if you know what to look for you can sometimes see them elsewhere.  Subtle changes can be easier to see if you can line up both images on the exact same area of screen and flip directly between them.)
You can control how much information is lost when creating a JPEG — but the ImageIO.write() method you're using doesn't provide a way to do that.  See this question for how to do it.  (It's in Java, but you should be able to follow it.)
Obviously, the more information you're prepared to lose, the smaller file you can end up with.  But note that if you choose a high-quality setting, the result could be a lot larger than the first JPEG, even though it will probably still lose slightly more quality.
(That's why, if you're doing any sort of processing on an image, it's best to keep it in lossless formats until the very end, and compress to a lossy format like JPEG only once, to avoid losing quality each time you save and reload.)
As you indicate, another reason could be the loss of non-image data — you're unlikely to notice the loss of metadata such as camera settings, but the file could have had a sizeable thumbnail image too.

Pure Java alternative to JAI ImageIO for detecting CMYK images

first I'd like to explain the situation/requirements that lead to the question:
In our web application we can't support CMYK images (JPEG) since IE 8 and below can't display them.
Thus we need to detect when someone wants to upload such an image and deny it.
Unfortunately, Java's ImageIO won't read those images or would not enable me to get the detected color space. From debugging it seems like JPEGImageReader internally gets the color space code 11 (which would mean JCS_YCCK) but I can't safely access that information.
When querying the reader for the image types I get nothing for CMYK, so I might assume no image types = unsupported image.
I converted the source CMYK image to RGB using an imaging tool in order to test whether it would then be readable (I tried to simulate the admin's steps when getting the message "No CMYK supported"). However, JPEGImageReader would not read that image, since it assumes (comment in the source!)3-component RGB color space but the image header reports 4 components (maybe RGBA or ARGB) and thus an IllegalArgumentException is thrown.
Thus, ImageIO is not an option since I can't reliably get the color space of an image and I can't tell the admin why an otherwise fine image (it can be displayed by the browser) would not be accepted due to some internal error.
This led me to try JAI ImageIO whose CLibJPEGImageReader does an excellent job and correctly reads all my test images.
However, since we're deploying our application in a JBoss that might host other applications as well, we'd like to keep them as isolated as possible. AFAIK, I'd need to install JAI ImageIO to the JRE or otherwise make the native libs available in order to use them, and thus other applications might get access to them as well, which might cause side effects (at least we'd have to test a lot to ensure that's not the case).
That's the explanation for the question, and here it comes again:
Is there any pure Java alternative to JAI ImageIO which reliably detects and possibly converts CMYK images?
Thanks in advance,
Thomas

I found a solution that is ok for our needs: Apache Commons Sanselan. This library reads JPEG headers quite fast and accurate (at least all my test images) as well as a number of other image formats.
The downside is that it won't read JPEG image data, but I can do that with the basic JRE tools.
Reading JPEG images for conversion is quite easy (the ones that ImageIO refuses to read, too):
JPEGImageDecoder decoder = JPEGCodec.createJPEGDecoder(new FileInputStream( new File(pFilename) ) );
BufferedImage sourceImg = decoder.decodeAsBufferedImage();
Then if Sanselan tells me the image is actually CMYK, I get the source image's raster and convert myself:
for( /*each pixel in the raster, which is represented as int[4]*/ )
{
double k = pixel[3] / 255.0;
double r = (255.0 - pixel[0])*k;
double g = (255.0 - pixel[1])*k;
double b = (255.0 - pixel[2])*k;
}
This give quite good results in the RGB images not being too bright or dark. However, I'm not sure why multiplying with k prevents the brightening. The JPEG is actually decoded in native code and the CMYK->RGB conversion I got states something different, I just tried the multiply to see the visual result.
If anybody could shed some light on this, I'd be grateful.

I've posted a pure Java solution for reading all sorts of JPEG images and converting them to RGB.
It's built on the following facts:
While ImageIO cannot read JPEG images with CMYK as a buffered image, it can read the raw pixel data (raster).
Sanselan (or Apache Commons Imaging as it's called now) can be used to read the details of CMYK images.
There are images with inverted CMYK values (an old Photoshop bug).
There are images with YCCK instead of CMYK (can easily be converted).

Beware of another post as the Java 7 does not allow to use directly Sun's implementation without special parameters as indicated in import com.sun.image.codec.jpeg.*.

In our web application we can't support CMYK images (JPEG) since
IE 8 and below can't display them. Thus we need to detect when someone
wants to upload such an image and deny it.
I don't agree with your "Thus we need to detect when someone wants to upload such an image and deny it". A much more user-friendly policy would be to convert it to something else than CMYK.
The rest of your post is a bit confusing in that regards seen that you ask both for detection and conversion, which are two different things. Once again, I think converting the image is much more user-friendly.
No need to write in bold btw:
Is there any pure Java alternative to JAI ImageIO which reliably
detects and possibly converts CMYK images?
Pure Java I don't know, but ImageMagick works fine to convert CMYK image to RGB ones. Calling ImageMagick on the server-side from Java really isn't complicated. I used to do it manually by calling an external process but nowadays there are wrappers like JMagick and im4java.

How can I do image manipulation on a very large BMP?

I am trying to do some manipulation (specifically, conversion to a different type of splitting into tiles) on a set of very large (a few GB) BMP image files.
I'm not sure I understand the BMP file format, but is it necessary to load the entire file into memory? I was unable to find any API that didn't require loading the entire file at some point. ImageMagick wasn't able to do it either.
Java would be the best tool of choice for me, but any other solution including command line tools or desktop software would be acceptable.

Based on this, it should be reasonably apparent that you can use the fact that it's row-packed. You should be able to read part of a row, store it, advance to the same position in the next row and repeat until you have completed a tile of the desired size. Obviously you may be able to do multiple tiles at once if you can store an entire row worth of tiles in memory all at once.
It is not necessary to load more than the header of the file at once. The header format is describe in the linked Wikipedia entry. It's probably worth paying attention to any compression schemes being used - compression is likely to make this task a bit harder (though still not impossible) :)

Not sure about 2011 year of this questing, but Java 7 (and possibly 6) has good approach for such pleasure.
Use ImageReader and ImageReadParam classes, they allow to read any rectangular part of source image if needed. Set source rectangle and read
... // initialization
private javax.imageio.ImageReadParam m_params;
private javax.imageio.ImageReader m_reader;
... // reading
m_params.setSourceRegion( readRect );
BufferedImage rdImg = m_reader.read( i, m_params );
... // processing/displaying etc
As BMP (it works too for any standard Java image type) is very planar format without any compression in 99.99% cases, to read any its part such way is very fast process with low memory consumption except resulted BufferedImage. Or you can even reuse previous BufferedImage on consequent request with the rect of same dimensions. See ImageReadParam.setDestination(BufferedImage destination). But I didn't test this option.
Also I found one small reading bug namely in BMP reader class implementation of Java runtime lib. If somebody be interested in this, I'll show simple way to correct it.

library for server side image resampling using java?

I want to create a serve resampled (downsized) version of images using jsp. The original images are stored in the database as blobs. I want to to create a jsp that serves a downsampled image with decent quality (not pixelated) as per the passed image width/height (e.g. getimage.jsp?imageid=xxxx&maxside=200) . Can you point me to a opensource api or code that I can call from the jsp page?

Java already contains libraries for image manipulation. It should be easy to resize an image and output it from a JSP.
This servlet looks like it does a very similar thing to what you want your JSP to do.

Is there anything wrong with the built-in Image.getScaledInstance(w, h, hints)? (*)
Use hints=Image.SCALE_SMOOTH to get non-horrible thumbnailing. Then use an ImageIO to convert to the required format for output.
*: well yes, there is something wrong with it, it's a bit slow, but really with all the other web overhead to worry about that's not likely to be much of an issue. It's also not the best quality for when upscaling images, where a drawImage with BICUBIC renderinghint is more suitable. But you're talking about downscaling only at the moment.
Be sure to check the sizes passed in so that you can't DoS your servlet by passing in enormous sizes causing a memory-eatingly-huge image to be created.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.