PDFBox convert PDF to TIFF. Reduce image size in bytes - java

I'm trying to convert a pdf to a tiff image. I got it working by using pdfbox but the image is too big.
Let's say my PDF size is 224kb => image size=1.4Mb
How can I make the tiff file smaller without losing quality?
Here is some of the code:
TIFFImageWriterSpi tiffspi = new TIFFImageWriterSpi();
writer = tiffspi.createWriterInstance();
ImageWriteParam param = writer.getDefaultWriteParam();
TIFFImageWriteParam param2 = (TIFFImageWriteParam) writer.getDefaultWriteParam();
param2.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
param2.setCompressionType("LZW");
param2.setCompressionQuality(1.0f);
writer.setOutput(output);
writer.write(null,new IIOImage(image,null,null),param2);

Here are some guidelines:
Match the colours to your output. If you are rendering in black and
white use bi-value output, which translates to one bit per pixel. If
you have few colours without too much shading or mixing like
highlight colouring or cartoon-style graphics, use 256 colours. Only
use full colour if you have photographs in your PDF. If you have to
produce full colour, your quest for smallness is doomed.
Match your compression to your colour depth. For monochrome use
CCITT T.4 or CCITT T.6, which are way more efficient for
bit-sequences. LZW works best on byte-sequences such as 256-colour.
If you have to produce full colour, your only hope of decent
compression is jpeg, but this will fuzz your text and lines.

Related

Should I convert BufferedImage.TYPE_4BYTE_ABGR to BufferedImage.TYPE_3BYTE_BGR?

I am working on image interpolation for which I am using bi-cubic interpolation to double the resolution of image in java using AffinedTransformOp.I used BufferedImage of TYPE_4BYTE_ABGR while doing up-scaling. When I tried to save back my upscale image using ImageIO.write then I found that openjdk does not support jpeg encoding for TYPE_4BYTE_ABGR so I converted this up-scaled image from TYPE_4BYTE_ABGR to TYPE_3BYTE_BGR. When I saved it in folder then found that the memory taken by this upscale image is way less(about half time) than the memory taken by original image.
So I assume that the original(input) image is represented by four channels ARGB while upscale(output) image is taking 3 channels RGB and that's why getting less memory.
Now my question is that should I use this conversion?
Is there some information that is getting lost?
Does quality of image remains same?
P.S: I've read from the documentation of ImageIO that when we convert ARGB to RGB than the alpha value gets premultiplied to RGB values and I think it should not affect the quality of the image.
I solved my problem and hope to share my answer. Actually the type of my original image was Grayscale and the color space of my original image was grey (meaning only one channel with 8 bits) with quality of 90.Problem arised when I used TYPE_4BYTE_ABGR for the upscaling instead of using TYPE_BYTE_GRAY. Secondly when you try to save this image in a file in jpeg format ImageIO.write uses compression of 75 by default so the image size will get small. You should use the compression factor which suits you or you should save it in PNG format. You can view information about your image by using identify -verbos image.jpg in linux and can see the color space, image type and quality etcYou can check this post to see how to set your compression quality manually in ImageIO.

Extract TIFF images from PDF without decoding

With the help of iText 5 I would like to extract all TIFF images from given PDF file and save them as TIFF files.
Examples and other posts (1, 2) use the following method:
Create PdfImageObject from PDF stream which in line 189 decodes the image stream (if corresponding filter implementation is present).
Call PdfImageObject#getImageAsBytes() which returns JPEG (original), PNG (re-encoded) or TIFF (in case of 8 bits per pixel).
As a result TIFF image with 1 bit color depth is converted to PNG, which is not what I need.
Another approach would be to call PdfImageObject#getBufferedImage() which will decode the image in step (2) into raster and afterwards encode it again as TIFF using ImageIO.write(bufferedImage, "tiff", file).
As one can see this is not efficient. Another solution shown in this post demonstrates how to save encoded TIFF image stream to file by prepending it a TIFF header – that is the solution I am looking for.
Can iText help here?
PDF images are not TIFF images.
PDFs however can contain images that use compression techniques that are also used in TIFF, e.g. Flate, CCITT, LZW, JPEG.

converting an image format with ImageJ

I have the following code that I use to convert an image from fits format to jpeg format
ImagePlus fitsImage = openImage(fitsImagePath);
final File out = new File(fullPath + fileNameNoExt + ".jpg");
BufferedImage jpgImage = fitsImage.getBufferedImage();
ImageIO.write(jpgImage, "jpg", out);
the actual format change is working and I do get a jpg file, but the problem is that the resulting file is in black and white and I know for a fact that the image I am using is colored.
So the question is what should I do to make the resulting image colored.
cheers,
es
For some reason the getBufferedImage() function is only copying the data in an 8-bit format. As I am unfamiliar with fits format, what pixel depth does it have and what pixel depth does your data have?
If you are importing in 8-bit which is false colored with red, green, or blue, then when you export it will maintain its 8-bit grey scale and not the false color.
If you want it to maintain its rgb, you will have to convert it into a rgb format before exporting.
The function command flatten might help as it will convert the picture to an RGB format
fitsImage.flatten()

How to save lossless jpg in java?

I have to save a jpeg image lossless. I am work on a steganography project but Java compressing and saving my result. I research every forums and try everything but it didn't work.
Here my example code for lossless save a jpeg image:
BufferedImage image = ImageIO.read(new File("sources/image.jpg"));
ImageWriter writer = ImageIO.getImageWritersByFormatName("JPEG").next();
JPEGImageWriteParam jpegParams = new JPEGImageWriteParam(null);
jpegParams.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
jpegParams.setCompressionQuality(1f);
writer.setOutput(ImageIO.createImageOutputStream(new File("example.jpg")));
writer.write(null, new IIOImage(image,null,null), jpegParams);
writer.dispose();
After this process I compute PSNR value is 28.53173 and "example.jpg"'s size is bigger than "image.jpg".
I try import JAI library but I am not sure Java 8 is support JAI.
JPEG is lossy all the time. Even at 100% compression quality there will be some loss of information, but it will be minimised.
The reason why your example.jpg has a bigger size is because it was encoded with a 100% quality factor, while most jpeg encoders have a default value of 50%-75%, which is what was most likely used for example.jpg. You can try different quality factors to see when both files have the same size.
A lossless JPEG format does exist (available in JAI), but it should be thought of as a different format to the conventional JPEG. However, it's not widely used and you'll probably not be able to view it in most applications, which would practically defeat the point of sharing an innocent image.

Java: Send BufferedImage through Socket with a low bitdepth

The title says enough I think.
I have a full quality BufferedImage and I want to send it through an OutputStream with a low
bitdepth. I don't want an algorithm to change pixel by pixel the quality, so it is still a full-quality.
So, the goal is to write the image (with the full resolution, full size) through the OuputStream which takes a very little number of bytes to write.
Thanks,
Martijn
You need to encode the image data into the format that has the right characteristics for your image. If it's 24-bit color and has a lot of colors and you want to lose no quality, you are probably stuck with PNG, but look into lossless JPEG 2000.
If you can lose some quality, then try
Lossy JPEG 2000 -- much smaller than JPEG for the same quality loss
reducing the number of colors and using a color mapped format
If the image has only gray or black and white data, make sure that you are encoding it as such (8-bit gray or 1-bit black and white). Then, make sure you use an encoder that is tuned for that kind of format (for example TIFF with Group 4 or JBIG2).
Another good option is to remove all of the unwanted meta-data from the image (or make sure that your encoder doesn't put any in).
If you want to stick with what's in Java, you probably have to use TIFF, PNG or JPEG -- there are third party image endcoders (for example, my company, Atalasoft, makes a .NET version of these advanced encoders -- there are Java vendors out there as well)
FYI: reducing bit depth usually means reducing quality (unless the reduction is meaningless). For example if I have a 24-bit color image, but all of the colors are gray (where R==G==B), then reducing to 8-bit gray does not lose quality. This is also true if your image has any collection of 256 different colors and you switch to a color mapped (indexed or paletted) format. You reduce the number of bytes and don't reduce quality.
Have a look at the java.util.zip package for all about compression:
http://java.sun.com/developer/technicalArticles/Programming/compression/
Though JPEG and PNG are already somehow compressed, using the Inflater class, you can reduce byte count while maintaining the exact same quality of your image. Compressing an already compressed format generally are not significant. If you group several images together, though, inflating them is significant (since inflation techniques recognize common patterns in object data, which reduce repetition and thus reduces byte count).

Categories