i made a Java application whose purpose is to offer a Print Preview for PS files.
My program uses Ghostscript and Ghost4J to load the Post Script file and produces a list of Images (one for each page) using the SimpleRenderer.render method. Then using a simple JList i show only the image corresponding to the page the user selected in JList.
This worked fine until a really big PS file occurred, causing an OutOfMemoryError when executing the code
PSDocument pdocument = new PSDocument(new File(filename));
I know that is possibile to read a file a little at a time using InputStreams, the problem is that i can't think of a way to connect the bytes that i read with the actual pages of the document.
Example, i tried to read from PS file 100 MB at a time
int buffer_size = 100000000;
byte[] buffer = new byte[buffer_size];
FileInputStream partial = new FileInputStream(filename);
partial.read(buffer, 0, buffer_size);
document.load(new ByteArrayInputStream(buffer));
SimpleRenderer renderer = new SimpleRenderer();
//how many pages do i have to read?
List<Image> images = renderer.render(document, firstpage ??, lastpage ??);
Am i missing some Ghost4J functionality to read partially a file?
Or has someone other suggestions / approaches about how to solve this problem in different ways?
I am really struggling
I found out I can use Ghost4J Core API to retrieve from a Post Script file a reduced set of pages as Images.
Ghostscript gs = Ghostscript.getInstance();
String[] gsArgs = new String[9];
gsArgs[0] = "-dQUIET";
gsArgs[1] = "-dNOPAUSE";
gsArgs[2] = "-dBATCH";
gsArgs[3] = "-dSAFER";
gsArgs[4] = "-sDEVICE=display";
gsArgs[5] = "-sDisplayHandle=0";
gsArgs[6] = "-dDisplayFormat=16#804";
gsArgs[7] = "-sPageList="+firstPage+"-"+lastPage;
gsArgs[8] = "-f"+filename;
//create display callback (capture display output pages as images)
ImageWriterDisplayCallback displayCallback = new ImageWriterDisplayCallback();
//set display callback
gs.setDisplayCallback(displayCallback);
//run PostScript (also works with PDF) and exit interpreter
try {
gs.initialize(gsArgs);
gs.exit();
Ghostscript.deleteInstance();
} catch (GhostscriptException e) {
System.out.println("ERROR: " + e.getMessage());
e.printStackTrace();
}
return displayCallback.getImages(); //return List<Images>
This solve the problem of rendering page as images in the preview.
However, i could not find a way to use Ghost4J to know total number of pages of PS file (in case the file is too big for opening it with Document.load()).
So, i am still here needing some help
Related
https://tukaani.org/xz/java.html This site provide a XZ Library for compress/decompress files, I would like to give it a shot but I am lost.
Anyone got experience on this? Or a tutorial? Thanks.
I have used this library recently and this is the working code on my github link XZ compression algorithm. you can use this class in your android project. This is Main class to give the idea of how to use this class.
public static void main(String[] args){
String input = "Some string blah blah blah";
System.out.println("XZ or LZMA2 library.....");
// If you are using some file instead of plain text you have to
//convert it to bytes here and pass it to `compress` method.
byte[] xzCompressed = XZ_LZMA2.compress(input);
System.out.println("Input length:" + input.length());
System.out.println("XZ Compressed Length: "+ xzCompressed.length);
String xzDecompressed = XZ_LZMA2.decompress(xzCompressed);
System.out.println("XZ Decompressed : "+ xzDecompressed);
// If you are using decompression for some compressed file instead of
// plain text return bytes from `decompress` method and put it in
//FileOutputStream to get file back
}
Note: XZ compression algorithm needs lots of memory to run. It's not recommended to use it for any mobile platform like Android. It will give you out of memory exception. Android provides ZLIB compression library called Deflater and Inflater. This works well on Android platform.
You can use XZ-Java static library from Android AOSP or 'org.tukaani:xz:1.8'lib to compress the files in XZ file format. Here is working code to compress the file in XZ format.Create the Asynctask for compressing multiple files and use the below java code to compress.
Android.mk Changes for building in AOSP:
LOCAL_STATIC_JAVA_LIBRARIES := \
xz-java
OR
Gradle File Changes for building in Android Studio:
implementation 'org.tukaani:xz:1.8'
Java Code:
public void CompressFile(String inputFile, String outputFile){
XZOutputStream xzOStream = null;
try {
LZMA2Options opts = new LZMA2Options();
opts.setPreset(7);
FileOutputStream foStream = new FileOutputStream(outputFile);
xzOStream = new XZOutputStream(foStream, opts);
FileInputStream fiStream = new FileInputStream(inputFile);
Scanner sc = null;
try {
sc = new Scanner(fiStream);
while (sc.hasNextLine()) {
String line = sc.nextLine() + "\n";
xzOStream.write(line.getBytes(), 0, line.getBytes().length);
}
Utils.removeFile(inputFile);
} finally {
if (fiStream != null) {
fiStream.close();
}
if (sc != null) {
sc.close();
}
if(xzOStream != null)
xzOStream.close();
if(foStream != null)
foStream.close();
}
}catch (Exception e){
Log.e(TAG, "CompressFileXZ() Exception: " + e.toString());
}
}
My Android app loads a data file from its assets directory on startup, and I already know the decompressed size, so all I needed to write was:
byte[] data = new byte[ /* decompressed size here */ ];
new org.tukaani.xz.XZInputStream(context.getAssets().open("file.xz")).read(data);
and then git clone https://git.tukaani.org/xz-java.git and cp -r xz-java/src/org into my app's src directory (and ensure all .java files are mentioned on my javac command line—I still use old-school command-line scripts to compile my apps; I haven't set them up for Android Studio or Gradle).
However, the resulting app took six seconds to decompress 3M of compressed data on a 2013 Sony Xperia Z Ultra (Android 4.4), and changing the compression level of xz did not noticeably affect that 6-second startup. Yes it only took 1 second on a 2018 Samsung S9 running Android 10, but it's the older phones that need more compression, as they're the ones with less space available, so adding an unacceptable startup delay to the app on older devices seems to be self-defeating, especially if the alternative java.util.zip.Inflater is near instantaneous:
byte[] data = new byte[ /* compressed size here */];
context.getAssets().open("file.z").read(data);
java.util.zip.Inflater i=new java.util.zip.Inflater();
i.setInput(data);
byte[] decompressed=new byte[ /* decompressed size here */ ];
i.inflate(decompressed); i.end();
data = decompressed; /* allow earlier byte[] to be gc'd */
and for that fast startup you pay only a 20% increase in the APK size over the one with the xz file (I compressed using zopfli to get a tiny bit smaller than gzip -9 although it's still bigger than xz -0).
Tukaani's code doesn't currently seem to make an equivalent of setInput available. Tukaani's XZDecDemo.java contains the comment "Since XZInputStream does some buffering internally anyway, BufferedInputStream doesn't seem to be needed here to improve performance", but for completeness I tried it anyway:
byte[] data = new byte[ /* decompressed size here */ ];
new org.tukaani.xz.XZInputStream(
new java.io.BufferedInputStream(
context.getAssets().open("file.xz"),
/* compressed size here */)).read(data);
however this made no noticeable difference to the 6-second delay (so it seems the comment is correct: the performance is just as bad either way).
I am writing a utility application using open source java based PDFBox to convert PDF file containing 'Hyperlink to open an mp3 file' to replace it with sound object.
I used PDFBox API since it appears to be mature enough to work with Sound object. I could read the PDF file and find the hyperlink with reference to mp3. But I am not able to replace it with sound object. I created the Sound Object and associate with action but it does not work. I think I am missing some important part how to create Sound object using PDActionSound object. Is it possible to refer to external wav file using PDFBox API?
for (PDPage pdPage : pages) {
List<PDAnnotation> annotations = pdPage.getAnnotations();
for (PDAnnotation pdAnnotation : annotations) {
if (pdAnnotation instanceof PDAnnotationLink) {
PDAnnotationLink link = ((PDAnnotationLink) pdAnnotation);
PDAction action = link.getAction();
if (action instanceof PDActionLaunch) {
PDActionLaunch launch = ((PDActionLaunch) action);
String fileInfo = launch.getFile().getFile();
if (fileInfo.contains(".mp3")) {
/* create Sound object referring to external mp3*/
//something like
PDActionSound actionSound = new PDActionSound(
soundStream);
//set the ActionSound to the link.
link.setAction(actionSound);
}
}
}
}
}
How to create sound object (PDActionSound) and add to link successfully?
Speaking of mature, that part has never been used, and now that I had a closer look at the code, I think some work remains to be done... Please try this, I created this with PDFBox 2.0 after reading the PDF specification:
PDSimpleFileSpecification fileSpec = new PDSimpleFileSpecification(new COSString("/C/dir1/dir2/blah.mp3")); // see "File Specification Strings" in PDF spec
COSStream soundStream = new COSStream();
soundStream.createOutputStream().close();
soundStream.setItem(COSName.F, fileSpec);
soundStream.setInt(COSName.R, 44100); // put actual sample rate here
PDActionSound actionSound = new PDActionSound();
actionSound.getCOSObject().setItem(COSName.getPDFName("Sound"), soundStream));
link.setAction(actionSound); // reassign the new action to the link annotation
edit: as the above didn't work, here's an alternative solution as requested in the comments. The file is embedded. It works only with .WAV files, and you have to know details of them. About 1/2 seconds are lost at the beginning. The sound you should hear is "I am Al Bundy". I tried with MP3 and didn't succeed. While googling, I found some texts saying that only "old" formats (wav, aif etc) are supported. I did find another way to play sounds ("Renditions") that even worked with embedded mp3 in another product, but the generated structure in the PDF is even more complex.
COSStream soundStream = new COSStream();
OutputStream os = soundStream.createOutputStream(COSName.FLATE_DECODE);
URL url = new URL("http://cd.textfiles.com/hackchronii/WAV/ALBUNDY1.WAV");
InputStream is = url.openStream();
// FileInputStream is = new FileInputStream(".....WAV");
IOUtils.copy(is, os);
is.close();
os.close();
// See p. 506 in PDF spec, Table 294
soundStream.setInt(COSName.C, 1); // channels
soundStream.setInt(COSName.R, 22050); // sampling rate
//soundStream.setString(COSName.E, "Signed"); // The encoding format for the sample data
soundStream.setInt(COSName.B, 8); // The number of bits per sample value per channel. Default value: 8
// soundStream.setName(COSName.CO, "MP3"); // doesn't work
PDActionSound actionSound = new PDActionSound();
actionSound.getCOSObject().setItem(COSName.getPDFName("Sound"), soundStream);
link.setAction(actionSound);
Update 9.7.2016:
We discussed this on the PDFBox mailing list, and thanks to Gilad Denneboom we know two more things:
1) in Adobe Acrobat it only lets you select either WAV or AIF files
2) code by Gilad Denneboom with MP3SPI to convert MP3 to raw:
private static InputStream getAudioStream(String filename) throws Exception {
File file = new File(filename);
AudioInputStream in = AudioSystem.getAudioInputStream(file);
AudioFormat baseFormat = in.getFormat();
AudioFormat decodedFormat = new AudioFormat(
AudioFormat.Encoding.PCM_UNSIGNED,
baseFormat.getSampleRate(),
baseFormat.getSampleSizeInBits(),
baseFormat.getChannels(),
baseFormat.getChannels(),
baseFormat.getSampleRate(),
false);
return AudioSystem.getAudioInputStream(decodedFormat, in);
}
I am using Java to automate the creation and modification of Open Office Calc documents.
I was wondering how to get the number of sheets in a spreadsheet. I can't seem to find any Count, Length, size or similar functions.
Here is my code. Thanks in advance!
public static void openDocument(String filename)
{
try
{
// Get the remote office component context
xContext = Bootstrap.bootstrap();
// Get the remote office service manager
XMultiComponentFactory xMCF = xContext.getServiceManager();
// Get the root frame (i.e. desktop) of openoffice framework.
oDesktop = xMCF.createInstanceWithContext("com.sun.star.frame.Desktop", xContext);
// Desktop has 3 interfaces. The XComponentLoader interface provides ability to load components.
XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime.queryInterface(XComponentLoader.class,
oDesktop);
PropertyValue[] loadProps = new PropertyValue[0];
xSpreadsheetComponent = xCompLoader.loadComponentFromURL(getUpdatedPath(filename), "_blank", 0, loadProps);
xStorable = (XStorable) UnoRuntime.queryInterface(XStorable.class, xSpreadsheetComponent);
xSpreadsheetDocument = (XSpreadsheetDocument) UnoRuntime.queryInterface(XSpreadsheetDocument.class,
xSpreadsheetComponent);
xSpreadsheets = xSpreadsheetDocument.getSheets();
// Need code here to get number of sheets
}
catch (Exception e)
{
e.printStackTrace();
}
This is more of a comment (since I do not know the correct syntax for Java - maybe you need to do a .queryInterface on xSpreadsheets?), but posting as an answer to include an image. Using Bernard Marcelly's object inspection tool XRay (http://bernard.marcelly.perso.sfr.fr/index2.html) shows that an XSpreadsheets object has a method .getCount(). I tested this method using OpenOffice Basic and it works as expected.
I solved my issue using this:
int numberOfSheets = xSpreadsheets.getElementNames().length;
I have for example 1000 images and their names are all very similar, they just differ in the number. "ImageNmbr0001", "ImageNmbr0002", ....., ImageNmbr1000 etc.;
I would like to get every image and store them into an ImageProcessor Array.
So, for example, if I use a method on element of this array, then this method is applied on the picture, for example count the black pixel in it.
I can use a for loop the get numbers from 1 to 1000, turn them into a string and create substrings of the filenames to load and then attach the string numbers again to the file name and let it load that image.
However I would still have to turn it somehow into an element I can store in an array and I don't a method yet, that receives a string, in fact the file path and returns the respective ImageProcessor that is stored at it's end.
Also my approach at the moment seems rather clumsy and not too elegant. So I would be very happy, if someone could show me a better to do that using methods from those packages:
import ij.ImagePlus;
import ij.plugin.filter.PlugInFilter;
import ij.process.ImageProcessor;
I think I found a solution:
Opener opener = new Opener();
String imageFilePath = "somePath";
ImagePlus imp = opener.openImage(imageFilePath);
ImageProcesser ip = imp.getProcessor();
That do the job, but thank you for your time/effort.
I'm not sure if I undestand what you want exacly... But I definitly would not save each information of each image in separate files for 2 reasons:
- It's slower to save and read the content of multiple files compare with 1 medium size file
- Each file adds overhead (files need Path, minimum size in disk, etc)
If you want performance, group multiple image descriptions in single description files.
If you dont want to make a binary description file, you can always use a Database, which is build for it, performance in read and normally on save.
I dont know exacly what your needs, but I guess you can try make a binary file with fixed size data and read it later
Example:
public static void main(String[] args) throws IOException {
FileOutputStream fout = null;
FileInputStream fin = null;
try {
fout = new FileOutputStream("description.bin");
DataOutputStream dout = new DataOutputStream(fout);
for (int x = 0; x < 1000; x++) {
dout.writeInt(10); // Write Int data
}
fin = new FileInputStream("description.bin");
DataInputStream din = new DataInputStream(fin);
for (int x = 0; x < 1000; x++) {
System.out.println(din.readInt()); // Read Int data
}
} catch (Exception e) {
} finally {
if (fout != null) {
fout.close();
}
if (fin != null) {
fin.close();
}
}
}
In this example, the code writes integers in "description.bin" file and then read them.
This is pretty fast in Java, since Java uses "channels" for files by default
We are attempting to generate documents using iText that are formed largely from "template" files - smaller PDF files that are combined together into one composite file using the PdfContentByte.addTemplate method. We then automatically and silently print the new file using the *nix command lp. This usually works; however occasionally, some files that are generated will fail to print. The document proceeds through all queues and arrives at the printer proper (a Lexmark T652n, in this case), its physical display gives a message of pending progress, and even its mechanical components whir up in preparation - then, the printing job vanishes spontaneously without a trace, and the printer returns to being ready.
The oddity in how specific this issue tends to be. For starters, the files in question print without fail when done manually through Adobe PDF Viewer, and can be read fine by editors like Adobe Live Cycle. Furthermore, the content of the file effects whether it is plagued by this issue, but not in a clear way - adding a specific template 20 times could cause the problem, while doing it 19 or 21 times might be fine, or using a different template will change the pattern entirely and might cause it to happen instead after 37 times. Generating a document with the exact same content will be consistent on whether or not the issue occurs, but any subtle and seemingly irrelevant change in content will change whether the problem happens.
While it could be considered a hardware issue, the fact remains that certain iText-generated files have this issue while others do not. Is our method of file creation sometimes creating files that are somehow considered corrupt only to the printer and only sometimes?
Here is a relatively small code example that generates documents using the repetitive template method similar to our main program. It uses this file as a template and repeats it a specified number of times.
public class PDFFileMaker {
private static final int INCH = 72;
final private static float MARGIN_TOP = INCH / 4;
final private static float MARGIN_BOTTOM = INCH / 2;
private static final String DIREC = "/pdftest/";
private static final String OUTPUT_FILEPATH = DIREC + "cooldoc_%d.pdf";
private static final String TEMPLATE1_FILEPATH = DIREC + "template1.pdf";
private static final Rectangle PAGE_SIZE = PageSize.LETTER;
private static final Rectangle TEMPLATE_SIZE = PageSize.LETTER;
private ByteArrayOutputStream workingBuffer;
private ByteArrayOutputStream storageBuffer;
private ByteArrayOutputStream templateBuffer;
private float currPosition;
private int currPage;
private int formFillCount;
private int templateTotal;
private static final int DEFAULT_NUMBER_OF_TIMES = 23;
public static void main (String [] args) {
System.out.println("Starting...");
PDFFileMaker maker = new PDFFileMaker();
File file = null;
try {
file = maker.createPDF(DEFAULT_NUMBER_OF_TIMES);
}
catch (Exception e) {
e.printStackTrace();
}
if (file == null || !file.exists()) {
System.out.println("File failed to be created.");
}
else {
System.out.println("File creation successful.");
}
}
public File createPDF(int inCount) throws Exception {
templateTotal = inCount;
String sFilepath = String.format(OUTPUT_FILEPATH, templateTotal);
workingBuffer = new ByteArrayOutputStream();
storageBuffer = new ByteArrayOutputStream();
templateBuffer = new ByteArrayOutputStream();
startPDF();
doMainSegment();
finishPDF(sFilepath);
return new File(sFilepath);
}
private void startPDF() throws DocumentException, FileNotFoundException {
Document d = new Document(PAGE_SIZE);
PdfWriter w = PdfWriter.getInstance(d, workingBuffer);
d.open();
d.add(new Paragraph(" "));
d.close();
w.close();
currPosition = 0;
currPage = 1;
formFillCount = 1;
}
protected void finishPDF(String sFilepath) throws DocumentException, IOException {
//Transfers data from buffer 1 to builder file
PdfReader r = new PdfReader(workingBuffer.toByteArray());
PdfStamper s = new PdfStamper(r, new FileOutputStream(sFilepath));
s.setFullCompression();
r.close();
s.close();
}
private void doMainSegment() throws FileNotFoundException, IOException, DocumentException {
File fTemplate1 = new File(TEMPLATE1_FILEPATH);
for (int i = 0; i < templateTotal; i++) {
doTemplate(fTemplate1);
}
}
private void doTemplate(File f) throws FileNotFoundException, IOException, DocumentException {
PdfReader reader = new PdfReader(new FileInputStream(f));
//Transfers data from the template input file to temporary buffer
PdfStamper stamper = new PdfStamper(reader, templateBuffer);
stamper.setFormFlattening(true);
AcroFields form = stamper.getAcroFields();
//Get size of template file via looking for "end" Acrofield
float[] area = form.getFieldPositions("end");
float size = TEMPLATE_SIZE.getHeight() - MARGIN_TOP - area[4];
//Requires Page Break
if (size >= PAGE_SIZE.getHeight() - MARGIN_TOP - MARGIN_BOTTOM + currPosition) {
PdfReader subreader = new PdfReader(workingBuffer.toByteArray());
PdfStamper substamper = new PdfStamper(subreader, storageBuffer);
currPosition = 0;
currPage++;
substamper.insertPage(currPage, PAGE_SIZE);
substamper.close();
subreader.close();
workingBuffer = storageBuffer;
storageBuffer = new ByteArrayOutputStream();
}
//Set Fields
form.setField("field1", String.format("Form Text %d", formFillCount));
form.setField("page", String.format("Page %d", currPage));
formFillCount++;
stamper.close();
reader.close();
//Read from working buffer, stamp to storage buffer, stamp template from template buffer
reader = new PdfReader(workingBuffer.toByteArray());
stamper = new PdfStamper(reader, storageBuffer);
reader.close();
reader = new PdfReader(templateBuffer.toByteArray());
PdfImportedPage page = stamper.getImportedPage(reader, 1);
PdfContentByte cb = stamper.getOverContent(currPage);
cb.addTemplate(page, 0, currPosition);
stamper.close();
reader.close();
//Reset buffers - working buffer takes on storage buffer data, storage and template buffers clear
workingBuffer = storageBuffer;
storageBuffer = new ByteArrayOutputStream();
templateBuffer = new ByteArrayOutputStream();
currPosition -= size;
}
Running this program with a DEFAULT_NUMBER_OF_TIMES of 23 produces this document and causes the failure when sent to the printer. Changing it to 22 times produces this similar-looking document (simply with one less "line") which does not have the problem and prints successfully. Using a different PDF file as a template component completely changes these numbers or makes it so that it may not happen at all.
While this problem is likely too specific and with too many factors for other people to reasonably be expected to reproduce, the question of possibilities remains. What about the file generation could cause this unusual behavior? What might cause one file to be acceptable to a specific printer but another, generated in the same manner in different only in seemingly non-trivial ways, to be unacceptable? Is there a bug in iText produced by using the stamper template commands too heavily? This has been a long-running bug with us for a while now, so any assistance is appreciate; additionally, I am willing to answer questions or have extended conversations in chat as necessary in an effort to get to the bottom of this.
The design of your application more or less abuses the otherwise perfectly fine PdfStamper functionality.
Allow me to explain.
The contents of a page can be expressed as a stream object or as an array of a stream objects. When changing a page using PdfStamper, the content of this page is always an array of stream objects, consisting of the original stream object or the original array of stream objects to which extra elements are added.
By adding the same template creating a PdfStamper object over and over again, you increase the number of elements in the page contents array dramatically. You also introduce a huge number of q and Q operators that save and restore the stack. The reason why you have random behavior is clear: the memory and CPU available to process the PDF can vary from one moment to another. One time, there will be sufficient resources to deal with 20 q operators (saves the state), the next time there will only be sufficient resources to deal with 19. The problem occurs when the process runs out of resources.
While the PDFs you're creating aren't illegal according to ISO-32000-1, some PDF processors simply choke on these PDFs. iText is a toolbox that allows you to create PDFs that can make me very happy when I look under the hood, but it also allows you to create horrible PDFs if you don't use the toolbox wisely. The latter is what happened in your case.
You should solve this be reusing the PdfStamper instance instead of creating a new PdfStamper over and over again. If that's not possible, please post another question, using less words, explaining exactly what you want to achieve.
Suppose that you have many different source files with PDF snippets that need to be added to a single page. For instance: suppose that each PDF snippet was a coupon and you need to create a sheet with 30 coupons. Than you'd use a single PdfWriter instance, import pages with getImportedPage() and add them at the correct position using addTemplate().
Of course: I have no idea what your project is about. The idea of coupons of a page was inspired by your test PDF.