how to add custom properties metadata to the pdf using apache fop - java

I am using apache FOP to create PDF files and need to add specific metadata to the PDF. In adobe reader it is called "custom properties" and it contains name and value.
I can add simple metadata like this:
out = new ByteArrayOutputStream();
fop = fopFactory.newFop(MimeConstants.MIME_PDF, foUserAgent, out);
foUserAgent.setKeywords("some keywords");
But I need to add customised metadata with name and value. Any idea how to do it?

Maybe you're lucky with the XMP support in FOP 1.1? Try with some keys that you find in the XMP specification.

Related

I would like to create a copy of word or excel file using poi

I would like to create a copy of word or excel file using poi.
I know that poi is also used when reading a word or excel file. Reading means not only values but also attribute such as font size or table color and backgroud colors for each cells. Reading values and attribute of the xlsx or docx document, I want to make a copy of the word or Excel document as it is. Is it possible that the related source is open at open source on the any site?
read Apache POI or docx4j for dealing with docx documents
you can find the techniques related to adding text into document you can found out on https://www.slideshare.net/plutext/document-generation-2012osdcsydney
use POI's HWPF support. this is often enclosed in docx4j as a dependency. however its not an excellent approach, since it does not convert the doc to docx4j's internal representation:- you are kind of stuck in HWPF land
use JODConverter to convert the doc to a docx, and if necessary, back again. this is often the simplest .
To open an excel from one file, and save it to another file I use this code.
//open source excel
InputStream template = new FileInputStream("C:\\source excel path\\input.xlsx");
Workbook wb = WorkbookFactory.create(template);
//Saving excel to a different location or filename.
FileOutputStream out = new FileOutputStream("C:\\path to copy excel to\\output.xlsx");
wb.write(out);
wb.close();
out.close();
template.close();

how to know whether a file is .docx or .doc format from Apache POI

I know we can get it done by extension or by mime type, do we have any other way through which we can get the idea of type of file whether it is .docx or .doc.
If it is just a matter of decided whether a collection of files known to either be .doc or .docx but are not marked accordingly with an extension, you can use the fact that a .docx file is a zipped collection of files. Something to the tune as follows might help:
boolean isZip = new ZipInputStream( fileStream ).getNextEntry() != null;
where fileStream is whatever file or other input stream you wish to evaluate. You could further evaluate a zipped file by looking for key .docx entries. A good starting reference is Word Document (DOCX). Likewise, if you know it is just a binary file, you can test for Word's File Information Block (see Word (.doc) Binary File Format)
You could use Apache Tika for content Detection. But you should been aware that this is a huge framework (many required dependencies) for such a small task.
There is a way, no strightforward though. But with Apache POI, you can locate it.
Try to read a .docx file using HWPFDocument Class. It would give you the following error
org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied
data appears to be in the Office 2007+ XML. You are calling the part
of POI that deals with OLE2 Office Documents. You need to call a
different part of POI to process this data (eg XSSF instead of HSSF)
String filePath = "C:\\XXXX\XXXX.docx";
FileInputStream inStream;
try {
inStream = new FileInputStream(new File(filePath));
HWPFDocument doc = new HWPFDocument(inStream);
WordExtractor wordExtractor = new WordExtractor(doc);
System.out.println("Getting words"+wordExtractor.getText());
} catch (Exception e) {
System.out.print("Its not a .doc format");
}
.docx can be read using XWPFDocument Class.
Why dont you use Apache Tika:
File file = new File('File Here');
Tika tika = new Tika();
String filetype = tika.detect(file);
System.out.println(filetype);
Assuming you're using Apache POI, you have a few options.
One is to grab the first few bytes of the file, and ask POIFSFileSystem with the hasPOIFSHeader(byte) method. If you have a stream that supports mark/reset, you can instead use POIFSFileSystem.hasPOIFSHeader(InputStream). If those return true then try to open it as a .doc with HWPF, otherwise try as .docx with XWPF
Otherwise, if you prefer a try/catch way, try to open it with POIFSFileSystem and catch OfficeXmlFileException - if it opens fine it's .doc, if you get the exception it's .docx
If you look at the source code for WorkbookFactory you'll see the first pattern in use, you can copy a similar set of logic form that

I can't import com.itextpdf.text.Document class

I'm building an android app and I want to use iText for creating pdf file, but I can't use Document class. As I seen in tutorials, there should be import com.itextpdf.text.Document for using Document class. For this app, I'm using com.itextpdf:itext-pdfa:5.5.9 library. I want to create a simple pdf file with 2 paragraphs, something like this:
try{
File pdfFolder = new File(Environment.getExternalStoragePublicDirectory(
Environment.DIRECTORY_DOCUMENTS), "pdfdemo");
if (!pdfFolder.exists()) {
pdfFolder.mkdir();
}
Date date = new Date() ;
String timeStamp = new SimpleDateFormat("yyyyMMdd_HHmmss").format(date);
File myFile = new File(pdfFolder + timeStamp + ".pdf");
OutputStream output = new FileOutputStream(myFile);
Document document = new Document();
PdfAWriter.getInstance(document, output);
document.open();
document.add(new Paragraph(mSubjectEditText.getText().toString()));
document.add(new Paragraph(mBodyEditText.getText().toString()));
document.close();
}catch (Exception e) {}
'
Could anyone help me with this problem? What am I doing wrong?
You say:
I'm using com.itextpdf:itext-pdfa:5.5.9 library
That is wrong for two reasons:
itext-pdfa is an addon to iText that is meant for writing or manipulating PDF/A documents. It requires the core iText libary. Read about the different parts of iText on the official web site: https://developers.itextpdf.com/itext-java
You say you want to use iText on Android, but you are referring to iText for Java. iText for Java contains classes that are not allowed on Android (java.awt.*, javax.nio,...). You should use the Android port for iText, which is called iTextG: https://developers.itextpdf.com/itextg-android
It's as if you're using iText without having visited the official iText web site. How is that even possible?
Just open your app level gradle file and add following code into your dependencies
implementation 'com.itextpdf:itext-pdfa:5.5.9'
It works for me

Java byteArray[] to docx

doc file in byte[] type.
Is it possible to convert it from byte[] into .docx file.
tried just change file extension programilly but it does not work.
any suggestions?
I generate report using BiRT eclipse
code of saving doc:
options = new RenderOptionBase();
ByteArrayOutputStream bos = new ByteArrayOutputStream();
options.setOutputStream(bos);
options.setOutputFormat("doc");
if(parameters != null){
task.setParameterValues(parameters);
}
task.setRenderOption(options);
task.run();
return bos.toByteArray()
//IRunAndRenderTask task;
problem is that we use BIRT 3.7 which not support DocxRenderOption
Take a look at Aspose.Words for Java -- http://www.aspose.com/java/word-component.aspx
It has really good doc too -- http://www.aspose.com/docs/display/wordsjava/load+or+create+a+document
Code will be as simple as
// Open a document.
Document doc = new Document("input.doc");
// Save document.
doc.save("output.docx");
Step1: save the doc file
Step2: using this lib convert the file and save as docx file.

Append full PDF file to FOP PDF

I have an xml file already being created and rendered as a PDF sent over a servlet:
TraxInputHandler input = new TraxInputHandler(
new File(XML_LOCATION+xmlFile+".xml"),
new File(XSLT_LOCATION)
);
ByteArrayOutputStream out = new ByteArrayOutputStream();
//driver is just `new Driver()`
synchronized (driver) {
driver.reset();
driver.setRenderer(Driver.RENDER_PDF);
driver.setOutputStream(out);
input.run(driver);
}
//response is HttpServletResponse
byte[] content = out.toByteArray();
response.setContentType("application/pdf");
response.setContentLength(content.length);
response.getOutputStream().write(content);
response.getOutputStream().flush();
This is all working perfectly fine.
However, I now have another PDF file that I need to include in the output. This is just a totally separate .pdf file that I was given. Is there any way that I can append this file either to the response, the driver, out, or anything else to include it in the response to the client? Will that even work? Or is there something else I need to do?
We also use FOP to generate some documents, and we accept uploaded documents, all of which we eventually combine into a single PDF.
You can't just send them sequentially out the stream, because the combined result needs a proper PDF file header, metadata, etc.
We use the iText library to combine the files, starting off with
PdfReader reader = new PdfReader(/*String*/fileName);
reader.consolidateNamedDestinations();
We later loop through adding pages from each pdf to the new combined destination pdf, adjusting the bookmark / page numbers as we go.
AFAIK, FOP doesn't provide this sort of functionality.

Categories