I've been having issues using POI-3.10-FINAL where editing a PPTX doesn't fully work. I noticed that I am successfully able to add new slides, but modifications to shapes (in this case, a table) aren't reflected in the outputted PPTX file.
I was able to fix it by switching from poi-ooxml-schemas-*.jar to ooxml-schemas-1.1.jar but the resulting PPTX file seems to be corrupted: PowerPoint 2007 fails to open it but PowerPoint 2010 repairs it first, then properly opens it.
In investigating the issue, I noticed that the "docProps/app.xml" is not being updated correctly (I'm assuming other files within the PPTX aren't being updated as well).
Any ideas?
I've been able to properly troubleshoot and fix my POI issues using Microsoft's OpenXml SDK (see OpenXml SDK). The SDK helps you scan through a PPTX file (or any other OpenXml document) and compiles a list of all errors it finds.
In my case, I was setting one of my table cell's text value to null. In turn, POI generated the xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" declaration at the top most slide tag and used xsi:nil="true" within the cell's tag which PowerPoint absolutely did not like.
Another issue I was having was that I was modifying and creating new rows and columns within my table. To make things easier in my code, whenever I did anything to a cell, I made sure I set the border information to black with a width of 1 and the fill color to white. For some odd reason it seems that POI was not replacing the border information, but appending it which made the PPTX have 2 conflicting values rather than 1 (I have to investigate this one further but checking the border and fill information before trying to set them definitely fixed my issue).
These issues were rather easy to fix once I figured them out.
Related
I am trying to understand how the FolioReader algorithm works. One problem which I have come across is that it does not copy the css for certain files when it loads it to the web view.
This is the algorithm I am using:
https://github.com/FolioReader/FolioReader-Android
A certain case I am working with has the following problem: It loads the file, however some titles which are centred are moved to the left.
I'm trying to find a way to copy the whole slide, and paste it with the formatting of 'as an image' to a blank slide, using POI's APIs.
The reason why I'm trying to do this is because I want to save each slide as an image.
The POI's Slide.draw() API by itself does not do a very good job of saving the slides' contents to images.
For example it cannot draw the following two types of objects:
Tables created on PowerPoint
Charts pasted from Excel
Is there any way to 'copy' and 'paste with formatting (as an image)' via POI, just like you do these two operations on MS PowerPoint running on a Windows, in order to save the slides as they are?
Since Slide.draw() works just fine with images, once I paste all the objects as a single flat image onto a slide (using POI), I'll be good to go.
Or if there is a better way to save the slides contents as intact as possible, could you please let me know?
The license which comes along with the method needs to be Apache license or otherwise something more permissive.
Also, I read the following post:
Programmatically extracting slides as images from a PowerPoint presentation (.PPT)
unoconv is GPL-licensed, so it is not an option for us.
JODConverter is LGPL-licensed; I'm not sure if it's acceptable so I will talk to my boss and check.
Then I ran the POI as a command line tool, as suggested by Michael,
but I ended up getting the same problem (tables and pasted Excel charts
do not show up in the saved images.)
Thanks.
I've been trying to use batik and iText to create a PDF in my application containing an SVG graph, however I only seem to be able to find examples where the svg is located at some coordinates. I don't want to have the image located at some coordinates however, I'm wanting it to be put into the page, preferably fit to the width of the page depending on whether it is landscape or portrait and be inline with any other content that I might add, have everything wrap around it.
Is this possible? I'm beginning to suspect my wish that somewhere it will start behaving like HTML is going to be in vain.
Many thanks,
Andrew
p.s.I have a working JAVA class to place the SVG on the page as a template, as per this tutorial:
http://itextpdf.com/examples/iia.php?id=263
p.p.s.For people looking for this issue in relation to Vaadin Charts I've tagged the question as Vaadin related also, as that's what has generated my SVG.
I have some troubles with JasperReports. I generated a formular with iReport including two subreports which generates a grid of values (1 or 2 Characters long).
The compiled PDF from iReport it works fine and looks good, but if i use the same *.jrxml and *.jasper files for my web app the generated PDF has some minor differences. One big problem is, that some cells of the grid now are 2 lines high. Values like "NB" only use one line but "GS" for example uses 2 lines.
For me it is not possible to find the error. Workarounds with smaller font size or wider cells didn't help.
Make sure the font you are using in the template is available on the JVM generating the report. If the font doesn't exist then a different font will be used. If changing the font isn't an option then you can create a font extension package. Creating a font extension is documented here: JasperReports Font Reference
Sound like you could have a different version of iReports in your web application. Making the cells sufficiently wide enough should at least allow the text to span just one line.
Create a Java Desktop test that generates a PDF based on the .jrxml and make sure it has the same results. If it does then there is something with the way iReports is working, if it doesn't then you know it is something with how you are viewing or creating the PDF in the web app.
Is there any free Java library for extracting text from PDF, that is compatible with Google Application Engine?
I've read about PDFJet, but it can't read PDF, can it?
Is there perhaps other way how to extract text from PDF? I tried http://www.pdfdownload.org/, unfortunately they don't handle non-English characters correctly.
iText now has a text parsing module (I'm one of the parser authors). See the com.itextpdf.text.pdf.parser.PdfContentReaderTool class for an example of how to use it.
PdfBox does not run on GAE. It uses not-allowed java classes.
(GAE only permits these http://code.google.com/appengine/docs/java/jrewhitelist.html)
I have partially modified a very old version of PdfBox (0.7.3) to be GAE complaiant. Now I'm able to extract text from PDF (whole page or rectangular area). I only modified a minumum part of the pdf text extraction and not the whole PdfBox. :)
The idea was to remove refences to java.awt.retangle & C. using my own "rectangle" class.
More info: http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html
I modified the latest (1.8.0-Snapshot) version to run on Google AppEngine. Had to disable one Unit-Test, but it runs fine for simple text extraction.
Following the simple try-fail-fix approach i had to modify 5 files in total. Pretty doable.
You'll also have to explicitly use a RandomAccessBuffer, like Fabrizio explained.
For the extra lazy, heres the compiled jar, dependencies for text extraction, and the patch. Note that it might not work for every usecase (i.e. rectangle based extraction). Used it to extract text of a whole page.
https://docs.google.com/folder/d/0B53n_gP2oU6iVjhOOVBNZHk0a0E/edit
I know there is http://pdfbox.apache.org/index.html
Apache PDFBox is an open source Java
PDF library for working with PDF
documents. This project allows
creation of new PDF documents,
manipulation of existing documents and
the ability to extract content from
documents.
but I've never tested it.
Last month, I'd just finished extracting text from pdf file in my project. I used XPDF tool for getting text, and text coordinates, but I used it in Xcode (Objective-C). This tool was open source, written by C++, and able to be encoded in many language. However, I didn't know whether XPdf would be work on your java, or not. Anyway, You can try this tool.