I am using the Apache POI library to create powerpoint slides with Java.
Our client is interested in embedded text, images and videos. No fancy
stuff like charts etc. is needed for now. I understand that XSLF is still
under development and not yet a mature product.
I have achieved my target using Apache POI HSLF model but the only thing it is missing is that videos which are embedded doesn't show up any playback controls. After little researching I found that it is the pptx and ppt file standards which are making the things different. So now to solve this issue I am migrating from HSLF to XSLF. But unfortunately XSLF library doesn't have any method to add video file (unlike HSLF addmovie method).
What method you guys recomend ? Is there any other way to show the playback controls on ppt files(and not pptx)?I mean by additional activex control/mediaplayer. If yes how should it be done using Java ?
Beginning from Powerpoint 2010 it's possible to embed videos in PPTX-files (... instead of linking them or using some kind of ActiveX/youtube combo). If you embed MP4-videos you need to have the Quicktime plugin installed.
Regarding the playback controls, my PP 2010 viewer displays them when you move the mouse over the video shape. Sometimes they never show up again, when you click straight into the image instead of waiting for the popup.
The following code ...
fetches a MPEG (could be also a local file)
creates a snapshot of the frame on the 5th second, which is used as the preview image. I've used the Xuggle libs here, but of course any other libs are ok too (... plain JMF (without extension pack) couldn't handle (this) MPEGs)
embeds image and video
and adds some arbitrary ;) stuff, which PP needs to actually play the video
The code is in the XSLF examples.
(Update 2016-02-06: moved the code to POI examples, so there's only one place to be modifed in case of new features. Furthermore there was a regression in POI 3.13 making it impossible to add pictures after adding movies to the media directory - this is fixed in the upcoming POI 3.14)
Related
I'm trying to find a way to copy the whole slide, and paste it with the formatting of 'as an image' to a blank slide, using POI's APIs.
The reason why I'm trying to do this is because I want to save each slide as an image.
The POI's Slide.draw() API by itself does not do a very good job of saving the slides' contents to images.
For example it cannot draw the following two types of objects:
Tables created on PowerPoint
Charts pasted from Excel
Is there any way to 'copy' and 'paste with formatting (as an image)' via POI, just like you do these two operations on MS PowerPoint running on a Windows, in order to save the slides as they are?
Since Slide.draw() works just fine with images, once I paste all the objects as a single flat image onto a slide (using POI), I'll be good to go.
Or if there is a better way to save the slides contents as intact as possible, could you please let me know?
The license which comes along with the method needs to be Apache license or otherwise something more permissive.
Also, I read the following post:
Programmatically extracting slides as images from a PowerPoint presentation (.PPT)
unoconv is GPL-licensed, so it is not an option for us.
JODConverter is LGPL-licensed; I'm not sure if it's acceptable so I will talk to my boss and check.
Then I ran the POI as a command line tool, as suggested by Michael,
but I ended up getting the same problem (tables and pasted Excel charts
do not show up in the saved images.)
Thanks.
this is my first question or rather Questions. I am in my last semester this college and Reading PDF is one of the components that im developing for my thesis.
I have been reading questions about reading a pdf document but there are no solid answer. I want to know what are the ways to read a PDF document? what i have read that there are API's available that can read PDF Document like PDFBox, muPDF, and iText. I have not seen any other API's but this is what i have read on other posts.
The problem here is first PDFBox i read that PDFBox can not be use because of AWT Dependencies and android is has no AWT and Swing related classes. PDFBox is out of the question. muPDF i have not read anything about muPDF, it was recommended to me but i want to know if it is usable to read PDF Document. iText this is the most common API that i encounter in PDF and android related questions. the problem here is its License?(Correct me if im wrong) I have not tried any of this 3 yet because i want know if there are another solutions beside this 3.
Other than APIs, i think PDF Reader Applications can be used too if im not mistaken? if it can be used then HOW?. I'm not looking for a Code but a explanation of how you did it and how you implemented it in your application.
i have thought another way but i do not know if this is possible. how about convert PDF Document into a .txt or .doc file? inside the android. it would be like when i load a PDF document inside the android a Code will convert that PDF Document into a .txt / .doc file and the application will search and extract text from the .txt / .doc file rather then the PDF Document.
if you are asking WHY do i need this kind of component, because i'm working on a application that would SEARCH and EXTRACT text from a PDF Document using Android.
This is my questions:
What are the ways to read a PDF Document in Android?
What are your experiences in using this kind of method?
How did you do it using this kind of method(just a flow/explanation would do)?
If the method has a License what would be the problem in the future?
PS: Correct me if i'm wrong.
Thank you.
This is a very iText specific answer so it does not answer all your questions, but it may still be of help to you.
What are the ways to read a PDF Document in Android?
I use the iText java bindings (Keep reading to find out about LGPL licensing)
What are your experiences in using this kind of method?
Great! It covers all things PDF related. (Older versions may be a little different)
How did you do it using this kind of method(just a flow/explanation would do)?
I assume this question is related to the "other way" that you thought of so it is not relevant to the iText PDF library?
If the method has a License what would be the problem in the future?
I still encourage you to use the updated version of iText, however if you use iText version 2.1.7 or older it falls under the old LGPL license, and has far more free reign and is more suitable for commercial or private/closed use with no real problems. From what I can tell all the functionality you are after is available in version 2.1.7 version.
The AGPL license for current version of iText is pretty decent, from what I understand you do need to publish your program under a similar license and make the code freely available to others (it would pay to check the details though), if sharing code is not a problem then the latest version of iText is worth looking into.
References:
LGPL License: http://www.gnu.org/licenses/lgpl.html
iText AGPL License: http://itextpdf.com/terms-of-use/agpl.php
I was also working on an Android Application, where i need to open PDF files in my android application. First i thought about doing this by using foreign API's. I tried to used mupdf, but it was not a good experience.
Then our team leader suuggested me to first install a pdf viewer application in your device and then use code to open pdf through the installed pdf viewer application.
You can easily download pdf viewer application from gogle play, then you can use the following code to open the pdf in your application,
File file = new File("/sdcard/MyPDFfile.pdf");
Uri path = Uri.fromFile(file);
Intent intent = new Intent(Intent.ACTION_VIEW);
intent.setDataAndType(path, "application/pdf");
intent.setFlags(Intent.FLAG_ACTIVITY_CLEAR_TOP);
startActivity(intent);
This code will automatically look for the installed pdf application in your device and will pass the intent to that application to open the pdf file.
I have just started working with GWT. I was wondering how I can dynamically move widgets on fly (at web page on client browser) for example to move a row of table up and down, or upload a excel file and display its content right away....something like a dashboard I am talking about. Are there any comprehensive tutorial to refer.
Have a look at the gwt-dnd lib:
http://code.google.com/p/gwt-dnd/
GWT is made for doing the kind of things you are describing. To move widgets you can either set their position or dynamically modify their css. To move rows around in a table look at the api of whatever table class you are using. To upload an excel file do a google search for 'gwt upload' and there will be some instructions - but to display the file you will need to convert it (probably to xml). Converting the file on the server will depend on which server you are using - I also have seen a 3rd party widget that will do that for you.
If youre looking for transition effects or animations , than check out gwtquery. Its really similar to jquery and has pretty simple good examples to start with.
I'm using a JTree to browse the content of a folder and I want that when a user click on a file, the software shows a preview of it (a screenshot of its first page).
The files are mostly Office documents and PDF.
I manage to do it for PDF file using a module downloaded from Sun, but I'd like to know if there is a way to do it using any software (JARs preferably) or even the built-in Windows API.
I was thinking of converting the file to PDF then do a preview of this PDF but this isn't optimal.
Any ideas ?
I've got the similar problem and the best I found after couple of days of googling is following.
Alfresco has the same problem and resolved it with :
An open office which runs in server mode (socket) and all the office documents are sent by alfresco to open office in order to convert them in PDF
Those PDF are converted to .swf viewer thanks to SWFTOOLS
This .swf is integrated in the HTML
For images, it uses ImageMagick to create small version of the file I suppose
Personnaly, I will try to implement it this way :
Converting office documents to PDF thanks to open office in socket mode
Transform the first page of the PDF into a PNG thanks to JPedal library (the LGPL version)
Diplay that PNG to the end user
For images I would perhaps use ImageMagick too ... but for now, I'm using Seam Image.scaleToFit API
I had the same problem too and stumbled over this thread. Starting with the solution from Anthony I am using Libre Office in socket mode to convert office documents directly to a PNG. Unfortunately this isn't posible from PDF's. Here is a good overview which ways are possible.
unoconv --connection 'socket,host=127.0.0.1,port=2220,tcpNoDelay=1;urp;StarOffice.ComponentContext' -f png -e PageRange=1 your_file_name.extension
Little reference to start Libre Office in socket mode: click me
I asked this a long time ago: solution
Is there any free Java library for extracting text from PDF, that is compatible with Google Application Engine?
I've read about PDFJet, but it can't read PDF, can it?
Is there perhaps other way how to extract text from PDF? I tried http://www.pdfdownload.org/, unfortunately they don't handle non-English characters correctly.
iText now has a text parsing module (I'm one of the parser authors). See the com.itextpdf.text.pdf.parser.PdfContentReaderTool class for an example of how to use it.
PdfBox does not run on GAE. It uses not-allowed java classes.
(GAE only permits these http://code.google.com/appengine/docs/java/jrewhitelist.html)
I have partially modified a very old version of PdfBox (0.7.3) to be GAE complaiant. Now I'm able to extract text from PDF (whole page or rectangular area). I only modified a minumum part of the pdf text extraction and not the whole PdfBox. :)
The idea was to remove refences to java.awt.retangle & C. using my own "rectangle" class.
More info: http://fhtino.blogspot.com/2010/04/pdfbox-text-extration-gae.html
I modified the latest (1.8.0-Snapshot) version to run on Google AppEngine. Had to disable one Unit-Test, but it runs fine for simple text extraction.
Following the simple try-fail-fix approach i had to modify 5 files in total. Pretty doable.
You'll also have to explicitly use a RandomAccessBuffer, like Fabrizio explained.
For the extra lazy, heres the compiled jar, dependencies for text extraction, and the patch. Note that it might not work for every usecase (i.e. rectangle based extraction). Used it to extract text of a whole page.
https://docs.google.com/folder/d/0B53n_gP2oU6iVjhOOVBNZHk0a0E/edit
I know there is http://pdfbox.apache.org/index.html
Apache PDFBox is an open source Java
PDF library for working with PDF
documents. This project allows
creation of new PDF documents,
manipulation of existing documents and
the ability to extract content from
documents.
but I've never tested it.
Last month, I'd just finished extracting text from pdf file in my project. I used XPDF tool for getting text, and text coordinates, but I used it in Xcode (Objective-C). This tool was open source, written by C++, and able to be encoded in many language. However, I didn't know whether XPdf would be work on your java, or not. Anyway, You can try this tool.