Parsing text from PDF file in application hosted in Google App Engine - java

I am working on uploading a PDF file (available in machine locally/in Google Drive), saving it and then parsing the text from it. This text will then be used in writing to a word doc.
These functionality are working in my locally hosted application. However, after I deployed it in Google App Engine, I am no more able to parse a PDF file.
How can I read a PDF file in a Java application hosted in Google App Engine.
In Tomcat parser -
new PDFParser(new RandomAccessFile(file,"r")); //allow to read and parse pdf
In GAE-
new PDFParser(new RandomAccessFile(file,"r")); //throwing access denied..

Programs running in Google App Engine may not do everything that a program on a local work station can. E.g. there obviously is no normal GUI which can be a hindrance to code using AWT, even if only to create an image file or interpreting font information!
Thus, programs and libraries not developed with GAE in mind may fail when deployed to GAE.
Some libraries have special versions for use with GAE, e.g. for iText there is iTextG.
As there are similar restrictions on Android, switching to an Android version of one's library may also help.
As a bottom line, when developing for GAE you have to check whether your libraries are compatible with GAE. If they aren't, you have to switch, either to GAE (or Android) versions of them or other libraries altogether.

Related

Configuring a .Properties file in Google App Engine

I have tried to use a Properties file with Google App Engine. I can read a Properties file with Google App Engine. But i can't write to a Properties file because FileOutputStream is not supported by Google App Engine's Java runtime environment.
Is there any way to write data to a Properties file with Google App Engine ? or any other way to read and write some texts to a file without using datastore ?
Writing data to a file is not supported on App Engine. If you are looking at reading files, then you could ship them with your application or better still in your WEB-INF* folder, so that they are not directly accessible.
If what you are trying to do is just model the properties then do look at the Datastore to model a Property entity. That generic piece of reading/writing properties from a Datastore could help you in other projects too.
In case you want to read/write files, you do have the option of Google Cloud Storage or a Google Drive, the APIs of which are accessible from Google App Engine application but I believe that might be a bit of overkill for what you want. But you do have that option too.

How to edit native google documents programmatically? [duplicate]

This question already has answers here:
How to programmatically manipulate native google doc files
(2 answers)
Closed 3 months ago.
I found few quite depressive QA here which mentioned that google documents cannot be modified programmatically in Google Drive API - there`s just upload/download option.
I checked those similar topics :
How to programmatically manipulate native google doc files
How do you create a document in Google Docs programmatically?
As I suppose we cannot download and upload directly native google doc formats. Is there any other way how to solve this requirement ?
Has anyone tried to trigger google app script programmatically on selected document, is that possible ? Is possible to start google app script programmatically with some parameters on the input ?
I just need to replace few pieces of text in native google doc`s but i cannot use download->modify->upload (e.g. with formats word/html/pdf) flow as i would broken formatting of pictures,borders etc... (customer requirement : full google integration no proprietary formats)
Do you have any innovative ideas or tips which would be good to explore ?
We are trying to use Google Drive as some kind of very simple templating system (~ thousands of users, hundreds of google documents) but it seems to be a really wrong idea as there is a lot of limitations on the way.
You can't use the Drive API to programmatically manage the content of a Google Document but you can use the Document Service in Apps Script to perform text replacing and other editing:
https://developers.google.com/apps-script/service_document
We invoke google app script deployed on the same domain as webapp which changes content of documents before we download them to proprietary format. We are just replacing few strings nothing complex.
This solution works but its a bit fragile (you have to install g app script + google app engine app in one domain), we are not sure how quickly are changes propagated after you trigger script so we wait always small amount of time e.g. 10 seconds before we try to download modified document.
Important disadvantage is that you cannot invoke GScript from localhost so development is a bit slower as we have to upload our app each time into google app engine.
Nowadays it's possible to use Java and other programming languages without having to use Google Apps Script by using the Google Docs API.
Also it is possible by using execute Google Apps Script code from other programming platforms by using Google Apps Script API, but it doesn't work with service accounts.
Notes:
There are some features available in the Google Docs user interface that aren't available in the Google Docs API.
Inserting content inside tables, that have rows and columns of different sizes might be complex due to the way that the indices work. Something that might help is to build the document from bottom to top.

How do I access a file's path without using ActiveX or Java

I'm developing some client based application and one of my projects needs to access a file and than move this file to another folder over the network.
I've written an ActiveX for this problem but some of my customers said that they didn't use Internet Explorer so I've decided to move my program into Java.
Today I read a news: Apple removes Java from all OS X Web browsers, they are still supporting Java but they stopped including pre-installed versions of Java in OS X.
So I want to know that Is there any way to solve this problem with Javascript or something else? I don't want to use external plug-ins.
There is no way to ask a web browser to move a file without using a plugin.
An <input type="file"> will suffice for uploading a file to the server. The File API will allow you to do more than just upload it, but that doesn't extend to moving it.

How to run java applications in Google App Engine?

I have created a simple Java application containing buttons, text fields and so on. I have created the JAR file and also the JNLP file for it and had it signed by Jarsigner.
Now I want to be able to run the java application (Java Web-Start) from a Google App Engine project. I used:
< a href="/.../mycode.jnlp">Launch application< /a>
This would work on a normal html page but does not work on my Java App Engine project which uses JSP pages. If I click the link on App Engine, it just downloads the JNLP file and doesn't run the java program.
I have searched for solutions to this to no avail. Any help would be appreciated!

Multiple file uploader with previews

I'm trying to find something that will let users upload multiple files to a website. The requirements are that it let them easily select multiple files (preferably with something like check boxes) and that it displays a preview of the images they select.
I'd prefer to only use Javascript or Flash if possible, but Java is also an option (this needs to work on platforms where Silverlight isn't available).
So far all I've been able to find are things that use the native file selector (which doesn't show previews on Windows, and makes it unclear that you can select multiple by holding ctrl).
I'm not sure if the preview requirement is even possible, but it's the most important.
This is a firefox solution:
It uses the FileReader javascript object to load, display and upload images.
http://hacks.mozilla.org/2011/01/how-to-develop-a-html5-image-uploader/
It still doesnt show previews in the FileSelection dialog but at least allows you to preview the images before uploading.
And here is a ready made java applet solution:
http://jumploader.com/doc_overview.html
To upload multiple files I use RichFaces rich:fileUpload component.
Concerning the preview, I've got the similar problem and the best I found after couple of days of googling is following.
Alfresco has the same problem and resolved it with :
An open office which runs in server mode (socket) and all the office documents are sent by alfresco to open office in order to convert them in PDF
Those PDF are converted to .swf viewer thanks to SWFTOOLS
This .swf is integrated in the HTML
For images, it uses ImageMagick to create small version of the file I suppose
Personnaly, I will try to implement it this way :
Converting office documents to PDF thanks to open office in socket mode
Transform the first page of the PDF into a PNG thanks to JPedal library
Diplay that PNG to the end user
For images I would perhaps use ImageMagick too ... but for now, I'm using Seam Image.scaleToFit API
I am assuming 2 things here:
1) Some kind of client/enduser will be doing the file upload
2) You get some kind of say on what the client installs on their computer to help make this happen.
If this is the case, my first suggestion would be:
Give them FTP or SFTP client software to upload files. The php page you make can have a link to Filezilla, along with instructions on how to use it. ftp and sftp are THE protocols to use for transferring files. HTTP is just not designed(well) for it, nor are browsers.
Once the user has the (S)FTP client software installed, you can give them URL's to upload files to that are specific to their user account, and you can have a backend script process and load/move files that they upload. It's pretty easy to create a local temporary directory using a server side script, have the client upload files via ftp, then go back to the web browser and click a button that says "Done uploading, please process my stuff".
The browser can even give back confirmations on everything that gets uploaded/processed.

Categories