My work has tasked me with determining the feasibility of migrating our existing in-house built change management services(web based) to a Sharepoint solution. I've found everything to be easy except I've run into the issue that for each change management issue (several thousand) there may be any number of attachment files associated with them, called through javascript, that need to be downloaded and put into a document library.
(ex. ... onClick="DownloadAttachment(XXXXX,'ProjectID=YYYY');return false">Attachment... ).
To keep me from manually selecting them all I've been looking over posts of people wanting to do similar, and there seem to be many possible solutions, but they often seem more complicated than they need to be.
So I suppose in a nutshell I'm asking what would be the best way to approach this issue that yields some sort of desktop application or script that can interact with web pages and will let me select and organize all the attachments. (Making a purely web based app (php, javascript, rails, etc.) is not an option for me, so throwing that out there now).
Thanks in advance.
Given a document id and project id,
XXXXX and YYYY respectively in
your example, figure out the URL
from which the file contents can be
downloaded. You can observe a few
URL links in the browser and detect
the pattern which your web
application uses.
Use a tool like Selenium to get a
list of XXXXXs and YYYYs of
documents you need to download.
Write a bash script with wget to
download the files locally and put
in the correct folders.
This is a "one off" migration, right?
Get access to your in-house application's database, and create an SQL query which pulls out rows showing the attachment names (XXXXX?) and the issue/project (YYYY?), ex:
|file_id|issue_id|file_name |
| 5| 123|Feasibility Test.xls|
Analyze the DownloadAttachment method and figure out how it generates the URL that it calls for each download.
Start a script (personally I'd go for Python) that will do the migration work.
Program the script to connect and run the SQL query, or can read a CSV file you create manually from step #1.
Program the script to use the details to determine the target-filename and the URL to download from.
Program the script to download the file from the given URL, and place it on the hard drive with the proper name. (In Python, you might use urllib.)
Hopefully that will get you as far as a bunch of files categorized by "issue" like:
issue123/Feasibility Test.xls
issue123/Billing Invoice.doc
issue456/Feasibility Test.xls
Thank you everyone. I was able to get what I needed using htmlunit and java to traverse a report I made of all change items with attachments, go to each one, copy the source code, traverse that to find instances of the download method, and copy the unique IDs of each attachment and build an .xls of all items and their attachments.
Related
I'm working on a small side-project for our company that does the following:
PDF-based documents received through Office 365 Outlook are temporarily stored in OneDrive, using Power Automate
Text data is extracted from the PDFs using a few Java libraries
Based on extracted data an appropriate filename and filepath is created
The PDFs are permanently saved in OneDrive
The issue right now is that my Java program is locally-run, i.e. point 2,3,4 require code to run 24/7 on my PC. I'd like to transition to a Cloud-based solution.
What is the easiest way to accomplish this? The solution doesn't have to be free, but shouldn't cost more than $20/mo. Our company already has an Azure subscription, though I'm not familiar yet with Azure.
What you are looking for is a solution that uses a serverless computing execution model. Azure Functions seems to be a possible choice here. It does seem to have input bindings that respond to OneDrive files and an likewise output bindings.
The cost will depend on the number of documents, not the time the solution is available. I assume we are talking about a small number of documents a month so this will come out cheaper than other execution models.
I need help on how to propose a new website. I don’t know how to start and I hope you can guide me( if it is better to make an applet , a servlet, use other technology, etc. ) .
I have a website in ASP, it reads text files that are on the server in the same directory as the web . There are n files (may be about 300 plain text files generated by an external application ) . The website only read them, generates a menu with the data they contain . Depending on the selected menu options , read a specific files and pass this information to Flash movies which generate statistical graphs.
Flash movies are very old and actually cause problems in browsers. They can’t be loaded on all platforms for example. And the ASP technology is also obsolete.
We want to change the technology and create a web that reads a series of text files hosted on the server and pass these parameters to a graphic (we would use javascript libraries, for example Morris). We are interested for JAVA. What you recommend?. if its JAVA , this can be done with Applets ? or Should we use servlets?? or Is there an easier way to do it?
I use amcharts (http://www.amcharts.com/) to generate our charts.
I build the data using classic ASP ... then pass to an array in JS and then use the amcharts tools, which are very powerful and flexible.
I want to use this class com.google.gwt.user.client.ui.FileUpload for file uploading but I faced with the next issue. I cannot find how to set text programmatically. I mean in TextBox I have setText method to do it. How can I do it in FileUpload?
You cannot set,Since GWT file upload wraps with html input type="file".So there is no way to set set filename or path (text) to that.
That should be select by user only,for sure.
You can only get from it.
If you are talking about setting some name to your control like "select file",take a lable and add.
http://en.wikipedia.org/wiki/JavaScript#Security
JavaScript and the DOM provide the potential for malicious authors to deliver scripts to run on a client computer via the web. Browser authors contain this risk using two restrictions. First, scripts run in a sandbox in which they can only perform web-related actions, not general-purpose programming tasks like creating files. Second, scripts are constrained by the same origin policy: scripts from one web site do not have access to information such as usernames, passwords, or cookies sent to another site
I have a requirement to process an external request to populate a HTML form with the parameters mentioned in the URL. This part is working fine. However, the URL also contains paths to files present on the client machine and I want to upload those files from the client machine to the server without user interaction.
Since it is not possible with HTML/Javascript to programatically select files, I tried using the Applet approach using JUpload. However, I am not able to figure out, how to preselect a file on applet initialization. It is not necessary to upload the files right away, but I want atleast to select the files automatically. User can review the info and then submit the form. and files in the applet.
Is it possible with this library? Or direct me to some better path
OK, so I found my answer in a different library with similar name. With Smartwerkz JUpload we can pass a parameter preselectedFiles="filePath" and autostartUpload=true to preselect files and auto upload files without user interaction. I hope it will help someone someday.
I need some ideas on how I can best solve this problem.
I have a JBoss Seam application running on JBoss 4.3.3
What a small portion of this application does is generate an html and a pdf document based on an Open Office template.
The files that are generated I put inside /tmp/ on the filesystem.
I have tried with System.getProperties("tmp.dir") and some other options, and they always return $JBOSS_HOME/bin
I would like to choose the path $JBOSS_HOME/$DEPLOY/myEAR.ear/myWAR.war/WhateverLocationHere/
However, I don't know how I can programatically choose path without giving an absolute path, or setting $JBOSS_HOME and $DEPLOY.
Anybody know how I can do this?
The second question;
I want to easily preview these generated files. Either through JavaScript, or whatever is the easiest way. However, JavaScript cannot access the filesystem on the server, so I cannot open the file through JavaScript.
Any easy solutions out there?
Not sure how you are generating your PDFs, but if possible, skip the disk IO all together, stash the PDF content in a byte[] and flush it out to the user in a servlet setting the mime type to application/pdf* that responds to a URL which is specified by a link in your client or dynamically set in a <div> by javascript. You're probably taking the memory hit anyways, and in addition to skipping the IO, you don't have to worry about deleting the tmp files when you're done with the preview.
*****I think this is right. Need to look it up.
Not sure I have a complete grasp of what you are trying to achieve, but I'll give it a try anyway:
My assumption is that your final goal is to make some files (PDF, HTML) available to end users via a web application.
In that case, why not have Apache serve those file to the end users, so you only need your JBOSS application to know the path of a directory that is mapped to an Apache virtual host.
So basically, create a file and save it as /var/www/html/myappfiles/tempfile.pdf (the folder your application knows), and then provide http://mydomain.com/myappfiles (an Apache virtual host) to your users. The rest will be done by the web server.
You will have to set an environment variable or system property to let your application know where your folder resides (/var/www/html/myappfiles/ in this example).
Hopefully I was not way off :)
I agree with Peter (yo Pete!). Put the directory outside of your WAR and setup an environment variable pointing to this. Have a read of this post by Jacob Orshalick about how to configure environment variables in Seam :
As for previewing PDFs, have a look at how Google Docs handles previewing PDFs - it displays them as an image. To do this with Java check out the Sun PDF Renderer.
I'm not sure if this works in JBoss, given that you want a path inside a WAR archive, but you could try using ServletContext.getRealPath(String).
However, I personally would not want generated files to be inside my deployed application; instead I would configure an external data directory somewhere like $JBOSS_HOME/server/default/data/myapp
First, most platforms use java.io.tmpdir to set a temporary directory. Some servlet containers redefine this property to be something underneath their tree. Why do you care where the file gets written?
Second, I agree with Nicholas: After generating the PDF on the server side, you can generate a URL that, when clicked, sends the file to the browser. If you use MIME type application/pdf, the browser should do the right thing with it.