There is a link to Google Drive file accessible to anyone without authentication. Is there any way to download that file using just link and some sort of http client? Most of examples around are relying on Drive API and file IDs, however I'd like to stick with more lightweight approach, at the same time no parsing web page that results from http get to the url.
"If you want to share a direct link, simply change the format of the link from this:
drive.google.com/file/d/FILE_ID/edit?usp=sharing
To this:
drive.google.com/uc?export=download&id=FILE_ID
Note that you'll need to grab the file ID from the original link and append it to the end of the new link.
With the new format, any link you share will automatically download to your recipient's computer."
http://lifehacker.com/share-direct-links-to-files-in-google-drive-and-skip-th-1493813665 //Jan 2014
Related
I am interested in extracting a particular from the source code of a website. I am able to do this using JSoup, by getting the entire source code using
Document doc;
doc = Jsoup.connect("http://example.com").get();
Element divs = document.getElementById("importantDiv");
However, the problem is that I need to do this about 20000 times a day, to be able to get all the changes that are happening in the div. To create the whole document every time would use a lot of network bandwidth, which I would like to avoid. Is there a way to be able to extract the required element without re-creating the entire document on the client side.
NOTE : The code snippet is an example and not the actual URL or ID which I need to extract.
I don't believe you can request specific portions of a web page. JSoup is basically a web client class, and the web client has no control over what the server sends it. The server is the one that dictates what is sent, so you can't really request a segment of a webpage without requesting the entire web page.
Do you have access to this webpage, or is it an external website?
If you don't have control of the server side, you cannot do it. You will need to download the complete html. But note that it's just the HTML, not the rest of the resources like stylesheets, images, javascripts, etc.
To save bandwidth you would need to install some code in the server, so that it serves just the bits of information required.
Take a look at the URLConnection class, you can use it to open a connection to an URL get the connection's input stream and read only as much bytes as you need, this will work and you won't have to download the entire document, but unfortunately you won't be able to download the document starting from an offset. You will always have to start downloading the document from its beginning.
I need the solution for the below problem in Java -
On arrival of new mail in ms outlook for a particular email id, there should be a web service get executed automatically.
is it possible? please help!
You can do that using java mail. You will need to find the configuration details but a standard code snippet for this would be something like below. I copied the code snipped from here. Thos official javamail link has a pretty decent set of examples (ie. how to read attachments).
For storing the email as a file to a folder, you can apache FileUtils. Write the email to a file and copy it to a folder that you desire.
There is one more interesting resource
I'm not sure if this is possible but is there a way to grab the cookie from yahoo from Firefox's cookies.sqlite file and then use that information in a Java program? When I log into yahoo, I told it to leave me logged in. Since the browser thinks I'm still logged in, that info is stored in a cookie (I assume).
I saw yahoo has their developer's API and some OAuth library. To be able to use OAuth to log in, I would need to register my program but I don't want to register unless I have to. I found this post from SO on how to use sqlite.exe to view the file. However, the file looks like gibberish (to a human) and I can't tell what entry is my cookie.
Is there another way to parse this file to get my yahoo cookie and use it in a Java program? Do I have to register my "secret" program with yahoo to use OAuth properly to log into yahoo? Thanks in advance for any help you can give me.
For Firefox, try this. It's based on using the session recovery file Firefox stores. This is bash syntax, not Java, but can probably be adapted pretty easily.
grep -o '{"host":"<HOSTNAME>"[^}]*}' $HOME/.mozilla/firefox/*.default/sessionstore-backups/recovery.js
That should dump out each cookie as a JSON entry that is associated with whatever you put in for <HOSTNAME>. You can adjust beyond that to extract the specific cookie you want.
Note: If you have more than one FF profile, you may need to adjust the *.default portion. The directory name is stored in .mozilla/firefox/profiles.ini, but extracting it from there is really overkill if only a single profile exists.
Cookies are stored in an sqlite file, so this worked for me:
$ sqlite3 ~/.mozilla/firefox/*.default/cookies.sqlite
sqlite> select name,value from moz_cookies where host="bugs.kde.org" and name LIKE "Bugzill%";
My use case is to extract the bugzilla cookie in order to give it to a script.
For other use cases, adjust the SQL query accordingly, obviously.
I want to upload a file to a general website.
It seems I'll have to use HTTP requests which I can obtain through http://www.fiddler2.com/.
But I'm a complete beginner in handling HTTP or webrequests. Could someone guide me into a good tutorial on HTTP requests using C++ or Java?
Most websites are in need of a login for example. Also, does a general website accept those http request to upload a file? For uploading a file to a general website it seems you will need a login handler, get the correct request to upload the file and then upload it. To get those correct requests, do I have to see what that specific site asks?
The goal is to upload a file to a general website of choice and expand from there on to more websites.
Thanks in advance!
I was trying to crawl some of website content, using jsoup and java combination. Save the relevant details to my database and doing the same activity daily.
But here is the deal, when I open the website in browser I get rendered html (with all element tags out there). The javascript part when I test it, it works just fine (the one which I'm supposed to use to extract the correct data).
But when I do a parse/get with jsoup(from Java class), only the initial website is downloaded for parsing. Meaning there are some dynamic parts of a website and I want to get that data but since they're rendered post get, asynchronously on the website I'm unable to capture it with jsoup.
Does anybody knows a way around this? Am I using the right toolset? more experienced people, I bid your advice.
You need to check before if the website you're crawling demands some of this list to show all contents:
Authentication with Login/Password
Some sort of session validation on HTTP headers
Cookies
Some sort of time delay to load all the contents (sites profuse on Javascript libraries, CSS and asyncronous data may need of this).
An specific User-Agent browser
A proxy password if, by example, you're inside a corporative network security configuration.
If anything on this list is needed, you can manage that data providing the parameters in your jsoup.connect(). Please refer the official doc.
http://jsoup.org/cookbook/input/load-document-from-url