OneDrive API download in parts

OneDrive API download in parts - java

I am currently trying to develop a Java based app to access OneDrive.
Today i tried to implement the download as described here: https://dev.onedrive.com/items/download.htm
I wanted to use the range parameter, to offer the user the capability to pause large downloads. But no matter how i send the parameter be at within the HTTP-Request header or in the URL as a GET-Parameter it will always send me the complete file.
Things i tried so far:
https:/ /api.onedrive.com/v1.0/drive/items/***/content?range=0-8388607
(OAuth via HTTP header)
https:/ /api.onedrive.com/v1.0/drive/items/***/content:
Header: Authorization: ***
range: 0-8388607
https:/ /api.onedrive.com/v1.0/drive/items/***/content:
Header: Authorization: ***
range: bytes=0-8388607
I also tried Content-Range and various variations on lower and upper case without success. Any reason why this dose not work?
PS.:
The links a broken because i am using a new account that only allows 2 links per post, I am aware that ther is a space between the two // in my post ;)

Requesting the range of the file is supported. You might want to use fiddler or some other tool to see if the original headers are being passed after the 302 redirect is performed. Below are the HTTP requests and responses when I provide the range header which is being passed on after the 302 redirect occurs. You'll notice that a HTTP 206 partial content response is returned. Additionally, to resume a download, you can use "Range: bytes=1025-" or whatever the last byte received was. I hope that helps.
GET https://api.onedrive.com/v1.0/drive/items/item-id/content HTTP/1.1
Authorization: Bearer
Range: bytes=0-1024
Host: api.onedrive.com
HTTP/1.1 302 Found
Content-Length: 0
Location: https://kplnyq.dm2302.livefilestore.com/edited_location
Other headers removed
GET https://kplnyq.dm2302.livefilestore.com/edited_location
Range: bytes=0-1024
Host: kplnyq.dm2302.livefilestore.com
HTTP/1.1 206 Partial Content
Cache-Control: public
Content-Length: 1025
Content-Type: audio/mpeg
Content-Location: https://kplnyq.dm2302.livefilestore.com/edited_location
Content-Range: bytes 0-1024/4842585
Expires: Tue, 11 Aug 2015 21:34:52 GMT
Last-Modified: Mon, 12 Dec 2011 21:33:41 GMT
Accept-Ranges: bytes
Server: Microsoft-HTTPAPI/2.0
Other headers removed

Related

Testing of Raw HTTP post in java, 400 Bad Request

I have developed a raw http post in java. I am trying to post a file to the post request dump website http://www.posttestserver.com/. But it shows and error
400 Bad Request. Pleas let me know what need to be done to avoid this error.
In this code , output => Stream to write on server.
filename -> path on server, here filename is initated to post.php
output.println("POST"+" "+filename+" HTTP/1.1\r");
//output.println("Content-Length: "+data.length());
output.println("Content-Type: multipart/form-data, boundary=AaB03x\r");
output.println("Content-length: 100\r");
//As http1.1 is by default keep-alive , close the connection explicitly
output.println("Connection: Close");
// blank line
output.println();
output.println("--AaB03x");
output.print(
"--AaB03x Content-Disposition: form-data; name=\"fileID\"; filename=\"temp.txt\" Content-Type: text/plain "
+"/nHello How are you?"
+ "/n--AaB03x--");
output.flush();
Error is
HTTP/1.1 400 Bad Request
Date: Wed, 18 Mar 2015 02:22:00 GMT
Server: Apache
Vary: Accept-Encoding
Content-Length: 226
Connection: close
Content-Type: text/html; charset=iso-8859-1
400 Bad Request
Bad Request
Your browser sent a request that this server could not understand.

This might be the issue of Content type. Server is expecting a request having header of content type text/HTML but your request content type is multipart/form data.

JDOM2 - Follow Redirects (HTTP Error 301)

I'm currently working on a third-party-program for a website using its public XML API. I don't want to go into deeper matters about what the program is actually doing or whatsoever because there seems to be a problem right at the beginning. The website's API expects a client to follow redirects and to set a proper user agent to verify the application itself, but the JDOM2 library, which I use for this project, doesn't seem to do any of these things. Neither the SAXBuilder (org.jdom2.input) integrated in the package nor the native HTTPURLConnection (java.net) class seem to do a proper job.
I'm very confused and don't know where to start at all. Is there any way to make the JDOM2 library follow redirects or am I just missing a simple method call?

JDOM uses the URL given to the SAXBuilder to create a URL Connection, and from that connection, it opens an input stream to read the XML content.
While I understand that the HTTP protocol has a redirect functionality, that is something that is handled by the client.... consider this:
# curl -i 'http://stackoverflow.com/questions/24913206'
HTTP/1.1 301 Moved Permanently
Cache-Control: public, no-cache="Set-Cookie", max-age=60
Content-Type: text/html; charset=utf-8
Expires: Wed, 23 Jul 2014 18:44:06 GMT
Last-Modified: Wed, 23 Jul 2014 18:43:06 GMT
Location: /questions/24913206/jdom2-follow-redirects-http-error-301
Vary: *
X-Frame-Options: SAMEORIGIN
Set-Cookie: prov=xxxx.yyyy.zzzz; domain=.stackoverflow.com; expires=Fri, 01-Jan-2055 00:00:00 GMT; path=/; HttpOnly
Date: Wed, 23 Jul 2014 18:43:05 GMT
Content-Length: 174
<html><head><title>Object moved</title></head><body>
<h2>Object moved to here.</h2>
</body></html>
The data that will be given to JDOM when it builds from the URL http://stackoverflow.com/questions/24913206 will be the redirect / HTTP-301 to http://stackoverflow.com/questions/24913206/jdom2-follow-redirects-http-error-301, and the HTML content that makes that human readable.
Now, the URL handling API for Java just returns the input stream for JDOM. What you are suggesting is that JDOM should interpret that stream, and automatically redirect.
There are a few problems with this.
JDOM does not even know it is an HTTP URL. It is often a File name, or an FTP URL, etc.
what if you did not want to follow the redirect?
etc.
The other issue is that this should be either supported natively by Java, or actively by the application.
What are the real solutions:
Tell all HTTP requests in your application to follow redirects using: HTTPUrlConnection.setFollowRedirects(true)
Don't give JDOM a raw URL to build from, but process it yourself:
URL httpurl = new URL(.....);
HTTPURLConnection conn = (HTTPUrlConnection)httpurl.openConnection();
conn.setInstanceFollowRedirects(true);
conn.connect();
Document doc = saxBuilder.build(conn.getInputStream());

Tumblr API Photo Post Returns 401 (Not Authorized)

I'm attempting to use the Tumblr API in an Android app to authorize users and make text and photo posts. I'm using the Scribe library. So for, I can successfully obtain an access token and use it to get user info. I can also make text posts without any issues. This tells me that I'm signing requests correctly.
However, I've spent the last week and a half attempting to make photo posts without success. I continuously receive 401 errors (Not Authorized) I've read through many posts on the Tumblr support forum as well as here on Stack Overflow, but was unable to find a solution.
I'm reluctant to include the Jumblr library because I'm trying to keep my app as lean as possible. That said, I reviewed the Jumblr code and decided to mimic how photo posts are sent (https://github.com/tumblr/jumblr/blob/master/src/main/java/com/tumblr/jumblr/request/MultipartConverter.java). I'm still receiving the exact same error.
Below is an example my multipart POST request and the response I receive. I've replace the blog name, and OAuth signature, consumer key, and token variables, and have removed the binary image data for brevity sake. Everything else is untouched. I have a few questions...
Are there any other variables that should be included in the
multipart section? A Stack Overflow user stated that placing the
"oauth_" signature variables in there fixed his problem. I didn't
have success with this, but maybe there was something I was missing.
The Jumblr app doesn't appear to do any encoding of the image data,
although the Tumblr documentation states that it should be URL
encoded. Right now I'm sending it as the Jumblr app appears to (raw
binary). Is this correct?
Does anything else in my request look
incorrect?
REQUEST:
NOTE: I learned that the OAuth signature should be generated WITHOUT the multipart form. My code takes that into account when building this request!
POST http://api.tumblr.com/v2/blog/**REMOVED**.tumblr.com/post HTTP/1.1
Content-Type: multipart/form-data, boundary=cbe6b79db1b3cbe6b79e104e
Authorization: OAuth oauth_signature="**REMOVED**", oauth_version="1.0", oauth_nonce="3181201716", oauth_signature_method="HMAC-SHA1", oauth_consumer_key="**REMOVED**", oauth_timestamp="1388791537", oauth_token="**REMOVED**"
Content-Length: 1001
User-Agent: Dalvik/1.6.0 (Linux; U; Android 4.3; SM-N900T Build/JSS15J)
Host: api.tumblr.com
Connection: Keep-Alive
Accept-Encoding: gzip
--cbe6b79db1b3cbe6b79e104e
Content-Disposition: form-data; name="type"
photo
--cbe6b79db1b3cbe6b79e104e
Content-Disposition: form-data; name="caption"
Another pic test...
--cbe6b79db1b3cbe6b79e104e
Content-Disposition: form-data; name="data[0]"; filename="postr_media_file_1388791537-1709648435.jpg"
Content-Type: image/jpeg
---- BINARY DATA REMOVED FOR BREVITY ----
RESPONSE:
HTTP/1.1 401 Not Authorized
Server: nginx
Date: Fri, 03 Jan 2014 23:25:39 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: close
Set-Cookie: tmgioct=52c746f34266840643527780; expires=Mon, 01-Jan-2024 23:25:39 GMT; path=/; httponly
P3P: CP="ALL ADM DEV PSAi COM OUR OTRo STP IND ONL"
3c
{"meta":{"status":401,"msg":"Not Authorized"},"response":[]}

I posted the answer in the "Tumblr API Discussion" Google Group. This is what I did:
The key to doing it correctly is NOT just signing without the multipart form!!! Here are the steps...
Add all fields EXCEPT the data field as regular url encoded POST
body variables
Sign the request
Remove ALL off the post variables you just added from the request
Add the multipart form, including the data field this time
Some things to consider...
The Content-Type in the header should be "multipart/form-data"
The Content-Disposition of all form parts should be "form-data" and, of course, include a valid "name" attribute (ie. type, caption, etc...)
The Content-Disposition of the data part should also include a "filename" attribute
The only form part that should contain a Content-Type is data, and it should be set to the mime type of the file you are uploading (ie. "image/jpeg")
I used "data[0]" as the name of the data field. I haven't tested this with just "data", but according to everything I've read it should work that way as well. If you are creating a photo set, I believe you simple add additional parts (ie. data1. data[2], etc...). Again, I haven't tested anything except "data[0]", so do your due diligence!!!
I did NOT encode the binary image data!!! I saw people spending considerable amount of time on this in other posts when adding the image as a POST body variable. If doing this as a multipart form, you can skip the encoding and send raw binary data! ;-)
I hope this helps someone! I've spent two weeks banging my head on random solid objects trying to figure this out. The implementation is very easy to do, but there is zero documentation available on how exactly to build POST requests for photos properly. The official docs really should include that. If I had know what I just posted above I could have completed this in minutes instead of weeks!!!
The last request I posted earlier is still valid, but here it is again. Just remember what I mentioned about the signature!!!
REQUEST:
POST http://api.tumblr.com/v2/blog/REMOVED.tumblr.com/post HTTP/1.1
Content-Type: multipart/form-data, boundary=c60f7c041c02c60f7c046e9b
Authorization: OAuth oauth_signature="***REMOVED***", oauth_version="1.0", oauth_nonce="315351812", oauth_signature_method="HMAC-SHA1", oauth_consumer_key="***REMOVED***", oauth_timestamp="1388785116", oauth_token="***REMOVED***"
Content-Length: 1001
User-Agent: Dalvik/1.6.0 (Linux; U; Android 4.3; SM-N900T Build/JSS15J)
Host: api.tumblr.com
Connection: Keep-Alive
Accept-Encoding: gzip
--c60f7c041c02c60f7c046e9b
Content-Disposition: form-data; name="type"
photo
--c60f7c041c02c60f7c046e9b
Content-Disposition: form-data; name="caption"
Another pic test...
--c60f7c041c02c60f7c046e9b
Content-Disposition: form-data; name="data[0]"; filename="postr_media_file_1388785116-1709648435.jpg"
Content-Type: image/jpeg
***** BINARY DATA REMOVED FOR BREVITY *****
--c60f7c041c02c60f7c046e9b--

URLConnection does not handle content length via proxy correctly

I faced the following problem: When URLConnection is used via proxy the content length is always set to -1.
First I checked that proxy really returns the Content-Length (lynx and wget are also working via proxy; there is no other way to go to internet from local network):
$ lynx -source -head ftp://ftp.wipo.int/pub/published_pct_sequences/publication/2003/1218/WO03_104476/WO2003-104476-001.zip
HTTP/1.1 200 OK
Last-Modified: Mon, 09 Jul 2007 17:02:37 GMT
Content-Type: application/x-zip-compressed
Content-Length: 30745
Connection: close
Date: Thu, 02 Feb 2012 17:18:52 GMT
$ wget -S -X HEAD ftp://ftp.wipo.int/pub/published_pct_sequences/publication/2003/1218/WO03_104476/WO2003-104476-001.zip
--2012-04-03 19:36:54-- ftp://ftp.wipo.int/pub/published_pct_sequences/publication/2003/1218/WO03_104476/WO2003-104476-001.zip
Resolving proxy... 10.10.0.12
Connecting to proxy|10.10.0.12|:8080... connected.
Proxy request sent, awaiting response...
HTTP/1.1 200 OK
Last-Modified: Mon, 09 Jul 2007 17:02:37 GMT
Content-Type: application/x-zip-compressed
Content-Length: 30745
Connection: close
Age: 0
Date: Tue, 03 Apr 2012 17:36:54 GMT
Length: 30745 (30K) [application/x-zip-compressed]
Saving to: `WO2003-104476-001.zip'
In Java I wrote:
URL url = new URL("ftp://ftp.wipo.int/pub/published_pct_sequences/publication/2003/1218/WO03_104476/WO2003-104476-001.zip");
int length = url.openConnection().getContentLength();
logger.debug("Got length: " + length);
and I get -1. I started to debug FtpURLConnection and it turned out that the necessary information is in underlying HttpURLConnection.responses field however it is never properly populated from there:
(there is Content-Length: 30745 in headers). The content length is not updated when you start reading the stream or even after the stream was read. Code:
URL url = new URL("ftp://ftp.wipo.int/pub/published_pct_sequences/publication/2003/1218/WO03_104476/WO2003-104476-001.zip");
URLConnection connection = url.openConnection();
logger.debug("Got length (1): " + connection.getContentLength());
InputStream input = connection.getInputStream();
byte[] buffer = new byte[4096];
int count = 0, len;
while ((len = input.read(buffer)) > 0) {
count += len;
}
logger.debug("Got length (2): " + connection.getContentLength() + " but wanted " + count);
Output:
Got length (1): -1
Got length (2): -1 but wanted 30745
It seems like it is a bug in JDK6, so I have opened new bug#7168608.
If somebody can help me to write the code should return correct content length for direct FTP connection, FTP connection via proxy and local file:/ URLs I would appreciate.
If given problem cannot be worked-around with JDK6, suggest any other library that definitely works for all cases I've mentioned (Apache Http Client?).

Remember that proxies will often change the representation of the underlying entity. In your case I suspect the proxy is probably altering the transfer encoding. Which in turn makes the Content-Length meaningless even if supplied.
You are falling afoul of the following two sections of the HTTP 1.1 spec:
4.4 Message Length
...
...
If a Content-Length header field (section 14.13) is present, its decimal value in OCTETs represents both the entity-length and the transfer-length. The Content-Length header field MUST NOT be sent if these two lengths are different (i.e., if a Transfer-Encoding header field is present). If a message is received with both a Transfer-Encoding header field and a Content-Length header field, the latter MUST be ignored.
14.41 Transfer-Encoding
The Transfer-Encoding general-header field indicates what (if any) type of transformation has been applied to the message body in order to safely transfer it between the sender and the recipient. This differs from the content-coding in that the transfer-coding is a property of the message, not of the entity.
Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding
Transfer-codings are defined in section 3.6. An example is:
Transfer-Encoding: chunked
If multiple encodings have been applied to an entity, the transfer- codings MUST be listed in the order in which they were applied. Additional information about the encoding parameters MAY be provided by other entity-header fields not defined by this specification.
Many older HTTP/1.0 applications do not understand the Transfer- Encoding header.
So The URLConnection is then ignoring the Content-Length header, as per the spec because it is meaningless in the presence of chunked transfers
In your debugger screenshot it's not clear whether the Transfer-Encoding header is present. Please let us know...
On further investigation - it seems that lynx does not show all the headers returned when you issue a lynx -head. It is not showing the Transfer-Encoding header critical to this discussion.
Here's the proof of the discrepancy with a publically visible website
Ξ▶ lynx -useragent='dummy' -source -head http://www.bbc.co.uk
HTTP/1.1 302 Found
Server: Apache
X-Cache-Action: PASS (non-cacheable)
X-Cache-Age: 0
Content-Type: text/html; charset=iso-8859-1
Date: Tue, 03 Apr 2012 13:33:06 GMT
Location: http://www.bbc.co.uk/mobile/
Connection: close
Ξ▶ wget -useragent='dummy' -S -X HEAD http://www.bbc.co.uk
--2012-04-03 14:33:22-- http://www.bbc.co.uk/
Resolving www.bbc.co.uk... 212.58.244.70
Connecting to www.bbc.co.uk|212.58.244.70|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: Apache
Cache-Control: private, max-age=15
Etag: "7e0f292b2e5e4c33cac1bc033779813b"
Content-Type: text/html
Transfer-Encoding: chunked
Date: Tue, 03 Apr 2012 13:33:22 GMT
Connection: keep-alive
X-Cache-Action: MISS
X-Cache-Age: 0
X-LB-NoCache: true
Vary: Cookie
Since I am obviously not inside your network I can't replicate your exact circumstances, but please validate that you really aren't getting a Transfer-Encoding header when passing through a proxy.

I think it's a "bug" in the jdk related to handling ftp connections which are proxied. The FtpURLConnection delegates to an HttpURLConnection when a proxy is in use. however, the FtpURLConnection doesn't seem to delegate any of the header management to this HttpURLConnection in this situation. thus, you can correctly get the streams, but i don't think you can access any "header" values like content length or content type. (this is based on a quick glance over the openjdk source for 1.6, i could have missed something).

One thing to check I would do is to actually read the response (writing off the top of my head so expect mistakes):
URLConnection connection= url.openConnection();
InputStream input= connection.getInputStream();
byte[] buffer= new byte[4096];
while(input.read(buffer) > 0)
;
logger.debug("Got length: " + getContentLength());
If the size you are getting is good, then look for a way for make URLConnection read the header but not the data to avoid reading the whole response.

http: conditional get does not give a chance to refresh headers without sending body again

I don't know if this is a bug or a feature in the http spec, or I am not understanding things ok.
I have a resource that changes at most once a week, at the week's beginning. If it didn't change, then the previous week's resource continues to be valid for the whole week.
(For all our tests we have modified the one week period for five minutes, but I think our observations are still valid).
First we send the resource with the header Expires: next Monday. The whole week the browser retrieves from the cache. If on Monday we have a new resource then it is retrieved with its new headers and everything is ok.
The problem occurs when the resource is not renewed. In response to the conditional get our app (Java+Tomcat) sends new headers with Expires: next Monday but without the body. But our frontend server (apache) removes this header, because the spec says you should not send new headers if the resource did not change. So now forever (until the resource changes) the browser will send a conditional get when we would like it to continue serving straight from the cache.
Is there a spec compliant way to update the headers without updating the body? (or sending it again)
And subquestion: how to make apache pass along tomcat's headers?

Just a Expires header is not enough. According to RFC 2616 section 13.3.4, a server needs to respond with two headers, Last-Modified and ETag, to do conditional GET right:
In other words, the preferred behavior for an HTTP/1.1 origin server is to send both a strong entity tag and a Last-Modified value.
And if the client is HTTP/1.1 compliant, it should send If-Modified-Since. Then the server is supposed to respond as following (quoted from Roy Fielding's proposal to add conditional GET):
If resource is inaccessible (for whatever reason), then the server should return a 4XX message just like it does now.
If resource no longer exists, the server should return a 404 Not Found response (i.e. same as now).
If resource is accessible but its last modification date is earlier (less than) or equal to the date passed, the server should return a 304 Not Modified message (with no body).
If resource is accessible and its last modification date is later than the date passed, the server should return a 200 OK message (i.e. same as now) with body.
So, I guess you don't need to configure Apache and/or Tomcat the way you described. You need to make your application HTTP/1.1 compliant.

Try sending a valid HTTP-Date for the Expires header?

One way to solve the problem is using separate URIs for each week. The canonical url redirects to the appropriate url for the week, and instructs the browser to cache the redirect for a week. Also, URLs that have a date in them will instruct the browser to cache forever.
Canonical URL : /path/to/resource
Status Code : 301
Location : /path/to/resource/12-dec or /path/to/resource/19-dec
Expires : Next Monday
Week 1 : /path/to/resource/12-dec
Status code : 200
Expires : Never
Week 2 : /path/to/resource/19-dec
Status code : 200
Expires : Never
When the cache expires on Monday, you just send a redirect response. You either send last weeks URL or this weeks, but you never send the entire response body.
With this approach, you have eliminated conditional gets. You have also made your resources "unmodifiable-once-published", and you also get versioned resources.
The only caveat - redirects aren't cached by all browsers even though the http spec requires them to do so. Notably IE8 and below don't cache. For details, look at the column "cache browser redirects" in browserscope.

The Expires header was basically deprecated with HTTP 1.1; use Cache-Control: max-age instead.
Make sure you are including Last-Modified.
It's optional, but you may also want to specify Cache-Control: must-revalidate, to make sure intermediate proxies don't deliver potentially stale content.
You don't need to set ETag.
Example request:
GET http://localhost/images/logo.png HTTP/1.1
Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5
Referer: http://localhost/default.aspx
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Host: localhost
Connection: Keep-Alive
The response includes the requested content:
HTTP/1.1 200 OK
Cache-Control: max-age=10
Content-Type: image/png
Last-Modified: Sat, 21 Feb 2009 11:28:18 GMT
Accept-Ranges: bytes
Date: Sun, 18 Dec 2011 05:48:34 GMT
Content-Length: 2245
Requests made before the 10 second timeout are resolved from cache, with no HTTP request. After the timeout:
GET http://localhost/images/logo.png HTTP/1.1
Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5
Referer: http://localhost/default.aspx
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
If-Modified-Since: Sat, 21 Feb 2009 11:28:18 GMT
Host: localhost
The response is just headers, without content:
HTTP/1.1 304 Not Modified
Cache-Control: max-age=10
Last-Modified: Sat, 21 Feb 2009 11:28:18 GMT
Accept-Ranges: bytes
Date: Sun, 18 Dec 2011 05:49:04 GMT
Subsequent requests are again resolved from the browser's cache until the specified cache expiration time.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.