HTTP Get: Only download the header? (HEAD is not supported) - java

In my code I use some Http Get request to download some files as a stream. I use the following code:
public String getClassName(String url) throws ClientProtocolException, IOException {
HttpResponse response = sendGetRequestJsonText(url);
Header[] all = response.getAllHeaders();
for (Header h : all) {
System.out.println(h.getName() + ": " + h.getValue());
}
Header[] headers = response.getHeaders("Content-Disposition");
InputStreamParser.convertStreamToString(response.getEntity().getContent());
String result = "";
for (Header header : headers) {
result = header.getValue();
}
return result.substring(result.indexOf("''") + "''".length(), result.length()).trim();
}
But this downloads the full content of the response. I want to retrieve only the http headers without the content. A HEAD request seems not to work because then i get the status 501, not implemented. How can I do that?

Instead of making a GET request, you might consider just making a HEAD request:
The HEAD method is identical to GET except that the server MUST NOT
return a message-body in the response. The metainformation contained
in the HTTP headers in response to a HEAD request SHOULD be identical
to the information sent in response to a GET request. This method can
be used for obtaining metainformation about the entity implied by the
request without transferring the entity-body itself. This method is
often used for testing hypertext links for validity, accessibility,
and recent modification.

You might be able to use the Range header in your request to specify a range of bytes to include in the response entity. Possibly something like:
Range: bytes=0-0
If it does work, you should receive back a 206 Partial Content with the bytes specified in your Range header present in the response entity. However, I've not tried this, and it's also not guaranteed to work:
A server MAY ignore the Range header.

Related

400 Bad request on Java Webclient multipart/formdata post request

Im having problems on posting a multipart/formdata request to a REST api. The request returns an 400 Bad Request response.
This is how the request should look like. The link shows you a screenshot captured on a successful request by the web interface.
Successful request
This is the Java code I created.
public void importModel(String projectId, String modelId, MultipartFile file, String fileName) throws IOException {
MultipartBodyBuilder builder = new MultipartBodyBuilder();
builder.part("data", file.getBytes(), MediaType.APPLICATION_OCTET_STREAM)
.header("Content-Disposition", "form-data; name=data; filename=" + fileName);
MultiValueMap<String, HttpEntity<?>> parts = builder.build();
WebClient webClient = WebClient.builder()
.filters(exchangeFilterFunctions -> {
exchangeFilterFunctions.add(logRequest());
exchangeFilterFunctions.add(logResponse());
})
.build();
String request = webClient.post()
.uri(getBaseUriBuilder()
.pathSegment(getTeamSlug())
.path(API_PATH_PROJECTS)
.pathSegment(projectId)
.path(API_PATH_MODEL)
.pathSegment(modelId)
.path("/importasync")
.build())
.contentType(MediaType.MULTIPART_FORM_DATA)
.contentLength(file.getSize())
.header(HttpHeaders.AUTHORIZATION, getPrefixedAuthToken())
.body(BodyInserters.fromMultipartData(parts))
.exchange()
.flatMap(FlatService::apply)
.block();
return;
}
Any help is much appreciated. Thank in advance!
Have you tried to send the request with alternative Software like POSTMAN.
There you can check for the request properties that are being sent with the request
a 400 error can occur due to the following issues with your request
Wrong URL: Same as 404-Error a Bad Request is generated, when the user types in a wrong internet address or he adds special chars to the address.
Error full Cookies: If the Cookie inside your browser is to old or broken it can also be a 400.
Old outdated DNS-Entries: In your DNS-Cache could lie files that point to wrong or outdated IP- addresses
Too big files: when you try to upload very large files, the server can deny the request.
Too long header lines: the communication between the client and server is done with header information about the request. some servers set a limit to the header length.
Also if you can find out the more specific 400 error like this:
400.1: Invalid Destination Header
400.2: Invalid Depth Header
400.3: Invalid If Header
400.4: Invalid Overwrite Header
400.5: Invalid Translate Header
400.6: Invalid Request Body
400.7: Invalid Content
400.8: Invalid Timeout
400.9: Invalid Lock Token
If you are not the server admin you could ask him about specifications of the server. or use tools like postman where you can try to send requests to the server and find out more specific error codes.

Java writeBytes white space replaced by '+'

when trying to do a post request using HttpURLConnection, white spaces in the message got replaced by '+' and '=' is added at the end of the String.
I am using JDK 1.8.0_91, here is my code :
public void sendPost(String message) throws Exception {
String url = "http://localhost:8081/subscribe";
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
//add reuqest header
con.setRequestMethod("POST");
/*String urlParameters = "sn=C02G8416DRJM&cn=&locale=&caller=&num=12345";*/
// Send post request
con.setDoOutput(true);
DataOutputStream wr = new DataOutputStream(con.getOutputStream());
wr.writeBytes(message);
wr.flush();
wr.close();
}
sendPost("Say Hi") would give on server side : Say+Hi=
When you don't set the Content-Type request header (using setRequestProperty()), it defaults to application/x-www-form-urlencoded.
When a POST request has content type application/x-www-form-urlencoded, the ServletRequest.getParameter methods can be used to extract the parameters from the request body. This consumes the request body, so any subsequent call to getInputStream() or getReader() will return null.
When your Spring handler method asks for the request body with the #RequestBody annotation, and the body has already been consumed, Spring will rebuild the body from the parameters, in order to supply the method parameter (it is basically faking it). When Spring rebuilds the request body from the parsed parameters, it will URL encode them as required by the application/x-www-form-urlencoded content type. This encoding converts spaces to +.
The original request body with Say Hi was parsed as parameter "Say Hi" with value "" (parameter without = has empty value). When that is rebuilt in compliance with application/x-www-form-urlencoded, the parameter is encoded as Say+Hi=, which is what you're seeing.
This only happens if one of the ServletRequest.getParameter methods has been called, so you likely have a Filter that does that. If that filter was removed, the request body wouldn't have been consumed, and the #RequestBody parameter would have received the original request body.
Another way to prevent the problem is to send the request body as a text/plain content type, since that seems to be how you want it anyway, i.e. the request body is not actually x-www-form-urlencoded:
con.setRequestProperty("Content-Type", "text/plain; charset=ISO-8859-1");
Of course, if the request body truly is x-www-form-urlencoded content, like your commented code would suggest, then there shouldn't be any spaces, and there was no problem to begin with.
To recap: Your problem is that you're sending arbitrary text in the POST body, but you didn't set the content type, so it defaulted to application/x-www-form-urlencoded, and your content doesn't conform to the specifications of that content type, i.e. your request is malformed and behaved unpredictably.

Header values overwritten on redirect in HTTPClient

I'm using httpclient 4.2.5 to make http requests which have to handle redirects as well.
Here is a little example to understand the context:
A sends http request (using httpclient 4.2.5) to B
B sends 302 redirect (containing url to C) back to A
A follows redirect to C
C retrieves request URL and do some work with it
If C parses the request URL by request.getRequestURL() (HttpServlet API) it contains e.g. host and port of the original request from step 1, which is wrong.
The problem exists in step 2, where httpclient handles the redirect. It just copies all headers from the original request (step 1) to the current request (step 3). I already had a look at the responsible code, via grepcode:
DefaultRequestDirector
HttpUriRequest redirect = redirectStrategy.getRedirect(request, response, context);
HttpRequest orig = request.getOriginal();
redirect.setHeaders(orig.getAllHeaders());
I don't really understand why all headers of the original request are copied to the current request.
E.g. using cURL for a simple test is doing it as expected, C would receive the correct host and port.
Implementing my own redirect strategy does not help because the original headers are copied after it.
I had the same problem when trying to download files from bitbucket's download section using HttpClient. After the first request bitbucket sends a redirect to CDN which then complains if the Authorization header is set.
I worked around it by changing DefaultRedirectStrategy.getRedirect() method to return redirect object which does not allow Authorization headers to be set.
I work with Scala so here is the code:
val http = new DefaultHttpClient()
http.setRedirectStrategy(new DefaultRedirectStrategy() {
override def getRedirect(
request: HttpRequest, response: HttpResponse, context: HttpContext
): HttpRequestBase = {
val uri: URI = getLocationURI(request, response, context)
val method: String = request.getRequestLine.getMethod
if (method.equalsIgnoreCase(HttpHead.METHOD_NAME)) {
new HttpHead(uri) {
override def setHeaders(headers: Array[Header]) {
super.setHeaders(headers.filterNot(_.getName == "Authorization"))
}
}
}
else {
new HttpGet(uri) {
override def setHeaders(headers: Array[Header]) {
super.setHeaders(headers.filterNot(_.getName == "Authorization"))
}
}
}
}
})
Please note orig.getAllHeaders() returns an array of headers explicitly added to the message by the caller. The code from DefaultRequestDirector posted above does not copy request headers automatically generated by HttpClient such as Host, Content-Length, Transfer-Encoding and so on.
You post a wire log of the session exhibiting the problem I may be able to tell why redirects do not work as expected.

HttpClient: Determine empty entity in response

I'm wondering how to determine an empty http response.
With empty http response I mean, that the http response will only have set some headers, but contains an empty http body.
For example: I do a HTTP POST to an webserver, but the webserver will only return an status code for my HTTP POST and nothing else.
The problem is, that I have written a little http framework on top of apache HttpClient to do auto json parsing etc. So the default use case of this framework is to make a request and parse the response. However if the response does not contain data, like mentioned in the example above, I will ensure that my framework skip json parsing.
So I do something like this:
HttpResponse response = httpClient.execute(uriRequest);
HttpEntity entity = response.getEntity();
if (entity != null){
InputStream in = entity.getContent();
// json parsing
}
However entity is always != null. And also the retrieved inputstream is != null. Is there a simple way to determine if the http body is empty or not?
The only way I see is that the server response contains the Content-Length header field set to 0.
But not every server set this field.
Any suggestions?
In HttpClient, getEntity() can return null. See the latest samples.
However, there's a difference between an empty entity, and no entity. Sounds like you've got an empty entity. (Sorry to be pedantic -- it's just that HTTP is pedantic. :) With respect to detecting empty entities, have you tried reading from the entity input stream? If the response is an empty entity, you should get an immediate EOF.
Do you need to determine if the entity is empty without reading any bytes from the entity body? Based on the code above, I don't think you do. If that's the case, you can just wrap the entity InputStream with a PushbackInputStream and check:
HttpResponse response = httpClient.execute(uriRequest);
HttpEntity entity = response.getEntity();
if(entity != null) {
InputStream in = new PushbackInputStream(entity.getContent());
try {
int firstByte=in.read();
if(firstByte != -1) {
in.unread(firstByte);
// json parsing
}
else {
// empty
}
}
finally {
// Don't close so we can reuse the connection
EntityUtils.consumeQuietly(entity);
// Or, if you're sure you won't re-use the connection
in.close();
}
}
It's best not to read the entire response into memory just in case it's large. This solution will test for emptiness using constant memory (4 bytes :).
EDIT: <pedantry> In HTTP, if a request has no Content-Length header, then there should be a Transfer-Encoding: chunked header. If there is no Transfer-Encoding: chunked header either, then you should have no entity as opposed to an empty entity. </pedantry>
I would suggest to use the class EntityUtils to get the response as String. If it returns the empty string, then the response is empty.
String resp = EntityUtils.toString(client.execute(uriRequest).getEntity())
if (resp == null || "".equals(resp)) {
// no entity or empty entity
} else {
// got something
JSON.parse(resp);
}
The assumption here is that, for sake of code simplicity and manutenibility, you don't care to distinguish between empty entity and no entity, and that if there is a response, you need to read it anyway.

Problem reading request body in servlet

I'am writing a HTTP proxy that is part of a test/verification
system. The proxy filters all requests coming from the client device
and directs them towards various systems under test.
The proxy is implemented as a servlet where each request is forwarded
to the target system, it handles both GET and POST. Somtimes the
response from the target system is altered to fit various test
conditions, but that is not the part of the problem.
When forwarding a request, all headers are copied except for those
that is part of the actual HTTP transfer such as Content-Length and
Connection headers.
If the request is a HTTP POST, then the entity body of the request is
forwarded as well and here is where it doesnt work sometimes.
The code reading the entity body from the servlet request is the following:
URL url = new URL(targetURL);
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
String method = request.getMethod();
java.util.Enumeration headers = request.getHeaderNames();
while(headers.hasMoreElements()) {
String headerName = (String)headers.nextElement();
String headerValue = request.getHeader(headerName);
if (...) { // do various adaptive stuff based on header
}
conn.setRequestProperty(headerName, headerValue);
}
// here is the part that fails
char postBody[] = new char[1024];
int len;
if(method.equals("POST")) {
logger.debug("guiProxy, handle post, read request body");
conn.setDoOutput(true);
BufferedReader br = request.getReader();
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(conn.getOutputStream()));
do {
logger.debug("Read request into buffer of size: " + postBody.length);
len = br.read(postBody, 0, postBody.length);
logger.debug("guiProxy, send request body, got " + len + " bytes from request");
if(len != -1) {
bw.write(postBody, 0, len);
}
} while(len != -1);
bw.close();
}
So what happends is that the first time a POST is received, -1
characters are read from the request reader, a wireshark trace shows
that the entity body containing URL encoded post parameters are there
and it is in one TCP segment so there is no network related
differences.
The second time, br.read successfully returns the 232 bytes in the
POST request entity body and every forthcoming request works as well.
The only difference between the first and forthcoming POST requests is
that in the first one, no cookies are present, but in the second one,
a cookie is present that maps to the JSESSION.
Can it be a side effect of entity body not being available since the
request processing in the servlet container allready has read the POST
parameters, but why does it work on forthcoming requests.
I believe that the solution is of course to ignore the entity body on
POST requests containing URL encoded data and fetch all parameters
from the servlet request instead using getParameter and reinsert them
int the outgoing request.
Allthough that is tricky since the POST request could contain GET
parameters, not in our application right now, but implementing it
correctly is some work.
So my question is basically: why do the reader from
request.getReader() return -1 when reading and an entity body is
present in the request, if the entity body is not available for
reading, then getReader should throw an illegal state exception. I
have also tried with InputStream using getInputStream() with the same
results.
All of this is tested on apache-tomcat-6.0.18.
So my question is basically: why do the reader from request.getReader() return -1 when reading.
It will return -1 when there is no body or when it has already been read. You cannot read it twice. Make sure that nothing before in the request/response chain has read it.
and an entity body is present in the request, if the entity body is not available for reading, then getReader should throw an illegal state exception.
It will only throw that when you have already called getInputStream() on the request before, not when it is not available.
I have also tried with InputStream using getInputStream() with the same results.
After all, I'd prefer streaming bytes than characters because you then don't need to take character encoding into account (which you aren't doing as far now, this may lead to future problems when you will get this all to work).
Seems, that moving
BufferedReader br = request.getReader()
before all operations, that read request (like request.getHeader() ), works for me well .

Categories