Apache Camel sends internal headers to HTTP calls - java

Apache Camel by default translates all the headers present in a message to HTTP headers.
This is very useful, but since many components also use headers for internal state, there can be leak of internal information on external HTTP calls.
Example, if the job was started using a timer / cron, we end up sending the following HTTP headers on the wire:
http-outgoing-1 >> fireTime: Mon Dec 28 10:14:00 CET 2020
http-outgoing-1 >> jobDetail: JobDetail 'Camel_camel-1.sync': jobClass: 'org.apache.camel.component.quartz.CamelJob concurrentExectionDisallowed: false persistJobDataAfterExecution: false isDurable: false requestsRecovers: false
http-outgoing-1 >> jobInstance: org.apache.camel.component.quartz.CamelJob#4ff90d52
http-outgoing-1 >> jobRunTime: -1
http-outgoing-1 >> mergedJobDataMap: org.quartz.JobDataMap#2338bfe0
http-outgoing-1 >> nextFireTime: Mon Dec 28 10:15:00 CET 2020
http-outgoing-1 >> refireCount: 0
http-outgoing-1 >> scheduledFireTime: Mon Dec 28 10:14:00 CET 2020
http-outgoing-1 >> scheduler: org.quartz.impl.StdScheduler#17a5f565
http-outgoing-1 >> trigger: Trigger 'Camel_camel-1.syncJob': triggerClass: 'org.quartz.impl.triggers.CronTriggerImpl calendar: 'null' misfireInstruction: 1 nextFireTime: Mon Dec 28 10:15:00 CET 2020
http-outgoing-1 >> triggerGroup: Camel_camel-1
http-outgoing-1 >> triggerName: syncJob
... (actual needed HTTP headers:)
http-outgoing-1 >> Connection: Keep-Alive
http-outgoing-1 >> User-Agent: Apache-HttpClient/4.5.12 (Java/11.0.6)
I know that I can remove all headers before attempting HTTP call.
Since my other routes also add my internal headers which I need, it is not simple to just remove all "junk" headers for me at point before sending the call, but keeping my internal ones (which also end up in the HTTP call btw.).
I am aware that I can use properties for this.
I am aware that I can disable headers of message to automatically be added as HTTP headers. In this case I am not sure if I need to add manually the headers which are needed for HTTP (like user-agent).
Also other frequent misuse of the headers is if you make 2 HTTP calls and forget to clear the headers,
the output headers of the first call will become input headers of the first.
Anyone found a good workaround how to avoid this issue?

This is how HTTP-based endpoints (camel-http, camel-http4, camel-jetty, camel-restlet, camel-cxf, camel-cxfrs) process headers by default (you can customize this behavior using HeaderFilterStrategy).
Consumer: creates an In message with CamelHttp* headers which record the status of the incoming message, all of the HTTP headers from the original message, and URL options (Jetty only).
Producer: converts the Exchange it to the target message format with CamelHttp* headers to control the behaviour of the HTTP producer endpoint, Camel* headers are filtered out because they are intended for internal use, and all other headers are converted to HTTP headers with the exception of content-length, content-type, cache-control, connection, date, pragma, trailer, transfer-encoding, upgrade, via, warning.
it is not simple to just remove all "junk" headers for me at point before sending the call, but keeping my internal ones (which also end up in the HTTP call btw.).
You could prefix your internal one with Camel to avoid leaking them as HTTP headers. If you need some of them as HTTP headers, you can manually to the mapping to a different key before the call.
In this case I am not sure if I need to add manually the headers which are needed for HTTP (like user-agent).
You don't need them.
Also other frequent misuse of the headers is if you make 2 HTTP calls and forget to clear the headers, the output headers of the first call will become input headers of the first.
In this case, you should at least remove the control headers with .removeHeaders("CamelHttp*).

Related

Strange "Allow" header in OPTIONS request to CORS-enabled spring boot endpoint

To test this, one can use the sample code from https://spring.io/guides/gs/rest-service-cors/ with no changes.
Here's the output from an OPTIONS request without any CORS headers:
$ curl -X OPTIONS -i http://localhost:8080/greeting HTTP/1.1 200
Allow: GET,HEAD,OPTIONS
Content-Length: 0
Date: Wed, 24 Jul 2019 16:45:25 GMT
As expected, the Allow header is correct, as the method is annotated with #GetMapping.
But now let's simulate a CORS preflight OPTIONS request (which is not really necessary for a GET, but that's not the point), adding Origin and Access-Control-Request-Method:
$ curl -X OPTIONS -H'Origin: http://localhost:9000' -H'Access-Control-Request-Method: GET' -i http://localhost:8080/greeting
HTTP/1.1 200
Vary: Origin
Vary: Access-Control-Request-Method
Vary: Access-Control-Request-Headers
Access-Control-Allow-Origin: http://localhost:9000
Access-Control-Allow-Methods: GET
Access-Control-Max-Age: 1800
Allow: GET, HEAD, POST, PUT, DELETE, OPTIONS, PATCH
Content-Length: 0
Date: Wed, 24 Jul 2019 16:48:36 GMT
The CORS headers have been correctly included, but note that Allow now lists more methods than actually allowed (and which are indeed not allowed, with or without CORS; a 405 "Method not allowed" error is returned if one tries to POST to that URL).
Even more strange, Access-Control-Allow-Methods correctly lists only GET.
Am I misunderstanding some detail about how CORS should work, or is this a bug in Spring Boot?
Allow
The Allow header lists the set of methods support by a resource.
Access-Control-Allow-Methods
The Access-Control-Allow-Methods response header specifies the method or methods allowed when accessing the resource in response to a preflight request.
Allow just states what methods that are in general supported by the spring boot application. While Access-Control-Allow-Methods tells you what methods that you have access to.
As #Thomas stated allow is a Resource response header
So if you look closely at the #RequestMapping properties you will see method : RequestMethod[] https://docs.spring.io/spring/docs/current/javadoc-api/org/springframework/web/bind/annotation/RequestMapping.html#method--
If you go to RequestMethod docs you will find the following :
Java 5 enumeration of HTTP request methods. Intended for use with the
RequestMapping.method() attribute of the RequestMapping annotation.
Note that, by default, DispatcherServlet supports GET, HEAD, POST,
PUT, PATCH and DELETE only. DispatcherServlet will process TRACE and
OPTIONS with the default HttpServlet behavior unless explicitly told
to dispatch those request types as well: Check out the
"dispatchOptionsRequest" and "dispatchTraceRequest" properties,
switching them to "true" if necessary.
So by default #RequestMapping will allow [GET, HEAD, POST, PUT, PATCH , DELETE]
If you want to restrict some resource or method for specific methods you can use
#RequestMapping(method = {RequestMethod.GET,RequestMethod.POST})

HTTP headers not returned on EC2

I have a Spring Boot application deployed on 2 EC2 instances (staging and production environments). I have an endpoint that is used for downloading a file. It looks like this (the app is written in Kotlin):
#PostMapping("/download")
open fun download(#RequestBody request: DownloadRequest, servletResponse: HttpServletResponse) {
val file = getByteArray(request.fileId)
servletResponse.outputStream.write(file)
servletResponse.contentType = MediaType.APPLICATION_OCTET_STREAM_VALUE
servletResponse.setHeader(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"${request.fileId}.zip\"")
}
When I execute a download request on the staging machine everything is fine. I get back the file and the response has the headers set. These are the headers that I can see in Postman:
Cache-Control →no-cache, no-store, max-age=0, must-revalidate
Connection →keep-alive
Content-Disposition →attachment; filename="345412.zip"
Content-Length →11756
Content-Type →application/octet-stream
Date →Tue, 04 Apr 2017 09:04:19 GMT
Expires →0
Pragma →no-cache
X-Application-Context →application:8081
X-Content-Type-Options →nosniff
X-Frame-Options →DENY
X-XSS-Protection →1; mode=block
When I do the same request on production, the response body contains the file content, but the 2 headers that I set manually, "Content-Type" and "Content-Disposition", are missing:
Cache-Control →no-cache, no-store, max-age=0, must-revalidate
Connection →keep-alive
Content-Length →56665
Date →Tue, 04 Apr 2017 09:06:45 GMT
Expires →0
Pragma →no-cache
X-Application-Context →application:8081
X-Content-Type-Options →nosniff
X-Frame-Options →DENY
X-XSS-Protection →1; mode=block
Both machines have the exact same JAR deployed in a Docker container. Both calls are done directly against the EC2 instances, using their private IPs, so no ELB is involved. The configuration of the 2 instances is identical, with no differences that I could find in the AWS Console.
Do you know what might cause this? Is there a setting in EC2 that can prevent some HTTP headers for being sent back in responses? I cannot find any reason for why the headers are sent back in one case and not in the other.
The issue was fixed by first writing the response headers and then the response body:
#PostMapping("/download")
open fun download(#RequestBody request: DownloadRequest, servletResponse: HttpServletResponse) {
val file = getByteArray(request.fileId)
servletResponse.contentType = MediaType.APPLICATION_OCTET_STREAM_VALUE
servletResponse.setHeader(HttpHeaders.CONTENT_DISPOSITION, "attachment; filename=\"${request.fileId}.zip\"")
servletResponse.outputStream.write(file)
}
If you start writing the body first, the headers might not be properly set. I am still not sure why this was reproducing only on the production machine.

OneDrive API download in parts

I am currently trying to develop a Java based app to access OneDrive.
Today i tried to implement the download as described here: https://dev.onedrive.com/items/download.htm
I wanted to use the range parameter, to offer the user the capability to pause large downloads. But no matter how i send the parameter be at within the HTTP-Request header or in the URL as a GET-Parameter it will always send me the complete file.
Things i tried so far:
https:/ /api.onedrive.com/v1.0/drive/items/***/content?range=0-8388607
(OAuth via HTTP header)
https:/ /api.onedrive.com/v1.0/drive/items/***/content:
Header: Authorization: ***
range: 0-8388607
https:/ /api.onedrive.com/v1.0/drive/items/***/content:
Header: Authorization: ***
range: bytes=0-8388607
I also tried Content-Range and various variations on lower and upper case without success. Any reason why this dose not work?
PS.:
The links a broken because i am using a new account that only allows 2 links per post, I am aware that ther is a space between the two // in my post ;)
Requesting the range of the file is supported. You might want to use fiddler or some other tool to see if the original headers are being passed after the 302 redirect is performed. Below are the HTTP requests and responses when I provide the range header which is being passed on after the 302 redirect occurs. You'll notice that a HTTP 206 partial content response is returned. Additionally, to resume a download, you can use "Range: bytes=1025-" or whatever the last byte received was. I hope that helps.
GET https://api.onedrive.com/v1.0/drive/items/item-id/content HTTP/1.1
Authorization: Bearer
Range: bytes=0-1024
Host: api.onedrive.com
HTTP/1.1 302 Found
Content-Length: 0
Location: https://kplnyq.dm2302.livefilestore.com/edited_location
Other headers removed
GET https://kplnyq.dm2302.livefilestore.com/edited_location
Range: bytes=0-1024
Host: kplnyq.dm2302.livefilestore.com
HTTP/1.1 206 Partial Content
Cache-Control: public
Content-Length: 1025
Content-Type: audio/mpeg
Content-Location: https://kplnyq.dm2302.livefilestore.com/edited_location
Content-Range: bytes 0-1024/4842585
Expires: Tue, 11 Aug 2015 21:34:52 GMT
Last-Modified: Mon, 12 Dec 2011 21:33:41 GMT
Accept-Ranges: bytes
Server: Microsoft-HTTPAPI/2.0
Other headers removed

Override the HTTP response status text

How can I override in Tomcat 7 the text of the HttpStatus.
I'm using HttpServletResponse.sendError(401, "Invalid username or Password"), but when I'm looking at the response status in the client it goves 401 Unauthorized.
Is there any way to override it?
Tomcat no longer supports USE_CUSTOM_STATUS_MSG_IN_HEADER property.
Changelog from 8.5.0:
RFC 7230 states that clients should ignore reason phrases in HTTP/1.1
response messages. Since the reason phrase is optional, Tomcat no
longer sends it. As a result the system property
org.apache.coyote.USE_CUSTOM_STATUS_MSG_IN_HEADER is no longer used
and has been removed. (markt)
RFC 7230, Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing, June 2014. Section 3.1.2:
The reason-phrase element exists for the sole purpose of providing
a textual description associated with the numeric status code,
mostly out of deference to earlier Internet application protocols
that were more frequently used with interactive text clients. A
client SHOULD ignore the reason-phrase content.
Edit catalina.properties and add the property:
org.apache.coyote.USE_CUSTOM_STATUS_MSG_IN_HEADER=true
With that set in my dev environment, then when I do:
response.sendError(HttpServletResponse.SC_BAD_REQUEST,
"A very very very bad request");
I see:
HTTP/1.1 400 A very very very bad request
Server: Apache-Coyote/1.1
Content-Type: text/html;charset=utf-8
Content-Language: en
Content-Length: 1024
Date: Fri, 20 Dec 2013 11:09:54 GMT
Connection: close
Also discussed here and here
No - the response codes are set according to RFC 2616. If you want to communicate a message to the user (to the API client) either write it in the body or in a response header

http: conditional get does not give a chance to refresh headers without sending body again

I don't know if this is a bug or a feature in the http spec, or I am not understanding things ok.
I have a resource that changes at most once a week, at the week's beginning. If it didn't change, then the previous week's resource continues to be valid for the whole week.
(For all our tests we have modified the one week period for five minutes, but I think our observations are still valid).
First we send the resource with the header Expires: next Monday. The whole week the browser retrieves from the cache. If on Monday we have a new resource then it is retrieved with its new headers and everything is ok.
The problem occurs when the resource is not renewed. In response to the conditional get our app (Java+Tomcat) sends new headers with Expires: next Monday but without the body. But our frontend server (apache) removes this header, because the spec says you should not send new headers if the resource did not change. So now forever (until the resource changes) the browser will send a conditional get when we would like it to continue serving straight from the cache.
Is there a spec compliant way to update the headers without updating the body? (or sending it again)
And subquestion: how to make apache pass along tomcat's headers?
Just a Expires header is not enough. According to RFC 2616 section 13.3.4, a server needs to respond with two headers, Last-Modified and ETag, to do conditional GET right:
In other words, the preferred behavior for an HTTP/1.1 origin server is to send both a strong entity tag and a Last-Modified value.
And if the client is HTTP/1.1 compliant, it should send If-Modified-Since. Then the server is supposed to respond as following (quoted from Roy Fielding's proposal to add conditional GET):
If resource is inaccessible (for whatever reason), then the server should return a 4XX message just like it does now.
If resource no longer exists, the server should return a 404 Not Found response (i.e. same as now).
If resource is accessible but its last modification date is earlier (less than) or equal to the date passed, the server should return a 304 Not Modified message (with no body).
If resource is accessible and its last modification date is later than the date passed, the server should return a 200 OK message (i.e. same as now) with body.
So, I guess you don't need to configure Apache and/or Tomcat the way you described. You need to make your application HTTP/1.1 compliant.
Try sending a valid HTTP-Date for the Expires header?
One way to solve the problem is using separate URIs for each week. The canonical url redirects to the appropriate url for the week, and instructs the browser to cache the redirect for a week. Also, URLs that have a date in them will instruct the browser to cache forever.
Canonical URL : /path/to/resource
Status Code : 301
Location : /path/to/resource/12-dec or /path/to/resource/19-dec
Expires : Next Monday
Week 1 : /path/to/resource/12-dec
Status code : 200
Expires : Never
Week 2 : /path/to/resource/19-dec
Status code : 200
Expires : Never
When the cache expires on Monday, you just send a redirect response. You either send last weeks URL or this weeks, but you never send the entire response body.
With this approach, you have eliminated conditional gets. You have also made your resources "unmodifiable-once-published", and you also get versioned resources.
The only caveat - redirects aren't cached by all browsers even though the http spec requires them to do so. Notably IE8 and below don't cache. For details, look at the column "cache browser redirects" in browserscope.
The Expires header was basically deprecated with HTTP 1.1; use Cache-Control: max-age instead.
Make sure you are including Last-Modified.
It's optional, but you may also want to specify Cache-Control: must-revalidate, to make sure intermediate proxies don't deliver potentially stale content.
You don't need to set ETag.
Example request:
GET http://localhost/images/logo.png HTTP/1.1
Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5
Referer: http://localhost/default.aspx
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Host: localhost
Connection: Keep-Alive
The response includes the requested content:
HTTP/1.1 200 OK
Cache-Control: max-age=10
Content-Type: image/png
Last-Modified: Sat, 21 Feb 2009 11:28:18 GMT
Accept-Ranges: bytes
Date: Sun, 18 Dec 2011 05:48:34 GMT
Content-Length: 2245
Requests made before the 10 second timeout are resolved from cache, with no HTTP request. After the timeout:
GET http://localhost/images/logo.png HTTP/1.1
Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5
Referer: http://localhost/default.aspx
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
If-Modified-Since: Sat, 21 Feb 2009 11:28:18 GMT
Host: localhost
The response is just headers, without content:
HTTP/1.1 304 Not Modified
Cache-Control: max-age=10
Last-Modified: Sat, 21 Feb 2009 11:28:18 GMT
Accept-Ranges: bytes
Date: Sun, 18 Dec 2011 05:49:04 GMT
Subsequent requests are again resolved from the browser's cache until the specified cache expiration time.

Categories