Is a URI containing a comma valid in a HTTP Link header? - java

Is the following HTTP Link header, containing a comma, valid?
Link: <http://www.example.com/foo,bar.html>; rel="canonical"
RFC5988 says:
Note that extension relation types are REQUIRED to be absolute URIs in
Link headers, and MUST be quoted if they contain a semicolon (";") or
comma (",") (as these characters are used as delimiters in the header
itself).
This doesn't cover the #link-value however. That must be a URI-Reference as per RFC 3987 which seems to allow this. The link header itself can also have multiple values, from RFC5988 section 5.5:
Link: </TheBook/chapter2>;
rel="previous"; title*=UTF-8'de'letztes%20Kapitel,
</TheBook/chapter4>;
rel="next"; title*=UTF-8'de'n%c3%a4chstes%20Kapitel
I'm parsing this link header in Java using BasicHeaderValueParser from Apache HttpCore 4.4.9 using the following code:
final String linkHeader = "<http://www.example.com/foo,bar.html>; rel=\"canonical\"";
final HeaderElement[] parsedHeaders = BasicHeaderValueParser.parseElements(linkHeader, null);
for (HeaderElement headerElement : parsedHeaders)
{
System.out.println(headerElement);
}
which tokenises on the comma and prints the following:
<http://www.example.com/foo
bar.html>; rel=canonical
Is this valid behaviour?

The comma is of course valid.
What you're missing is that the BasicHeaderValueParser is not generic. It only supports certain HTTP header fields, and "Link" isn't one of them (see syntax description in https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/message/HeaderValueParser.html.

RFC 3986, section 3.3 clearly mentions, that a URI may contain sub-delimiters, which are defined in section 2.2 and may contain a comma ,.
RFC 5988 clearly states that the relation types must be quoted if they contain a comma and not the URI.
I think there is very little room for interpretation and it's IMHO an incomplete implementation on the HttpCore side.
The BasicHeaderValueParser uses the ',' as element delimiter, neglecting the fact that this character is a valid character for the header fields - which is probably ok for most cases, although not 100% compliant.
You may however provide your own custom parser as second argument (instead of null)

Related

Handling white spaces in Request Parameter in springboot

In my HQL i'm using
queryListBuilder.append(" and f.nom like '%"+ nomFil +"%' ");
nomFil is a string that may contain white spaces between words.
when i send
http://localhost:8080/list?nom=First Last
I got empty result.
Ps: in my DB the value exists in my target table.
is there any way to handel white spaces in request parameters?
You need to encode and decode the query params.
Ref : https://www.baeldung.com/java-url-encoding-decoding
You should encode nomFil if using inside URL,as:
URLEncoder.encode(nomFil, "UTF-8");
See Percent encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier (URI) under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier (URI) set, which includes both Uniform Resource Locator (URL) and Uniform Resource Name (URN). As such, it is also used in the preparation of data of the application/x-www-form-urlencoded media type, as is often used in the submission of HTML form data in HTTP requests.

JAX-RS/Jersey path parameter regex for a simple string

I am trying to match strings v1 and v2. For that, I am trying the following regex : ^v(1|2) (I also tried with $ which is probably what I need). When I test it in http://www.regextester.com/, it seems to work fine. But when I used it in JAX-RS path expression it doesn't work. The expression I use is below:
#Path("/blah/{ver:^v(1|2)}/ep")
Is there anything specific to JAX-RS that I am missing?
Your attempt does not work because of the anchor ^. Quoting from the JAX-RS specification, chapter 3.7.3 (emphasis mine):
The function R(A) converts a URI path template annotation A into a regular expression as follows:
URI encode the template, ignoring URI template variable specifications.
Escape any regular expression characters in the URI template, again ignoring URI template variable specifications.
Replace each URI template variable with a capturing group containing the specified regular expression or ‘([ˆ/]+?)’ if no regular expression is specified.
If the resulting string ends with ‘/’ then remove the final character.
Append ‘(/.*)?’ to the result.
Because each URI templates is placed inside a capturing group, you can't embed anchors in it.
As such, the following will work and will match v1 or v2:
#Path("/blah/{ver:v[12]}/ep")
Try the following (without anchors):
#Path("/blah/{ver : v(1|2)}/ep")
Also, if the change is a single character only, use character set instead of the | operator:
#Path("/blah/{ver : v[12]}/ep")

In SOAP MTOM, what is the syntax to specify "content-id" in Attachment Part section?

On this http://axis.apache.org/axis2/java/core/docs/mtom-guide.html#MTOM_Backward_Compatibility_with_SwA link, the "content-id" is specified in angular brackets.
--MIMEBoundary4A7AE55984E7438034
content-type: application/octet-stream
content-transfer-encoding: binary
content-id: <1.A91D6D2E3D7AC4D580#apache.org>
In XOP element in SOAP Part, it is referred as -
< xop:Include href="cid:1.A91D6D2E3D7AC4D580#apache.org"
xmlns:xop="http://www.w3.org/2004/08/xop/include" >
(No angular brackets here )
I don't see anywhere that the angular brackets are mandetory.
I am using SAAJ APIs and it seems they don't attach any brackets to the content id provided.
Can anyone put some more focus on this ?
This is specified in RFC 2392:
A "cid" URL is converted to the corresponding Content-ID message header by removing the "cid:" prefix, converting the % encoded character to their equivalent US-ASCII characters, and enclosing the remaining parts with an angle bracket pair, "<" and ">".
Some SwA/MTOM implementations don't conform to that spec and don't add the brackets. This is generally not a problem because most SwA/MTOM implementations accept such non conforming messages.
Regarding SAAJ, the Javadoc of the AttachmentPart#setContentId(String) method specifies this:
Sets the MIME header whose name is "Content-Id" with the given value.
This means that you should pass it a value that includes the brackets.

What is the scheme-specific part in a URI?

I can't find any explanation as to what exactly the "scheme-specific part" of a URI is.
From wikipedia :
All URIs and absolute URI references are formed with a scheme name,
followed by a colon character (":"), and the remainder of the URI
called (in the outdated RFCs 1738 and 2396, but not the current STD
66/RFC 3986) the scheme-specific part.
The scheme-specific-part is what you have after the :.
Example :
http://stackoverflow.com/questions/24077453/
scheme : scheme-specific-part
Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme.
See this section of the URI rfc https://www.rfc-editor.org/rfc/rfc3986#section-3.1
Scheme specific means just to simple define which Protocol is used by the Url like
HTTP or HTTPS .
So simply add these in URL to work fine
Scheme Specific
http://localhost:8080/api/notes
Without Scheme
localhost:8080/api/notes

How do I correctly set the HTTP Content-Disposition for large file names in Java?

I'm working on some requirements that will lead to arbitrary PDF files being downloaded from a J2EE web server. The names may look like this:
Xxxxxxxxxxxxxxxxxx - Yyyyyyyyyy - Aaaaaaaaaaa - Bbbbbbbb ccc Dddddddddddddd - abc1234560 - 2009-03-26 – 235959.pdf
Now I've read a couple of sections in RFC2183:
http://www.ietf.org/rfc/rfc2183.txt
For instance
A short (length <= 78 characters) parameter value containing
only non-tspecials' characters SHOULD be represented as a single
token'. A short parameter value containing only ASCII characters,
but including tspecials' characters, SHOULD be represented as
quoted-string'. Parameter values longer than 78 characters, or
which contain non-ASCII characters, MUST be encoded as specified in
[RFC 2184].
Etc etc. Now there are millions of things that can go wrong, if I don't read through all of those RFC's... Or I choose a library which handles such RFC specifications. Is there any such thing for Java? Or am I paranoid, and actually it's sufficient to just write this header to the out stream:
String filename = "\"" + filename.replace("\"", "\\\"") + "\"";
addHeader("Content-Disposition", "attachment; filename=" + filename);
I had similar problem in past and found the following solution.
The first URL looks like http://myhost.com/file/1234
where 1234 is the file ID. Let's say that the file name should be my-very-long-file-name.pdf. So, instead of setting HTTP header redirect the call to URL like
http://myhost.com/download/1234/my-very-long-file-name.pdf
The sevlet mapped to /download/ will take ID from URL and print the file to its output stream. But browser will extract the file name from URL and offer you to download and save the file because its name is into the URL. I hope this will work for you also for long file names.
RFC 2183 isn't relevant, RFC 6266 is.
Also, the 78 character limit only applies to email, not http, so you don't have to worry about that.

Categories