URI.getHost() returns null - java

This prints null:
System.out.println(new URI("http://a.1a/").getHost());
But this prints a.1a:
System.out.println(new URL("http://a.1a/").getHost());
If all URLs are URIs (but not all URIs are URLs) shouldn't a valid URL that has a host component also have the same host component (instead of null) as a URI?

Look at the Javadoc:
https://docs.oracle.com/javase/8/docs/api/java/net/URI.html:
Returns: The host component of this URI, or null if the host is undefined"
OK, so why is the host part of your particular URI ("http://a.1a/") undefined? Look at the RFC:
https://www.ietf.org/rfc/rfc2396.txt
Hostnames take the form described in Section 3 of [RFC1034] and
Section 2.1 of [RFC1123]: a sequence of domain labels separated by
".", each domain label starting and ending with an alphanumeric
character and possibly also containing "-" characters. The rightmost
domain label of a fully qualified domain name will never start > with a digit... To actually be "Uniform" as a resource
locator, a URL hostname should be a fully qualified domain name.

Related

Handling white spaces in Request Parameter in springboot

In my HQL i'm using
queryListBuilder.append(" and f.nom like '%"+ nomFil +"%' ");
nomFil is a string that may contain white spaces between words.
when i send
http://localhost:8080/list?nom=First Last
I got empty result.
Ps: in my DB the value exists in my target table.
is there any way to handel white spaces in request parameters?
You need to encode and decode the query params.
Ref : https://www.baeldung.com/java-url-encoding-decoding
You should encode nomFil if using inside URL,as:
URLEncoder.encode(nomFil, "UTF-8");
See Percent encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier (URI) under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier (URI) set, which includes both Uniform Resource Locator (URL) and Uniform Resource Name (URN). As such, it is also used in the preparation of data of the application/x-www-form-urlencoded media type, as is often used in the submission of HTML form data in HTTP requests.

Is a URI containing a comma valid in a HTTP Link header?

Is the following HTTP Link header, containing a comma, valid?
Link: <http://www.example.com/foo,bar.html>; rel="canonical"
RFC5988 says:
Note that extension relation types are REQUIRED to be absolute URIs in
Link headers, and MUST be quoted if they contain a semicolon (";") or
comma (",") (as these characters are used as delimiters in the header
itself).
This doesn't cover the #link-value however. That must be a URI-Reference as per RFC 3987 which seems to allow this. The link header itself can also have multiple values, from RFC5988 section 5.5:
Link: </TheBook/chapter2>;
rel="previous"; title*=UTF-8'de'letztes%20Kapitel,
</TheBook/chapter4>;
rel="next"; title*=UTF-8'de'n%c3%a4chstes%20Kapitel
I'm parsing this link header in Java using BasicHeaderValueParser from Apache HttpCore 4.4.9 using the following code:
final String linkHeader = "<http://www.example.com/foo,bar.html>; rel=\"canonical\"";
final HeaderElement[] parsedHeaders = BasicHeaderValueParser.parseElements(linkHeader, null);
for (HeaderElement headerElement : parsedHeaders)
{
System.out.println(headerElement);
}
which tokenises on the comma and prints the following:
<http://www.example.com/foo
bar.html>; rel=canonical
Is this valid behaviour?
The comma is of course valid.
What you're missing is that the BasicHeaderValueParser is not generic. It only supports certain HTTP header fields, and "Link" isn't one of them (see syntax description in https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/message/HeaderValueParser.html.
RFC 3986, section 3.3 clearly mentions, that a URI may contain sub-delimiters, which are defined in section 2.2 and may contain a comma ,.
RFC 5988 clearly states that the relation types must be quoted if they contain a comma and not the URI.
I think there is very little room for interpretation and it's IMHO an incomplete implementation on the HttpCore side.
The BasicHeaderValueParser uses the ',' as element delimiter, neglecting the fact that this character is a valid character for the header fields - which is probably ok for most cases, although not 100% compliant.
You may however provide your own custom parser as second argument (instead of null)

Dispatcher servlet spring and url pattern

I'm new to spring framework I today I ran into dispatcher servlet configuration in web.xml file and i came up with a question concerning url pattern like this syntax /. So what does actually the "/" symbol apply in case I deploy web application in tomcat server as following: host:port/ or host:port/myWeb/
The pattern / will make your servlet the default servlet for the app, meaning it will pick up every pattern that doesn't have another exact match.
URL pattern mapping :
A string beginning with a / character and ending with a /* suffix is used for path mapping.
A string beginning with a *. prefix is used as an extension mapping.
A string containing only the / character indicates the default servlet of the application. In this case the servlet path is the request URI minus the context path and the path info is null.
All other strings are used for exact matches only.
Rules for path mapping :
The container will try to find an exact match of the path of the request to the path of the servlet. A successful match selects the servlet.
The container will recursively try to match the longest path-prefix. This is done by stepping down the path tree a directory at a time, using the / character as a path separator. The longest match determines the servlet selected.
If the last segment in the URL path contains an extension (e.g. .jsp), the servlet container will try to match a servlet that handles requests for the extension. An extension is defined as the part of the last segment after the last . character.
If neither of the previous three rules result in a servlet match, the container will attempt to serve content appropriate for the resource requested. If a default servlet is defined for the application, it will be used.

URL encoding + sign

I have an application with + sign in its name (eg. DB+JSP.jws).
I get an error when trying to create connection as java encodes url + with spaces and hence cannot add the connection to DB JSP/../META-INF/connection.xml (File not found exception).
Any way to circumvent this only by using URLEncoder.encode() and URLDecoder.decode() methods?
You need to encode the URL correctly since '+' is a reserved character in a URL and can only be used in the correct context otherwise needs to be encoded with %2B.
Your URL string would encoded as "DB%2BJSP.jws".
So if you defined the following:
String url = URLEncoder.encode("DB+JSP.jws");
System.out.println(url);
The output would be the same:
DB%2BJSP.jws
You can prepend "http://localhost/" to the encoded URL as you need to.

Java : File.toURI().toURL() on Windows file

The system I'm running on is Windows XP, with JRE 1.6.
I do this :
public static void main(String[] args) {
try {
System.out.println(new File("C:\\test a.xml").toURI().toURL());
} catch (Exception e) {
e.printStackTrace();
}
}
and I get this : file:/C:/test%20a.xml
How come the given URL doesn't have two slashes before the C: ? I expected file://C:.... Is it normal behaviour?
EDIT :
From Java source code : java.net.URLStreamHandler.toExternalForm(URL)
result.append(":");
if (u.getAuthority() != null && u.getAuthority().length() > 0) {
result.append("//");
result.append(u.getAuthority());
}
It seems that the Authority part of a file URL is null or empty, and thus the double slash is skipped. So what is the authority part of a URL and is it really absent from the file protocol?
That's an interesting question.
First things first: I get the same results on JRE6. I even get that when I lop off the toURL() part.
RFC2396 does not actually require two slashes. According to section 3:
The URI syntax is dependent upon the
scheme. In general, absolute URI are
written as follows:
<scheme>:<scheme-specific-part>
Having said that, RFC2396 has been superseded by RFC3986, which states
The generic URI syntax consists of a
hierarchical sequence of omponents
referred to as the scheme, authority,
path, query, and fragment.
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty
The scheme and path components are
required, though the path may be empty
(no characters). When authority is
present, the path must either be empty
or begin with a slash ("/") character.
When authority is not present, the
path cannot begin with two slash
characters ("//"). These restrictions
result in five different ABNF rules
for a path (Section 3.3), only one of
which will match any given URI
reference.
So, there you go. Since file URIs have no authority segment, they're forbidden from starting with //.
However, that RFC didn't come around until 2005, and Java references RFC2396, so I don't know why it's following this convention, as file URLs before the new RFC have always had two slashes.
To answer why you can have both:
file:/path/file
file:///path/file
file://localhost/path/file
RFC3986 (3.2.2. Host) states:
"If the URI scheme defines a default for host, then that default applies when the host subcomponent is undefined or when the registered name is empty (zero length). For example, the "file" URI scheme is defined so that no authority, an empty host, and "localhost" all mean the end-user's machine, whereas the "http" scheme considers a missing authority or empty host invalid."
So the "file" scheme translates file:///path/file to have a context of the end-user's machine even though the authority is an empty host.
As far as using it in a browser is concerned, it doesn't matter. I have typically seen file:///... but one, two or three '/' will all work. This makes me think (without looking at the java documentation) that it would be normal behavior.

Categories