What is the scheme-specific part in a URI? - java

I can't find any explanation as to what exactly the "scheme-specific part" of a URI is.

From wikipedia :
All URIs and absolute URI references are formed with a scheme name,
followed by a colon character (":"), and the remainder of the URI
called (in the outdated RFCs 1738 and 2396, but not the current STD
66/RFC 3986) the scheme-specific part.
The scheme-specific-part is what you have after the :.
Example :
http://stackoverflow.com/questions/24077453/
scheme : scheme-specific-part

Each URI begins with a scheme name that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme.
See this section of the URI rfc https://www.rfc-editor.org/rfc/rfc3986#section-3.1

Scheme specific means just to simple define which Protocol is used by the Url like
HTTP or HTTPS .
So simply add these in URL to work fine
Scheme Specific
http://localhost:8080/api/notes
Without Scheme
localhost:8080/api/notes

Related

How do I parse an HttpExchange request after an ending slash?

I have a request as follows:
localhost:8000/location/:01
My code takes as input an HttpContext request.
func(HttpExchange r) {
String area_path = r.getRequestURI(); // Equals string "/location/"
}
How do I parse an HttpExchange correctly so I can pull out the "01" from this path and store it as a variable?
That (localhost:8000/location/:01) is not a valid URL or URI
A plain colon character is not legal in the path of a URL or URI. If you want to put a colon in the path, it must be percent-encoded. Furthermore, if this was a URL, it would start with a protocol; e.g. http:.
Now ... it is unclear what the HTTP stack you are using will do with a syntactically incorrect URL / URI, but it could simply be ignoring the colon and the characters after it.
Your code looks a bit odd too. You have tagged the question as [java]. But the code looks like JavaScript rather than Java; i.e. func is a Javascript keyword. But it also looks like you are using the (deprecated) com.sun.net.httpserver.HttpExchange Java class. I don't know what to make of that ...
My advice:
Don't use a colon character in the URL path.
If you must do it, then percent-encode the colon it.
If you cannot encode it properly, then you may need to find and use a different framework for your HTTP request handling. One that will accept and handle a malformed URL / URI in the way that you want. (Good luck finding one!)
Unfortunately, the details in your question are too sketchy to give more detailed advice.

Handling white spaces in Request Parameter in springboot

In my HQL i'm using
queryListBuilder.append(" and f.nom like '%"+ nomFil +"%' ");
nomFil is a string that may contain white spaces between words.
when i send
http://localhost:8080/list?nom=First Last
I got empty result.
Ps: in my DB the value exists in my target table.
is there any way to handel white spaces in request parameters?
You need to encode and decode the query params.
Ref : https://www.baeldung.com/java-url-encoding-decoding
You should encode nomFil if using inside URL,as:
URLEncoder.encode(nomFil, "UTF-8");
See Percent encoding
Percent-encoding, also known as URL encoding, is a mechanism for encoding information in a Uniform Resource Identifier (URI) under certain circumstances. Although it is known as URL encoding it is, in fact, used more generally within the main Uniform Resource Identifier (URI) set, which includes both Uniform Resource Locator (URL) and Uniform Resource Name (URN). As such, it is also used in the preparation of data of the application/x-www-form-urlencoded media type, as is often used in the submission of HTML form data in HTTP requests.

Is a URI containing a comma valid in a HTTP Link header?

Is the following HTTP Link header, containing a comma, valid?
Link: <http://www.example.com/foo,bar.html>; rel="canonical"
RFC5988 says:
Note that extension relation types are REQUIRED to be absolute URIs in
Link headers, and MUST be quoted if they contain a semicolon (";") or
comma (",") (as these characters are used as delimiters in the header
itself).
This doesn't cover the #link-value however. That must be a URI-Reference as per RFC 3987 which seems to allow this. The link header itself can also have multiple values, from RFC5988 section 5.5:
Link: </TheBook/chapter2>;
rel="previous"; title*=UTF-8'de'letztes%20Kapitel,
</TheBook/chapter4>;
rel="next"; title*=UTF-8'de'n%c3%a4chstes%20Kapitel
I'm parsing this link header in Java using BasicHeaderValueParser from Apache HttpCore 4.4.9 using the following code:
final String linkHeader = "<http://www.example.com/foo,bar.html>; rel=\"canonical\"";
final HeaderElement[] parsedHeaders = BasicHeaderValueParser.parseElements(linkHeader, null);
for (HeaderElement headerElement : parsedHeaders)
{
System.out.println(headerElement);
}
which tokenises on the comma and prints the following:
<http://www.example.com/foo
bar.html>; rel=canonical
Is this valid behaviour?
The comma is of course valid.
What you're missing is that the BasicHeaderValueParser is not generic. It only supports certain HTTP header fields, and "Link" isn't one of them (see syntax description in https://hc.apache.org/httpcomponents-core-ga/httpcore/apidocs/org/apache/http/message/HeaderValueParser.html.
RFC 3986, section 3.3 clearly mentions, that a URI may contain sub-delimiters, which are defined in section 2.2 and may contain a comma ,.
RFC 5988 clearly states that the relation types must be quoted if they contain a comma and not the URI.
I think there is very little room for interpretation and it's IMHO an incomplete implementation on the HttpCore side.
The BasicHeaderValueParser uses the ',' as element delimiter, neglecting the fact that this character is a valid character for the header fields - which is probably ok for most cases, although not 100% compliant.
You may however provide your own custom parser as second argument (instead of null)

URI vs URL vs URN

Well there are lot of discussion, post, comments and questions over internet to differentiate URI, URL and URN.
One answer on SO explain about it, but i am confused in implementation result in my code.
Q : If URI is super set of URL then how come it got this following output:
URI : /XXX/abc.do
URL : http://examplehost:8080/XXX/abc.do
When i write the below code:
System.out.println(“URI : “+ httpRequestObj.getRequestURI());
System.out.println(“URL : “+ httpRequestObj.getRequestURL());
EDIT : Could you share a detailed answer by keeping JAVA and original concept of URI,URL and URN in scope.
Regards,
Arun Kumar
java.net.URI API gives a good explanation:
A URI is a uniform resource identifier while a URL is a uniform resource locator. Hence every URL is a URI, abstractly speaking, but not every URI is a URL. This is because there is another subcategory of URIs, uniform resource names (URNs), which name resources but do not specify how to locate them. The mailto, news, and isbn URIs shown above are examples of URNs.
If URI is super set of URL then how come it got this following output ...
The definitions of URI and URL cannot be used to infer the behaviour of getRequestURI() and getRequestURL(). To understand what the methods return, you need to read the javadocs and the Servlet specification.
The meaning of those methods are what they are because the HttpRequest API has evolved over time, and that evolution has had to maintain backwards compatibility.
getRequestURI() does return a URI, and getRequestURL() does return a URL, but the URI and URL are for different things.

Java : File.toURI().toURL() on Windows file

The system I'm running on is Windows XP, with JRE 1.6.
I do this :
public static void main(String[] args) {
try {
System.out.println(new File("C:\\test a.xml").toURI().toURL());
} catch (Exception e) {
e.printStackTrace();
}
}
and I get this : file:/C:/test%20a.xml
How come the given URL doesn't have two slashes before the C: ? I expected file://C:.... Is it normal behaviour?
EDIT :
From Java source code : java.net.URLStreamHandler.toExternalForm(URL)
result.append(":");
if (u.getAuthority() != null && u.getAuthority().length() > 0) {
result.append("//");
result.append(u.getAuthority());
}
It seems that the Authority part of a file URL is null or empty, and thus the double slash is skipped. So what is the authority part of a URL and is it really absent from the file protocol?
That's an interesting question.
First things first: I get the same results on JRE6. I even get that when I lop off the toURL() part.
RFC2396 does not actually require two slashes. According to section 3:
The URI syntax is dependent upon the
scheme. In general, absolute URI are
written as follows:
<scheme>:<scheme-specific-part>
Having said that, RFC2396 has been superseded by RFC3986, which states
The generic URI syntax consists of a
hierarchical sequence of omponents
referred to as the scheme, authority,
path, query, and fragment.
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
hier-part = "//" authority path-abempty
/ path-absolute
/ path-rootless
/ path-empty
The scheme and path components are
required, though the path may be empty
(no characters). When authority is
present, the path must either be empty
or begin with a slash ("/") character.
When authority is not present, the
path cannot begin with two slash
characters ("//"). These restrictions
result in five different ABNF rules
for a path (Section 3.3), only one of
which will match any given URI
reference.
So, there you go. Since file URIs have no authority segment, they're forbidden from starting with //.
However, that RFC didn't come around until 2005, and Java references RFC2396, so I don't know why it's following this convention, as file URLs before the new RFC have always had two slashes.
To answer why you can have both:
file:/path/file
file:///path/file
file://localhost/path/file
RFC3986 (3.2.2. Host) states:
"If the URI scheme defines a default for host, then that default applies when the host subcomponent is undefined or when the registered name is empty (zero length). For example, the "file" URI scheme is defined so that no authority, an empty host, and "localhost" all mean the end-user's machine, whereas the "http" scheme considers a missing authority or empty host invalid."
So the "file" scheme translates file:///path/file to have a context of the end-user's machine even though the authority is an empty host.
As far as using it in a browser is concerned, it doesn't matter. I have typically seen file:///... but one, two or three '/' will all work. This makes me think (without looking at the java documentation) that it would be normal behavior.

Categories