If we have an url e.g www.google.de how can I get ONLY the "google"
In Java new URL (url).getHost(); does work but it gives me google.de
and this is not what I want to have.
Thank you
EDIT: If we have something like www.google.co.uk then I also want to have only "google" as result.
I dont want "google.de" or "www.google" I ONLY want "google"
Splitting on a period and selecting the first or second element (whichever is not "www") would work:
URL url = new URL("http://www.host.ext.ext");
String host = url.getHost(); // host = "www.host.ext.ext"
String splitHost = host.split("\\.") // splitHost = { "www", "host", "ext", "ext" }
host = splitHost[0].equals("www") ? splitHost[1] : splitHost[0]; // host = "host"
If there is anything more than http://www. before it, and the extension is potentially more than two "extensions" (.co.uk for instance), then there is no easy way to get just the part you want. As far as I know, you would have to try iterating over a list of extensions and return the part immediately before the longest matching extension.
The most basic solution would be using
System.out.println(url.split("\\.")[1]);
Or you could try this https://stackoverflow.com/a/23079402/2555419
public String getHostName(String url) {
URI uri = new URI(url);
String hostname = uri.getHost();
// to provide faultproof result, check if not null then return only hostname, without www.
if (hostname != null) {
return hostname.startsWith("www.") ? hostname.substring(4) : hostname;
}
return hostname;
}
Related
I would like to attach a platform parameter to a url with ? if the url has no query string and using & if url has a query string
SO i have added the following
String api_url;
//costructor next to assign apiurl value
//method to extract url and process request
processData(){
String apiUrl = "";
String[] urlParams = this.api_url.split("\\?");
if (urlParams.length > 0){
apiUrl = this.api_url+"&platform="+tokenService.getToken(AppDetailsHelpers.AppSettingsKeys.PLATFORM);
}else {
apiUrl = this.api_url+"?platform="+tokenService.getToken(AppDetailsHelpers.AppSettingsKeys.PLATFORM);
}
}
The above always evaluates the urlParams to a an array even when a url doesnt contain the ?
Example for a url
http://test.com
is resolved with the above code as
http://test.com&platform=12
But i expected it to be as http://test.com?platform=12
I have tried adding
String[] urlParams = this.api_url.split("?");
But it throws an error of Dangling metacharacter. What am i missing out on this. Why does this fail.
This is expected behaviour for String#split. Running "http://test.com".split("\\?") returns an array with one element, "http://test.com". So, just update your condition to if(uriParams.length > 1).
You could also consider parsing your String to a Uri, as you may not need this check and could possibly instead use:
Uri.parse(api_url)
.buildUpon()
.appendQuery("platform", tokenService.getToken(AppSettingsKeys.PLATFORM))
.build().toString();
I have a relative url string, know host and protocol. How can I build an absolute url string?
Seems easy? Yes at first look, but until escaped characters coming. I have to build absolute url from 302 code http(s) response Location header.
lets consider an example
protocol: http
host: example.com
location: /path/path?param1=param1Data¶m2= "
First I tried to build url string like:
Sting urlString = protocol+host+location
Constructor of URL class not escapes spaces and double quotes:
new URL(urlString)
Constructors of URI class fail with exception:
new URI(urlString)
URI.resolve method also fails with exception
Then I found URI can escape params in query string, but only with few constructors like for example:
URI uri = new URI("http", "example.com",
"/path/path", "param1=param1Data¶m2= \"", null);
This constructor needs path and query be a separate arguments, but I have a relative URL, and it not split by path and query parts.
I could consider to check if relative URL contains "?" question sign and think everything before it is path, and everything after it is query, but what if relative url not contain path, but query only, and query contains "?" sign? Then this will not works because part of query will be considered as path.
Now I cannot get how to build absolute url from relative url.
These accepted answers seems just wrong:
how to get URL using relative path
Append relative URL to java.net.URL
Building an absolute URL from a relative URL in Java
It could be nice to consider scenario when relative url was given in relation to url with both host and some path part:
initial url http://example.com/...some path...
relative /home?...query here ...
It would be great to get java core solution, though it still possible to use a good lib.
The first ? indicates where the query string begins:
3.4. Query
[...] The query component is indicated by the first question mark (?) character and terminated by a number sign (#) character or by the end of the URI.
A simple approach (that won't handle fragments and assumes that the query string is always present) is as simple as:
String protocol = "http";
String host = "example.com";
String location = "/path/path?key1=value1&key2=value2";
String path = location.substring(0, location.indexOf("?"));
String query = location.substring(location.indexOf("?") + 1);
URI uri = new URI(protocol, host, path, query, null);
A better approach that can also handle fragments could be :
String protocol = "http";
String host = "example.com";
String location = "/path/path?key1=value1&key2=value2#fragment";
// Split the location without removing the delimiters
String[] parts = location.split("(?=\\?)|(?=#)");
String path = null;
String query = null;
String fragment = null;
// Iterate over the parts to find path, query and fragment
for (String part : parts) {
// The query string starts with ?
if (part.startsWith("?")) {
query = part.substring(1);
continue;
}
// The fragment starts with #
if (part.startsWith("#")) {
fragment = part.substring(1);
continue;
}
// Path is what's left
path = part;
}
URI uri = new URI(protocol, host, path, query, fragment);
The best way seems to be to create a URI object with the multi piece constructors, and then convert it to a URL like so:
URI uri = new URI("https", "sitename.domain.tld", "/path/goes/here", "param1=value¶m2=otherValue");
URL url = uri.toURL();
I have an URL address like: http://myfile.com/File1/beauty.png
I have to remove http://site address/ from main string
That mean result should be File1/beauty.png
Note: site address might be anything(e.g some.com, some.org)
See here: http://docs.oracle.com/javase/tutorial/networking/urls/urlInfo.html
Just create a URL object out of your string and use URL.getPath() like this:
String s = new URL("http://myfile.com/File1/beauty.png").getPath();
If you don't need the slash at the beginning, you can remove it via s.substring(1, s.length());
Edit, according to comment:
If you are not allowed to use URL, this would be your best bet: Extract main domain name from a given url
See the accepted answer. Basically you have to get a TLD list, find the domain and substract everything till the domain names' end.
If, as you say, you only want to use the standard String methods then this should do it.
public static String getPath(String url){
if(url.contains("://")){
url = url.substring(url.indexOf("://")+3);
url = url.substring(url.indexOf("/") + 1);
} else {
url = url.substring(url.indexOf("/")+1);
}
return url;
}
If the url contains :// then we know that the string you are looking for will come after the third /. Otherwise, it should come after the first. If we do the following;
System.out.println(getPath("http://myfile.com/File1/beauty.png"));
System.out.println(getPath("https://myfile.com/File1/beauty.png"));
System.out.println(getPath("www1.myfile.com/File1/beauty.png"));
System.out.println(getPath("myfile.co.uk/File1/beauty.png"));;
The output is;
File1/beauty.png
File1/beauty.png
File1/beauty.png
File1/beauty.png
You can use the below approach to fetch the required data.
String url = "http://myfile.org/File1/beauty.png";
URL u = new URL(url);
String[] arr = url.split(u.getAuthority());
System.out.println(arr[1]);
Output - /File1/beauty.png
String s = "http://www.freegreatpicture.com/files/146/26189-abstract-color-background.jpg";
s = s.substring(s.indexOf("/", str.indexOf("/") + 1));
I have the following URL that I need to get the 4805206 code from.
href="http://adserver.adtech.de/adlink|832|4805206|0|1686|AdId=9624985;BnId=1;itime=527032581;nodecode=yes;link=http://URL/Recruiters/Lex-Consultancy-3979.aspx"
I was wondering if its possible to do this and if so how?
Heres my Java Selenium Class
public void checkAdTechKeys(WebDriver driver) {
if(driver.getCurrentUrl().equalsIgnoreCase("URL"))
{
HP_LeftSearchBox(driver);//enter search terms
driver.get("URL");
// driver.findElement(By.linkText("Read More")).getAttribute("href").toString();
String url = new String(driver.findElement(By.linkText("Read More")).getAttribute("href").toString());
// url = url.split("|")[2];
System.out.println(url);
}else{
setup.loadHomePage(driver);
checkAdTechKeys(driver);
}
}
The code with a small modification that prints out that number:
driver.get("http://irishjobs.ie/");
String url = driver.findElement(By.linkText("Read More")).getAttribute("href");
String[] parsedUrl = url.split("\\|");
System.out.println(parsedUrl[2]);
Two things that you missed:
escaping the "|"
.split() returns an array of strings, not a string.
I've seen things like:
user:password#smtpserver:port
In the past, but I'm not sure if the some library parsed that to build a properties to create a session or is there some sort of accepted format.
While there is a SMTP URL Scheme, I have never seen anyone use it. In practice, most applications provide four separate fields for host, port, user name and password. But if you really need to put those four components into a single string, the example you provided is probably the best-known format for something like this.
Using an URI for specifying a network resource, such as an SMTP server is probably the cloeset thing to a "accepted" format you'd see, an SMTP URI, would be something like smtp://user:host#example.com:port or perhaps just smtp://example.com . You'd use a generic URI parsing library to extract the various components.
There's also an old RFC draft for SMTP URLs
I think that would work.
I would like to add as answer, how I'm using java.net.URI class get information from that URI.
class Demo {
public static void main( String ... args ) throws Exception {
System.out.println( inspect( new URI("smtp://user:port#host:25")));
}
// invoke all the getXyz methods on object and construct
// a string with the result.
private static String inspect( Object o ) throws Exception {
StringBuilder builder = new StringBuilder();
for( Method m : o.getClass().getMethods() ) {
String name = m.getName();
if( name.startsWith("get")) {
builder.append( name )
.append(" = " )
.append( m.invoke( o ) )
.append( "\n" );
}
}
return builder.toString();
}
}
Output
getAuthority = user:port#host:25
getFragment = null
getPath =
getQuery = null
getScheme = smtp
getHost = host
getPort = 25
getUserInfo = user:port
getRawAuthority = user:port#host:25
getRawFragment = null
getRawPath =
getRawQuery = null
getRawSchemeSpecificPart = //user:port#host:25
getRawUserInfo = user:port
getSchemeSpecificPart = //user:port#host:25
getClass = class java.net.URI
Was configuring node app and had a hard time getting it.
smtp://username:password#smtphost:port
The funny part was my password was having an '#' so it was presuming the hostname has started.
The username also had an '#' server but it was good as it was separated by the colon ":"
Hope this helps.
Cheers