How to extract part of a domain from a string in Java? - java

If my domain is 31.example.appspot.com (App Engine prepends a version number), I can retrieve the domain from the Request object like this:
String domain = request.getServerName();
// domain == 31.example.appspot.com
What I want is to extract everything except the version number so I end up with two values:
String fullDomain; // example.appspot.com
String appName; // example
Since the domain could be anything from:
1.example.appspot.com
to:
31.example.appspot.com
How do I extract the fullDomain and appName values in Java?
Would a regex be appropriate here?

If you are always sure of this pattern, then just find the first dot and start from there.
fullDomain = domain.subString(domain.indexOf('.'));
UPDATE: after James and Sean comments, here is the full correct code:
int dotIndex = domain.indexOf(".")+1;
fullDomain = domain.substring(dotIndex);
appName = domain.substring(dotIndex,domain.indexOf(".",dotIndex));

Take a look at split method on java.lang.String.

Related

Getting file extension from http url using Java

Now I know about FilenameUtils.getExtension() from apache.
But in my case I'm processing extensions from http(s) urls, so in case I have something like
https://your_url/logo.svg?position=5
this method is gonna return svg?position=5
Is there the best way to handle this situation? I mean without writing this logic by myself.
You can use the URL library from JAVA. It has a lot of utility in this cases. You should do something like this:
String url = "https://your_url/logo.svg?position=5";
URL fileIneed = new URL(url);
Then, you have a lot of getter methods for the "fileIneed" variable. In your case the "getPath()" will retrieve this:
fileIneed.getPath() ---> "/logo.svg"
And then use the Apache library that you are using, and you will have the "svg" String.
FilenameUtils.getExtension(fileIneed.getPath()) ---> "svg"
JAVA URL library docs >>>
https://docs.oracle.com/javase/7/docs/api/java/net/URL.html
If you want a brandname® solution, then consider using the Apache method after stripping off the query string, if it exists:
String url = "https://your_url/logo.svg?position=5";
url = url.replaceAll("\\?.*$", "");
String ext = FilenameUtils.getExtension(url);
System.out.println(ext);
If you want a one-liner which does not even require an external library, then consider this option using String#replaceAll:
String url = "https://your_url/logo.svg?position=5";
String ext = url.replaceAll(".*/[^.]+\\.([^?]+)\\??.*", "$1");
System.out.println(ext);
svg
Here is an explanation of the regex pattern used above:
.*/ match everything up to, and including, the LAST path separator
[^.]+ then match any number of non dots, i.e. match the filename
\. match a dot
([^?]+) match AND capture any non ? character, which is the extension
\??.* match an optional ? followed by the rest of the query string, if present

Java Regexp to match domain of url

I would like to use Java regex to match a domain of a url, for example,
for www.table.google.com, I would like to get 'google' out of the url, namely, the second last word in this URL string.
Any help will be appreciated !!!
It really depends on the complexity of your inputs...
Here is a pretty simple regex:
.+\\.(.+)\\..+
It fetches something that is inside dots \\..
And here are some examples for that pattern: https://regex101.com/r/L52oz6/1.
As you can see, it works for simple inputs but not for complex urls.
But why reinventing the wheel, there are plenty of really good libraries that correctly parse any complex url. But sure, for simple inputs a small regex is easily build. So if that does not solve the problem for your inputs then please callback, I will adjust the regex pattern then.
Note that you can also just use simple splitting like:
String[] elements = input.split("\\.");
String secondToLastElement = elements[elements.length - 2];
But don't forget the index-bound checking.
Or if you search for a very quick solution than walk through the input starting from the last position. Work your way through until you found the first dot, continue until the second dot was found. Then extract that part with input.substring(index1, index2);.
There is also already a delegate method for exactly that purpose, namely String#lastIndexOf (see the documentation).
Take a look at this code snippet:
String input = ...
int indexLastDot = input.lastIndexOf('.');
int indexSecondToLastDot = input.lastIndexOf('.', indexLastDot);
String secondToLastWord = input.substring(indexLastDot, indexSecondToLastDot);
Maybe the bounds are off by 1, haven't tested the code, but you get the idea. Also don't forget bound checking.
The advantage of this approach is that it is really fast, it can directly work on the internal structures of Strings without creating copies.
My attempt:
(?<scheme>https?:\/\/)?(?<subdomain>\S*?)(?<domainword>[^.\s]+)(?<tld>\.[a-z]+|\.[a-z]{2,3}\.[a-z]{2,3})(?=\/|$)
Demo. Works correctly for:
http://www.foo.stackoverflow.com
http://www.stackoverflow.com
http://www.stackoverflow.com/
http://stackoverflow.com
https://www.stackoverflow.com
www.stackoverflow.com
stackoverflow.com
http://www.stackoverflow.com
http://www.stackoverflow.co.uk
foo.www.stackoverflow.com
foo.www.stackoverflow.co.uk
foo.www.stackoverflow.co.uk/a/b/c
private static final Pattern URL_MATCH_GET_SECOND_AND_LAST =
Pattern.compile("www.(.*)//.google.(.*)", Pattern.CASE_INSENSITIVE);
String sURL = "www.table.google.com";
if (URL_MATCH_GET_SECOND_AND_LAST.matcher(sURL).find()){
Matcher matchURL = URL_MATCH_GET_SECOND_AND_LAST .matcher(sURL);
if (matchURL .find()) {
String sFirst = matchURL.group(1);
String sSecond= matchURL.group(2);
}
}

Regular expression string in java

I want to explain my problem about a command string submitted from the client to the server. Through this specific command I have to login into the database managed by the server. According to the project's guide lines, command's formatting is: login#username;password. The string is sent to the server through a socket. User inserts his credentials and these are sent to the server putting them into the command. The server has to split the user and the password from the command and has to check, in the database, if the user already exists. Here is my problem, given the fact that in the user and in the password can be present ; (the same character of the separator), I don't understand (server side) how can i divide user and psw from the command so that user and psw are the same of the ones inserted in the client.
Have you got some ideas about? I was thinking to use regular expression, is this the right way?
I would just split the string into a user/pass string like this:
String userPass = "User;Pass";
String[] parts = userPass.split(";");
String user = parts[0];
String pass = parts[1];
If you really have to split a string by a separator included in both substrings there is no way to make sure the string is always split correctly!
I think you can use the unicode unit separator character 0x001F to separate you strings safely, as the user will have some difficulties entering such a character!
But depending on your application and string processing this could cause damage, too, as there are some issues concerning incompatibilities (e.g. xml 1.0 doesn't support it at all).
Otherwise if only one or none of the substrings may contain such a character you can easily use one of the already presented methods to split up the string and safely extract the data.
This will work only if User/password doesn't contain # or ;
String loginStr = "login#User;Password";
String []splittedLogin = loginStr.split("#");
String []loginCredentials = splittedLogin[1].split(";");
String user = loginCredentials[0];
String password = loginCredentials[1];
System.out.println(user);

How do I split the rest of the URL from the last path of it

I have this file URL: http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new sample.pdf which will be converted to http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new%20sample.pdf later.
Now I can get the last path by:
public static String getLastPathFromUrl(String url) {
return url.replaceFirst(".*/([^/?]+).*", "$1");
}
which will give me new sample.pdf
but how do I get the remaining of the URL: http://xxx.xxx.xx.xx/resources/upload/2014/09/02/
?
Easier way to get last path from URL would be to use String.split function, like this:-
String url = "http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new sample.pdf";
String[] urlArray = url.split("/");
String lastPath = urlArray[urlArray.length-1];
This converts your url into an Array which can then be used in many ways. There are various ways to get url-lastPath, one way could be to join the above generated Array using this answer. Or use lastIndexOf() and substring like this:-
String restOfUrl = url.substring(0,url.lastIndexOf("/"));
PS:- Although you can learn something by doing this but I think your best solution would be to replace space by %20 in the complete url String, that would be the fastest and make more sense.
I am not sure if I understood it correctly but when you say
I have this file URL: URL/new sample.pdf which will be converted to URL/new%20sample.pdf later.
It looks like you are trying to replace "space" with %20 in URL or said in simple words trying to take care of unwanted characters in URL. If that is what you need use pre-built
URLEncoder.encode(String url,String enc), You can us ÜTF-8 as encoding.
http://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html
If you really need to split it, assuming that you interested in URL after http://, remove http:// and take store remaining URL in string variable called say remainingURL. then use
List myList = new ArrayList(Arrays.asList(remainingURL.split("/")));
You can iterate on myList to get rest of URL fragments.
I've found it:
File file=new File("http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new sample.pdf");
System.out.println(file.getPath().replaceAll(file.getName(),""));
Output:
http://xxx.xxx.xx.xx/resources/upload/2014/09/02/
Spring solution:
List<String> pathSegments = UriComponentsBuilder.fromUriString(url).build().getPathSegments();
String lastPath = pathSegments.get(pathSegments.size()-1);

How to Separate the String in the java for the below requirement?

need to separate the string,i need to separate it.The String is dynamically add.
For Example
1.String a="C:\Wowza Media Systems\Wowza Media Server 2.2.3\content\user2\weight.mp4"
i need to separate it user2
2. String a="C:users\Wowza Media Systems\Wowza Media Server 2.2.3\content\user2\sample.flv"
So i added the value dynamically for a, but i need to separate the string before weight.mp4 after content .
You Can approach like also..
String s="C:/Wowza Media Systems/Wowza Media Server 2.2.3/content/user2/weight.mp4";
String strArray[]=s.split("/");
String fileName = strArray[strArray.length-1]; /*weight.mp4*/
int index = s.indexOf(fileName);
String path = s.substring(0,index) /*C:/Wowza Media Systems/Wowza Media Server 2.2.3/content/user2/*/
You just want to substring the sequence between two last slashes. Look at methods 'lastIndexOf' and 'substring' of String class.
Are tried something like this?
a = a.replace("users", "");
It's kinda hard for me to explain because i don't know to much what are you trying to do.
Is it only avoiding "users" or adding or you try to do something more?
If I understood you correctly you want to fetch the file name from that String. If so:
If you have your String a defined like this:
String a="C:\\Wowza Media Systems\\Wowza Media Server 2.2.3\\content\\user2\\weight.mp4"
try the code:
String[] split = a.split("\\");
String file = null;
if(split.length!=0) file=split[split.length-1];
System.out.println(file);
String end = a.substring(a.lastindexof("\\"),a.length); // <- get the end
String tmp = a.substring(0,a.lastindexof("\\")); // <- get the rest
String start = tmp.substring(0,a.lastindexof("\\"); // <- get the start
I'm shure the code above have some syntax errors and in the first line you have to add 1 to the lastindexof perhaps. But it gives you an idea to solve your problem.

Categories