String url getting extension - java

I am trying to get the extension (dk, com, org, eu) or any other domain extension from a String.
for example:
http://www.example.com/siteone/sitetwo/currentpage
From this String i would like to get the .com
I could go the very messy way around and do subString however the problem comes when an url looks like this:
dk.webpage.otherstuff.com/page
So how will i go around this in a way that doesnt require me to check everything every step of the way

Use the getHost() method like this:
public static String getDomainName(String testUrl) throws URISyntaxException {
URI fullUri = new URI(testUrl);
String domainName = fullUri.getHost();
return domainName.startsWith("www.") ? domainName.substring(4) : domainName;
}
After you have done that then just use subString for the .com part of your domain name.

Use Guava's InternetDomainName class. Specifically have a look at the publicSuffix method.

Try this:
String ext = url.replaceAll(".*//[^/]*(\\.\\w+)/.*", "$1");
Some test code:
String url = "http://www.example.com/siteone/sitetwo/currentpage";
String ext = url.replaceAll(".*//[^/]*(\\.\\w+)/.*", "$1");
System.out.println(ext);
Output:
.com

Try this :
private String getExtensionFromDomain(String domainName){
int p = domainName.lastIndexOf(".") +1;
return domainName.substring(p);
}
In case of example.co.ma this will output : .ma

Related

Get the specific value from the last part of a String in Android

I wanted to get the value from the last part of a String, here's my example String
String str="www.mywebsite.com?id=0001&user=myname"
I like to get the word myname from that String, All examples that I'm seeing is like this
String getUser = str.substring(str.length() - 6);
but user value length changes every transaction so I can't fix that to any value. Can anyone please help me in how I will be able to get the user value from that String. Thanks.
String getUser=str.subString(str.lastIndexOf("=")+1,Str.length());
Will return myname.
If your string is going to be a uri, you can use Uri's getQueryParameter:
String str="www.mywebsite.com?id=0001&user=myname";
Uri uri = Uri.parse(str);
return uri.getQueryParameter("user");
Try this
String getUser = str.substring(str.lastIndexOf("=") + 1);
Try this
String foo = "www.mywebsite.com?id=0001&user=myname";
String[] split = foo.split("=");
String name = split[split.length-1];
with String.split(), you can split strings with a delimiter and put the values in an array. It is similar to PHPs' explode method.
String str="www.mywebsite.com?id=0001&user=myname";
String user = getQueryParamValue(str);
public static final String getQueryParam(String url) {
List<NameValuePair> params = URLEncodedUtils.parse(new URI(url), "UTF-8");
for (NameValuePair param : params){
if("user".equalsIgnoreCase(param.getName())) {
return param.getValue();
}
}
return null;
}

How to remove specific character from a string?

I have a string that saves user login name and I want to remove specific characters from that string,i want to remove "#gmail.com" and just have the name before the #, then save it as a new string?
How can I do this?
Here's an example, email can be any email address, not just gmail.com
public class Test {
public static void main(String[] args) {
String email = "nobody#gmail.com";
String nameOnly = email.substring(0,email.indexOf('#'));
System.out.println(nameOnly);
}
}
make sure the email format be correct then use "split" method to split the string from '#' character's position and use first portion of results.
var str = "username#amailserver.com";
var res = str.split("#");
var username = res[0];
You can use regex + replaceAll method of string for eliminate it
sample:
String s = "Rod_Algonquin#company.co.nz";
String newS = s.replaceAll("#(.*).(.*)", "");
System.out.println(newS);
will work on different sites extension.
if you want .org, .net , etc then you need to change the regex #(.*).(.*)

Java : how to get text between "http://" and first following "/" occurence ? And after first "/" occurence?

I am still a novice with regular expressions, "regex", etc... in Java.
If I have an url like this : "http://somedomain.someextention/somefolder/.../someotherfolder/somepage"
What is the simplest way to get :
"somedomain.someextention" ?
"somefolder/.../someotherfolder/somepage" ?
"somepage" ?
Thanks !
You don't have to (and probably shouldn't) use regex here. Instead use classes defined to handle things like this. You can use for example URL, URI, File classes like
String address = "http://somedomain.someextention/somefolder/.../someotherfolder/somepage";
URL url = new URL(address);
File file = new File(url.getPath());
System.out.println(url.getHost());
System.out.println(url.getPath());
System.out.println(file.getName());
Outpit:
somedomain.someextention
/somefolder/.../someotherfolder/somepage
somepage
Now you can need to get rid of / at start of path to your resource. You can use substring(1) here if resource starts with /.
But if you really must use regex you can try with
^https?://([^/]+)/(.*/([^/]+))$
Now
group 1 will contain host name,
group 2 will contain path to resource
group 3 will contain name of resource
The best way to get those components is to use the URI class; e.g.
URI uri = new URI(str);
String domain = uri.getHost();
String path = uri.getPath();
int pos = path.lastIndex("/");
...
// or use File to parse the path string.
You could do it using regexes on the raw url string, but there is a risk that you won't correctly cope with all of the variability that is possible in a URL. (Hint: the regex supplied by #Pchenko doesn't :-)) And you would definitely need to use a decoder to deal with possible percent encoding.
This is not a regexp or URI use but simple substring code as an excersise material. Missing few corner case format validation.
int lastDelim = str.lastIndexOf('/);
if (lastDelim<0) throw new IllegalArgumentException("Invalid url");
int startIdx = str.indexOf("//");
startIdx = startIdx<0 ? 0 : startIdx+2;
int pathDelim = str.indexOf('/', startIdx);
String domain = str.substring(startIdx, pathDelim);
String path = str.substring(pathDelim+1, lastDelim);
String page = str.substring(lastDelim+1);
If you would like to use regex to decode the URL instead of using the URI class, as described in the previous answers, the below link gives a nice tutorial of regex, and it explains decoding a sample URL as well. You could learn it there and try it out.
http://www.beedub.com/book/2nd/regexp.doc.html
It's not regex, or scalable at that, it works though:
public class SomeClass
{
public static void main(String[] args)
{
SomeClass sclass = new SomeClass();
String[] string =
sclass.parseURL("http://somedomain.someextention/somefolder/.../someotherfolder/somepage");
System.out.println(string[0]);
System.out.println(string[1]);
System.out.println(string[2]);
}
private String[] parseURL(String url)
{
String part1 = url.substring("http://".length(), url.indexOf("/", "http://".length()));
String part2 = url.substring("http://".length() + part1.length() + 1, url.lastIndexOf("/"));
String part3 = url = url.substring(url.lastIndexOf("/") + 1);
return new String[] { part1, part2, part3 };
}
}
Output:
somedomain.someextention
somefolder/.../someotherfolder
somepage

Extracting the file name from the variable dynamically

I have a query a method in which the parameter is coming as a file name that upon debugging I have analyzed, as shown below:
private processfile ( string filePath)
{
}
Now this file path can be like:
C:\abc\file1.txt
or
C:\abc\def\file1.txt
or
C:\ghj\ytr\wer\file1.txt
Now my query is that I have to extract the file name only and have to store in a string parameter. So I have to store the file1.txt in a string, let say in a string parameter s , so finally s will be stored as
String s = file1.txt
How to achieve this?
This should do the trick
String s = new File(filepath).getName()
although I would rename filepath to filePath.
You can find File#getName() documentation here
You can use indexOf and substring for this case:
String s = filepath.substring(filepath.lastIndexOf(File.separator)+1);
File.getName also takes similar approach, see source below:
public String getName() {
int index = path.lastIndexOf(separatorChar);
if (index < prefixLength) return path.substring(prefixLength);
return path.substring(index + 1);
}

How to obtain the last path segment of a URI

I have as input a string that is a URI. how is it possible to get the last path segment (that in my case is an id)?
This is my input URL:
String uri = "http://base_path/some_segment/id"
and I have to obtain the id I have tried with this:
String strId = "http://base_path/some_segment/id";
strId = strId.replace(path);
strId = strId.replaceAll("/", "");
Integer id = new Integer(strId);
return id.intValue();
but it doesn't work, and surely there must be a better way to do it.
is that what you are looking for:
URI uri = new URI("http://example.com/foo/bar/42?param=true");
String path = uri.getPath();
String idStr = path.substring(path.lastIndexOf('/') + 1);
int id = Integer.parseInt(idStr);
alternatively
URI uri = new URI("http://example.com/foo/bar/42?param=true");
String[] segments = uri.getPath().split("/");
String idStr = segments[segments.length-1];
int id = Integer.parseInt(idStr);
import android.net.Uri;
Uri uri = Uri.parse("http://example.com/foo/bar/42?param=true");
String token = uri.getLastPathSegment();
Here's a short method to do it:
public static String getLastBitFromUrl(final String url){
// return url.replaceFirst("[^?]*/(.*?)(?:\\?.*)","$1);" <-- incorrect
return url.replaceFirst(".*/([^/?]+).*", "$1");
}
Test Code:
public static void main(final String[] args){
System.out.println(getLastBitFromUrl(
"http://example.com/foo/bar/42?param=true"));
System.out.println(getLastBitFromUrl("http://example.com/foo"));
System.out.println(getLastBitFromUrl("http://example.com/bar/"));
}
Output:
42
foo
bar
Explanation:
.*/ // find anything up to the last / character
([^/?]+) // find (and capture) all following characters up to the next / or ?
// the + makes sure that at least 1 character is matched
.* // find all following characters
$1 // this variable references the saved second group from above
// I.e. the entire string is replaces with just the portion
// captured by the parentheses above
I know this is old, but the solutions here seem rather verbose. Just an easily readable one-liner if you have a URL or URI:
String filename = new File(url.getPath()).getName();
Or if you have a String:
String filename = new File(new URL(url).getPath()).getName();
If you are using Java 8 and you want the last segment in a file path you can do.
Path path = Paths.get("example/path/to/file");
String lastSegment = path.getFileName().toString();
If you have a url such as http://base_path/some_segment/id you can do.
final Path urlPath = Paths.get("http://base_path/some_segment/id");
final Path lastSegment = urlPath.getName(urlPath.getNameCount() - 1);
In Android
Android has a built in class for managing URIs.
Uri uri = Uri.parse("http://base_path/some_segment/id");
String lastPathSegment = uri.getLastPathSegment()
If you have commons-io included in your project, you can do it without creating unecessary objects with org.apache.commons.io.FilenameUtils
String uri = "http://base_path/some_segment/id";
String fileName = FilenameUtils.getName(uri);
System.out.println(fileName);
Will give you the last part of the path, which is the id
In Java 7+ a few of the previous answers can be combined to allow retrieval of any path segment from a URI, rather than just the last segment. We can convert the URI to a java.nio.file.Path object, to take advantage of its getName(int) method.
Unfortunately, the static factory Paths.get(uri) is not built to handle the http scheme, so we first need to separate the scheme from the URI's path.
URI uri = URI.create("http://base_path/some_segment/id");
Path path = Paths.get(uri.getPath());
String last = path.getFileName().toString();
String secondToLast = path.getName(path.getNameCount() - 2).toString();
To get the last segment in one line of code, simply nest the lines above.
Paths.get(URI.create("http://base_path/some_segment/id").getPath()).getFileName().toString()
To get the second-to-last segment while avoiding index numbers and the potential for off-by-one errors, use the getParent() method.
String secondToLast = path.getParent().getFileName().toString();
Note the getParent() method can be called repeatedly to retrieve segments in reverse order. In this example, the path only contains two segments, otherwise calling getParent().getParent() would retrieve the third-to-last segment.
You can also use replaceAll:
String uri = "http://base_path/some_segment/id"
String lastSegment = uri.replaceAll(".*/", "")
System.out.println(lastSegment);
result:
id
You can use getPathSegments() function. (Android Documentation)
Consider your example URI:
String uri = "http://base_path/some_segment/id"
You can get the last segment using:
List<String> pathSegments = uri.getPathSegments();
String lastSegment = pathSegments.get(pathSegments.size() - 1);
lastSegment will be id.
I'm using the following in a utility class:
public static String lastNUriPathPartsOf(final String uri, final int n, final String... ellipsis)
throws URISyntaxException {
return lastNUriPathPartsOf(new URI(uri), n, ellipsis);
}
public static String lastNUriPathPartsOf(final URI uri, final int n, final String... ellipsis) {
return uri.toString().contains("/")
? (ellipsis.length == 0 ? "..." : ellipsis[0])
+ uri.toString().substring(StringUtils.lastOrdinalIndexOf(uri.toString(), "/", n))
: uri.toString();
}
you can get list of path segments from the Uri class
String id = Uri.tryParse("http://base_path/some_segment/id")?.pathSegments.last ?? "InValid URL";
It will return id if the url is valid, if it is invalid it returns "Invalid url"
Get URL from URI and use getFile() if you are not ready to use substring way of extracting file.

Categories