I'm fairly new to java and I'm trying to get a part of a string:
Say I have a URL and I want a specific part of it, such as a filename:
String url = "http://example.com/filename02563.zip";
The 02563 will be generated at random every time and it's now always 5 characters long.
I want to have java find what's between "m/" (from .com/) to the end of the line to get the filename alone.
Now consider this example:
Say I have an html file that I want a snippet extracted from. Below would be the extracted example:
<applet name=someApplet id=game width="100%" height="100%" archive=someJarFile0456799.jar code=classInsideAJarFile.class mayscript>
I want to extract the jar filename, so I want to get the text between "ve=" and ".jar". The extension will always be ".jar", so including this is not important.
How would I do this? If possible, could you comment the code so I understand what's happening?
Use the Java URI class where you can access the individual elements.
URI uri = new URI("http://example.com/filename02563.zip");
String filename = uri.getPath();
Granted, this will need a little more work if the resource no longer resides in the root path.
You can use the lastIndexOf() and substring() methods from the String class to extract a specific piece of a String:
String url = "http://example.com/filename02563.zip";
String filename = url.substring(url.lastIndexOf("/") + 1); //+1 skips ahead of the '/'
You have answers for your first question so this is for second one. Normally I would use some XML parser but your example is not valid XML file so this will be solved with regex (as you wanted).
String url = "<applet name=someApplet id=game width=\"100%\" height=\"100%\" archive=someJarFile0456799.jar code=classInsideAJarFile.class mayscript>";
Pattern pattern= Pattern.compile("(?<=archive=).*?(?= )");
Matcher m=pattern.matcher(url);
if(m.find())
System.out.println(m.group());
output:
someJarFile0456799.jar
Related
I currently have a S3 bucket directory key like this:
String dir = "s3://mybucket/workflow/science/sweet-humoor/vars";
What I am trying to do is to get the prefix of this S3 directory, a prefix is actually without s3:://mybucket/, so what I want to have is workflow/science/sweet-humoor/vars
Now, what would be a elegant way to achieve this? I know the quickest way to do is to do a subString(13), but this will break whenever the bucket name changes.
How would you handle this?
Use a regular expression with replaceAll:
String result = directoryKey.replaceAll("s3://[^/]+/", "");
The regex here is:
s3://[^/]+/
It matches the part that you want to remove, which is s3:// followed by a bunch of non-slash characters, followed by a slash.
It's cleanest to use the Java library functions for paths instead of handling the Strings directly. What you have is an URL, so
URL url = new URL(dir);
URI uri = url.toURI();
Path fullpath = Paths.get(uri);
Now you have a Path (ie the "/mybucket/workflow/science/sweet-humoor/vars" part), and you can get the subpath by
// start index 1 to skip the first directory element
Path subpath = fullpath.subpath(1, fullpath.getNameCount()-1);
You can make a File out of this (subpath.toFile()), or just get the path string by
subpath.toString();
The URIBuilder class from the org.apache.http.client.utils package can do that.
URIBuilder builder = new URIBuilder(dir);
String thePath = builder.getPath();
This automatically extracts /workflow/science/sweet-humoor/vars from the path. The retrieved path does not include mybucket, because URIBuilder sees the first part immediately after the protocol specifier (s3://) as hostname.
Further processing can be done through Path p = Paths.get(thePath).
You can try this:
String dir2=dir.replaceAll("s3://"+dir.split("/")[2]+"/","");
String dir = "s3://mybucket/workflow/science/sweet-humoor/vars";
dir = dir.replace("//", "").substring( dir.indexOf("/") );
System.err.println(dir); // prints mybucket/workflow/science/sweet-humoor/vars
I would split the string by "/" and get the values from third index and join it with "/". Sample code in python.
input_string = "s3://mybucket/workflow/science/sweet-humoor/vars"
list1 = (input_string.split("/"))
print(list1)
print("/".join(list1[3:]))
Output:
workflow/science/sweet-humoor/vars
Now I know about FilenameUtils.getExtension() from apache.
But in my case I'm processing extensions from http(s) urls, so in case I have something like
https://your_url/logo.svg?position=5
this method is gonna return svg?position=5
Is there the best way to handle this situation? I mean without writing this logic by myself.
You can use the URL library from JAVA. It has a lot of utility in this cases. You should do something like this:
String url = "https://your_url/logo.svg?position=5";
URL fileIneed = new URL(url);
Then, you have a lot of getter methods for the "fileIneed" variable. In your case the "getPath()" will retrieve this:
fileIneed.getPath() ---> "/logo.svg"
And then use the Apache library that you are using, and you will have the "svg" String.
FilenameUtils.getExtension(fileIneed.getPath()) ---> "svg"
JAVA URL library docs >>>
https://docs.oracle.com/javase/7/docs/api/java/net/URL.html
If you want a brandname® solution, then consider using the Apache method after stripping off the query string, if it exists:
String url = "https://your_url/logo.svg?position=5";
url = url.replaceAll("\\?.*$", "");
String ext = FilenameUtils.getExtension(url);
System.out.println(ext);
If you want a one-liner which does not even require an external library, then consider this option using String#replaceAll:
String url = "https://your_url/logo.svg?position=5";
String ext = url.replaceAll(".*/[^.]+\\.([^?]+)\\??.*", "$1");
System.out.println(ext);
svg
Here is an explanation of the regex pattern used above:
.*/ match everything up to, and including, the LAST path separator
[^.]+ then match any number of non dots, i.e. match the filename
\. match a dot
([^?]+) match AND capture any non ? character, which is the extension
\??.* match an optional ? followed by the rest of the query string, if present
I have audio and video names like this :
- azan1(1).mp3
- Funny.mp4
So I need an only AudioName as azan and VideoName as Funny. I am newbie in android and don't know how can I get only filename? How can I achieve this in code.??
Try this way to get filename without extension:-
if (fileName.indexOf(".") > 0)
fileName = fileName.substring(0, fileName.lastIndexOf("."));
In Java , simple and efficient way to get the filename
String resultName= filename.split("\\.")[0].split("\\(")[0];
As mentioned by 44kksharma, you can split the String at the . to get the extension. The only problem as I can see is if the file name contains . elsewhere (for an instance filename.test.mp3) - the file is an mp3 but one could argue filename.test is a part of the file name. If you think of it like that, this is the right approach using splitting:
String resultName = filename.split("\\.mp")[0];
If you have other extensions, you can do this:
String resultName = filename.split("\\.mp|\\.wav|\\.otherformat")[0];
mp3 and mp4 with have mp in them, therefore files with either extension is guaranteed to have .mp.
Using | is or in regex.
Alternatively, you can use the replaceAll method:
String result = filename.replaceAll("\\.mp3|\\.mp4", "");
replace works too, but as it doesn't use regex I find it ends up replacing the wrong chars or ends up screwing up the replacement.
Finally, you could use substring too, but using one-liners is possible with regex(/non-regex using replace) with split(), replace() and replaceAll()
if(audioname.contions(.mp3)
{
String audiostr= audioname.replace(".mp3", "");
}
if(videoname.contions(.mp4)
{
String videostr= videoname.replace(".mp3", "");
}
Set the String in your required textview
I have this file URL: http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new sample.pdf which will be converted to http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new%20sample.pdf later.
Now I can get the last path by:
public static String getLastPathFromUrl(String url) {
return url.replaceFirst(".*/([^/?]+).*", "$1");
}
which will give me new sample.pdf
but how do I get the remaining of the URL: http://xxx.xxx.xx.xx/resources/upload/2014/09/02/
?
Easier way to get last path from URL would be to use String.split function, like this:-
String url = "http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new sample.pdf";
String[] urlArray = url.split("/");
String lastPath = urlArray[urlArray.length-1];
This converts your url into an Array which can then be used in many ways. There are various ways to get url-lastPath, one way could be to join the above generated Array using this answer. Or use lastIndexOf() and substring like this:-
String restOfUrl = url.substring(0,url.lastIndexOf("/"));
PS:- Although you can learn something by doing this but I think your best solution would be to replace space by %20 in the complete url String, that would be the fastest and make more sense.
I am not sure if I understood it correctly but when you say
I have this file URL: URL/new sample.pdf which will be converted to URL/new%20sample.pdf later.
It looks like you are trying to replace "space" with %20 in URL or said in simple words trying to take care of unwanted characters in URL. If that is what you need use pre-built
URLEncoder.encode(String url,String enc), You can us ÜTF-8 as encoding.
http://docs.oracle.com/javase/7/docs/api/java/net/URLEncoder.html
If you really need to split it, assuming that you interested in URL after http://, remove http:// and take store remaining URL in string variable called say remainingURL. then use
List myList = new ArrayList(Arrays.asList(remainingURL.split("/")));
You can iterate on myList to get rest of URL fragments.
I've found it:
File file=new File("http://xxx.xxx.xx.xx/resources/upload/2014/09/02/new sample.pdf");
System.out.println(file.getPath().replaceAll(file.getName(),""));
Output:
http://xxx.xxx.xx.xx/resources/upload/2014/09/02/
Spring solution:
List<String> pathSegments = UriComponentsBuilder.fromUriString(url).build().getPathSegments();
String lastPath = pathSegments.get(pathSegments.size()-1);
I have a problem here, I have a String that contains a value of C:\Users\Ewen\AppData\Roaming\MyProgram\Test.txt, and I want to remove the C:\Users\Ewen\AppData\Roaming\MyProgram\ so that only Test is left. So the question is, how can i remove any part of the string.
Thanks for your time! :)
If you're working strictly with file paths, try this
String path = "C:\\Users\\Ewen\\AppData\\Roaming\\MyProgram\\Test.txt";
File f = new File(path);
System.out.println(f.getName()); // Prints "Test.txt"
Thanks but I also want to remove the .txt
OK then, try this
String fName = f.getName();
System.out.println(fName.substring(0, fName.lastIndexOf('.')));
Please see this for more information.
The String class has all the necessary power to deal with this. Methods you may be interested in:
String.split(), String.substring(), String.lastIndexOf()
Those 3, and more, are described here: http://docs.oracle.com/javase/1.4.2/docs/api/java/lang/String.html
Give it some thought, and you'll have it working in no time :).
I recommend using FilenameUtils.getBaseName(String filename). The FilenameUtils class is a part of Apache Commons IO.
According to the documentation, the method "will handle a file in either Unix or Windows format". "The text after the last forward or backslash and before the last dot is returned" as a String object.
String filename = "C:\\Users\\Ewen\\AppData\\Roaming\\MyProgram\\Test.txt";
String baseName = FilenameUtils.getBaseName(filename);
System.out.println(baseName);
The above code prints Test.