How to parse this string in Java?

How to parse this string in Java? - java

prefix/dir1/dir2/dir3/dir4/..
How to parse the dir1, dir2 values out of the above string in Java?
The prefix here can be:
/usr/local/apache2/resumes

If you want to split the String at the / character, the String.split method will work:
For example:
String s = "prefix/dir1/dir2/dir3/dir4";
String[] tokens = s.split("/");
for (String t : tokens)
System.out.println(t);
Output
prefix
dir1
dir2
dir3
dir4
Edit
Case with a / in the prefix, and we know what the prefix is:
String s = "slash/prefix/dir1/dir2/dir3/dir4";
String prefix = "slash/prefix/";
String noPrefixStr = s.substring(s.indexOf(prefix) + prefix.length());
String[] tokens = noPrefixStr.split("/");
for (String t : tokens)
System.out.println(t);
The substring without the prefix "slash/prefix/" is made by the substring method. That String is then run through split.
Output:
dir1
dir2
dir3
dir4
Edit again
If this String is actually dealing with file paths, using the File class is probably more preferable than using string manipulations. Classes like File which already take into account all the intricacies of dealing with file paths is going to be more robust.

...
String str = "bla!/bla/bla/"
String parts[] = str.split("/");
//To get fist "bla!"
String dir1 = parts[0];

In this case, why not use new File("prefix/dir1/dir2/dir3/dir4") and go from there?

String str = "/usr/local/apache/resumes/dir1/dir2";
String prefix = "/usr/local/apache/resumes/";
if( str.startsWith(prefix) ) {
str = str.substring(0, prefix.length);
String parts[] = str.split("/");
// dir1=parts[0];
// dir2=parts[1];
} else {
// It doesn't start with your prefix
}

String result;
String str = "/usr/local/apache2/resumes/dir1/dir2/dir3/dir4";
String regex ="(dir)+[\\d]";
Matcher matcher = Pattern.compile( regex ).matcher( str);
while (matcher.find( ))
{
result = matcher.group();
System.out.println(result);
}
output--
dir1
dir2
dir3
dir4

Using String.split method will surely work as told in other answers here.
Also, StringTokenizer class can be used to to parse the String using / as the delimiter.
import java.util.StringTokenizer;
public class Test
{
public static void main(String []args)
{
String s = "prefix/dir1/dir2/dir3/dir4/..";
StringTokenizer tokenizer = new StringTokenizer(s, "/");
String dir1 = tokenizer.nextToken();
String dir2 = tokenizer.nextToken();
System.out.println("Dir 1 : "+dir1);
System.out.println("Dir 2 : " + dir2);
}
}
Gives the output as :
Dir 1 : prefix
Dir 2 : dir1
Here you can find more about StringTokenizer.

If it's a File, you can get the parts by creating an instanceof File and then ask for its segments.
This is good because it'll work regardless of the direction of the slashes; it's platform independent (except for the "drive letters" in windows...)

public class Test {
public static void main(String args[]) {
String s = "pre/fix/dir1/dir2/dir3/dir4/..";
String prefix = "pre/fix";
String[] tokens = s.substring(prefix.length()).split("/");
for (int i=0; i<tokens.length; i++) {
System.out.println(tokens[i]);
}
}
}

String.split(String regex) is convenient but if you don't need the regular expression handling then go with the substring(..) example, java.util.StringTokenizer or use Apache commons lang 1. The performance difference when not using regular expressions can be a gain of 1 to 2 orders of magnitude in speed.

String s = "prefix/dir1/dir2/dir3/dir4"
String parts[] = s.split("/");
System.out.println(s[0]); // "prefix"
System.out.println(s[1]); // "dir1"
...

Related

How to extract number suffix from a filename

In Java I have a filename example ABC.12.txt.gz, I want to extract number 12 from the filename. Currently I am using last index method and extracting substring multiple times.

You could try using pattern matching
import java.util.regex.Pattern;
import java.util.regex.Matcher;
// ... Other features
String fileName = "..."; // Filename with number extension
Pattern pattern = Pattern.compile("^.*(\\d+).*$"); // Pattern to extract number
// Then try matching
Matcher matcher = pattern.matcher(fileName);
String numberExt = "";
if(matcher.matches()) {
numberExt = matcher.group(1);
} else {
// The filename has no numeric value in it.
}
// Use your numberExt here.

You can just separate every numeric part from alphanumeric ones by using a regular expression:
public static void main(String args[]) {
String str = "ABC.12.txt.gz";
String[] parts = str.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");
// view the resulting parts
for (String s : parts) {
System.out.println(s);
}
// do what you want with those values...
}
This will output
ABC.
12
.txt.gz
Then take the parts you need and do what you have to do with them.

We can use something like this to extract the number from a string
String fileName="ABC.12.txt.gz";
String numberOnly= fileName.replaceAll("[^0-9]", "");

Removing link from Text in Java?

I need to change somethign like this -> Hello, go here http://www.google.com for your ...
grab the link, and change it in a method i made, and replace it back into the string like this
-> Hello, go here http://www.yahoo.com for your...
Here is what i have so far:
if(Text.toLowerCase().contains("http://"))
{
// Do stuff
}
else if(Text.toLowerCase().contains("https://"))
{
// Do stuff
}
All i need to do is change the URL in the String to something different. The Url in the String will not always be http://www.google.com, so i can not just say replace("http://www.google.com","")

Use regex:
String oldUrl = text.replaceAll(".*(https?://)www((\\.\\w+)+).*", "www$2");
text = text.replaceAll("(https?://)www(\\.\\w+)+", "$1" + traslateUrl(oldUrl));
Note: code changed to meet extra requirements in comments below.

you can grab the link from the string using below code. I assumed the string will contain only .com domain
String input = "Hello, go here http://www.google.com";
Pattern pattern = Pattern.compile("http[s]{0,1}://www.[a-z-]*.com");
Matcher m = pattern.matcher(input);
while (m.find()) {
String str = m.group();
}

Have you tried something like:
s= s.replaceFirst("http:.+[ ]", new link);
This will find any word beginning with http up till the first white space and replace it with whatever you want
if you want to keep the link then you can do:
String oldURL;
if (s.contains("http")) {
String[] words = s.split(" ");
for (String word: words) {
if (word.contains("http")) {
oldURL = word;
break;
}
}
//then replace the url or whatever
}

You can try this
private String removeUrl(String commentstr)
{
String urlPattern = "((https?|ftp|gopher|telnet|file|Unsure|http):((//)|(\\\\))+[\\w\\d:##%/;$()~_?\\+-=\\\\\\.&]*)";
Pattern p = Pattern.compile(urlPattern,Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(commentstr);
int i = 0;
while (m.find()) {
commentstr = commentstr.replaceAll(m.group(i),"").trim();
i++;
}
return commentstr;
}

Extract content after "=" and before "&", Regex expression in java

guys, I wanna extract the content in a string, the content is before "&" and after the "=", like this example:
asdfaf=afl10109&adsfjkl
I want to extract "afl10109" out of the string, can anyone teach me how to do this, I am very new to regex expression...

Use replaceAll() to replace the whole input with just what you want:
String target = str.replaceAll(".*=(.*)&.*", "$1");
The target is captured in a group (group number 1), which is then referenced in the replacement string.

try
public static void main(String args[]) {
String input="asdfaf=afl10109&adsfjkl";
Pattern pattern = Pattern.compile("=[^&]*&");
Matcher m = pattern.matcher(input);
while (m.find()) {
String str = m.group();
System.out.println( str.substring(1,str.length()-1));
}
}

This is not regex but you can also use split()
String str = "asdfaf=afl10109&adsfjkl";
System.out.println(str.split("=")[1].split("&")[0]);
Output:
afl10109

Using good old String#substring()
String str = "foo=bar&baz";
int begin = str.indexOf('=');
if (begin != -1) {
int end = str.indexOf('&', begin);
if (end != -1) {
System.out.println(str.substring(begin+1, end)); // bar
}
}

Deleting everything except last part of a String?

What kind of method would I use to make this:
http://www.site.net/files/file1.zip
To
file1.zip?

String yourString = "http://www.site.net/files/file1.zip";
int index = yourString.lastIndexOf('/');
String targetString = yourString.substring(index + 1);
System.out.println(targetString);// file1.zip

String str = "http://www.site.net/files/file1.zip";
str = str.substring(str.lastIndexOf("/")+1);

You could use regex to extract the last part:
#Test
public void extractFileNameFromUrl() {
final Matcher matcher = Pattern.compile("[\\w+.]*$").matcher("http://www.site.net/files/file1.zip");
Assert.assertEquals("file1.zip", matcher.find() ? matcher.group(0) : null);
}
It'll return only "file1.zip". Included here as a test as I used it to validate the code.

Use split:
String[] arr = "http://www.site.net/files/file1.zip".split("/");
Then:
String lastPart = arr[arr.length-1];
Update: Another simpler way to get this:
File file = new File("http://www.site.net/files/file1.zip");
System.out.printf("Path: [%s]%n", file.getName()); // file1.zip

Cut ':' && " " from a String with a tokenizer

right now I am a little bit confused. I want to manipulate this string with a tokenizer:
Bob:23456:12345 Carl:09876:54321
However, I use a Tokenizer, but when I try:
String signature1 = tok.nextToken(":");
tok.nextToken(" ")
I get:
12345 Carl
However I want to have the first int and the second int into a var.
Any ideas?

You have two different patterns, maybe you should handle both separated.
Fist you should split the space separated values. Only use the string split(" "). That will return a String[].
Then for each String use tokenizer.
I believe will works.
Code:
String input = "Bob:23456:12345 Carl:09876:54321";
String[] words = input.split(" ")
for (String word : words) {
String[] token = each.split(":");
String name = token[0];
int value0 = Integer.parseInt(token[1]);
int value1 = Integer.parseInt(token[2]);
}

Following code should do:
String input = "Bob:23456:12345 Carl:09876:54321";
StringTokenizer st = new StringTokenizer(input, ": ");
while(st.hasMoreTokens())
{
String name = st.nextToken();
String val1 = st.nextToken();
String val2 = st.nextToken();
}

Seeing as you have multiple patterns, you cannot handle them with only one tokenizer.
You need to first split it based on whitespace, then split based on the colon.
Something like this should help:
String[] s = "Bob:23456:12345 Carl:09876:54321".split(" ");
System.out.println(Arrays.toString(s ));
String[] so = s[0].split(":", 2);
System.out.println(Arrays.toString(so));
And you'd get this:
[Bob:23456:12345, Carl:09876:54321]
[Bob, 23456:12345]

If you must use tokeniser then I tink you need to use it twice
String str = "Bob:23456:12345 Carl:09876:54321";
StringTokenizer spaceTokenizer = new StringTokenizer(str, " ");
while (spaceTokenizer.hasMoreTokens()) {
StringTokenizer colonTokenizer = new StringTokenizer(spaceTokenizer.nextToken(), ":");
colonTokenizer.nextToken();//to igore Bob and Carl
while (colonTokenizer.hasMoreTokens()) {
System.out.println(colonTokenizer.nextToken());
}
}
outputs
23456
12345
09876
54321
Personally though I would not use tokenizer here and use Claudio's answer which splits the strings.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to parse this string in Java? - java

prefix/dir1/dir2/dir3/dir4/.. How to parse the dir1, dir2 values out of the above string in Java? The prefix here can be: /usr/local/apache2/resumes

... String str = "bla!/bla/bla/" String parts[] = str.split("/"); //To get fist "bla!" String dir1 = parts[0];

In this case, why not use new File("prefix/dir1/dir2/dir3/dir4") and go from there?

String str = "/usr/local/apache/resumes/dir1/dir2"; String prefix = "/usr/local/apache/resumes/"; if( str.startsWith(prefix) ) { str = str.substring(0, prefix.length); String parts[] = str.split("/"); // dir1=parts[0]; // dir2=parts[1]; } else { // It doesn't start with your prefix }

String result; String str = "/usr/local/apache2/resumes/dir1/dir2/dir3/dir4"; String regex ="(dir)+[\\d]"; Matcher matcher = Pattern.compile( regex ).matcher( str); while (matcher.find( )) { result = matcher.group(); System.out.println(result); } output-- dir1 dir2 dir3 dir4

If it's a File, you can get the parts by creating an instanceof File and then ask for its segments. This is good because it'll work regardless of the direction of the slashes; it's platform independent (except for the "drive letters" in windows...)

public class Test { public static void main(String args[]) { String s = "pre/fix/dir1/dir2/dir3/dir4/.."; String prefix = "pre/fix"; String[] tokens = s.substring(prefix.length()).split("/"); for (int i=0; i<tokens.length; i++) { System.out.println(tokens[i]); } } }

String s = "prefix/dir1/dir2/dir3/dir4" String parts[] = s.split("/"); System.out.println(s[0]); // "prefix" System.out.println(s[1]); // "dir1" ...

Related

How to extract number suffix from a filename

Removing link from Text in Java?

Extract content after "=" and before "&", Regex expression in java

Deleting everything except last part of a String?

Cut ':' && " " from a String with a tokenizer

Categories

Resources