java StringTokenizer skips the charaters if its part of delimiter - java

I have issues in using java string tokenizer:
String myString = "1||2||3|||4";
StringTokenizer stp = new StringTokenizer(myString, "||");
while (stp.hasMoreTokens()) {
System.out.println(stp.nextToken());
}
actual output : [1,2,3,4]
expected output : [1,2,3,'|4']
Could any one help me on the same

Try this..
String myString = "1||2||3|||4";
String[] s=myString.split("\\|\\|");
for (String string : s) {
System.err.println(string);
}

I think you cannot do anything because it's how StringTokenizer works (you can put returnDelims true and remove it manually but it's more hard than look sometimes)
String myString = "1||2||3|||4";
String[] tokens = myString.split("\\|\\|");
for(String token : tokens)
{
System.out.println(token);
}
You can use split which does what you want.
Output:
1
2
3
|4

It is recommended to use the split method of the String class for doing this since StringTokenizer matches the given string and split takes a regular expression. I would use this:
String[] splitStr = myString.split("[|]{2}");
This matches every time the regular expression [|] (a single pipe) is matched twice in a row.

You are maybe thinking of String.split, as this splits on a delimiter string.
A StringTokenizer takes the delimiter string and recognizes all characters in it as a delimiter. So in fact you redundantly specified the "|" character a second time.
Using the split function is what you maybe wanted:
System.out.println(Arrays.toString("1||2||3|||4".split("\\|\\|")));
This produces
[1, 2, 3, |4]

take a look this is an easy solution:
StringTokenizer stp = new StringTokenizer(myString, "|");
while (stp.hasMoreTokens()) {
System.out.println(stp.nextToken());
}

Related

How to split string using regex without consuming the splitter part?

How would I split a string without consuming the splitter part?
Something like this but instead : I'm using #[a-fA-F0-9]{6} regex.
String from = "one:two:three";
String[] to = ["one",":","two",":","three"];
I already tried using commons lib since it has StringUtils.splitPreserveAllTokens() but it does not work with regex.
EDIT: I guess I should have been more specific, but this is more of what I was looking for.
String string = "Some text here #58a337test #a5fadbtest #123456test as well.
#58a337Word#a5fadbwith#123456more hex codes.";
String[] parts = string.split("#[a-fA-F0-9]{6}");
/*Output: ["Some text here ","#58a337","test ","#a5fadb","test ","#123456","test as well. ",
"#58a337","Word","#a5fadb","with","#123456","more hex codes."]*/
EDIT 2: Solution!
final String string = "Some text here #58a337test #a5fadbtest #123456test as
well. #58a337Word#a5fadbwith#123456more hex codes.";
String[] parts = string.split("(?=#.{6})|(?<=#.{6})");
for(String s: parts) {
System.out.println(s);
}
Output:
Some text here
#58a337
test
#a5fadb
test
#123456
test as well.
#58a337
Word
#a5fadb
with
#123456
more hex codes.
You could use \\b (word-break, \ escaped) to split in your case,
final String string = "one:two:three";
String[] parts = string.split("\\b");
for(String s: parts) {
System.out.println(s);
}
Try it online!
The answer given by #vrintle +1 is probably the tightest code which can be written for your exact input. But, assuming you might have other non word characters in the input besides :, then you could also split more precisely using lookarounds:
String from = "one:two:three";
String[] parts = from.split("(?<=:)|(?=:)");
System.out.println(Arrays.toString(parts));
This prints:
[one, :, two, :, three]

How make java split java with pipe

I have to separate a string into an array that may contain empty spaces, for example,
|maria|joao|fernando||
but it is ignored when you have space at the end of the line
I am using this regex split("\\|")
It should be like it: maria,joao,fernando,null
but stays like this: maria,joao,fernando
You can use:
String str = "|maria|joao|fernando||";
String[] tokens = str.replaceAll("^[|]|[|]$", "").split("[|]", -1));
//=> [maria, joao, fernando, ""]
Steps:
Replace starting and ending | using replaceAll method to get maria|joao|fernando| as input to split.
Then split it using split method with 2nd parameter as -1 to return empty tokens as well.
You only need to add double backslashes to you split string
String s = "|maria|joao|fernando||";
String [] st =s.split("\\|");
for (String string : st) {
System.out.print(string+",");
}
Java 8 solution
List<String> params = Pattern
.compile("\\|")
.splitAsStream("|maria|joao|fernando||")
.filter(e -> e.trim().length() > 0) // Remove spaces only or empty strings
.collect(Collectors.toList());

Split string against some characters except the # character

I want to split a string against the following characters
~!#$%^&*()_+­=<>,.?/:;"'{}|[]\, \n,\t, space
I tried to use \\s regex delimiter but i don't want the # included as the split character so that a string like this is #funny should result to this is #funny as the resulting values.
I have tried the following but it doesn't work.
this is #funny".split("\\s")
but it doesn't work. Any ideas?
Just specify the characters you want in square bracket, which means any of. Single escape Java characters (like \") and double escape Regex special characters (like \\[):
#Test
public void testName() throws Exception
{
String[] split = "this is #funny".split("[~!#$%^&*()_+­=<>,.?/:;\"'{}|\\[\\]\\\\ \\n\\t]");
for (String string : split)
{
logger.debug(string);
}
}
User replaceAll(String regex,String replacement) method from String.
String result = "this is #funny".replaceAll("[~!#$%^&*()_+­=<>,.?/:;\"'{}|\\[\\]\\,\\n\\t]", "");
System.out.println(result);
You can try to implement this:
String[] split = "this&is%a#funny^string".split("[^#\\p{Alnum}]|\\s+");
for (String string : split){
System.out.println(string);
}
Also check the Java API (Patterns) for more information on how to process strings.
It look like this will work for you:
String[] split = str.split("[^a-zA-Z&&[^#]]+");
This uses a character class subtraction to split on non-letter chars, except the hash.
Here's some test code:
String str = "this is #funny";
String[] split = str.split("[^a-zA-Z&&[^#]]+");
System.out.println(Arrays.toString(split));
Output:
[this, is, #funny]

Regex Pattern to avoid : and , in the strings

I have a string which comes from the DB.
the string is something like this:-
ABC:def,ghi:jkl,hfh:fhgh,ahf:jasg
In short String:String, and it repeats for large values.
I need to parse this string to get only the words without any : or , and store each word in ArrayList
I can do it using split function(twice) but I figured out that using regex I can do it one go and get the arraylist..
String strLine="category:hello,good:bye,wel:come";
Pattern titlePattern = Pattern.compile("[a-z]");
Matcher titleMatcher = titlePattern.matcher(strLine);
int i=0;
while(titleMatcher.find())
{
i=titleMatcher.start();
System.out.println(strLine.charAt(i));
}
However it is not giving me proper results..It ends up giving me index of match found and then I need to append it which is not so logical and efficient,.
Is there any way around..
String strLine="category:hello,good:bye,wel:come";
String a[] = strLine.split("[,:]");
for(String s :a)
System.out.println(s);
Use java StringTokenizer
Sample:
StringTokenizer st = new StringTokenizer(in, ":,");
while(st.hasMoreTokens())
System.out.println(st.nextToken());
Even if you can use a regular expression to parse the entire string at once, I think it would be less readable than splitting it with multiple steps.

String Tokenizing in java

I need to tokenize a string using a delimiter.
StringTokenizer is capable of tokenizing the string with given delimiter. But, when there are two consecutive delimiters in the string, then it is not considering it as a token.
Thanks in advance for you help
Regards,
The second parameter to the constructor of StringTokenizer object is just a string containing all delimiters that you require.
StringTokenizer st = new StringTokenizer(str, "#!");
In this case, there are two delimiters both # and !
Consider this example :
String s = "Hello, i am using Stack Overflow;";
System.out.println("s = " + s);
String delims = " ,;";
StringTokenizer tokens = new StringTokenizer(s, delims);
while(tokens.hasMoreTokens())
System.out.println(tokens.nextToken());
Here you would get an output similar to this with 3 delimiters :
Hello
,
i
am
using
Stack
Overflow
;
Look into String.split()
This should do what you are looking for.
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html
Use the split() method of java.lang.String and pass it a regular expression which matches your one or more delimiter condition.
for e.g. "a||b|||c||||d" could be tokenised with split("\\|{2,}"); with the resulting array [a,b,c,d]

Categories