String Tokenizing in java - java

I need to tokenize a string using a delimiter.
StringTokenizer is capable of tokenizing the string with given delimiter. But, when there are two consecutive delimiters in the string, then it is not considering it as a token.
Thanks in advance for you help
Regards,

The second parameter to the constructor of StringTokenizer object is just a string containing all delimiters that you require.
StringTokenizer st = new StringTokenizer(str, "#!");
In this case, there are two delimiters both # and !
Consider this example :
String s = "Hello, i am using Stack Overflow;";
System.out.println("s = " + s);
String delims = " ,;";
StringTokenizer tokens = new StringTokenizer(s, delims);
while(tokens.hasMoreTokens())
System.out.println(tokens.nextToken());
Here you would get an output similar to this with 3 delimiters :
Hello
,
i
am
using
Stack
Overflow
;

Look into String.split()
This should do what you are looking for.
http://docs.oracle.com/javase/6/docs/api/java/lang/String.html

Use the split() method of java.lang.String and pass it a regular expression which matches your one or more delimiter condition.
for e.g. "a||b|||c||||d" could be tokenised with split("\\|{2,}"); with the resulting array [a,b,c,d]

Related

StringTokenizer vs. String.split?

Someone just asked a question on String.split() and the solution was to use StringTokenizer. String split comma and parenthisis-JAVA Why doesn't String.split() split on parentheses?
public static void main(String[] args) {
String a = "(id,created,employee(id,firstname," +
"employeeType(id), lastname),location)";
StringTokenizer tok = new StringTokenizer(a, "(), ");
System.out.println("StringTokenizer example");
while (tok.hasMoreElements()) {
String b = (String)tok.nextElement();
System.out.println(b);
}
System.out.println("Split example");
String[] array = a.split("(),");
for (String ii: array) {
System.out.println(ii);
}
}
Outputs:
StringTokenizer example
id
created
employee
id
firstname
employeeType
id
lastname
location
Split example
(id
created
employee(id
firstname
employeeType(id)
lastname)
location)
There was a discussion on String.split() vs. StringTokenizer at Scanner vs. StringTokenizer vs. String.Split but it doesn't explain the parentheses. Is this by design? What's going on here?
If you want split to split on the characters '(', ')', ',', and ' ', you need to pass a regex that matches any of those. The easiest is to use a character class:
String[] array = a.split("[(), ]");
Normally, parentheses in a regex are a grouping operator and would have to be escaped if you intended them to be used as literals. However, inside the character class delimiters, the parenthesis characters do not have to be escaped.
StringTokenizer does not support regular expressions . The token characters "()," for the StringTokenizer are split , so the StringTokenizer code will split the input when it encounters any one of the following ( or ) or ,
String.split takes a regular expression and parenthesis are used to term different expressions. Since there is nothing in the parenthesis , they are ignored and only the comma , is used.

Splitting string based on delimiter

A string is taken as input which is in the form of 23,4,555,67 via deadline nd another input is key yo search the element linearly ?my question is how can we recognize the elements from string separated by comma
You can split the String using split :
String[] tokens = "23,4,555,67".split(",");
String s = "23,4,555,67"
String[] tokens = s.split(",");
This will give you a string array with the numbers.
Alternatively, you can use a StringTokenizer. (java.util)
This can be used if your string is delimited by more than one characters (can be be used as well in case of single character). Your example using StringTokenizer
SrringTokenizer st = new StringTokenizer("23,4,555,67", ",");
while(st.hasMoreElements())
System.out.println(st.nextToken());

How to replace all numbers in java string

I have string like this String s="ram123",d="ram varma656887"
I want string like ram and ram varma so how to seperate string from combined string
I am trying using regex but it is not working
PersonName.setText(cursor.getString(cursor.getColumnIndex(cursor
.getColumnName(1))).replaceAll("[^0-9]+"));
The correct RegEx for selecting all numbers would be just [0-9], you can skip the +, since you use replaceAll.
However, your usage of replaceAll is wrong, it's defined as follows: replaceAll(String regex, String replacement). The correct code in your example would be: replaceAll("[0-9]", "").
You can use the following regex: \d for representing numbers. In the regex that you use, you have a ^ which will check for any characters other than the charset 0-9
String s="ram123";
System.out.println(s);
/* You don't need the + because you are using the replaceAll method */
s = s.replaceAll("\\d", ""); // or you can also use [0-9]
System.out.println(s);
To remove the numbers, following code will do the trick.
stringname.replaceAll("[0-9]","");
Please do as follows
String name = "ram varma656887";
name = name.replaceAll("[0-9]","");
System.out.println(name);//ram varma
alternatively you can do as
String name = "ram varma656887";
name = name.replaceAll("\\d","");
System.out.println(name);//ram varma
also something like given will work for you
String given = "ram varma656887";
String[] arr = given.split("\\d");
String data = new String();
for(String x : arr){
data = data+x;
}
System.out.println(data);//ram varma
i think you missed the second argument of replace all. You need to put a empty string as argument 2 instead of actually leaving it empty.
try
replaceAll(<your regexp>,"")
you can use Java - String replaceAll() Method.
This method replaces each substring of this string that matches the given regular expression with the given replacement.
Here is the syntax of this method:
public String replaceAll(String regex, String replacement)
Here is the detail of parameters:
regex -- the regular expression to which this string is to be matched.
replacement -- the string which would replace found expression.
Return Value:
This method returns the resulting String.
for your question use this
String s = "ram123", d = "ram varma656887";
System.out.println("s" + s.replaceAll("[0-9]", ""));
System.out.println("d" + d.replaceAll("[0-9]", ""));

java StringTokenizer skips the charaters if its part of delimiter

I have issues in using java string tokenizer:
String myString = "1||2||3|||4";
StringTokenizer stp = new StringTokenizer(myString, "||");
while (stp.hasMoreTokens()) {
System.out.println(stp.nextToken());
}
actual output : [1,2,3,4]
expected output : [1,2,3,'|4']
Could any one help me on the same
Try this..
String myString = "1||2||3|||4";
String[] s=myString.split("\\|\\|");
for (String string : s) {
System.err.println(string);
}
I think you cannot do anything because it's how StringTokenizer works (you can put returnDelims true and remove it manually but it's more hard than look sometimes)
String myString = "1||2||3|||4";
String[] tokens = myString.split("\\|\\|");
for(String token : tokens)
{
System.out.println(token);
}
You can use split which does what you want.
Output:
1
2
3
|4
It is recommended to use the split method of the String class for doing this since StringTokenizer matches the given string and split takes a regular expression. I would use this:
String[] splitStr = myString.split("[|]{2}");
This matches every time the regular expression [|] (a single pipe) is matched twice in a row.
You are maybe thinking of String.split, as this splits on a delimiter string.
A StringTokenizer takes the delimiter string and recognizes all characters in it as a delimiter. So in fact you redundantly specified the "|" character a second time.
Using the split function is what you maybe wanted:
System.out.println(Arrays.toString("1||2||3|||4".split("\\|\\|")));
This produces
[1, 2, 3, |4]
take a look this is an easy solution:
StringTokenizer stp = new StringTokenizer(myString, "|");
while (stp.hasMoreTokens()) {
System.out.println(stp.nextToken());
}

Regex Pattern to avoid : and , in the strings

I have a string which comes from the DB.
the string is something like this:-
ABC:def,ghi:jkl,hfh:fhgh,ahf:jasg
In short String:String, and it repeats for large values.
I need to parse this string to get only the words without any : or , and store each word in ArrayList
I can do it using split function(twice) but I figured out that using regex I can do it one go and get the arraylist..
String strLine="category:hello,good:bye,wel:come";
Pattern titlePattern = Pattern.compile("[a-z]");
Matcher titleMatcher = titlePattern.matcher(strLine);
int i=0;
while(titleMatcher.find())
{
i=titleMatcher.start();
System.out.println(strLine.charAt(i));
}
However it is not giving me proper results..It ends up giving me index of match found and then I need to append it which is not so logical and efficient,.
Is there any way around..
String strLine="category:hello,good:bye,wel:come";
String a[] = strLine.split("[,:]");
for(String s :a)
System.out.println(s);
Use java StringTokenizer
Sample:
StringTokenizer st = new StringTokenizer(in, ":,");
while(st.hasMoreTokens())
System.out.println(st.nextToken());
Even if you can use a regular expression to parse the entire string at once, I think it would be less readable than splitting it with multiple steps.

Categories