How to scan words in a text without newline character? - java

I have serious issues to understand how the scanner class works. Indeed, I would like from this input:
AAAA BBBG GREZZ
ADFG GTRE
FREZZ
to have this ouput as an ArrayList:
[AAAA, BBBG, GREZZ, ADFG, GTRE, FREZZ]
My code is the following:
System.out.println("list of words?");
Scanner scan3 = new Scanner(System.in);
scan3.useDelimiter("[\\s+\\n]");
ArrayList<String> test = new ArrayList<String>();
while(scan3.hasNext()){
String temp = scan3.next();
if(temp.equals("STOP")){
break;
}else{
test.add(temp);
}
}
System.out.println(test);
My input is:
AAAA BBBG GREZZ
ADFG GTRE
FREZZ STOP
And my ouput is:
[AAAA, BBBG, GREZZ, , ADFG, GTRE, , FREZZ]
My question is twofold:
Why do I have an empty element inserted in the list (which is I beleive related to the new line)?
As you might have noticed, I add the string "STOP" at the end in order to stop the loop because the condition scan3.HasNext() always returns "true", is it the only way to proceed?
Many thanks for your help.

Just use String#split() with a suitable delimiter (it sounds like in this case you just need "[ \\n]+" (that's a space and a \n) but you may want to tweak that a bit, for example you might want to split on all whitespace.
The reason for the empty entry in the list is because you are just scanning for a single character of whitespace. This gets matched twice when you have two line endings next to each other. You need to modify the regex so it matches more than one character.

Your + appears to be in the wrong place. Try instead using scan3.useDelimiter("[\\s\\n]+");
Regarding,
As you might have noticed, I add the string "STOP" at the end in order to stop the loop because the condition scan3.HasNext() always returns "true", is it the only way to proceed?
What other way would you wish to proceed? What other stop condition do you propose to use?

Related

Java or Eclipse 'fails' on Whitespaces(?)

After many questions asked by other users this is my first one for which I was not able to find a fitting answer.
However, the problem sounds weird and actually is:
I have had more than one situation in which whitespaces were part of the problem and common solutions to be find on stackoverflow or elsewhere did not help me.
First I wanted to split a String on whitespaces. Should be something like
String[] str = input.split(" ")
But neither (" ") nor any regex like ("\\s+") worked for me. Not really a problem at all. I just chose a different character to split on. :)
Now I'm trying to clean up a string by removing all whitespaces. Common solution to find is
String str = input.replaceAll(" ", "")
I tried to use the regex again and also (" *", "") to prevent exception if the string inludes no whitespaces. Again, none of these worked for me.
Now I'm asking myself whether this is a kinda weird problem on my Java/Eclipse plattform or if I'm doing something basically wrong. Technically I do not think so, because all code above works fine with any other character to split/clean on.
Hope to have made myself understood.
Regards Drebin
edit to make it clearer:
I'm caring just about the "replacing" right now.
My code does accept a combination of values and names separated by comma and series of these separated by semicolon, e.g.:
1,abc;2,def;3,ghi
this gets two time splitted, first on comma, then on semicolon. Works fine.
Now I want to clear such an input by removing all whitespaces to proceed as explained above. Therefore I use, as already explained, String.replaceAll(" ", ""), but it does NOT work. Instead, everything in the string after the FIRST whitespace, no matter where it is, gets removed and is lost. E.g. the String from above would change to
1,abc;
if there is whitespace after the first semicolon.
Hope this part of code works for you:
import java.util.*;
public class Main {
public static void main(String[] args) {
// some info output
Scanner scan = new Scanner(System.in);
String input;
System.out.println("\n wait for input ...");
input = scan.next();
if(input.equals("info"))
{
// special case for information
}
else if(input.equals("end"))
{
scan.close();
System.exit(0);
}
else
{
// here is the problem:
String input2 = input.replaceAll(" ", "");
System.out.println("DEBUG: "+input2);
// further code for cleared Strings
}
}
}
I really do not know how to make it even clearer now ...
The next method of Scanner returns the next token - with the default delimiters that will be a single word, not the complete line.
Use the nextLine method if you want to get the complete line.

How to remove first string and comma from a text file

!
I have a text, the content looks like [1,'I am java, and I am happy, I am.....'], I want to remove the first integer and the comma. When I was run the code above, but the result start with last comma: I am......
If you only want to remove commas from a String, you can use String.replaceAll(",",""); If you want to replace them by spaces, use String.replaceAll(","," "):
while ((line = br.readLine()) != null) {
contents.append(line.replaceAll(","," ");
}
Also in your code you seem to split the input, but don't use the result of this operation.
You need to use the indexOfReturns the index within this string of the first occurrence of the specified character, starting the search at the specified index..
lastIndexOf Returns the index within this string of the last occurrence of the specified substring, searching backward starting at the specified index.
System.out.print(s.substring(s.indexOf(",")+1));
Use this following code as:
System.out.println(line.substring(2));
sub string takes the beginning index as a parameter and splits the string from that index to till the end.
Note that you are using lastIndexOf(). Use indexOf() to get the first index as shown below.
System.out.println(test.substring(line.indexOf(',')+1));
I'm taking your String literially, but you could use String#replaceFirst, for example...
String text = "[1,'I am java, and I am happy, I am.....']";
text = text.replaceFirst("\\[\\d,", "[");
System.out.println(text);
Which outputs...
['I am java, and I am happy, I am.....']
If you want to update the file, you are either going to have to read all the lines into some kind of List (modifying them as you please) and once finished, write the List back to the file (after you've closed it after reading it).
Alternatively, you could write each updated line to a second file, once you're finished, close both files, delete the first and rename the second back in it's place...
Try This code:
String[] s=line.splite(",");
String m="";
for(int i=1;i<s.length;i++)
{
String m=m+s[i];
}
br.append(m);
String input = "[1,'I am java, and I am happy, I am.....']";
//Getting String after first comma
String output = StringUtils.substringAfter(input, ",");
System.out.println("Output:"+output);
//replacing commas;
System.out.println("Final o/p:"+StringUtils.replace(output, ",",""));
You can use methods in StringUtils Class for string manipulations. For using StringUtils methods, you need to import apache-commons-lang.jar file. Using this API you can manipulate many String related methods. For more details, you can see the link
http://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringUtils.html

Removing items from String

I am trying to replace all occurrences of a substring from a String.
I want to replace "\t\t\t" with "<3tabs>"
I want to replace "\t\t\t\t\t\t" with "<6tabs>"
I want to replace "\t\t\t\t" with "< >"
I am using
s = s.replace("\t\t\t\t", "< >");
s = s.replace("\t\t\t", "<3tabs>");
s = s.replace("\t\t\t\t\t\t", "<6tabs>");
But no use, it does not replace anything, then i tried using
s = s.replaceAll("\t\t\t\t", "< >");
s = s.replaceAll("\t\t\t", "<3tabs>");
s = s.replaceAll("\t\t\t\t\t\t", "<6tabs>");
Again, no use, it does not replace anything. after trying these two methods i tried StringBuilder
I was able to replace the items through StringBuilder, My Question is, why am i unable to replace the items directly through String from the above two commands? Is there any method from which i can directly replace items from String?
try in this order
String s = "This\t\t\t\t\t\tis\t\t\texample\t\t\t\t";
s = s.replace("\t\t\t\t\t\t", "<6tabs>");
s = s.replace("\t\t\t\t", "< >");
s = s.replace("\t\t\t", "<3tabs>");
System.out.print(s);
output:
This<6tabs>is<3tabs>example< >
6tabs is never going to find a match as the check before it will have already replaced them with two 3tabs.
You need to start with largest match first.
Strings are immutable so you can't directly modify them, s.replace() returns a new String with the modifications present in it. You then assign that back to s though so it should work fine.
Put things in the correct order and step through it with a debugger to see what is happening.
Take a look at this
Go through your text, divide it into a char[] array, then use a for loop to go through the individual characters.
Don't print them out straight, but print them using a %x tag (or %d if you like decimal numbers).
char[] characters = myString.tocharArray();
for (char c : characters)
{
System.out.printf("%x%n", c);
}
Get an ASCII table and look up all the numbers for the characters, and see whether there are any \n or \f or \r. Do this before or after.
Different operating systems use different line terminating characters; this is the first reference I found from Google with "line terminator Linux Windows." It says Windows uses \r\f and Linux \f. You should find that out from your example. Obviously if you strip \n and leave \r you will still have the text break into separate lines.
You might be more successful if you write a regular expression (see this part of the Java Tutorial, etc) which includes whitespace and line terminators, and use it as a delimiter with the String.split() method, then print the individual tokens in order.

Reading a line from a text file Java

I am trying to read a line from a file using BufferedReader and Scanner. I can create both of those no problem. What I am looking to do is read one line, count the number of commas in that line, and then go back and grab each individual item. So if the file looked like this:
item1,item2,item3,etc.
item4,item5,item6,etc.
The program would return that there are four commas, and then go back and get one item at a time. It would repeat for the next line. Returning the number of commas is crucial for my program, otherwise, I would just use the Scanner.useDelimiter() method. I also don't know how to return to the beginning of the line to grab each item.
Why not just split the String. The split method accepts a delimiter (regex) as an argument and breaks the String into a String[]. This will eliminate the need to return to the beginning.
String value = "item1,item2,item3";
String[] tokens = value.split(",");
To get the number of commas, just use, tokens.length - 1
String.split() Documentation
Split() can be used to achieve this
eg:
String Line = "item1,item2,item3"
String[] words =Line.split(",");
If you absolutely must know the number of commas, a similar question has already been answered:
Java: How do I count the number of occurrences of a char in a String?

Print string up to a certain word - Java

I was wondering in Java how I could print a string until it reaches the word "quit" in that string and then instantly stop printing at that point. For instance if the string value was:
"Hi there this is a random string quit this should not be printed"
All that should be printed is "Hi there this is a random string".
I was trying something like this, but I believe it to be wrong.
if ( input.indexOf( "quit" ) > -1 )
{
//code to stop printing here
}
Instead of thinking about the problem as "how to stop printing" (because once you start printing something in Java it's pretty hard to stop it), think about it in terms of "How can I print only the words up to a certain point?" For example:
int quit_position = input.indexOf("quit");
if (quit_position >= 0) {
System.out.println(input.substring(0, quit_position));
} else {
System.out.println(input);
}
Looks like homework, so this answer is in homework style. :-)
You're on the right track.
Save the value of that indexOf to an integer.
Then it's like you have a finger pointing at the right spot - ie, at the end of the substring you really want to print.
That's a hint anyway...
EDIT: Looks like people are giving it to you anyway. But here are some more thoughts:
You might want to think about upper and lower case as well.
Also consider what you are going to do if 'quit' is not there.
Also the solutions here don't strictly solve your problem - they'll print unnecessary spaces too, after the last word ends, before 'quit' starts. If that is a problem you consider String Tokenization or an adapation of the replaceAll solution above to cover for leading whitespace into `quit'.
This has a one-line solution:
System.out.println(input.replaceAll("quit.*", ""));
String.replaceAll() takes a regex to match, which I've specified to be "the literal 'quit' and everything following", which is to be replaced by a blank "" (ie effectively deleted) from the returned String
If you don't mind trailing spaces in your string
int index = input.indexOf("quit");
if (index == -1) index = input.length();
return input.substring(0, index);

Categories