I'm trying to read an InputStream of String tokens with a Scanner. Every token ends with a comma ,. An empty string "" is also a valid token. In that case the whole token is just the comma that ends it.
The InputStream is slowly read from another process, and any tokens should be handled as soon as they have been fully read. Therefore reading the whole InputStream to a String is out of the question.
An example input could look like this:
ab,,cde,fg,
If I set the delimiter of the Scanner to a comma, it seems to handle the job just fine.
InputStream input = slowlyArrivingStreamWithValues("ab,,cde,fg,");
Scanner scan = new Scanner(input);
scan.useDelimiter(Pattern.quote(","));
while (scan.hasNext()) {
System.out.println(scan.next());
}
output:
ab
cde
fg
However the problems appear when the stream begins with an empty token. For some reason Scanner just ignores the first token if it is empty.
/* begins with empty token */
InputStream input = slowlyArrivingStreamWithValues(",ab,,cde,fg,");
...
output:
ab
cde
fg
Why does Scanner ignore the first token? How can I include it?
Try using a lookbehind as the pattern:
(?<=,)
and then replace comma with empty string with each token that you match. Consider the following code:
String input = ",ab,,cde,fg,";
Scanner scan = new Scanner(input);
scan.useDelimiter("(?<=,)");
while (scan.hasNext()) {
System.out.println(scan.next().replaceAll(",", ""));
}
This outputs the following:
(empty line)
ab
cde
fg
Demo
It's easier if you write it yourself, without using Scanner:
static List<String> getValues(String source){
List<String> list = new ArrayList<String>();
for(int i = 0; i < source.length();i++){
String s = "";
while(source.charAt(i) != ','){
s+=source.charAt(i++);
if(i >= source.length()) break;
}
list.add(s);
}
return list;
}
For example, if source = ",a,,b,,c,d,e", the output will be "", "a", "", "c", "d", "e".
Related
I have a file in the following format, records are separated by newline but some records have line feed in them, like below. I need to get each record and process them separately. The file could be a few Mb in size.
<?aaaaa>
<?bbbb
bb>
<?cccccc>
I have the code:
FileInputStream fs = new FileInputStream(FILE_PATH_NAME);
Scanner scanner = new Scanner(fs);
scanner.useDelimiter(Pattern.compile("<\\?"));
if (scanner.hasNext()) {
String line = scanner.next();
System.out.println(line);
}
scanner.close();
But the result I got have the begining <\? removed:
aaaaa>
bbbb
bb>
cccccc>
I know the Scanner consumes any input that matches the delimiter pattern. All I can think of is to add the delimiter pattern back to each record mannully.
Is there a way to NOT have the delimeter pattern removed?
Break on a newline only when preceded by a ">" char:
scanner.useDelimiter("(?<=>)\\R"); // Note you can pass a string directly
\R is a system independent newline
(?<=>) is a look behind that asserts (without consuming) that the previous char is a >
Plus it's cool because <=> looks like Darth Vader's TIE fighter.
I'm assuming you want to ignore the newline character '\n' everywhere.
I would read the whole file into a String and then remove all of the '\n's in the String. The part of the code this question is about looks like this:
String fileString = new String(Files.readAllBytes(Paths.get(path)), StandardCharsets.UTF_8);
fileString = fileString.replace("\n", "");
Scanner scanner = new Scanner(fileString);
... //your code
Feel free to ask any further questions you might have!
Here is one way of doing it by using a StringBuilder:
public static void main(String[] args) throws FileNotFoundException {
Scanner in = new Scanner(new File("C:\\test.txt"));
StringBuilder builder = new StringBuilder();
String input = null;
while (in.hasNextLine() && null != (input = in.nextLine())) {
for (int x = 0; x < input.length(); x++) {
builder.append(input.charAt(x));
if (input.charAt(x) == '>') {
System.out.println(builder.toString());
builder = new StringBuilder();
}
}
}
in.close();
}
Input:
<?aaaaa>
<?bbbb
bb>
<?cccccc>
Output:
<?aaaaa>
<?bbbb bb>
<?cccccc>
I know that by default, the Scanner skips over whitespaces and newlines.
There is something wrong with my code because my Scanner does not ignore "\n".
For example: the input is "this is\na test." and the desired output should be ""this is a test."
this is what I did so far:
Scanner scan = new Scanner(System.in);
String token = scan.nextLine();
String[] output = token.split("\\s+");
for (int i = 0; i < output.length; i++) {
if (hashmap.containsKey(output[i])) {
output[i] = hashmap.get(output[i]);
}
System.out.print(output[i]);
if (i != output.length - 1) {
System.out.print(" ");
}
nextLine() ignores the specified delimiter (as optionally set by useDelimiter()), and reads to the end of the current line.
Since input is two lines:
this is
a test.
only the first line (this is) is returned.
You then split that on whitespace, so output will contain [this, is].
Since you never use the scanner again, the second line (a test.) will never be read.
In essence, your title is right on point: Java Scanner does not ignore new lines (\n)
It specifically processed the newline when you called nextLine().
You don't have to use a Scanner to do this
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
String result = in.lines().collect(Collectors.joining(" "));
Or if you really want to use a Scanner this should also work
Scanner scanner = new Scanner(System.in);
Spliterator<String> si = Spliterators.spliteratorUnknownSize(scanner, Spliterator.ORDERED);
String result = StreamSupport.stream(si, false).collect(Collectors.joining(" "));
the code below is from a reference i saw online, so there might be some similarities i'm trying to implement the code to remove an entire line based on the 1st field in this instance it is (aaaa or bbbb) the file which has a delimiter "|", but it is not working. Hope someone can advise me on this. Do i need to split the line first? or my method is wrong?
data in player.dat (e.g)
bbbb|aaaaa|cccc
aaaa|bbbbbb|cccc
Code is below
public class testcode {
public static void main(String[] args)throws IOException
{
File inputFile = new File("players.dat");
File tempFile = new File ("temp.dat");
BufferedReader read = new BufferedReader(new FileReader(inputFile));
BufferedWriter write = new BufferedWriter(new FileWriter(tempFile));
Scanner UserInput = new Scanner(System.in);
System.out.println("Please Enter Username:");
String UserIn = UserInput.nextLine();
String lineToRemove = UserIn;
String currentLine;
while((currentLine = read.readLine()) != null) {
// trim newline when comparing with lineToRemove
String trimmedLine = currentLine.trim();
if(trimmedLine.equals(lineToRemove)) continue;
write.write(currentLine + System.getProperty("line.separator"));
}
write.close();
read.close();
boolean success = tempFile.renameTo(inputFile);
}
}
Your code compares the entire line it reads from the file to the user name the user enters, but you say in your question that you actually only want to compare to the first part up to the first pipe (|). Your code doesn't do that.
What you need to do is read the line from the file, get the part of the string up to the first pipe symbol (split the string) and skip the line based on comparing the first part of the split string to the lineToRemove variable.
To make it easier, you could also add the pipe symbol to the user input and then do this:
string lineToRemove = UserIn + "|";
...
if (trimmedLine.startsWith(lineToRemove)) continue;
This spares you from splitting the string.
I'm currently not sure whether UserInput.nextLine(); returns the newline character or not. To be safe here, you could change the above to:
string lineToRemove = UserIn.trim() + "|";
Okay So I am creating an application but I'm not sure how to get certain parts of the string. I have read In a file as such:
*tp*|21394398437984|163600
*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567
*2*|AAA|1234567894561237|STOP|20140527|Success||Automated|DPSRN1234568
*3*|2
I need to read the lines beginning with 2 so I done:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
while (s.hasNext()) {
String str1 = s.nextLine ();
if(str1.startsWith("*2*")) {
System.out.print(str1);
}
}
So this will read the whole line I'm fine with that, Now my issue is I need to extract the 2nd line beginning with numbers the 4th with numbers the 5th with success and the 7th(DPSRN).
I was thinking about using a String delimiter with | as the delimiter but I'm not sure where to go after this any help would be great.
You should use String.split("|"), it will give you an array - String[]
Try following:
String test="*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567";
String tok[]=test.split("\\|");
for(String s:tok){
System.out.println(s);
}
Output :
*2*
AAA
1234567894561236
STOP
20140527
Success
Automated
DSPRN1234567
What you require will be placed at tok[2], tok[4], tok[5] and tok[8].
Just split the returned line based on your search, which would return an array of String elements where you can retrieve your elements based on their index:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
String searchLine = "";
while (s.hasNext()) {
searchLine = s.nextLine();
if(searchLine.startsWith("*2*")) {
break;
}
}
String[] strs = searchLine.split("|");
String secondArgument = strs[2];
String forthArgument = strs[4];
String fifthArgument = strs[5];
String seventhArgument = strs[7];
System.out.println(secondArgument);
System.out.println(forthArgument);
System.out.println(fifthArgument);
System.out.println(seventhArgument);
I'm having trouble figuring out how to read the rest of a input line. I need to token the first word then possibly create the rest of the input line as one whole token
public Command getCommand()
{
String inputLine; // will hold the full input line
String word1 = null;
String word2 = null;
System.out.print("> "); // print prompt
inputLine = reader.nextLine();
// Find up to two words on the line.
Scanner tokenizer = new Scanner(inputLine);
if(tokenizer.hasNext()) {
word1 = tokenizer.next(); // get first word
if(tokenizer.hasNext()) {
word2 = tokenizer.next(); // get second word
// note: just ignores the rest of the input line.
}
}
// Now check whether this word is known. If so, create a command
// with it. If not, create a "null" command (for unknown command).
if(commands.isCommand(word1)) {
return new Command(word1, word2);
}
else {
return new Command(null, word2);
}
}
The input:
take spinning wheel
Output:
spinning
Desired Output:
spinning wheel
Use split()
String[] line = scan.nextLine().split(" ");
String firstWord = line[0];
String secondWord = line[1];
It means that you need to split the line at space and that will convert it into the array. Now using yhe index you can get any word you want
OR -
String inputLine =//Your Read line
String desiredOutput=inputLine.substring(inputLine.indexOf(" ")+1)
You can try like this also...
String s = "This is Testing Result";
System.out.println(s.split(" ")[0]);
System.out.println(s.substring(s.split(" ")[0].length()+1, s.length()-1));
Use split(String regex, int limit)
String[] line = scan.nextLine().split(" ", 2);
String firstWord = line[0];
String rest= line[1];
Refer to doc here