Splitting array on new line - java

I am submitting the following input through stdin:
4 2
30 one
30 two
15 three
25 four
My code is:
public static void main(String[] args) throws IOException {
BufferedReader stdin = new BufferedReader(new InputStreamReader(System.in));
String submittedString;
System.out.flush();
submittedString = stdin.readLine();
zipfpuzzle mySolver = new zipfpuzzle();
mySolver.getTopSongs(submittedString);
}
Which calls:
//Bringing it all together
public String getTopSongs(String myString) {
setUp(myString);
calculateQuality();
qualitySort();
return titleSort();
}
Which calls
public void setUp(String myString) {
String tempString = myString;
//Creating array where each element is a line
String[] lineExplode = tempString.split("\\n+");
//Setting up numSongsAlbum and songsToSelect
String[] firstLine = lineExplode[0].split(" ");
numSongsAlbum = Integer.parseInt(firstLine[0]);
songsToSelect = Integer.parseInt(firstLine[1]);
System.out.println(lineExplode.length);
//etc
}
However, for some reason lineExplode.length returns value 1... Any suggestions?
Kind Regards,
Dario

String[] lineExplode = tempString.split("\\n+");
The argument to String#split is a String that contains a regular expression

Your String#split regex will work file on Strings with newline characters.
String[] lineExplode = tempString.split("\n");
The problem is that your tempString has none of these characters, hence the size of the array is 1.
Why not just put the readLine in a loop and add the Strings to an ArrayList
String submittedString;
while (!(submittedString= stdin.readLine()).equals("")) {
myArrayList.add(submittedString);
}

Are you sure the file is using UNIX-style line endings (\n)? For a cross-platform split, use:
String[] lineExplode = tempString.split("[\\n\\r]+");

You should use "\\n" character to separate by new line but check that not all OS use the same separators ( http://en.wikipedia.org/wiki/Newline )
To solve this is very useful the system property line.separator that contains the current separator charater(s) for the current OS that is running the application.
You should use:
String[] lineExplode = tempString.split("\\\\n");
using \n as separator
Or:
String lineSeparator = System.getProperty("line.separator");
String[] lineExplode = tempString.split(lineSeparator);
Using the current OS separator
Or:
String lineSeparator = System.getProperty("line.separator");
String[] lineExplode = tempString.split(lineSeparator + "+");
Using the current OS separator and requiring one item

Its better to use this split this way:
String[] lineExplode =
tempString.split(Pattern.quote(System.getProperty("line.separator")) + '+');
To keep this split on new line platform independent.
UPDATE: After looking at your posted code it is clear that OP is reading just one line (till \n) in this line:
submittedString = stdin.readLine();
and there is no loop to read further lines from input.

Related

Java how to read a line from a text file that has multiple strings and double values?

I want to create a program that reads from a text file with three different parts and then outputs the name. E.g. text file:
vanilla 12 24
chocolate 23 20
chocolate chip 12 12
However, there is a bit of an issue on the third line, as there is a space. So far, my code works for the first two lines, but then throws a InputMismatchException on the third one. How do I make it so it reads both words from one line and then outputs it? My relevant code:
while (in.hasNext())
{
iceCreamFlavor = in.next();
iceCreamRadius = in.nextDouble();
iceCreamHeight = in.nextDouble();
out.println("Ice Cream: " + iceCreamFlavor);
}
In your input file, the separator between fields is composed of multiples spaces, no ?
if yes, you could simply use split method of String object.
You read a line.
You split it to obtain a String array.
String[] splitString = myString.split(" ");
Ther first element «0» is the String, the two others can be parsed as double
This could looks like :
try (BufferedReader br = new BufferedReader(new FileReader("path/to/the/file.txt"))) {
String line;
while ((line = br.readLine()) != null) {
String[] lineSplitted = line.split(" ");
String label = lineSplitted[0];
double d1 = Double.parseDouble(lineSplitted[1]);
double d2 = Double.parseDouble(lineSplitted[2]);
}
} catch (IOException e) {
e.printStackTrace();
}
You can use scanner.useDelimiter to change the delimiter or use a regular expression to parse the line.
//sets delimiter to 2 or more consecutive spaces
Scanner s = new Scanner(input).useDelimiter("(\\s){2-}");
Check the Scanner Javadoc for examples:

How to break a file into tokens based on regex using Java

I have a file in the following format, records are separated by newline but some records have line feed in them, like below. I need to get each record and process them separately. The file could be a few Mb in size.
<?aaaaa>
<?bbbb
bb>
<?cccccc>
I have the code:
FileInputStream fs = new FileInputStream(FILE_PATH_NAME);
Scanner scanner = new Scanner(fs);
scanner.useDelimiter(Pattern.compile("<\\?"));
if (scanner.hasNext()) {
String line = scanner.next();
System.out.println(line);
}
scanner.close();
But the result I got have the begining <\? removed:
aaaaa>
bbbb
bb>
cccccc>
I know the Scanner consumes any input that matches the delimiter pattern. All I can think of is to add the delimiter pattern back to each record mannully.
Is there a way to NOT have the delimeter pattern removed?
Break on a newline only when preceded by a ">" char:
scanner.useDelimiter("(?<=>)\\R"); // Note you can pass a string directly
\R is a system independent newline
(?<=>) is a look behind that asserts (without consuming) that the previous char is a >
Plus it's cool because <=> looks like Darth Vader's TIE fighter.
I'm assuming you want to ignore the newline character '\n' everywhere.
I would read the whole file into a String and then remove all of the '\n's in the String. The part of the code this question is about looks like this:
String fileString = new String(Files.readAllBytes(Paths.get(path)), StandardCharsets.UTF_8);
fileString = fileString.replace("\n", "");
Scanner scanner = new Scanner(fileString);
... //your code
Feel free to ask any further questions you might have!
Here is one way of doing it by using a StringBuilder:
public static void main(String[] args) throws FileNotFoundException {
Scanner in = new Scanner(new File("C:\\test.txt"));
StringBuilder builder = new StringBuilder();
String input = null;
while (in.hasNextLine() && null != (input = in.nextLine())) {
for (int x = 0; x < input.length(); x++) {
builder.append(input.charAt(x));
if (input.charAt(x) == '>') {
System.out.println(builder.toString());
builder = new StringBuilder();
}
}
}
in.close();
}
Input:
<?aaaaa>
<?bbbb
bb>
<?cccccc>
Output:
<?aaaaa>
<?bbbb bb>
<?cccccc>

Reading file line with multiple delimiters in java

I am trying to read a file line by line with multiple delimiters. I am using a regex for splitting but its not considering space (" ") as a delimiters. File contains ;, #, ,, and space as delimiters. What am I doing wrong?
File line looks like this - ADD R1, R2, R3
public static void initialize() throws IOException {
PC = 4000;
BufferedReader fileReader = new BufferedReader(new FileReader("test/ascii.txt"));
String str;
while((str = fileReader.readLine()) != null){
Instruction instruction = new Instruction();
String[] parts = str.split("[ ,:;#]");
instruction.instrAddr = String.valueOf(PC++);
System.out.println(instruction.instrAddr);
instruction.opcode = parts[0];
System.out.println(instruction.opcode);
instruction.dest = parts[1];
System.out.println(instruction.dest);
instruction.source_1 = parts[2];
System.out.println(instruction.source_1);
instruction.source_2 = parts[3];
System.out.println(instruction.source_2);
}
fileReader.close();}
The output prints 4000 (PC value), ADD, R1, " " and R2. How to avoid space? Is there anything wrong with the regex str.split("[ ,:;#]"); ?
Are you sure those are actually spaces?
This should work for any white-space:
#Test
public void test() {
String s = "1 2,3:4;5#6\t7";
Assert.assertEquals(7, s.split("[\\s,:;#]").length);
}

Retrieving part of a string using a delimiter

Okay So I am creating an application but I'm not sure how to get certain parts of the string. I have read In a file as such:
*tp*|21394398437984|163600
*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567
*2*|AAA|1234567894561237|STOP|20140527|Success||Automated|DPSRN1234568
*3*|2
I need to read the lines beginning with 2 so I done:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
while (s.hasNext()) {
String str1 = s.nextLine ();
if(str1.startsWith("*2*")) {
System.out.print(str1);
}
}
So this will read the whole line I'm fine with that, Now my issue is I need to extract the 2nd line beginning with numbers the 4th with numbers the 5th with success and the 7th(DPSRN).
I was thinking about using a String delimiter with | as the delimiter but I'm not sure where to go after this any help would be great.
You should use String.split("|"), it will give you an array - String[]
Try following:
String test="*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567";
String tok[]=test.split("\\|");
for(String s:tok){
System.out.println(s);
}
Output :
*2*
AAA
1234567894561236
STOP
20140527
Success
Automated
DSPRN1234567
What you require will be placed at tok[2], tok[4], tok[5] and tok[8].
Just split the returned line based on your search, which would return an array of String elements where you can retrieve your elements based on their index:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
String searchLine = "";
while (s.hasNext()) {
searchLine = s.nextLine();
if(searchLine.startsWith("*2*")) {
break;
}
}
String[] strs = searchLine.split("|");
String secondArgument = strs[2];
String forthArgument = strs[4];
String fifthArgument = strs[5];
String seventhArgument = strs[7];
System.out.println(secondArgument);
System.out.println(forthArgument);
System.out.println(fifthArgument);
System.out.println(seventhArgument);

string tokenizer wrong usage in java

I believe I am not using correctly String Tokenizer. Here is my code:
buffer = new byte[(int) (end - begin)];
fin.seek(begin);
fin.read(buffer, 0, (int) (end - begin));
StringTokenizer strk = new StringTokenizer(new String(buffer),
DELIMS,true);
As you can see I am reading a chunk of lines from a file(end and begin are line numbers) and I am transfering the data to a string tokenizer. My delimitators are:
DELIMS = "\r\n ";
because I want to separate words that have a space between them, or are on the next line.
However this code sometimes separates whole words also. What could be the explanation?? Is my DELIMS string conceived wrong?
Also I am passing "true" as an argument to the tokenizer because I want the delimitators to be treated as tokens as well.( I want this because I want to count the line I am currently at)
Could you please help me. Thanks a lot.
To start with, your method for converting bytes into a String is a bit suspect, and this overall method will be less-than-efficient, especially for a larger file.
Are you required to use StringTokenizer? If not, I'd strongly recommend using Scanner instead. I'd provide you with an example, but will ask that you just refer to the Javadocs instead, which are quite comprehensive and already contain good examples. That said, it accepts delimiters as well - but as Regular Expressions, so just be aware.
You could always wrap your input stream in a LineNumberReader. That will keep track of the line number for you. LineNumberReader extends BufferedReader, which has a readLine() method. With that, you could use a regular StringTokenizer to get your words as tokens. You could use regular expressions or Scanner, but for this case, StringTokenizer is simpler for beginners to understand and quicker.
You must have a RandomAccessFile. You didn't specify that, but I'm guessing based on the methods you used. Try something like:
byte [] buffer = ...; // you know how to get this.
ByteArrayInputStream stream = new ByteArrayInputStream(buffer);
// if you have java.util.Scanner
{
int lineNumber = 0;
Scanner s = new Scanner(stream);
while (s.hasNextLine()) {
lineNum++;
String line = s.nextLine();
System.out.format("I am on line %s%n", lineNum);
Scanner lineScanner = new Scanner(line);
while (lineScanner.hasNext()) {
String word = lineScanner.next();
// do whatever with word
}
}
}
// if you don't have java.util.Scanner, or want to use StringTokenizer
{
LineNumberReader reader = new LineNumberReader(
new InputStreamReader(stream));
String line = null;
while ((line = reader.nextLine()) != null) {
System.out.println("I am on line " + reader.getLineNumber());
StringTokenizer tok = new StringTokenizer(line);
while (tok.hasMoreTokens()) {
String word = tok.nextToken();
// do whatever with word
}
}
}

Categories