How to read specific patterns with FileReader in Java - java

I have a text file that has a repeated pattern after line 14, which does not have a fixed length.
The pattern is as follows:
'
String
String
Int
The apostrophe is used to separate each chunk of data, which I am trying to save into a Hashmap for each type (e.g one hashmap for the first line of each block, etc)
What is the best was to check if there is a block of text (is the apostrophe the best?) and then save the next three lines and continue?

Related

Print all text in-between two characters

I want to print a specific part of text from a text file.
For example :
number)street)city)state)country)
I want to print from ) to ) so that any street name or country can go into the text file. What I have done is:
I have tried connecting scanner to the file and creating a while loop with .hasNextLine();
Then creating a String for the whole of the text file String line = textscanner.nextLine();
Then to print out country for example, I would create a substring System.out.print(line.substring(25));
However, this won't work if there are different street or country names in the file. How do I make it so that it will print anything from the ) to )?
You have to take advantage of Java's split() method, which accepts a specified string of text to use as separators/delimiters to words, which are often commas, like in .csv files. I'm going to skip the part about reading the file and just use this string as an example and put the words into an array:
String line = "number)street)city)state)country)";
String[] words = line.split("\\)");
Note in your case you must use double backslashes or the compiler will throw an error saying no matching parentheses.

Java: Optimized way of reading file, validate that file line by line and then split and write that file into smaller files.?

I am trying to read a file around 160 MB using Buffered Reader and then reading each line into string and validate(check first character in each line ). if the file is validated then i re- read file & split the files based on addresses in the line and save that in map as multiple lines can go to same address. once the complete file is read i write them through FTPS.String uses too much memory
File format
blocks of AJZ/AJJZ/AJJJZ
From A line we have to extract the address and then send that block(AJZ) , one block can be sent to multipl addresses and if more than one block belong to same address(eg Address2) we shud consolidate blocks
AAddress1,Address2
J7777
Z02
A00Address2,Address3
JH77
Z00...
You can use the Flyweight design pattern to compress your string.
For example, you can store each word only once and use a placeholder (some integer) unique for each word in the original text. This way you end up with an array of placeholders.

Reading a line from a text file Java

I am trying to read a line from a file using BufferedReader and Scanner. I can create both of those no problem. What I am looking to do is read one line, count the number of commas in that line, and then go back and grab each individual item. So if the file looked like this:
item1,item2,item3,etc.
item4,item5,item6,etc.
The program would return that there are four commas, and then go back and get one item at a time. It would repeat for the next line. Returning the number of commas is crucial for my program, otherwise, I would just use the Scanner.useDelimiter() method. I also don't know how to return to the beginning of the line to grab each item.
Why not just split the String. The split method accepts a delimiter (regex) as an argument and breaks the String into a String[]. This will eliminate the need to return to the beginning.
String value = "item1,item2,item3";
String[] tokens = value.split(",");
To get the number of commas, just use, tokens.length - 1
String.split() Documentation
Split() can be used to achieve this
eg:
String Line = "item1,item2,item3"
String[] words =Line.split(",");
If you absolutely must know the number of commas, a similar question has already been answered:
Java: How do I count the number of occurrences of a char in a String?

Suggested ways of reading a text file with inconsistent formatting

I'm trying to read a text file of numbers as a double array and after various methods (usually resulting in an input format exception) I have come to the conclusion that the text file I am trying to read is inconsistent with it's delimiting.
The majority of the text format is in the form "0.000,0.000" so I have been using a Scanner and the useDelimiter(",") to read in each value.
It turns out though (this is a big file of numbers) that some of the formatting is in the form "0.000 0.000" (at the end of a line I presume) which of course produces an input format exception.
This is an open question really, I'm a pretty basic Java programmer so I would just like to see if there are any suggestions/ways of performing this. Is Scanner the correct class to go on this?
Thank you for your time!
Read file as text line-by-line. Then split line into parts:
String[] parts = line.split("[ ,]");
Now iterate over the parts and call Double.parseDouble() for each part.
Scanner allows any Java Regex Pattern to function as a delimiter. You should be able to use any number of delimiters by doing the following:
scanner.setDelimiter("[,\\s]"); // Will match commas and whitespace
I'd like to comment this in instead of making it a separate answer, but my reputation is too low. Apologies, Alex.
You mentioned having two different delimited characters used in different instances, not a combination of the two as a single delimiter.
You can use the vertical bar as logical OR in a regular expression.
scanner.setDelimiter("[,|\\s]"); //Will match commas or whitespace as appropriate
line by line:
String[] parts = line.split("[,|\\s]");

Java parsing text file

I need to write a parser for textfiles (at least 20 kb), and I need to determine if words out of a set of words appear in this textfile (about 400 words and numbers). So I am looking for the most efficient possibilitie to do this (if a match is found, i need to do some further processing of this and it's previous line).
What I currently do, is to exclude lines that do not contain any information for sure (kind of metadata lines) and then compare word by word - but i don't think that only comparing word by word is the most efficient possibility.
Can anyone please provide some tips/hints/ideas/...
Thank you very much
It depends on what you mean with "efficient".
If you want a very straightforward way to code it, keep in mind that the String object in java has method String.contains(CharSequence sequence).
Then, you could put the file content into a String and then iterate on your keywords you want to check to see if any of those appear in String, using the method contains().
How about the following:
Put all your keywords in a HashSet (Set<String> keywords;)
Read the file one line at once
For each line in file:
Tokenize to words
For each word in line:
If word is contained in keywords (keywords.containes(word))
Process actual line
If previous line is available
Process previous line
Keep track of previous line (prevLine = line;)

Categories