How to read a file with delimiters - java

If I have a file like this, in which each section is delimited by "**". How can I read each section and put them into different data structures?
AAA
BBB
CCC
**
ccc:cc
ddd:dd
**
xyz;XYZ
abc;ABC
**
Name: John
Email: john#gmail.com
Name: Jack
Email: jack#gmail.com
Name: kate
Email: kake#hotmail.com
**
In a while loop, I can test whether the line equals "**". But since the number of lines in each section is unknown, it seems hard to recognize which section a particular line belongs to?
String line;
while((line=reader.readline()) != null){
if(!line.equals("**"){
// the line has to be parsed and built into different data structures.
For the first section, AAA,BBB,CCC will be added into an ArrayList.
}
}

IMO you should just make the reading method a little bit more clever.
Here is an example (a kind of pseudo code, assuming you have a reader that does an actual IO):
void main() {
List<List<String>> sections = ...
while(reader.hasMoreDataToProcess()) {
sections.add(processSection(reader));
}
}
List<String> processSection(reader) {
List<String> section = ...
do {
String line = reader.readLine();
if(line.equals("**")) { // end of section or whatever delimiter you have
return section;
}
section.addLine(line);
}while(true);
}

Sorry, in a hurry, so pseudocode:
currentSection = []
sections = [currentSection]
for each line:
if line is the separator:
currentSection = []
add currentSection to sections
else:
add line to currentSection

You can use split method of the string class in Java.
String string = "a-b,b-d,c-s,d-w,e-e,f-e";
String[] parts = string.split(",");
String part1 = parts[0]; // a-b
String part2 = parts[1]; // b-d

You should use scanner for this scenario. Here's how you do it. This code is not tested.
File file = new File("somefile.txt");
try {
Scanner sc = new Scanner(file);
sc.useDelimeter("\\*\\*");
while (sc.hasNext()) {
String s = sc.next();
}
sc.close();
}
catch (FileNotFoundException e) {
e.printStackTrace();
}

You can use a Scanner with a FileInputStream to scan the file, using setDelimiter(String) (which accepts a regex pattern) to set your delimiter.
public class Test {
public static void main(String[] args) {
ArrayList<String> firstList = new ArrayList<>();
ArrayList<String> secondList = new ArrayList<>();
try(Scanner scanner = new Scanner(new FileInputStream(new File("yourFile.txt"))).useDelimiter("[*]+")) {
firstList.add(scanner.next());
secondList.add(scanner.next());
// and so on
scanner.close();
} catch(FileNotFoundException e) {
e.printStackTrace();
}
}
}
This will take everything above ** and create a String out of it. If you want, you can then split the String, and grab the data from each line.
String[] split = scanner.next().split("\n");
for(String string : split) {
firstList.add(string);
}
In the first example, the regex [*]+ searches for multiple *. Learn more about regex (regular expressions) to add flexibility.

Related

Retrieving part of a string using a delimiter

Okay So I am creating an application but I'm not sure how to get certain parts of the string. I have read In a file as such:
*tp*|21394398437984|163600
*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567
*2*|AAA|1234567894561237|STOP|20140527|Success||Automated|DPSRN1234568
*3*|2
I need to read the lines beginning with 2 so I done:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
while (s.hasNext()) {
String str1 = s.nextLine ();
if(str1.startsWith("*2*")) {
System.out.print(str1);
}
}
So this will read the whole line I'm fine with that, Now my issue is I need to extract the 2nd line beginning with numbers the 4th with numbers the 5th with success and the 7th(DPSRN).
I was thinking about using a String delimiter with | as the delimiter but I'm not sure where to go after this any help would be great.
You should use String.split("|"), it will give you an array - String[]
Try following:
String test="*2*|AAA|1234567894561236|STOP|20140527|Success||Automated|DSPRN1234567";
String tok[]=test.split("\\|");
for(String s:tok){
System.out.println(s);
}
Output :
*2*
AAA
1234567894561236
STOP
20140527
Success
Automated
DSPRN1234567
What you require will be placed at tok[2], tok[4], tok[5] and tok[8].
Just split the returned line based on your search, which would return an array of String elements where you can retrieve your elements based on their index:
s = new Scanner(new BufferedReader(new FileReader("example.dat")));
String searchLine = "";
while (s.hasNext()) {
searchLine = s.nextLine();
if(searchLine.startsWith("*2*")) {
break;
}
}
String[] strs = searchLine.split("|");
String secondArgument = strs[2];
String forthArgument = strs[4];
String fifthArgument = strs[5];
String seventhArgument = strs[7];
System.out.println(secondArgument);
System.out.println(forthArgument);
System.out.println(fifthArgument);
System.out.println(seventhArgument);

Using useDelimiter() in Java to isolate a piece of text

I have a text file with content that looks like this:
Event=ThermostatNight,time=0
Event=LightOn,time=2000
Event=WaterOff,time=8000
Event=ThermostatDay,time=10000
Event=Bell,time=9000,rings=5
Event=WaterOn,time=6000
Event=LightOff,time=4000
Event=Terminate,time=12000
I have to use a Scanner to grab the file and then loop through each of the lines of text and isolate each event. For example I need to isolate "ThermostatNight" in the first line and then put it in an array, the next one would be "LightOn", and so on. It's a small piece of a large project that I am working on for an intermediate Java course. I have been able to get exactly the opposite of what I want with the useDelimiter argument shown below. Is there a quick fix to this. Note that I must use the useDelimiter() method.
public void readFile2() {
array2 = new ArrayList<String>();
while (s.hasNext()) {
s.useDelimiter("=(.*?),");
array2.add(s.next());
}
You can use multiples delimiter.
//scanner.useDelimiter("Event=|,time=([0-9]*)");
scanner.useDelimiter("Event=|,(.)+[\\r\\n]*Event=|,(.)+[\\r\\n]*");
//for better you can use this
//scanner.useDelimiter("Event=|,time=([0-9]**)[\\r\\n]**Event=|,time=([0-9]*)");
while (scanner.hasNext())
{
System.out.println(scanner.next());
}
Probably not the best , but it will work
Since you have requirement to use only useDelimeter and if the structure not changed.
then
public static void main(String[] args) {
Scanner sc;
try {
sc = new Scanner(new File("/home/xxx/text.txt"));
sc.useDelimiter(",time=(.*?)\\nEvent=");
ArrayList<String> eventlist = new ArrayList<String>();
String tmp = null;
if (sc.hasNext()) {
tmp = sc.next();
tmp = tmp.split("=")[1]; // Just First line
}
while (sc.hasNext()) {
eventlist.add(tmp);
System.out.println(tmp); // for test only remove it
tmp = sc.next();
}
tmp = tmp.split(",")[0];
eventlist.add(tmp);
System.out.println(tmp); // for test only , remove it
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}

Searching content of a file

I dont have alot of experience working with files. I have a file. I have written the following to the file
Test 112
help 456
news 456
Friendly 554
fileOUT.write("Test 112\r\n");//this is a example of how I entered the data.
Now I am trying to search in the file for the word news and display all the content that is in that line that contains the word news.
This is what I have attempted.
if(fileIN.next().contains("news")){
System.out.println("kkk");
}
This does not work. The folowing does find a word news because it displays KKK but I dont have an Idea how to display only the line that it news was found in.
while(fileIN.hasNext()){
if(fileIN.next().contains("Play")){
System.out.println("kkk");
}
}
What must be displayed is news 456.
Thank You
You want to call fileIN.nextLine().contains("news")
Try using the Scanner class if you are not already. It does a wonderful job of splitting input from a stream by some delineator (in this case the new line character.)
Here's a simple code example:
String pathToFile = "data.txt";
String textToSearchFor = "news";
Scanner scanner = new Scanner(pathToFile);
while(scanner.hasNextLine()){
String line = scanner.nextLine();
if(line.contains(textToSearchFor)){
System.out.println(line);
}
}
scanner.close();
And here's an advanced code example that does much more than you asked. Enjoy!
//Search file for an array of strings. Ignores case if caseSensitive is false.
public void searchFile(String file, boolean caseSensitive, String...textToSearchFor){
Scanner scanner = new Scanner(file);
while(scanner.hasNextLine()){
String originalLine = scanner.nextLine();
String line = originalLine;
if(!caseSensitive) line = line.toLowerCase();
for(String searchText : textToSearchFor){
if(!caseSensitive) searchText = searchText.toLowerCase();
if(line.contains(searchText)){
System.out.println(originalLine);
break;
}
}
}
scanner.close();
}
//usage
searchFile("data.txt",true,"news","Test","bob");
searchFile("data.txt",true,new String[]{"test","News"});
you can try this code...:D
String s = null;
File file = new File(path);
BufferedReader in;
try {
in = new BufferedReader(new FileReader(file));
while (in.ready()) {
s = in.readLine();
if(s.contains("news")){
//print something
}
}
in.close();
} catch (Exception e) {
}

removeAll operation on arraylist makes program hang

I'm trying to read in from two files and store them in two separate arraylists. The files consist of words which are either alone on a line or multiple words on a line separated by commas.
I read each file with the following code (not complete):
ArrayList<String> temp = new ArrayList<>();
FileInputStream fis;
fis = new FileInputStream(fileName);
Scanner scan = new Scanner(fis);
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (scan.hasNext()) {
String md5 = scan.next();
temp.add(md5);
}
}
scan.close();
return temp;
Each file contains almost 1 million words (I don't know the exact number), so I'm not entirely sure that the above code works correctly - but it seems to.
I now want to find out how many words are exclusive to the first file/arraylist. To do so I planned on using list1.removeAll(list2) and then checking the size of list1 - but for some reason this is not working. The code:
public static ArrayList differentWords(String fileName1, String fileName2) {
ArrayList<String> file1 = readFile(fileName1);
ArrayList<String> file2 = readFile(fileName2);
file1.removeAll(file2);
return file1;
}
My main method contains a few different calls and everything works fine until I reach the above code, which just causes the program to hang (in netbeans it's just "running").
Any idea why this is happening?
You are not using input in
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (scan.hasNext()) {
String md5 = scan.next();
temp.add(md5);
}
}
I think you meant to do this:
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (input.hasNext()) {
String md5 = input.next();
temp.add(md5);
}
}
but that said you should look into String#split() that will probably save you some time:
while (scan.hasNextLine()) {
String line = scan.nextLine();
String[] tokens = line.split(",");
for (String token: tokens) {
temp.add(token);
}
}
try this :
for(String s1 : file1){
for(String s2 : file2){
if(s1.equals(s2)){file1.remove(s1))}
}
}

reading from text file to string array

So I can search for a string in my text file, however, I wanted to sort data within this ArrayList and implement an algorithm. Is it possible to read from a text file and the values [Strings] within the text file be stored in a String[] Array.
Also is it possible to separate the Strings? So instead of my Array having:
[Alice was beginning to get very tired of sitting by her sister on the, bank, and of having nothing to do:]
is it possible to an array as:
["Alice", "was" "beginning" "to" "get"...]
.
public static void main(String[]args) throws IOException
{
Scanner scan = new Scanner(System.in);
String stringSearch = scan.nextLine();
BufferedReader reader = new BufferedReader(new FileReader("File1.txt"));
List<String> words = new ArrayList<String>();
String line;
while ((line = reader.readLine()) != null) {
words.add(line);
}
for(String sLine : words)
{
if (sLine.contains(stringSearch))
{
int index = words.indexOf(sLine);
System.out.println("Got a match at line " + index);
}
}
//Collections.sort(words);
//for (String str: words)
// System.out.println(str);
int size = words.size();
System.out.println("There are " + size + " Lines of text in this text file.");
reader.close();
System.out.println(words);
}
To split a line into an array of words, use this:
String words = sentence.split("[^\\w']+");
The regex [^\w'] means "not a word char or an apostrophe"
This will capture words with embedded apostrophes like "can't" and skip over all punctuation.
Edit:
A comment has raised the edge case of parsing a quoted word such as 'this' as this.
Here's the solution for that - you have to first remove wrapping quotes:
String[] words = input.replaceAll("(^|\\s)'([\\w']+)'(\\s|$)", "$1$2$3").split("[^\\w']+");
Here's some test code with edge and corner cases:
public static void main(String[] args) throws Exception {
String input = "'I', ie \"me\", can't extract 'can't' or 'can't'";
String[] words = input.replaceAll("(^|[^\\w'])'([\\w']+)'([^\\w']|$)", "$1$2$3").split("[^\\w']+");
System.out.println(Arrays.toString(words));
}
Output:
[I, ie, me, can't, extract, can't, or, can't]
Also is it possible to separate the Strings?
Yes, You can split string by using this for white spaces.
String[] strSplit;
String str = "This is test for split";
strSplit = str.split("[\\s,;!?\"]+");
See String API
Moreover you can also read a text file word by word.
Scanner scan = null;
try {
scan = new Scanner(new BufferedReader(new FileReader("Your File Path")));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
while(scan.hasNext()){
System.out.println( scan.next() );
}
See Scanner API

Categories