removeAll operation on arraylist makes program hang - java

I'm trying to read in from two files and store them in two separate arraylists. The files consist of words which are either alone on a line or multiple words on a line separated by commas.
I read each file with the following code (not complete):
ArrayList<String> temp = new ArrayList<>();
FileInputStream fis;
fis = new FileInputStream(fileName);
Scanner scan = new Scanner(fis);
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (scan.hasNext()) {
String md5 = scan.next();
temp.add(md5);
}
}
scan.close();
return temp;
Each file contains almost 1 million words (I don't know the exact number), so I'm not entirely sure that the above code works correctly - but it seems to.
I now want to find out how many words are exclusive to the first file/arraylist. To do so I planned on using list1.removeAll(list2) and then checking the size of list1 - but for some reason this is not working. The code:
public static ArrayList differentWords(String fileName1, String fileName2) {
ArrayList<String> file1 = readFile(fileName1);
ArrayList<String> file2 = readFile(fileName2);
file1.removeAll(file2);
return file1;
}
My main method contains a few different calls and everything works fine until I reach the above code, which just causes the program to hang (in netbeans it's just "running").
Any idea why this is happening?

You are not using input in
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (scan.hasNext()) {
String md5 = scan.next();
temp.add(md5);
}
}
I think you meant to do this:
while (scan.hasNextLine()) {
Scanner input = new Scanner(scan.nextLine());
input.useDelimiter(",");
while (input.hasNext()) {
String md5 = input.next();
temp.add(md5);
}
}
but that said you should look into String#split() that will probably save you some time:
while (scan.hasNextLine()) {
String line = scan.nextLine();
String[] tokens = line.split(",");
for (String token: tokens) {
temp.add(token);
}
}

try this :
for(String s1 : file1){
for(String s2 : file2){
if(s1.equals(s2)){file1.remove(s1))}
}
}

Related

Object needs to created from the file and placed into an array

I am trying to create an object for each line of text and as each object is created, place it into an array. I'm struggling to place it into an array. This is my code:
File inFile = new File("shareholders.txt");
Scanner inputFile = new Scanner(inFile);
String str;
Shareholder shareholder = new Shareholder();
while (inputFile.hasNext()) {
str = inputFile.nextLine();
String tokens[] = str.split(",");
shareholder.setID(tokens[0]);
shareholder.setName(tokens[1]);
shareholder.setAddress(tokens[2]);
shareholder.setPortfolioID(tokens[3]);
}
If you have a fixed number of shareholders, you can do this -
File inFile = new File("shareholders.txt");
Scanner inputFile = new Scanner(inFile);
String str;
int i=0;
Shareholder[] shareholder = new Shareholder[n];
while (inputFile.hasNext()) {
str = inputFile.nextLine();
String tokens[] = str.split(",");
shareholder[i++] = new Shareholder(tokens[0],tokens[1],tokens[2],tokens[3]);
}
Or if dont know the number of shareholders, then you can use list -
File inFile = new File("shareholders.txt");
Scanner inputFile = new Scanner(inFile);
String str;
List<Shareholder> list = new ArrayList<>();
while (inputFile.hasNext()) {
Shareholder shareholder = new Shareholder();
str = inputFile.nextLine();
String tokens[] = str.split(",");
list.add(new Shareholder(tokens[0],tokens[1],tokens[2],tokens[3]));
}
I think a list of shareholder objects might make the most sense here:
File inFile = new File("shareholders.txt");
Scanner inputFile = new Scanner(inFile);
String str;
List<Shareholder> list = new ArrayList<>();
while (inputFile.hasNext()) {
Shareholder shareholder = new Shareholder();
str = inputFile.nextLine();
String tokens[] = str.split(",");
shareholder.setID(tokens[0]);
shareholder.setName(tokens[1]);
shareholder.setAddress(tokens[2]);
shareholder.setPortfolioID(tokens[3]);
list.add(shareholder);
}
The reason a list makes sense here is because you might not know how many shareholders are present in the input file. Hence, an array might not work so well in this case (and even if the number of shareholders were fixed it could change at some later date).
Before reading the file, you can not know how many lines the file has.
The information about the number of lines is important to initialize your array with that specific size or otherwise you would need to extend your array multiple times by creating a new, bigger one. Which is bad practice and bad performance.
But instead of working with an array itself, use an arraylist for easier usage and just return a simple array, which can be received from the arraylist you worked with.
My suggestion as a solution for this issue is the following. Please note that the following code is not 100% complete and will not run in it's state. It is your job to complete it and make it run.
public void readFileIntoArray(String filename, Shareholder[] targetArray)
{
File sourceFile = new File(filename);
// Read in the file to determine the number of lines (int numberOfLines)
ArrayList<Shareholder> lines = new ArrayList<>(numberOfLines);
Shareholder sh;
while(file.hasNext())
{
sh = new Shareholder();
//Parse data into Shareholderobject
lines.add(sh);
}
return lines.toArray();
}

How to read a file with delimiters

If I have a file like this, in which each section is delimited by "**". How can I read each section and put them into different data structures?
AAA
BBB
CCC
**
ccc:cc
ddd:dd
**
xyz;XYZ
abc;ABC
**
Name: John
Email: john#gmail.com
Name: Jack
Email: jack#gmail.com
Name: kate
Email: kake#hotmail.com
**
In a while loop, I can test whether the line equals "**". But since the number of lines in each section is unknown, it seems hard to recognize which section a particular line belongs to?
String line;
while((line=reader.readline()) != null){
if(!line.equals("**"){
// the line has to be parsed and built into different data structures.
For the first section, AAA,BBB,CCC will be added into an ArrayList.
}
}
IMO you should just make the reading method a little bit more clever.
Here is an example (a kind of pseudo code, assuming you have a reader that does an actual IO):
void main() {
List<List<String>> sections = ...
while(reader.hasMoreDataToProcess()) {
sections.add(processSection(reader));
}
}
List<String> processSection(reader) {
List<String> section = ...
do {
String line = reader.readLine();
if(line.equals("**")) { // end of section or whatever delimiter you have
return section;
}
section.addLine(line);
}while(true);
}
Sorry, in a hurry, so pseudocode:
currentSection = []
sections = [currentSection]
for each line:
if line is the separator:
currentSection = []
add currentSection to sections
else:
add line to currentSection
You can use split method of the string class in Java.
String string = "a-b,b-d,c-s,d-w,e-e,f-e";
String[] parts = string.split(",");
String part1 = parts[0]; // a-b
String part2 = parts[1]; // b-d
You should use scanner for this scenario. Here's how you do it. This code is not tested.
File file = new File("somefile.txt");
try {
Scanner sc = new Scanner(file);
sc.useDelimeter("\\*\\*");
while (sc.hasNext()) {
String s = sc.next();
}
sc.close();
}
catch (FileNotFoundException e) {
e.printStackTrace();
}
You can use a Scanner with a FileInputStream to scan the file, using setDelimiter(String) (which accepts a regex pattern) to set your delimiter.
public class Test {
public static void main(String[] args) {
ArrayList<String> firstList = new ArrayList<>();
ArrayList<String> secondList = new ArrayList<>();
try(Scanner scanner = new Scanner(new FileInputStream(new File("yourFile.txt"))).useDelimiter("[*]+")) {
firstList.add(scanner.next());
secondList.add(scanner.next());
// and so on
scanner.close();
} catch(FileNotFoundException e) {
e.printStackTrace();
}
}
}
This will take everything above ** and create a String out of it. If you want, you can then split the String, and grab the data from each line.
String[] split = scanner.next().split("\n");
for(String string : split) {
firstList.add(string);
}
In the first example, the regex [*]+ searches for multiple *. Learn more about regex (regular expressions) to add flexibility.

Scanning, spliting and assigning values from a text file

I'm having trouble scanning a given file for certain words and assigning them to variables, so far I've chosen to use Scanner over BufferedReader because It's more familiar. I'm given a text file and this particular part I'm trying to read the first two words of each line (potentially unlimited lines) and maybe add them to an array of sorts. This is what I have:
File file = new File("example.txt");
Scanner sc = new Scanner(file);
while (sc.hasNextLine()) {
String line = sc.nextLine();
String[] ary = line.split(",");
I know It' a fair distance off, however I'm new to coding and cannot get past this wall...
An example input would be...
ExampleA ExampleAA, <other items seperated by ",">
ExampleB ExampleBB, <other items spereated by ",">
...
and the proposed output
VariableA = ExampleA ExampleAA
VariableB = ExampleB ExampleBB
...
You can try something like this
File file = new File("D:\\test.txt");
Scanner sc = new Scanner(file);
List<String> list =new ArrayList<>();
int i=0;
while (sc.hasNextLine()) {
list.add(sc.nextLine().split(",",2)[0]);
i++;
}
char point='A';
for(String str:list){
System.out.println("Variable"+point+" = "+str);
point++;
}
My input:
ExampleA ExampleAA, <other items seperated by ",">
ExampleB ExampleBB, <other items spereated by ",">
Out put:
VariableA = ExampleA ExampleAA
VariableB = ExampleB ExampleBB
To rephrase, you are looking to read the first 2 words of a line (everything before the first comma) and store it in a variable to process further.
To do so, your current code looks fine, however, when you grab the line's data, use the substring function in conjunction with indexOf to just get the first part of the String before the comma. After that, you can do whatever processing you want to do with it.
In your current code, ary[0] should give you the first 2 words.
public static void main(String[] args)
{
File file = new File("example.txt");
FileReader fr = new FileReader(file);
BufferedReader br = new BufferedReader(fr);
String line = "";
List l = new ArrayList();
while ((line = br.readLine()) != null) {
System.out.println(line);
line = line.trim(); // remove unwanted characters at the end of line
String[] arr = line.split(",");
String[] ary = arr[0].split(" ");
String firstTwoWords[] = new String[2];
firstTwoWords[0] = ary[0];
firstTwoWords[1] = ary[1];
l.add(firstTwoWords);
}
Iterator it = l.iterator();
while (it.hasNext()) {
String firstTwoWords[] = (String[]) it.next();
System.out.println(firstTwoWords[0] + " " + firstTwoWords[1]);
}
}

How to read data from a text file into arrays in Java

I am having trouble with a programming assignment. I need to read data from a txt file and store it in parallel arrays. The txt file contents are formatted like this:
Line1: Stringwith466numbers
Line2: String with a few words
Line3(int): 4
Line4: Stringwith4643numbers
Line5: String with another few words
Line6(int): 9
Note: The "Line1: ", "Line2: ", etc is just for display purposes and isn't actually in the txt file.
As you can see it goes in a pattern of threes. Each entry to the txt file is three lines, two strings and one int.
I would like to read the first line into an array, the second into another, and the third into an int array. Then the fourth line would be added to the first array, the 5th line to the second array and the 6th line into the third array.
I have tried to write the code for this but can't get it working:
//Create Parallel Arrays
String[] moduleCodes = new String[3];
String[] moduleNames = new String[3];
int[] numberOfStudents = new int[3];
String fileName = "myfile.txt";
readFileContent(fileName, moduleCodes, moduleNames, numberOfStudents);
private static void readFileContent(String fileName, String[] moduleCodes, String[] moduleNames, int[] numberOfStudents) throws FileNotFoundException {
// Create File Object
File file = new File(fileName);
if (file.exists())
{
Scanner scan = new Scanner(file);
int counter = 0;
while(scan.hasNext())
{
String code = scan.next();
String moduleName = scan.next();
int totalPurchase = scan.nextInt();
moduleCodes[counter] = code;
moduleNames[counter] = moduleName;
numberOfStudents[counter] = totalPurchase;
counter++;
}
}
}
The above code doesn't work properly. When I try to print out an element of the array. it returns null for the string arrays and 0 for the int arrays suggesting that the code to read the data in isn't working.
Any suggestions or guidance much appreciated as it's getting frustrating at this point.
The fact that only null's get printed suggests that the file doesn't exist or is empty (if you print it correctly).
It's a good idea to put in some checking to make sure everything is fine:
if (!file.exists())
System.out.println("The file " + fileName + " doesn't exist!");
Or you can actually just skip the above and also take out the if (file.exists()) line in your code and let the FileNotFoundException get thrown.
Another problem is that next splits things by white-space (by default), the problem is that there is white-space on that second line.
nextLine should work:
String code = scan.nextLine();
String moduleName = scan.nextLine();
int totalPurchase = Integer.parseInt(scan.nextLine());
Or, changing the delimiter should also work: (with your code as is)
scan.useDelimiter("\\r?\\n");
You are reading line so try this:
while(scan.hasNextLine()){
String code = scan.nextLine();
String moduleName = scan.nextLine();
int totalPurchase = Integer.pasreInt(scan.nextLine().trim());
moduleCodes[counter] = code;
moduleNames[counter] = moduleName;
numberOfStudents[counter] = totalPurchase;
counter++;
}
String code = scan.nextLine();
String moduleName = scan.nextLine();
int totalPurchase = scan.nextInt();
scan.nextLine()
This will move scanner to proper position after reading int.

Scanner to reset pointer at previous line

My problem could be solved if Scanner class had previous() method on it. I am asking this question to know if there are any methods to achieve this functionality.
Input:
a file with contents like
a,1
a,2
a,3
b,1
c,1
c,2
c,3
c,4
d,1
d,2
d,3
e,1
f,1
I need to create a list of all lines that has same alphabet.
try {
Scanner scanner = new Scanner(new File(fileName));
List<String> procList = null;
String line =null;
while (scanner.hasNextLine()){
line = scanner.nextLine();
System.out.println(line);
String[] sParts = line.split(",");
procList = new ArrayList<String>();
procList.add(line);
boolean isSamealpha = true;
while(isSamealpha){
String s1 = scanner.nextLine();
if (s1.contains(sParts[0])){
procList.add(s1);
}else{
isSamealpha = false;
System.out.println(procList);
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
I get output like
a,1
[a,1, a,2, a,3]
c,1
[c,1, c,2, c,3, c,4]
d,2
[d,2, d,3]
f,1
[f,1]
As you can see it missed list for b and e. If I has scanner.previous() method, I would have put it in else of second while loop. Because there is no previous method, I am stuck.
Please let me know if there are any methods I can use. I can't use FileUtils.readLines() because its a 3GB file and I don't want to use my java memory to store all the file.
I would suggest reconsidering your algorithm instead. You are missing tokens because your algorithm involves reading ahead to determine when the sequence has broken, yet you aren't collecting that next line of input into the same structures that you are placing "duplicate" entries.
You can solve this without needing to read backwards. If you know that the input is always sorted, just read line by line and keep a reference to the last line (to compare with the current one).
Below is some sample code that should help. (I only typed this; I did no checking.)
Scanner scanner = new Scanner(new File(fileName));
List<String> procList = null;
String line = null;
String previousAlpha = null;
while (scanner.hasNextLine()){
line = scanner.nextLine();
if (previousAlpha == null) {
// very first line in the file
procList = new ArrayList<String>();
procList.add(line);
System.out.println(line);
previousAlpha = line.split(",")[0];
}
else if (line.contains(previousAlpha)) {
// same letter as before
procList.add(line);
}
else {
// new letter, but not the very first
// line
System.out.println(procList);
procList = new ArrayList<String>();
procList.add(line);
System.out.println(line);
previousAlpha = line.split(",")[0];
}
}

Categories