Counting and comparing strings from scv file - java

I have a big csv file of 18000 rows. They represent different kinds of articles from a liqueur store. If I with bufferedReader split up the rows I get columns with stuff like - article number, name, amount of alcohol, price etc etc.
In the seventh column is the type of beverage (beer, whine rum etc.). What would be the best way to count how many articles there are of each type? I would like to be able to do this without having to know the types. And I only want to use java.util.*.
Can I read through the column and store it in a queue. While reading I store each new type in a set. Then I can maybe compare all elements in the queue to this set?
br = new BufferedReader(new FileReader(filename));
while ((line = br.readLine()) != null) {
// using \t as separator
String[] articles = line.split(cvsSplitBy);
The output should be something like.
There are:
100 beers
2000 whines
etc. etc.
in the sortiment

You could use a HashMap<String, integer> in order to know how many products of each category you have. The usage would be like:
HashMap<String, Integer> productCount = new HashMap<>(); //This must be outsise the loop
//read you CSV here, this should be placed inside the loop
String product = csvColumns[6];
if(productCount.containsKey(product)) { //If the product doesn't exist
productCount.put(product, 1);
} else { //Product exists, add 1
int count = productCount.get(product);
productCount.put(productCount, count+1);
}
DISCLAIMER You have to be sure all products are named the same ie: beer or Beer. To Java, these are different Strings, and so, will have different count. One way will be to conver every product to uppercase ie: beer -> BEER Beer->BEER. This will cause to show all names in uppercase when showing results.

Related

Comparison of values from file not working in java

Currently i am stuck up with a issue regarding comparing values in a text file. Below is my requirement which is a bit unique is what i can say.
I am getting a text file which is having data in the below format. The lines are a series of numbers of a particular format.
223---other line values
354---other line value
756---other line values
754---other line values
854---other line values
923---other line values
I have to validate that all the lines are starting in this order 2,3,7,8,9. There can be multiple lines in between 2 & 9 starting with 2,3,7,7,8,3,7,7,8,9. It is guranteed that 2 and 9 lines will be the first and last lines in the file. Multiple 7's can appear between 3 & 8.
I came up with the logic below for this comparison but the logic works for only one combination of lines starting with 2,3,7,7,8,9.
When there are multiple occurrences of lines like say 2,3,7,7,8,3,7,7,8,9 it does not work. Can someone please help me with what is wrong here and how i can solve this issue. If there is a better option or any other better way for my requirement please suggest so that i can use it. The volume in the input file is not high and can be almost 10 to 20 thousand.
Set<String> recordTypeOrder = new HashSet<>();
BufferedReader rdr = new BufferedReader(new StringReader("path to my file here"));
for (String line = rdr.readLine(); line != null; line = rdr.readLine()) {
if(line.startsWith("2")){
recordTypeOrder.add("2");
}else if(line.startsWith("3")){
recordTypeOrder.add("3");
}else if(line.startsWith("7")){
recordTypeOrder.add("7");
}else if(line.startsWith("8")){
recordTypeOrder.add("8");
}else if(line.startsWith("9")){
recordTypeOrder.add("9");
}
}
Set<String> orderToCompare = new TreeSet<>(recordTypeOrder);
boolean compare = orderToCompare.equals(actualOrder());
if(!compare){
logger.info("== Processing failed =====");
throw new CustomException("======= Processing failed =======");
}
private static Set<String> actualOrder(){
Set<String> actualOrder= new HashSet<>();
actualOrder.add("2");
actualOrder.add("3");
actualOrder.add("7");
actualOrder.add("8");
actualOrder.add("9");
return actualOrder;
}
Many Thanks
You need to store both order and count of 3,7,7,8. I think treeset won't work. Can you try others data structures like LinkedHashMap. That way you could just store the data you need in LinkedHashMap and then write a custom function to validate it.

jython convert text file to list of strings

I must convert a text file into a list of strings separated by commas (with no whitespace and no first line). After printing that, I need to print the name of each state, how many lines contain each state, The sum of all Cen2010 values (the 1st number in each line) for each state, sum of Est2013 values (the last number in each line) for each state, and the total change from Cen2010 population to Est2013 population for each state.
Text File Example:
NAME,STNAME,Cen2010,Base2010,Est2010,Est2011,Est2012,Est2013
"Abingdon city",Illinois,3319,3286,3286,3270,3242,3227
"Addieville village",Illinois,252,252,252,250,250,247
"Addison village",Illinois,36942,36964,37007,37181,37267,37385
"Adeline village",Illinois,85,85,85,84,84,83
Current Code:
def readPopest():
censusfile=pickAFile()
cf=open(censusfile,"rt")
cflines=cf.readlines()
for i in range(len(cflines)-1):
lines=cflines[i+1]
estimate=lines.strip().split(',')
print estimate
Returning:
['"Abingdon city"', 'Illinois', '3319', '3286', '3286', '3270', '3242', '3227']
['"Addieville village"', 'Illinois', '252', '252', '252', '250', '250', '247']
['"Addison village"', 'Illinois', '36942', '36964', '37007', '37181', '37267', '37385']
['"Adeline village"', 'Illinois', '85', '85', '85', '84', '84', '83']
I think you can import this data to SQL database and then it is very easy to sum, filter etc.
But in Python we have dictionaries. You can read data and fill dictionary where key name is name of the state. Then for each line you add town to list of towns in this state, and add numbers to already saved numbers. Of course for 1st town in state you must create structure with two arrays. One for towns, and one for numbers. In code it looks like:
def add_items(main_dict, state, town, numbers):
try:
towns_arr, numbers_arr = main_dict[state]
towns_arr.append(town)
for i in range(len(numbers)):
numbers_arr[i] += numbers[i]
except KeyError:
town_arr = [town, ]
main_dict[state] = [town_arr, numbers]
Now you must use it in your main code that reads file:
state_dict = {}
cf = open(censusfile, "rt")
lines = cf.readlines()
for line in lines[1:]: # we skip 1st line
arr = line.strip().split(',')
town = arr[0]
state = arr[1]
numbers = [int(x) for x in arr[2:]]
add_items(state_dict, state, town, numbers)
print(state_dict)
As a homework try to print this dictionary in desired format.

searching from txt file for a specific characters (Java)

I have a big txt. (a dictionary) file which contains about 100k + words ordered like that:
tree trees asderi 12
car cars asdfei 123
mouse mouses dasrkfi 333
plate plates asdegvi 333
......
(ps. there are no empty rows in between)
what i want to do is to to check the 3th column (asderi in this case at first row) and if there are letters "i" and "e" in this word then copy the first word in this row (tree in this case) to a new txt. file. I don't need a whole solution but maybe and example how to read 3th word and check for it letters and if they are TRUE print the first word in that line out.
When it comes to big data files you want to process line by line rather than reading all of it to your memory you may want to start with this to process the file line by line:
BufferedReader br = new BufferedReader(new FileReader(new File("C:/sample/sample.txt")));
String line;
while ((line = br.readLine()) != null) {
// process the line.
}
br.close();
Once you have the line i bet you will be able to use the common String-methods like .indexOf(.., .substring(..., .split to aquire the data you want (expecially since the source file seems to have well structured data).
So assumed your "columns" are always seperated by a space and there is never a word in a column containing a space nor is there never a column missing you could catch the columns using .split like this:
// this will be the current line of the file
String s = "tree trees asderi 12";
String[] fragments = s.split(" ");
String thirdColumn = fragments[2];
boolean hasI = thirdColumn.contains("i");
String firstColumn = fragments[0];
System.out.println("Fragment: "+thirdColumn+" contains i: "+hasI+" thats why i want the first fragment: "+firstColumn);
But in the end you will have to try around a bit and play with the String-methods to get it together especially for all special cases this file probably will bring up ;)
You may update your "question" with some source you managed to write with this hints and then ask again if you get stuck.

How to implement array processing with read file data?

I'm trying to figure out how to read data from a file, using an array. The data in the file is listed like this:
Clarkson 80000
Seacrest 100000
Dunkleman 75000
...
I want to store that information using an array. Currently I have something like this to read the data and use it:
String name1 = in1.next();
int vote1 = in1.nextInt();
//System.out.println(name1 +" " + vote1);
String name2 = in1.next();
int vote2 = in1.nextInt();
//System.out.println(name2 +" " + vote2);
String name3 = in1.next();
int vote3 = in1.nextInt();
...
//for all names
Problem is, the way I'm doing it means I can never manipulate the file data for more contestants or whatnot.
While I can use this way and handle all the math within different methods and get the expected output...its really inefficient I think.
Output expected:
American Idol Fake Results for 2099
Idol Name Votes Received % of Total Votes
__________________________________________________
Clarkson 80,000 14.4%
Seacrest 100,000 18.0%
Dunkleman 75,000 13.5%
Cowell 110,000 19.7%
Abdul 125,000 22.4%
Jackson 67,000 12.0%
Total Votes 557,000
The winner is Abdul!
I figure reading input file data into arrays is likely easy using java.io.BufferedReader is there a way not to use that?
I looked at this: Java: How to read a text file but I'm stuck thinking this is a different implementation.
I want to try to process all the information through understandable arrays and maybe at least 2-3 methods (in addition to the main method that reads and stores all data for runtime). But say I want to use that data and find percentages and stuff (like the output). Figure out the winner...and maybe even alphabetize the results!
I want to try something and learn how the code works to get a feel of the concept at hand. ;c
int i=0
while (in.hasNextLine()) {
name = in.nextLine();
vote = in.nextInt();
//Do whatever here: print, save name and vote, etc..
//f.e: create an array and save info there. Assuming both name and vote are
//string, create a 2d String array.
array[i][0]=name;
array[i][1]=vote;
//if you want to individually store name and votes, create two arrays.
nameArray[i] = name;
voteArray[i] = vote;
i++;
}
This will loop until he automatically finds you don't have any more lines to read. Inside the loop, you can do anything you want (Print name and votes, etc..). In this case, you save all the values into the array[][].
array[][] will be this:
array[0][0]= Clarkson
array[0][1]= 80,000
array[1][0]= Seacrest
array[1][1]= 100,000
...and so on.
Also, I can see that you have to do some maths. So, if you save it as a String, you should convert it to double this way:
double votesInDouble= Double.parseDouble(array[linePosition][1]);
You have several options:
create a Class to represent your File data, then have an array of those Objects
maintain two arrays in parallel, one of the names and the other of the votes
Use a Map, where the name of the person is the key and the number of votes is the value
a) gives you direct access like an array
b) you don't need to create a class
Option 1:
public class Idol
{
private String name;
private int votes;
public Idol(String name, int votes)
{
// ...
}
}
int index = 0;
Idol[] idols = new Idol[SIZE];
// read from file
String name1 = in1.next();
int vote1 = in1.nextInt();
//create Idol
Idol i = new Idol(name1, vote1);
// insert into array, increment index
idols[index++] = i;
Option 2:
int index = 0;
String[] names = new String[SIZE];
int[] votes = new int[SIZE];
// read from file
String name1 = in1.next();
int vote1 = in1.nextInt();
// insert into arrays
names[index] = name1;
votes[index++] = vote1;
Option 3:
// create Map
Map<String, Integer> idolMap = new HashMap<>();
// read from file
String name1 = in1.next();
int vote1 = in1.nextInt();
// insert into Map
idolMap.put(name1, vote1);
Now you can go back any manipulate the data to your hearts content.

Get csv and compare lines. ArrayList? Java

i dont't use java very often and now i got some Problem.
I want to read a CSV file like this one:
A,B,C,D
A,B,F,K
E,F,S,A
A,B,C,S
A,C,C,S
Java don't know dynamic arrays, so i choose an ArrayList. This works so far. The Problem is:
How can I store the ArrayList? I think an other ArrayList would help.
This is what I got:
BufferedReader reader = new BufferedReader(
new InputStreamReader(this.getClass().getResourceAsStream(
"../data/" + filename + ".csv")));
List rows = new ArrayList();
String line;
while ((line = reader.readLine()) != null) {
rows.add(Arrays.asList(line.split(",")));
}
Now I get an ArrayList with a size of 5 for rows.size().
How do I get row[0][0] for example?
What do I want to do? The Problem is i want to find the same row except the last column.
For example i want to find row 0 and row 3.
thank you very much
Thank you all! You helped me a lot. =) Maybe Java and I will become friends =) THANKS!
You don't need to know the row size in advance, String.split() returns a String array:
List<String[]> rows = new ArrayList<String[]>();
String line = null;
while((line = reader.readLine()) != null)
rows.add(line.split(",", -1));
To access a specific row:
int len = rows.get(0).length;
String val = rows.get(0)[0];
Also, are you always comparing by the entire row except the last column? You could just take off the last value (line.replaceFirst(",.*?$", "")) and compare the rows as strings (have to be careful of whitespace and other formatting, of course).
A slightly different way:
Set<String> rows = new HashSet<String>();
String line = null;
while((line = reader.readLine()) != null){
if(!rows.add(line.substring(0, line.lastIndexOf(','))))
System.out.println("duplicate found: " + line);
}
Of course, modify as necessary if you actually need to capture the matching lines.
You'll need to declare an ArrayList of arrays. Asuming that csv file has a known number of columns, the only dynamic list needed here are the "rows" of your "table", formed by an ArrayList(rows) of arrays char[] (columns). (If not, then an ArrayList of ArrayList is fine).
It's just like a 2D table in any other language: an array of arrays. Just that in this case one of the arrays needs to be dynamic.
To read the file you'll need two loops. One that reads each line, just as you're doing, and another one that reads char per char.
Just a quick note: if you are going to declare an array like this:
char[] row = new char[5];
and then going to add each row to the ArrayList like this:
yourList.add(row);
You will have a list full of pointers to the same array. You'll need to use the .clone() method like this:
yourList.add(row.clone());
To access it like table[1][2], you'll need to use arraylist.get(1).get(2);

Categories