jython convert text file to list of strings - java

I must convert a text file into a list of strings separated by commas (with no whitespace and no first line). After printing that, I need to print the name of each state, how many lines contain each state, The sum of all Cen2010 values (the 1st number in each line) for each state, sum of Est2013 values (the last number in each line) for each state, and the total change from Cen2010 population to Est2013 population for each state.
Text File Example:
NAME,STNAME,Cen2010,Base2010,Est2010,Est2011,Est2012,Est2013
"Abingdon city",Illinois,3319,3286,3286,3270,3242,3227
"Addieville village",Illinois,252,252,252,250,250,247
"Addison village",Illinois,36942,36964,37007,37181,37267,37385
"Adeline village",Illinois,85,85,85,84,84,83
Current Code:
def readPopest():
censusfile=pickAFile()
cf=open(censusfile,"rt")
cflines=cf.readlines()
for i in range(len(cflines)-1):
lines=cflines[i+1]
estimate=lines.strip().split(',')
print estimate
Returning:
['"Abingdon city"', 'Illinois', '3319', '3286', '3286', '3270', '3242', '3227']
['"Addieville village"', 'Illinois', '252', '252', '252', '250', '250', '247']
['"Addison village"', 'Illinois', '36942', '36964', '37007', '37181', '37267', '37385']
['"Adeline village"', 'Illinois', '85', '85', '85', '84', '84', '83']

I think you can import this data to SQL database and then it is very easy to sum, filter etc.
But in Python we have dictionaries. You can read data and fill dictionary where key name is name of the state. Then for each line you add town to list of towns in this state, and add numbers to already saved numbers. Of course for 1st town in state you must create structure with two arrays. One for towns, and one for numbers. In code it looks like:
def add_items(main_dict, state, town, numbers):
try:
towns_arr, numbers_arr = main_dict[state]
towns_arr.append(town)
for i in range(len(numbers)):
numbers_arr[i] += numbers[i]
except KeyError:
town_arr = [town, ]
main_dict[state] = [town_arr, numbers]
Now you must use it in your main code that reads file:
state_dict = {}
cf = open(censusfile, "rt")
lines = cf.readlines()
for line in lines[1:]: # we skip 1st line
arr = line.strip().split(',')
town = arr[0]
state = arr[1]
numbers = [int(x) for x in arr[2:]]
add_items(state_dict, state, town, numbers)
print(state_dict)
As a homework try to print this dictionary in desired format.

Related

Counting and comparing strings from scv file

I have a big csv file of 18000 rows. They represent different kinds of articles from a liqueur store. If I with bufferedReader split up the rows I get columns with stuff like - article number, name, amount of alcohol, price etc etc.
In the seventh column is the type of beverage (beer, whine rum etc.). What would be the best way to count how many articles there are of each type? I would like to be able to do this without having to know the types. And I only want to use java.util.*.
Can I read through the column and store it in a queue. While reading I store each new type in a set. Then I can maybe compare all elements in the queue to this set?
br = new BufferedReader(new FileReader(filename));
while ((line = br.readLine()) != null) {
// using \t as separator
String[] articles = line.split(cvsSplitBy);
The output should be something like.
There are:
100 beers
2000 whines
etc. etc.
in the sortiment
You could use a HashMap<String, integer> in order to know how many products of each category you have. The usage would be like:
HashMap<String, Integer> productCount = new HashMap<>(); //This must be outsise the loop
//read you CSV here, this should be placed inside the loop
String product = csvColumns[6];
if(productCount.containsKey(product)) { //If the product doesn't exist
productCount.put(product, 1);
} else { //Product exists, add 1
int count = productCount.get(product);
productCount.put(productCount, count+1);
}
DISCLAIMER You have to be sure all products are named the same ie: beer or Beer. To Java, these are different Strings, and so, will have different count. One way will be to conver every product to uppercase ie: beer -> BEER Beer->BEER. This will cause to show all names in uppercase when showing results.

Questions regarding programming a single-line calculator in Java

I am currently a early CS student and have begun to start projects outside of class just to gain more experience. I thought I would try and design a calculator.
However, instead of using prompts like "Input a number" etc. I wanted to design one that would take an input of for example "1+2+3" and then output the answer.
I have made some progress, but I am stuck on how to make the calculator more flexible.
Scanner userInput = new Scanner(System.in);
String tempString = userInput.nextLine();
String calcString[] = tempString.split("");
Here, I take the user's input, 1+2+3 as a String that is then stored in tempString. I then split it and put it into the calcString array.
This works out fine, I get "1+2+3" when printing out all elements of calcString[].
for (i = 0; i <= calcString.length; i += 2) {
calcIntegers[i] = Integer.parseInt(calcString[i]);
}
I then convert the integer parts of calcString[] to actual integers by putting them into a integer array.
This gives me "1 0 2 0 3", where the zeroes are where the operators should eventually be.
if (calcString[1].equals("+") && calcString[3].equals("+")) {
int retVal = calcIntegers[0] + calcIntegers[2] + calcIntegers[4];
System.out.print(retVal);
}
This is where I am kind of stuck. This works out fine, but obviously isn't very flexible, as it doesn't account for multiple operators at the same like 1 / 2 * 3 - 4.
Furthermore, I'm not sure how to expand the calculator to take in longer lines. I have noticed a pattern where the even elements will contain numbers, and then odd elements contain the operators. However, I'm not sure how to implement this so that it will convert all even elements to their integer counterparts, and all the odd elements to their actual operators, then combine the two.
Hopefully you guys can throw me some tips or hints to help me with this! Thanks for your time, sorry for the somewhat long question.
Create the string to hold the expression :
String expr = "1 + 2 / 3 * 4"; //or something else
Use the String method .split() :
String tokens = expr.split(" ");
for loop through the tokens array and if you encounter a number add it to a stack. If you encounter an operator AND there are two numbers on the stack, pop them off and operate on them and then push back to the stack. Keep looping until no more tokens are available. At the end, there will only be one number left on the stack and that is the answer.
The "stack" in java can be represented by an ArrayList and you can add() to push items onto the stack and then you can use list.get(list.size()-1); list.remove(list.size()-1) as the pop.
You are taking input from user and it can be 2 digit number too.
so
for (i = 0; i <= calcString.length; i += 2) {
calcIntegers[i] = Integer.parseInt(calcString[i]);
}
will not work for 2 digit number as your modification is i+=2.
Better way to check for range of number for each char present in string. You can use condition based ASCII values.
Since you have separated your entire input into strings, what you should do is check where the operations appear in your calcString array.
You can use this regex to check if any particular String is an operation:
Pattern.matches("[+-[*/]]",operation )
where operation is a String value in calcString
Use this check to seperate values and operations, by first checking if any elements qualify this check. Then club together the values that do not qualify.
For example,
If user inputs
4*51/6-3
You should find that calcString[1],calcString[4] and calcString[6] are operations.
Then you should find the values you need to perform operations on by consolidating neighboring digits that are not separated by operations. In the above example, you must consolidate calcString[2] and calcString[3]
To consolidate such digits you can use a function like the following:
public int consolidate(int startPosition, int endPosition, ArrayList list)
{
int number = list.get(endPosition);
int power = 10;
for(int i=endPosition-1; i>=startPosition; i--)
{
number = number + (power*(list.get(i)));
power*=10;
}
return number;
}
where startPosition is the position where you encounter the first digit in the list, or immediately following an operation,
and endPosition is the last position in the list till which you have not encountered another operation.
Your ArrayList containing user input must also be passed as an input here!
In the example above you can consolidate calcString[2] and calcString[3] by calling:
consolidate(2,3,calcString)
Remember to verify that only integers exist between the mentioned positions in calcString!
REMEMBER!
You should account for a situation where the user enters multiple operations consecutively.
You need a priority processing algorithm based on the BODMAS (Bracket of, Division, Multiplication, Addition and Subtraction) or other mathematical rule of your preference.
Remember to specify that your program handles only +, -, * and /. And not power, root, etc. functions.
Take care of the data structures you are using according to the range of inputs you are expecting. A Java int will handle values in the range of +/- 2,147,483,647!

Regex to parse multiline data

I have a following data from a file and I would like to see if I can do a regex parsing here
Name (First Name) City Zip
John (retired) 10007
Mark Baltimore 21268
....
....
Avg Salary
70000 100%
Its not a big file and the entire data from the file is available in a String object with a new line characters (\n) (String data = "data from the file")
I am trying to get name, city, zip and then the salary, percentage details
data inside () considered part of Name field.
For Name field space is considered valid and there are no space for other fields.
'Avg Salary' is available only at the end of the file
Will it be easy to do this via regex parsing in Java?
If the text file is space-aligned, you can (and probably should) extract the fields based on the number of characters. So, you take the first n characters in each line as first name, the next m characters as City, and so on.
This is one code to extract using the above method, by calculating the field length of the fields automatically, assuming we know the header.
String data = "data from the file";
// This is just to ensure we have enough space in the array
int numNewLines = data.length()-data.replace("\n","").length();
String[][] result = new String[numNewLines][3];
String[] lines = data.split("\n");
int avgSalary = 0;
int secondFieldStart = lines[0].indexOf("City");
int thirdFieldStart = lines[0].indexOf("Zip");
for(int i=1; i<lines.length; i++){
String line = lines[i].trim();
if(line.equals("Avg Salary")){
avgSalary = Integer.parseInt(lines[i+1].substring(0,secondFieldStart).trim());
break;
}
result[i-1][0] = line.substring(0,secondFieldStart).trim(); // First Name
result[i-1][1] = line.substring(secondFieldStart,thirdFieldStart).trim(); // City
result[i-1][2] = line.substring(thirdFieldStart).trim(); // Zip
}
Using regex will be possible, but it will be more complicated. And regex won't be able to differentiate person's name and city's name anyway:
Consider this case:
John Long-name Joe New York 21003
How would you know the name is John Long-name Joe instead of John Long-name Joe New if you don't know that the length of the first field is at most 20 characters? (note that length of John Long-name Joe is 19 characters, leaving one space between it and New in New York)
Of course if your fields are separated by other characters (like tab character \t), you can split each line based on that. And it's easy to modify the code above to accommodate that =)
Since the solution I proposed above is simpler, I guess you might want to try it instead =)

Selecting the lines in a text file starting with integer values in java

I have scraped a file from pdf using pdfbox for java and the output is as follows:
Tribhuvan University
Institute of Engineering
Entrance Examination Board
BE/BArch Entrance Examination 2070
Pass List
ROLLNO NAME GENDER DISTRICT Percent Rank
1001 AADARSH        DEO MALE Saptari 51.429 3442
1002 AADARSH        MALLA MALE Bajhang 43.429 5714
1003 AADARSHA        KHANAL MALE Rupandehi 40.571 6709
The list goes on with the repetition of first 6 lines in every page[150 pages]. What I need to do is to select the lines that starts with the integer value in java and create a new file with the list that starts with the integer value.
You can split the output into separate lines and then use .startsWith("[0-9]") for each line.
For example:
// let's presume that you've loaded the lines into "List<String> lines"..
// empty ArrayList for storing the selected lines
List<String> linesToWrite = new ArrayList<>();
for(String line : lines)
{
if(line.startsWith("[0-9]"))
{
linesToWrite.add(line);
}
}
// and now write it to the other file

search elements in an array in java

I'm wondering what kind method should I use to search the elements in an array and what data structure to store the return value
For example a txt file contains following
123 Name line Moon night table
124 Laugh Cry Dog
123 quote line make pet table
127 line array hello table
and the search elements are line+table
I read every line as an string and then spilt by space
the output should like this
123 2 (ID 123 occurs twice that contains the search elements)
127 1
I want some suggestions of what kind method to search the elements in the array and what kind data structure to store the return value (the ID and the number of occurs. I'm thinking hashmap)
Read the text file and store each line that ends with table in ArrayList<String>. Then use contains for each element in ArrayList<String>. Store result in HashMap<key,value> where key is ID and value is Integer which represent number of times ID occurs.
First, I would keep reading through the file line by line, there's really no other way of going about it other than that.
Second, to pick out the rows to save, you don't need to do the split (assumption: they all end in (space)table). You can just get them by using:
if (line.endsWith(" table"))
Then, I would suggest using a Map<String, Integer> datatype to store your information. This way, you have the number of the table (key) and how many times if was found in the file (value).
Map<String, Integer> map = new HashMap<String, Integer>();
....reading file....
if (line.endsWith(" table")) {
String number = line.substring(0, line.indexOf(" "))
if (!map.containsKey(number)) {
map.put(number, 1);
} else {
Integer value = map.get(number);
value++;
map.put(number, value);
}
}

Categories