Sorting .csv Id's in natural order Java - java

I'm trying to write some code that will take in a list of IDs (numbers and letters) from a .csv file and output them to a new file with the IDs in "natural order". My files are compiling, but I am getting the error:
java.lang.NumberFormatException: For input string: "Alpha"
I think the issue is I am not accounting for both number and letter values in the .csv file. What am I doing wrong?! Sorry if my variable Id's are confusing...
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collections;
public class IdReader {
public static String CSV_FILE_PATH = "/Users/eringray/Desktop/idreader/idData.csv";
public static void main(String[] args){
try {
BufferedReader br = new BufferedReader(new FileReader(CSV_FILE_PATH));
BufferedWriter bw = new BufferedWriter(new FileWriter(CSV_FILE_PATH + ".tsv"));
ArrayList<String> textIds = new ArrayList<>();
ArrayList<Integer> numberIds = new ArrayList<>();
String line = "";
while((line = br.readLine()) != null) {
String[] values = line.split(" ");
if(values.length == 1) {
String idAsString = values[0];
try{
int id = Integer.parseInt(idAsString);
numberIds.add(id);
}
catch(NumberFormatException e){
textIds.add(idAsString);
}
}
}
Collections.sort(textIds);
Collections.sort(numberIds);
for(int i = 0; i < textIds.size(); i++){
String stu = textIds.get(i);
String lineText = stu.toString();
bw.write(lineText);
bw.newLine();
}
for(int i = 0; i < numberIds.size(); i++){
int numValues = numberIds.get(i);
bw.write(numValues);
bw.newLine();
}
br.close();
bw.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

The exception is coming at this line
int id = Integer.parseInt(idAsString);
Clearly alpha is not an integer, so it will throw NumberFormatException. In a case, where you encounter such Strings which cannot be converted into numbers, you can either skips them or throw an exception.
Update
//Use two seperate lists, one for maintaining numbers and other for text
ArrayList<String> textIds = new ArrayList<>();
ArrayList<Integer> numberIds = new ArrayList<>();
String line = "";
while((line = br.readLine()) != null) {
String[] values = line.split(" ");
if(values.length == 1) {
String idAsString = values[0];
try {
//Parse the value. If successful, it means it was a number. Add to integer array.
int id = Integer.parseInt(idAsString);
numberIds.add(id);
}catch (NumberFormatException e){
//If not successful, it means it was a string.
textIds.add(idAsString);
}
}
}
//In the end sort both the list
Collections.sort(textIds);
Collections.synchronizedList(numberIds);
for(int i = 0; i < textIds.size(); i++){
String stu = textIds.get(i);
bw.write(stu);
bw.newLine();
}
for(int i = 0; i < numberIds.size(); i++){
int numValues = numberIds.get(i);
bw.write(numValues+"");
bw.newLine();
}
br.close();
bw.close();
I am not putting code for writing this data to a new file. I hope you can do that.
Sample Input
4
6
33
2
5632
23454
Alpha
So after running my code
numberIds will have [ 2,4,6,33,5632,23454]
textIds will have ["Alpha"]

NumberFormatException occurs because of AlphaNumeric characters in the input.
Please use isNumeric(str) metod in https://commons.apache.org/proper/commons-lang/javadocs/api-2.6/org/apache/commons/lang/StringUtils.html api to verify whether the input is numeric or not and convert to int , only it is numeric

Related

Arraylist with data from CSV file is full of null values?

For a question I have to find the number of unique Strings in a column and write it into a CSV file.
My plan was to put the first String inside an Arraylist and loop through the column and add any strings not within the Arraylist
It works with any ordinary Arraylist but for some reason the Arraylist containing data from the CSV file is all null
Can anyone explain to me why this is and how I can fix it? My code is below.
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashSet;
public class uniqueTrips {
static ArrayList<String> removeDuplicates(ArrayList<String> list) {
ArrayList<String> result = new ArrayList<>();
HashSet<String> set = new HashSet<>();
for (String item : list) {
if (!set.contains(item)) {
result.add(item);
set.add(item);
}
}
return result;
}
public static void main(String[] args) throws IOException {
ArrayList<String> trips = new ArrayList<String>();
try {
BufferedReader reader = new BufferedReader(new FileReader("Passenger_Weather_Combined.csv"));
BufferedWriter writer = new BufferedWriter(new FileWriter("result.csv"));
double[] attribute = new double[15];
double[][] attributes = new double[77284][15];
String[][] attributes2 = new String[77284][15];
String line = reader.readLine();
int number = 0; // Rows!
trips.add(attributes2[1][9]);
while (line != null) {
String[] att = line.split(",");
attributes[number] = attribute;
line = reader.readLine();
}
System.out.println(removeDuplicates(trips).size());
writer.newLine();
number++;
writer.close();
reader.close();
} catch (IOException e) {
}
}
}
This bit here is where you iterate through the rows of the file:
while (line != null) {
String[] att = line.split(",");
attributes[number] = attribute;
line = reader.readLine();
}
However, you are not actually touching the data you read in. You read the row into line, you split the line into columns, and save it in att. Then you never use the value in att from then on - instead, you use the variable attribute, which doesn't contain any meaningful data. Try changing the loop to this:
while (line != null) {
String[] att = line.split(",");
attributes[number] = new double[att.length];
for (int i = 0; i < att.length; i++)
attributes[number][i] = Double.parseDouble(att[i]);
line = reader.readLine();
number++; // Very important, otherwise you're always updating the same row...
}

Remove duplicate strings from an arraylist<Object>

My program is opening a file and then saves its words and their byte distance from the file beginning . Though the file has too many duplicate words that i don't want . Also i want my list to be in alphabetical order . The problem is that when i fix the order the duplicate are messed and vice versa . Here is my code:
import java.io.*;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashSet;
import java.util.LinkedList;
import java.util.Set;
class MyMain {
public static void main(String[] args) throws IOException {
ArrayList<DictPage> listOfWords = new ArrayList<DictPage>();
LinkedList<Page> Eurethrio = new LinkedList<Page>();
File file = new File("C:\\Kennedy.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
//This will reference one line at a time...
String line = null;
int line_count=0;
int byte_count;
int total_byte_count=0;
int fromIndex;
int kat = 0;
while( (line = br.readLine())!= null ){
line_count++;
fromIndex=0;
String [] tokens = line.split(",\\s+|\\s*\\\"\\s*|\\s+|\\.\\s*|\\s*\\:\\s*");
String line_rest=line;
for (int i=1; i <= tokens.length; i++) {
byte_count = line_rest.indexOf(tokens[i-1]);
//if ( tokens[i-1].length() != 0)
//System.out.println("\n(line:" + line_count + ", word:" + i + ", start_byte:" + (total_byte_count + fromIndex) + "' word_length:" + tokens[i-1].length() + ") = " + tokens[i-1]);
fromIndex = fromIndex + byte_count + 1 + tokens[i-1].length();
if (fromIndex < line.length())
line_rest = line.substring(fromIndex);
if(!listOfWords.contains(tokens[i-1])){//Na mhn apothikevetai h idia leksh
//listOfWords.add(tokens[i-1]);
listOfWords.add(new DictPage(tokens[i-1],kat));
kat++;
}
Eurethrio.add(new Page("Kennedy",fromIndex));
}
total_byte_count += fromIndex;
Eurethrio.add(new Page("Kennedy", total_byte_count));
}
Set<DictPage> hs = new HashSet<DictPage>();
hs.addAll(listOfWords);
listOfWords.clear();
listOfWords.addAll(hs);
if (listOfWords.size() > 0) {
Collections.sort(listOfWords, new Comparator<DictPage>() {
#Override
public int compare(final DictPage object1, final DictPage object2) {
return object1.getWord().compareTo(object2.getWord());
}
} );
}
//Ektypwsh leksewn...
for (int i = 0; i<listOfWords.size();i++){
System.out.println(""+listOfWords.get(i).getWord()+" "+listOfWords.get(i).getPage());
}
for (int i = 0;i<Eurethrio.size();i++){
System.out.println(""+Eurethrio.get(i).getFile()+" "+Eurethrio.get(i).getBytes());
}
}
}
Use the TreeSet instead of ArrayList, and you'll get automatically order and no repeatings.
In the first place, why are you using ArrayList to store your list of words.
ArrayList<DictPage> listOfWords = new ArrayList<DictPage>();
You should use Set (like HashSet, TreeSet or some implementation of Set) to store your words if you don't want duplicates.
Set<DictPage> listOfWords = new Hashset<DictPage>(); //no duplicates but not sorted
Or
Set<DictPage> listOfWords = new Treeset<DictPage>(); //no duplicates and sorted as well
This would make sure that your list of words does not contain any duplicates.
And if you want them sorted straight away, you can use TreeSet which will make it more easier for you.
use this.
public void stripDuplicatesFromFile(String filename) {
try {
BufferedReader reader = new BufferedReader(new FileReader(filename));
Set<String> lines = new HashSet<String>();
String line;
while ((line = reader.readLine()) != null) {
lines.add(line);
}
reader.close();
BufferedWriter writer = new BufferedWriter(new FileWriter(filename));
for (String unique : lines) {
writer.write(unique);
writer.newLine();
}
writer.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
it takes filepath as an input, find duplicate lines and remove them. But if you have large file do not use this. I'm using this method on a very small size of a .txt file (kind of log file and order is not imported).

Read a file into Map<Integer, ArrayList<Double>>

I saw some similar questions, but mine is a little different.
I define a
Map<Integer, ArrayList<Double>> fl;
My input .txt file:
1 0.56 0.57 0.73 ..
2 2.3 3.50 ...
9 4.98 0.99 ..
How to read the file into the map fl?
Thanks!
Use a Scanner and first call Scanner.readInt() that will give you the first integer.
Then call Scanner.readLine() that will give you all the remaining double in the line as a String. Split it and parse everything to double.
Repeat the same till end of file.
Here's a try.
I've compiled and run the code.
Make sure the input file is in the same directory as your project if you use an IDE.-- This only applies if you do not modify the path below.
package fileread;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.HashMap;
public class FileRead {
private static HashMap<Integer, ArrayList<Double>> map = new HashMap<>();
private static BufferedReader reader;
public static void main(String[] args) {
try
{
reader = new BufferedReader(new FileReader("input"));
//or reader = new BufferedReader(new FileReader("C:\\full-path-to-your-file));
String line;
while((line = reader.readLine()) != null)
{
String[] tokens = line.split(" ");
Integer i;
Double d;
ArrayList<Double> list = new ArrayList<>();
i = Integer.valueOf(tokens[0]);
for(int j = 1; j < tokens.length; j++)
list.add(Double.valueOf(tokens[j]));
map.put(i, list);
}
}catch(IOException ex)
{
//break execution
}finally
{
if(reader != null)
try
{
reader.close();
}catch (IOException ex) {
//don't break :)
}
}
for(Integer i : map.keySet())
{
ArrayList<Double> l = map.get(i);
System.out.print("Line " + i + ": ");
for(Double d: l)
System.out.print(d + " ");
System.out.println();
}
}
}
The code for parsing the file and populating the map should be like below
try {
BufferedReader bReader = new BufferedReader(new FileReader(new File("c:/input .txt")));
String line = "";
Map<Integer, ArrayList<Double>> fl = new HashMap<Integer, ArrayList<Double>>();
while ((line = bReader.readLine()) != null) {
String[] strArray = line.split(" ");
for (int i=0;i<strArray.length;i++) {
ArrayList<Double> value = new ArrayList<Double>();
int key=0;
if(i==0){
key =Integer.valueOf(strArray[0]);
}
else{
value.add(Double.valueOf(strArray[i]));
}
fl.put(key, value);
}
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Importing data from a large file into a 2d Array

I am trying to import a large data file and insert the information into a 2D array. The file has around 19,000 lines, and consists of 5 columns. My code is absolutely correct, there are no run time errors nor exceptions. Though, the problem is that when I try to print out data[15000][0], it says null. but my line does have 15,000 lines and it should print out the element inside the array. But when I print out data[5000][0], it works. What could possibly be wrong? I have 19,000 cities in 19,000 different lines, but it seems like when It goes around 10,000+ nothing gets stored in the 2d array. Help please
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileNotFoundException;
import java.io.IOException;
public class Data1
{
public static void main(String[] args)
{
try{
FileReader file = new FileReader("/Users/admin/Desktop/population.csv");
BufferedReader in = new BufferedReader(file);
String title = in.readLine();
String[][] data = new String[20000][5];
int currentRow = 0;
String current;
int i = 0;
String temp;
while ((temp = in.readLine()) !=null)
{
String[] c = new String[5];
String line = in.readLine().replaceAll("\"", ""); //changing the format of the data input
c = line.split(",");
c[1] = c[1].replace(" ", "");
for (int j = 0; j <data[0].length; j++)
{
current = c[j];
data[i][j] = c[j];
}
i++;
}
System.out.println(data[15000][0]);
}
catch (FileNotFoundException ex)
{
ex.printStackTrace();
}
catch (IOException ex)
{
ex.printStackTrace();
}
catch (Exception ex)
{
ex.printStackTrace();
}
}
}
You're throwing away a line on each loop.
while (in.readLine() != null)
should be
String temp;
while ((temp = in.readLine()) != null)
And then no calls to .readLine() inside the loop but refer to "temp".
Read line only once...
String line=null;
while ((line=in.readLine()) !=null) // reading line once here
{
String[] c = new String[5];
line = line.replaceAll("\"", ""); //
c = line.split(",");
c[1] = c[1].replace(" ", "");
One of your errors are the loops
while (in.readLine() !=null)
{
String[] c = new String[5];
String line = in.readLine().replaceAll("\"", ""); //changing the format of the data input
c = line.split(",");
c[1] = c[1].replace(" ", "");
Each time you invoke in.readLine() it reads a line,so you are skipping one line each time since you are calling readline twice(thus reading two lines) but storing only the second line.
You should replace it with.
String line=in.readLine();
while (line !=null)
{
String[] c = new String[5];
line.replaceAll("\"", ""); //changing the format of the data input
c = line.split(",");
c[1] = c[1].replace(" ", "");
//whatever code you have
//last line of the loop
line=in.readLine();
Can you provide us with a couple of lines of your file? And are you sure that all the file is formatted correctly ?

Reading data from a text file in Java

I have a problem in reading data from a text file and put it in 2 dimensional array. The sample of dataset is:
1,2,3,4,5,6
1.2,2.3,4.5,5.67,7.43,8
The problem of this code is that it just read the first line and does not read the next lines. Any suggestion is appreciated.
package test1;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class Test1{
public static void main(String args[])throws FileNotFoundException, IOException{
try{
double[][] X = new double[2][6];
BufferedReader input = new BufferedReader(new FileReader(file));
String [] temp;
String line = input.readLine();
String delims = ",";
temp = line.split(delims);
int rowCounter = 0;
while ((line = input.readLine())!= null) {
for(int i = 0; i<6; i++){
X[rowCounter][i] = Double.parseDouble(temp[i]);
}
rowCounter++;
}
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}finally{
}
}
}
Have you tried the Array utilities? Something like this:
while ((line = input.readLine())!= null) {
List<String> someList = Arrays.asList(line.split(","));
//do your conversion to double here
rowCounter++;
}
I think the blank line might be throwing your for loop off
The only place that your temp array is being assigned is before your while loop. You need to assign your temp array inside the loop, and don't read from the BufferedReader until the loop.
String[] temp;
String line;
String delims = ",";
int rowCounter = 0;
while ((line = input.readLine())!= null) {
temp = line.split(delims); // Moved inside the loop.
for(int i = 0; i<6; i++){
X[rowCounter][i] = Double.parseDouble(temp[i]);
}
Try:
int rowCounter = 0;
while ((line = input.readLine())!= null) {
String [] temp;
String line = input.readLine();
String delims = ",";
temp = line.split(delims);
for(int i = 0; i<6; i++){
X[rowCounter][i] = Double.parseDouble(temp[i]);
}
...
readLine expects a new line character at the end of the line. You should put a blank line to read the last line or use read instead.
I couldn't run the code, but one of your problems is that you were only splitting the first text line.
package Test1;
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
public class Test1 {
public static void main(String args[]) {
try {
double[][] X = new double[2][];
BufferedReader input = new BufferedReader(new FileReader(file));
String line = null;
String delims = ",";
int rowCounter = 0;
while ((line = input.readLine()) != null) {
String[] temp = line.split(delims);
for (int i = 0; i < temp.length; i++) {
X[rowCounter][i] = Double.parseDouble(temp[i]);
}
rowCounter++;
}
} catch (Exception e) {// Catch exception if any
System.err.println("Error: " + e.getMessage());
e.printStackTrace();
} finally {
}
}
}
I formatted your code to make it more readable.
I deferred setting the size of the second element of the two dimensional array until I knew how many numbers were on a line.

Categories