Compare Two CSV Files and Fetch Data - java

I have two csv files. One Master CSV File around 500000 records. Another DailyCSV file has 50000 Records.
The DailyCSV files misses few columns which has to be fetched from Master CSV File.
For example
DailyCSV File
id,name,city,zip,occupation
1,Jhon,Florida,50069,Accountant
MasterCSV File
id,name,city,zip,occupation,company,exp,salary
1, Jhon, Florida, 50069, Accountant, AuditFirm, 3, $5000
What I have to do is, read both files, match the records with ID, if ID is present in the master file, then i have to fetch company, exp, salary and write it to a new csv file.
How to achieve this.??
What I have done Currently
while (true) {
line = bstream.readLine();
lineMaster = bstreamMaster.readLine();
if (line == null || lineMaster == null)
{
break;
}
else
{
while(lineMaster != null)
readlineSplit = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);
String splitId = readlineSplit[4];
String[] readLineSplitMaster =lineMaster.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);
String SplitIDMaster = readLineSplitMaster[13];
System.out.println(splitId + "|" + SplitIDMaster);
//System.out.println(splitId.equalsIgnoreCase(SplitIDMaster));
if (splitId.equalsIgnoreCase(SplitIDMaster)) {
String writeLine = readlineSplit[0] + "," + readlineSplit[1] + "," + readlineSplit[2] + "," + readlineSplit[3] + "," + readlineSplit[4] + "," + readlineSplit[5] + "," + readLineSplitMaster[15]+ "," + readLineSplitMaster[16] + "," + readLineSplitMaster[17];
System.out.println(writeLine);
pstream.print(writeLine + "\r\n");
}
}
}pstream.close();
fout.flush();
bstream.close();
bstreamMaster.close();

First of all, your current parsing approach will be painfully slow. Use a CSV parsing library dedicated for that to speed things up. With uniVocity-parsers you can process your 500K records in less than a second. This is how you can use it to solve your problem:
First let's define a few utility methods to read/write your files:
//opens the file for reading (using UTF-8 encoding)
private static Reader newReader(String pathToFile) {
try {
return new InputStreamReader(new FileInputStream(new File(pathToFile)), "UTF-8");
} catch (Exception e) {
throw new IllegalArgumentException("Unable to open file for reading at " + pathToFile, e);
}
}
//creates a file for writing (using UTF-8 encoding)
private static Writer newWriter(String pathToFile) {
try {
return new OutputStreamWriter(new FileOutputStream(new File(pathToFile)), "UTF-8");
} catch (Exception e) {
throw new IllegalArgumentException("Unable to open file for writing at " + pathToFile, e);
}
}
Then, we can start reading your daily CSV file, and generate a Map:
public static void main(String... args){
//First we parse the daily update file.
CsvParserSettings settings = new CsvParserSettings();
//here we tell the parser to read the CSV headers
settings.setHeaderExtractionEnabled(true);
//and to select ONLY the following columns.
//This ensures rows with a fixed size will be returned in case some records come with less or more columns than anticipated.
settings.selectFields("id", "name", "city", "zip", "occupation");
CsvParser parser = new CsvParser(settings);
//Here we parse all data into a list.
List<String[]> dailyRecords = parser.parseAll(newReader("/path/to/daily.csv"));
//And convert them to a map. ID's are the keys.
Map<String, String[]> mapOfDailyRecords = toMap(dailyRecords);
... //we'll get back here in a second.
This is the code to generate a Map from the list of daily records:
/* Converts a list of records to a map. Uses element at index 0 as the key */
private static Map<String, String[]> toMap(List<String[]> records) {
HashMap<String, String[]> map = new HashMap<String, String[]>();
for (String[] row : records) {
//column 0 will always have an ID.
map.put(row[0], row);
}
return map;
}
With the map of records, we can process your master file and generate the list of updates:
private static List<Object[]> processMasterFile(final Map<String, String[]> mapOfDailyRecords) {
//we'll put the updated data here
final List<Object[]> output = new ArrayList<Object[]>();
//configures the parser to process only the columns you are interested in.
CsvParserSettings settings = new CsvParserSettings();
settings.setHeaderExtractionEnabled(true);
settings.selectFields("id", "company", "exp", "salary");
//All parsed rows will be submitted to the following RowProcessor. This way the bigger Master file won't
//have all its rows stored in memory.
settings.setRowProcessor(new AbstractRowProcessor() {
#Override
public void rowProcessed(String[] row, ParsingContext context) {
// Incoming rows from MASTER will have the ID as index 0.
// If the daily update map contains the ID, we'll get the daily row
String[] dailyData = mapOfDailyRecords.get(row[0]);
if (dailyData != null) {
//We got a match. Let's join the data from the daily row with the master row.
Object[] mergedRow = new Object[8];
for (int i = 0; i < dailyData.length; i++) {
mergedRow[i] = dailyData[i];
}
for (int i = 1; i < row.length; i++) { //starts from 1 to skip the ID at index 0
mergedRow[i + dailyData.length - 1] = row[i];
}
output.add(mergedRow);
}
}
});
CsvParser parser = new CsvParser(settings);
//the parse() method will submit all rows to the RowProcessor defined above.
parser.parse(newReader("/path/to/master.csv"));
return output;
}
Finally, we can get the merged data and write everything to another file:
... // getting back to the main method here
//Now we process the master data and get a list of updates
List<Object[]> updatedData = processMasterFile(mapOfDailyRecords);
//And write the updated data to another file
CsvWriterSettings writerSettings = new CsvWriterSettings();
writerSettings.setHeaders("id", "name", "city", "zip", "occupation", "company", "exp", "salary");
writerSettings.setHeaderWritingEnabled(true);
CsvWriter writer = new CsvWriter(newWriter("/path/to/updates.csv"), writerSettings);
//Here we write everything, and get the job done.
writer.writeRowsAndClose(updatedData);
}
This should work like a charm. Hope it helps.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

I will approach the problem in a step by step manner.
First I will parse/read the master CSV file and keep its content into a hashmap, where the key will be each record's unique 'id' as for the value maybe you can store them in a hash or simply create a java class to store the information.
Example of hash:
{
'1' : { 'name': 'Jhon',
'City': 'Florida',
'zip' : 50069,
....
}
}
Next, read your comparer csv file. For each row, read the 'id' and check if the key exists on the hashmap you have created earlier.
if it exists, then from the hashmap access the information you need and write to a new CSV file.
Also, you might want to consider using a 3rd party CSV parser to make this task easier.
If you have maven maybe you can follow this example I found on net. Otherwise you can just google for apache 'csv parser' example on the internet.
http://examples.javacodegeeks.com/core-java/apache/commons/csv-commons/writeread-csv-files-with-apache-commons-csv-example/

Related

Arrange List in Java to output specific columns

I have 2 csv files which the same data but output of the two files are in different order.
I want to output both lists in the same order.
List csv1
System.out.println(csv1);
Employee, Address, Name, Email
System.out.println(csv2);
Output of this List looks like;
Address, Email, Employee Name
How can I sort the lists to print in the column order;
Employee, Name, Email, Address
Note: I can't use integer col(1),col(3) because column 1 in csv1 does not match col1 in csv2
data is read as follows:
List<String> ret = new ArrayList<>();
BufferedReader r = new BufferedReader(new InputStreamReader(str));
Stream lines = r.lines().skip(1);
lines.forEachOrdered(
line -> {
line= ((String) line).replace("\"", "");
ret.add((String) line);
I've assumed that you need to parse these two csv files and output in order.
You can use Apache Commons-CSV library for parsing. I've considered below examples
Solution using external library:
test1.csv
Address,Email,Employee,Name
SecondMainRoad,test2#gmail.com,Frank,Michael
test2.csv
Employee,Address,Name,Email
John,FirstMainRoad,Doe,test#gmail.com
Sample program
public static void main(String[] args) throws IOException {
try(Reader csvReader = Files.newBufferedReader(Paths.get
("test2.csv"))) {
// Initialize CSV parser and iterator.
CSVParser csvParser = new CSVParser(csvReader, CSVFormat.Builder.create()
.setRecordSeparator(System.lineSeparator())
.setHeader()
.setSkipHeaderRecord(true)
.setIgnoreEmptyLines(true)
.build());
Iterator<CSVRecord> csvRecordIterator = csvParser.iterator();
while(csvRecordIterator.hasNext())
{
final CSVRecord csvRecord = csvRecordIterator.next();
final Map<String, String> recordMap = csvRecord.toMap();
System.out.println(String.format("Employee:%s", recordMap.get("Employee")));
System.out.println(String.format("Name:%s", recordMap.get("Name")));
System.out.println(String.format("Email:%s", recordMap.get("Email")));
System.out.println(String.format("Address:%s", recordMap.get("Address")));
}
}
}
Standlone Solution:
public class CSVTesterMain {
public static void main(String[] args) {
// I have used string variables to hold csv data, In this case, you can replace with file output lines.
String csv1= "Employee,Address,Name,Email\r\n" +
"John,FirstMainRoad,Doe,test#gmail.com\r\n" +
"Henry,ThirdCrossStreet,Joseph,email#gmail.com";
String csv2 = "Address,Email,Employee,Name\r\n" +
"SecondMainRoad,test2#gmail.com,Michael,Sessner\r\n" +
"CrossRoad,test25#gmail.com,Vander,John";
// Map key - To hold header information
// Map Value - List of lines holding values to the corresponding headers.
Map<String, List<String>> dataMap = new HashMap<>();
Stream<String> csv1LineStream = csv1.lines();
Stream<String> csv2LineStream = csv2.lines();
// We are using the same method to parse different csv formats. We are maintaining reference to the headers
// in the form of Map key which will helps us to emit output later as per our format.
populateDataMap(csv1LineStream, dataMap);
populateDataMap(csv2LineStream, dataMap);
// Now we have dataMap that holds data from multiple csv files. Key of the map is responsible to
// determine the header sequence.
// Print the output as per the sequence Employee, Name, Email, Address
System.out.println("Employee,Name,Email,Address");
dataMap.forEach((header, lineList) -> {
// Logic to determine the index value for each column.
List<String> headerList = Arrays.asList(header.split(","));
int employeeIdx = headerList.indexOf("Employee");
int nameIdx = headerList.indexOf("Name");
int emailIdx = headerList.indexOf("Email");
int addressIdx = headerList.indexOf("Address");
// Now we know the index value of each of these columns that can be emitted in our format.
// You can output to a file in your case.
// Iterate through each line, split and output as per the format.
lineList.forEach(line -> {
String[] data = line.split(",");
System.out.println(String.format("%s,%s,%s,%s", data[employeeIdx],
data[nameIdx],
data[emailIdx],
data[addressIdx]
));
});
});
}
private static void populateDataMap(Stream<String> csvLineStream, Map<String, List<String>> dataMap) {
// Populate data map associating the data to respective headers.
Iterator<String> csvIterator = csvLineStream.iterator();
// Fetch header. (In my example, I am sure that my first line is always the header).
String header = csvIterator.next();
if(! dataMap.containsKey(header))
dataMap.put(header, new ArrayList<>());
// Iterate through the remaining lines and populate data map.
while(csvIterator.hasNext())
dataMap.get(header).add(csvIterator.next());
}
}
Here I am using Jackson dataformat library to parse the csv files.
Dependency
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-csv</artifactId>
<version>2.13.2</version>
</dependency>
File 1
employee, address, name, email
1, address 1, Name 1, name1#example.com
2, address 2, Name 2, name2#example.com
3, address 3, Name 3, name3#example.com
File 2
address, email, employee, name
address 4, name4#example.com, 4, Name 4
address 5, name5#example.com, 5, Name 5
address 6, name6#example.com, 6, Name 6
Java Program
Here EmployeeDetails is a POJO class. And it is expected that the location of the csv files is passed as an argument.
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectReader;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class EmployeeDataParser {
public static void main(String[] args) {
File directoryPath = new File(args[0]);
File filesList[] = directoryPath.listFiles();
List<EmployeeDetails> employeeDetails = new ArrayList<>();
EmployeeDataParser employeeDataParser=new EmployeeDataParser();
for(File file : filesList) {
System.out.println("File path: "+file.getAbsolutePath());
employeeDataParser.readEmployeeData(employeeDetails, file.getAbsolutePath());
}
System.out.println("number of employees into list: " + employeeDetails.size());
employeeDataParser.printEmployeeDetails(employeeDetails);
}
private List<EmployeeDetails> readEmployeeData(List<EmployeeDetails> employeeDetails,
String filePath){
CsvMapper csvMapper = new CsvMapper();
CsvSchema schema = CsvSchema.emptySchema().withHeader();
ObjectReader oReader = csvMapper.readerFor(EmployeeDetails.class).with(schema);
try (Reader reader = new FileReader(filePath)) {
MappingIterator<EmployeeDetails> mi = oReader.readValues(reader);
while (mi.hasNext()) {
EmployeeDetails current = mi.next();
employeeDetails.add(current);
}
} catch (IOException e) {
System.out.println("IOException Caught !!!");
System.out.println(e.getStackTrace());
}
return employeeDetails;
}
private void printEmployeeDetails(List<EmployeeDetails> employeeDetails) {
System.out.printf("%5s %10s %15s %25s", "Employee", "Name", "Email", "Address");
System.out.println();
for(EmployeeDetails empDetail:employeeDetails){
System.out.format("%5s %15s %25s %15s", empDetail.getEmployee(),
empDetail.getName(),
empDetail.getEmail(),
empDetail.getAddress());
System.out.println();
}
}
}

Why can't my phone app open a file it stored in a Room Database?

Problem: My Android phone app can open various file types stored in an Android Room pre-populated SQLite database but it cannot open files the app itself has added to the pre-populated database (except it can open .txt files). I believe the issue is probably with how the I coded the copying and conversion of a selected file to byte[] data. The app is java based, and I have done this in Java before in a desktop app, so I just can't seem to find the issue. Maybe it is a permission issue, I'm just not sure and someone standing outside looking in may see what I can't.
What I have tried: Since the app can open various existing pre-populated files successfully from the DB, I've concentrated on and stepped through methods writing files to the DB. I'm not receiving any errors. I suspect it may just be minor issue since I can't seem to see it.
What I'm trying to do: I'm trying to emulate the desktop version of this app into a Android phone version. I know it's not recommended or common practice to populate files to a DB, but this app needs to be able to read and write files to the DB supporting it. This will be a full range of file types like the desktop version (e.g., pics, docs, audio, video, etc.). However, as I stated above, .txt files seem to have no issue. The user can select files stored on their phone into a table that captures the fileName and filePath to a TableRow in a TableLayout. Below are methods involved. The plan is to refactor functionality once I get it working:
Capturing the full path and filename for each row - Uses the captured filepath to convert to a byte[] to store the data. The filename and file byte data are stored in a Files table, example, Files(fileName, fileData(byte[])). Each file is added to an ArrayList<Files> which the method returns
public static List<Files> captureNoteFiles(TableLayout table){
List<Files> noteFiles = new ArrayList<>();
int i = table.getChildCount();
if(i>1){
for (int itr = 1; itr<i; itr++) { // iterating through indexes
TableRow tr = (TableRow) table.getChildAt(itr);
TextView tv = (TextView) tr.getChildAt(1); // 1 is the file path position
File f = new File(tv.getText().toString());
String n = f.getName();
try {
FileInputStream fis = new FileInputStream(f.getPath());
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] buf = new byte[1024];
for (int read; (read = fis.read(buf)) != -1; ) {
bos.write(buf, 0, read);
}
fis.close();
noteFiles.add(new Files(0, n, bos.toByteArray()));
} catch (Exception e) {
e.printStackTrace();
Log.d("Input File", e.toString());
}
}
}
return noteFiles;
}
Iteration of the ArrayList - The ArrayList<Files> is iterated and populated to the Files table and an ID capture to associate those files with a particular note of reference.
public static void addNewNoteFiles(int noteID, List<Files> nf){
if(nf.size()>0) {
for (Files f : nf) {
long id = rdb.getFilesDao().addFile(f);
rdb.getFilesByNoteDao().insert(new FilesByNote(noteID, (int) id));
}
}
}
Files Entity
#Entity(tableName = "Files")
public class Files implements Parcelable {
#PrimaryKey(autoGenerate = true)
#ColumnInfo(name = "FileID")
private int fileID;
#ColumnInfo(name = "FileName")
private String fileName;
#TypeConverters(FileTypeConverter.class)
#ColumnInfo(name = "FileData", typeAffinity = ColumnInfo.TEXT)
private byte[] fileData;
#SuppressWarnings(RoomWarnings.CURSOR_MISMATCH)
public Files(int fileID, String fileName, byte[] fileData){
this.fileID = fileID;
this.fileName = fileName;
this.fileData = fileData;
}
}
First you are assuming that an insert works as per :-
long id = rdb.getFilesDao().addFile(f);
rdb.getFilesByNoteDao().insert(new FilesByNote(noteID, (int) id));
What if the row isn't inserted? and returns an id of -1?
So I'd suggest adding getters to the Files class such as :-
public int getFileID() {
return fileID;
}
public String getFileName() {
return fileName;
}
public byte[] getFileData() {
return fileData;
}
and then add the following to FilesDao :-
#Query("SELECT coalesce(length(FileData)) FROM Files WHERE FileID=:fileId")
abstract long getFilesDataLength(long fileId);
and then amending the addNewNoteFiles to be :-
public static void addNewNoteFiles(int noteID, List<Files> nf){
final String TAG = "ADDNEWNOTE";
if(nf.size()>0) {
for (Files f : nf) {
long id = rdb.getFilesDao().addFile(f);
if (id > 0) {
long lengthOfFileData = rdb.getFilesDao().getFilesDataLength(id);
Log.d(TAG,
"Inserted File = " + f.getFileName() +
" DataLength = " + f.getFileData().length +
" ID = " + f.getFileID() +
" Length of Stored Data = " + lengthOfFileData);
if (f.getFileData().length != lengthOfFileData) {
Log.d(TAG,"WARNING FileData length MISMATCH for File = " + f.getFileName() + "\n\t Expected " + f.getFileData().length + " Found " + lengthOfFileData);
}
rdb.getFilesByNoteDao().insert(new FilesByNote(noteID, (int) id));
} else {
Log.d(TAG,"NOT INSERTED File = " + f.getFileName());
}
}
}
}
Run and check the log. Are all the files inserted? Do the lengths match? Are the lengths as expected (if all 0 lengths, or some, then obviously something is amiss when building the ByteArrayOutputStream)
You may wish to add similar for inserting the FilesByNote i.e. have the insert Dao return a long (it returns the rowid) and check if the value is > 0.
You may wonder what rowid is. Well it's a normally hidden column, perhaps hidden as it would appear that FilesByNotes is an associative table mapping(associating) Note(s) with Files and as such has a composite primary key NoteId and FileId which is not an alias of the rowid, so rowid will be hidden as such. However, the value will be auto-generated or -1 if no row is inserted.
ALL tables, with the exception of tables defined with WITHOUT ROWID, have a rowid column. Room does not allow thee definition of WITHOUT ROWID tables.
You wouldn't be concerned about the value if it's greater than 0, just that it is greater than 0 and thus a row was inserted.
The above may help to determine any issues encountered when inserting the data. If there are none found then the issue is else where.

Join csv files ased on common column in java

I want to join two csv files based on a common column in. My two csv files and final csv file looks like this.
Here are the example files - 1st file looks like:
sno,first name,last name
--------------------------
1,xx,yy
2,aa,bb
2nd file looks like:
sno,place
-----------
1,pp
2,qq
Output:
sno,first name,last name,place
------------------------------
1,xx,yy,pp
2,aa,bb,qq
Code:
CSVReader r1 = new CSVReader(new FileReader("c:/csv/file1.csv"));;
CSVReader r2 = new CSVReader(new FileReader("c:/csv/file2.csv"));;
HashMap<String,String[]> dic = new HashMap<String,String[]>();
int commonCol = 1;
r1.readNext(); // skip header
String[] line = null;
while ((line = r1.readNext()) != null)
{
dic.put(line[commonCol],line)
}
commonCol = 1;
r2.readNext();
String[] line2 = null;
while ((line2 = r2.readNext()) != null)
{
if (dic.keySet().contains(line2[commonCol])
{
// append line to existing entry
}
else
{
// create a new entry and pre-pend it with default values
// for the columns of file1
}
}
foreach (String[] line : dic.valueSet())
{
// write line to the output file.
}
I don't know how to proceed further to get desired output. Any help will be appreciated.
Thanks
First, you need to use zero as your commonCol value as the first column has index zero rather than one.
if (dic.keySet().contains(line2[commonCol])
{
//Get the whole line from the first file.
String firstPart = dic.get(line2[commonCol]);
//Gets the line from the second file, without the common column.
String secondPart = String.join (Arrays.copyOfRange(line2, 1, line2.length -1), ",");
// Join together and put in Hashmap.
dic.put(line2[commonCol], String.join (firstPart, secondPart));
}
else
{
// create a new entry and pre-pend it with default values
// for the columns of file1
String firstPart = String.join(",","some", "default", "values")
String secondPart = String.join (Arrays.copyOfRange(line2, 1, line2.length -1), ",");
dic.put(line2[commonCol], String.join (firstPart, secondPart));
}

Create CSV file with columns and values from HashMap

Be gentle,
This is my first time using Apache Commons CSV 1.7.
I am creating a service to process some CSV inputs,
add some additional information from exterior sources,
then write out this CSV for ingestion into another system.
I store the information that I have gathered into a list of
HashMap<String, String> for each row of the final output csv.
The Hashmap contains the <ColumnName, Value for column>.
I have issues using the CSVPrinter to correctly assign the values of the HashMaps into the rows.
I can concatenate the values into a string with commas between the variables;
however,
this just inserts the whole string into the first column.
I cannot define or hardcode the headers since they are obtained from a config file and may change depending on which project uses the service.
Here is some of my code:
try (BufferedWriter writer = Files.newBufferedWriter(
Paths.get(OUTPUT + "/" + project + "/" + project + ".csv"));)
{
CSVPrinter csvPrinter = new CSVPrinter(writer,
CSVFormat.RFC4180.withFirstRecordAsHeader());
csvPrinter.printRecord(columnList);
for (HashMap<String, String> row : rowCollection)
{
//Need to map __record__ to column -> row.key, value -> row.value for whole map.
csvPrinter.printrecord(__record__);
}
csvPrinter.flush();
}
Thanks for your assistance.
You actually have multiple concerns with your technique;
How do you maintain column order?
How do you print the column names?
How do you print the column values?
Here are my suggestions.
Maintain column order.
Do not use HashMap,
because it is unordered.
Instead,
use LinkedHashMap which has a "predictable iteration order"
(i.e. maintains order).
Print column names.
Every row in your list contains the column names in the form of key values,
but you only print the column names as the first row of output.
The solution is to print the column names before you loop through the rows.
Get them from the first element of the list.
Print column values.
The "billal GHILAS" answer demonstrates a way to print the values of each row.
Here is some code:
try (BufferedWriter writer = Files.newBufferedWriter(
Paths.get(OUTPUT + "/" + project + "/" + project + ".csv"));)
{
CSVPrinter csvPrinter = new CSVPrinter(writer,
CSVFormat.RFC4180.withFirstRecordAsHeader());
// This assumes that the rowCollection will never be empty.
// An anonymous scope block just to limit the scope of the variable names.
{
HashMap<String, String> firstRow = rowCollection.get(0);
int valueIndex = 0;
String[] valueArray = new String[firstRow.size()];
for (String currentValue : firstRow.keySet())
{
valueArray[valueIndex++] = currentValue;
}
csvPrinter.printrecord(valueArray);
}
for (HashMap<String, String> row : rowCollection)
{
int valueIndex = 0;
String[] valueArray = new String[row.size()];
for (String currentValue : row.values())
{
valueArray[valueIndex++] = currentValue;
}
csvPrinter.printrecord(valueArray);
}
csvPrinter.flush();
}
for (HashMap<String,String> row : rowCollection) {
Object[] record = new Object[row.size()];
for (int i = 0; i < columnList.size(); i++) {
record[i] = row.get(columnList.get(i));
}
csvPrinter.printRecord(record);
}

Cannot iterate through CSV columns

I'm building a stock screener that applies a calculation through each column of a csv file. However, when I run the for loop, I only get one result back.
String path = "C:/Users/0/Desktop/Git/Finance/Data/NQ100.csv";
Reader buf = Files.newBufferedReader(Paths.get(path));
CSVParser parsed = new CSVParser(buf, CSVFormat.DEFAULT.withFirstRecordAsHeader()
.withIgnoreHeaderCase().withTrim());
// Parse tickers
Map<String, Integer> header = parsed.getHeaderMap();
List<String> tickerList = new ArrayList<>(header.keySet());
for (int x=1; x < tickerList.size(); x++) { <----------------------- PROBLEM
// Accessing closing price by Header names
List<Double> closeList = new ArrayList<>();
for (CSVRecord record : parsed) {
String stringClose = record.get(x);
Double close = Double.valueOf(stringClose);
closeList.add(close);
}
// Percentage Change
List<Double> pctList = new ArrayList<>();
for (int i=1; i < closeList.size(); i++) {
Double pct = closeList.get(i) / closeList.get(i-1) - 1;
pctList.add(pct);
}
// Statistics
Double sum = 0.0, var = 0.0, mean, sd, rfr, sr;
// Mean
for (Double num : pctList) sum += num;
mean = sum/pctList.size();
// Standard Deviation
for (Double num: pctList) var += Math.pow(num - mean, 2);
sd = Math.sqrt(var/pctList.size());
// Risk Free Rate
rfr = Math.pow((1+0.03),(1/252.0))-1;
// Sharpe Ratio
sr = Math.sqrt(252) * ((mean-rfr)/sd);
System.out.println(tickerList.get(x) + " " + sr);
}
My data looks like this:
,AAL,AAPL,ADBE
2007-10-25,26.311651,23.141403,47.200001
2007-10-26,26.273216,23.384495,47.0
2007-10-29,26.004248,23.43387,47.0
So I was expecting:
AAL XXX
AAPL XXX
ADBE XXX
But I got just:
AAL 0.3604941921663456
Would be grateful if you guys can help me find the problem!
You can iterate through Iterable in Java only once, in your case CSVParser parsed implements Iterable<CSVRecord>.
So you iterate through it only for the first time when you calculate statistics for AAL, during analyzing data for AAPL and ADBE it will be handled as an empty one.
You can handle this by introducing helper list init by the parsed, add next code (it is a one line solution of course e.g. in Java 8, but this option will work for earlier versions too) before the for cycle:
List<CSVRecord> records = new ArrayList<>();
for (CSVRecord record : parsed) {
records.add(record);
}
And change next line:
for (CSVRecord record : records) {
with:
for (CSVRecord record : parsed) {
For the CSV you've provided you will have next output then:
AAL -21.583101145880306
AAPL 23.417753561072438
ADBE -16.75343297000953
So here's a block of the code that work for me, if i understand your question, you only want to "read" each column and row from a csv file, hope helps.
br = new BufferedReader(new InputStreamReader(new FileInputStream(archivo), "UTF8"));
while ((line = br.readLine()) != null) {
if(a!=0){
String[] datos = line.split(cvsSplitBy);
System.out.println(datos[0] + " - " + datos[1] + " - " + datos[2]);
}
a++;
}

Categories