java CSV file to array

java CSV file to array - java

I am novice to java however, I cannot seem to figure this one out. I have a CSV file in the following format:
String1,String2
String1,String2
String1,String2
String1,String2
Each line are pairs. The 2nd line is a new record, same with the 3rd. In the real word the CSV file will change in size, sometimes it will be 3 records, or 4, or even 10.
My issues is how do I read the values into an array and dynamically adjust the size? I would imagine, first we would have to parse though the csv file, get the number of records/elements, then create the array based on that size, then go though the CSV again and store it in the array.
I'm just not sure how to accomplish this.
Any help would be appreciated.

You can use ArrayList instead of Array. An ArrayList is a dynamic array. ex.
Scanner scan = new Scanner(new File("yourfile"));
ArrayList<String[]> records = new ArrayList<String[]>();
String[] record = new String[2];
while(scan.hasNext())
{
record = scan.nextLine().split(",");
records.add(record);
}
//now records has your records.
//here is a way to loop through the records (process)
for(String[] temp : records)
{
for(String temp1 : temp)
{
System.out.print(temp1 + " ");
}
System.out.print("\n");
}
Just replace "yourfile" with the absolute path to your file.
You could do something like this.
More traditional for loop for processing the data if you don't like the first example:
for(int i = 0; i < records.size(); i++)
{
for(int j = 0; j < records.get(i).length; j++)
{
System.out.print(records.get(i)[j] + " ");
}
System.out.print("\n");
}
Both for loops are doing the same thing though.

You can simply read the CSV into a 2-dimensional array just in 2 lines with the open source library uniVocity-parsers.
Refer to the following code as an example:
public static void main(String[] args) throws FileNotFoundException {
/**
* ---------------------------------------
* Read CSV rows into 2-dimensional array
* ---------------------------------------
*/
// 1st, creates a CSV parser with the configs
CsvParser parser = new CsvParser(new CsvParserSettings());
// 2nd, parses all rows from the CSV file into a 2-dimensional array
List<String[]> resolvedData = parser.parseAll(new FileReader("/examples/example.csv"));
// 3rd, process the 2-dimensional array with business logic
// ......
}

tl;dr
Use the Java Collections rather than arrays, specifically a List or Set, to auto-expand as you add items.
Define a class to hold your data read from CSV, instantiating an object for each row read.
Use the Apache Commons CSV library to help with the chore of reading/writing CSV files.
Class to hold data
Define a class to hold the data of each row being read from your CSV. Let's use Person class with a given name and surname, to be more concrete than the example in your Question.
In Java 16 and later, more briefly define the class as a record.
record Person ( String givenName , String surname ) {}
In older Java, define a conventional class.
package work.basil.example;
public class Person {
public String givenName, surname;
public Person ( String givenName , String surname ) {
this.givenName = givenName;
this.surname = surname;
}
#Override
public String toString ( ) {
return "Person{ " +
"givenName='" + givenName + '\'' +
" | surname='" + surname + '\'' +
" }";
}
}
Collections, not arrays
Using the Java Collections is generally better than using mere arrays. The collections are more flexible and more powerful. See Oracle Tutorial.
Here we will use the List interface to collect each Person object instantiated from data read in from the CSV file. We use the concrete ArrayList implementation of List which uses arrays in the background. The important part here, related to your Question, is that you can add objects to a List without worrying about resizing. The List implementation is responsible for any needed resizing.
If you happen to know the approximate size of your list to be populated, you can supply an optional initial capacity as a hint when creating the List.
Apache Commons CSV
The Apache Commons CSV library does a nice job of reading and writing several variants of CSV and Tab-delimited formats.
Example app
Here is an example app, in a single PersoIo.java file. The Io is short for input-output.
Example data.
GivenName,Surname
Alice,Albert
Bob,Babin
Charlie,Comtois
Darlene,Deschamps
Source code.
package work.basil.example;
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVRecord;
import java.io.BufferedReader;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
public class PersonIo {
public static void main ( String[] args ) {
PersonIo app = new PersonIo();
app.doIt();
}
private void doIt ( ) {
Path path = Paths.get( "/Users/basilbourque/people.csv" );
List < Person > people = this.read( path );
System.out.println( "People: \n" + people );
}
private List < Person > read ( final Path path ) {
Objects.requireNonNull( path );
if ( Files.notExists( path ) ) {
System.out.println( "ERROR - no file found for path: " + path + ". Message # de1f0be7-901f-4b57-85ae-3eecac66c8f6." );
}
List < Person > people = List.of(); // Default to empty list.
try {
// Hold data read from file.
int initialCapacity = ( int ) Files.lines( path ).count();
people = new ArrayList <>( initialCapacity );
// Read CSV file.
BufferedReader reader = Files.newBufferedReader( path );
Iterable < CSVRecord > records = CSVFormat.RFC4180.withFirstRecordAsHeader().parse( reader );
for ( CSVRecord record : records ) {
// GivenName,Surname
// Alice,Albert
// Bob,Babin
// Charlie,Comtois
// Darlene,Deschamps
String givenName = record.get( "GivenName" );
String surname = record.get( "Surname" );
// Use read data to instantiate.
Person p = new Person( givenName , surname );
// Collect
people.add( p ); // For real work, you would define a class to hold these values.
}
} catch ( IOException e ) {
e.printStackTrace();
}
return people;
}
}
When run.
People:
[Person{ givenName='Alice' | surname='Albert' }, Person{ givenName='Bob' | surname='Babin' }, Person{ givenName='Charlie' | surname='Comtois' }, Person{ givenName='Darlene' | surname='Deschamps' }]

Related

Arrange List in Java to output specific columns

I have 2 csv files which the same data but output of the two files are in different order.
I want to output both lists in the same order.
List csv1
System.out.println(csv1);
Employee, Address, Name, Email
System.out.println(csv2);
Output of this List looks like;
Address, Email, Employee Name
How can I sort the lists to print in the column order;
Employee, Name, Email, Address
Note: I can't use integer col(1),col(3) because column 1 in csv1 does not match col1 in csv2
data is read as follows:
List<String> ret = new ArrayList<>();
BufferedReader r = new BufferedReader(new InputStreamReader(str));
Stream lines = r.lines().skip(1);
lines.forEachOrdered(
line -> {
line= ((String) line).replace("\"", "");
ret.add((String) line);

I've assumed that you need to parse these two csv files and output in order.
You can use Apache Commons-CSV library for parsing. I've considered below examples
Solution using external library:
test1.csv
Address,Email,Employee,Name
SecondMainRoad,test2#gmail.com,Frank,Michael
test2.csv
Employee,Address,Name,Email
John,FirstMainRoad,Doe,test#gmail.com
Sample program
public static void main(String[] args) throws IOException {
try(Reader csvReader = Files.newBufferedReader(Paths.get
("test2.csv"))) {
// Initialize CSV parser and iterator.
CSVParser csvParser = new CSVParser(csvReader, CSVFormat.Builder.create()
.setRecordSeparator(System.lineSeparator())
.setHeader()
.setSkipHeaderRecord(true)
.setIgnoreEmptyLines(true)
.build());
Iterator<CSVRecord> csvRecordIterator = csvParser.iterator();
while(csvRecordIterator.hasNext())
{
final CSVRecord csvRecord = csvRecordIterator.next();
final Map<String, String> recordMap = csvRecord.toMap();
System.out.println(String.format("Employee:%s", recordMap.get("Employee")));
System.out.println(String.format("Name:%s", recordMap.get("Name")));
System.out.println(String.format("Email:%s", recordMap.get("Email")));
System.out.println(String.format("Address:%s", recordMap.get("Address")));
}
}
}
Standlone Solution:
public class CSVTesterMain {
public static void main(String[] args) {
// I have used string variables to hold csv data, In this case, you can replace with file output lines.
String csv1= "Employee,Address,Name,Email\r\n" +
"John,FirstMainRoad,Doe,test#gmail.com\r\n" +
"Henry,ThirdCrossStreet,Joseph,email#gmail.com";
String csv2 = "Address,Email,Employee,Name\r\n" +
"SecondMainRoad,test2#gmail.com,Michael,Sessner\r\n" +
"CrossRoad,test25#gmail.com,Vander,John";
// Map key - To hold header information
// Map Value - List of lines holding values to the corresponding headers.
Map<String, List<String>> dataMap = new HashMap<>();
Stream<String> csv1LineStream = csv1.lines();
Stream<String> csv2LineStream = csv2.lines();
// We are using the same method to parse different csv formats. We are maintaining reference to the headers
// in the form of Map key which will helps us to emit output later as per our format.
populateDataMap(csv1LineStream, dataMap);
populateDataMap(csv2LineStream, dataMap);
// Now we have dataMap that holds data from multiple csv files. Key of the map is responsible to
// determine the header sequence.
// Print the output as per the sequence Employee, Name, Email, Address
System.out.println("Employee,Name,Email,Address");
dataMap.forEach((header, lineList) -> {
// Logic to determine the index value for each column.
List<String> headerList = Arrays.asList(header.split(","));
int employeeIdx = headerList.indexOf("Employee");
int nameIdx = headerList.indexOf("Name");
int emailIdx = headerList.indexOf("Email");
int addressIdx = headerList.indexOf("Address");
// Now we know the index value of each of these columns that can be emitted in our format.
// You can output to a file in your case.
// Iterate through each line, split and output as per the format.
lineList.forEach(line -> {
String[] data = line.split(",");
System.out.println(String.format("%s,%s,%s,%s", data[employeeIdx],
data[nameIdx],
data[emailIdx],
data[addressIdx]
));
});
});
}
private static void populateDataMap(Stream<String> csvLineStream, Map<String, List<String>> dataMap) {
// Populate data map associating the data to respective headers.
Iterator<String> csvIterator = csvLineStream.iterator();
// Fetch header. (In my example, I am sure that my first line is always the header).
String header = csvIterator.next();
if(! dataMap.containsKey(header))
dataMap.put(header, new ArrayList<>());
// Iterate through the remaining lines and populate data map.
while(csvIterator.hasNext())
dataMap.get(header).add(csvIterator.next());
}
}

Here I am using Jackson dataformat library to parse the csv files.
Dependency
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-csv</artifactId>
<version>2.13.2</version>
</dependency>
File 1
employee, address, name, email
1, address 1, Name 1, name1#example.com
2, address 2, Name 2, name2#example.com
3, address 3, Name 3, name3#example.com
File 2
address, email, employee, name
address 4, name4#example.com, 4, Name 4
address 5, name5#example.com, 5, Name 5
address 6, name6#example.com, 6, Name 6
Java Program
Here EmployeeDetails is a POJO class. And it is expected that the location of the csv files is passed as an argument.
import com.fasterxml.jackson.databind.MappingIterator;
import com.fasterxml.jackson.databind.ObjectReader;
import com.fasterxml.jackson.dataformat.csv.CsvMapper;
import com.fasterxml.jackson.dataformat.csv.CsvSchema;
import java.io.*;
import java.util.ArrayList;
import java.util.List;
public class EmployeeDataParser {
public static void main(String[] args) {
File directoryPath = new File(args[0]);
File filesList[] = directoryPath.listFiles();
List<EmployeeDetails> employeeDetails = new ArrayList<>();
EmployeeDataParser employeeDataParser=new EmployeeDataParser();
for(File file : filesList) {
System.out.println("File path: "+file.getAbsolutePath());
employeeDataParser.readEmployeeData(employeeDetails, file.getAbsolutePath());
}
System.out.println("number of employees into list: " + employeeDetails.size());
employeeDataParser.printEmployeeDetails(employeeDetails);
}
private List<EmployeeDetails> readEmployeeData(List<EmployeeDetails> employeeDetails,
String filePath){
CsvMapper csvMapper = new CsvMapper();
CsvSchema schema = CsvSchema.emptySchema().withHeader();
ObjectReader oReader = csvMapper.readerFor(EmployeeDetails.class).with(schema);
try (Reader reader = new FileReader(filePath)) {
MappingIterator<EmployeeDetails> mi = oReader.readValues(reader);
while (mi.hasNext()) {
EmployeeDetails current = mi.next();
employeeDetails.add(current);
}
} catch (IOException e) {
System.out.println("IOException Caught !!!");
System.out.println(e.getStackTrace());
}
return employeeDetails;
}
private void printEmployeeDetails(List<EmployeeDetails> employeeDetails) {
System.out.printf("%5s %10s %15s %25s", "Employee", "Name", "Email", "Address");
System.out.println();
for(EmployeeDetails empDetail:employeeDetails){
System.out.format("%5s %15s %25s %15s", empDetail.getEmployee(),
empDetail.getName(),
empDetail.getEmail(),
empDetail.getAddress());
System.out.println();
}
}
}

Ignoring invalid object in CSV file

So I got a .csv over which I am iterating and creating objects based on the columns. Now, in the constructor of the to-be-generated object I'm checking a few conditions and throwing Exceptions if said conditions are not met.
Now I've been asking myself - assuming there are some objects in that list that would cause an Exception to be thrown, would there be any possibility to stop going through the constructor of the about-to-be-created-object and simply go to the next line in the .csv and continue building my list?
So in a nutshell:
go over a .csv
build objects based on columns
if an object cannot be created (because an Exception is being thrown in the constructor), ignore it and go to the next element in the list
Is this possible?
Thanks!

I actually found a solution. I put the line, where I was getting my data from .csv in a try-catch-block (using return Object xyz) - that way, the program wouldn't terminate. However, as this line is in a function that has to return an object, I needed to return an object outside of the try-catch-block, which is why I returned null. Now the list in my main was being filled with valid objects, but also with a few null objects.
I then "removed" the null-objects from the list in my main by filtering out all null-objects while filling up my list, using .stream().map(SimpleCsvParser::parseLine).filter(p -> p != null).collect(Collectors.toList());.
Thanks!

I needed to return an object outside of the try-catch-block, which is why I returned null.
Re-organize your code so that you do not always have to return a new object. Trap for the new object creation failure. If exception thrown, skip that particular loop when it comes to adding to the collection of new objects.
Let's make a Person class for a demo. Note how in the constructor we look for valid data, throwing an IllegalArgumentException if not received. This is called data-validation.
package work.basil.example;
import java.util.Objects;
public class Person
{
public String givenName, surname;
// -------| Constructors |------------
public Person ( String givenName , String surname )
{
if ( Objects.isNull( givenName ) || givenName.isEmpty() ) { throw new IllegalArgumentException();}
if ( Objects.isNull( surname ) || surname.isEmpty() ) { throw new IllegalArgumentException();}
this.givenName = givenName;
this.surname = surname;
}
// -------| Accessors |------------
// Read-only, no setters.
public String getGivenName ( )
{
return givenName;
}
public String getSurname ( )
{
return surname;
}
// -------| Object |-----------------
#Override
public boolean equals ( Object o )
{
if ( this == o ) return true;
if ( o == null || getClass() != o.getClass() ) return false;
Person person = ( Person ) o;
return givenName.equals( person.givenName ) &&
surname.equals( person.surname );
}
#Override
public int hashCode ( )
{
return Objects.hash( givenName , surname );
}
#Override
public String toString ( )
{
return "Person{ " +
"givenName='" + givenName + '\'' +
" | surname='" + surname + '\'' +
" }";
}
}
And some fake data. We expect failure on that third line, where the first name is Zero but the last name field is blank. Our Person class requires two String objects that are both non-null and not-empty.
String input =
"Alice,Anderson\r\n" + // Standard CSV requires CRLF as newline.
"Bob,Barker\r\n" +
"Zero,\r\n" +
"Carol,Carrington";
Use the Apache Commons CSV library to do the work of reading our input.
➥ Notice the inner try-catch around the calls to csvRecord.get and the new Person. If those lines fail, we skip adding a Person object to our List. The loop moves on to the next input from the CSV file.
At no point are trying to juggle a null object of type Person. If we do not have a valid Person object at the ready, we move on.
List < Person > persons = new ArrayList <>();
CSVFormat format = CSVFormat.RFC4180;
try ( // Try-with-resources syntax used here, to automatically close the `Reader` whether or not exceptions thrown.
Reader reader = new StringReader( input ) ;
)
{
CSVParser parser = new CSVParser( reader , format );
for ( CSVRecord csvRecord : parser ) // Each line of CSV is parsed into a `CSVRecord` object.
{
try
{
String givenName = csvRecord.get( 0 ); // Annoying zero-based index number.
String surname = csvRecord.get( 1 ); // `CSVRecord::get` throws `ArrayIndexOutOfBoundsException` if no value found for that index.
Person person = new Person( givenName , surname );
persons.add( person ); // If exception is thrown during `CSVRecord::get` or thrown during the construction of the `Person` object, this line of code is not executed.
}
catch ( ArrayIndexOutOfBoundsException | IllegalArgumentException | NullPointerException e )
{
// Log the issue, and move on to the next loop.
System.out.println( "INFO Import failed on row # " + csvRecord.getRecordNumber() + ". Exception: " + e );
// Be aware that if your CSV input has multi-line values, the returned record
// number does *not* correspond to the current line number of the parser that created this record.
}
}
}
catch ( IOException e )
{
e.printStackTrace();
}
System.out.println( "persons = " + persons );
When run.
INFO Import failed on row # 3. Exception: java.lang.IllegalArgumentException
persons = [Person{ givenName='Alice' | surname='Anderson' }, Person{ givenName='Bob' | surname='Barker' }, Person{ givenName='Carol' | surname='Carrington' }]

Compare Two CSV Files and Fetch Data

I have two csv files. One Master CSV File around 500000 records. Another DailyCSV file has 50000 Records.
The DailyCSV files misses few columns which has to be fetched from Master CSV File.
For example
DailyCSV File
id,name,city,zip,occupation
1,Jhon,Florida,50069,Accountant
MasterCSV File
id,name,city,zip,occupation,company,exp,salary
1, Jhon, Florida, 50069, Accountant, AuditFirm, 3, $5000
What I have to do is, read both files, match the records with ID, if ID is present in the master file, then i have to fetch company, exp, salary and write it to a new csv file.
How to achieve this.??
What I have done Currently
while (true) {
line = bstream.readLine();
lineMaster = bstreamMaster.readLine();
if (line == null || lineMaster == null)
{
break;
}
else
{
while(lineMaster != null)
readlineSplit = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);
String splitId = readlineSplit[4];
String[] readLineSplitMaster =lineMaster.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1);
String SplitIDMaster = readLineSplitMaster[13];
System.out.println(splitId + "|" + SplitIDMaster);
//System.out.println(splitId.equalsIgnoreCase(SplitIDMaster));
if (splitId.equalsIgnoreCase(SplitIDMaster)) {
String writeLine = readlineSplit[0] + "," + readlineSplit[1] + "," + readlineSplit[2] + "," + readlineSplit[3] + "," + readlineSplit[4] + "," + readlineSplit[5] + "," + readLineSplitMaster[15]+ "," + readLineSplitMaster[16] + "," + readLineSplitMaster[17];
System.out.println(writeLine);
pstream.print(writeLine + "\r\n");
}
}
}pstream.close();
fout.flush();
bstream.close();
bstreamMaster.close();

First of all, your current parsing approach will be painfully slow. Use a CSV parsing library dedicated for that to speed things up. With uniVocity-parsers you can process your 500K records in less than a second. This is how you can use it to solve your problem:
First let's define a few utility methods to read/write your files:
//opens the file for reading (using UTF-8 encoding)
private static Reader newReader(String pathToFile) {
try {
return new InputStreamReader(new FileInputStream(new File(pathToFile)), "UTF-8");
} catch (Exception e) {
throw new IllegalArgumentException("Unable to open file for reading at " + pathToFile, e);
}
}
//creates a file for writing (using UTF-8 encoding)
private static Writer newWriter(String pathToFile) {
try {
return new OutputStreamWriter(new FileOutputStream(new File(pathToFile)), "UTF-8");
} catch (Exception e) {
throw new IllegalArgumentException("Unable to open file for writing at " + pathToFile, e);
}
}
Then, we can start reading your daily CSV file, and generate a Map:
public static void main(String... args){
//First we parse the daily update file.
CsvParserSettings settings = new CsvParserSettings();
//here we tell the parser to read the CSV headers
settings.setHeaderExtractionEnabled(true);
//and to select ONLY the following columns.
//This ensures rows with a fixed size will be returned in case some records come with less or more columns than anticipated.
settings.selectFields("id", "name", "city", "zip", "occupation");
CsvParser parser = new CsvParser(settings);
//Here we parse all data into a list.
List<String[]> dailyRecords = parser.parseAll(newReader("/path/to/daily.csv"));
//And convert them to a map. ID's are the keys.
Map<String, String[]> mapOfDailyRecords = toMap(dailyRecords);
... //we'll get back here in a second.
This is the code to generate a Map from the list of daily records:
/* Converts a list of records to a map. Uses element at index 0 as the key */
private static Map<String, String[]> toMap(List<String[]> records) {
HashMap<String, String[]> map = new HashMap<String, String[]>();
for (String[] row : records) {
//column 0 will always have an ID.
map.put(row[0], row);
}
return map;
}
With the map of records, we can process your master file and generate the list of updates:
private static List<Object[]> processMasterFile(final Map<String, String[]> mapOfDailyRecords) {
//we'll put the updated data here
final List<Object[]> output = new ArrayList<Object[]>();
//configures the parser to process only the columns you are interested in.
CsvParserSettings settings = new CsvParserSettings();
settings.setHeaderExtractionEnabled(true);
settings.selectFields("id", "company", "exp", "salary");
//All parsed rows will be submitted to the following RowProcessor. This way the bigger Master file won't
//have all its rows stored in memory.
settings.setRowProcessor(new AbstractRowProcessor() {
#Override
public void rowProcessed(String[] row, ParsingContext context) {
// Incoming rows from MASTER will have the ID as index 0.
// If the daily update map contains the ID, we'll get the daily row
String[] dailyData = mapOfDailyRecords.get(row[0]);
if (dailyData != null) {
//We got a match. Let's join the data from the daily row with the master row.
Object[] mergedRow = new Object[8];
for (int i = 0; i < dailyData.length; i++) {
mergedRow[i] = dailyData[i];
}
for (int i = 1; i < row.length; i++) { //starts from 1 to skip the ID at index 0
mergedRow[i + dailyData.length - 1] = row[i];
}
output.add(mergedRow);
}
}
});
CsvParser parser = new CsvParser(settings);
//the parse() method will submit all rows to the RowProcessor defined above.
parser.parse(newReader("/path/to/master.csv"));
return output;
}
Finally, we can get the merged data and write everything to another file:
... // getting back to the main method here
//Now we process the master data and get a list of updates
List<Object[]> updatedData = processMasterFile(mapOfDailyRecords);
//And write the updated data to another file
CsvWriterSettings writerSettings = new CsvWriterSettings();
writerSettings.setHeaders("id", "name", "city", "zip", "occupation", "company", "exp", "salary");
writerSettings.setHeaderWritingEnabled(true);
CsvWriter writer = new CsvWriter(newWriter("/path/to/updates.csv"), writerSettings);
//Here we write everything, and get the job done.
writer.writeRowsAndClose(updatedData);
}
This should work like a charm. Hope it helps.
Disclosure: I am the author of this library. It's open-source and free (Apache V2.0 license).

I will approach the problem in a step by step manner.
First I will parse/read the master CSV file and keep its content into a hashmap, where the key will be each record's unique 'id' as for the value maybe you can store them in a hash or simply create a java class to store the information.
Example of hash:
{
'1' : { 'name': 'Jhon',
'City': 'Florida',
'zip' : 50069,
....
}
}
Next, read your comparer csv file. For each row, read the 'id' and check if the key exists on the hashmap you have created earlier.
if it exists, then from the hashmap access the information you need and write to a new CSV file.
Also, you might want to consider using a 3rd party CSV parser to make this task easier.
If you have maven maybe you can follow this example I found on net. Otherwise you can just google for apache 'csv parser' example on the internet.
http://examples.javacodegeeks.com/core-java/apache/commons/csv-commons/writeread-csv-files-with-apache-commons-csv-example/

XML/JSON comparison: XML writes faster, JSON reads faster - is my measurement wrong?

DISCLAIMER: I do not want to discuss about JSON and XML. Really.
I made the following benchmark and I wonder if I am totally wrong:
#1
I made a simple model of Java POJOs (a course which has a list of students as well as a list of topics and some properties just as a name). It looks like this:
public class Course {
private String name;
private String description;
private int rating;
private List< Student > students = new ArrayList<>();
private List< Topic > topics = new ArrayList<>();
// getter and setter...
}
public class Student {
...
}
public class Topic {
...
}
#2
I generated a course instance with let's say 10.000 Students and 10.000 topics and initialized it with random values:
Course course = new Course();
for ( int i = 0 + t; i < 10000 + t; i++ ) {
Student student = new Student();
student.setAge( ( int ) ( 100 * Math.random() ) + t );
student.setName( UUID.randomUUID().toString() );
course.getStudents().add( student );
}
for ( int i = 0 + t; i < 10000 + t; i++ ) {
Topic topic = new Topic();
topic.setDescription( UUID.randomUUID().toString() );
topic.setDifficulty( ( int ) ( 10 * Math.random() ) + t );
topic.setName( UUID.randomUUID().toString() );
topic.setRating( ( int ) ( 10 * Math.random() ) + t );
course.getTopics().add( topic );
}
#3
I wrote this Course object to a file (500 times in a loop) and measured the time with XML and JSON. And I read the object from a file (500 times in a loop), also with XML and JSON. For example, here is the code I used to write the the object as JSON and the code I used to write it as XML:
Writer writer = new FileWriter( this.jsonFile );
Gson gson = new GsonBuilder().create();
Course course = createCourse( i );
gson.toJson( course, writer );
writer.close();
JAXBContext jaxbContext = JAXBContext.newInstance( Course.class );
Marshaller jaxbMarshaller = jaxbContext.createMarshaller();
jaxbMarshaller.setProperty( Marshaller.JAXB_FORMATTED_OUTPUT, false );
Course course = createCourse( i );
jaxbMarshaller.marshal( course, this.xmlFile );
#4
I did this with 200, 2.000, 20.000, 100.000 and 200.000 students and topics in the course object. The result was the following:
The XML file is read much slower than the JSON file.
The JSON file is written slower than the XML file, but not that much. Also, if I use a small object (200 and 2.000 students/topics), JSON is a little bit faster (but nearly the same).
The Google Doc with more details is at https://docs.google.com/spreadsheets/d/1t6tRfja6cKEnl7oYxSa69bwDvIcDniy9lK5rzOZqb30/edit?usp=sharing
#5
I used:
Java 8
GSON for JSON
JAXB for XML
Question
Is there any logical explanation for my results? Did I make an obvious mistake in my measurements? Or is it just a coincidence?

Reading data from CSV file. Convert data into an array

I am trying to get a project finished but am having no luck. It is an online course so my only communication is through email. He has yet to reply to my four emails over the last five days.
So for this assignment we had to download a csv file from containing NASDAQ stock price info for a specific company. I chose GOOG (google). Below are the requirements for the code portion.
Create a second file ReadFiles.java. This is the file that will read in the data from your csv file. Note: You will want to use a smaller version of your data file (20 rows) for testing.
Your ReadFiles.java class requires the following methods:
Method: check to see if the file exists
Method: find number of rows in csv file
Method: Converts the csv file to a mutli-dimensional array
Method: PrintArray
Method: Return array using a get method
Create a file DataAnalyzer.java. This file will be used to call the methods in ReadFiles.java. Be sure to demonstrate that all of your methods work through DataAnalyzer.java.
This is what I have so far.
package Analysis;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.Reader;
import java.util.StringTokenizer;
import java.util.Scanner;
public class ReadFiles
{
public static int numberOfRows;
public static int rowNumber = 0;
public static int columnNumber = 0;
public static void main(String[] args)
{
Scanner kb = new Scanner (System.in);
String fileName;
System.out.print("Enter the file name >> ");
fileName = kb.nextLine();
File f = new File("D:\\Java\\Assignment 3\\" + fileName);
if(f.exists())
{
System.out.print("File exists.");
}
fileName="D:\\Java\\Assignment 3\\" + fileName;
try
{
BufferedReader br = new BufferedReader(new FileReader(fileName));
StringTokenizer st = null;
while((fileName = br.readLine()) != null)
{
rowNumber++;
numberOfRows++;
st = new StringTokenizer(fileName, ",");
while(st.hasMoreTokens())
{
columnNumber++;
System.out.println("Row " + rowNumber +
", Column " + columnNumber
+ ", Entry : "+ st.nextToken());
}
columnNumber = 0;
}
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
catch (IOException e)
{
e.printStackTrace();
}
}
public static void rows()
{
System.out.println("Total Rows: " + numberOfRows);
}
}
The book we have been given for the course is no help. All of the "Examples" and "You do it" portions give errors. Also in the entire chapter this assignment is based on, not one mention of an array.
When I run this code I do not get any error. I am shown the following:
File exists.
Row 1, Column 1, Entry : 30/12/2011
Row 1, Column 2, Entry : 642.02
Row 1, Column 3, Entry : 646.76
Row 1, Column 4, Entry : 642.02
Row 1, Column 5, Entry : 645.9
Row 1, Column 6, Entry : 1782300
Row 1, Column 7, Entry : 645.9
Row 2, Column 1, Entry : 29/12/2011
Row 2, Column 2, Entry : 641.49
I am shown from row 1 - 19 (the entire file).
What I do not understand is how to create separate methods in this class to convert to an array, print the array, and return the array.
Any help would be much appreciated.
Thanks

You need to define 2 classes, DataAnalyzer and ReadFiles. You usually have one file per class, although this is not a requirement. The structure of ReadFiles has been provided, so you will have a file called ReadFiles.java like this:
public class ReadFiles{
//instance var(s)
...
//constructor(s)
...
//methods(s)
/**
* Checks whether the file exists
*/
public boolean exists(){
....
}
/*
* Number of rows in the file
*/
public int getRowCount(){
....
}
// add the rest your self!!
}
}
You'll also need a file called DataAnalyzer.java:
public class DataAnalyzer{
public static void main(String args){
//create ReadFiles and call it's methods and check they return what is expected
}
}
Assume the ReadFile manages a single input file; it probably needs a class variable to hold that information. The DataAnalyzer will need to tell the ReadFiles which file to analyse (a constructor seems a good choice).
My advice is to create your skeleton structure (you already have been told what it is) and start building the functionality of each method one at a time.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

java CSV file to array - java

Related

Arrange List in Java to output specific columns

Ignoring invalid object in CSV file

Compare Two CSV Files and Fetch Data

XML/JSON comparison: XML writes faster, JSON reads faster - is my measurement wrong?

Reading data from CSV file. Convert data into an array

Categories

Resources