Parse Excel Spreadsheet into model in java - java

I have an excel spreadsheet that contains service delivery information for a single client at a time. For example, Max Inc will be provided with health assessments at 3 of their corporate offices. An office may have deliveries of health assessments (service type) on multiple days and performed by different doctors.
I've created what I believe to be JavaBeans to ultimately represent all this information and the relationship between entities such as client, delivery, service, and individual sessions in a delivery.
My problem now is, what is the best way to read in and parse the data from the excel spreadsheet?
I was thinking I could create a static util class (like a factory class) that reads the excel spreadsheet (using Apache POI HSSF) in one method and then uses a variety of other methods to parse the data from the spreadsheet and then ultimately return a client object which contains all the other objects and so on.
During some point during this process I also need some data from a SQL Server DB which I thought I will just pull in using JDBC as needed.
Am I heading in the right direction with this approach? Or would recommend doing this another way?

Try simplifying this a little. Save the spreadsheet as a CSV file which is really easy to import into Java and subsequently into a database.
You could use a BufferedReader to read in the strings and split them at each delimiter ",". With each split String you will get an Array which you can add to your database.
Definitely try and avoid interfacing with Excel.

Simply reading cells from an Excel sheet is easy with POI. See: http://poi.apache.org/spreadsheet/quick-guide.html#CellContents
This will avoid any manual steps like converting to another format.
All you really need is a factory method to read in the spreadsheet and return back a list of objects. Then you can do your thing.

Related

What is the best approach for saving statistical data on a file using spring framework?

What is the best approach for saving statistical data on a file using spring framework? is there any available library that offers reading and updating the data on a file? or should I build my own IO code?
I already have a relational database, but don't like the approach of creating an additional table to save the calculated values in different multiple tables with joins, also don't want to add more complexity to the project by using an additional database for just one task like MongoDB.
To understand the complexity of this report, Imagine you are drawing a chart with a total number of daily transactions for a full year with billions of records at any time with a lot of extra information like( total and average with different currencies on different rates).
So, my approach was to generate those data in a file on a regular basis, so later I don't need to generate them again once requested, only accumulate the new dates if available to the file
Is this approach fine? and what is the best library to do that in an efficient way?
Update
I found this answer useful for why sometimes people prefer using flat files rather than the relational or non-relational one
Is it faster to access data from files or a database server?
I would preferet to use MongoDB for such purposes, but if you need simple approach, you can write your data to csv\excel file.
Just using I\O
List<String> data = new ArrayList<>();
data.add("head1;head2;head3");
data.add("a;b;c");
data.add("e;f;g");
data.add("9;h;i");
Files.write(Paths.get("my.csv"), data);
That is all)
How to convert your own object, to such string 'filed1;field2' I think you know.
Also you can use apache-poi csv library, but I think this is way much faster.
Files.write(Paths.get("my.csv"), data, StandardOpenOption.APPEND);
If you want to append data to existed file, there are many different options in StandardOpenOption.
For reading you should use Files.readAllLines(Paths.get("my.csv")); it will return you list of strings.
Also you can read lines in range.
But if you need to retrieve one column, or update two columns where, and so on. You should read about MongoDB or other not relational databases. It is difficult write about MongoDB here, you should read documentation.
Enjoy)
I found a library that can be used to write/read CSV files easily and can be mapped to objects as well Jackson data formats
Find an example with spring

Informix, MySQL and Oracle blob contains

We have an application that runs with any of IBM Informix, MySQL and Oracle, and we are using Java with Hibernate to connect to the database. We will store XML, CSV and other text-based files inside the database (clob column). The entities in Java are byte[] objects.
One feature request to the application is now to "grep" content inside the data. So I need to find all files with a specific content.
On regular char/varchar fields I can use like '%xyz%', but this is not working on byte[] / blobs.
The first approach was to load each entity, cast the byte[] into a string and use the contains method in Java. If the use enters any filter parameters on other (non-clob) columns, I will apply those filters before testing the clob in order to reduce the number of blobs I have to scan.
That worked quite well for 100 files (clobs) and as long as the application and database are on the same server. But I think it will get really slow if I have 1.000.000 files inside the database and the database is not always in the same network. So I think that is not a good idea.
My next thought was creating a database procedure. But I am not quite sure if this is possible for Informix, MySQL and Oracle. And I am not sure if this is possible.
The last but not favored method is to store the content of the data not inside a clob. Maybe I can use a different datatype for that?
Does anyone has a good idea how to realize that? I need a solution for all three DBMS. The application knows on what kind of DBMS it is connected to. So it would be okay, if I have three different solutions (one for each DBMS).
I am completely open to changing what kind of datatype I use (BLOB, CLOB ...) — I can modify that as I want.
Note: the clobs will range from about 5 KiB to about 500 KiB, with a maximum of 1 MiB.
Look into Apache Lucene or other text indexing library.
https://en.wikipedia.org/wiki/Lucene
http://en.wikipedia.org/wiki/Full_text_search
If you go with a DB specific solution like Oracle Text Search you will have to implement a custom solution for each database. I know from experience that Oracle Text search takes significant time to learn and involves a lot of tweaking to get just right.
Also, if you use a DB solution you would receive different results in each DB even if the data sets were the same (each DB would have it's own methods of indexing and retrieving the data).
By going with a 3rd party solution like Lucene -- you only have to learn one solution and results will be consistent regardless of the Db.

Bulk Update in Oracle from CSV file

I have table and CVS file what i want to do is from csv have to update the table.
csv file as follows (no delta)
1,yes
2,no
3,yes
4,yes
Steps through java
what i have did is read the csv file and make two lists like yesContainList,noContainList
in that list added the id values which has yes and no seperately
make the list as coma seperated strinh
Update the table with the comma seperated string
Its working fine. but if i want to handle lakhs of records means somewhat slow.
Could anyone tell whether is it correct way or any best way to do this update?
There are 2 basic techniques to do this:
sqlldr
Use an external table.
Both methods are explained here:
Update a column in table using SQL*Loader?
Doing jobs like bulk operation, import, exports or heavy SQL operation is not recommended to be done outside RDBMS due to performance issues.
By fetching and sending large tables throw ODBC like API's you will suffer network round trips, memory usage, IO hits ....
When designing a client server application (like J2EE) do you design a heavy batch operation being called and controlled from user interface layer synchronously or you will design a server side process triggered by clients command?.
Think about your java code as UI layer and RDBMS as server side.
BTW RDBMS's have embedded features for these operations like SQLLOADER in oracle.

Validate microsoft Excel data cell in Java

How to validate excel's column in Java spring? Validation such as column should be string or not null. After read Excel file, do validation then all data will be store in database's table.
The short answer to that is: you, probably, have to use the JXL library (or something similar). It allows you to access concrete cells in Excel sheets and get their values (amongst other things).

Store Retrieve Data from Database to Java

What's the best way to store data retrieved from database in to Java for processing?
Here's some context:
Every month, data from Excel files are stored in the database and our web application (plain JSP/Servlet) does some processing on those data in the database. Currently we use an ArrayList of Hashmaps to store data retrieved from the tables but it seems very clunky. Is there a better way or data structure to do this?
It's not really possible to create some sort of a Model class for each of these because there's no logical "User" object or anything like that. It's basically chunks of random data that need to be processed. Stored procedure is not an answer as well because the processing logic is quite complicated.
Try using Java API's to get faster execution.
Apache POI
Java Excel API namely JXL
Check the link of sample tutorial using JXL: Link
If your excel files are in csv format then use openCSV.
It's not really possible to create some sort of a Model class for each of these because there's no logical "User" object or anything like that. It's basically chunks of random data that need to be processed. Stored procedure is not an answer as well because the processing logic is quite complicated.
Then there is not really a better way than using a List<Map<String, Object>>. To soften the pain, you could abstract the Map<String, Object> away by extending it to another class e.g. Row with eventually convenience methods to cast the values (.getAsDouble(columnname) or even T get(columnname, Class<T> type), etc) so that it makes traversing and manipulating less scary.

Categories