Opencsv cannot read entire file - java

I am trying to read data from a csv file has 332,462 KB with 136 columns and 297,388 rows. Then I want to insert into an Oracle database table which has the exactly same column number mapping, except I add one more column at the end of this table to record today's date.
So, everything looks fine, no exceptions, the only thing is I can only read a small part like 7619 row, and the program stops. The finish part in the database is what I want, that is correct, but I don't know why it stops, I tried use readNext(), readAll(), and pass an inputStreamRead to CSVReader, all of these way have the same result.
What is the cause of this? One thing I am think is this csv file has some empty row that the CSVReader read it as the end of the file?

Related

Whats the fastest between adding row by row and multiple row in MySQL

I'm parsing a large text file to add it into mySQL database, the file is more than 10K lines, each lines with more than 20 columns (line seperated by ",").
Before I start coding, I wanted to know which is the best way (in term of memory and execution time) between those two solutions :
The easy way : Line by line
Adding row by row (more than 10K insert into) : The easiest way
Complicated way
Parsing the whole file, creating ArrayList for each column, then inserting all the data in only one statement.
Does the "complicated way" save me a lot in the execution time ?
Thanks
Thanks to the solutions of #Gordon Linoff I was able to do it :
The request to execute in Java was :
LOAD DATA LOCAL INFILE 'file.txt' INTO TABLE TableName FIELDS TERMINATED BY ';';

loading a word2vec model into a Mysql database

I have a word2vec model stored in text file as
also -0.036738 -0.062687 -0.104392 -0.178325 0.010501 0.049380....
one -0.089568 -0.191083 0.038558 0.156755 -0.037399 -0.013798....
The size of the text file is more than 8GB.
I want to read this file into mysql database using the first word as key (in a column) and the rest of the line as another column. Is it possible to do so without reading each line and splitting it?
I went through some related questions but it didn't match what I want.
How to read a file and add its content to database?
read text file content and insert it into a mysql database
you can do it by:
making a simple for loop that iterates over the records on the model
aggregating about 100 records on an array
using mysql bulk insert feature to insert 100s of records at once
use a fast language like go if you can.
This thing you are trying to do, it's very possible, let me know if you need code for this.

How do I best handle a simple column replacement in a database using a java program?

I have a table A [id,name] and it has about say, 10 million records. I need to replace all names with 10 million unique names. So, for this I have a text file that acts as a lookup file and it has 10 million names in it separated by new line. So, bunch of questions:
How do I go about randomly replacing these 10 million names in the
database with 10 million names in a text file? - I can think of a few approaches, caching the entire file, and creating a map of what is being replaced, so that I never reuse entries from lookup file.OR use a database table, load look up file into it and make use of this table.
In general, what would be a good number for # of writes/ # of reads that would make it a case for using database against using a file? Say, if your program is reading a file million times and writing to another file million times, would you switch to using database? What's the upperlimit really (if there is any)?
Well lets think, you have THAT many names, that it cant be all loaded into the memory, so we are going to find the solution as usable as possible.
For random approach you can create temp column in database, create the Unique key over it and always use this :
1)take a name on line "x" (by random or whatever you want)
2)random record "y" in database which was not replaced yet (it can be tracked with just one boolean)
3)try to add the name on line x to the record y AND to the same record add x to the temp column.
4)if Unique errorcomes, it means the name was already given to someone, repeat once more with another x.
If we can track "x" and we are sure, we are not using alredy given names, we dont need the Unique modifier.
Instead of replacing the names randomly if i would have been in your place I would have opted for a batch based approached where I would have processed the data in chunks. Have a reader to read the chunk, a processor to update new values from the file and a writer to write back the updated values to a database.
Your second question is a bit unclear. The decision to go with files is purely based on the requirement. If you are getting the data in flat files you will have to read from that. Even if there are billions of rows moving all the data from a flat file to database table and then again using it to update another table is an overkill. You are unnecessarily persisting data which you are not going to use later on once the intended table column is updated.

Reading Bulk data from CSV and writring into another csv

I had one csv file of 1 cr lines of data. From this file I have to read data from csv and using first field I need do check conditions with in my db and from that Db took one key and append all prevoius data and write to another csv. Which ever I read from CSV I written code for this but it takes days of time to read and write for one 2 lakhs line of data. Here I am using single thread to do this all.
Sample code is I followed below steps:
1).reading data from CSF.
2).Read first field from csv and checking condition(in this i am
checking 5 conditions).
3).And written into CSV.
In my opinion, reading the CSV into the database and then writing a statement to filter the data would be more efficient than reading the file line by line using the code.
You may refer the links below to know more about csv to database import if you use MySQL:
http://dev.mysql.com/doc/refman/5.0/en/mysqlimport.html
http://support.modwest.com/content/6/253/en/how-do-i-import-delimited-data-into-mysql.html

How to insert only unique values from a CSV file in Oracle database?

I am uploading a CSV file using a Servlet and inserting it in an Oracle table using JDBC. I need to insert only the unique values in the CSV file. If the values in CSV file is already in the database table, then it should not be inserted in the table. So it should insert only unique values from the CSV file.
These options should help avoid an additional DB call to handle this situation.
Option 1: Very Simple and needs least coding ... but works when there is no global transaction boundary.
Just get all the inserts going, in case of any Constraint exception, just catch it and do "nothing", loop to another value
Option 2: Every time you read row from CSV, add it to a collection, before adding, just check if the object already exists (ex: arrayList.contains (object instance)) and continue adding only when there is no object with similar data). At the end, do a bulk insert.
Note: If the data is large, go for fixed set of data for bulk insert.
Consider these steps:
Read a value from the CSV file
Code to search a "value" against the database and if it not found then insert it.
i'm guessing the way ur inserting data in the database, is in a loop, you read from the csv and insert in the DB, so why not simply do a select satetement to check if the value exists and if it does don't insert it

Categories