Read multiple csv file with CsvJdbc - java

I need to bind a group of csv file in the format "YYYY-MM-DD hh:mm:ss.csv" that are present in the same folder with a unique table that contains all the data present in all the files.
I need to read the data from a Java EE application thus I would like to create a connection pool inside the application server. I found the CsvJdbc driver that allows the reading of multiple files as a single entity. A good starting point was this page in the section with this paragraph:
To read several files (for example, daily log files) as a single table, set the database connection property indexedFiles. The following example demonstrates how to do this.
The example could be fine for me but the problem is that I do not have a header word in the filename string. So the corresponding table becames an empty string that makes obviously impossible to query the table.
How can I tell the driver to map the pattern to a table that hasn't a header part?
P.S. I already tried to use hsqldb as a frontend to the csv files but it does not support multiple files.

Setup CsvJdbc to read several files as described in http://csvjdbc.sourceforge.net/doc.html and then use an empty table name in the SQL query because your CSV filenames do not have any header before the fileTailPattern regular expression. For example:
props.put("fileTailPattern", "(\\d+)-(\\d+)-(\\d+) (\\d+):(\\d+):(\\d+)");
props.put("fileTailParts", "Year,Month,Day,Hour,Minutes,Seconds");
...
ResultSet results = stmt.executeQuery("SELECT * FROM \"\" AS T1");

Related

Ingest data from JDBC connections to Hive : Handling binary columns

Following diagram depicts the simplified ingestion flow we are building to ingest data from different RDBS to Hive.
Step 1: Using JDBC connection to the data-source, source data is streamed and saved in a CSV file on HDFS using HDFS java API.
Basically, execute a 'SELECT * ' query and each row is saved in CSV until the ResultSet is exhausted.
Step 2: Using LOAD DATA INPATH command, Hive table is populated using the CSV file created in Step 1.
We use JDBC ResultSet.getString() to get column data.
This works fine for non-binary data.
But for BLOC,CLOB type columns, we cannot write column data into a text/CSV file.
My question is it possible to use OCR or AVRO format to handle binary columns? Does these formats support write row-by-row?
(Update: We are aware of Sqoop/Nifi..etc technologies, the reason for implementing our custom ingestion-flow is beyond the scope of this question)

Reading file content to Mysql stored procedure

i have a scenario where i have a file of the form,
id,class,type
1,234,gg
2,235,kk
3,236,hth
4,237,rgg
5,238,rgr
I also have a table in my database of the form PROPS,
id,class,property
1,7735,abc
2,3454,efg
3,235,hij
4,238,klm
5,24343,xyx
Now the first file and the db table are joined based on class so that final output will be of the form:
id,class,type,property
1,235,kk,hij
2,238,rgr,klm
Now, i can search the db table for each class record of the first file and so forth.
But this will take too much time.
Is there any way to do this same thing through a MySQL STORED PROCEDURE?
My question is whether there is a way to read the first file content line by line(WITHOUT MAKING USE OF A TEMPORARY TABLE), check the class with the class in the db table and insert the result into an output file and return the output file using MYSQL STORED PROCEDURE?

How to test/verify database table loaded by jdbc against data file?

Hi I am creating table using schema file and loading table from data file through jdbc. I am doing batch upload using PreparedStatement and executeBatch. Data file contents look like the following structure:
key time rowid stream
X 11:40 1 A
Y 3:30 2 B
Now I am able to load successfully table in database. But I would like to test/verify that same table loaded into database against this same data file. how do I do it? How do compare table in database with data file? I am new to JDBC. Please guide. Thanks in advance.
Like Loki said, you can use a tool like DBUnit. Another option is to make a rudimentary integration test whereby your test generates a dump file of your table and compares this dump with the original "good" file.
You need DBunit . Check more details here : http://dbunit.sourceforge.net/howto.html
DB unit helps you to write test cases against data from database.

insert Xml file data into MySQL table

I want to insert the Xml file data into MySQL table ,, by choosing which column to insert into ,, using Java How will this be done ?
It really depends on the format of your XML file. If your XML file is a direct export from the MySQL file, please refer to this question.
If your XML is in some other format, then I would probably be using JAXB to parse XML into POJO, then write some logic to map the POJO into the database table.

data parsing from a file into java and then into a mysql database

I have .Data file given in the above format . I am writing a program in java that will take the values from the .data file and put it in the buffer. MY java program is connected to Mysql(windows) via JDBC. So I need to read the values from the file given in the above format and put it the buffer like
Insert Into building values ("--", "---",----)
In this way, i store these values and jdbc will populate the database tables on Mysql(windows). Please tell me teh best way.
Check out the answers to this question for reading file lines and splitting them into chunks. I know the question says Groovy: but most answers are Java. Then insert the values you retrieved via JDBC.
Actually, since your data file is obviously CSV, you could also use a CSV libary like OpenCSV to read the values.
The data is in CSV format, so use a CSV library to parse the file and then just add some JDBC code to insert this into database.
Or just call MySQL CSV import command from Java:
try {
// Execute a command with arguments
String command = "mysqlimport [options] db_name textfile1 [textfile2 ...]";
Process child = Runtime.getRuntime().exec(command);
} catch (IOException e) {
}
This is the fourth question for the same task... If your data file is well formatted like in the example you provided, then you don't have to split the line into values:
Source: "AAH196","Austin","TX","Virginia Beach","VA"
Target: INSERT INTO BUILDING VALUES("AAH196","Austin","TX","Virginia Beach","VA");
<=> "INSERT INTO BUILDING VALUES(" + Source + ");"
Just take a complete row from you csv file and concatenate a SQL expression.
(see my answer to question 1 of 4 - BTW, if SQL INJECTION is a potential problem, splitting a line of values is not a solution too)
you can bind your csv with java beans using opencsv.
http://opencsv.sourceforge.net/
you can make these beans persistent using an ORM framework, like Hibernate, Cayenne or with JPA which're based on annotations and map your fields to tables easily without creating any sql statement.
This would be a perfect job for Groovy. Here's a gist with a small skeleton script to build upon.

Categories