My File has the following format:
Table1; Info
rec_x11;rec_x21;rec_x31;rec_x41
rec_x12;rec_x22;rec_x32;rec_x42
...
\n
Table2; Info
rec_x11;rec_x21;rec_x31;rec_x41
rec_x12;rec_x22;rec_x32;rec_x42
...
\n
Table3; Info
rec_x11;rec_x21;rec_x31;rec_x41
rec_x12;rec_x22;rec_x32;rec_x42
...
Each batch of records starting from the next line after TableX header and ending by an empty line delimiter is about 700-800 lines size.
Each such batch of lines (rec_xyz...) need to be imported into the relevant MyISAM table name indicated in the header of the batch (TableX)
I am familiar with the option to pipeline the stream using shell comands into LOAD DATA command.
I am interested in simple java snipet code which will parse this file and execute LOAD DATA for a single batch of records each time (in a for loop and maybe using seek command).
for now i am trying to use IGNORE LINES to jump over processed records, but i am not familiar if there is an option to ignore lines from BELOW?
is there a more efficient way to parse and load this type of file into DB?
EDIT
I have read that JDBC supports input stream to LOAD DATA starting from 5.1.3, can i use it to iterate over the file with an input stream and change the LOAD DATA statement each time?
I am attaching my code as a solution,
This solution is based on the additional functionality (setLocalInfileInputStream) added by MySQL Connector/J 5.1.3 and later.
I am pipe-lining input-stream into LOAD DATA INTO statement, instead of using direct file URL.
Additional info: I am using BoneCP as a connection pool
public final void readFile(final String path)
throws IOException, SQLException, InterruptedException {
File file = new File(path);
final Connection connection = getSqlDataSource().getConnection();
Statement statement = SqlDataSource.getInternalStatement(connection.createStatement());
try{
Scanner fileScanner = new Scanner(file);
fileScanner.useDelimiter(Pattern.compile("^$", Pattern.MULTILINE));
while(fileScanner.hasNext()){
String line;
while ((line = fileScanner.nextLine()).isEmpty());
InputStream is = new ByteArrayInputStream(fileScanner.next().getBytes("UTF-8"));
String [] tableName = line.split(getSeparator());
setTable((tableName[0]+"_"+tableName[1]).replace('-', '_'));
String sql = "LOAD DATA LOCAL INFILE '" + SingleCsvImportBean.getOsDependantFileName(file) + "' "
+ "INTO TABLE " + SqlUtils.escape(getTable())
+ "FIELDS TERMINATED BY '" + getSeparator()
+ "' ESCAPED BY '' LINES TERMINATED BY '" + getLinefeed() + "' ";
sql += "(" + implodeStringArray(getFields(), ", ") + ")";
sql += getSetClause();
((com.mysql.jdbc.Statement) statement).setLocalInfileInputStream(is);
statement.execute(sql);
}
}finally{
statement.close();
connection.close();
}
}
Related
I would like to insert a file that hasn't an extension. It is a text file without a .txt extension. This is my code:
public boolean setData(List<String> data) {
SABConnection connection = new SABConnection();
boolean bool = false;
try {
PreparedStatement ps = connection.connectToSAB().prepareStatement("INSERT INTO AS400.ZXMTR03 VALUES (?)");
if (!data.isEmpty()) {
for (String file: data) {
File fi = new File(file);
FileInputStream fis = new FileInputStream(file);
ps.setAsciiStream(1, fis);
int done = ps.executeUpdate();
if (done > 0) {
System.out.println("File: " + fi.getName() + " Inserted successfully");
}else {
System.out.println("Insertion of File: " + fi.getName() + " failed");
}
}
bool = true;
}else {
System.out.println("Le repertoire est vide");
}
}catch (Exception e) {
System.out.println("error caused by: " + e.getMessage());
}
return bool;
}
I keep getting a data truncation error.
ps:
the file ZXMTR03 doesn't have columns.
to insert such thing manually into as400 I write this statement: insert into ZXMTR03 (select * from n.niama/ZXMTR02) it works. When I write insert into ZXMTR03 values ('12345') it works.
I'm using JT400 library.
You can't insert a stream file into a database table like that.
Assuming your text file has EOL indicators, you'd need to split it into rows to insert one row at time; or insert some distinct number of rows at a time using a multi-row insert.
Also you're wrong in thinking ZXMTR03 doesn't have columns, every DB table on the IBM i has at least 1 column.
Alternatively, you could copy, using FTP, SMB, NFS, ect. or even the JT400 AccessIfsFile class, the text file to the Integrated File System (IFS), which supports stream files. And make use of the Copy From Import File (CPYFRMIMPF) command or perhaps the IFS Copy (CPY) command. If on a current version of the OS, you might want to check out the QSYS2_IFS_READ() table functions
I have a Java program that creates an SQL script that will then later be imported as a file into MySQL. The Java program cannot directly access the MySQL database but has to generate an SQL file with all the insert commands to be ingested by MySQL. Without getting into too many details we can ignore any security concerns because the data is used once and then the database deleted.
The Java code does something like this:
String sql = "INSERT INTO myTable (column1, column2) VALUES (1, 'hello world');";
BufferedWriter bwr = new BufferedWriter(new FileWriter(new File("output.sql")));
bwr.write(sql);
// then flushes the stream, etc.
The issue I have is when I need to include a byte[] array as the third column:
The issue I have is that I now need to include a byte[] array as the third column. Therefore I want to do:
byte[] data = getDataFromSomewhere();
// Convert byte[] to String and replace all single quote with an escape character for the import
String dataString = new String(data, StandardCharsets.UTF_8).replaceAll("'", "\\\\'");
String sql = "INSERT INTO myTable (column1, column2, column3) VALUES (1, 'hello world', '" + dataString + "');";
BufferedWriter bwr = new BufferedWriter(new FileWriter(new File("output.sql")));
bwr.write(sql);
// then flushes the stream, etc.
The problem is that on the other computer when I load the file I get the following error:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''????' at line 1
The core of the code to load the SQL file is simply:
try (Stream<String> stream = Files.lines(Paths.get(IMPORT_SQL_FILE)))
{
stream.forEach(line ->
{
try
{
// Yes it could be batched but I omitted it for simplicity
executeStatement(connection, line);
} catch (Exception e) {
e.printStackTrace();
System.exit(-1);
}
});
}
And if I load that the file directly in MySQL I get the following error message:
1 row(s) affected, 1 warning(s): 1300 Invalid utf8 character string: 'F18E91'
Therefore my question is how can I generate an SQL file with binary data from Java?
Inline your BLOB data into a hexadecimal literal:
StringBuilder sb = new StringBuilder(a.length * 2);
for(byte b: data) {
sb.append(String.format("%02x", b));
}
String sql = "INSERT INTO myTable (column1, column2, column3) "
+ "VALUES (1, 'hello world', x'" + sb.toString() + "');";
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I have a 20gb text file that i would like to read and store the data into a database. The problem is when I try to load it before it can print out anything to see what the program is doing it is terminated, and it seems like it might be due to the size of the file. If anyone has any suggestions on how to read this file efficiently please show me.
From another post Read large files in Java
First, if your file contains binary data, then using BufferedReader would be a big mistake (because you would be converting the data to String, which is unnecessary and could easily corrupt the data); you should use a BufferedInputStream instead. If it's text data and you need to split it along linebreaks, then using BufferedReader is OK (assuming the file contains lines of a sensible lenght).
Regarding memory, there shouldn't be any problem if you use a decently sized buffer (I'd use at least 1MB to make sure the HD is doing mostly sequential reading and writing).
If speed turns out to be a problem, you could have a look at the java.nio packages - those are supposedly faster than java.io,
As for reading it to a database, make sure you make use of some sort of bulk loading API otherwise it would take forever.
Here is an example of a bulk loading routine I use for Netezza ...
private static final void executeBulkLoad(
Connection connection,
String schema,
String tableName,
File file,
String filename,
String encoding) throws SQLException {
String filePath = file.getAbsolutePath();
String logFolderPath = filePath.replace(filename, "");
String SQLString = "INSERT INTO " + schema + "." + tableName + "\n";
SQLString += "SELECT * FROM\n";
SQLString += "EXTERNAL '" + filePath + "'\n";
SQLString += "USING\n";
SQLString += "(\n";
SQLString += " ENCODING '" + encoding + "'\n";
SQLString += " QUOTEDVALUE 'NO'\n";
SQLString += " FILLRECORD 'TRUE'\n";
SQLString += " NULLVALUE 'NULL'\n";
SQLString += " SKIPROWS 1\n";
SQLString += " DELIMITER '\\t'\n";
SQLString += " LOGDIR '" + logFolderPath + "'\n";
SQLString += " REMOTESOURCE 'JDBC'\n";
SQLString += " CTRLCHARS 'TRUE'\n";
SQLString += " IGNOREZERO 'TRUE'\n";
SQLString += " ESCAPECHAR '\\'\n";
SQLString += ");";
Statement statement = connection.createStatement();
statement.execute(SQLString);
statement.close();
}
If you need to load the information into a database you can use Spring batch,
with this you are going to read your file, manage transaction, execute process over your file, persist your rows into a database, control how much records you are going to execute a commit, I think that is a better option because the first problem is to read the large file, but your next problem will be to manage the transaction of your database, control the commits, etc. I hop It help you
If you are reading very huge file, always prefer InputStreams.
e.g.
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line = null;
StringBuilder responseData = new StringBuilder();
while((line = in.readLine()) != null) {
// process line
}
MySQL documentation says:
When the LOAD DATA INFILE statement finishes, it returns an information string in the following format:
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
Now, I want to get this information string when I execute Load Data Infile through my Java code. This is needed as I have to tally the number of inserts made into the table to make sure that we do not miss any record. Note that my Java code is all right, I need to know how to capture this string.
The key is to use executeUpdate();
Try this
Connection con = DriverManager.getConnection("jdbc:mysql://localhost/foobar", "root", "password");
Statement stmt = con.createStatement();
String sql =
"load data infile 'c:/temp/some_data.txt' \n" +
" replace \n" +
" into table prd \n" +
" columns terminated by '\\t' \n" +
" ignore 1 lines";
int rows = stmt.executeUpdate(sql);
System.out.println("Rows: " + rows);
The following will work through JDBC. Note that to use LOAD DATA INFILE you need superuser privilege. Which you don't need for LOAD DATA LOCAL INFILE
Connection con = DriverManager.getConnection("jdbc:mysql://localhost/foobar", "root", "password");
Statement stmt = con.createStatement();
String sql =
"load data infile 'c:/temp/some_data.txt' \n" +
" replace \n" +
" into table prd \n" +
" columns terminated by '\\t' \n" +
" ignore 1 lines";
stmt.execute(sql);
I am trying to execute a BULK INSERT statement on SQL Server 2008 Express.
(It basically takes all fields in a specified file and inserts these fields into appropriate columns in a table.)
Given below is an example of the bulk insert statement--
BULK INSERT SalesHistory FROM 'c:\SalesHistoryText.txt' WITH (FIELDTERMINATOR = ',')
Given below is the Java code I am trying to use (but its not working)...Can someone tell me what I am doing wrong here or point me to a java code sample/tutorial that uses the Bulk Insert statement? --
public void insertdata(String filename)
{
String path = System.getProperty("user.dir");
String createString = "BULK INSERT Assignors FROM " + path + "\\" +filename+ ".txt WITH (FIELDTERMINATOR = ',')";
try
{
// Load the SQLServerDriver class, build the
// connection string, and get a connection
Class.forName("com.microsoft.sqlserver.jdbc.SQLServerDriver");
String connectionUrl = "jdbc:sqlserver://arvind-pc\\sqlexpress;" +
"database=test01;" +
"user=sa;" +
"password=password1983";
Connection con = DriverManager.getConnection(connectionUrl);
System.out.println("Connected.");
// Create and execute an SQL statement that returns some data.
String SQL = "BULK INSERT dbo.Assignor FROM " + path + "\\" +filename+ ".txt WITH (FIELDTERMINATOR = ',')";
Statement stmt = con.createStatement();
ResultSet rs = stmt.executeQuery(SQL);
// Iterate through the data in the result set and display it.
while (rs.next())
{
//System.out.println(rs.getString(1) + " " + rs.getString(2));
System.out.println(" Going through data");
}
}
catch(Exception e)
{
System.out.println(e.getMessage());
System.exit(0);
}
}
I'd guess that your SQL string is missing the single quotes around the filename. Try the following:
String SQL = "BULK INSERT dbo.Assignor FROM '" + path + "\\" +filename+ ".txt' WITH (FIELDTERMINATOR = ',')";
EDIT in response to your comment: I wouldn't expect there to be anything in the ResultSet following a bulk insert, in much the same way that I wouldn't expect anything in a ResultSet following an ordinary INSERT statement. These statements just insert the data they are given into a table, they don't return it as well.
If you're not getting any error message, then it looks like your bulk insert is working. If you query the table in SQLCMD or SQL Server Management Studio, do you see the data?
INSERT, UPDATE, DELETE and BULK INSERT statements are not queries, so you shouldn't be using them with the executeQuery() method. executeQuery() is only intended for running SELECT queries. I recommend using the executeUpdate(String) method instead. This method returns an int, which is normally the number of rows inserted/updated/deleted.