How to handle read transaction in Java to provide consistency?

How to handle read transaction in Java to provide consistency? - java

I am developing a client in Java. It communicates with the server via actions. Actions are social-like actions (an example of a action is a user views the profile of another user).
With the View Profile example above, the client executes 4 queries to get the data from the database server. To provide consistency, I want to put the 4 queries in a transaction. So in my View Profile function, first I put conn.setAutoCommit(false), then queries the data, and at the end before return I set auto commit to true again conn.setAutoCommit(true) (see the code snippet below).
try {
// set auto commit to false to manually handle transaction
conn.setAutoCommit(false);
// execute query 1
// ...
// execute query 2
// ...
// execute query 3
// ...
// execute query 4
// ...
// set auto commit to true again to not affect other actions
conn.setAutoCommit(true);
} catch (SQLException e) {
e.printStackTrace(System.out);
} finally {
try {
conn.close();
} catch (SQLException e) {
e.printStackTrace(System.out);
}
}
However, when I run the code, sometimes I notice that the data returned from this action is not consistent. When I tries to combine the 4 queries in a single query, I can achieve consistency.
My question is, does setting autoCommit in Java really work with read transaction like in my example, when I want to issue separate queries to the DBMS? If not, how can I provide consistency if I want to query the DBMS in 4 separate queries?
FYI, the database server I use is Oracle DB.

For oracle, selects never do dirty reads, so are always implicitly TRANSACTION_READ_COMMITTED. If you ate ingesting data at a high rate, my guess is that data is changing between the first and last select, so your best bet would be to combine the selects into one using 3 UNIONs.
See http://www.oracle.com/technetwork/issue-archive/2005/05-nov/o65asktom-082389.html

Related

How to proceed with deleting information on two databases

I need to delete items from two databases - one internal managed by my team, and another managed by some other team (they hold different, but related data). The constraint is that if one of these deletes from database fail, then the entire operation should be cancelled and rolled back.
Now, I can control and access my own database easily, but not the database managed by the other team. My line of thought is as follows:
delete from my database first (if it fails, abort everything straightaway)
assuming step 1 succeeds, now I call the API from the other team to delete the data on their side as well
if step 2 succeeds, all is good... if it fails, I'll roll back the delete on my database in step 1
In order to achieve step 3, I think I will have to save the data in step 1 in some variables within the function. Roughly speaking...
public void deleteData (String id) {
Optional<var> entityToBeDeleted = getEntity(id);
try{
deleteFromMyDB(id);
} catch (Exception e){
throw e;
}
try{
deleteFromOtherDB(id);
} catch (Exception e){
persistInMyDB(entityToBeDeleted);
throw e;
}
}
Now I am aware that the above code looks horrible. Any guru can give me some advice on how to do this better?

What does it mean if the remote deletion fails? That the deletion should not happen at all?
Can the local deletion fail for a non-transient reason?
A possible solution is:
Create a "pending deletions" table in your database which will contain the keys of records you want to delete.
When you need to delete record, insert a row in this table.
Then delete the record from the remote system.
If this succeeds, delete the "pending deletion" record and the local record, preferably in a single transaction.
Whenever you start your system, check the "pending deletion" table, and delete any records mention from the local and remote systems (I assume that both these operations are idempotent). Then delete the "pending deletion" record.

Java : Cancel failed insert using Hibernate

I'm looking for a way to cancel a failed insert, using Hibernate.
Context : I've got a program which has to format and then transfer data from a source database to a destination Oracle database. Since i've got a lot of data to process, I want to be able to insert in bulks (ex: 100 rows bulks). But the thing is, sometimes an insert could fail because of a bad format (typically, trying to insert a 50 characters long string in a field that can only take up to 32). I could bypass the problem by checking first if the row is valid before trying to insert it, but I'm looking for another way to do it.
I tried to do something like this :
List<MyDataObject> dataList=processData();
HibernateUtils myUtils=HibernateUtils.getInstance();
myUtils.openTransaction(); //opens the transaction so it is not automatically committed after every insert
int i=0;
for(MyDataObject data:dataList){
myUtils.setSavepoint(); //Creates a savepoint
try{
myUtils.insertData(data); //Does not commit, but persists the data object into the DB
myUtils.flush();
} catch (RuntimeException e){
myUtils.rollbackSavepoint(); //Rolls back to the savepoint I created right before inserting the last element
myUtils.commitTransaction();
i=0;
continue;
}
if(++i==100){
myUtils.commitTransaction();
i=0;
}
}
myUtils.closeTransaction();
However, it doesn't work because the unflushed, failed insert will not be rolled back even though I rolled back to the savepoint I created before inserting, probably because it wasn't actually flushed in the first place (because flushing throws an error because of the bad format).
My savepoint rollback is working, if I throw a "fake" runtimeException after inserting some element, this last element won't be in the database
How can I bypass the problem ? (I'd like a way to delete the unflushed SQL instructions while keeping the flushed ones in the transaction)
Thank you in advance for any help

Most efficient multithreading Database Insert in Java

We have to read a lot of data from a HDD (~50GB) into our database, but our multithreading procedure is pretty slow (~2h for ~10GB), because of a Thread lock inside of org.sqlite.core.NativeDB.reset[native] (see thread sampler).
We read our data relatively fast and use our insert method to execute a prepared statement. But only if we collected like 500.000 datasets we commit all these statements to our database. Currently we use JDBC as Interface for our sqlite database.
Everything works fine so far, if you use one thread total. But if you want to use multiple threads you do not see much of a performance/speed increase, because only one thread can run at time, and not in parallel.
We already reuse our preparedStatement and all threads use one instance of our Database class to prevent file locks (there is one connection to the database).
Unfortunately we have no clue how to improve our insert method any further. Is anyone able to give us some tips/solutions or a way how to not use this NativeDB.reset method?
We do not have to use SQLite, but we would like to use Java.
(Threads are named 1,2,...,15)
private String INSERT = "INSERT INTO urls (url) VALUES (?);";
public void insert(String urlFromFile) {
try {
preparedStatement.setString(1, urlFromFile);
preparedStatement.executeUpdate();
} catch (SQLException e) {
e.printStackTrace();
}
}
Updated insert method as suggested by #Andreas , but it is still throwing some Exceptions
public void insert(String urlFromFile) {
try {
preparedStatement.setString(1, urlFromFile);
preparedStatement.addBatch();
++callCounter;
if (callCounter%500000 == 0 && callCounter>0){
preparedStatement.executeBatch();
commit();
System.out.println("Exec");
}
} catch (SQLException e) {
e.printStackTrace();
}
}
java.lang.ArrayIndexOutOfBoundsException: 9
at org.sqlite.core.CorePreparedStatement.batch(CorePreparedStatement.java:121)
at org.sqlite.jdbc3.JDBC3PreparedStatement.setString(JDBC3PreparedStatement.java:421)
at UrlDatabase.insert(UrlDatabase.java:85)

Most databases have some sort of bulk insert functionality, though there's no standard for it, AFAIK.
Postrgresql has COPY, and MySql has LOAD DATA, for instance.
I don't think that SQLite has this facility, though - it might be worth switching to a database that does.

SQLite has no write concurrency.
The fastest way to load a large amount of data is to use a single thread (and a single transaction) to insert everything into the DB (and not to use WAL).

Hibernate Batch Processing Using Native SQL

I have an application using hibernate. One of its modules calls a native SQL (StoredProc) in batch process. Roughly what it does is that every time it writes a file it updates a field in the database. Right now I am not sure how many files would need to be written as it is dependent on the number of transactions per day so it could be zero to a million.
If I use this code snippet in while loop will I have any problems?
#Transactional
public void test()
{
//The for loop represents a list of records that needs to be processed.
for (int i = 0; i < 1000000; i++ )
{
//Process the records and write the information into a file.
...
//Update a field(s) in the database using a stored procedure based on the processed information.
updateField(String.valueOf(i));
}
}
#Transactional(propagation=propagation.MANDATORY)
public void updateField(String value)
{
Session session = getSession();
SQLQuery sqlQuery = session.createSQLQuery("exec spUpdate :value");
sqlQuery.setParameter("value", value);
sqlQuery.executeUpdate();
}
Will I need any other configurations for my data source and transaction manager?
Will I need to set hibernate.jdbc.batch_size and hibernate.cache.use_second_level_cache?
Will I need to use session flush and clear for this? The samples in the hibernate tutorial is using POJO's and not native sql so I am not sure if it is also applicable.
Please note another part of the application is already using hibernate so as much as possible I would like to stick to using hibernate.
Thank you for your time and I am hoping for your quick response. If it is also possible could code snippet would really be useful for me.
Application Work Flow
1) Query Database for the transaction information. (Transaction date, Type of account, currency, etc..)
2) For each account process transaction information. (Discounts, Current Balance, etc..)
3) Write the transaction information and processed information to a file.
4) Update a database field based on the process information
5) Go back to step 2 while their are still accounts. (Assuming that no exception are thrown)

The code snippet will open and close the session for each iteration, which definitely not a good practice.
Is it possible, you have a job which checks how many new files added in the folder?
The job should run say every 15/25 minutes, checking how much files are changed/added in last 15/25 minutes and updates the database in batch.
Something like that will lower down the number of open/close session connections. It should be much faster than this.

Sybase JConnect: ENABLE_BULK_LOAD usage

Can anyone out there provide an example of bulk inserts via JConnect (with ENABLE_BULK_LOAD) to Sybase ASE?
I've scoured the internet and found nothing.

I got in touch with one of the engineers at Sybase and they provided me a code sample. So, I get to answer my own question.
Basically here is a rundown, as the code sample is pretty large... This assumes a lot of pre initialized variables, but otherwise it would be a few hundred lines. Anyone interested should get the idea. This can yield up to 22K insertions a second in a perfect world (as per Sybase anyway).
SybDriver sybDriver = (SybDriver) Class.forName("com.sybase.jdbc3.jdbc.SybDriver").newInstance();
sybDriver.setVersion(com.sybase.jdbcx.SybDriver.VERSION_6);
DriverManager.registerDriver(sybDriver);
//DBProps (after including normal login/password etc.
props.put("ENABLE_BULK_LOAD","true");
//open connection here for sybDriver
dbConn.setAutoCommit(false);
String SQLString = "insert into batch_inserts (row_id, colname1, colname2)\n values (?,?,?) \n";
PreparedStatement pstmt;
try
{
pstmt = dbConn.prepareStatement(SQLString);
}
catch (SQLException sqle)
{
displaySQLEx("Couldn't prepare statement",sqle);
return;
}
for (String[] val : valuesToInsert)
{
pstmt.setString(1, val[0]); //row_id varchar(30)
pstmt.setString(2, val[1]);//logical_server varchar(30)
pstmt.setString(3, val[2]); //client_host varchar(30)
try
{
pstmt.addBatch();
}
catch (SQLException sqle)
{
displaySQLEx("Failed to build batch",sqle);
break;
}
}
try {
pstmt.executeBatch();
dbConn.commit();
pstmt.close();
} catch (SQLException sqle) {
//handle
}
try {
if (dbConn != null)
dbConn.close();
} catch (Exception e) {
//handle
}

After following most of your advice we didn't see any improvement over simply creating a massive string and sending that across in batches of ~100-1000rows with a surrounding transaction. we got around:
*Big String Method [5000rows in 500batches]: 1716ms = ~2914rows per second.
(this is shit!).
Our db is sitting on a virtual host with one CPU (i7 underneath) and the table schema is:
CREATE TABLE
archive_account_transactions
(
account_transaction_id INT,
entered_by INT,
account_id INT,
transaction_type_id INT,
DATE DATETIME,
product_id INT,
amount float,
contract_id INT NULL,
note CHAR(255) NULL
)
with four indexes on account_transaction_id (pk), account_id, DATE, contract_id.
Just thought I would post a few comments first we're connecting using:
jdbc:sybase:Tds:40.1.1.2:5000/ikp?EnableBatchWorkaround=true;ENABLE_BULK_LOAD=true
we did also try the .addBatch syntax described above but it was marginally slower than just using java StringBuilder to build the batch in sql manually and then just push it across in one execute statement. Removing the column names in the insert statement gave us a surprisingly large performance boost it seemed to be the only thing that actually effected the performance. As the Enable_bulk_load param didn't seem to effect it at all nor did the EnableBatchWorkaround we also tried DYNAMIC_PREPARE=false which sounded promising but also didn't seem to do anything.
Any help getting these parameters actually functioning would be great! In other words are there any tests we could run to verify that they are in effect? I'm still convinced that this performance isn't close to pushing the boundaries of sybase as mysql out of the box does more like 16,000rows per second using the same "big string method" with the same schema.
Cheers
Rod

In order to get the sample provided by Chris Kannon working, do not forget to disable auto commit mode first:
dbConn.setAutoCommit(false);
And place the following line before dbConn.commit():
pstmt.executeBatch();
Otherwise this technique will only slowdown the insertion.

Don't know how to do this in Java, but you can bulk-load text files with LOAD TABLE SQL statement. We did it with Sybase ASA over JConnect.

Support for Batch Updates
Batch updates allow a Statement object to submit multiple update commands
as one unit (batch) to an underlying database for processing together.
Note: To use batch updates, you must refresh the SQL scripts in the sp directory
under your jConnect installation directory.
CHAPTER
See BatchUpdates.java in the sample (jConnect 4.x) and sample2 (jConnect
5.x) subdirectories for an example of using batch updates with Statement,
PreparedStatement, and CallableStatement.
jConnect also supports dynamic PreparedStatements in batch.
Reference:
http://download.sybase.com/pdfdocs/jcg0420e/prjdbc.pdf
http://manuals.sybase.com/onlinebooks/group-jcarc/jcg0520e/prjdbc/#ebt-link;hf=0;pt=7694?target=%25N%14_4440_START_RESTART_N%25#X
.
Other Batch Update Resources
http://java.sun.com/j2se/1.3/docs/guide/jdbc/spec2/jdbc2.1.frame6.html
http://www.jguru.com/faq/view.jsp?EID=5079

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.