Postgresql commited data not visible - java

I noticed weird behavior in my application. It looks like commited data is not visible right after commit. Algorithm looks like this :
connection1 - insert into table row with id = 5
connection1 - commit, close
connection2 - open
connection2 - select from table row with id = 5 (no results)
connection2 - insert into table row with id = 5 (PRIMARY KEY VIOLATION, result is in db)
If select on connection2 returns no results then i do insert, otherwise it is update.
Server has many databases (~200), it looks like commit is done but changes are in DB later. I use java and jdbc. Any ideas would be appreciated.

This behavior corresponds to the REPEATABLE READ isolation mode, see SET TRANSACTION:
REPEATABLE READ
All statements of the current transaction can only see rows committed before the
first query or data-modification statement
was executed in this transaction.
Try connection.setTransactionIsolation(Connection.TRANSACTION_READ_COMMITTED) to see if it makes a difference.

Related

Spring JDBC Template batchUpdate to update thousands of records in a tbale

I have an update query which I am trying to execute through batchUpdate method of spring jdbc template. This update query can potentially match 1000s of rows in EVENT_DYNAMIC_ATTRIBUTE table which needs to be get updated. Will updating thousands of rows in a table cause any issue in production database apart from timeout? like, will it crash database or slowdown the performance of entire database engine for other connections...etc?
Is there a better way to achieve this instead of firing single update query in spring JDBC template or JPA? I have the following settings for jdbc template.
this.jdbc = new JdbcTemplate(ds);
jdbc.setFetchSize(1000);
jdbc.setQueryTimeout(0); // zero means there is no limit
The update query:
UPDATE EVENT_DYNAMIC_ATTRIBUTE eda
SET eda.ATTRIBUTE_VALUE = 'claim',
eda.LAST_UPDATED_DATE = SYSDATE,
eda.LAST_UPDATED_BY = 'superUsers'
WHERE eda.DYNAMIC_ATTRIBUTE_NAME_ID = 4002
AND eda.EVENT_ID IN
(WITH category_data
AS ( SELECT c.CATEGORY_ID
FROM CATEGORY c
START WITH CATEGORY_ID = 495984
CONNECT BY PARENT_ID = PRIOR CATEGORY_ID)
SELECT event_id
FROM event e
WHERE EXISTS
(SELECT 't'
FROM category_data cd
WHERE cd.CATEGORY_ID = e.PRIMARY_CATEGORY_ID))
If it is one time thing, I normally first select the records which needs to be updated and put in a temporary table or in a csv, and I make sure that I save primary key of those records in a table or in a csv. Then I read records in batches from temporary table or csv, and do the update in the table using the primary key. This way tables are not locked for a long time and you can have fixed set of records added in the batch which needs update and updates are done using primary key so it will be very fast. And if any update fails then you know which records got failed by logging out the failed records primary key in a log file or in an error table. I have followed this approach many time for updating millions of records in the PROD database, as it is very safe approach.

JDBC : Batch insert not inserting value to database

I have to execute multiple insert queries using JDBC for which I am trying to execute batch statement. Everything works fine in my code but when i try to see values in the table, the table is empty.
Here is the code :
SessionImpl sessionImpl = (SessionImpl) getSessionFactory().openSession();
Connection conn = (Connection) sessionImpl.connection();
Statement statement = (Statement) conn.createStatement();
for (String query : queries) {
statement.addBatch(query);
}
statement.executeBatch();
statement.close();
conn.close();
And the
List<String> queries
contains insert queries like:
insert into demo values (null,'Sharmzad','10006','http://demo.com','3 Results','some values','$44.00','10006P2','No Ratings','No Reviews','Egypt','Duration: 8 hours','tour','Day Cruises');
And the table structure is like:
create table demo ( ID INTEGER PRIMARY KEY AUTO_INCREMENT,supplierName varchar(200),supplierId varchar(200),supplierUrl varchar(200),totalActivities varchar(200),activityName varchar(200),activityPrice varchar(200),tourCode varchar(200),starRating varchar(200),totalReviews varchar(200),geography varchar(200),duration varchar(200),category varchar(200),subCategory varchar(200));
No exception is thrown anywhere but no value is inserted. Can someone explain?
Most JDBC drivers use autocommit, but some of them do not. If you don't know, you should use either .setAutoCommit(true) before the transaction or .commit() after it..
Could be a transaction issue. Perhaps you're not committing your transaction? If so, then it is normal not to see anything in the database.
You can check if this is the case by running a client in READ_UNCOMMITTED transaction mode, right after .executeBatch(); (but before close()) and see if there are any rows.
You don't should assign a value to ID add supply all the others columns name
insert into demo
(
supplierName
,supplierId
,supplierUrl
,totalActivities
,activityName
,activityPrice
,tourCode
,starRating
,totalReviews
,geography
,duration
,category
,subCategory
)
values (
'Sharmzad'
,'10006'
,'http://demo.com'
,'3 Results'
,'some values'
,'$44.00'
,'10006P2'
,'No Ratings'
,'No Reviews'
,'Egypt'
,'Duration: 8 hours
','tour'
,'Day Cruises'
);
and add commit to your code

neo4j insert using jdbc but cannot query immediately within the same connection

question background:
1.database is neo4j 2.3.1, driver using jdbc;
2.db connection initialized as a class member, default is auto-commit(not changed);
To avoid insert duplicates, i query before insert. after program stopped, found duplicates. why?
code:
String query = "CREATE (n:LABEL {name:'jack'})";
System.out.println(query);
Statement stmt = dbConnection.createStatement();
stmt.executeUpdate(query);
stmt.close();
Use MERGE + unique constraints instead
How do you "check"
You would have to check in the same tx and also take a write lock
after debugging i found that for neo4j-jdbc(v2.1.4), the default db connection transaction level is TRANSACTION_NONE, then i set it to TRANSACTION_READ_COMMITTED, above issue disappeared. so i think that TRANSACTION_READ_COMMITTED will force the previous insert committed, though this is not the recommended way. for isolation level refer to:Difference between read commit and repeatable read

PSQL JDBC Transactions Cause Deadlock

Question:
Updated:
Why does inserting a row into table A with a foreign key constraint to table B and then updating the row in table B that the inserted row in table A references in a transaction cause a deadlock?
Scenario:
reservation.time_slot_id has a foreign key constraint to time_slot.id.
When a reservation is made the following SQL is run:
BEGIN TRANSACTION
INSERT INTO reservations (..., time_slot_id) VALUES (..., $timeSlotID)
UPDATE reservations SET num_reservations = 5 WHERE id = $timeSlotID
COMMIT
I am load testing my server with about 100 concurrent users, each making a reservation for the same time slot (same $timeSlotID for each user).
If I don't use a transaction (remove cn.setAutoCommit(false);, cn.commit(), etc.) this problem does not occur.
Environment:
PostgreSQL 9.2.4
Tomcat v7.0
JDK 1.7.0_40
commons-dbcp-1.4.jar
commons-pool-1.6.jar
postgresql-9.2-1002.jdbc4.jar
Code:
// endpoint start
// there are some other SELECT ... LEFT JOIN ... WHERE ... queries up here but they don't seem to be related
...
// create a reservation in the time slot then increment the count
cn.setAutoCommit(false);
try
{
st = cn.prepareStatement("INSERT INTO reservation (time_slot_id, email, created_timestamp) VALUES (?, ?, ?)");
st.setInt (1, timeSlotID); // timeSlotID is the same for every user
st.setString(2, email);
st.setInt (3, currentTimestamp);
st.executeUpdate();
st.close();
st = cn.prepareStatement("UPDATE time_slot SET num_reservations = 5 WHERE id = ?"); // set to 5 instead of incrementing for testing
st.setInt(1, timeSlotID); // timeSlotID is the same for every user
st.executeUpdate();
st.close();
cn.commit();
}
catch (SQLException e)
{
cn.rollback();
...
}
finally
{
cn.setAutoCommit(true);
}
...
// endpoint end
PSQL Error:
ERROR: deadlock detected
DETAIL: Process 27776 waits for ExclusiveLock on tuple (2,179) of relation 49817 of database 49772; blocked by process 27795.
Process 27795 waits for ShareLock on transaction 3962; blocked by process 27777.
Process 27777 waits for ExclusiveLock on tuple (2,179) of relation 49817 of database 49772; blocked by process 27776.
Process 27776: UPDATE time_slot SET num_reservations = 5 WHERE id = $1
Process 27795: UPDATE time_slot SET num_reservations = 5 WHERE id = $1
Process 27777: UPDATE time_slot SET num_reservations = 5 WHERE id = $1
HINT: See server log for query details.
STATEMENT: UPDATE time_slot SET num_reservations = 5 WHERE id = $1
How the foreign key can cause a deadlock (in Postgresql 9.2 and below).
Let say there is a child table referencing to a parent table:
CREATE TABLE time_slot(
id int primary key,
num_reservations int
);
CREATE TABLE reservation(
time_slot_id int,
created_timestamp timestamp,
CONSTRAINT time_slot_fk FOREIGN KEY (time_slot_id)
REFERENCES time_slot( id )
);
INSERT INTO time_slot values( 1, 0 );
INSERT INTO time_slot values( 2, 0 );
Suppose that the FK column in child table is modified in session one, that fires ordinary insert statement (to test this behavior, open one session in SQL Shell (psql) and turn auto commit off, or start the transaction using begin statement:
BEGIN;
INSERT INTO reservation VALUES( 2, now() );
When the FK column in child table is modified, DBMS will have to lookup the parent table to ensure the existence of the parent record.
If inserted value doesn't exists in the referenced (parent) table - DBMS breaks the transaction and reports an error.
In case the value exists, the record is inserted into child table, but DBMS has to ensure the transaction integrity - no other transaction can delete or modify referenced record in the parent table, until the transaction ends (until INSERT into child table is committed).
PostgreSql 9.2 (and below) ensure database integrity in a such case placing a read share lock on a record in the parent table. The read share lock doesn't prevents readers from reading locked record from the table, but prevent's writers from modyfying locked record in the shared mode.
OK - now we have a new record in the child table insered by session 1 (there is a write lock placed on this record by session 1), and the read share lock placed on the record 2 in the parent table. The transaction is not yet commitet.
Suppose that a session 2 starts the same transaction, that references the same record in the parent table:
BEGIN;
INSERT INTO reservation VALUES( 2, now() );
The query executes fine, without any errors - it inserts a new record into the child table, and also places a shared read lock on the record 2 in the parent table. Shared locks don't conflict, many transactions can lock a record in a shared read mode and don't have to wait for others (only write locks conflict).
Now (a few miliseconds later) the session 1 fires (as a part of the same transaction) this command:
UPDATE time_slot
SET num_reservations = num_reservations + 1
WHERE id = 2;
In Postgres 9.2 the above command "hangs" and is waiting for the shared lock placed by session 2.
And now, suppose that the same command, a few miliseconds later, is running in session 2:
UPDATE time_slot
SET num_reservations = num_reservations + 1
WHERE id = 2;
This command is supposed to "hang" and should wait for a write lock placed on the record by UPDATE from session 1.
But the result is:
BŁĄD: wykryto zakleszczenie
SZCZEGÓŁY: Proces 5604 oczekuje na ExclusiveLock na krotka (0,2) relacji 41363 bazy danych 16393; zablokowany przez 381
6.
Proces 3816 oczekuje na ShareLock na transakcja 1036; zablokowany przez 5604.
PODPOWIEDŹ: Przejrzyj dziennik serwera by znaleźć szczegóły zapytania.
("zakleszczenie" means "deadlock", "BŁĄD" means "ERROR")
the update command from session 2 is trying to place a write lock on the record 2 locked by session 1
session 1 is trying to place a write lock on the same record, locked (in the shared mode) by session 2
----> ...... deadlock.
The deadlock can be prevented by placing a write lock on the parent table using SELECT FOR UPDATE
The above test case will not cause the deadlock in PostgreSQL 9.3 (try it) - in 9.3 they improved locking behaviour in such cases.
------------ EDIT - additional questions -------------------
why does the insert statement not release the lock after it is done? Or does it remain for the entire transaction which is why not using a transaction does not cause a deadlock?
All statements that modify data within the transaction (insert, update, delete) place locks on modified records. These locks remain active until the transaction ends - by issuing commit or rollback.
Because autocommit is turned off in the JDBC connection, successive SQL commands are automaticaly grouped into one transaction
The explanation is here:
http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#setAutoCommit%28boolean%29
If a connection is in auto-commit mode, then all its SQL statements will be executed and committed as individual transactions. Otherwise, its SQL statements are grouped into transactions that are terminated by a call to either the method commit or the method rollback.
How does the SELECT FOR UPDATE prevent the deadlock?
SELECT FOR UPDATE places a write lock on the record. This is the first command in the whole transaction, and the lock is placed in the beginning. When another transaction starts (in another session), is also executes SELECT FOR UPDATE, trying to lock the same record. Write locks conflict - two transactions cannot lock the same record at the same time - therefore the SELECT FOR UPDATE of the second transaction is hold, and is waiting until the first transaction releases the lock (by issuing commit or rollback) - actually the second transaction is waiting until the whole first transaction ends.
In the first scenario, the INSERT statements places two locks:
- a write lock on the inserted record in the reservation table
- and a read shared lock on the record in the time_slot table referenced by the foreign key constraint
Read shared locks don't conflict - two and more transactions can lock the same record in the shared mode, and can continue execution - then don't have to wait for each other. But later, when the UPDATE is issued within the same transaction, trying to place a write lock on the same record already locked in the shared mode, this cause a deadlock.
Would placing the increment first also prevent the deadlock?
Yes, you are right. This prevents the deadlock, because a write lock is placed on the record at the beginning of the transaction. Another transaction also tries to update the same record at the beginning, and has to wait at this point because the record is already locked (in write mode) by another session.
While I still don't understand it, I added:
SELECT * FROM time_slot WHERE id = ? FOR UPDATE
as the first statement in the transaction. This seems to have solved my problem as I no longer get a deadlock.
I would still love for someone to give a proper answer and explain this to me.

hibernate batch insert - how flush works?

I need to insert a lot of data in a database using hibernate, i was looking at batch insert from hibernate, what i am using is similar to the example on the manual:
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
for ( int i=0; i<100000; i++ ) {
Customer customer = new Customer(.....);
session.save(customer);
if ( i % 20 == 0 ) { //20, same as the JDBC batch size
//flush a batch of inserts and release memory:
session.flush();
session.clear();
}
}
tx.commit();
session.close();
but i see that flush doesn't write the data on the database.
Reading about it, if the code is inside a transaction then nothing will be committed to the database until the transaction performs a commit.
So what is the need to use flush/clear ? seems useless, if the data are not written on the database then they are in memory.
How can i force hibernate to write data in the database?
Thanks
The data is sent to the database, and is not in memory anymore. It's just not made definitively persistent until the transaction commit. It's exacltly the same as if you executes the following sequences of statements in any database tool:
begin;
insert into ...
insert into ...
insert into ...
// here, three inserts have been done on the database. But they will only be made
// definitively persistent at commit time
...
commit;
The flush consists in executing the insert statements.
The commit consists in executing the commit statement.
The data will be written to the database, but according to the transaction isolation level you will not see them (in other transactions) until the transaction is committed.
Use some sql statement logger, that prints the statmentes that are transported over the database connection, then you will see that the statmentes are send to the database.
For best perfromance you also have to commit transactions. Flushing and clearing session clears hibernate caches, but data is moved to JDBC connection caches, and is still uncommited ( different RDBMS / drivers show differrent behaviour ) - you are just shifting proble to other place without real improvements in perfromance.
Having flush() at the location mentioned saves you memory too as your session will be cleared regularly. Otherwise you will have 100000 object in memory and might run out of memory for larger count. Check out this article.

Categories