I have a program where I have the ability to extract words from a file and insert those words into a table in MySQL.
My program works fine where it can commit the transaction after all the words from the file is inserted into the table. If anything happens in between the transaction, then nothing is inserted to the table since autoCommit is set to false.
I was wondering since there once a transaction is committed, the records are permanent in the table , is there a way to undo such transactions, if there are tons of different transactions, how do I manage to undo them?
If a commit succeeds then its done, complete, you can't roll it back at that point. so it should not be done
Edit:
to be more clear the con.rollback should be used within a catch block if con.comit fails
Related
I have struggled with architectural problem.
I have table in DB2 v.9.7 database in which I need to insert ~250000 rows, with 13 columns each, in a single transaction. I especially need that this data would inserted as one unit of work.
Simple insert into and executeBatch give me:
The transaction log for the database is full. SQL Code: -964, SQL State: 57011
I don't have rights to change the size of transaction log. So I need to resolve this problem on the developer's side.
My second thought was to use savepoint before all inserts then I found out that works only with current transaction so it doesn't help me.
Any ideas?
You want to perform a large insert as a single transaction, but don't have enough log space for such transaction and no permissions to increase it.
This means you need to break up your insert into multiple database transactions and manage higher level commit or rollback on the application side. There is not anything in the driver, either JDBC or CLI, to help with that, so you will have to write custom code to record all committed rows and manually delete them if you need to roll back.
Another alternative might be to use the LOAD command by means of the ADMIN_CMD() system stored procedure. LOAD requires less log space. However, for this to work you will need to write rows that you want to insert into a file on the database server or to a shared filesystem or drive accessible from the server.
Hi you can use export/load commands to export/import large tables, this should be very fast.The LOAD command should not be using the transaction log.You may have problem if your user have no privilege to write file on server filesystem.
call SYSPROC.ADMIN_CMD('EXPORT TO /export/location/file.txt OF DEL MODIFIED BY COLDEL0x09 DECPT, select * from some_table ' )
call SYSPROC.ADMIN_CMD('LOAD FROM /export/location/file.txt OF DEL MODIFIED BY COLDEL0x09 DECPT, KEEPBLANKS INSERT INTO other_table COPY NO');
My application parses a CSV file, about 100 - 200 records per file, does database CRUD features and commits them all in the end.
public static void main(String[] args) {
try {
List<Row> rows = parseCSV();
Transaction t = openHibernateTransaction();
//doCrudStuff INSERTS some records in the database
for (Row r : rows)
doCrudStuff(r);
t.commit();
} catch (Exception ex) {
//log error
if (t != null) t.rollback();
}
}
When I was about to doCrudStuff on the 78th Row, I suddenly got this error:
Data truncation: Data too long for column 'SOME_COLUMN_UNRELATED_TO_78TH_ROW' at row 1.
I read the stack trace and the error was triggered by a SELECT statement to a table unrelated to the 78th row. Huh, weird right?
I checked the CSV file and found that on the 77th row, some field was indeed too long for the database column. But Hibernate didn't catch the error during the INSERT of the 77th row and threw the error when I was doing a SELECT for the 78th row. Why is it delayed?
Does Hibernate really behave like this? I commit only once at the very end because I want to make sure that everything succeeded, otherwise, rollback.
Actually not really if you take into account what hibernate is doing behind the scenes for you.
Hibernate does not actually execute your write statements (update,insert) until it needs to, thus in your case I assume your "doCrudStuff" executes a select and then executes an update or insert right?
This is what is happening:
You tell hibernate to execute "UPDATE my_table SET something = value;" which causes hibernate to cache this in the session and return right away.
You may do more writes, which Hibernate will likely continue to cache in the session until either 1) you manually flush the session or 2) hibernate decides its time to flush the session.
You then execute a SELECT statement to get some data from the database. At this point, the state of the database is not consistent with the state of the session since there is data waiting to be written. Hibernate will then start executing your writes to catch up the database state to the session state.
If one of the writes fails, when you look at the stack trace, you will actually not be able to map it to the exact point you asked (this a important distinction between an ORM and using JDBC directly) hibernate to execute the write, but rather it will fail when the session had to be flushed (either manually or automatically).
At the expense of performance, you can always tell hibernate to flush your session after your writes. But as long as you are aware of the lifecycle of the hibernate session and how it caches those queries, you should be able to more easily debug these.
By the way, if you want to see this is practice, you can tell hibernate to log the queries.
Hope this helps!
EDIT: I understand how this can be confusing, let me try to augment my answer by highlighting the difference between a Transaction and a Hibernate Session.
A transaction is a sequence of atomic operations performed on the database. Until a transaction is committed, it is typically not visible by other clients of the database. The state of the transaction is fully managed by the database - i.e. you can start a transaction and send you operations to the database, and it will ensure consistency of these operations within the transaction.
A Hibernate Session is a session managed by Hibernate, outside the database, mostly for performance reasons. Hibernate will queue operations whenever possible to improve performance, and only go to the database when it deems necessary.
Imagine you have 50 marbles that are all different colors and need to be stored in their correct buckets, but these buckets are 100 feet away and you need someone to correctly sort them inside their rightful buckets. You ask your friend Bob to store the blue marbles, then the red marbles then the green marbles. Your friend is smart and anticipates that you will ask him to make multiple round trips, so he ways until your last request to walk those 100 feet to store them in their proper buckets, which is much faster than making 3 round trips.
Now imagine that you ask him to store the yellow marbles, and then you ask him how many total marbles you have across all the buckets. He is then forced to go to the buckets (since he needs to gather information), store the yellow marbles (so he can accurately count all buckets) before he can give you an answer. This is in essence what hibernate is doing with your data.
How in your case, imagine there is NO yellow bucket. Bob unfortunately is not going to find that out until he tries to answer your query into how many total marbles you have - thus in the sequence of events, he will come back to you to tell you he couldn't complete your request only after he tries to count the marbles (as opposed to when you asked him to store the yellow ones, which is what he was actually unable to do).
Hope this helps clear things a little bit!
We are using spring and hibernate for an web application:
The application has a shopping cart where user can place items in it. in order to hold the items to be viewed between different login's the item values in the shopping cart are stored in tables. when submitting the shopping cart the items will be saved into different table were we need to generate the order number.
When we insert the values into the table to get the order number, we use to get the max order number and add +1 to it. we are using spring transaction manager and hibernate, in the code flow we get the order number and update the hibernate object to hold the order num value. when i debug, i noticed that only when the complete transaction is issued the order number entity bean is being inserted.
Issue here is when we two request is being submitted to the server at the same time, the same order number is being used, and only one request data is getting inserted. could not insert the other request value which is again a unique one.
The order num in the table is a unique one.
i noticed when debugging the persistant layer is not getting inserted into the database even after issuing session flush
session.flush()
its just updating the memory and inserting the data to db only at the end of the spring transaction . i tried explicitly issuing a commit to transaction
session.getTransaction().commit();
this inserted the values into the database immediately, but on further code flow displayed message that could not start transaction.
Any help is highly appreciated.
Added:
Oracle database i used.
There is a sequence number which is unique for that table and also the order number maps to it.
follow these steps :- ,
1) Create a service method with propagation REQUIRES_NEW in different service class .
2)Move your code (whatever code you want to flush in to db ) in this new method .
3)Call this method from existing api (Because of proxy in spring, we have to call this new service method from different class otherwise REQUIRES_NEW will not work which make sure your flushing data ).
I would set the order number with a trigger which will run in the same transaction with the shopping cart insert one.
After you save the shopping cart, to see the updated order count, you'll have to call:
session.refresh(cart);
The count shouldn't be managed by Hibernate (insertable/updatable = false or #Transient).
Your first problem is that of serial access around the number generation when multiple thread are executing the same logic. If you could use Oracle sequences this would have been automatically taken care of at the database level as the sequences
are guranteed to return unique values any number of times they are called. However since this needs to be now managed at server side, you would need to
use synchronization mechanism around your number generation logic ( select max and increment by one) across the transaction boundary. You can make the Service
method synchronized ( your service class would be singleton and Spring managed) and declare the transaction boundary around it. However please note that this would be have performance implications and is usually bad for
scalability.
Another option could be variation of this - store the id to be allocated in a seperate table with one column "currentVal" and use pessimistic lock
for getting the next number. This way, the main table would not have any big lock. This way a lock would be held for the sequence generator code for the time the main entity creation transaction is complete. The main idea behind these techniques is to serialize
access to the sequence generator and hold the lock till the main entity transaction commits. Also delay the number generator as late as possible.
The solution suggested by #Vlad is an good one if using triggers is fine in your design.
Regarding your question around the flush behaviour, the SQL is sent to the database at flush call, however the data is not committed until the transaction is committed declaratively or a manual commit is called. The transaction can however see the data it purposes to change but not other transactions depending upon the isolation nature of transaction.
How can I specify the rollback point for a transaction in Spring?
Assuming the following scenario, I have to perform a really long insert into the db which takes quite some times (several minutes). This insert operation is wrapped in a transaction which ensures that if a problem occurs, the transaction is aborted and the database is restored to the status preceding the beginning of the transaction.
However, this solution affects the performance of the application since other transactions cannot access the db while the long transaction is being executed. I solved this issue by splitting the large transaction in several smaller transactions that perform the same operation. However, if one of these small transactions fails, the database rolls back to the status preceding this last transaction. Unfortunately, this would leave the database in an incorrect status. I want that if an errors occurs in any of these smaller transactions, the database rolls back to the status before the first small transaction ( i.e. exactly the same status, it would roll back if this operation is performed by a singular transaction).
Do you have any suggestion how I can achieve this using Spring transactions?
You should look to http://docs.spring.io/spring/docs/4.0.3.RELEASE/javadoc-api/org/springframework/transaction/TransactionStatus.html .
It has required functionality:
- create savepoint
- release savepoint
- rollback to savepoint
Of course, your transaction manager (and underlying JDBC driver, and DB) should support the functionality.
if you can use same primary key sequence for staging tables and production tables then you shall batch moving data from stg to prod. when a small transaction fails you can use the keys in staging table to delete from production table. that way you can restore production table to its original state
I have to implement a requirement for a Java CRUD application where users want to keep their search results intact even if they do actions which affects the criteria by which the returned rows are matched.
Confused? Ok. Let me give you a familiar example. In Gmail if you do an advanced search on unread emails, you are presented with a list of matching results. Click on an entry and then go back to the search list. What happens is that you have just read that entry but it hasn't disappeard from the original result set. Only that line has changed from bold to normal.
I need to implement the exact same behaviour but the application is designed in such a way that any transaction is persisted first and then the UI requeries the db to keep in sync. The complexity of the application and the size of the database prevents me from doing just a simple in memory caching of the matching rows and making the changes both in db and in memory.
I'm thinking of solving the problem on the database level by creating an intermediate table in the Oracle database holding pointers to matching records and requerying only those records to keep the UI in sync with data. Any Ideas?
In Oracle, if you open a cursor, the results of that cursor are static, regardless if another transaction inserts a row that would appear in your cursor, or updates or deletes a row that does exist in your cursor.
The challenge then is to not close the cursor if you want results consistent from when the cursor was opened.
If the UI maintains a single session on the database, one solution is to use Global Temporary Tables in Oracle. When you execute a search, insert the unique IDs into the GTT, then the UI just queries the GTT.
If the UI doesn't keep the session open, you could do the same thing but with an ordinary table. Then, of course, you'd just have to add some cleanup code to remove old search results from the table.
You can use a flashback query to read data from the past. For example, select * from employee as of timestamp to_timestap('01-MAY-2011 070000', 'DD-MON-YYYY HH24MISS');
Oracle only stores this historical information for a limited period of time. You'll need to look into your retention settings; the UNDO_RETENTION parameter, UNDO tablespace retention gaurantee and proper sizing, and also LOBs have their own retention setting.
Create two connections to the database.
Set the first one to READ ONLY (using SET TRANSACTION READ ONLY) do your searching from that connection but make sure you never end that transaction by issuing a commit or rollback.
As a read only transaction only sees the data as it was at the time the transaction started, the first connection will never see any changes to the database - not even committed ones.
Then you can do your updates in the second connection without affecting the results in the first connection.
If you cannot use two connections, you could implement the updates through stored procedures that use autonomous transactions, then you can keep the read only transaction open in the single connection you have.