I have some JDBC stuff defined over here.
I am basically, grabbing records from one table(test.selectiontable) located at 11.11.1.111 and inserting into 22.22.2.222(test.insertiontable). It's a thread based process, so the thread will keep on looking & transferring
the records until all records are transferred.
and
Since I don't want to insert duplicate records into another table, I have a field in test.selectiontable, called DTSStatusType_ti which has initial value of 1 in the table. Hence,I am updating it after transferring the records
to test.insertiontable as follows:
Line # 112 SelectQueueRS.updateInt( "DTSStatusType_ti", 3 );
Line # 113 SelectQueueRS.updateString( "Queued_DialerIP_vch", 22.22.2.222 );
The initial value of Queued_DialerIP_vch in test.selectiontable is 11.11.1.111.
Although the records are getting updated using above two lines of code in my code but I don't think it's an efficient way to update. Could anyone please suggest some efficient way to update the records and 100% make sure that these records are not duplicated ever. Please feel free to suggest any changes in my code. Thanks
Instead of updating the ResultSet of the original query, you can:
Start a transaction with connMain.setAutoCommit(false)
If others can modify the data, you should use SELECT FOR UPDATE which locks the selected rows
Collect the primary key ids from the query result
Do the insert to the remote datasource
Update the original table with one update query, by the collected ids (WHERE id IN (...))
Commit the transaction with connMain.commit()
3 and 4 can be done in the same cycle.
By the way you can also start a transaction on the remote datasource and do a batch insert.
You should also be aware of doing transactions in try-catch blocks and closing your resources. This may help: http://docs.oracle.com/javase/tutorial/jdbc/basics/transactions.html
Related
I have struggled with architectural problem.
I have table in DB2 v.9.7 database in which I need to insert ~250000 rows, with 13 columns each, in a single transaction. I especially need that this data would inserted as one unit of work.
Simple insert into and executeBatch give me:
The transaction log for the database is full. SQL Code: -964, SQL State: 57011
I don't have rights to change the size of transaction log. So I need to resolve this problem on the developer's side.
My second thought was to use savepoint before all inserts then I found out that works only with current transaction so it doesn't help me.
Any ideas?
You want to perform a large insert as a single transaction, but don't have enough log space for such transaction and no permissions to increase it.
This means you need to break up your insert into multiple database transactions and manage higher level commit or rollback on the application side. There is not anything in the driver, either JDBC or CLI, to help with that, so you will have to write custom code to record all committed rows and manually delete them if you need to roll back.
Another alternative might be to use the LOAD command by means of the ADMIN_CMD() system stored procedure. LOAD requires less log space. However, for this to work you will need to write rows that you want to insert into a file on the database server or to a shared filesystem or drive accessible from the server.
Hi you can use export/load commands to export/import large tables, this should be very fast.The LOAD command should not be using the transaction log.You may have problem if your user have no privilege to write file on server filesystem.
call SYSPROC.ADMIN_CMD('EXPORT TO /export/location/file.txt OF DEL MODIFIED BY COLDEL0x09 DECPT, select * from some_table ' )
call SYSPROC.ADMIN_CMD('LOAD FROM /export/location/file.txt OF DEL MODIFIED BY COLDEL0x09 DECPT, KEEPBLANKS INSERT INTO other_table COPY NO');
i need a little help here because i'm struggling a little bit to find the best solution for my problem. i googled and dont have any enlightening answer.
So, first of all, i'll explain the idea.
1 - i've a java application that insert data in my database (Oracle DB) using jdbc.
2 - My database is logically splited in two. One part that contains table with exported information (from another application) and another part with table that represents some reports.
3 - my java app only insert information in export table.
4 - I've developed some packages that makes the transformation of data from export table to report table (generate some reports).
5 - This packages are scheduled to execute 2, 3 times a day
So, my problem is that when transformation task starts, i want to prevent new DML operations. Then, when transformation stops, all new data that was supposed to be inserted/updated during that time, shall be inserted again in the export tables.
i tought in two approaches:
1 - during transformation time deviate the DML ops to temporary table
2 - lock the tables but i've not so many experience using this. My main question is, can i force DML operations in jdbc to wait until the lock is finished? Not tried yet, but read here and there that after some that is thrown a lockwaittimeout exception or something like that.
Can anyone more experienced give me some advices?
Any doubts on what i'm trying to do just ask.
Do not try locking tables as a solution. Sadly, that is common but rarely necessary. Just a few ideas:
at start of transformation select * data from export table into global_temp table. Then execute your transformation packages on that temp table
create a materialized view like select * data from export table. Investigate the options to refresh on commit but it seems you require to refresh the table just before your transformation
analyze your exported data. If it is like many other cases most of the data will never change once imported. Only new data needs to be analyzed. To aid in processing add a timestamp field called date_last_modified and a trigger on the table. When a row is updated then update the date_last_modified. This allows you to choose the smallest data set possible of "only changed records"
you should also investigate using bulk collect to optimize your cursor. This will allow you get a group of records all at once, sort of a snapshot of the data at a point in time
I believe you are over thinking this. If you get a group of records one at a time then Oracle will get the state of the record as of the last commit by any user. If you bulk collect a group of records they go into memory and will, again, represent the state as of a point in time.
The best way to feel more comfortable about this is to set up a test case. Set up a cursor that sleeps during every processing cycle. Open another session and change the data that is being processed. See what happens....
There's a DB that contains approximately 300-400 records. I can make a simple query for fetching 30 records like:
SELECT * FROM table
WHERE isValidated = false
LIMIT 30
Some more words about content of DB table. There's a column named isValidated, that can (as you correctly guessed) take one of two values: true or false. After a query some of the records should be made validated (isValidated=true). It is approximately 5-6 records from each bunch of 30 records. Correspondingly after each query, I will fetch the records (isValidated=false) from previous query. In fact, I'll never get to the end of the table with such approach.
The validation process is made with Java + Hibernate. I'm new to Hibernate, so I use Criterion for making this simple query.
Is there any best practices for such task? The variant with adding a flag-field (that marks records which were fetched already) is inappropriate (over-engineering for this DB).
Maybe there's an opportunity to create some virtual table where records that were already processed will be stored or something like this. BTW, after all the records are processed, it is planned to start processing them again (it is possible, that some of them need to be validated).
Thank you for your help in advance.
I can imagine several solutions:
store everything in memory. You only have 400 records, and it could be a perfectly fine solution given this small number
use an order by clause (which you should do anyway) on a unique column (the PK, for example), store the ID of the last loaded record, and make sure the next query uses where ID > :lastId
I have some queries that run for a quite long (20-30 minutes). If a lot of queries are started simultaneously, connection pool is drained quickly.
Is it possible to wrap the long-running query into a statement (procedure) that will store the result of a generic query into a temp table, terminanting the connection, and fetchin (polling) the results later on demand?
EDIT: queries and data stuctures are optimized, and tips like 'check your indices and execution plan' don't work for me. I'm looking for a way to store [maybe a] byte presentation of a generic result set, for later retreive.
First of all, 20-30 minutes is an extremely long time for a query - are you sure you aren't missing any indexes for the query? Do check your execution plan - you could get a huge performance gain from a well-placed index.
In MySQL, you could do
INSERT INTO `cached_result_table` (
SELECT your_query_here
)
(of course, cached_result_table needs to have the exact same column structure as your SELECT returns, otherwise you'll get an error).
Then, you could query these cached results (instead of the original tables), and only run the above query from time to time - to update the cached_result_table.
Of course, the query will need to run at least once initially, which will take the 20-30 minutes you mentioned. I suggest to pre-populate the cached table before the data are requested, and keep some locking mechanism to prevent the update query to run several times simultaneously. Pseudocode:
init:
insert select your_big_query
work:
if your_big_query cached table is empty or nearing expiration:
refresh in the background:
check flag to see if there's another "refresh" process running
if yes
end // don't run two your_big_queries at the same time
else
set flag
re-run your_big_query, save to cached table
clear flag
serve data to clients always from cached table
An easy way to do that in Oracle is "CREATE TABLE sometempname AS SELECT...". That will create a new table using the result columns from the select.
Not quite sure what you are requesting.
Currently you have 50 database sessions. Say you get 40 running long-running queries, that leaves 10 to service the rest.
What you seem to be asking for is, you want those 40 queries asynchronously (running in the background) not clogging up the connection pool of 50. The question is, do you want those 40 running concurrently with (potentially) another 50 queries from the connection pool, or do you want them queued up in some way ?
Queuing can be done (look into DBMS_SCHEDULER and DBMS_JOB). But you will need to deliver those results into some other table and know how to deliver that result set. The old fashioned way is simply to generate reports on request that get delivered to a directory on a shared drive or by email. Could be PDF or CSV or Excel.
If you want the 40 running concurrently alongside the 50 'connection pool' settings, then you may be best off setting up a separate connection pool for the long-running queries.
You can look into Resource Manager for terminating calls that take too long or too many resources. That way the quickie pool can't get bogged down in long running requests.
The most generic approach in Oracle I can think of is creating a stored procedure that will convert a result set into XML, and store it as CLOB XMLType in a table with the results of your long-running queries.
You can find more on generation XMLs from a generic result sets here.
SQL> select dbms_xmlgen.getxml('select employee_id, first_name,
2 last_name, phone_number from employees where rownum < 6') xml
3 from dual
So i have a database where there is a lot of data being inserted from a java application. Usualy i insert into table1 get the last id, then again insert into table2 and get the last id from there and finally insert into table3 and get that id as well and work with it within the application. And i insert around 1000-2000 rows of data every 10-15 minutes.
And using a lot of small inserts and selects on a production webserver is not really good, because it sometimes bogs down the server.
My question is: is there a way how to insert multiple data into table1, table2, table3 without using such a huge amount of selects and inserts? Is there a sql-fu technique i'm missing?
Since you're probably relying on auto_increment primary keys, you have to do the inserts one at a time, at least for table1 and table2. Because MySQL won't give you more than the very last key generated.
You should never have to select. You can get the last inserted id from the Statement using the getGeneratedKeys() method. See an example showing this in the MySQL manual for the Connector/J:
http://dev.mysql.com/doc/refman/5.1/en/connector-j-usagenotes-basic.html#connector-j-examples-autoincrement-getgeneratedkeys
Other recommendations:
Use multi-row INSERT syntax for table3.
Use ALTER TABLE DISABLE KEYS while you're importing, and re-enable them when you're finished.
Use explicit transactions. I.e. begin a transaction before your data-loading routine, and commit at the end. I'd probably also commit after every 1000 rows of table1.
Use prepared statements.
Unfortunately, you can't use the fastest method for bulk load of data, LOAD DATA INFILE, because that doesn't allow you to get the generated id values per row.
There's a lot to talk about here:
It's likely that network latency is killing you if each of those INSERTs is another network roundtrip. Try batching your requests so they only require a single roundtrip for the entire transaction.
Speaking of transactions, you don't mention them. If all three of those INSERTs need to be a single unit of work you'd better be handling transactions properly. If you don't know how, better research them.
Try caching requests if they're reused a lot. The fastest roundtrip is the one you don't make.
You could redesign your database such that the primary key was not a database-generated, auto-incremented value, but rather a client generated UUID. Then you could generated all the keys for every record upfront and batch the inserts however you like.