I would like to perform updates to a MySQL database using two separate classes (that do different things) -- one doing so every 10 seconds, and the second every minute. I have a few gaps in my Java knowledge and I'm wondering what the best way to achieve it is.
Importantly, if connectivity to the database is lost, I need reconnection attempts to occur indefinitely, and I'm guessing the use of Prepared Statements will make queries more efficient. Should the connection to the database be left open all the time or closed between the updates being run? Maybe I also need to think about clearing objects/resources out of memory if the class instances are going to be run indefinitely.
Answer to your first question "how to run java class at some specific interval?"
You can use quartz triggers to run java class at time interval.
Second Question: "what if conn is lost?"
Check in class that runs at time interval for connection if its open then dont attempt to connect and if lost then first connect to db and then update the db.
Third question: "Should the connection left open every time?"
If you wanna do update with very small time gap then i think yes you should keep it open.but not sure about it. May be you should try it keeping open.
I would keep the database connection open but do noty try to manage the open connection yourself. Use a connection pooling library like Apache DBCP. It allows to specify a validationQuery which technically a very simple SELECT request that is executed before the actual statement for validating if the connection is still alive. In my application this works rock solid (one app that runs 24/7; queries about every minute).
Related
I am working on a Java web application that uses Weblogic to connect to an Informix database. In the application we have multiple threads creating records in a table.
It happens pretty often that it fails and the following error is thrown:
java.sql.SQLException: Could not do a physical-order read to fetch next row....
Caused by: java.sql.SQLException: ISAM error: record is locked.
I am assuming that both threads are trying to insert or update when the record is locked.
I did some research and found that there is an option to set the database that instead of throwing an error, it should wait for the lock to be released.
SET LOCK MODE TO WAIT;
SET LOCK MODE TO WAIT 17;
I don't think that there is an option in JDBC to use this setting. How do I go about using this setting in my java web app?
You can always just send that SQL straight up, using createStatement(), and then send that exact SQL.
The more 'normal' / modern approach to this problem is a combination of MVCC, the transaction level 'SERIALIZABLE', retry, and random backoff.
I have no idea if Informix is anywhere near that advanced, though. Modern DBs such as Postgres are (mysql does not count as modern for the purposes of MVCC/serializable/retry/backoff, and transactional safety).
Doing MVCC/Serializable/Retry/Backoff in raw JDBC is very complicated; use a library such as JDBI or JOOQ.
MVCC: A mechanism whereby transactions are shallow clones of the underlying data. 2 separate transactions can both read and write to the same records in the same table without getting in each other's way. Things aren't 'saved' until you commit the transaction.
SERIALIZABLE: A transaction level (also called isolationlevel), settable with jdbcDbObj.setTransactionIsolation(Connection.TRANSACTION_SERIALIZABLE); - the safest level. If you know how version control systems work: You're asking the database to aggressively rebase everything so that the entire chain of commits is ordered into a single long line of events: Each transaction acts as if it was done after the previous transaction was completed. The simplest way to implement this level is to globally lock all the things. This is, of course, very detrimental to multithread performance. In practice, good DB engines (such as postgres) are smarter than that: Multiple threads can simultaneously run transactions without just being frozen and waiting for locks; the DB engine instead checks if the things that the transaction did (not just writing, also reading) is conflict-free with simultaneous transactions. If yes, it's all allowed. If not, all but one simultaneous transaction throw a retry exception. This is the only level that lets you do this sequence of events safely:
Fetch the balance of isaace's bank account.
Fetch the balance of rzwitserloot's bank account.
subtract €10,- from isaace's number, failing if the balance is insufficient.
add €10,- to rzwitserloot's number.
Write isaace's new balance to the db.
Write rzwitserloot's new balance to the db.
commit the transaction.
Any level less than SERIALIZABLE will silently fail the job; if multiple threads do the above simultaneously, no SQLExceptions occur but the sum of the balance of isaace and rzwitserloot will change over time (money is lost or created – in between steps 1 & 2 vs. step 5/6/7, another thread sets new balances, but these new balances are lost due to the update in 5/6/7). With serializable, that cannot happen.
RETRY: The way smart DBs solve the problem is by failing (with a 'retry' error) all but one transaction, by checking if all SELECTs done by the entire transaction are not affected by any transactions that been committed to the db after this transaction was opened. If the answer is yes (some selects would have gone differently), the transaction fails. The point of this error is to tell the code that ran the transaction to just.. start from the top and do it again. Most likely this time there won't be a conflict and it will work. The assumption is that conflicts CAN occur but usually do not occur, so it is better to assume 'fair weather' (no locks, just do your stuff), check afterwards, and try again in the exotic scenario that it conflicted, vs. trying to lock rows and tables. Note that for example ethernet works the same way (assume fair weather, recover errors afterwards).
BACKOFF: One problem with retry is that computers are too consistent: If 2 threads get in the way of each other, they can both fail, both try again, just to fail again, forever. The solution is that the threads twiddle their thumbs for a random amount of time, to guarantee that at some point, one of the two conflicting retriers 'wins'.
In other words, if you want to do it 'right' (see the bank account example), but also relatively 'fast' (not globally locking), get a DB that can do this, and use JDBI or JOOQ; otherwise, you'd have to write code to run all DB stuff in a lambda block, catch the SQLException, check the SqlState to see if it is indicating that you should retry (sqlstate codes are DB-engine specific), and if yes, rerun that lambda, after waiting an exponentially increasing amount of time that also includes a random factor. That's fairly complicated, which is why I strongly advise you rely on JOOQ or JDBI to take care of this for you.
If you aren't ready for that level of DB usage, just make a statement and send "SET LOCK MDOE TO WAIT 17;" as SQL statement straight up, at the start of opening any connection. If you're using a connection pool there is usually a place you can configure SQL statements to be run on connection start.
The Informix JDBC driver does allow you to automatically set the lock wait mode when you connect to the server.
Simply pass via the DataSource or connection URL the following parameter
IFX_LOCK_MODE_WAIT=17
The values for JDBC are
(-1) Wait forever
(0) not wait (default)
(> 0) wait this many seconds
See https://www.ibm.com/support/knowledgecenter/SSGU8G_14.1.0/com.ibm.jdbc.doc/ids_jdbc_040.htm
Connection conn = DriverManager.getConnection ( "jdbc:Informix-sqli://cleo:1550:
IFXHOST=cleo;PORTNO=1550;user=rdtest;password=my_passwd;IFX_LOCK_MODE_WAIT=17";);
I have the following scenario:
MethodA() calls MethodB()
MethodB() calls MethodC()
All methods have to execute some query to DB. So, to do this I create a connection object and pass it along the chain of methods to reuse the connection object.
Assumption here is that connection pooling is not being employed.
Now my question is, should only a single connection be opened and reused and be closed at the starting point (in the above example, the connection will be opened and closed in MethodA) ? or should I create a separate connection for each method?
Reusing the connection seems better, but then I will have to keep the connection open till the control comes back to MethodA().
I have read that reusing the connection is better as they are expensive to create. But then I have also read that its better to close the connection as soon as possible, i.e., once you are done with the query call.
Which approach is better and why?
It sounds like you are only querying the DB and not updating or inserting. If that is the case then you avoid many of the transactional semantics in such a nested procedure call.
If that is true, then simply connect once, do all your querying and close the connection. While usage of a connection pool is somewhat orthogonal to your question - use one if you can. They greatly simplify your code because you can have the pool automatically test the connection before it gives one to you. It will auto-create a new connection if the connection was lost (let's say because the DB was bounced).
Finally, you want to minimize the number of times you create a DB Connection BECAUSE it is expensive. However, this is often non-trivial. Databases themselves only support a maximum number of connections. If there are many clients, then you would need to take this into consideration. If you have the trivial case - one database and your program is the only one making connections, then open the connection and leave it open for the duration of the program. This would require you to validate it, so using a DB Pool, size of 1, avoids that.
I've a typical scenario & need to understand best possible way to handle this, so here it goes -
I'm developing a solution that will retrieve data from a remote SOAP based web service & will then push this data to an Oracle database on network.
Also, this will be a scheduled task that will execute every 15 minutes.
I've event queues on remote service that contains the INSERT/UPDATE/DELETE operations that have been done since last retrieval, & once I retrieve the events for last 15 minutes, it again add events for next retrieval.
Now, its just pushing data to Oracle so all my interactions are INSERT & UPDATE statements.
There are around 60 tables on Oracle with some of them having 100+ columns. Moreover, for every 15 minutes cycle there would be around 60-70 Inserts, 100+ Updates & 10-20 Deletes.
This will be an executable jar file that will terminate after operation & will again start on next 15 minutes cycle.
So, I need to understand how should I handle WRITE operations (best practices) to improve performance for this application as whole ?
Current Test Code (on every cycle) -
Connects to remote service to get events.
Creates a connection with DB (single connection object).
Identifies the type of operation (INSERT/UPDATE/DELETE) & table on which it is done.
After above, calls the respective method based on type of operation & table.
Uses Preparedstatement with positional parameters, & retrieves each column value from remote service & assigns that to statement parameters.
Commits the statement & returns to get event class to process next event.
Above is repeated till all the retrieved events are processed after which program closes & then starts on next cycle & everything repeats again.
Thanks for help !
If you are inserting or updating one row at a time,You can consider executing a batch Insert or a batch Update. It has been proven that if you are attempting to update or insert rows after a certain quantity, you get much better performance.
The number of DB operations you are talking about (200 every 15 minutes) is tiny and will be easy to finish in less than 15 minutes. Some concrete suggestions:
You should profile your application to understand where it is spending its time. If you don't do this, then you don't know what to optimize next and you don't know if something you did helped or hurt.
If possible, try to get all of the events in one round-trip to the remote server.
You should reuse the connection to the remote service (probably by using a library that supports connection persistence and reuse).
You should reuse the DB connections by using a connection pooling library rather than creating a new connection for each insert/update/delete. Believe it or not, creating the connection probably takes 100+ times as long as doing your DB operation once you have the connection in hand.
You should consider doing multiple (or all) of the database operations in the same transaction rather than creating a new transaction for each row that is changed. However, you should carefully consider your failure modes such that you don't lose any events (if that is an important consideration).
You should consider utilizing prepared statement caching. This may help, but maybe not if Oracle is configured properly.
You should consider trying to analyze your operations to find any that can be batched together. This can be a lot faster if you have some "hot" operations that get done often.
"I've a typical scenario"
No you haven't. You have a bespoke architecture, with a unique data model, unique data and unique business requirements. That's not a bad thing, it's the state of pretty much every computer system that's not been bought off-the-shelf (and even some of them).
So, it's an experiment and you must approach it as such. There is no "best practice". Try various things and see what works best.
"need to understand best possible way to handle this"
You will improve your chances of success enormously by hiring somebody who understands Oracle databases.
I am developing a simple database project. It has some functions that will be performed on database. I need advice about opening and closing database. Which way is the better below?
When program starts, open database and close when program finishes.
When program starts, when a function is called, open database, and at the end of this function close database.
I am confused for which one is better. In the first case database will be open during the program. In the second case database opening and closing operation will be performed at each function call and also for each function. This will reduce efficiency and also makes code longer.
So which one is the better?
Better is subjective. If you have only a few clients connecting to the database, for short periods of time then perhaps leaving a connection open is the better way to go. However, if you have hundreds of clients needing connections, there are limited number of simultaneous connections that can be supported, keeping the connection open for long periods would be detrimental.
There is an overhead involved in opening and closing connections to the database but you have to balance that against whether the language actually pools the connections for you so it only appears that the connection is closed but if needed will be reused and if not really closed after some period of time, and many other possible criteria.
Better is what meets your needs in the simplest to maintain manner.
Make a separate function like getConnection() which will create and return a Connection object. Put all your code that will create the connection there.
now if in any function you need a database connection call getConnection()
public void myfunction()
{
javax.sql.Connection con = MyConnectionClass.getConnection();
con.open();
// do your business logic here
con.close();
}
There is very little cost to leaving a connection open longer than necessary, so that would favor your idea to open when the program starts and close upon exiting the program.
I'd like to come up with the best approach for re-connecting to MS SQL Server when connection from JBoss AS 5 to DB is lost temporarily.
For Oracle, I found this SO question: "Is there any way to have the JBoss connection pool reconnect to Oracle when connections go bad?" which says it uses an Oracle specific ping routine and utilizes the valid-connection-checker-class-name property described in JBoss' Configuring Datasources Wiki.
What I'd like to avoid is to have another SQL run every time a connection is pulled from the pool which is what the other property check-valid-connection-sql basically does.
So for now, I'm leaning towards an approach which uses exception-sorter-class-name but I'm not sure whether this is the best approach in the case of MS SQL Server.
Hoping to hear your suggestions on the topic. Thanks!
I am not sure it will work the way you describe it (transparently).
The valid connection checker (this can be either a sql statement in the *ds.xml file or a class that does the lifting) is meant to be called when a connection is taken from the pool, as the db could have closed it while it is in the pool. If the connection is no longer valid,
it is closed and a new one is requested from the DB - this is completely transparent to the application and only happens (as you say) when the connection is taken out of the pool. You can then use that in your application for a long time.
The exception sorter is meant to report to the application if e.g. ORA-0815 is a harmless or bad return code for a SQL statement. If it is a harmless one it is basically swallowed, while for a bad one it is reported to the application as an Exception.
So if you want to use the exception sorter to find bad connections in the pool, you need to be prepared that basically every statement that you fire could throw a stale-connection Exception and you would need to close the connection and try to obtain a new one. This means appropriate changes in your code, which you can of course do.
I think firing a cheap sql statement at the DB every now and then to check if a connection from the pool is still valid is a lot less expensive than doing all this checking 'by hand'.
Btw: while there is the generic connection checker sql that works with all databases, some databases provide another way of testing if the connection is good; Oracle has a special ping command for this, which is used in the special OracleConnectionChecker class you refer to. So it may be that there is something similar for MS-SQL, which is less expensive than a simple SQL statement.
I used successfully background validation properties: background-validation-millis from https://community.jboss.org/wiki/ConfigDataSources
With JBoss 5.1 (I don't know with other versions), you can use
<valid-connection-checker-class-name>org.jboss.resource.adapter.jdbc.vendor.MSSQLValidConnectionChecker</valid-connection-checker-class-name>