spring batch remains in EXECUTING - java

I created a job which uses reader of type
org.springframework.batch.item.database.HibernateCursorItemReader to execute a query.
The problem is database connection in this case is hitting connection limit (I have a oracle error ORA-12519, TNS:no appropriate service handler found) and, surprisingly, I noticed exit_code=EXECUTING and status=STARTED on BATCH_STEP_EXECUTION table.
If I run again the job it will respond "A job execution for this job is already running" and if I issue -restart on this task, it complains with message "No failed or stopped execution found for job".
How does spring batch manages these fatal failure situations? Do I have to remove these execution information manually or is there a reset option?
Thank you for any help

the current release of Spring Batch (2.2.0) doesn't appear to have an out of the box solution for this situation. as discussed in this question, 'manual' intervention in the database may be required. alternatively, if this is a particular job that is hanging (that is, you know the job name), you can do the following as well;
use the JobExplorer.findRunningJobExecutions(jobName)
go through the list of executions and 'fail' them (JobExecution.upgradeStatus(BatchStatus.FAILED))
save the change using JobRepository.update(jobExecution)

Just an FYI why this problem of connection limit occurs when using a CursorItemReader (JDBCCursorItemReader or HibernateCursorItemReader)
The cursorItemReader opens a separate connection even if there is already a connection opened for the transaction (Reader -> Processors -> Writer). So, each step execution needs two connections even if it is in a single transaction and hitting the same db. This causes the connection bottleneck and so the number of db connections should be double the number of threads configured in thread pool to execute the steps in parallel.
This can also be resolve if you provide a separate connection to your CursorReader.
JdbcPagingItemReader is another implementation of ItemReader which uses the same connection opened for the transaction.

Related

Where to handle DB connection related exceptions in a Spring Boot/JPA application?

I'm relatively new to Spring and I've created a small app that connects to Oracle db and regularly sends some queries and receives some data. How can I handle eventual exceptions caused by connection issues? They can be thrown both at start-up(which I noticed is different from normal running - something related to bean creation and thrown by SpringApplication.run) and during normal running(it's thrown by the methods doing the query like findAll() - I created a #ControllerAdvice for that - but this won't work for the start-up part).
The database connection is configured based on some application.properties data and autoconfiguration(hikari pool).
Basically I want this: if there's any connection issue, whether right from the start or during running, to make 3 retries with a back-off of 1 minute, if none work, a method should be called, then the app should close gracefully. How can I do this?
Is it possible to specify some specific back-off time for hikari pool?(default seems to be around 5 seconds which seems a bit to often)
Thank you!

JdbcSQLNonTransientConnectionException: Database may be already in use: "Waited for database closing longer than 1 minute"

We are using H2 started as database server process and listening on standard TCP/IP port 9092.
Our application is deployed in a Tomcat using connection pooling. We do a purge during idle time which at the end results in closing all connections to H2. From time to time we observe errors when the application tries to open the connection to H2 again:
SCHEDULERSERVICE schedule: Exception: Database may be already in use: "Waited for database closing longer than 1 minute". Possible solutions: close all other connection(s); use the server mode [90020-199]
org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: "Waited for database closing longer than 1 minute". Possible solutions: close all other connection(s); use the server mode [90020-199]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:617)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:427)
at org.h2.message.DbException.get(DbException.java:205)
at org.h2.message.DbException.get(DbException.java:181)
at org.h2.engine.Engine.openSession(Engine.java:209)
at org.h2.engine.Engine.createSessionAndValidate(Engine.java:178)
at org.h2.engine.Engine.createSession(Engine.java:161)
at org.h2.server.TcpServerThread.run(TcpServerThread.java:160)
at java.lang.Thread.run(Thread.java:748)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:617)
at org.h2.engine.SessionRemote.done(SessionRemote.java:607)
at org.h2.engine.SessionRemote.initTransfer(SessionRemote.java:143)
at org.h2.engine.SessionRemote.connectServer(SessionRemote.java:431)
at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:317)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:169)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:148)
at org.h2.Driver.connect(Driver.java:69)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
The problem occurs when the Tomcat connection pool closes all idle connection (unused) and one connection still in use is closed afterwards.
The next attempt to open a new connection fails, a retry is successfully after some wait time.
Under which circumstances does this exception happen?
What does the exception mean?
Are there any recommendations to follow to avoid the problem?
It sounds to me that H2 does a database close after the last connection has been closed.
When does the database close occure?
How can database closures been controlled?
Thx in advance
Thorsten
Embedded database in web applications needs careful handling of its lifecycle.
You can add a javax.servlet.ServletContextListener implementation (marked with #WebListener annotation or included into web.xml) and add explicit database shutdown to its contextDestroyed() methods.
You can force database shutdown here with connection.createStatement().execute("SHUTDOWN"). If your application needs to write something to database during unload, it should do it before that command.
Without the explicit shutdown H2 closes the database when all connections are closed, if some other behavior wasn't configured explicitly (with parameters in JDBC URL, for example). For example, DB_CLOSE_DELAY sets the additional delay, maybe your application uses that setting and therefore H2 doesn't close the database immediately, or application doesn't close all connections immediately.
Anyway, when you're trying to update the web application of the fly, Tomcat tries to initialize the new version before its old version is unloaded. If H2 is in classpath of the web application itself, the new version will be unable to connect to the database during short period of time when the new version is already online but the old version isn't unloaded yet.
If you don't like it, you can run the standalone H2 Server process and use remote connections to it in your web applications.
Another option is to move H2 to the classpath of Tomcat itself and configure the connection pool as resource in the server.xml, in that case it shouldn't be affected by the lifecycle of your applications.
In both these cases you shouldn't use the SHUTDOWN command.
UPDATED
With client-server connections to a remote server such exception means that server decided to close the database because there are no active connection. This operation can't be interrupted and reverted in the middle. On attempt to open a new connection to the same database during this process it waits at most for 1 minute for completion of this process to re-open the database again. This timeout is not configurable.
There are two possible solutions.
DB_CLOSE_DELAY setting can be used with some large value in seconds. When all connections are closed, database will stay online for the specified number of seconds. -1 also can be used to set an infinite timeout.
You can try to speed up the shutdown process, but you have to figure out what takes so much time by yourself. The file compaction procedure is limited to 200 milliseconds by default, it may take a longer time, but I think it shouldn't be that long. Maybe you have a lot of temporary objects or uncommitted data. Maybe you have a very high fragmentation of database file. It's hard to say what's going wrong without further investigation.

Using DB.requestDone() to force database connections back into connection pool

I have a grails application that uses a quartz job to automatically augment documents with data supplied from an external service. The quartz job uses a non-transactional Service to query and update documents from mongodb. The actual querying and updating uses mongo's native driver (no GORM). The quartz job and Service do not return database connections to the connection pool resulting in the error "Connection wait timeout" once all connections have been consumed.
I can fix the problem by adding a call to DB.requestDone() after querying and updating in the spawned thread. I am not sure about the ramifications of using requestDone for this purpose.
Are there negative consequences for calling requestDone without ever calling requestStart?
Are there any threading issues with requestStart/requestDone. For example, what happens if another thread is in the middle of querying Mongo when requestDone is called?
Is there a better way to ensure a database connection is returned to the connection pool?
FYI, I tried adding cursor.close() but that did not resolve the problem.

Multithreaded data processing hanging on PostgreSQL

I'm trying to use the 8 threads from my new processor to handle transactions on the PostgreSQL database. It have to process geographic data in PostGIS, what I already do with just 1 processor core (one thread). I'm using Java (JDBC4) to create one Connection for each thread. Each connection receives the job to process groups of geometric entities, where one SELECT and one UPDATE statements are used for each entity. Each entity is processed by unique ID and no relation functions are used, so there is no dependencies between the transactions.
The application can be started to run with a variable number of threads. When I run it, all except one of the threads hang. Even if I try to run with just two threads, one hangs. With the "Server status" tool from pgAdmin3 I can see that all the hanging threads are "IDLE in transaction", some in "ExclusiveLock" mode, some in "RowExclusiveLock" mode and some in "AccessShareLock" mode.
I've adjusted my postgresql.conf as described in http://jayant7k.blogspot.com/2010/06/postgresql-tuning-quick-tips.html
I've tried to put the threads to sleep for a while right after the UPDATE statement with no success.
Why are the locks been created? Is there a way to avoid these locks, once that are no reasons to a query depend on other?
Thanks for any help
Did you set min-pool-size and max-pool-size for JDBC connection?
In your case, minimum should be 8.

JDBC connection for a background thread being closed accessing in Websphere

I have an application running in Websphere Portal Server inside of Websphere Application Server 6.0 (WAS). In this application for one particular functionality that takes a long time to complete, I am firing a new thread that performs this action. This new thread opens a new Session from Hibernate and starts performing DB transactions with it. Sometimes (haven't been able to see a pattern), the transactions inside the thread work fine and the process completes successfully. Other times however I get the errors below:
org.hibernate.exception.GenericJDBCException: could not load an entity: [OBJECT NAME#218294]
...
Caused by: com.ibm.websphere.ce.cm.ObjectClosedException: DSRA9110E: Connection is closed.
Method cleanup failed while trying to execute method cleanup on ManagedConnection WSRdbManagedConnectionImpl#642aa0d8 from resource jdbc/MyJDBCDataSource. Caught exception: com.ibm.ws.exception.WsException: DSRA0080E: An exception was received by the Data Store Adapter. See original exception message: Cannot call 'cleanup' on a ManagedConnection while it is still in a transaction..
How can I stop this from happening? Why does it seem that WAS wants to kill my connections even though they're not done. Is there a way I can stop WAS from attempting to close this particular connection?
Thanks
I mentioned two possible causes in my other answer: 1. the hibernate.connection.release_mode optional parameter or 2. a problem with unmanaged threads. Now that I read this question, I really start to think that your problem may be related to the fact that you're spawning your own threads. Since they aren't managed by the container, connections used in these treads may appear as "leaked" (not closed properly) and I wouldn't be surprised if WAS tries to recover them at some point.
If you want to start a long running job, you should use a WorkManager. Don't spawn threads yourself.

Categories