I am encountering a very weird problem in production. I have deployed a new application in production which is run in Tomcat server(version 7) using Windows services. This application is basically a batch processing which processes around 100-200 records at a time and generates their output in a flat file. When we run this application in production, the record's status changes from pending to verified(which is correct) and generates the flat file without any exception. However, when the batch process completes, all transactions are rolled back like nothing happened. Means, the record's status changes to pending, flat file is empty etc. This same application when running in staging or QA environment, it is running fine.
Can someone help me with this problem or at least guide me where should I start looking into it?
Since I don't know where the problem is, I am not attaching any sources or any error logs. Please let me know what details should I share?
Thank you
Related
I have a web server which is presenting some data from a Cloudera cluster. The data are stored on HBase and the cluster is secured with Kerberos. When I try to perform a get, the server hangs without logging any error.
So far I've tried:
Launching the webserver from command line after a kinit (the server is just for testing purpose, so log-in duration and complex procedures to start it are not an issue)
The runAs approach described in here, both with and without the configuration file import from this answer.
The CLASSPATH configuration approach described here
Global authentication with UserGroupInformation.loginUserFromKeytab (with and without all the configurations from point 2 and 3)
I've executed all the gets from hbase shell after kiniting with the web server's user and they work in reasonable time (less than a second, while the last time I left the connection open the server didn't respond in over an hour), so it's not a performance or authorization issue. Inside the same web server, with every configuration listed, I'm able to perform other actions, like connecting to HBase and getting the table instances.
I've also checked the logs from Kerberos, HBase and my web server and none of them presents any error. In fact, I'm pretty much afraid that the authorization works, but it just gets stuck in some loop during the get.
UPDATES: After more testing, I've verified that there is a user set right before the call to HBase's API. Also, I've checked and no calls are made to HBase. So this is not an authentication problem, but something else. Did anybody have the same problem?
I previously did not have any issues with this, as I have been deploying Azure web app from Eclipse with no issue. Usually it takes a few minutes, but currently it is taking forever with no real progress from what I can tell. I have tried restarting Eclipse as well as deleting the Azure app and recreating a new one. None of those work. Are there some settings I need to reset ?
EDIT: Yes it is a Web App created in Azure, and I previously had no trouble deploying at regular intervals. Last time however I wanted to abort the deployment and attempted to do so in Eclipse but it kept running and it seemed to be hanging so I shut down Eclipse and tried to deploy again but instead of taking just a few minutes, now it is stuck at the beginning with no progress.
I then decided to delete the web app, create a new one, and deploy to the new web app from Eclipse, but it is still the same with no progress.
EDIT: Adding Screenshots of the general environment and Azure configuration.
Just summarize for the above comments, the issue seems that Azure WebApp waits timeout to recover or reset for a failed deployment connection, so a suggestion is that trying to restart the Azure WebApp to recover or reset all status of the current instance fastly, or wait awhile then try to reconnect again.
But for some similar cases, if the above action failed, it's necessary for considering some metadata files in the current workspace, a valid way may be to create a new workspace to try to reconnect.
I'm looking for a way to test if my application can survive if the database it connects to goes down and back up again. In which case the expected behaviour of the application would be to first throw an exception and then reconnect successfully once the database is up.
I would like to do so in an automated integration test (so manually powering off/on the database and see how the app behaves is out of question).
Connecting to an in-memory database is an option but this will devalue the test somewhat since the prod code and test code wont run against exactly the same db and drivers, so not ideal.
Another option would be for the test to trigger a process which would cutoff the connection between the database and the app for a few seconds (not sure how to do that).
Any other ideas ?
Technical stack is Java with Spring (jdbc templates).
I think you can try to supply with incorrect db log in information to simulate the connection loss
If you are looking at killing the connection, depending on your setup you could create a batch script to disable your network adapter, and another switch it back on again once you are ready
This might help
How can I write a batch file to toggle my network adapters?
and you can use this to call the script
How do I run a batch file from my Java Application?
I have a grails project with mysql for GORM support. When i start project without mysql running, it gives me series of exception message. This is fine and expected. But, what is problem is that there is no way to handle that so that i can catch them and report to user that there is issue with system and they need to wait till it gets fixed.
There are two cases for this problem.
First one is when a currently running project (deployed inside tomcat and running) get its mysql connection killed by say stopping mysql service. In this case, it keeps on throwing database exception without any grace. The error catching mechanism fails for me. I have mapped 500 to a error page but it also does not gets rendered. Nginx which acts as reverse proxy at last displays its own timeout page.
Second case is when project start loading (say tomcat container is started) and mysql service is already down. In this case, the project startup seems to be effected to a level that startup fails though tomcat reports in log that war is running but when accessed it just throws black page (saying 404) which still is mysterious because i have mapped 404 to a error page which is also not in work.
Thanks.
If there is no db at start up the war will not be loaded into tomcat at all. So you will get default 404 error pages at that point. There are errors thrown to the log in this case but not much else. So your users are hosting their own instances of their app it sounds like, what I like to do is expose a health page.
The health page can be at a known URL and in the case of no db you will get a 404 but you can document that any non 200 status code is an error and provide info on how to fix. Also if you are automating deployment this check is great for reporting status of the deployment.
I don't know why my Heroku application is in crashed state.
Log and code at https://github.com/jstar88/LibreTitan/blob/master/log.txt
Running application at http://libretitan.herokuapp.com/
The problem is that your database is in an inconsistant state, so Play wants to run DOWNS evolutions, but you have not started your server with -DapplyEvolutions.default=true and -DapplyDownEvolutions.default=true. If this is a production system, I would not recommend doing that until reading about and fully understanding how Play's evolutions work because DOWNS could cause destructive changes to your data. The documentation can be found here:
http://www.playframework.com/documentation/2.1.0/Evolutions
Since you are running on Heroku, be sure to also set evolutions.use.locks=true so the evolutions will still work if you scale to more than one dyno.