Google Cloud Pub Sub memory leak on re-deploy (Netty based) - java

My tomcat web service uses realtime developer notifications for Android, which requires Google Cloud Pub Sub. It is working flawlessly, all the notifications are received immediately. The only problem is that it uses too much RAM that causes the machine to respond very slowly than it is supposed to, and is not releasing it after undeploying the application. It uses HttpServlet (specifically Jersey which provides contextInitialized and contextDestroyed methods to set and clear references) and commenting the pub-sub code actually decreases a lot of the memory usage.
Here is the code for subscribing-unsubscribing for Android subscription notifications.
package com.example.webservice;
import com.example.webservice.Log;
import com.google.api.core.ApiService;
import com.google.api.gax.core.FixedCredentialsProvider;
import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.pubsub.v1.MessageReceiver;
import com.google.cloud.pubsub.v1.Subscriber;
import com.google.common.collect.Lists;
import com.google.pubsub.v1.ProjectSubscriptionName;
import java.io.FileInputStream;
public class SubscriptionTest
{
// for hiding purposes
private static final String projectId1 = "api-000000000000000-000000";
private static final String subscriptionId1 = "realtime_notifications_subscription";
private static final String TAG = "SubscriptionTest";
private ApiService subscriberService;
private MessageReceiver receiver;
// Called when "contextInitialized" is called.
public void initializeSubscription()
{
Log.w(TAG, "Initializing subscriptions...");
try
{
GoogleCredentials credentials1 = GoogleCredentials.fromStream(new FileInputStream("googlekeys/apikey.json"))
.createScoped(Lists.newArrayList("https://www.googleapis.com/auth/cloud-platform"));
ProjectSubscriptionName subscriptionName1 = ProjectSubscriptionName.of(projectId1, subscriptionId1);
// Instantiate an asynchronous message receiver
receiver =
(message, consumer) ->
{
consumer.ack();
// do processing
};
// Create a subscriber for "my-subscription-id" bound to the message receiver
Subscriber subscriber1 = Subscriber.newBuilder(subscriptionName1, receiver)
.setCredentialsProvider(FixedCredentialsProvider.create(credentials1))
.build();
subscriberService = subscriber1.startAsync();
}
catch (Throwable e)
{
Log.e(TAG, "Exception while initializing async message receiver.", e);
return;
}
Log.w(TAG, "Subscription initialized. Messages should come now.");
}
// Called when "contextDestroyed" is called.
public void removeSubscription()
{
if (subscriberService != null)
{
subscriberService.stopAsync();
Log.i(TAG, "Awaiting subscriber termination...");
subscriberService.awaitTerminated();
Log.i(TAG, "Subscriber termination done.");
}
subscriberService = null;
receiver = null;
}
}
And this is the statement after the application is undeployed. (Names may not match but it is not important)
org.apache.catalina.loader.WebappClassLoaderBase.checkThreadLocalMapForLeaks The web application
[example] created a ThreadLocal with key of type [java.lang.ThreadLocal]
(value [java.lang.ThreadLocal#2cb2fc20]) and a value of type
[io.grpc.netty.shaded.io.netty.util.internal.InternalThreadLocalMap]
(value [io.grpc.netty.shaded.io.netty.util.internal.InternalThreadLocalMap#4f4c4b1a])
but failed to remove it when the web application was stopped.
Threads are going to be renewed over time to try and avoid a probable memory leak.
From what I've observed, Netty is creating a static ThreadLocal with a strong reference to the value InternalThreadLocalMap which seems to be causing this message to appear. I've tried to delete it by using some sort of code like this (probably it's overkill but none of the answers worked for me so far, and this isn't seem to be working either)
InternalThreadLocalMap.destroy();
FastThreadLocal.destroy();
for (Thread thread : Thread.getAllStackTraces().keySet())
{
if (thread instanceof FastThreadLocalThread)
{
// Handle the memory leak that netty causes.
InternalThreadLocalMap map = ((FastThreadLocalThread) thread).threadLocalMap();
if (map == null)
continue;
for (int i = 0; i < map.size(); i++)
map.setIndexedVariable(i, null);
((FastThreadLocalThread) thread).setThreadLocalMap(null);
}
}
After the undeploy (or stop-start) tomcat detects a memory leak if I click Find leaks (obviously). The problem is, the RAM and CPU that has been used is not released because apparently the subscription is not closed properly. Re-deploying the app causes the used RAM to increase even further on every action like, if it uses 200 MB ram at first, after 2nd deploy it increases to 400, 600, 800 which goes unlimited until the machine slows down enough to die.
It is a serious issue and I have no idea how to solve it, the stop methods are called as defined, awaitTerminated is also called which immediately executes (means that the interface is actually stopped listening) but it does not release the RAM behind it.
So far I've only seen questions about python clients (ref 1, ref 2) but nobody seems to be mentioning the Java client, and I'm kind of losing hope about using this structure.
I've opened an issue about this problem as well.
What should I do to resolve this issue? Any help is appreciated, thank you very much.

I don't know if it will fully fix your issue, but you appear to be leaking some memory by not closing the FileInputStream.
The first option is to extract the FileInputStream into a variable and call the close() method on it after you are done reading the content.
A second (and better) option to work with these kind of streams is to use try-with-resources. Since FileInputStream implements the AutoCloseable interface, it will be closed automatically when exiting the try-with-resources.
Example:
try (FileInputStream stream = new FileInputStream("googlekeys/apikey.json")) {
GoogleCredentials credentials1 = GoogleCredentials.fromStream(stream)
.createScoped(Lists.newArrayList("https://www.googleapis.com/auth/cloud-platform"));
// ...
} catch (Exception e) {
Log.e(TAG, "Exception while initializing async message receiver.", e);
return;
}

Related

Unneeded verticle overhead and undeploy

Does Vert.x has any overhead for deployed verticles? Is there any reason to undeploy them after they become unneeded?
Pls look at MyVerticle - the only purpose of it is to do load on app launching, after loading this verticle is unneeded. Is it sufficient to call consumer.unregister()? Is there any reasons to undeploy MyVerticle?
public class MyVerticle extends AbstractVerticle {
private MessageConsummer consumer;
#Override
public void start() {
consumer = vertx.eventBus().consumer(AppConstants.SOME_ADDRESS, this::load);
}
#Override
public void load(Message message) {
LocalMap<Short, String> map = vertx.sharedData().getLocalMap(AppConstants.MAP_NAME);
try (
DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(AppConstants.INDEX_PATH)))
) {
while (true) {
map.put(
in.readShort(),
in.readUTF()
);
}
} catch(EOFException eof) {
message.reply(AppConstants.SUCCESS);
} catch (IOException ioe) {
message.fail(100, "Fail to load index in memory");
throw new RuntimeException("There are no recovery policy", ioe);
} finally {
//is this sufficient or I need to undeploy them programmatically?
consumer.unregister();
}
}
}
Verticles can be seen as applications running on Vert.x, deploying/undeploying only have a small overhead if for example you're doing high availability or failover. In this case Vert.x will need to keep track of deployed instances, monitor for failures and re-spawn verticles on other nodes, etc...
Undeploy will also allow you to perform any clean up (although you're not using it in your example) by calling the stop() method.
When not running with HA in mind undeploy will only allow you to recover any memory that was allocated by your Verticle but is not referenced anymore (plus the memory related to keep internal track of deployment verticles, which should be negletible as a single object reference).

Application doesn't start up with tomcat 8 due to memory leak [duplicate]

I'm experiencing a memory leak due to orphaned threads in Tomcat. Particularly, it seems that Guice and the JDBC driver are not closing threads.
Aug 8, 2012 4:09:19 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: A web application appears to have started a thread named [com.google.inject.internal.util.$Finalizer] but has failed to stop it. This is very likely to create a memory leak.
Aug 8, 2012 4:09:19 PM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads
SEVERE: A web application appears to have started a thread named [Abandoned connection cleanup thread] but has failed to stop it. This is very likely to create a memory leak.
I know this is similar to other questions (such as this one), but in my case, the answer of "don't worry about it" won't be sufficient, as it is causing problems for me. I have CI server which regularly updates this application, and after 6-10 reloads, the CI server will hang because Tomcat is out of memory.
I need to be able to clear up these orphaned threads so I can run my CI server more reliably. Any help would be appreciated!
I just dealt with this problem myself. Contrary to some other answers, I do not recommend issuing the t.stop() command. This method has been deprecated, and for good reason. Reference Oracle's reasons for doing this.
However there is a solution for removing this error without needing to resort to t.stop()...
You can use most of the code #Oso provided, just replace the following section
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
Thread[] threadArray = threadSet.toArray(new Thread[threadSet.size()]);
for(Thread t:threadArray) {
if(t.getName().contains("Abandoned connection cleanup thread")) {
synchronized(t) {
t.stop(); //don't complain, it works
}
}
}
Replace it using the following method provided by the MySQL driver:
try {
AbandonedConnectionCleanupThread.shutdown();
} catch (InterruptedException e) {
logger.warn("SEVERE problem cleaning up: " + e.getMessage());
e.printStackTrace();
}
This should properly shutdown the thread, and the error should go away.
I've had the same issue, and as Jeff says, the "don't worry about it approach" was not the way to go.
I did a ServletContextListener that stops the hung thread when the context is being closed, and then registered such ContextListener on the web.xml file.
I already know that stopping a thread is not an elegant way to deal with them, but otherwise the server keeps on crashing after two or three deploys (it is not always possible to restart the app server).
The class I created is:
public class ContextFinalizer implements ServletContextListener {
private static final Logger LOGGER = LoggerFactory.getLogger(ContextFinalizer.class);
#Override
public void contextInitialized(ServletContextEvent sce) {
}
#Override
public void contextDestroyed(ServletContextEvent sce) {
Enumeration<Driver> drivers = DriverManager.getDrivers();
Driver d = null;
while(drivers.hasMoreElements()) {
try {
d = drivers.nextElement();
DriverManager.deregisterDriver(d);
LOGGER.warn(String.format("Driver %s deregistered", d));
} catch (SQLException ex) {
LOGGER.warn(String.format("Error deregistering driver %s", d), ex);
}
}
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
Thread[] threadArray = threadSet.toArray(new Thread[threadSet.size()]);
for(Thread t:threadArray) {
if(t.getName().contains("Abandoned connection cleanup thread")) {
synchronized(t) {
t.stop(); //don't complain, it works
}
}
}
}
}
After creating the class, then register it on the web.xml file:
<web-app...
<listener>
<listener-class>path.to.ContextFinalizer</listener-class>
</listener>
</web-app>
The least invasive workaround is to force initialisation of the MySQL JDBC driver from code outside of the webapp's classloader.
In tomcat/conf/server.xml, modify (inside the Server element):
<Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener" />
to
<Listener className="org.apache.catalina.core.JreMemoryLeakPreventionListener"
classesToInitialize="com.mysql.jdbc.NonRegisteringDriver" />
With mysql-connector-java-8.0.x use com.mysql.cj.jdbc.NonRegisteringDriver instead
This assumes you put the MySQL JDBC driver into tomcat's lib directory and not inside your webapp.war's WEB-INF/lib directory, as the whole point is to load the driver before and independently of your webapp.
References:
http://bugs.mysql.com/bug.php?id=68556#c400606
http://tomcat.apache.org/tomcat-7.0-doc/config/listeners.html#JRE_Memory_Leak_Prevention_Listener_-_org.apache.catalina.core.JreMemoryLeakPreventionListener
http://markmail.org/message/dmvlkps7lbgpngil
com.mysql.jdbc.NonRegisteringDriver source v5.1
com.mysql.cj.jdbc.NonRegisteringDriver source v8.0
Changes in connector/J v8.0
Effective from MySQL connector 5.1.23 onwards, a method is provided to shut the abandoned connection cleanup thread down, AbandonedConnectionCleanupThread.shutdown.
However, we don't want direct dependencies in our code on the otherwise opaque JDBC driver code, so my solution is to use reflection to find the class and method and invoke it if found. The following complete code snippet is all that's needed, executed in the context of the class loader that loaded the JDBC driver:
try {
Class<?> cls=Class.forName("com.mysql.jdbc.AbandonedConnectionCleanupThread");
Method mth=(cls==null ? null : cls.getMethod("shutdown"));
if(mth!=null) { mth.invoke(null); }
}
catch (Throwable thr) {
thr.printStackTrace();
}
This cleanly ends the thread if the JDBC driver is a sufficiently recent version of the MySQL connector and otherwise does nothing.
Note it has to be executed in the context of the class loader because the thread is a static reference; if the driver class is not being or has not already been unloaded when this code is run then the thread will not be running for subsequent JDBC interactions.
I took the best parts of the answers above and combined them into an easily extensible class. This combines Oso's original suggestion with Bill's driver improvement and Software Monkey's reflection improvement. (I liked the simplicity of Stephan L's answer too, but sometimes modifying the Tomcat environment itself is not a good option, especially if you have to deal with autoscaling or migration to another web container.)
Instead of directly referring to the class name, thread name, and stop method, I also encapsulated these into an private inner ThreadInfo class. Using a list of these ThreadInfo objects, you can include additional troublesome threads to be shutdown with the same code. This is a bit more complex of a solution than most people likely need, but should work more generally when you need that.
import java.lang.reflect.Method;
import java.sql.Driver;
import java.sql.DriverManager;
import java.sql.SQLException;
import java.util.Arrays;
import java.util.Enumeration;
import java.util.List;
import java.util.Set;
import javax.servlet.ServletContextEvent;
import javax.servlet.ServletContextListener;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
* Context finalization to close threads (MySQL memory leak prevention).
* This solution combines the best techniques described in the linked Stack
* Overflow answer.
* #see Tomcat Guice/JDBC Memory Leak
*/
public class ContextFinalizer
implements ServletContextListener {
private static final Logger LOGGER =
LoggerFactory.getLogger(ContextFinalizer.class);
/**
* Information for cleaning up a thread.
*/
private class ThreadInfo {
/**
* Name of the thread's initiating class.
*/
private final String name;
/**
* Cue identifying the thread.
*/
private final String cue;
/**
* Name of the method to stop the thread.
*/
private final String stop;
/**
* Basic constructor.
* #param n Name of the thread's initiating class.
* #param c Cue identifying the thread.
* #param s Name of the method to stop the thread.
*/
ThreadInfo(final String n, final String c, final String s) {
this.name = n;
this.cue = c;
this.stop = s;
}
/**
* #return the name
*/
public String getName() {
return this.name;
}
/**
* #return the cue
*/
public String getCue() {
return this.cue;
}
/**
* #return the stop
*/
public String getStop() {
return this.stop;
}
}
/**
* List of information on threads required to stop. This list may be
* expanded as necessary.
*/
private List<ThreadInfo> threads = Arrays.asList(
// Special cleanup for MySQL JDBC Connector.
new ThreadInfo(
"com.mysql.jdbc.AbandonedConnectionCleanupThread", //$NON-NLS-1$
"Abandoned connection cleanup thread", //$NON-NLS-1$
"shutdown" //$NON-NLS-1$
)
);
#Override
public void contextInitialized(final ServletContextEvent sce) {
// No-op.
}
#Override
public final void contextDestroyed(final ServletContextEvent sce) {
// Deregister all drivers.
Enumeration<Driver> drivers = DriverManager.getDrivers();
while (drivers.hasMoreElements()) {
Driver d = drivers.nextElement();
try {
DriverManager.deregisterDriver(d);
LOGGER.info(
String.format(
"Driver %s deregistered", //$NON-NLS-1$
d
)
);
} catch (SQLException e) {
LOGGER.warn(
String.format(
"Failed to deregister driver %s", //$NON-NLS-1$
d
),
e
);
}
}
// Handle remaining threads.
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
Thread[] threadArray = threadSet.toArray(new Thread[threadSet.size()]);
for (Thread t:threadArray) {
for (ThreadInfo i:this.threads) {
if (t.getName().contains(i.getCue())) {
synchronized (t) {
try {
Class<?> cls = Class.forName(i.getName());
if (cls != null) {
Method mth = cls.getMethod(i.getStop());
if (mth != null) {
mth.invoke(null);
LOGGER.info(
String.format(
"Connection cleanup thread %s shutdown successfully.", //$NON-NLS-1$
i.getName()
)
);
}
}
} catch (Throwable thr) {
LOGGER.warn(
String.format(
"Failed to shutdown connection cleanup thread %s: ", //$NON-NLS-1$
i.getName(),
thr.getMessage()
)
);
thr.printStackTrace();
}
}
}
}
}
}
}
I went a step further from Oso,improved the code above in two points:
Added the Finalizer thread to the need-to-kill check:
for(Thread t:threadArray) {
if(t.getName().contains("Abandoned connection cleanup thread")
|| t.getName().matches("com\\.google.*Finalizer")
) {
synchronized(t) {
logger.warn("Forcibly stopping thread to avoid memory leak: " + t.getName());
t.stop(); //don't complain, it works
}
}
}
Sleep for a little while to give threads time to stop. Without that, tomcat kept complaining.
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
logger.debug(e.getMessage(), e);
}
Bill's solution looks good, however I found another solution directly in MySQL bug reports:
[5 Jun 2013 17:12] Christopher Schultz
Here is a much better workaround until something else changes.
Enable Tomcat's JreMemoryLeakPreventionListener (enabled by default on Tomcat 7), and add this attribute to the element:
classesToInitialize="com.mysql.jdbc.NonRegisteringDriver"
If "classesToInitialize" is already set on your , just add NonRegisteringDriver to the existing value separated by a comma.
and the answer:
[8 Jun 2013 21:33] Marko Asplund
I did some testing with the JreMemoryLeakPreventionListener / classesToInitialize workaround (Tomcat 7.0.39 + MySQL Connector/J 5.1.25).
Before applying the workaround thread dumps listed multiple AbandonedConnectionCleanupThread instances after redeploying the webapp several times. After applying the workaround there's only one AbandonedConnectionCleanupThread instance.
I had to modify my app, though, and move MySQL driver from the webapp to Tomcat lib.
Otherwise, the classloader is unable to load com.mysql.jdbc.NonRegisteringDriver at Tomcat startup.
I hope it helps for all who still fighting with this issue...
It seems this was fixed in 5.1.41. You could upgrade Connector/J to 5.1.41 or newer.
https://dev.mysql.com/doc/relnotes/connector-j/5.1/en/news-5-1-41.html
The implementation of AbandonedConnectionCleanupThread has now been improved, so that there are now four ways for developers to deal with the situation:
When the default Tomcat configuration is used and the Connector/J jar is put into a local library directory, the new built-in application detector in Connector/J now detects the stopping of the web application within 5 seconds and kills AbandonedConnectionCleanupThread. Any unnecessary warnings about the thread being unstoppable are also avoided. If the Connector/J jar is put into a global library directory, the thread is left running until the JVM is unloaded.
When Tomcat's context is configured with the attribute clearReferencesStopThreads="true", Tomcat is going to stop all spawned threads when the application stops unless Connector/J is being shared with other web applications, in which case Connector/J is now protected against an inappropriate stop by Tomcat; the warning about the non-stoppable thread is still issued into Tomcat's error log.
When a ServletContextListener is implemented within each web application that calls AbandonedConnectionCleanupThread.checkedShutdown() on context destruction, Connector/J now, again, skips this operation if the driver is potentially shared with other applications. No warning about the thread being unstoppable is issued to Tomcat's error log in this case.
When AbandonedConnectionCleanupThread.uncheckedShutdown() is called, the AbandonedConnectionCleanupThread is closed even if Connector/J is shared with other applications. However, it may not be possible to restart the thread afterwards.
If you look at source code, they called setDeamon(true) on thread, so it won't block shutdown.
Thread t = new Thread(r, "Abandoned connection cleanup thread");
t.setDaemon(true);
See To prevent a memory leak, the JDBC Driver has been forcibly unregistered. Bill's answer deregisters all Driver instances as well as instances that may belong to other web applications. I have extended Bill's answer with a check that the Driver instance belongs to the right ClassLoader.
Here is the resulting code (in a separate method, because my contextDestroyed has other things to do):
// See https://stackoverflow.com/questions/25699985/the-web-application-appears-to-have-started-a-thread-named-abandoned-connect
// and
// https://stackoverflow.com/questions/3320400/to-prevent-a-memory-leak-the-jdbc-driver-has-been-forcibly-unregistered/23912257#23912257
private void avoidGarbageCollectionWarning()
{
ClassLoader cl = Thread.currentThread().getContextClassLoader();
Enumeration<Driver> drivers = DriverManager.getDrivers();
Driver d = null;
while (drivers.hasMoreElements()) {
try {
d = drivers.nextElement();
if(d.getClass().getClassLoader() == cl) {
DriverManager.deregisterDriver(d);
logger.info(String.format("Driver %s deregistered", d));
}
else {
logger.info(String.format("Driver %s not deregistered because it might be in use elsewhere", d.toString()));
}
}
catch (SQLException ex) {
logger.warning(String.format("Error deregistering driver %s, exception: %s", d.toString(), ex.toString()));
}
}
try {
AbandonedConnectionCleanupThread.shutdown();
}
catch (InterruptedException e) {
logger.warning("SEVERE problem cleaning up: " + e.getMessage());
e.printStackTrace();
}
}
I wonder whether the call AbandonedConnectionCleanupThread.shutdown() is safe. Can it interfere with other web applications? I hope not, because the AbandonedConnectionCleanupThread.run() method is not static but the AbandonedConnectionCleanupThread.shutdown() method is.

Workaround for Java bug which causes crash dump

A program that I've developed is crashing the JVM occasionally due to this bug: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8029516. Unfortunately the bug has not been resolved by Oracle and the bug report says that there are no known workarounds.
I've tried to modify the example code from the bug report by calling .register(sWatchService, eventKinds) in the KeyWatcher thread instead, by adding all pending register request to a list that I loop through in the KeyWatcher thread but it's still crashing. I'm guessing this just had the same effect as synchronizing on sWatchService (like the submitter of the bug report tried).
Can you think of any way to get around this?
From comments:
It appears that we have an issue with I/O cancellation when there is a pending ReadDirectoryChangesW outstanding.
The statement and example code indicate that the bug is triggered when:
There is a pending event that has not been consumed (it may or may not be visible to WatchService.poll() or WatchService.take())
WatchKey.cancel() is called on the key
This is a nasty bug with no universal workaround. The approach depends on the specifics of your application. Consider pooling watches to a single place so you don't need to call WatchKey.cancel(). If at one point the pool becomes too large, close the entire WatchService and start over. Something similar to.
public class FileWatcerService {
static Kind<?>[] allEvents = new Kind<?>[] {
StandardWatchEventKinds.ENTRY_CREATE,
StandardWatchEventKinds.ENTRY_DELETE,
StandardWatchEventKinds.ENTRY_MODIFY
};
WatchService ws;
// Keep track of paths and registered listeners
Map<String, List<FileChangeListener>> listeners = new ConcurrentHashMap<String, List<FileChangeListener>>();
Map<WatchKey, String> keys = new ConcurrentHashMap<WatchKey, String>();
boolean toStop = false;
public interface FileChangeListener {
void onChange();
}
public void addFileChangeListener(String path, FileChangeListener l) {
if(!listeners.containsKey(path)) {
listeners.put(path, new ArrayList<FileChangeListener>());
keys.put(Paths.get(path).register(ws, allEvents), path);
}
listeners.get(path).add(l);
}
public void removeFileChangeListener(String path, FileChangeListener l) {
if(listeners.containsKey(path))
listeners.get(path).remove(l);
}
public void start() {
ws = FileSystems.getDefault().newWatchService();
new Thread(new Runnable() {
public void run() {
while(!toStop) {
WatchKey key = ws.take();
for(FileChangeListener l: listeners.get(keys.get(key)))
l.onChange();
}
}
}).start();
}
public void stop() {
toStop = true;
ws.close();
}
}
I've managed to create a workaround though it's somewhat ugly.
The bug is in JDK method WindowsWatchKey.invalidate() that releases native buffer while the subsequent calls may still access it. This one-liner fixes the problem by delaying buffer clean-up until GC.
Here is a compiled patch to JDK. In order to apply it add the following Java command-line flag:
-Xbootclasspath/p:jdk-8029516-patch.jar
If patching JDK is not an option in your case, there is still a workaround on the application level. It relies on the knowledge of Windows WatchService internal implementation.
public class JDK_8029516 {
private static final Field bufferField = getField("sun.nio.fs.WindowsWatchService$WindowsWatchKey", "buffer");
private static final Field cleanerField = getField("sun.nio.fs.NativeBuffer", "cleaner");
private static final Cleaner dummyCleaner = Cleaner.create(Thread.class, new Thread());
private static Field getField(String className, String fieldName) {
try {
Field f = Class.forName(className).getDeclaredField(fieldName);
f.setAccessible(true);
return f;
} catch (Exception e) {
throw new IllegalStateException(e);
}
}
public static void patch(WatchKey key) {
try {
cleanerField.set(bufferField.get(key), dummyCleaner);
} catch (IllegalAccessException e) {
throw new IllegalStateException(e);
}
}
}
Call JDK_8029516.patch(watchKey) right after the key is registred, and it will prevent watchKey.cancel() from releasing the native buffer prematurely.
You might not be able to work around the problem itself but you could deal with the error and handle it. I don't know your specific situation but I could imagine the biggest issue is the crash of the whole JVM. Putting all in a try block does not work because you cannot catch a JVM crash.
Not knowing more about your project makes it difficult to suggest a good/acceptable solution, but maybe this could be an option: Do all the file watching stuff in a separate JVM process. From your main process start a new JVM (e.g. using ProcessBuilder.start()). When the process terminates (i.e. the newly started JVM crashes), restart it. Obviously you need to be able to recover, i.e. you need to keep track of what files to watch and you need to keep this data in your main process too.
Now the biggest remaining part is to implement some communication between the main process and the file watching process. This could be done using standard input/output of the file watching process or using a Socket/ServerSocket or some other mechanism.

tomcat 7 + freebsd 7.2 (32bits) multithread app throws OutOfMemoryError: could not create native thread

Situation: there is server based on freebsd 7.2 with tomcat 7 installed. Tomcat 7 runs huge multithread server application which reads data through ServerSocket from some different devices. So there is some listener threads for each port for different devices. When device is authorized listener creates new thread that collects data from authorized device (1 device per thread), incoming data not so huge by the way. Also there is web socket listener for clients, it works same way - 1 thread per client. This all worked really good until number of devices was increased from 200 to 1500 (number of clients rised too, near 300).
Ideally process must work with more than 3000 active threads, but it crushes.
Here is code for thread pool on which application based.
public class MultiThreadPool {
private static final int INITIAL_POOL_SIZE = 1250;
private final LinkedBlockingQueue<Runnable> queue = new LinkedBlockingQueue<Runnable>();
private final ThreadPoolExecutor exec = new ThreadPoolExecutor(INITIAL_POOL_SIZE, Integer.MAX_VALUE, 1, TimeUnit.MINUTES, queue, new ThreadFactory() {
#Override
public Thread newThread(Runnable r) {
Thread th = new Thread(r);
//th.setDaemon(true);
th.setContextClassLoader(null);
th.setUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() {
#Override
public void uncaughtException(Thread t, Throwable e) {
//t.setDaemon(true);
//throw new UnsupportedOperationException("Not supported yet.");
}
});
return th;
}
});
private static class SingletonHolder {
private final static MultiThreadPool instance = new MultiThreadPool();
}
private static MultiThreadPool getInstance() {
return SingletonHolder.instance;
}
public MultiThreadPool() {
}
public static void initPool() {
getInstance();
}
public static void executeRunnable(Runnable r) {
try {
getInstance().exec.execute(r);
} catch (OutOfMemoryError e) {
System.out.println("** Out of memory: " + e.getMessage());
}
}
public static void stopPool() {
getInstance().exec.shutdownNow();
//getInstance().executor.shutdownNow();
}
}
All threads are running through ExecuteRunnable(Runnable r) procedure. Increasing of INITIAL_POOL_SIZE (core size for ThreadPoolExecutor) to 1300 for example causes OutOfMemoryError: could not create native thread. Also if INITIAL_POOL_SIZE stays at 1250, so another threads in queue does not work at all until some of core threads are not complete. And this causes that clients does not work too in both situations.
I've already tried to manipulate JVM using -Xss, -server, -Xms, -Xmx, -Xss, perm size etc.
Even OS hard limits was increased using loader.conf and login.conf. But all of these don't help very much or make it worse. May be there are another limits in freebsd which can be increased by deploying Kernel with another configuration?
Anyway, I need to increase ammount of active threads on server to process full set of devices and clients. By the way devices sends data every 5 secods at least.
Please, show me the right way to fix this problem!
PS: This app was built by another guy, so I cant say a lot about it, because it's really huge. But I will try to provide you with more information if you'll need this.
The OS seems to be limited to the number of threads it can create regardless of memory. Depending on what OS you're on check to see the limit and also check to see if you can increase the number.
For instance for mac you check via
sysctl kern.num_threads
kern.num_threads: 10240
For linux
cat /proc/sys/kernel/threads-max
3500
and so forth.
You can address this two ways.
Increase the number
Use a pool of threads instead of creating thousands of them (generally I like idea better).

Tomcat web application stops automatically

I have a dedicated server running CentOS 5.9, Apache-Tomcat 5.5.36. I have written a JAVA web applications which runs every minute to collect the data from multiple sensors. I am using ScheduledExecutorService to execute the threads. (one thread for each sensor every minute and there can be more than hundred sensors) The flow of the thread is
Collect sensor information from the database.
Sends the command to the instrument to collect data.
Update the database with the data values.
There is another application that checks the database every minute and send the alerts to the users (if necessary). I have monitored the application using jvisualVM, I cant find any memory leak. for every thread. The applications work fine but after some time(24 Hour - 48 Hours) the applications stop working. I cant find out what the problem could be, is it server configuration problem, too many threads or what?
Does anyone have any idea what might be going wrong or is there anyone who has done think kind of work? Please help, Thanks
UPDATE : including code
public class Scheduler {
private final ScheduledExecutorService scheduler =
Executors.newScheduledThreadPool(1);
public void startProcess(int start) {
final Runnable uploader = new Runnable() {
#SuppressWarnings("rawtypes")
public void run()
{
//Select data from the database
ArrayList dataList = getData();
for(int i=0;i<dataList.size();i++)
{
String args = dataList.get(i).toString();
ExecutorThread comThread = new ExecutorThread(args...);
comThread.start();
}
}
};
scheduler.scheduleAtFixedRate(uploader, 0, 60 , TimeUnit.SECONDS);
}
}
public class ExecutorThread extends Thread {
private variables...
public CommunicationThread(args..)
{
//Initialise private variable
}
public void run()
{
//Collect data from sensor
//Update Database
}
}
Can't say much without a code, but you need to be sure that your thread always exits properly - doesn't hang in memory on any exception, closes connection to database, etc.
Also, for monitoring your application, you can take a thread dump every some period of time to see how many threads the application generates.
Another suggestion is configure Tomcat to take a heap dump on OutOfMemoryError. If that's an issue, you'll be able to analyze what is filling up the memory
Take heed of this innocuous line from the ScheduledExecutorService.schedule... Javadoc
If any execution of the task encounters an exception, subsequent executions are suppressed.
This means that if you are running into an Exception at some point and not handling it, the Exception will propagate into the ScheduledExecutorService and it will kill your task.
To avoid this problem you need to make sure the entire Runnable is wrapped in a try...catch and Exceptions are guaranteed to never be unhandled.
You can also extend the ScheduledExecutorService (also mentioned in the javadoc) to handle uncaught exceptions :-
final ScheduledExecutorService ses = new ScheduledThreadPoolExecutor(10){
#Override
protected void afterExecute(Runnable r, Throwable t) {
super.afterExecute(r, t);
if (t == null && r instanceof Future<?>) {
try {
Object result = ((Future<?>) r).get();
} catch (CancellationException ce) {
t = ce;
} catch (ExecutionException ee) {
t = ee.getCause();
} catch (InterruptedException ie) {
Thread.currentThread().interrupt(); // ignore/reset
}
}
if (t != null) {
System.out.println(t);
}
}
};
Here the afterExecute method simply System.out.printlns the Throwable but it could do other things. Alert users, restart tasks etc...

Categories