Netty running at 100% CPU

Netty running at 100% CPU - java

I've seen other references to this issue, such as here and here, although these reference different versions of Netty. Tried this using the latest in the 4.0 branch (4.0.29) and in the 5.0 alpha branch (5.0-Alpha3). Local (non-linux) jdk 1.8.040, fine. Remote (Linux) with java jdk 1.8.025-b17 get 100% cpu.
Linux kernel version 2.6.32.
Tried using EpollEventLoopGroup();
Tried calling
workerGroup = new NioEventLoopGroup();
workerGroup.rebuildSelectors();
Can anyone offer any suggestions? I've seen references to this bug w/different versions of Netty. Jdk bug? Netty bug? Process goes to 100% immediately on startup and stays there.
Update: Upgraded to java 1.8.045, same difference.
JStack output of all runnable threads (there's some rabbitmq stuff in there, only included for completeness - that's common to other applications, and is not the cause of the problem).

As we identified in the comments, the thread that consumed CPU is busy in the following stack:
"pool-9-thread-1" #49 prio=5 os_prio=0 tid=0x00007ffd508e8000 nid=0x3a0c runnable [0x00007ffd188b6000]
java.lang.Thread.State: RUNNABLE
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.poll(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I have managed to reproduce a similar behavior by creating a ScheduledThreadPoolExecutor, configuring it to allow core threads to time out, and scheduling a lot of repeating tasks with a short delay. It yields a lot of CPU on my machine and the jstack output is similar (sometimes deeper into the poll method). This code reproduces it:
ScheduledThreadPoolExecutor executor = new ScheduledThreadPoolExecutor(1);
executor.setKeepAliveTime(1, TimeUnit.MINUTES);
executor.allowCoreThreadTimeOut(true);
for (long i = 0; i < 1000; i++) {
executor.scheduleAtFixedRate(new Runnable() {
#Override
public void run() {
}
}, 0, 1, TimeUnit.NANOSECONDS);
}
Now we just have to identify which code sets up a broken ScheduledThreadPoolExecutor. I searched through the RabbitMQ and Netty source code without finding anything obvoius. Could it be something you do in your own code?
Edit: As mentioned in the comments, the root cause was a ScheduledThreadPoolExecutor initialized with 0 which apparently can cause a CPU spin om some platforms. This was done in the OP's code.

Related

Fork Join pool hangs

The case is that an application hangs infinitely from time to time.
Seems that the bug sits in the following snippet:
ForkJoinPool pool = new ForkJoinPool(1); // parallelism = 1
List<String> entries = ...;
pool.submit(() -> {
entries.stream().parallel().forEach(entry -> {
// An I/O op.
...
});
}).get();
Thread pool-4-thread-1 that executes the code freezes on get():
"pool-4-thread-1" #35 prio=5 os_prio=0 tid=0x00002b42e4013800 nid=0xb7d1 in Object.wait() [0x00002b427b72f000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.concurrent.ForkJoinTask.externalInterruptibleAwaitDone(ForkJoinTask.java:367)
- locked <0x00000000e08b68b8> (a java.util.concurrent.ForkJoinTask$AdaptedRunnableAction)
at java.util.concurrent.ForkJoinTask.get(ForkJoinTask.java:1001)
...other app methods
One can assume that the task passed to submit() executes too long.
But surprisingly there is no ForkJoinPool-N-worker-N occurrences in the thread dump, so looks like the pool doesn't perform any computations!
How is that possible? If no tasks are executed by the pool, why pool-4-thread-1 thread waits inside get()?
P.S. I know that it's not recommended to execute I/O-related tasks in ForkJoinPool, but still interested in the root of the problem.
Update. When parallelism is set to value greater than 1, no problems are detected.

Set parallelism = N where N > 1 solved the problem.
Strange thing but seems that there is some bug in ForkJoinPool similar to what is stated here.

Spark on mesos Executors failing with OOM Errors

We are using spark 2.0.2 managed by a DCOS system that fetch data from a Kafka 1.0.0 messaging service and writes parquet in a hdfs system.
Every thing was working ok, but when we increase the number of topics in Kafka, our spark executors began to crash constantly with OOM errors:
java.lang.OutOfMemoryError: Java heap space
at org.apache.parquet.column.values.dictionary.IntList.initSlab(IntList.java:90)
at org.apache.parquet.column.values.dictionary.IntList.<init>(IntList.java:86)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter.<init>(DictionaryValuesWriter.java:93)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter$PlainDoubleDictionaryValuesWriter.<init>(DictionaryValuesWriter.java:422)
at org.apache.parquet.column.ParquetProperties.dictionaryWriter(ParquetProperties.java:139)
at org.apache.parquet.column.ParquetProperties.dictWriterWithFallBack(ParquetProperties.java:178)
at org.apache.parquet.column.ParquetProperties.getValuesWriter(ParquetProperties.java:203)
at org.apache.parquet.column.impl.ColumnWriterV1.<init>(ColumnWriterV1.java:83)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.newMemColumn(ColumnWriteStoreV1.java:68)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.getColumnWriter(ColumnWriteStoreV1.java:56)
at org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.<init>(MessageColumnIO.java:183)
at org.apache.parquet.io.MessageColumnIO.getRecordWriter(MessageColumnIO.java:375)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:109)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:99)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:217)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:175)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:146)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:113)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:87)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:62)
at org.apache.parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:47)
at npm.parquet.ParquetMeasurementWriter.ensureOpenWriter(ParquetMeasurementWriter.java:91)
at npm.parquet.ParquetMeasurementWriter.write(ParquetMeasurementWriter.java:75)
at npm.ingestion.spark.StagingArea$Measurements.store(StagingArea.java:100)
at npm.ingestion.spark.StagingArea$StagingAreaStorage.store(StagingArea.java:80)
at npm.ingestion.spark.StagingArea.add(StagingArea.java:40)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.sendToStagingArea(Kafka2HDFSPM.java:207)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.consumeRecords(Kafka2HDFSPM.java:193)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.process(Kafka2HDFSPM.java:169)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:133)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:111)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:218)
18/03/20 18:41:13 ERROR [Executor task launch worker-0] SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.OutOfMemoryError: Java heap space
at org.apache.parquet.column.values.dictionary.IntList.initSlab(IntList.java:90)
at org.apache.parquet.column.values.dictionary.IntList.<init>(IntList.java:86)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter.<init>(DictionaryValuesWriter.java:93)
at org.apache.parquet.column.values.dictionary.DictionaryValuesWriter$PlainDoubleDictionaryValuesWriter.<init>(DictionaryValuesWriter.java:422)
at org.apache.parquet.column.ParquetProperties.dictionaryWriter(ParquetProperties.java:139)
at org.apache.parquet.column.ParquetProperties.dictWriterWithFallBack(ParquetProperties.java:178)
at org.apache.parquet.column.ParquetProperties.getValuesWriter(ParquetProperties.java:203)
at org.apache.parquet.column.impl.ColumnWriterV1.<init>(ColumnWriterV1.java:83)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.newMemColumn(ColumnWriteStoreV1.java:68)
at org.apache.parquet.column.impl.ColumnWriteStoreV1.getColumnWriter(ColumnWriteStoreV1.java:56)
at org.apache.parquet.io.MessageColumnIO$MessageColumnIORecordConsumer.<init>(MessageColumnIO.java:183)
at org.apache.parquet.io.MessageColumnIO.getRecordWriter(MessageColumnIO.java:375)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:109)
at org.apache.parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:99)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:217)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:175)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:146)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:113)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:87)
at org.apache.parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:62)
at org.apache.parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:47)
at npm.parquet.ParquetMeasurementWriter.ensureOpenWriter(ParquetMeasurementWriter.java:91)
at npm.parquet.ParquetMeasurementWriter.write(ParquetMeasurementWriter.java:75)
at npm.ingestion.spark.StagingArea$Measurements.store(StagingArea.java:100)
at npm.ingestion.spark.StagingArea$StagingAreaStorage.store(StagingArea.java:80)
at npm.ingestion.spark.StagingArea.add(StagingArea.java:40)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.sendToStagingArea(Kafka2HDFSPM.java:207)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.consumeRecords(Kafka2HDFSPM.java:193)
at npm.ingestion.spark.Kafka2HDFSPM$SubsetProcessor.process(Kafka2HDFSPM.java:169)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:133)
at npm.ingestion.spark.Kafka2HDFSPM$FetchSubsetsAndStore.call(Kafka2HDFSPM.java:111)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$foreachPartition$1.apply(JavaRDDLike.scala:218)
We tried to increase the available the executors memory, review the code, but we couldn't find anything wrong.
Another info: we are using RDDs in spark.
Have someone encountered a similar problem, that already been solved

What is the heap configuration for the executor? By default, Java will autotune its heap according to machine memory. You need to change it to fit in your container with -Xmx setting.
See this article about running Java in the container
https://github.com/fabianenardon/docker-java-issues-demo/tree/master/memory-sample

Java "Thread-2" without stack prevents termination

I have a pretty complex java program, which doesn't terminate.
The eclipse debugger is showing a thread that can be suspended, but has no stack trace.
It is called "Thread-2".
The jstack -l output for this thread is:
"Thread-2" #17 prio=5 os_prio=0 tid=0x00007f1268002800 nid=0x3342 runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
I added a breakpoint to Thread.start(), but I cannot find a thread called "Thread-2".
The thread just appears after the two "AWT-Event-Queue" threads were created.
I don't create any threads manually in my program.
After the main thread and all other threads exit, and the JFrame is disposed, the following threads still exist:
Thread [AWT-EventQueue-0] (Running)
Thread [Thread-2] (Running)
Thread [DestroyJavaVM] (Running)
When suspending the VM, the following threads exist:
Daemon System Thread [Signal Dispatcher] (Suspended)
Daemon System Thread [Finalizer] (Suspended)
Daemon System Thread [Reference Handler] (Suspended)
Daemon System Thread [Java2D Disposer] (Suspended)
Daemon System Thread [AWT-XAWT] (Suspended)
Thread [AWT-EventQueue-0] (Suspended)
Thread [Thread-2] (Suspended)
Thread [DestroyJavaVM] (Suspended)
How can I get more information about this thread, or allow it to terminate?
EDIT 1:
According to the Dependency Hierarchy of the eclipse pom.xml view, I use the following third party libraries:
guava 17.0 [compile]
hamcrest-core 1.3 [test]
junit 4.11 [test]
log4j-api 2.0-beta9 [compile]
log4j-core 2.0-beta9 [compile]
EDIT 2:
Adding breakpoints to all constructors of the thread class, as suggested in https://stackoverflow.com/a/35128213/577485, I see that Thread-0 and Thread-1 are created by log4j, but not Thread-2. It still just appears as before and no breakpoint triggers when it is constructed.
EDIT 3:
Now it's getting creepy. Not even the stop() method works, when invoked on the thread. I added it to the code given in https://stackoverflow.com/a/35128149/577485.
At least System.exit(int) still works. But as said in a comment, I don't want to use that.
EDIT 4:
Info about my system:
I'm running the newest stable version of Ubuntu 15.10 Wily. I have the security, updates and backports repos enabled.
My JVM version is:
java version "1.7.0_91"
OpenJDK Runtime Environment (IcedTea 2.6.3) (7u91-2.6.3-0ubuntu0.15.10.1)
OpenJDK 64-Bit Server VM (build 24.91-b01, mixed mode)
EDIT 5:
I executed the program with Java version jre-8u71-linux-x64 directly downloaded from java.com, but the error persists. jstack -l shows the same strange thread. Note that the program was still built with the older java version. Edit: After compiling it with java8u72 from oracle.com, I get the same behaviour.
EDIT 6:
I iterated over all the threads fields, here's the output. I cannot get any hint out of those fields, the thread doesn't even have a target.
name: [C#6b67034
priority: 5
threadQ: null
eetop: 140274638530560
single_step: false
daemon: false
stillborn: false
target: null
group: java.lang.ThreadGroup[name=main,maxpri=10]
contextClassLoader: null
inheritedAccessControlContext: java.security.AccessControlContext#0
threadInitNumber: 3
threadLocals: null
inheritableThreadLocals: null
stackSize: 0
nativeParkEventPointer: 0
tid: 17
threadSeqNumber: 20
threadStatus: 5
parkBlocker: null
blocker: null
blockerLock: java.lang.Object#16267862
MIN_PRIORITY: 1
NORM_PRIORITY: 5
MAX_PRIORITY: 10
EMPTY_STACK_TRACE: [Ljava.lang.StackTraceElement;#453da22c
SUBCLASS_IMPLEMENTATION_PERMISSION: ("java.lang.RuntimePermission" "enableContextClassLoaderOverride")
uncaughtExceptionHandler: null
defaultUncaughtExceptionHandler: null
threadLocalRandomSeed: 0
threadLocalRandomProbe: 0
threadLocalRandomSecondarySeed: 0
EDIT 7:
Added a watchpoint to the name field of Thread. It is only accessed by my analytics code, and seems to be never written...
EDIT 8:
jstack -F -m throws an error for my program:
Attaching to process ID 10973, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.71-b15
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.tools.jstack.JStack.runJStackTool(JStack.java:140)
at sun.tools.jstack.JStack.main(JStack.java:106)
Caused by: java.lang.RuntimeException: Unable to deduce type of thread from address 0x00007ff68000c000 (expected type JavaThread, CompilerThread, ServiceThread, JvmtiAgentThread, or SurrogateLockerThread)
at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:169)
at sun.jvm.hotspot.runtime.Threads.first(Threads.java:153)
at sun.jvm.hotspot.tools.PStack.initJFrameCache(PStack.java:200)
at sun.jvm.hotspot.tools.PStack.run(PStack.java:71)
at sun.jvm.hotspot.tools.PStack.run(PStack.java:58)
at sun.jvm.hotspot.tools.PStack.run(PStack.java:53)
at sun.jvm.hotspot.tools.JStack.run(JStack.java:66)
at sun.jvm.hotspot.tools.Tool.startInternal(Tool.java:260)
at sun.jvm.hotspot.tools.Tool.start(Tool.java:223)
at sun.jvm.hotspot.tools.Tool.execute(Tool.java:118)
at sun.jvm.hotspot.tools.JStack.main(JStack.java:92)
... 6 more
Caused by: sun.jvm.hotspot.types.WrongTypeException: No suitable match for type of address 0x00007ff68000c000
at sun.jvm.hotspot.runtime.InstanceConstructor.newWrongTypeException(InstanceConstructor.java:62)
at sun.jvm.hotspot.runtime.VirtualConstructor.instantiateWrapperFor(VirtualConstructor.java:80)
at sun.jvm.hotspot.runtime.Threads.createJavaThreadWrapper(Threads.java:165)
... 16 more
The class name of the strange thread is java.lang.Thread.
I don't use any command line arguments for executing the program. Adding the -Dlog4j2.disable.jmx=true option gives the strange thread the name Thread-1.
I updated log4j to 2.5, and the strange thread now has name Thread-0, when the -Dlog4j2.disable.jmx=true option is given, and Thread-1, when it isn't.
EDIT 9:
Completely removed log4j, error persists. The thread is now called Thread-0.
Here's the project, if that helps.

I don't understand why the breakpoint on Thread.start() did not work, but you could also try intercepting the thread >>creation<< by setting a breakpoint on the Thread constructors, or on the (internal) Thread.init() method.
The fact that the thread has the name Thread-2 implies that it was created by one of the constructors that generates a default thread name. That suggests that it was not created by the JVM or standard Java class libraries. It also narrows down the constructors that could have been used to create it.
How can I get more information about this thread ...
I can't think of any way apart from setting breakpoints.
... or allow it to terminate?
If you can find where it is created, you should be able to use setDaemon(true) to mark it as a daemon thread. However, this needs to be done before the thread is started.
Another possibility would be to find the thread by traversing the ThreadGroup tree and then calling Thread.interrupt() on it. ( Thread.getAllStackTraces() is another way of tracking down the thread object. ) However, there is no guarantee that the thread will "respect" the interrupt and shut down.
Finally, you could just call System.exit(...).
UPDATE
I mentioned that the thread might not respect interrupt() and I'm not surprised that stop() doesn't work. (It is deprecated, and may not even be implemented on some platforms.)
However, if you have managed to implement code that actually finds the mystery thread, you could dig around to find either the Thread subclass, or the Runnable that it is instantiated with. If you can print out the fully qualified class name, that will give you a big clue as to where it comes from. (Assuming that you are still having no success with breakpoints, then you may need to use "nasty" reflection to extract the runnable from the thread's private target field.)

Not sure if this enough for you, but the following code will allow you to try to interrupt any Thread by its name:
//Set of current Threads
Set<Thread> setOfThread = Thread.getAllStackTraces().keySet();
//Iterate over set to find yours
for(Thread thread : setOfThread){
if (thread.getName().equals("Thread-2")) {
thread.interrupt();
break;
}
}
Also, take a look on this article from JavaSpecialists that tries to identify the creator of a Thread based on the fact that the constructor of Thread makes a call to the security manager. If we add a custom SecurityManager to our System, we may track the initiator of a Thread.

First off, you should change your _exit flag to volatile, since it's read from one thread (your main method), and written by another (JCF/Swing event handler), so it's possible your main thread isn't getting a "fresh" value. Specifically: the thread may be saving the field to a CPU register and not reloading it from memory as you're looping. 'volatile' will prevent that behavior:
private volatile boolean _exit;
However, based on your stack traces I don't think that's your problem, since we don't see your main method in there. But this should be done anyway, it's just good practice.
Assuming that doesn't fix it, I'm guessing your problem is that you have at least one other Window (besides AgentFrame) that isn't being disposed. The AWT thread won't stop until all Windows are disposed.
Put this at the end of your main method:
System.out.println(Arrays.toString(Window.getWindows()))
I'm guessing you're going to see more than just your DrawFrame there. If I were to guess, I'd say UISettingsFrame is in there, undisposed.

Performance tuning for Netty 4.1 on linux machine

I am building a messaging application using Netty 4.1 Beta3 for designing my server and the server understands MQTT protocol.
This is my MqttServer.java class that sets up the Netty server and binds it to a specific port.
EventLoopGroup bossPool=new NioEventLoopGroup();
EventLoopGroup workerPool=new NioEventLoopGroup();
try {
ServerBootstrap boot=new ServerBootstrap();
boot.group(bossPool,workerPool);
boot.channel(NioServerSocketChannel.class);
boot.childHandler(new MqttProxyChannel());
boot.bind(port).sync().channel().closeFuture().sync();
} catch (Exception e) {
e.printStackTrace();
}finally {
workerPool.shutdownGracefully();
bossPool.shutdownGracefully();
}
}
Now I did a load testing of my application on my Mac having the following configuration
The netty performance was exceptional. I had a look at the jstack while executing my code and found that netty NIO spawns about 19 threads and none of them seem to be stuck up waiting for channels or something else.
Then I executed my code on a linux machine
This is a 2 core 15GB machine. The problem is that the packet sent by my MQTT client seems to take a long time to pass through the netty pipeline and also on taking jstack I found that there were 5 netty threads and all were stuck up like this
."nioEventLoopGroup-3-4" #112 prio=10 os_prio=0 tid=0x00007fb774008800 nid=0x2a0e runnable [0x00007fb768fec000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000006d0fdc898> (a
io.netty.channel.nio.SelectedSelectionKeySet)
- locked <0x00000006d100ae90> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000006d0fdc7f0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:621)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:309)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:834)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Is this some performance issue related to epoll on linux machine. If yes then what changes should be made to netty configuration to handle this or to improve performance.
Edit
Java Version on local system is :-
java version "1.8.0_40"
Java(TM) SE Runtime Environment (build 1.8.0_40-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.40-b25, mixed mode)
Java version on AWS is :-
openjdk version "1.8.0_40-internal"
OpenJDK Runtime Environment (build 1.8.0_40-internal-b09)
OpenJDK 64-Bit Server VM (build 25.40-b13, mixed mode)

Here are my findings from implementing a very simple HTTP → Kafka forklift:
Consider switching to EpollEventLoopGroup.
Simple autoreplace NioEventLoopGroup → EpollEventLoopGroup gave me 30% perfomance boost.
Removing LoggingHandler from the pipeline (if you have any) can give you a CPU usage drop (in my case CPU the drop was almost unbelievable: 80%).

Play around with the worker threads to see if this improves performance. The standard constructor of NioEventLoopGroup() creates the default amount of event loop threads:
DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
"io.netty.eventLoopThreads", Runtime.getRuntime().availableProcessors() * 2));
As you can see you can pass io.netty.eventLoopThreads as a launch argument but I usually don't do that.
You can also pass the amount of threads in the constructor of NioEventLoopGroup().
In our environment we have netty servers that accept communication from hundreds of clients. Usually one boss thread to handle the connections is enough. The worker thread amount needs to be scaled though. We use this:
private final static int BOSS_THREADS = 1;
private final static int MAX_WORKER_THREADS = 12;
EventLoopGroup bossGroup = new NioEventLoopGroup(BOSS_THREADS);
EventLoopGroup workerGroup = new NioEventLoopGroup(calculateThreadCount());
private int calculateThreadCount() {
int threadCount;
if ((threadCount = SystemPropertyUtil.getInt("io.netty.eventLoopThreads", 0)) > 0) {
return threadCount;
} else {
threadCount = Runtime.getRuntime().availableProcessors() * 2;
return threadCount > MAX_WORKER_THREADS ? MAX_WORKER_THREADS : threadCount;
}
}
So in our case we use just one boss thread. The worker threads depend on if a launch argument has been given. If not then use cores * 2 but never more than 12.
You will have to test yourself though what numbers work best for your environment.

Difference between newScheduledThreadPool(1) and newSingleThreadScheduledExecutor()

I am wondering what is the difference between these two methods of Executors class? I have a web application where I'm checking some data every 100 ms so that's why I'm using this scheduler with scheduleWithFixedDelay method. I want to know which method should I use in this case (newScheduledThreadPool or newSingleThreadScheduledExecutor)?
I also have one more question - in VisualVM where I monitor my Glassfish server I noticed that I have some threads in PARK state - for example:
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <3cb9965d> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1088)
at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Is it possible that these threads are connected with scheduler because I don't have any idea what else would create them? These threads are never destroyed, so I am afraid that this could cause some troubles. Here is a screenshot (new Thread-35 will be created in 15minutes and so on...):

As documentation states:
Unlike the otherwise equivalent newScheduledThreadPool(1) the returned executor is guaranteed not to be reconfigurable to use additional threads.
So when using newScheduledThreadPool(1)you will be able to add more threads later.

newSingleThreadScheduledExecuto() is wrapped by a delegate, as you can see in Executors.java:
public static ScheduledExecutorService newSingleThreadScheduledExecutor() {
return new DelegatedScheduledExecutorService(new ScheduledThreadPoolExecutor(1));
}
Differences (from javadoc):
if this single thread terminates due to a failure during execution prior to shutdown, a new one will take its place if needed to execute subsequent tasks.
Unlike the otherwise equivalent {#code newScheduledThreadPool(1)} the returned executor is guaranteed not to be reconfigurable to use additional threads.
reply to your comment:
do this also apply for newScheduledThreadPool(1) or not?
no, you need to take care of thread failure yourself.
As for Unsafe.park(), see this.

As pointed out by wings and tagir the differences is in "how to manage failure".
About Thread your thread is in Wait status; park is not a status but is a method to put the Thread in wait status; see also
How to detect thread being blocked by IO?
However, let me suggest a different way to implement a scheduled thread on Java EE; you should take a look at EJB's TimerService
#Singleton
public class TimerSessionBean {
#Resource
TimerService timerService;
public void setTimer(long intervalDuration) {
Timer timer = timerService.createTimer(intervalDuration,
"Created new programmatic timer");
}
#Timeout
public void lookForData(Timer timer) {
//this.setLastProgrammaticTimeout(new Date());
....
}
//OR
#Schedule(minute = "*/1", hour = "*")
public void runEveryMinute() {
...
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Netty running at 100% CPU - java

Related

Fork Join pool hangs

Spark on mesos Executors failing with OOM Errors

Java "Thread-2" without stack prevents termination

Performance tuning for Netty 4.1 on linux machine

Difference between newScheduledThreadPool(1) and newSingleThreadScheduledExecutor()

Categories

Resources