how to emulate full gc by many StackTraceElement in heap - java

Recently My operation colleague report production environment have many full gc, and influence app response time. And he supply an image
he especially said StackTraceElement have 85M, and suggests not have these code , e.g.
e.printStackTrace();
Now I want to simulate this situation in my local, and I write a test code like below
public class FullGCByLogTest {
private static final Logger log = Logger.getLogger(FullGCByLogTest.class);
public static final byte[] _1M = new byte[1 * 1024 * 1024]; //placeholder purpose
public static void main(String[] args) throws InterruptedException {
int nThreads = 1000; // concurrent count
ExecutorService pool = Executors.newFixedThreadPool(nThreads);
while (true) {
final CountDownLatch latch = new CountDownLatch(nThreads);
for (int i = 0; i < nThreads; i++) {
pool.submit(new Runnable() {
#Override
public void run() {
latch.countDown();
try {
latch.await(); // waiting for execute below code concurrently
} catch (InterruptedException e1) {
}
try {
int i = 1 / 0;
System.out.println(i);
} catch (Exception e) {
e.printStackTrace();
// log.error(e.getMessage(), e);
}
}
});
}
try {
Thread.sleep(100); // interval 1s every concurrent calling
} catch (InterruptedException e) {
}
}
}
}
and I run this class with these vm args
-Xmx4m -Xms4m -XX:NewSize=1m -XX:MaxNewSize=1m -XX:+PrintGCDetails
then in jvisualvm VisualGC I found old gen is 7 M, but I set max heap is 4m.
in addition in heapdump I did not find StackTraceElement. So how could I emulate this problem successfully?

The StackTraceElement objects are actually created when an exception object is instantiated, and they will be eligible for garbage collection as soon as the exception object is unreachable.
I suspect that the real cause for your (apparent) storage leak is that something in your code is saving lots of exception objects.
Calling printStackTrace() does not leak objects. Your colleague has misdiagnosed the problem. However calling printStackTrace() all over the place is ugly ... and if it happens frequently, that will lead to performance issues.
Your simulation and the results are a red herring, but the probable reason that the heap is bigger than you asked for is that the JVM has "rounded up" to a larger heap size. (4Mb is a miniscule heap size, and impractical for most Java programs.)
So how could I emulate this problem successfully?
Emulation is highly unlikely to tell you anything useful. You need to get hold of a heap dump from the production system and analyze that.

Related

How can multithreading help increase performance in this situation?

I have a piece of code like this:
while(){
x = jdbc_readOperation();
y = getTokens(x);
jdbc_insertOperation(y);
}
public List<String> getTokens(String divText){
List<String> tokenList = new ArrayList<String>();
Matcher subMatcher = Pattern.compile("\\[[^\\]]*]").matcher(divText);
while (subMatcher.find()) {
String token = subMatcher.group();
tokenList.add(token);
}
return tokenList;
}
What I know is using multithreading can save time when one thread is get blocked by I/O or network. In this synchronous operations every step have to wait for its previous step get finished. What I want here is to maximize cpu utilization on getTokens().
My first thought is put getTokens() in the run method of a class, and create multiple threads. But I think it will not work since it seems not able to get performance benefit by having multiple threads on pure computation operations.
Is adoption of multithreading going to help increase performance in this case? If so, how can I do that?
It will depend on the pace that jdbc_readOperation() produces data to be processed in comparison with the pace that getTokens(x) processes the data. Knowing that will help you figure out if multi-threading is going to help you.
You could try something like this (just for you to get the idea):
int workToBeDoneQueueSize = 1000;
int workDoneQueueSize = 1000;
BlockingQueue<String> workToBeDone = new LinkedBlockingQueue<>(workToBeDoneQueueSize);
BlockingQueue<String> workDone = new LinkedBlockingQueue<>(workDoneQueueSize);
new Thread(() -> {
try {
while (true) {
workToBeDone.put(jdbc_readOperation());
}
} catch (InterruptedException e) {
e.printStackTrace();
// handle InterruptedException here
}
}).start();
int numOfWorkerThreads = 5; // just an example
for (int i = 0; i < numOfWorkerThreads; i++) {
new Thread(() -> {
try {
while (true) {
workDone.put(getTokens(workToBeDone.take()));
}
} catch (InterruptedException e) {
e.printStackTrace();
// handle InterruptedException here
}
}).start();
}
new Thread(() -> {
// you could improve this by making a batch operation
try {
while (true) {
jdbc_insertOperation(workDone.take());
}
} catch (InterruptedException e) {
e.printStackTrace();
// handle InterruptedException here
}
}).start();
Or you could learn how to use the ThreadPoolExecutor. (https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html)
Okay to speed up getTokens() you can split the inputted String divText by using String.substring() method. You split it into as many substrings as you will run Threads running the getTokens() method. Then every Thread will "run" on a certain substring of divText.
Creating more Threads than the CPU can handle should be avoided since context switches create inefficiency.
https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#substring-int-int-
An alternative could be splitting the inputted String of getTokens with the String.split method http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29 e.g. in case the text is made up of words seperated by spaces or other symbols. Then specific parts of the resulting String array could be passed to different Threads.

How to determine the max possible fixed thread pool size?

Is there a method or utility to determine how many threads can be created in a program, for example with Executors.newFixedThreadPool(numberThreads)? The following custom rollout works but is obviously not an enterprise grade solution:
public class ThreadCounter {
public static void main(String[] args) {
System.out.println("max number threads = " + getMaxNumberThreads());
}
static int getMaxNumberThreads() {
final int[] maxNumberThreads = {0};
try {
while (true) {
new Thread(() -> {
try {
maxNumberThreads[0]++;
Thread.sleep(Integer.MAX_VALUE);
} catch (InterruptedException e) {
}
}).start();
}
} catch (Throwable t) {
}
return maxNumberThreads[0];
}
}
So as a general rule, creating more threads than the number of processors you have isn't really good because you may find bottlenecks between context switching. You can find the number of threads using the available availableProcessors() method like so:
numThreads = Runtime.getRuntime().availableProcessors();
executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numThreads);
This provides good general scalability as all available processors will be used in your thread pool.
Now sometimes, due to a lot of I/O blocking, or other factors, you may find that it may make sense to increase the number of threads beyond what you have available. In which case you can just multiply the result of numThreads for example to double the thread pool:
executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(numThreads * 2);
I would only recommend this once some benchmarking has been done to see if it's worth it though.
So it's not a max theoretical limit as such (which will be determined by the underlying operating system), but it probably provides you with the realistic limit of being able to take advantage of your computer's hardware.
Hope this helps!

NioSocketChannel$WriteRequestQueue causing OutOfMemory

I am using Netty to perform large file upload. It works fine but the RAM used by the client seems to increase with the size of the file. This is not the expected behaviour since everything is piped from the Reading the source file to writing the target file.
At first, I thought about a kind of adaptive buffer growing up until Xmx is reached but setting Xmx to a reasonable value (50M) would lead to an OutOfMemoryError soon after starting upload.
After some research using Eclipse Memory Analyzer, it appears that the object retaining the heap memory is:
org.jboss.netty.channel.socket.nio.NioSocketChannel$WriteRequestQueue
Is there any option for setting a limit to this queue or do I have to code my own queue using ChannelFutures to control the number of bytes and block the pipe when the limit is reached?
Thanks for your help,
Regards,
Renaud
Answer from #normanmaurer on Netty Github
You should use
Channel.isWritable()
to check if the "queue" is full. If so you will need to check if there is enough space to write more. So the effect you see can happen if you write data to quickly to get it send out to the clients.
You can get around this kind of problems when try to write a File via DefaultFileRegion or ChunkedFile.
#normanmaurer thank you I missed this method of the Channel!
I guess I need to read what's happening inside:
org.jboss.netty.handler.stream.ChunkedWriteHandler
UPDATED: 2012/08/30
This is the code I made to solve my problem:
public class LimitedChannelSpeaker{
Channel channel;
final Object lock = new Object();
long maxMemorySizeB;
long size = 0;
Map<ChannelBufferRef, Integer> buffer2readablebytes = new HashMap<ChannelBufferRef, Integer>();
public LimitedChannelSpeaker(Channel channel, long maxMemorySizeB) {
this.channel= channel;
this.maxMemorySizeB = maxMemorySizeB;
}
public ChannelFuture speak(ChannelBuffer buff) {
if (buff.readableBytes() > maxMemorySizeB) {
throw new IndexOutOfBoundsException("The buffer is larger than the maximum allowed size of " + maxMemorySizeB + "B.");
}
synchronized (lock) {
while (size + buff.readableBytes() > maxMemorySizeB) {
try {
lock.wait();
} catch (InterruptedException ex) {
throw new RuntimeException(ex);
}
}
ChannelBufferRef ref = new ChannelBufferRef(buff);
ref.register();
ChannelFuture future = channel.write(buff);
future.addListener(new ChannelBufferRef(buff));
return future;
}
}
private void spoken(ChannelBufferRef ref) {
synchronized (lock) {
ref.unregister();
lock.notifyAll();
}
}
private class ChannelBufferRef implements ChannelFutureListener {
int readableBytes;
public ChannelBufferRef(ChannelBuffer buff) {
readableBytes = buff.readableBytes();
}
public void unregister() {
buffer2readablebytes.remove(this);
size -= readableBytes;
}
public void register() {
buffer2readablebytes.put(this, readableBytes);
size += readableBytes;
}
#Override
public void operationComplete(ChannelFuture future) throws Exception {
spoken(this);
}
}
}
for a Desktop background application
Netty is designed for highly scalable servers e.g. around 10,000 connections. For a desktop application with less than a few hundred connections, I would use plain IO. You may find the code is much simpler and it should use less than 1 MB.

HeapDumpOnOutOfMemoryError works only once on periodical tasks

I have a couple of applications that run in specified intervals. To monitor OutOfMemoryError i've decided to enable HeapDumpOnOutOfMemoryError, and before doing this i decided to do some research. Some of applications have maximum heap size of 2GB, so generating multiple heap dumps in rapid succession could eat up all disk space.
I've written a small script to check how it will work.
import java.util.LinkedList;
import java.util.List;
public class Test implements Runnable{
public static void main(String[] args) throws Exception {
new Thread(new Test()).start();
}
public void run() {
while (true) {
try{
List<Object> list = new LinkedList<Object>();
while (true){
list.add(new Object());
}
}
catch (Throwable e){
System.out.println(e);
}
try {
Thread.sleep(1000);
}
catch (InterruptedException ignored) {
}
}
}
}
And here is the result
$ java -XX:+HeapDumpOnOutOfMemoryError -Xmx2M Test
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid25711.hprof ...
Heap dump file created [14694890 bytes in 0,101 secs]
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
It works as i would want it to, but i would like to know why.
Looking at openjdk6 source code i've found the following
void report_java_out_of_memory(const char* message) {
static jint out_of_memory_reported = 0;
// A number of threads may attempt to report OutOfMemoryError at around the
// same time. To avoid dumping the heap or executing the data collection
// commands multiple times we just do it once when the first threads reports
// the error.
if (Atomic::cmpxchg(1, &out_of_memory_reported, 0) == 0) {
// create heap dump before OnOutOfMemoryError commands are executed
if (HeapDumpOnOutOfMemoryError) {
tty->print_cr("java.lang.OutOfMemoryError: %s", message);
HeapDumper::dump_heap_from_oome();
}
if (OnOutOfMemoryError && OnOutOfMemoryError[0]) {
VMError err(message);
err.report_java_out_of_memory();
}
}
}
How does the first if statement work?
EDIT: it seems that heapdump should be created every time message is printed, but it does not happen. Why is that so?
The if statement contains a compare-and-exchange atomic operation which will return 0 if and only if the exchange was performed by the running thread. Compare-and-exchange (also known as compare-and-swap) works the following way:
Supply a value of which you think a variable contains (0 in your case, the variable is out_of_memory_reported)
Supply a value for which you would like to exchange the value (1 in your case)
If the value is the one you supplied, it is exchanged for the replacement value atomically (no other thread may change the value after it has been compared against your estimation) and 0 is returned
Otherwise, nothing happens and a value different from 0 is returned to indicate the failure

How to make the java system release Soft References?

I'm going to use a SoftReference-based cache (a pretty simple thing by itself). However, I've came across a problem when writing a test for it.
The objective of the test is to check if the cache does request the previously cached object from the server again after the memory cleanup occurs.
Here I find the problem how to make system to release soft referenced objects. Calling System.gc() is not enough because soft references will not be released until the memory is low. I'm running this unit test on the PC so the memory budget for the VM could be pretty large.
================== Added later ==============================
Thank you all who took care to answer!
After considering all pro's and contra's I've decided to go the brute force way as advised by nanda and jarnbjo. It appeared, however, that JVM is not that dumb - it won't even attempt garbage collecting if you ask for a block which alone is bigger than VM's memory budget. So I've modified the code like this:
/* Force releasing SoftReferences */
try {
final List<long[]> memhog = new LinkedList<long[]>();
while(true) {
memhog.add(new long[102400]);
}
}
catch(final OutOfMemoryError e) {
/* At this point all SoftReferences have been released - GUARANTEED. */
}
/* continue the test here */
This piece of code forces the JVM to flush all SoftReferences. And it's very fast to do.
It's working better than the Integer.MAX_VALUE approach, since here the JVM really tries to allocate that much memory.
try {
Object[] ignored = new Object[(int) Runtime.getRuntime().maxMemory()];
} catch (OutOfMemoryError e) {
// Ignore
}
I now use this bit of code everywhere I need to unit test code using SoftReferences.
Update: This approach will indeed work only with less than 2G of max memory.
Also, one need to be very careful with SoftReferences. It's so easy to keep a hard reference by mistake that will negate the effect of SoftReferences.
Here is a simple test that shows it working every time on OSX. Would be interested in knowing if JVM's behavior is the same on Linux and Windows.
for (int i = 0; i < 1000; i++) {
SoftReference<Object> softReference = new SoftReferencelt<Object>(new Object());
if (null == softReference.get()) {
throw new IllegalStateException("Reference should NOT be null");
}
try {
Object[] ignored = new Object[(int) Runtime.getRuntime().maxMemory()];
} catch (OutOfMemoryError e) {
// Ignore
}
if (null != softReference.get()) {
throw new IllegalStateException("Reference should be null");
}
System.out.println("It worked!");
}
An improvement that will work for more than 2G max memory. It loops until an OutOfMemory error occurs.
#Test
public void shouldNotHoldReferencesToObject() {
final SoftReference<T> reference = new SoftReference<T>( ... );
// Sanity check
assertThat(reference.get(), not(equalTo(null)));
// Force an OoM
try {
final ArrayList<Object[]> allocations = new ArrayList<Object[]>();
int size;
while( (size = Math.min(Math.abs((int)Runtime.getRuntime().freeMemory()),Integer.MAX_VALUE))>0 )
allocations.add( new Object[size] );
} catch( OutOfMemoryError e ) {
// great!
}
// Verify object has been garbage collected
assertThat(reference.get(), equalTo(null));
}
Set the parameter -Xmx to a very
small value.
Prepare your soft
reference
Create as many object as
possible. Ask for the object everytime until it asked the object from server again.
This is my small test. Modify as your need.
#Test
public void testSoftReference() throws Exception {
Set<Object[]> s = new HashSet<Object[]>();
SoftReference<Object> sr = new SoftReference<Object>(new Object());
int i = 0;
while (true) {
try {
s.add(new Object[1000]);
} catch (OutOfMemoryError e) {
// ignore
}
if (sr.get() == null) {
System.out.println("Soft reference is cleared. Success!");
break;
}
i++;
System.out.println("Soft reference is not yet cleared. Iteration " + i);
}
}
You could explicitly set the soft reference to null in your test, and as such simulate that the soft reference has been released.
This avoids any complicated test setup that is memory and garbage collection dependend.
Instead of a long running loop (as suggested by nanda), it's probably faster and easier to simply create a huge primitive array to allocate more memory than available to the VM, then catch and ignore the OutOfMemoryError:
try {
long[] foo = new long[Integer.MAX_VALUE];
}
catch(OutOfMemoryError e) {
// ignore
}
This will clear all weak and soft references, unless your VM has more than 16GB heap available.

Categories