The only way I know to force garbage collection is to use ForceGarbageCollection() from JVMTI. Is there any cross-platofrm way to force GC (so I don't need to create a JVMTI library for each platform)?
I think that the answer is No.
But I also think that you shouldn't need to do this anyway.
The way to request the garbage collector to run is to call System.gc(). But as the javadoc explains, this can be ignored.
The normal reason that System.gc() is ignored is that the JVM has been launched with the option -XX:+DisableExplicitGC. This is NOT the default. It only happens if the person or script or whatever launching the JVM wants this behavior.
So you are really asking for a way for an application override the user or administrator's explicit instructions to ignore System.gc() calls. You should not be doing that. It is not the application or the application writer's prerogative to override the user's wishes.
If your Java application really needs to run the GC explicitly, include in the installation instructions that it should NOT be run with the -XX:+DisableExplicitGC option. Then System.gc() should work.
So why did they provide a way to disable gc() calls?
Basically because explicitly running the gc() is bad practice (see Why is it bad practice to call System.gc()?) and (nearly always1) unnecessary in a properly written application2. If you application relies on the GC running at specific times to function, then you have made a mistake in the application design.
1 - A couple of exceptions are test cases for code that uses Reference types and similar, and interactive games where you want to (say) clean up between levels to avoid a GC pause during normal play.
2 - It is not uncommon for a Java programmer to start out as a C or C++ programmer. It can be difficult for such people to realize that they don't need to take a hand in Java memory management. The JVM (nearly always) has better understanding of when to run the GC. People also come across Object.finalize and dream up "interesting" ways to use it ... without realizing that it is an expensive and (ultimately) unreliable mechanism.
Related
I have a C++ codebase, in which I'm using JNI to create a JVM and occasionally interact with a library implemented in Java. I'm curious whether, in this use case, Java's garbage collector will still reliably run and clean up?
Most of the information that I find online about JNI seems to be about the "opposite" use case, where people generally appear to have mainly Java code, which sometimes interacts with native code through JNI. For such a use case, I find for example the following online:
The automatic garbage collection of local references that are no longer in scope prevents memory leaks in most situations. This automatic garbage collection occurs when a native thread returns to Java (native methods) or detaches from the JVM (Invocation API). Local reference memory leaks are possible if automatic garbage collection does not occur. A memory leak might occur if a native method does not return to the JVM, or if a program that uses the Invocation API does not detach from the JVM.
I'm not sure what exactly "returns to Java" in this context means. Is just occasionally calling into Java-based methods from C++ sufficient, does that already count as "returning to Java"? If not, are there any ways to make sure that the garbage collector gets a chance to run in my use case?
The JVM created with JNI is a full JVM, including GC.
Think of it this way: The java command that you normally use to run Java programs, is nothing but a small JNI program that creates a JVM, locates the class named on the command-line, and makes a static call to the main(String[]) method.
For the profiler which I implement using JVMTI I would like to start measuring the execution time of all Java methods. The JVMTI offers the events:
MethodEntry
MethodExit
So this would be quite easy to implement, however I came across this note in the API:
Enabling method entry or exit events will significantly degrade performance on many platforms and is thus not advised for performance critical usage (such as profiling). Bytecode instrumentation should be used in these cases.
But my profiling agent works headless, which means the collected data is serialized and sent via socket to a server application displaying the results. How should I realize this using byte code instrumentation. I am kind of confused how to go on from here. Could someone explain to me, if I have to switch the strategy or how can I approach this problem?
I don't know about the Sun JVM but the IBM JVM goes into what we call FullSpeedDebug mode when you request the MethodEntry/Exit events.... FSD slows down execution quite a bit.
As you say you can use BCI as my profiler does but unless you are selective about which methods you instrument you will also see a slow down. For example my profiler inserts a if(profiling) callProfilerHook() on every entry and all of the possible exits in a method all object creates and some other areas as well.... These additional checks can slow down execution by over 50%...
As for how to BCI... well I wrote my own C library to do it... it's technically not hard (hint just delete the StackMapTable) but I may take you a while.. Alternatively you can use ASM et. al.
Finally... you callBackHook will add overhead and on small methods render the reported CPU/Clock time meaningless unless you perform some sophisticated overhead calculation... even if you do this your callback code affects the shape of the processor L1 caches and the Java code becomes less efficient because it has less room..
My profiler basically ignores the reported times as I visualize the execution in an interesting way... I'm looking to understand the flow of all of the code, in fact in most cases what code is running (most Java projects have no idea of the millions on lines of third-party code running in their app)
In short - tomcat uses a thread pool, so threads are reused. Some libraries use ThreadLocal variables, but don't clean them up (using .remove()), so in fact they return "dirty" threads to the pool.
Tomcat has new features of detecting these things on shutdown, and cleaning the thread locals. But it means the threads are "dirty" during the whole execution.
What I can do is implement a Filter, and right after the request completes (and the thread is returned to the pool), clean all ThreadLocals, using the code from tomcat (the method there is called checkThreadLocalsForLeaks).
The question is, is it worth it? Two pros:
preventing memory leaks
preventing undeterministic behaviour by the libraries, which assume the thread is "fresh"
One con:
The solution uses reflection, so it's potentially slow. All reflection data (Fields) will be cached, of course, but still.
Another option is to report the issue to the libraries that don't clean their thread locals.
I would go through the route of reporting the issue to the library developers for 2 reasons:
It will help to other people who want to use the same library, but lack the skills / time to find such a HORRIBLE memory leak.
To help the developers of the library to build a better product.
Honestly, I've never seen this type of error before and I think it's an exception rather than something that we should guard as it happens often. Could you share on which library you've seen this behaviour?
As a side note, I wouldn't mind enabling that filter in the development / test environment and logging a critical error if a ThreadLocal variable is still attached.
in theory, this seems like a good idea. however, i could see some situations where you might not want to do this. for instance, some of the xml related technologies have some non-trivial setup costs (like setting up DocumentBuilders and Transformers). if you are doing a lot of that in your webapp, it may make sense to cache these instances in ThreadLocals (as the utilities are generally not thread-safe). in this case, you probably don't want to clean these between requests.
If you think there's a chance that the dirtiness of the threads will actually cause problems, then this is a sensible thing to do. Problems are to be avoided where possible.
The use of threadlocals may be bad behaviour by the library, and you should certainly report it to the authors, but sadly, right now, it's down to you to deal with it.
I wouldn't worry too much about performance. The slow bit in reflection is the metadata lookup; once you have a Field object, then using it is fairly quick, and gets quicker over time - AIUI, it starts out working by making a native call into the JVM, but after some number of uses, it generates bytecode for the access, which can then be compiled into native code, optimised, inlined, etc, so it shouldn't be much slower than a direct field access. I don't think the Tomcat code reuses the Field objects across requests, though, so if you want to take advantage of that, you'd have to write your own cleaning code. In any case, the performance cost will be far smaller than the cost of the IO associated with the request.
I've read in many threads that it is impossible to turn off garbage collection on Sun's JVM. However, for the purpose of our research project we need this feature. Can anybody recommend a JVM implementation which does not have garbage collection or which allows turning it off? Thank you.
I wanted to find a fast way to keep all objects in memory for a simple initial proof of concept.
The simple way to do this is to run the JVM with a heap that is so large that the GC never needs to run. Set the -Xmx and -Xms options to a large value, and turn on GC logging to confirm that the GC doesn't run for the duration of your test.
This will be quicker and more straightforward than modifying the JVM.
(In hindsight, this may not work. I vaguely recall seeing evidence that implied that the JVM does not always respect the -Xms setting, especially if it was really big. Still, this approach is worth trying before trying some much more difficult approach ... like modifying the JVM.)
Also, this whole thing strikes me as unnecessary (even counter-productive) for what you are actually trying to achieve. The GC won't throw away objects unless they are garbage. And if they are garbage, you won't be able to use them. And the performance of a system with GC disabled / negated is not going to indicative of how a real application will perform.
UPDATE - From Java 11 onwards, you have the much simpler option of using the Epsilon (no-op) garbage collector; see
JEP 318: Epsilon: A No-Op Garbage Collector (Experimental)
You add the following options when you launch the JVM:
-XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC
When the heap is filled, no attempt is made to collect garbage. Instead, the Epsilon GC terminates the JVM.
Depending on your needs this could perhaps work:
Using the -Xbootclasspath option you may specify your own implementation of API classes. You could then for instance override the implementation of Object, and add to the constructor, a globalList.add(this) to prevent the objects from being garbage collected. It's a hack for sure, but for simple case-study it's perhaps sufficient.
Another option is to take an open source jvm and comment out the parts that initiate garbage collection. I would guess it is not that complicated.
Sun's JVM has no such option. AFAIK, no other JVM has this option either.
You did not state what it is that you are exactly trying to achieve but you have one of two options: either use a profiler and see exactly what the GC is doing, that way you can take its effects into consideration. The other is to compile one of the JVMs from source, and disable GC from there.
You can only turn off the GC if its not actually needed (otherwise your application would run out of memory) and if you didn't need to GC, it shouldn't run anyway.
The simplest option would be to not discard any objects, this will avoid GC being performed (And set the max memory very high so you don't run out).
You may find that you get GCs on startup and you may consider a no-GC when running acceptable.
the question is old but for those who might be interested, there is a proposal to
Develop a GC that only handles memory allocation, but does not implement any actual memory reclamation mechanism. Once available Java heap is exhausted, perform the orderly JVM shutdown.
JEP draft: Epsilon GC: The Arbitrarily Low Overhead Garbage (Non-)Collector
Maybe you could try making your VM's available memory sufficient for GC never to be run.
My (allbeit limited) experience leads me to suggest that the VM is, by default, extremely lazy and extremely reluctant to run GC.
giving -Xmx 16384M (or some such) and making sure that your research subject stays well below that limit, might give you the environment you wish to obtain, allthough even then it will obviously not be guaranteed.
There actually exists a dirty hack to temporarily pause GC. First create a dummy array in Java. Then, in JNI, use GetPrimitiveArrayCritical function to get hold of the pointer to the array. The Sun JVM will disable GC to ensure that the array is never moved and the pointer stays valid. To re-enable GC, you can call the ReleasePrimitiveArrayCritical function on the pointer. But this is very implementation specific since other VM impl may pin the object instead of disabling GC entirely. (Tested to work on Oracle Jdk 7 & 8)
Take a look at Oracle's JRockit JVM. I've seen very good near-deterministic performance on Intel hardware with this JVM and you can prod and poke the runtime using the Mission Control utility to see how well it's performing.
Though you can't turn GC off completely, I believe that you can use the -Xnoclassgc option to disable the collection of classes. The GC can be tuned to minimize latency at the expense of leaving memory consumption to grow. You may need a license to drop the latency as low as you need if you're going this route.
There is also a Realtime version of the JRockit JVM available but I don't think that there is a free-to-developers version of this available.
Can you get an open source JVM and disable its GC, for example Sun's Hotspot?
If there was no Garbage Collection what would you expect to be the semantics of code like this?
public myClass {
public void aMethod() {
String text = new String("xyz");
}
}
In the absence of GC any item newed and with a stack scoped reference could never be reclaimed. Even if your own classes could decide not to use local variables like this, or to use only primitive types I don't see how you would safely use any standard Java library.
I'd be interested to hear more about your usage scenario.
If I had this problem I would get IBM's Jikes Research Virtual Machine because:
The run-time system is written in Java itself (with special extensions)
The whole thing was designed as a research vehicle and is relatively easy to tweak.
You can't turn off GC forever, because Java programs do allocate and eventually you'll run out of memory, but it's quite possible that you can delay GC for the duration of your experiment by telling the JVM not to start collecting until the heap gets really big. (That trick might work on other JVMs as well, but I wouldn't know where to find the knobs to start twirling.)
I would like to run a Java program with garbage collection switched off. Managing memory in my own code is not so difficult.
However the program needs quite a lot of I/O.
Is there any way (short of using JNI for all I/O operations) that I could achieve this using pure Java?
Thanks
Daniel
What you are trying to achieve is frequently done in investment banking to develop low-latency real-time systems.
To avoid GC you simply need to make sure not to allocate memory after the startup and warm-up phase of your application.
As you seem to have noticed Java NIO internally does unwanted memory allocation.
Unfortunately, you have no choice but write JNI replacements for the problematic calls.
You need at least to write a replacement for the NIO Selector.
You will have to avoid using most of the Java libraries due to similar unwanted memory allocations.
For example you will have to avoid using immutable object like String, avoid Boxing, re-implement Collections that preallocate enough entries for the whole lifetime of your program.
Writing Java code this way is not easy, but certainly possible.
I am developing a platform to do just so.
Managing memory in my own code is not
so difficult.
It's not difficult - It's impossible. For example:
public void foo() {
Object o = new Object();
// free(o); // Doh! No "free" keyword in Java.
}
Without the aid of the garbage collector how can the memory consumed by o be reclaimed?
I'm assuming from your question that you might want to avoid the sporadic pauses caused by garbage collection due to the high level of I/O being performed by your app. If this is the case there are techniques for minimising the number of objects created (e.g. re-using objects from a pool). You could also consider enabling the Concurrent Mark Sweep Collector.
The concurrent mark sweep collector,
also known as the concurrent collector
or CMS, is targeted at applications
that are sensitive to garbage
collection pauses.
It's very hard (but not impossible) to disable GC in a JVM.
Look at the JNI "critical" functions for hints.
You can also essentially ensure you don't GC by not allocating any more objects (write a JVMTI agent that slaps you if you do, and instrument your code).
Finally, you can force a fatal OutOfMemoryError by ensuring that every object you allocate is never freed, thus when you hit -Xmx memory used, you'll fall over as GC won't be able to reclaim anything (mind you, you'll GC one or more times at this point before you fall over in a heap).
The real question is why you'd want to? What upside do you see in doing it? Is it for realtime? If so, I'd consider looking at one of the several realtime JVMs available on the market (Oracle, IBM, & others all sell them). I can't honestly think of another reason to do this while still using Java.
The only way you are going to be able to turn off garbage collection is to modify the JVM. This is should be feasible with OpenJDK 6 codebase.
However, the what you will get at the end is a JVM that leaks memory like crazy, with no reasonable hope of fixing the leaks. The Java class library APIs are designed and implemented on the assumption that there is a GC taking care of memory management. This is so fundamental that any serious attempt to "fix" it would lead to a language / library that is not recognizable as Java.
If you want a non-garbage collected language, use C or C++.
Modern JVM's are so good at handling short-lived objects that any scheme you devise on your own will be slower.
This is because the objects you handle yourself will become long-lived and receive extra deluxe treatment from the JVM in terms of being moved around etc. Of course, this is by the garbage collector, which you want to turn off, but you can do very little without any gc.
So, before you start considering what optimization to use, then establish a baseline where you have a large unoptimized, program and profile it. Then do your tweaks, and see if it helps, but you will never know if you do not have a baseline.
As other people have mentioned you can't disable the GC. However, you can choose to use the experimental 'Epsilon' garbage collector which will never actually perform any garbage collections. Warning: it will crash if your JVM runs out of memory (because it's not doing any garbage collections).
There's more info (including the command-line switch to use) at:
http://openjdk.java.net/jeps/318
Good luck!
GarbageCollection is automated memory management in java.So you can not disable GC
Since you say, "its all about predictability not straight line speed," you should look at using a realtime Java system with deterministic garbage collection.