I would like to run a Java program with garbage collection switched off. Managing memory in my own code is not so difficult.
However the program needs quite a lot of I/O.
Is there any way (short of using JNI for all I/O operations) that I could achieve this using pure Java?
Thanks
Daniel
What you are trying to achieve is frequently done in investment banking to develop low-latency real-time systems.
To avoid GC you simply need to make sure not to allocate memory after the startup and warm-up phase of your application.
As you seem to have noticed Java NIO internally does unwanted memory allocation.
Unfortunately, you have no choice but write JNI replacements for the problematic calls.
You need at least to write a replacement for the NIO Selector.
You will have to avoid using most of the Java libraries due to similar unwanted memory allocations.
For example you will have to avoid using immutable object like String, avoid Boxing, re-implement Collections that preallocate enough entries for the whole lifetime of your program.
Writing Java code this way is not easy, but certainly possible.
I am developing a platform to do just so.
Managing memory in my own code is not
so difficult.
It's not difficult - It's impossible. For example:
public void foo() {
Object o = new Object();
// free(o); // Doh! No "free" keyword in Java.
}
Without the aid of the garbage collector how can the memory consumed by o be reclaimed?
I'm assuming from your question that you might want to avoid the sporadic pauses caused by garbage collection due to the high level of I/O being performed by your app. If this is the case there are techniques for minimising the number of objects created (e.g. re-using objects from a pool). You could also consider enabling the Concurrent Mark Sweep Collector.
The concurrent mark sweep collector,
also known as the concurrent collector
or CMS, is targeted at applications
that are sensitive to garbage
collection pauses.
It's very hard (but not impossible) to disable GC in a JVM.
Look at the JNI "critical" functions for hints.
You can also essentially ensure you don't GC by not allocating any more objects (write a JVMTI agent that slaps you if you do, and instrument your code).
Finally, you can force a fatal OutOfMemoryError by ensuring that every object you allocate is never freed, thus when you hit -Xmx memory used, you'll fall over as GC won't be able to reclaim anything (mind you, you'll GC one or more times at this point before you fall over in a heap).
The real question is why you'd want to? What upside do you see in doing it? Is it for realtime? If so, I'd consider looking at one of the several realtime JVMs available on the market (Oracle, IBM, & others all sell them). I can't honestly think of another reason to do this while still using Java.
The only way you are going to be able to turn off garbage collection is to modify the JVM. This is should be feasible with OpenJDK 6 codebase.
However, the what you will get at the end is a JVM that leaks memory like crazy, with no reasonable hope of fixing the leaks. The Java class library APIs are designed and implemented on the assumption that there is a GC taking care of memory management. This is so fundamental that any serious attempt to "fix" it would lead to a language / library that is not recognizable as Java.
If you want a non-garbage collected language, use C or C++.
Modern JVM's are so good at handling short-lived objects that any scheme you devise on your own will be slower.
This is because the objects you handle yourself will become long-lived and receive extra deluxe treatment from the JVM in terms of being moved around etc. Of course, this is by the garbage collector, which you want to turn off, but you can do very little without any gc.
So, before you start considering what optimization to use, then establish a baseline where you have a large unoptimized, program and profile it. Then do your tweaks, and see if it helps, but you will never know if you do not have a baseline.
As other people have mentioned you can't disable the GC. However, you can choose to use the experimental 'Epsilon' garbage collector which will never actually perform any garbage collections. Warning: it will crash if your JVM runs out of memory (because it's not doing any garbage collections).
There's more info (including the command-line switch to use) at:
http://openjdk.java.net/jeps/318
Good luck!
GarbageCollection is automated memory management in java.So you can not disable GC
Since you say, "its all about predictability not straight line speed," you should look at using a realtime Java system with deterministic garbage collection.
Related
Many monitoring tools, like the otherwise phantastic JavaMelody, just monitor the current memory usage. If you want to check for memory leaks or impending out of memory situations, this is not particularily helpful, if you have an application that generates loads of garbage which gets collected immediately. Not perfect, but IMHO much more interesting, would it be to monitor the memory usage immediately after a major garbage collection. If that's high, a crash is looming over you.
So: can you find out the memory usage immediately after the last major garbage collection - either from Java code or via JMX? I know there are some tools like VisualVM which do this (which is no option for production use), and it can be written in the garbage collection log, but I'm looking for a more straightforward solution than parsing the garbage collection logfile. :-) To be clear: I'm looking for something that can easily be used in any application in production, not any expensive tool for debugging.
In case that matters: JDK 7 with -XX:+UseConcMarkSweepGC , but I am interested in general answers, too.
Information about memory available right after gc (youg or old) is available via JMX.
Garbage collector MBean has attribute LastGcInfo which is composite data object including information about memory pool sizes before and after GC.
In addition, starting with Java 7 JMX notification subscription could be used to receive GC events without polling.
You can find example of code working with GC MBean here.
Probably 'Dynatrace' is an option... it's a very powerful monitoring tool (not only for memory).
http://www.dynatrace.com/en/index.html
A very crude way would be to monitor the minima of Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory() for some time. At least that would not require you to know intimate details about memory pools, as monitoring LastGcInfo in Alexey Ragozin's answer does. This might require you to get notifications about garbage collections.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Creating a memory leak with Java
There is a "Garbage Collector" in Java, but does this mean that memory leaks are totally absent in a Java applications? If not, how and why do they happen?
I am more interested in scenarios in applications using JavaSE.
No - memory leaks can still exists in Java. They are just of a "different kind".
Wiki: Memory Leak
A memory leak, in computer science (or leakage, in this context), occurs when a computer program consumes memory but is unable to release it [the memory] back to the operating system.
In the case of Java it (normally) is when an unused/unneeded object is never made eligible for reclamation. For instance, an object may be stashed in a global List and never removed even if the object is never accessed later. In this case the JVM won't release the object/memory - it can't - because the object might be needed later, even if it never is.
(As an aside, some objects, such as directly allocated ByteBuffers also consume "out of JVM heap" memory which might not be reclaimed in a timely manner due to the nature of finalizers and memory pressure.)
In the case of Java, a "memory leak" is a semantic issue and not so much an issue of "not being able to release under any circumstances". Of course, with buggy JNI/JNA code, all bets are off ;-)
Happy coding.
Depends on how you define memory leak.
If you specifically mean having allocated memory that is no longer referenced by some memory root, then no, the garbage collector will eventually clean all of those up.
If you mean generally having your memory footprint grow without bound, that is easily possible. Just have some collection referenced by a static field and constantly added to.
Memory leaks in java are very possible. Here is a good article which has an example using core java. Fundamentally, a memory leak happens in java when the garbage collector cannot reclaim an object because the application holds a reference to it that it won't release, even though the object itself might no longer be used. The easiest way to create a memory leak in java is to have your application hold a reference to something, but not using it.
In the example, the unused object is a static List, and adding things to that list will eventually cause the JVM to run out of memory. Static collections are a pretty common source of "leaks", as they are typically long lived and mutable.
There are a few good responses so far. I don't want to recreate those posts, so I'll just add that one thing most people don't think about in connection with this subject is leaks in native code running via JNI. Native code running via JNI uses the JVM's heap space to allocate memory. So, if your application uses native code running via JNI that has a leak, your application has a leak.
Any object that has one or more live references to it will not be garbage-collected. So as long as some variable (either static, in the heap, or in the stack) refers to an object, that object will continue to occupy non-reclaimable memory space.
Unclosed resources (like sockets, JDBC connections, etc.) and constantly growing static collections are some of the better-known leak producers.
I want to know about the efficiency of Java and the advantages and disadvantages of Java Virtual Machine and Android.
Efficiency is the low use of memory, low use of the processor and fast execution.
Mobile devices are simpler than PC, then the apps need to be more efficient. Servers receive many connections and they need to be very efficient. Many mobile devices use Android and Java apps, and many servers use PHP.
Can Java and interpreted languages, such as Java Script, Python and PHP, be more efficient than C and C++?
JIT (just in time) advantages:
It can optimize better, because it knows the value of some variables and where it is used or changed.
It knows the processor and can optimize with processor specific instructions.
It is easier to transform functions into inline function.
It can remove known conditional tests and remove blocks that will not be run.
Java disadvantages:
When the app run for the first time, the app will be very slow, because the bytecodes will be interpreted and JIT compiler will do many analysis to find good optimizations. The apps cannot use the maximum of the hardware power. If an app is a game or a real-time app, if it be run for the first successfully and with no delay, but it uses the maximum of the hardware power, then the next time the app be run, it will not use the maximum of the hardware power due to optimizations. The problem is the app cannot be designed to use the maximum of the hardware power after the optimization, because it will be too slow on the first run, and will not continue to run.
Java checks if the array index is not out of bounds, and it checks if the pointers are not null. It will add several internal "if"s to generated code.
All objects use garbage collector, including objects that are very easy to manually delete.
All instances of objects are created with dynamic memory allocation, including objects that can easily use the stack. If a loop iteration begins creating an instance of a class and ends deleting the created object, dynamic memory allocation will be inefficient.
The garbage collector needs to stop the app while it cleans the memory and it is very undesired for games, GUI apps and real-time apps. Reference counting is slow and it cannot handle circular references. Multi-threaded garbage collector is slower and it needs more use of the CPU.
Can Java and interpreted languages, such as Java Script, Python and PHP, be more efficient than C and C++?
It's very difficult to get more efficient than the best C and C++ programs. There's a lot of C and C++ programs that are nowhere near as efficient as that though, and beating them with (modern) Java code is quite practical if you're any good.
I've also heard good things about the current best-of-breed Javascript engines, but I've never studied them in detail.
With Python and PHP (and many other languages besides) it's a bit different. Those languages are written in C, so it's obvious they cannot be more efficient than C (follows by construction). Yet it's much easier to write efficient code in them (i.e., that uses what is in-effect a very well-written C library) than it is to start from scratch. In particular, it reduces the number of defects per program. That's a very important metric in practice; anyone can produce fast code if it's allowed to be wrong.
In general, I advise not worrying about getting maximal efficiency. You run up against the law of diminishing returns. Instead, use sensible overall algorithms (or, as a friend of mine once said to me, “look after the big O()s and let the constant factors look after themselves”) and focus on the question of whether the program is good enough in practice. Once it is, stop fiddling around and ship it!
Let's pick apart your claimed disadvantages:
When the app run for the first time, the app will be very slow, because the bytecodes will be interpreted and JIT compiler will do many analysis to find good optimizations. The apps cannot use the maximum of the hardware power.
JIT compilation is an implementation issue. Not all platforms do it. Indeed, the Android platform could be modified to 1) do ahead of time compilation, or 2) cache the native code produced by the JIT to give faster startup next time you run the app.
It is interesting that various Java vendors have tried these strategies at various times, and yet the empirical evidence is that plain JIT is the best strategy.
Java checks if the array index is not out of bounds, and it checks if the pointers are not null. It will add several internal "if"s to generated code.
The JIT compiler can optimize away many of these tests. For the rest, the overheads tend to be relatively small; e.g. a few percent difference ... not a factor of 2.
Note that the alternative to checking is the risk that typical application bugs will crash the android platform. Certainly, garbage collection becomes problematic if applications can trash memory.
All objects use garbage collector, including objects that are very easy to manually delete.
The flip-side is that it is easy to forget to delete objects, delete objects twice, use them after they have been deleted and so on. These mistakes all lead to bugs that tend to be hard to track down.
All instances of objects are created with dynamic memory allocation, including objects that can easily use the stack. If a loop iteration begins creating an instance of a class and ends deleting the created object, dynamic memory allocation will be inefficient.
Java dynamic memory allocation and object creation is FAST. Faster than in C++ for example.
The garbage collector needs to stop the app while it cleans the memory and it is very undesired for games, GUI apps and real-time apps.
Use a concurrent / low-pause garbage collector then. Another approach is to implement your app to not generate lots of garbage ... and seldom trigger garbage collection.
Reference counting is slow and it cannot handle circular references.
No decent Java GC uses reference counting. (On the other hand, a lot of C / C++ manual memory management schemes do. For instance, so-called smart pointer schemes in C++.)
Multi-threaded garbage collector is slower and it needs more use of the CPU.
You actually mean concurrent collection I think. Yes it does, but that's the penalty you pay for the extra responsiveness that you demand for interactive games / realtime apps.
What you describe as 'efficient' I would describe as 'ideal'. An application that requires little memory, little CPU time and runs quickly, put another way, is one that is good, fast, and cheap all at the same time. Never mind if it does anything useful or interesting.
The only comparison I'd view as reasonable, if all three goals are required, is among applications that produce a common result. In that case, it is unlikely, given a competing group of evenly-capable programmers, that any one implementation would excel on all three counts over the others.
That said, your question leaves out a key criterion to the mobile market: rate of application development. Mobile applications also profit far more from positive user experience than back-end optimization. Without that constraint, the question of efficiency as you put it, seems to me more of an ponderous consideration than a practical one.
But to the actual question: can a language like Java produce more efficient code than one that compiles statically to the instruction set of the target machine? Probably not. Can it be as efficient, or efficient enough? Absolutely. If we considered an execution platform with fixed, severely constrained resources that changes infrequently, it would be a different matter.
In any language, the way to get fast execution is to do the job with as little execution as possible, and as little garbage collection as possible.
That sounds like a vacuous generality, but what it means in practice, regardless of language, is
For the data structure design, keep it as simple as possible. Stay away from the fancy collection classes full of bells and whistles. Especially stay away from notifications as a way of keeping data consistent. If your data is normalized, it can never be inconsistent. If you can't normalize it, it's better to tolerate temporary inconsistency, than to try to keep it tight with notifications.
Performance problems creep in, even into the best code. You should try not to make them, but you will still make them. Most important is knowing how to find them, once made, and remove them. Here's a blow-by-blow example. If in doing this, you find you need a better big-O algorithm, then put it in. Putting one in without being sure it's needed is a recipe for slowness.
No language can rescue a program from non-removed performance problems. The language and its compiler, JITter, etc. are like a race horse. It's fine to want a good horse, but it's a waste if the jockey isn't as slim as possible.
Your program is the jockey, and it's your job to take it on a weight-loss program.
I will paste an interesting answer given by the James Gosling himself in the Book Masterminds of Programming.
Well, I’ve heard it said that
effectively you have two compilers in
the Java world. You have the compiler
to Java bytecode, and then you have
your JIT, which basically recompiles
everything specifically again. All of
your scary optimizations are in the
JIT.
James: Exactly. These days we’re
beating the really good C and C++
compilers pretty much always. When you
go to the dynamic compiler, you get
two advantages when the compiler’s
running right at the last moment. One
is you know exactly what chipset
you’re running on. So many times when
people are compiling a piece of C
code, they have to compile it to run
on kind of the generic x86
architecture. Almost none of the
binaries you get are particularly well
tuned for any of them. You download
the latest copy of Mozilla,and it’ll
run on pretty much any Intel
architecture CPU. There’s pretty much
one Linux binary. It’s pretty generic,
and it’s compiled with GCC, which is
not a very good C compiler.
When HotSpot runs, it knows exactly
what chipset you’re running on. It
knows exactly how the cache works. It
knows exactly how the memory hierarchy
works. It knows exactly how all the
pipeline interlocks work in the CPU.
It knows what instruction set
extensions this chip has got. It
optimizes for precisely what machine
you’re on. Then the other half of it
is that it actually sees the
application as it’s running. It’s able
to have statistics that know which
things are important. It’s able to
inline things that a C compiler could
never do. The kind of stuff that gets
inlined in the Java world is pretty
amazing. Then you tack onto that the
way the storage management works with
the modern garbage collectors. With a
modern garbage collector, storage
allocation is extremely fast.
I've read in many threads that it is impossible to turn off garbage collection on Sun's JVM. However, for the purpose of our research project we need this feature. Can anybody recommend a JVM implementation which does not have garbage collection or which allows turning it off? Thank you.
I wanted to find a fast way to keep all objects in memory for a simple initial proof of concept.
The simple way to do this is to run the JVM with a heap that is so large that the GC never needs to run. Set the -Xmx and -Xms options to a large value, and turn on GC logging to confirm that the GC doesn't run for the duration of your test.
This will be quicker and more straightforward than modifying the JVM.
(In hindsight, this may not work. I vaguely recall seeing evidence that implied that the JVM does not always respect the -Xms setting, especially if it was really big. Still, this approach is worth trying before trying some much more difficult approach ... like modifying the JVM.)
Also, this whole thing strikes me as unnecessary (even counter-productive) for what you are actually trying to achieve. The GC won't throw away objects unless they are garbage. And if they are garbage, you won't be able to use them. And the performance of a system with GC disabled / negated is not going to indicative of how a real application will perform.
UPDATE - From Java 11 onwards, you have the much simpler option of using the Epsilon (no-op) garbage collector; see
JEP 318: Epsilon: A No-Op Garbage Collector (Experimental)
You add the following options when you launch the JVM:
-XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC
When the heap is filled, no attempt is made to collect garbage. Instead, the Epsilon GC terminates the JVM.
Depending on your needs this could perhaps work:
Using the -Xbootclasspath option you may specify your own implementation of API classes. You could then for instance override the implementation of Object, and add to the constructor, a globalList.add(this) to prevent the objects from being garbage collected. It's a hack for sure, but for simple case-study it's perhaps sufficient.
Another option is to take an open source jvm and comment out the parts that initiate garbage collection. I would guess it is not that complicated.
Sun's JVM has no such option. AFAIK, no other JVM has this option either.
You did not state what it is that you are exactly trying to achieve but you have one of two options: either use a profiler and see exactly what the GC is doing, that way you can take its effects into consideration. The other is to compile one of the JVMs from source, and disable GC from there.
You can only turn off the GC if its not actually needed (otherwise your application would run out of memory) and if you didn't need to GC, it shouldn't run anyway.
The simplest option would be to not discard any objects, this will avoid GC being performed (And set the max memory very high so you don't run out).
You may find that you get GCs on startup and you may consider a no-GC when running acceptable.
the question is old but for those who might be interested, there is a proposal to
Develop a GC that only handles memory allocation, but does not implement any actual memory reclamation mechanism. Once available Java heap is exhausted, perform the orderly JVM shutdown.
JEP draft: Epsilon GC: The Arbitrarily Low Overhead Garbage (Non-)Collector
Maybe you could try making your VM's available memory sufficient for GC never to be run.
My (allbeit limited) experience leads me to suggest that the VM is, by default, extremely lazy and extremely reluctant to run GC.
giving -Xmx 16384M (or some such) and making sure that your research subject stays well below that limit, might give you the environment you wish to obtain, allthough even then it will obviously not be guaranteed.
There actually exists a dirty hack to temporarily pause GC. First create a dummy array in Java. Then, in JNI, use GetPrimitiveArrayCritical function to get hold of the pointer to the array. The Sun JVM will disable GC to ensure that the array is never moved and the pointer stays valid. To re-enable GC, you can call the ReleasePrimitiveArrayCritical function on the pointer. But this is very implementation specific since other VM impl may pin the object instead of disabling GC entirely. (Tested to work on Oracle Jdk 7 & 8)
Take a look at Oracle's JRockit JVM. I've seen very good near-deterministic performance on Intel hardware with this JVM and you can prod and poke the runtime using the Mission Control utility to see how well it's performing.
Though you can't turn GC off completely, I believe that you can use the -Xnoclassgc option to disable the collection of classes. The GC can be tuned to minimize latency at the expense of leaving memory consumption to grow. You may need a license to drop the latency as low as you need if you're going this route.
There is also a Realtime version of the JRockit JVM available but I don't think that there is a free-to-developers version of this available.
Can you get an open source JVM and disable its GC, for example Sun's Hotspot?
If there was no Garbage Collection what would you expect to be the semantics of code like this?
public myClass {
public void aMethod() {
String text = new String("xyz");
}
}
In the absence of GC any item newed and with a stack scoped reference could never be reclaimed. Even if your own classes could decide not to use local variables like this, or to use only primitive types I don't see how you would safely use any standard Java library.
I'd be interested to hear more about your usage scenario.
If I had this problem I would get IBM's Jikes Research Virtual Machine because:
The run-time system is written in Java itself (with special extensions)
The whole thing was designed as a research vehicle and is relatively easy to tweak.
You can't turn off GC forever, because Java programs do allocate and eventually you'll run out of memory, but it's quite possible that you can delay GC for the duration of your experiment by telling the JVM not to start collecting until the heap gets really big. (That trick might work on other JVMs as well, but I wouldn't know where to find the knobs to start twirling.)
This may be a bit off topic of "right answer, not discussion."
However, I am trying to debug my thought process, so maybe someone can help me:
I use compilers all the time, and the fact that I'm giving up control over machine code generation (the layout of my caches, and the flow of electrons) does not bother me.
However, giving up control of memory layout (being able to place stuff in memory) and memory management (garbage collection) still bothers me these days.
Have others dealt with this? If so, how did you get past it? (In particular, how I often feel "safer" in C++ than in Java.)
Thanks!
Your feeling is, naturally, very subjective.
You might feel comfortable managing your own memory space in C++.
Others might appreciate the easiness of Java managing the heap for you, and reducing memory management overhead to a minimum.
Programming domain has an influence as well. For example, in an embedded environment, you most likely will not have the privilege to enjoy a garbage collection mechanism, leaving you to manage your own memory, whether you like it or not.
Bottom line - subjective and domain-dependent.
Confront your nightmare! Profile a busy application in NetBeans and watch the garbage collector do its job.
If you trust the JVM with code generation, why not trust it with data generation too?
Please note that things like cache sizes on CPU may influence the optimal placement of your objects, and that the JIT basically knows better than you because it can measure and take action in the process.
If you've ever used COM under C++ its really no different to using "Release()". The momory may or may not be freed right then or it may be freed somewhere down the line when the thing using it has finished using it.
Best thing to do is just assume it works and stop worrying about it.
The original poster asked about (a) memory layout and (b) memory management. The previous answers only talk about memory management.
Regarding memory layout, the keyword to search for seems to be "struct".
C and C++ both have memory layout control. D should as well.
It appears (based on a quick search) Java does not.
C# has grants memory layout control via structs. See:
Stack Overflow: incorrect members order in a C# structure
http://www.developerfusion.com/article/84519/mastering-structs-in-c/
Go's data structures are called "structs", but I cannot tell if they grant control over memory layout. (I suspect they do, but have not been able to confirm this.)
I welcome any corrections/additions to the above.
(And regarding memory management, I'm quite happy to let the language/platform do it.)