Java JNI- What's the benefit of GetString() vs. GetStringCritical()? - java

According to JNI docs, GetStringCritical() disable garbage collection while you hold on to some memory managed by the JVM. Using this instead of GetString() puts your native at risks of deadlock if you call into the Java layer or perform blocking operation before ReleaseStringCritical() is invoked. So what's the benefit of the Critical function, and if it's synonomous, disabling garbage collection while you access JVM managed memory?

First a correction: according to the specs, the GetXxxCritical() methods may disable garbage collection.
So what's the benefit of the Critical function,
The advantage of using the GetXxxCritical() methods is that they are more likely to return a pointer to the original data rather than a copy of the data. This means that they are likely to be faster. Of course, this comes at a cost in terms of the documented restrictions / caveats.
... and is it synonymous with disabling garbage collection while you access JVM managed memory?
Well no; see my correction. The methods may disable garbage collection, or they may not. It will depend on how your platform implements the JNI interfaces, and possibly on other things such as JVM garbage collector options.

Related

Out of memory errors in Java API that uses finalizers in order to free memory allocated by C calls

We have a Java API that is a wrapper around a C API.
As such, we end up with several Java classes that are wrappers around C++ classes.
These classes implement the finalize method in order to free the memory that has been allocated for them.
Generally, this works fine. However, in high-load scenarios we get out of memory exceptions.
Memory dumps indicate that virtually all the memory (around 6Gb in this case) is filled with the finalizer queue and the objects waiting to be finalized.
For comparison, the C API on its own never goes over around 150 Mb of memory usage.
Under low load, the Java implementation can run indefinitely. So this doesn't seem to be a memory leak as such. It just seem to be that under high load, new objects that require finalizing are generated faster than finalizers get executed.
Obviously, the 'correct' fix is to reduce the number of objects being created. However, that's a significant undertaking and will take a while. In the meantime, is there a mechanism that might help alleviate this issue? For example, by giving the GC more resources.
Java was designed around the idea that finalizers could be used as the primary cleanup mechanism for objects that go out of scope. Such an approach may have been almost workable when the total number of objects was small enough that the overhead of an "always scan everything" garbage collector would have been acceptable, but there are relatively few cases where finalization would be appropriate cleanup measure in a system with a generational garbage collector (which nearly all JVM implementations are going to have, because it offers a huge speed boost compared to always scanning everything).
Using Closable along with a try-with-resources constructs is a vastly superior approach whenever it's workable. There is no guarantee that finalize methods will get called with any degree of timeliness, and there are many situations where patterns of interrelated objects may prevent them from getting called at all. While finalize can be useful for some purposes, such as identifying objects which got improperly abandoned while holding resources, there are relatively few purposes for which it would be the proper tool.
If you do need to use finalizers, you should understand an important principle: contrary to popular belief, finalizers do not trigger when an object is actually garbage collected"--they fire when an object would have been garbage collected but for the existence of a finalizer somewhere [including, but not limited to, the object's own finalizer]. No object can actually be garbage collected while any reference to it exists in any local variable, in any other object to which any reference exists, or any object with a finalizer that hasn't run to completion. Further, to avoid having to examine all objects on every garbage-collection cycle, objects which have been alive for awhile will be given a "free pass" on most GC cycles. Thus, if an object with a finalizer is alive for awhile before it is abandoned, it may take quite awhile for its finalizer to run, and it will keep objects to which it holds references around long enough that they're likely to also earn a "free pass".
I would thus suggest that to the extent possible, even when it's necessary to use finalizer, you should limit their use to privately-held objects which in turn avoid holding strong references to anything which isn't explicitly needed for their cleanup task.
Phantom references is an alternative to finalizers available in Java.
Phantom references allow you to better control resource reclamation process.
you can combine explicit resource disposal (e.g. try with resources construct) with GC base disposal
you can employ multiple threads for postmortem housekeeping
Using phantom references is complicated tough. In this article you can find a minimal example of phantom reference base resource housekeeping.
In modern Java there are also Cleaner class which is based on phantom reference too, but provides infrastructure (reference queue, worker threads etc) for ease of use.

Why is the finalize() method deprecated in Java 9?

(This question is different from Why would you ever implement finalize()? This question is about deprecation from the Java platform, and the other question is about whether one should use this mechanism in applications.)
Why is the finalize() method deprecated in Java 9?
Yes it could be used in wrong way (like save an object from garbage collecting [only one time though] or try to close some native resources within it [it's better than don't close at all though]) as well as many other methods could be used wrongly.
So is finalize() really so dangerous or absolutely useless that it's necessary to kick it out of Java?
Although the question was asking about the Object.finalize method, the subject really is about the finalization mechanism as a whole. This mechanism includes not only the surface API Object.finalize, but it also includes specifications of the programming language about the life cycle of objects, and practical impact on garbage collector implementations in JVMs.
Much has been written about why finalization is difficult to use from the application's point of view. See the questions Why would you ever implement finalize()? and Should Java 9 Cleaner be preferred to finalization? and their answers. See also Effective Java, 3rd edition by Joshua Bloch, Item 8.
Briefly, some points about the problems associated with using finalizers are:
they are notoriously difficult to program correctly
in particular, they can be run unexpectedly when an object
becomes unreachable unexpectedly (but correctly); for example,
see my answer to this question
finalization can easily break subclass/superclass relationships
there is no ordering among finalizers
a given object's finalize method is invoked at most once by the JVM, even if that object is "resurrected"
there are no guarantees about timeliness of finalization or
even that it will run at all
there is no explicit registration or deregistration mechanism
The above are difficulties with the use of finalization. Anyone who is considering using finalization should reconsider, given the above list of issues. But are these issues sufficient to deprecate finalization in the Java platform? There are several additional reasons explained in the sections below.
Finalization Potentially Makes Systems Fragile
Even if you write an object that uses finalization correctly, it can cause problems when your object is integrated into a larger system. Even if you don't use finalization at all, being integrated into a larger system, some parts of which use finalization, can result in problems. The general issue is that worker threads that create garbage need to be in balance with the garbage collector. If the garbage collector falls behind, at least some collectors can "stop the world" and do a full collection to catch up. Finalization complicates this interaction. Even if the garbage collector is keeping up with application threads, finalization can introduce bottlenecks and slow down the system, or it can cause delays in freeing resources that result in exhaustion of those resources. This is a systems problem. Even if the actual code that uses finalization is correct, problems can still occur in correctly programmed systems.
(Edit 2021-09-16: this question describes a problem where a system works fine under low load but fails under high load, likely because the relative rate of allocation outstrips the rate of finalization under high load.)
Finalization Contributes to Security Issues
The SEI CERT Oracle Coding Standard for Java has a rule MET12-J: Do not use finalizers. (Note, this is a site about secure coding.) In particular, it says
Improper use of finalizers can result in resurrection of garbage-collection-ready objects and result in denial-of-service vulnerabilities.
Oracle's Secure Coding Guidelines for Java SE is more explicit about potential security issues that can arise using finalization. In this case it is not a problem with code that uses finalization. Instead, finalization can be used by an attacker to attack sensitive code that hasn't properly defended itself. In particular, Guideline 7-3 / OBJECT-3 states in part,
Partially initialized instances of a non-final class can be accessed via a finalizer attack. The attacker overrides the protected finalize method in a subclass and attempts to create a new instance of that subclass. This attempt fails ... but the attacker simply ignores any exception and waits for the virtual machine to perform finalization on the partially initialized object. When that occurs the malicious finalize method implementation is invoked, giving the attacker access to this, a reference to the object being finalized. Although the object is only partially initialized, the attacker can still invoke methods on it....
Thus, the presence of the finalization mechanism in the platform imposes a burden on programmers who are trying to write high assurance code.
Finalization Adds Complexity to Specifications
The Java Platform is defined by several specifications, including specifications for the language, the virtual machine, and the class library APIs. Impact of finalization is spread thinly across all of these, but it repeatedly makes its presence felt. For example, finalization has a very subtle interaction with object creation (which is already complicated enough). Finalization also has appeared Java's public APIs, meaning that evolution of those APIs has (up to now) been required to remain compatible with previously specified behaviors. Evolving these specifications is made more costly the presence of finalization.
Finalization Adds Complexity to Implementations
This is mainly about garbage collectors. There are several garbage collection implementations, and all are required to pay the cost of implementing finalization. The implementations are quite good at minimizing the runtime overhead if finalization isn't used. However, the implementation still needs to be there, and it needs to be correct and well tested. This is an ongoing development and maintenance burden.
Summary
We've seen elsewhere that it's not recommended for programmers to use finalization. However, if something is not useful, it doesn't necessarily follow that it should be deprecated. The points above illustrate the fact that even if finalization isn't used, the mere presence of the mechanism in the platform imposes ongoing specification, development, and maintenance costs. Given the lack of usefulness of the mechanism and the costs it imposes, it makes sense to deprecate it. Eventually, getting rid of finalization will benefit everyone.
As of this writing (2019-06-04) there is no concrete plan to remove finalization from Java. However, it is certainly the intent to do so. We've deprecated the Object.finalize method, but have not marked it for removal. This is a formal recommendation that programmers stop using this mechanism. It's been known informally that finalization shouldn't be used, but of course it's necessary to take a formal step. In addition, certain finalize methods in library classes (for example, ZipFile.finalize) have been deprecated "for removal" which means that the finalization behavior of these classes may be removed from a future release. Eventually, we hope to disable finalization in the JVM (perhaps first optionally, and then later by default), and at some point in the future actually remove the finalization implementation from garbage collectors.
(Edit 2021-11-03: JEP 421 has just been posted, which proposes to deprecate finalization for removal. At this writing it's in the "candidate" state but I expect it will move forward. The deprecations added by this JEP are a formal notification that finalization will be removed at some point in a subsequent Java release. Perhaps not surprisingly, there's a fair overlap between the material in this answer and in the JEP, though the JEP is more precise and describes a moderate evolution in our thinking on the topic.)
(Edit 2022-04-04: JEP 421 Deprecate Finalization for Removal has been integrated and delivered in JDK 18.)

Is it possible to mark java objects non-collectable from gc perspective to save on gc-sweep time?

Is it possible to mark java objects non-collectable from gc perspective to save on gc-sweep time?
Something along the lines of http://wwwasd.web.cern.ch/wwwasd/lhc++/Objectivity/V5.2/Java/guide/jgdStorage.fm.html and specifically non-garbage-collectible containers there (non-garbage-collectable?).
The problem is that I have lots of ordinary temporary objects, but I have even bigger (several Gigs) of objects that are stored for Cache purposes. For no reason should the Java GC traverse all those Cache gigabytes trying to find anything to collect, because they contain cached data which have their own timeouts.
This way I could partition my data in a custom way into infinite-lived and normal-lived objects, and hopefully GC would be quite fast because normal objects don't live so long and amount to smaller amounts.
There are some workarounds to this problem, such as Apache DirectMemory and Commercial Terracotta BigMemory(http://terracotta.org/products/bigmemory), but a java-native solution would be nicer (I mean free and probably more reliable?). Also I want to avoid serialization overhead which means it should happen within same jvm. To my understanding DirectMemory and BigMemory operate mainly off heap which means that the objects must be serialized/deserialized to/from memory outside jvm. Simply marking non-gc regions within the jvm would seem a better solution. Using Files for cache is not an option either, it has the same unaffordable serialization/deserialization overhead - use case is a HA server with lots of data used in random (human) order and low latency needed.
Any memory the JVM manages is also garbage-collected by the JVM. And any “live” objects which are directly available to Java methods without deserialization have to live in JVM memory. Therefore in my understanding you cannot have live objects which are immune to garbage collection.
On the other hand, the usage you describe should make the generational approach to garbage collection quite efficient. If your big objects stay around for a while, they will be checked for reclamation less often. So I doubt there is much to be gained from avoiding those checks.
Is it possible to mark java objects non-collectable from gc perspective to save on gc-sweep time?
No it is not possible.
You can prevent objects from being garbage collected by keeping them reachable, but the GC will still need to trace them to check reachability on each full; GC (at least).
Is simply my assumption, that when the jvm is starving it begins scanning all those unnecessary objects too.
Yes. That is correct. However, unless you've got LOTS of objects that you want to be treated this way, the overhead is likely to be insignificant. (And anyway, a better idea is to give the JVM more memory ... if that is possible.)
Quite simply, for you to be able to do this, the garbage collection algorithm would need to be aware of such a flag, and take it into account when doing its work.
I'm not aware of any of the standard GC algorithms having such a flag, so for this to work you would need to write your own GC algorithm (after deciding on some feasible way to communicate this information to it).
In principle, in fact, you've already started down this track - you're deciding how garbage collection should be done rather than being happy to leaving it to the JVM's GC algo. Is the situation you describe a measurable problem for you; something for which the existing garbage collection is insufficient, but your plan would work? Garbage collectors are extremely well-tuned, so I wouldn't be surprised if the "inefficient" default strategy is actually faster than your naively-optimal one.
(Doing manual memory management is tricky and error-prone at the best of times; managing some memory yourself while using a stock garbage collector to handle the rest seems even worse. I expect you'd run into a lot of edge cases where the GC assumes it "knows" what's happening with the whole heap, which would no longer be true. Steer clear if you can...)
The recommended approaches would be to use either a commerical RTSJ implementation to avoid GC, or to use off heap memory. One could also look into soft references for caches as well (they do get collected).
This is not recommended:
If for some reason you do not believe these options are sufficient, you could look into direct memory access which is UNSAFE (part of sun.misc.Unsafe). You can use the 'theUnsafe' field to get the 'Unsafe' instance. Unsafe allows to allocation/deallocate memory via 'allocateMemory' and 'freeMemory'. This is not under GC control nor limited by JVM heap size. The impact on GC/application, once you go down this route, is not guaranteed - which is why using byte buffers might be the way to go (if you're not using a RTSJ like implementation).
Hope this helps.
Living Java objects will always be part of the GC life cycle. Or said another way, marking an object to be non-gc is the same order of overhead than having your object referenced by a root reference (a static final map for instance).
But thinking a bit further, data put in a cache are most likely to be temporary, and would eventually be evicted. At that point you will start again to like the JVM and the GC.
If you have 100's of GBs of permanent data, you may want to rethink the architecture of your application, and try to shard and distribute your data (horizontally scalability).
Last but not least, lots of work has been done around serialization, and the overhead of serialization (I'm not speaking about the poor reputation of ObjectInputStream and ObjectOutputStream) is not that big.
More than that, if your data is mainly composed of primitive types (including bytes array), there is efficient way to readInt() or readBytes() from off heap buffers (for instannce netty.io's ChannelBuffer). This could be a way to go.

Preventing native memory leak by customizing garbage collector?

Let's say I'm writing an API in java that refers to some native C libraries, that requires destructors to be called explicitly. If the destructors are not called, I run out of native memory.
Is there a way to protect users of my API from calling the destructors explicitly, by having the garbage collector call the destructors somehow? (perhaps based on some estimate I make of the size of the used native memory?)
I know Java doesn't have its garbage collector as part of the Java API, but perhaps there is some way to get this implemented?
One alternative if you have control over creation of your objects is to reference them with a WeakReference using the constructor that takes a ReferenceQueue. When they get out of scope, the Reference will be queued and you can have your own thread polling the queue and call some clean up function.
Why?
Well, it is slightly more efficient than adding finalizers to your classes (because it forces the gc to do some special handling of them).
Edit: The following two (variations of the same article) describes it:
http://java.sun.com/developer/technicalArticles/javase/finalization/
http://www.devx.com/Java/Article/30192
Peter Lawrey has a very good point when he says:
Even so, waiting for the GC to cleanup can be inefficient and you may want to expose a means of explicitly cleaning up the resource if its required.
Whenever you can assume your users to be on Java7, take a look at java.lang.AutoCloseable as it will help them do that automatically when using the new try-with-resources.
In addition to use finalize(), you may need to trigger a GC if you run out of resources to make the call, however a GC hasn't been run.
The ByteBuffer.allocateDirect() has this issue. It need the GC to clean up its ByteBuffers, However, you can reach your maximum direct memory before a GC is triggered, so the code has to detect this and triggers a System.gc() explicitly.
Even so, waiting for the GC to cleanup can be inefficient and you may want to expose a means of explicitly cleaning up the resource if its required.
Garbage collector will call finalize() of Java objects when the Java object is about to be GCd, and inside the finalize, you could call the destructor. Just make a new Java object for every destructor that needs to be called, and keep reference to that Java object until when you want to call the destructor.
In practice, finalize() will be called sooner or later (even though technically Java makes no guarantee that any particular object will ever be GCd). The only exception is if the object is still around when the process is shutting down: then it may indeed never get GCd.

The Java language specification allows a dummy gc() method. Why?

I've a hard time understanding the following:
"The Java language specification allows a dummy gc() method."
Why would the standard do this?Its effectively making a very important feature of java optional.This would also mean my same program will behave differently on two different JVM implementations !!! Something totally against Java's important feature of portability.
Its effectively making a very important feature of java optional.
The GC is not made optional by that in Java. What is made optional is an explicit garbage collection, triggered by a call to gc(). And this is completely acceptable, since explicitly triggering a garbage collection is rarely necessary and interferes with the function of a modern garbage collector.
Calling the gc method suggests that the Java Virtual Machine expend effort toward recycling unused objects in order to make the memory they currently occupy available for quick reuse. When control returns from the method call, the Java Virtual Machine has made a best effort to reclaim space from all discarded objects.
Calling gc() is not guaranteed to trigger garbage-collection. I.e. you cant force garbage collection. That's why the method can be dummy.
"The best effort" mentioned above might be "waited for implicit garbage collection".
This would also mean my same program
will behave differently on two
different JVM implementations !!!
Something totally against Java's
important feature of portability.
Differences of this sort between different JVM implementations is actually a good thing. It allows improvements in the underlying details that don't affect the correctness of a program and usually improve performance. Garbage collection happens to be an area where the JVM spec focuses on allowing implementation room to basically experiment with different approaches to solving the problem. Sun explicitly states that calling gc() will not force a memory sweep, so programmers can not make the claim that they expect a certain behavior out of every JVM.
The core idea - gc() is optional and you can force to collect garbage if you want.
But still you have the "automatic" garbage collection as it should be. gc() is just a matter of making a program more and more effective
It is usually a good idea to let the JVM figure out when it needs to do garbage collection instead of explicitly telling it to do so. The fact that the Java spec allows a dummy method does not change the fact that Java does garbage collection, it just means that it can ignore requests from you to do it explicitly.
Here is a good (slightly old) article about garbage collection in Java, which has a short paragraph on System.gc(): http://www.ibm.com/developerworks/library/j-jtp01274.html
If your program depends on System.gc() you should either rethink your design or change to an other programing language.
The garbage collection can be implemented different on each vm and the sun vm alone comes with several different implementations.
It would be interesting to know in what way your program relies on System.gc().
If it is for finalizers, they come with their own problems and should only be used to help with debugging.

Categories