Notification of any String object construction in Java 8 HotSpot VM - java

Is there a way to get notified on all invocations to constructor of String class (either directly or using reflection) without weaving or instrumenting rt.jar?
Further is it possible to filter these notifications only for calls within a specific package?
Further is it possible to make these notifications async (like events) so that actual JVM invocations are not slowed down
My use-case is to intercept all strings being created, make a pattern match on the content and raise alters based on some rules (all in backend) as part of some platform component.
As I don't want to instrument rt.jar, AspectJ seems to be out of question (as LTW can't be done on java core classes). The potential tool seems to JVM TI, but I am not exactly sure how to achieve it.
Thanks,
Harish

Is there a way to get notified on all invocations to constructor of String class (either directly or using reflection) without weaving or instrumenting rt.jar in compile time?
You are not compiling the String class, so you can only do weaving at runtime. And yes, this is the only way without creating a custom JVM.
Further is it possible to filter these notifications only for calls within a specific package?
It is possible to check the caller with Reflection.getCallerClass(n)
Further is it possible to make these notifications async (like events) so that actual JVM invocations are not slowed down
All this is very expensive as is passing work to another thread.
make a pattern match on the content
Pattern matching is very expensive compared to creating a String. If you are not careful you will slow down your application by an order of magnitude or two. I suggest you reconsider your real requirements and see if there is another way to some what you are trying to do.
Are you sure you don't want to use a profiler to do this. Note: even profilers generally only sub-sample e.g. every 10th allocation. There is plenty of free ones, in fact two come with the JVM. I suggest using Flight Recorder to track allocations as this has a very low overhead.

Related

Block instances of a class at the JVM level?

Is there a way to configure the JVM to block instances of a class being created?
I'd like to do this to ensure no service running in the JVM is allowed to create instances of a class that has been identified as a security risk in a CVE, lets call that class BadClass.
NOTE: I'm looking for a general solution, so the following is purely additional information. I would normally address this by switching the library out, or upgrading it to a version that doesn't have the exploit, but it's part of a larger library that wont be addressing the issue for some time. So I'm not even using BadClass anywhere, but want to completely block it.
I do not know a JVM parameter, but here's some alternatives that might pout you in a position that solve your requirements:
You can write a CustomClassLoader that gives you fine control on what to do. Normal use cases would be plugin loading etc. In your case this is more security governance on devops level.
If you have a CICD pipeline with integration tests you could also start the JVM with -verbose:class parameter and see which classes are loaded when running your tests. Seem a bit hacky, but maybe suits your use case. Just throwing everything into the game, it's up to you judging about the best fit.
Depending on your build system (Maven?) you could restrict building applications just on your private cached libs. So you should have full control on it and put a library - review layer in between. This would also share responsibility between devs and the repository admins.
A distinct non-answer: Do not even try!
What if that larger library that has this dependency wants to call that method? What should happen then?
In other words, what is your blocking supposed to do?
Throw some Error instance, that leads to a teardown of the JVM?
Return null, so that (maybe much later) other code runs into a NPE?
Remember: that class doesn't exist in a void. There is other code invoking it. That code isn't prepared for you coming in, and well, doing what again?!
I think there are no good answers to these questions.
So, if you really want to "manipulate" things:
Try sneaking in a different version of that specific class into your classpath instead. Either an official one, that doesn't have the security issue, or something that complies to the required interface and that does something less harmful. Or, if you dare going down that path, do as the other answer suggests and get into "my own classloader" business.
In any case, your first objective: get clean on your requirements here. What does blocking mean?!
Have you considered using Java Agent?
It can intercept class loading in any classloader, and manipulate it's content before the class is actually loaded. Then, you may either modify the class to remove/fix it's bugs, or return dummy class that would throw error in static initializer.

Java: Method hooking & Finding object instances

Situation
Hi, I have 2 problems.
The situation is that I'm writing a Java API for Windows that also provides tools for injecting code into a process and then manipulate the target. I have already implemented the injection-part, for example injecting a jar into another jar. At this point my jar gets called (while the target already is at runtime) and starts in a complete static context.
Goals & problems
From here I have two goals:
I'd like to interact with the targets objects, thus I need references. For many objects this is already possible because they provide static access to their instances. For example awt.Frames#getFrames() provides access to all created Frame objects. But it would be awesome if there is a possibility to get access to arbitrary objects on the heap. Something like 'Heap#getAllObjectInstances()'.
Given an object instance, I'd like to hook up onto arbitrary functions of this object. For example whenever BufferStrategy#show() gets called, I want it to call another method first.
So I summarize the problems as follows:
How to get arbitrary object references from a static context?
How to hook up onto arbitrary functions?
Remarks
What I've done so far, remarks and ideas:
The JDI (Java Debugger Interface) provides such a method via VirtualMachine#allClasses() -> ReferenceType#instances(0). But the JDI needs the target JVM to be started with additional debug parameter which is no option for me. One could go down to low-level and analyze the heap with memory tools, but I hope someone knows a more high-level approach. Using the Windows API would be an option for me as I'm familiar with JNA/JNI, but I don't know such a tool.
The last resort would be to use IAT hooking with C-Code, a very low-level approach, I'd like to avoid this. As I can assume having a object reference at this point, maybe does the Reflection API provide a method to change an objects method? Or at least simply provide a hooking mechanism?
Be aware that changing the targeted code certainly is no option for me. And that it is already at runtime, thus ByteCode-Manipulation could also be an option.
Scenario
A scenario where this would come in handy:
The target is a game, deployed as jar. It renders with a Double-Buffer-Strategy, using the BufferStrategy class. It displays the image with BufferStrategy#show(). We inject our jar inside the game and like to draw an overlay with additional information. For this we get an reference to the used BufferStrategy and hook up onto its show-method. So that it calls our drawOverlay-method everytime it gets called, then we pass back to the original show-method.
What you need is JVMTI agent - a native library that makes use of JVM Tool Interface.
Agents can be attached dynamically to a running VM using the Attach API.
See VirtualMachine.loadAgentPath.
To get all instances of a given class use JVMTI IterateOverInstancesOfClass function.
See the related question for details.
To intercept a method of a foreign class you'll need JVMTI RetransformClasses API. The same can be also achieved by using Java-level instrumentation API, see Instrumentation.retransformClasses.
For the example of JVMTI-level method interception refer to demo/jvmti/mtrace from Oracle JDK demos and samples package.
Java-level instrumentation will be easier with bytecode manipulation libraries like Byte Buddy.

Reflection versus container configurations [duplicate]

This question already has answers here:
If reflection in Java slows down execution by orders, why do so many frameworks use it ?
(2 answers)
Closed 7 years ago.
Using reflection in Java is very expensive because it affects performance very badly right.But I wonder that , reflection is widely used in container configurations (web.xml),frame works like Structs,REST.. , and ORM like hibernate etc.
How it can be justified?Is it because reflection used only once when container is up or some other reason behind it?
There is no other way for them to do what they do (a good example of this might be Spring framework - it doesn't force you to use any interface when using dependency injection, and since it has no interface to use and doesn't know your classes at compile time, the only way is to inspect them via reflection)
The reflection-heavy parts are not (should not) be executed too often
Reflection isn't that very expensive if done right (e.g. if you only lookup the method you want to call once and then cache the java.lang.reflect.Method object found and use it in further invocations)
First, I wouldn't say that using reflection has such a detrimental effect on code performance. Of course, there is an overhead, but there are optimisation techniques in place, that make sure that the performance impact is kept to a minimum. As far as the trade - off between performance and usability is concerned, the specific requirements of the product being developer should be taken into account. For example, would I use a heavy reflection - based framework on mobile - I think not. Does it makes sense on the backend - I would say yes.
Second, having annotation based configuration doesn't always mean that there is reflection used at application runtime. There are frameworks that make use of the AnnotationProcessor framework and generate java code during compilation, which is later used as "normal code". Also, a lot of frameworks use annotation configuration in conjunction with byte - code generation at runtime, so basically, reflection is kept at a minimum.

Can a Custom Delegating Classloader Cache loadClass() results safely?

We've got a custom classloader, called here MainClassLoader, sitting on top of a Java web application (specifically on Tomcat 7 where the parent classloader is the WebAppClassLoader). This custom classloader is set as the TCCL for the web application, and its purpose is to delegate the lookup of classpath resources (including classes and non-class resources) to a set of other custom classloaders, each of which represents a pluggable module to the application. (MainClassLoader itself loads nothing.)
MainClassLoader.loadClass() will do parent-first delegation, and upon a ClassNotFoundException, go one by one through the pluggable child classloaders to see which of them will provide the result. If none of them can, it then throws the ClassNotFoundException.
The logic here is a bit more complicated, however, and combining that with the fact that our end users may end up having several (in the 10s) of these child modules plugged in, we're finding that the classloader ends up being one of the more CPU-intensive parts of the application, given how reliant Java is today on reflection-based command pattern implementations. (By that I mean there are a lot of Class.forName() calls to load and instantiate classes at runtime.)
We started noticing this first in periodic thread dumps of the application to catch the app "in action" to see what it is doing, plus profiling through JProfiler certain use cases that were known to be slower than desired.
I've written a very simple caching approach for MainClassLoader where the results of a loadClass() (including a ClassNotFoundException) call are cached in a concurrent map with weak values (keyed by the String className), and the performance of this class went high enough to totally fall off the hot spots list of JProfiler.
However, I'm concerned about whether we can really safely do this. Will such a cache get in the way of intended classloader logic? What are the pitfalls one might expect in doing this?
Some obvious ones I anticipate:
(1) Memory - obviously this cache consumes memory, and if left unbounded is a possible memory drain. We can address this using a limited cache size (we're using Google's Guava CacheBuilder for this cache).
(2) Dynamic classloading, especially in development - So if a new or updated class/resource is added to the classpath after our cache has a stale result, this would confuse the system, probably resulting more often in ClassNotFoundExceptions being thrown when the class now should be loadable. A small TTL on the cached "not found" state elements might help here, but my bigger concern is, during development, what happens when we update a class and it gets hot-swapped into the JVM. This class would most likely be in one of the classloaders that MainClassLoader delegates to, and so its cache could conceivably have a stale (older) version of the class. However, since I'm using Weak values, would this help to mitigate this? My understanding of weak references are they don't go away even when eligible for collection until the GC runs a pass where it decides to reclaim them.
These are my two known issues/concerns with this approach, but what scares me is that classloading is a bit of a black art (if not a dark science) that is full of gotchas when you do non-standard things here.
So what am I not worried about that I should be worried about?
UPDATE/EDIT
We ended up opting NOT to do the local caching as I prototyped above (it just seems dangerous and redundant with the caching/optimization done by the JVM), but did some optimization within our loadClass() method. Basically the logic we have in this loadClass() method (see comments below) did not follow a "best case" path through the code when it could have, e.g. when there were no "customization" modules in place, we were still behaving as though there were, letting that classloader throw a ClassNotFoundException and catching it and doing the next checks. This pattern meant that a given class load operation would nearly always go through at least 3 try/catch blocks with a ClassNotFoundException being thrown in each. Quite expensive. Some extra code to determine whether there were any URLs associated with the classloaders being delegated to allowed us to bypass those checks (and the resultant exception throw/catch), giving us an almost 25000% boost in performance for this class.
I'd still like comment on my original question, however, to help keep the issue alive to be answered.
What are the concerns in doing our own caching in a custom classloader, other than those I already listed?

Dump execution - java?

Is it possible to dump the complete program execution in java? I have to go through a complete process flow for a execution for a specific input values. Using step over, step into is a bit time consuming and I wanted to find out if any java command dumps the execution?
Maybe you want to have a look at the Chronon Time Travel Debugger.
I haven't tried it out yet, after a long beta period it seems to be now officially available and may satisfy your demands. It's a commercial product, but offers a free time trial.
Another alternative may be the use of debugging to a core file using the jsadebugd utility provided with the JDK. (you can't step forwards and backwards, but you can examine the stack/monitors of all threads which might help you already out)
If you only need the method calls, as stated in a comment, maybe a profiler which uses instrumentation like jprofiler or yourkit will also be helpful.
Or you want to have a look at btrace, a dtrace-like tool.
If you're able to modify/build the application, also some sort of a small AOP method interceptor will do the job.
If I understand correctly, you want something like a view of all the method calls that happen when your program processes some set of inputs. You can often get this kind of information out of a profiler, such as JProbe:
http://www.quest.com/jprobe/
You can run the program under JProbe, and then it will present a visual call graph of all of the method calls or a list of all method calls along with their frequency of execution.
Somewhat related are static analysis tools, such as Understand:
http://www.scitools.com/
Static analysis tools tend to focus on figuring out overall code structure rather than what happens with a specific set of inputs though.
Of course, you can always change code, but it's probably too much work to change every method in a large system to print a debugging string. Aspect-oriented programming tends to be a good approach for this kind of problem, because it's a cross-cutting concern across the codebase. There are a few different Java AOP solutions. I've used Spring AOP with dynamic proxies, which isn't enough to cover all method executions, but it is good enough for covering any method execution defined on an interface for a bean managed in a Spring container:
http://static.springsource.org/spring/docs/3.1.0.M1/spring-framework-reference/html/aop.html
For example, I've written a TimingAspect that wraps the execution of a method and logs its execution time after it completes. When I want to use it, I update my Spring applicationContext.xml to specify pointcuts for the methods I want to measure. You could define a similar TracingAspect to print a debugging message at the start of each method execution. Just remember to leave this off for production deployment.
For all of these approaches, measuring every single method call is probably going to cause information overload. You'll probably want to selectively measure just a few important pieces of your own codebase, filtering out core JDK methods and third-party libraries.

Categories