I am looking towards some approach where by using Java agent or instrumenting classes (preferably something at lower level than user classes) to intercept all object creation in JVM (new or any alternative ways to create Object), There is a similar question which doesn't focus on Java agent or something lower than instrumenting user classes
Java Objects can be created in several different ways.
From Java code, when a Java method, either interpreted or compiled, executes one of the following bytecode instructions: new, newarray, anewarray, multianewarray.
From native code, when native methods, including those in standard class library, call one of JNI functions: NewObject, NewObjectArray, NewStringUTF, NewDirectByteBuffer, etc.
Directly from VM runtime, when a new object is created internally by JVM, for example, in response to Object.clone(), Throwable.getStackTrace(), Class.getInterfaces(), etc.
Unfortunately, there is no single point where you can collect objects from all these sources. However, there are means for intercepting all of them.
Objects instantiated from Java can be caught by an Instrumentation agent. The agent needs to define a ClassFileTransformer that will scan the bytecode of all loaded classes for object-creating instructions and modify it.
Note: there is no need to intercept all new instructions, you can instrument Object() constructor instead. But you still need to intercept array allocation instructions.
JNI functions can be intercepted by JVMTI agent. You need to define your own native hooks for NewObjectArray, NewStringUTF etc. and then replace JNI function table. See JVMTI Reference for the details.
Objects created by the VM can be caught by JVMTI Event Callback mechanism. The desired event is VMObjectAlloc.
Note: JVM will not post VMObjectAlloc event for objects allocated from Java or by JNI functions.
All other ways of object instantiation (cloning, reflection, deserialization) fall into one of the above categories.
Get JDK 8 Demos and Samples from Oracle Java SE Downloads website.
There is a sample JVMTI agent for exactly this question.
Look under
jvmti/heapTracker
jvmti/hprof
You can take a look at this opensource java agent created by devexperts team
https://github.com/Devexperts/aprof
It provides nice reports to detect where memory is allocated. But, as i know, it doesn't intercept new objects created via JNI or sun.misc.Unsafe.allocateInstance in current version
It is pure java agent which manipulates bytecode with ASM. Before each object allocation aprof inserts method call which traks allocation size and location stack (where this allocation occurs)
Related
Is there a way to get notified when a method from a Different JVM is called.
Dev Env: JDK8, Windows 10 (later on cloud for deployment).
I have couple of Java applications running, One in App Server and another is standalone batch process.
Whenever a core java class method is called on either of these JVM's e.g. PrintStream.print, I need to get handle to input string and log it somewhere else.
I tried with
1. Java bytecode manipulation libraries e.g. Javassist, to transform byte code using Instrumentation, but it allows to have handle and manipulate User Defined classes / Third party library classes only - not java., sun. etc... (even if we do it somehow, it says - it violates JRE binary licence - Official Javadoc says this process of instrumenting the rt.jar class violates the JRE binary code license - so this may not be the go ahead approach.
https://docs.oracle.com/javase/8/docs/technotes/tools/windows/java.html )
Reflections - Can be used when you are on the same JVM, not sure if it works on different JVM.
Appreciate suggestions.
Situation
Hi, I have 2 problems.
The situation is that I'm writing a Java API for Windows that also provides tools for injecting code into a process and then manipulate the target. I have already implemented the injection-part, for example injecting a jar into another jar. At this point my jar gets called (while the target already is at runtime) and starts in a complete static context.
Goals & problems
From here I have two goals:
I'd like to interact with the targets objects, thus I need references. For many objects this is already possible because they provide static access to their instances. For example awt.Frames#getFrames() provides access to all created Frame objects. But it would be awesome if there is a possibility to get access to arbitrary objects on the heap. Something like 'Heap#getAllObjectInstances()'.
Given an object instance, I'd like to hook up onto arbitrary functions of this object. For example whenever BufferStrategy#show() gets called, I want it to call another method first.
So I summarize the problems as follows:
How to get arbitrary object references from a static context?
How to hook up onto arbitrary functions?
Remarks
What I've done so far, remarks and ideas:
The JDI (Java Debugger Interface) provides such a method via VirtualMachine#allClasses() -> ReferenceType#instances(0). But the JDI needs the target JVM to be started with additional debug parameter which is no option for me. One could go down to low-level and analyze the heap with memory tools, but I hope someone knows a more high-level approach. Using the Windows API would be an option for me as I'm familiar with JNA/JNI, but I don't know such a tool.
The last resort would be to use IAT hooking with C-Code, a very low-level approach, I'd like to avoid this. As I can assume having a object reference at this point, maybe does the Reflection API provide a method to change an objects method? Or at least simply provide a hooking mechanism?
Be aware that changing the targeted code certainly is no option for me. And that it is already at runtime, thus ByteCode-Manipulation could also be an option.
Scenario
A scenario where this would come in handy:
The target is a game, deployed as jar. It renders with a Double-Buffer-Strategy, using the BufferStrategy class. It displays the image with BufferStrategy#show(). We inject our jar inside the game and like to draw an overlay with additional information. For this we get an reference to the used BufferStrategy and hook up onto its show-method. So that it calls our drawOverlay-method everytime it gets called, then we pass back to the original show-method.
What you need is JVMTI agent - a native library that makes use of JVM Tool Interface.
Agents can be attached dynamically to a running VM using the Attach API.
See VirtualMachine.loadAgentPath.
To get all instances of a given class use JVMTI IterateOverInstancesOfClass function.
See the related question for details.
To intercept a method of a foreign class you'll need JVMTI RetransformClasses API. The same can be also achieved by using Java-level instrumentation API, see Instrumentation.retransformClasses.
For the example of JVMTI-level method interception refer to demo/jvmti/mtrace from Oracle JDK demos and samples package.
Java-level instrumentation will be easier with bytecode manipulation libraries like Byte Buddy.
Is there a way to get notified on all invocations to constructor of String class (either directly or using reflection) without weaving or instrumenting rt.jar?
Further is it possible to filter these notifications only for calls within a specific package?
Further is it possible to make these notifications async (like events) so that actual JVM invocations are not slowed down
My use-case is to intercept all strings being created, make a pattern match on the content and raise alters based on some rules (all in backend) as part of some platform component.
As I don't want to instrument rt.jar, AspectJ seems to be out of question (as LTW can't be done on java core classes). The potential tool seems to JVM TI, but I am not exactly sure how to achieve it.
Thanks,
Harish
Is there a way to get notified on all invocations to constructor of String class (either directly or using reflection) without weaving or instrumenting rt.jar in compile time?
You are not compiling the String class, so you can only do weaving at runtime. And yes, this is the only way without creating a custom JVM.
Further is it possible to filter these notifications only for calls within a specific package?
It is possible to check the caller with Reflection.getCallerClass(n)
Further is it possible to make these notifications async (like events) so that actual JVM invocations are not slowed down
All this is very expensive as is passing work to another thread.
make a pattern match on the content
Pattern matching is very expensive compared to creating a String. If you are not careful you will slow down your application by an order of magnitude or two. I suggest you reconsider your real requirements and see if there is another way to some what you are trying to do.
Are you sure you don't want to use a profiler to do this. Note: even profilers generally only sub-sample e.g. every 10th allocation. There is plenty of free ones, in fact two come with the JVM. I suggest using Flight Recorder to track allocations as this has a very low overhead.
The method DriverManager.getCallerClassLoader() in class java.sql.DriverManager is declared as native. I understand that all the class loaders references in an application are available in the current executing JVM. Also, my basic understanding about native method is that it's used to call the method defined in native libraries and they execute outside the JVM execution environment.
My question is, what is that needed by DriverManager.getCallerClassLoader() which requires its implementation to be native?
My basic understanding about native method is that its used to call the method defined in native libraries
This is correct, native methods represent calls of the code that is part of a natively compiled library
and they execute outside the JVM execution environment
That is what native methods typically do. That is, the native methods that Java users write. However, native methods are not limited in what they can do: once you're outside of JVM, you can do what you wish. In fact, Java's built-in classes such as Class<T>, heavily rely on the ability to do so, with dozens of native method sprinkled around their Java code.
One of these methods is package-private java.lang.Class<T>.getClassLoader0 (yes, with a zero). The implementation of ClassLoader.getCallerClassLoader ultimately refer to this method, which queries the internals of JVM to fetch the class loader.
Note that DriverManager cannot forward the call to ClassLoader.getCallerClassLoader, because that would return the DriverManager's class loader (because DriverManager would be the caller of getCallerClassLoader). It is not possible for the DriverManager to repeat the "magic" of ClassLoader's getCallerClassLoader either, because it is located in a different package (i.e. not in the java.lang), so Class<T>.getClasLoader0 is not accessible. That is why it is forced to move the getCallerClassLoader into the native territory, where the native code can obtain the calling class and fetch its class loader without restrictions.
for bytecode instrumentation in java, there is the asm framework and the bcel and javaassist libraries.
However I need to do instrumentation in native code, since some java classes are already loaded by the time the javaagent runs, eg java.lang.Thread, java.lang.Class, etc
is there any library for instrumenting java classes in native code?
Edit:
Seems there is a bit of confusion.
What I want is:
Create a native java agent, which uses JVMTI apis to change the bytecode of a class while its being loaded, using the OnClassLoad event hook.
I encountered this problem during my doctoral research. The answer that worked best for me was to perform the byte-code modification in a separate JVM using a java library (I used ASM).
I used the JVMTI class load hook to capture the class file and transmit it to the separate JVM using a tcp connection. Once the class had been modified within the separate JVM I returned it to the JVMTI Agent, which copies it into VM memory and returns a pointer to the modified class file to the JVM.
I found that it was too difficult to weave classes within the same JVM as was being profiled as the system class files I wanted to modify (java.lang.Object, for example) had to be loaded before any class files I needed to perform weaving. I hunted for c/c++ bytecode libraries without much success, before settling on the separate JVM approach I finally used.
You can parameterize the JVMTI agent with the hostname/port of the weaver JVM, or you could use some form of discovery, depending on your requirements.
The JIT will turn byte code into native code. If you want to produce native code, you need to let the JIT do it or write native code which is called via JNI.
Perhaps what you are trying to achieve can be done simpler another way.
Create a native java agent, which uses JVMTI apis to change the bytecode of a class while its being loaded, using the OnClassLoad event hook.
Though you don't need to do what you want. Why make the solution more complicated (and less likely to work) than it needs to be?
You cannot change the byte code of a class once it has been loaded. You can either make sure your instrumentation runs before it is loaded, or you can create a new ClassLoader, and re-load the classes inside of it by not asking the parent class. You can't use those classes with code loaded outside of the ClassLoader though, as that code will refer to the earlier loaded, non-altered class.