Rewriting method calls within compiled Java classes

Rewriting method calls within compiled Java classes - java

I want to replace calls to a given class with calls to anther class within a method body whilst parsing compiled class files...
or put another way, is there a method of detecting usages of a given class in a method and replacing just that part of the method using something like javaassist.
for example.. if I had the compiled version of
class A { public int m() { int i = 2; B.multiply(i,i); return i; } }
is there a method of detecting the use of B and then altering the code to perform
class A { public int m() { int i = 2; C.divide(i,i); return i; } }
I know the alternative would be to write a parser to grep the source files for usages but I would prefer a more elegant solution such as using reflection to generate new compiled class files.
Any thoughts ?

As #djna says, it is possible to modify bytecode files before you load them, but you probably do not want to do this:
The code that does the code modification is likely to be complex and hard to maintain.
The code that has been modified is likely to be difficult to debug. For a start, a source level debugger will show you source code that no longer corresponds to the code that you are actually editing.
Bytecode rewriting is useful in certain cases. For example, JDO implementations use bytecode rewriting to replace object member fetches with calls into the persistence libraries. However, if you have access to the source code for these files, you'll get a better (i.e. more maintainable) solution by preprocessing (or generating) the source code.
EDIT: and AOP or Groovy sound like viable alternatives too, depending on the extent of rewriting that you anticipate doing.

BCEL or ASM.
I recently looked at a number of libraries for reading Java class files. BCEL was the fastest, had the least number of dependencies, compiled out of the box, and had a deliciously simple API. I preferred BCEL to ASM because ASM has more dependencies (although the API is reputedly simpler).
AspectJ, as previously mentioned, is another viable option.
BCEL is truly simple. You can get a list of methods in three lines of code:
ClassParser cp = new ClassParser( "A.class" );
JavaClass jc = cp.parse();
Method[] m = jc.getMethods();
There are other API facilities for further introspection, including, I believe, ways to get the instructions in a method. However, this solution will likely be more laborious than AspectJ.
Another possibility is to change the multiply or divide methods themselves, rather than trying to change all instances of the code that calls the operation. That would be an easier road to take with BCEL (or ASM).

The format of byte code for compiled Java is specified and products exist that manipulate it.
This library appears to have the capability you need. I've no idea how easy it is to do these transformations reliably.

If you don't mind using Groovy, you can intercept the call to B.multiply and replace it with C.divide. You can find an example here.

It's much easier to perform these operations ahead-of-time, where the executable on disk is modified before launching the application. Manipulating the code in memory at run time is even more prone to errors than manipulating code in memory in C/C++. Why do you need to do this?

Related

How to get the variables name declared within a method in java

How to get the variables name declared within a method in java class?
for eg:
public class A {
String x;
public void xyz(){
int i;
String z = null;
//some code here
}
Method[] methods = A.class.getDeclaredMethods();
for (Method q = methods){
//some code here to get the variables declared inside method (i.e q)
}
}
How can i do that?
Thanks in advance..

There is no simple way to do this.
If those were fields, you could get their names using reflection. However, local variable and parameter names are not loaded into the JVM. So you would need to resort to reading the "A.class" file and extracting the debug information for that method. And the bad news is that if the class wasn't compiled with debug information, then even that wouldn't work.
There are libraries around for reading ".class" files for various purposes, but I can't give a specific recommendation.
But the $64,000 question is "But why ...?". What is the point of listing the local variable names for a method from Java? Can't you just look at the source code? Can't you dump the ".class" file with "javap" or decompile it with some 3rd party decompiler?
I thought for big programs it will be useful to understand and analyze it if we can come to know the variables their types and method names and their parameter list etc so only...
I think you just need a decent IDE ...

To paraphrase another answer, There's no simple way to do this with reflection.
There is a way to do it. You need a full Java source code parser and name/type resolver ("symbol tables").
The Java compiler offers internal APIs to get at that information. Eclipse JDT offers something similar. Our DMS Software Reengineering Toolkit offers a full parser with this information easily accessible, and considerable additional help to build analyzers and/or code generators that take advantage of this extra information. (You can see this information extracted by DMS in the example Java Source Code Browser at my site, see bio).

Programatic code modification (e.g. variable extraction) in Java

I know it's possible to do nice stuff with Reflection, such as invoking methods, or altering the values of fields. Is it possible to do heavier code modification, though, at runtime and programmatically?
For instance, if I have a method:
public void foo(){
this.bar = 100;
}
Can I write a program that modifies the innards of this method, notices that it assigns a constant to a field, and turns it into the following:
public int baz = 100;
public void foo(){
this.bar = baz;
}
Perhaps Java isn't really the language to do this kind of thing in - if not, I'm open to suggestions for languages that would allow me to basically reparse or inspect code in this way, and be able to alter it so precisely. I might be pipe dreaming here though, so please tell me if this is the case also.

Just adding a suggestion from a friend - Apache Commons' BCEL looks excellent:
http://commons.apache.org/bcel/manual.html
The Byte Code Engineering Library (Apache Commons BCEL™) is intended to
give users a convenient way to analyze, create, and manipulate (binary)
Java class files (those ending with .class). Classes are represented by
objects which contain all the symbolic information of the given class:
methods, fields and byte code instructions, in particular.
Such objects can be read from an existing file, be transformed by a
program (e.g. a class loader at run-time) and written to a file again.
An even more interesting application is the creation of classes from
scratch at run-time. The Byte Code Engineering Library (BCEL) may be
also useful if you want to learn about the Java Virtual Machine (JVM)
and the format of Java .class files.

You are looking for software that allows you to do bytecode manipulation, there are several frameworks to achieve this, but the two most known currently are:
ASM
javassist
When performing bytecode modifications at runtime in Java classes keep in mind the following:
If you change a class's bytecode after a class has been loaded by a classloader, you'll have to find a way to reload it's class definition (either through classloading tricks, or using hotswap functionalities)
If you change the classes interface (example add new methods or fields) you will be able only to reach them through reflection.

It's probably fair to say that Java wasn't designed with this purpose in mind, but you can do it potentially. How and when depends a little on the ultimate aim of the exercise. A couple of options:
At the source code level, you can use the Java Compiler API to
compile arbitrary code into a class file (which you can then load).
At the bytecode level, you can write an agent that installs a
ClassFileTransformer to arbitrarily alter a class "on the fly"
as it is loaded. In practice, if you do this, you will also probably
make use of a library such as BCEL (Bytecode Engineering
Library) to make manipulating the class easier.

You want to investigate program transformation systems (PTS), which provide general facilities for parsing and transforming languages at the source level. PTS provide rewrite rules that say in effect, "if you see this pattern, replace it by that pattern" using the surface syntax of the target language. This is done using full parsers so the rewrite rule really operates on language syntax and not text; such rewrite rules obviously won't attempt to modify code-like text in comments, unlike tools based on regexps.
Our DMS Software Reengineering Toolkit is one of these. It provides not only the usual parsing, AST building and prettyprinting (reproducing compilable source code complete with comments), but also supports symbol tables and control and data flow analysis. These are needed for almost any interesting transformations. DMS also has front ends for a variety of dialects of Java as well as many other languages.
Bytecode transformers exist because they are much easier to build; it is pretty easy to "parse" bytecode. Of course, you can't make permanent source changes with a bytecode transformer, so it is lot less useful.

You mean like this?
String script1 = "println(\"OK!\");";
eval( script1 );
script1 += "println(\"... well, maybe NOT OK after all\");";
eval( script2 );
Output:
OK!
OK!
... well, maybe NOT OK after all
... use a scripting extension to Java. Groovy and other things like that would probably allow you to do what you want. I've written a scripting extension which integrates with Java through reflection almost seamlessly myself; contact me if you're interested in the details.

Best choice? Edit bytecode (asm) or edit java file before compiling

Goal
Detecting where comparisons between and copies of variables are made
Inject code near the line where the operation has happened
The purpose of the code: everytime the class is ran make a counter increase
General purpose: count the amount of comparisons and copies made after execution with certain parameters
2 options
Note: I always have a .java file to begin with
1) Edit java file
Find comparisons with regex and inject pieces of code near the line
And then compile the class (My application uses JavaCompiler)
2)Use ASM Bytecode engineering
Also detecting where the events i want to track and inject pieces into the bytecode
And then use the (already compiled but modified) class
My Question
What is the best/cleanest way? Is there a better way to do this?

If you go for the Java route, you don't want to use regexes -- you want a real java parser. So that may influence your decision. Mind, the Oracle JVM includes one, as part of their internal private classes that implement the java compiler, so you don't actually have to write one yourself if you don't want to. But decoding the Oracle AST is not a 5 minute task either. And, of course, using that is not portable if that's important.
If you go the ASM route, the bytecode will initially be easier to analyze, since the semantics are a lot simpler. Whether the simplicity of analyses outweighs the unfamiliarity is unknown in terms of net time to your solution. In the end, in terms of generated code, neither is "better".
There is an apparent simplicity of just looking at generated java source code and "knowing" that What You See Is What You Get vs doing primitive dumps of class files for debugging and etc., but all that apparently simplicity is there because of your already existing comfortability with the Java lanaguage. Once you spend some time dredging through byte code that, too, will become comfortable. Just a question whether it's worth the time to you to get there in the first place.

Generally it all depends how comfortable you are with either option and how critical is performance aspect. The bytecode manipulation will be much faster and somewhat simpler, but you'll have to understand how bytecode works and how to use ASM framework.
Intercepting variable access is probably one of the simplest use cases for ASM. You could find a few more complex scenarios in this AOSD'07 paper.
Here is simplified code for intercepting variable access:
ClassReader cr = ...;
ClassWriter cw = ...;
cr.accept(new MethodVisitor(cw) {
public void visitVarInsn(int opcode, int var) {
if(opcode == ALOAD) { // loading Object var
... insert method call
}
}
});

If it was me i'd probably use the ASM option.
If you need a tutorial on ASM I stumbled upon this user-written tutorial click here

How Do I Place Auto-generated Java Classes in a Single .java File?

As everyone knows - public java classes must be placed in their own file named [ClassName].java
( When java class X required to be placed into a file named X.java? )
However, we are auto-generating 50+ java classes, and I'd like to put them all in the same file for our convenience. This would make it substantially easier to generate the file(s), and copy them around when we need to.
Is there any way I can get around this restriction? It seems like more of a stylistic concern - and something I might be able to disable with a compiler flag.
If not, what would you recommend?

Can you put wrapper class around your classes? Something like:
public class Wrapper {
public static class A {...}
public static class B {...}
....
}
Then you can access them via Wrapper.A, Wrapper.B.

At the .class level, this is a requirement per the Java spec. Even the inner classes get broken out into their own class file in the from Outer$Inner.class. I believe the same is true at the language level.
Your best bet is to generate the files and make your copy script smart. Perhaps generate them and zip them up. Usually, if you have to move these files around then either everyone has the same generator script OR you distribute them as a JAR.

Is there any way I can get around this restriction?
You can change your generated source code to make it acceptable; e.g. by using nested classes, by putting the generated classes into their own package.
It seems like more of a stylistic concern - and something I might be able to disable with a compiler flag.
It is not just a stylistic concern:
The one file per class rule is allowed by the Java Language Specification.
It is implemented by all mainstream Java compilers.
It is implemented by all mainstream JVMs in the form of the default classloader behavior.
It is assumed by 3rd party Java tools; e.g. IDEs, style checkers, bug checkers, code generation frameworks, etc.
In short, while it would theoretically be legal to implement a Java ecosystem that didn't have this restriction, it is impractical. No such compiler switch exists, and implementing one would be impractical for the reasons above.
The nested class solution is a good one. Another alternative would be to put the generated classes into a separate package (but with separate file) to make them easier to manage.

Runtime code generation and compilation

Say I have this code that uses some input (e.g. a URL path) to determine which method to run, via reflection:
// init
map.put("/users/*", "viewUser");
map.put("/users", "userIndex");
// later
String methodName = map.get(path);
Method m = Handler.class.getMethod(methodName, ...);
m.invoke(handler, ...);
This uses reflection so the performance can be improved. It could be done like this:
// init
map.put("/users/*", new Runnable() { public void run() { handler.viewUser(); } });
map.put("/users", new Runnable() { public void run() { handler.userIndex(); } });
// later
Runnable action = map.get(path);
action.run();
But manually creating all those Runnables like that has its own issues.
I'm wondering, can I generate them at runtime? So I'd have an input map as in the first example, and would dynamically create the map of the second example.
Sure, generating it is just a matter of building a string, but what about compiling and loading it?
Note: I know the performance increase is so little it's a perfect example of premature optimization. This is therefore an academic question, I'm interested in runtime generation and compilation of code.

The only ways to generate code dynamically are to either generate source code and compile it or to generate byte code and load it at runtime. There are templating solutiions out there for the former, and bytecode manipulation libraries for the latter. Without a real case and some profiling I don't think you can really say which will be better. From a maintenance point of view I think reflection is the best option when available.

I think you can achieve this with the code found here. It is some time ago I tried this, and I'm not sure anymore where I found the code I was using, but it seems that this is the same.
Basically, you use the 1.6 Compiler API, but use an "untraditional" way to find source files and write class files: The Compiler takes an Iterable<JavaFileObject>, where you plug in your memory-backed implementation, and a JavaFileManager that handles writing class files, where you hold the binary compiler output in memory.
Now that your code was compiled, you only need a custom ClassLoader that can read from your in-memory byte code and load the class with the right FQCN etc.
And, luckily, all that seems to be ready ;)

Actually, the reflection engine will generate similar invocation stubs internally, if you invoke the same methods over and over again. (Just use the same Method objects instead of recreating them again and again.)

Well, you could write code to a .java file, compile it with javac (how to do that) and load it into Java using Reflection.
But maybe, as a trade-off, you could also fetch the Method objects during initialization - so you would just have to call the invoke() method for every request.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Rewriting method calls within compiled Java classes - java

The format of byte code for compiled Java is specified and products exist that manipulate it. This library appears to have the capability you need. I've no idea how easy it is to do these transformations reliably.

If you don't mind using Groovy, you can intercept the call to B.multiply and replace it with C.divide. You can find an example here.

It's much easier to perform these operations ahead-of-time, where the executable on disk is modified before launching the application. Manipulating the code in memory at run time is even more prone to errors than manipulating code in memory in C/C++. Why do you need to do this?

Related

How to get the variables name declared within a method in java

Programatic code modification (e.g. variable extraction) in Java

Best choice? Edit bytecode (asm) or edit java file before compiling

How Do I Place Auto-generated Java Classes in a Single .java File?

Runtime code generation and compilation

Categories

Resources