Use java reflection to analyze Java project - java

I'm having an issue with java reflection.
How can I load a .java file or the whole project then analyze them?
input : .java code
output : analyzed class, method, relations between classes, attributes. v/v

Analyzing .java files is a lot more difficult than it sounds, as they are pure text and therefore requires textual analysis inorder to get something. A tool like PMD knows that and performs static code analysis on .java files.
https://pmd.github.io/
Analyzing .class files however is alot easier. For this task one need to create a custom class loader object (URLClassLoader should work) and use it to search and load all of the Class objects. Then one cause those objects' methods to get information on those classes. A tool that performs static code analysis on .class files is FindBugs.
http://findbugs.sourceforge.net
Hopefully this helps you bit

Related

Java Bytecode Manipulation Without External Library

Libraries such as ASM, BCEL, Javaassist and AspectJ are all capable of runtime bytecode manipulation but how do they achieve this?
I have done some basic bytecode manipulation using ASM before but i don't understand how it works. Is the Java Agent executed in the JVM before the remainder of the program, allowing for ASM to load the compiled classes and edit them before they are executed by the JVM?
If so, is it possible to perform java bytecode manipulation without using an external library like ASM and loading the compiled class files with an BufferedReader and writing a custom parser etc. for example?
These libraries settle on standard Java APIs which, of course, you can also use yourself without these libraries.
First of all, Java class files are just sequences of bytes in a well defined format, as specified in JVMS §4, The class File Format. The primary task of the mentioned libraries is to provide tools for processing byte sequences in this format. The second is about getting the definitions of existing or exporting modified or newly created classes.
There are two different way of dealing with the second task. One is to read compiled classes from persistent storage like file systems or jar files, etc. and writing them back to these storage while the particular code is not running, like build and deployment tools do. This should be trivial, as it just boils down to reading and writing bytes.
The other is to manipulate classes at runtime, which can be done by Java Agents via the Instrumentation API. It offers mechanisms for intercepting classes at loading/definition time before their first use, but also redefinition of classes. The latter can’t change them freely, currently, it has to retain all field and method declarations, so it can be mainly used to change the executable code of the methods.
If you want examples for class file processing without additional 3rd party libraries, there are some answers on Stackoverflow
Extract the class name from a class file
Find all class dependencies
Parse the constant pool
Iterate over all instructions of a method
Of course, these examples are only single-purpose code or sketches. If you expand them to something more general or useful, you will soon end up at basically re-implementing these libraries.
Class files are just a sequence of bytes, the format of which is specified in The Java Virtual Machine Specification. BufferedReader is for text files so you'd want BufferedInputStream, but the format is quite complicated.
You can load manipulated class files as if they were generated by javac. You can also load them dynamically with java.net.URLClassLoader.newInstance or similar. Java Agent allows modification of class files as they are loaded, either through a Java or a native interface (the latter being necessary if you want to modify classes the are loaded before classes that load classes).
Recently java development group at Oracle announced a new JEP( Java enhancement proposal) for class file manipulation. Now we will not need any additional libraries.
https://openjdk.org/jeps/8280389

Compiling a Java file out of context

I have a large pre-compiled project with lots of packages and class files. I have extracted one of the class files and decompiled it and edited some of the code inside. Now I would like to compile the changed code and re-insert it back into the original pre-compiled project, but unfortunately the code keeps many references to Objects in the pre-compiled project so I cannot compile without having it be already in the project which creates a rather large paradox. is there any for me to do what I am trying to accomplish?
Just compile it with a classpath which refers to the existing class files (or the jar file that contains those class files). There should be no problem.
However, note that if you change any constants in the file, those changes won't be reflected in any other code that refers to those constants.
It would generally be a much better idea to recompile from the complete source code. It would also be a better idea to use the original source code than just the result of decompilation - do you not have access to the original source code? (If you don't, are you sure that what you're doing is even legal in your country? I'm not a lawyer, but you should at least check...)
I would recompile the whole thing to avoid problems, but if you MUST, try this and let me know if they work for you:
Instead of loading the class on your original project, load it using classForName http://docs.oracle.com/javase/7/docs/api/java/lang/Class.html
Remember that you cannot change the signature of the methods as this would indicate a different object since it wouldn't follow the same contract (interface).
Also keep in mind the serialVersionId What is a serialVersionUID and why should I use it?

How eclipse deduces the methods of a class in a jar file

like:
import com.xxx.utility.*;
class MyClass{
public static void main(String[] args){
MyUtiliy ut = new MyUtiliy();
MyUtility.doAdd(5, 6);
.......
}
}
When put the "." after MyUtiliy, eclipse will tell you all the methods you can use, how does eclipse achieve this?
Does eclipse use the reflection on the fly? (like the answer of this thread? )
The architecture of the eclipse software is describe here, in the section 6.1.2. Java Development Tools (JDT) it briefly describes the incremental build system used. That system would have all the relavent information to populate the autocomplete mechanism.
For the exact mechanism, you would have to look at the eclipse source code.
Yes Eclipse (and any other Java IDEs) uses reflection.
If fact Eclipse uses a ClassLoader for each project's libraries, so it load the classes in jar files, and after that everything is easy, it can get information using reflection.
By the way java IDEs not only use reflection, but also read class debug info, to extract parameter names, and so on.
There is an explanation in this article. Basically the Eclipse Java compiler builds an Abstract Syntax Tree (AST) of your code which lets it find all the information it needs for autocompletion very quickly.
So it is not using reflection for this, rather it is compiling the code in to an internal form for quick access.
When no source code is available (you just have a .class file) it is still possible to construct the part of the AST containing the class methods and types which are needed for completion. This appears to be done by reading the .class files directly rather than using a class loader (org.eclipse.jdt.internal.compiler.classfmt.ClassFileReader)

Statement level reflection in java

I'm looking for a mechanism in java (maybe like Reflection) to access the statements in a program. for example i access the statements of a function, then walk throw the statement tree to have analysis on java programs.
In .Net Microsoft provides the Extended Reflection. what is the alternative in java?
For C files, CIL process the .c source files and allow us to access the statements and walk throw the tree (even changing and inserting codes) statically. if there is a tool that process the .java codes and does similar works, can solve my problem.
If you aim to analyze Java code via a Java application you will need a copy of the source code. .java files are essentially text files, so with the source code in hand your program could read the files similar to reading any text file.
There are several tools such as PMD and Clover that will perform this analysis for you. It may save time and resources to use an established tool. Although Clover is no longer a free tool it provides extensive metrics on code complexity. I believe PMD may provide similar metrics.

Hide a class in a .jar

Whenever I build my app all classes (logically) are visible in the .jar that comes out of it.
Aswell as a class that holds information to my MYSQL server (for the app to connect to). But I dont want this information to be publicly visible!
How can I "hide" this code or "hide" the class?
Thanks!!
I think you mean you dont want someone to do reverse engineering with your .class inside your jar file. There are many decompilers that can do that.
So you would need to Obfuscate your code with an obfuscator utility.
The process of obfuscation will convert bytecode into a logical
equivalent version that is extremely difficult for decompilers to pick
apart. Keep in mind that the decompilation process is extremely
complicated and cannot be easily 'tweaked' to bypassed obfuscated
code. Essentially the process is as follows:
Compile Java source code using a regular compiler (ie. JDK)
Run the obfuscator, passing in the compiled class file as a
parameter. The result will be a different output file (perhaps with a
different extension).
This file, when renamed as a .class file, will be functionally
equivalent to the original bytecode. It will not affect performance
because a virtual machine will still be able to interpret it.
Here is an article describing this process in more detail and
introducing an early obfuscator, Crema:
http://www.javaworld.com/javaworld/javatips/jw-javatip22.html

Categories