We have a need to generate Java source code. We do this by modeling the abstract syntax tree and have a tree walker that generate the actual source code text. This far all good.
Since my AST code is a bit old, it does not have support for annotations and generics. So I'm looking around for open projects to use for future projects with code generation needs. And this is where the actual issue comes. We want to test that the code generated has the correct behavior.
Here is where I got the idea to actually evaluate the AST instead of generating the java source code, compile it, and run tests against that code. An evaluator would speed up the unit tests, and one could evaluate smaller pieces of generated code, such as only a method, making the "units" more reasonable.
So far i have found the com.sun.codemodel project that seems quite nice as for being a modern (support for java5 and 6 features) AST based code-generating solution.
Anyone know if there is another project that would allow me to evaluate pieces of AST directly (such as a single generated method)?
To evaluate Java, you need all the semantic analysis that goes along with it ("what is the scope of this identifier? What type does it have?") as well as an interpreter.
To get that semantic analysis, you need more than just an AST: you need full name resolution (symbol table building) and type resolution (determination of expression types and validation that expressions are valid in the context in which they are found),
as well as class lookup (which actual method does foo refer to?)
With that, you can consider building an interpreter by crawling over the trees in execution order. You'll also need to build a storage manager; you might not need to do a full garbage collector, but you'll need something. You'll also need an interpreter
for .class files if you really want to run something, which means you need a parser
(and name/type resolution for the class files, too).
I don't know if Eclipse has all this (at least the storage manager part you can get for free :). I'd sort of expect it to, given that its original design was to support Java development, but I've been sorely disappointed by lots of tools over the years.
The DMS Software Reengineering Toolkit is a program analysis/transformation too that handles many languages. It has a full Java front end including parsing, AST building, symbol table construction and name resolution, type resolution, builds call graphs (needed to resolve virtual function calls), and has a .class file reader to boot with name resolution. So it would be a good foundation for building an interpreter.
DMS can construct arbitrary ASTs, too, and then generate source code from them, so it would handle the code generation end, too, just fine.
[The reason DMS exists is the "sorely disappointed" part].
I'm not sure if this is what you're looking for, but Eclipse's JDT project provides a very good view on the Java AST (including the Java 5 and 6 features). It has a series of utilities and tools for code viewing/rewriting (not necessarily generation). They're all licensed under the Eclipse Public License.
You can get more info at http://eclipse.org/jdt/
Related
As in C/C++ the program is first given to the preprocessor to include files & perform macro expansions etc... then given to the compiler to convert the code into assembly format and the process goes on.But in Java I do not see the use of preprocessor.Why so and then who does all the task that normally the preprocessor handles?
The pre-processor is not a necessary step of the compilation process in Java.
In C/C++, functions stored in different files are "included" in other files, which essentially means they are copied and pasted in their entirety into the document. This was a pretty good idea at the time, given the hardware capabilities at the time, but nowadays more modern languages use something called "symbolic imports".
Symbolic imports involve looking for symbols in another file rather than using text directly. In Java, this can involve importing constants or classes. These imports act as references to code in other files. Thus, rather than having to go through the trouble of having the pre-processor copy and paste code around and eventually figuring out what code belongs to which file, Java allows doing these imports on a semantic level directly.
This makes a pre-processor unnecessary to the compilation process of the language, and has therefore, along with other reasons been left out.
This page describes how I can use the code generator in javac to generate code given that I can build an AST (using a separate parser which I wrote). The technique involves editing javac's source code to basically bypass the Java parser, so that one could supply his/her own AST to the code generator. This could work, but I was hoping to do it in a slightly cleaner way. I want to include the code generating part of javac as a library in my project so I can use it to generate code, without bringing with it the rest of javac's source.
Is there a way to do this with javac, or is there perhaps a better library?
Also, feel free to change the question's title. I couldn't think of a better one, but it's a little ambiguous. If you suggest an edit for a better title, I'll accept it.
I think what you might be interested in is a java library like BCEL(ByteCode Engineering Library)
I played around with it back when I took a class on compiler construction, basically, it has a nice wrapper for generating the constant pool, inserting named bytecode instructions into a method and whatnot, then when you are done, you can either load the class at runtime with a custom classloader, or write it out to a file in the normal way.
With BCEL, it should be relatively easy to go from the syntax tree to the java bytecodes, albeit a bit tedious, but you may want to just use BCEL to generate the raw bytecode without building the tree as well in some cases.
Another cool framework is ASM, a bytecode analysis and manipulation framework.
In case you do not want to use a framework, as of now (2014), it is not possible to generate bytecode from a tree using the arbitrary representations of com.sun.source.tree.* as said here.
I'm parsing Java source files and I need to extract type information to guess at signatures of called methods.
e.g. I have foo.x(bar) and I need to figure out the type of foo and bar.
I'm using a java parser that gives me a complete AST, but I'm running into problems with scoping. Is there a different parser I could use that resolves this?
This can't be resolved perfectly because of reflection, but I'm hoping a good parser can deal with scoping and casting issues in the least.
edit: I can't assume that other source files will be present, so I can't simply follow the method call to its source and read the signature from the method declaration
Java's Pluggable Annotation Processing Framework (and its associated APIs) are designed to model Java code in the way you're talking about. You can invoke the Java compiler at runtime, giving you access to the model of the source using the APIs in the javax.lang.model packages. An introductory article is available here.
Basically getting types of identifier would require you to run a big part of a Java compiler yourself. In Java there is a long way from parsing to resolving types, so implementing this is a challenging task even for good programmers.
Perhaps your best way through this would be to take the Java Compiler from OpenJDK, run the relevant compiler phases and extract the types from that.
What you need is all the name and type resolution machinery. As another poster observed, one way you can get that is to abuse the Java compiler.
But you likely have some goal in mind other than compiling java; once you have those names, you want to do something with them. The Java compiler is unlikely to help you here.
What you really want is a foundation for building a tool that processes the Java language, including name and type resolution, that will help you do the rest of your task.
Our DMS Software Reengineering Toolkit is generalized program analysis and transformation machinery. It parses code, builds ASTs, manages symbol tables, provides generic flow analysis mechnisms, supports AST modification (or construction) both procedurally and in terms of surface syntax pattens, including (re)generation of compilable text from the ASTs including any comments.
DMS has a Java Front End that enables DMS to process Java, build Java ASTs, do all that name and type resolution you want. Yes, that's a lot of machinery, equivalent to what the Java compiler has, go read your latest Java reference manual. You can build whatever custom tool you need on top of that foundation.
What you won't be able to do as a practical issue is full, accurate name and type resolution without the rest of the Java source files (or corresponding class files), no matter how you tackle it. You might be able to produce some heuristic guess, but that's all it would be.
I'm looking for a way to automatically generate source code for new methods within an existing Java source code file, based on the fields defined within the class.
In essence, I'm looking to execute the following steps:
Read and parse SomeClass.java
Iterate through all fields defined in the source code
Add source code method someMethod()
Save SomeClass.java (Ideally, preserving the formatting of the existing code)
What tools and techniques are best suited to accomplish this?
EDIT
I don't want to generate code at runtime; I want to augment existing Java source code
What you want is a Program Transformation system.
Good ones have parsers for the language you care about, build ASTs representing the program for the parsed code, provide you with access to the AST for analaysis and modification, and can regenerate source text from the AST. Your remark about "scanning the fields" is just a kind of traversal of the AST representing the program. For each interesting analysis result you produce, you want to make a change to the AST, perhaps somewhere else, but nonetheless in the AST.
And after all the chagnes are made, you want to regenerate text with comments (as originally entered, or as you have constructed in your new code).
There are several tools that do this specifically for Java.
Jackpot provides a parser, builds ASTs, and lets you code Java procedures to do what you want with the trees. Upside: easy conceptually. Downside: you write a lot more Java code to climb around/hack at trees than you'd expect. Jackpot only works with Java.
Stratego and TXL parse your code, build ASTs, and let you write "surce-to-source" transformations (using the syntax of the target language, e.g., Java in this case) to express patterns and fixes. Additional good news: you can define any programming language you like, as the target language to be processed, and both of these have Java definitions.
But they are weak on analysis: often you need symbol tables, and data flow analysis, to really make analyses and changes you need. And they insist that everything is a rewrite rule, whether that helps you or not; this is a little like insisting you only need a hammer in toolbox; after all, everything can be treated like a nail, right?
Our DMS Software Reengineering Toolkit allows the definition of an abitrary target language (and has many predefined langauges including Java), includes all the source-to-source transformation capabilities of Stratego, TXL, the procedural capability of Jackpot,
and additionally provides symbol tables, control and data flow analysis information. The compiler guys taught us these things were necessary to build strong compilers (= "analysis + optimizations + refinement") and it is true of code generation systems too, for exactly the same reasons. Using this approach you can generate code and optimize it to the extent you have the knowledge to do so. One example, similar to your serialization ideas, is to generate fast XML readers and writers for specified XML DTDs; we've done that with DMS for Java and COBOL.
DMS has been used to read/modify/write many kinds of source files. A nice example that will make the ideas clear can be found in this technical paper, which shows how to modify code to insert instrumentation probes: Branch Coverage Made Easy.
A simpler, but more complete example of defining an arbitrary lanauges and transformations to apply to it can be found at How to transform Algebra using the same ideas.
Have a look at Java Emitter Templates. They allow you to create java source files by using a mark up language. It is similar to how you can use a scripting language to spit out HTML except you spit out compilable source code. The syntax for JET is very similar to JSP and so isn't too tricky to pick up. However this may be an overkill for what you're trying to accomplish. Here are some resources if you decide to go down that path:
http://www.eclipse.org/articles/Article-JET/jet_tutorial1.html
http://www.ibm.com/developerworks/library/os-ecemf2
http://www.vogella.de/articles/EclipseJET/article.html
Modifying the same java source file with auto-generated code is maintenance nightmare. Consider generating a new class that extends you current class and adds the desired method. Use reflection to read from user-defined class and create velocity templates for the auto-generating classes. Then for each user-defined class generate its extending class. Integrate the code generation phase in your build lifecycle.
Or you may use 'bytecode enhancement' techniques to enhance the classes without having to modify the source code.
Updates:
mixing auto-generated code always pose a risk of someone modifying it in future to just to tweak a small behavior. It's just the matter of next build, when this changes will be lost.
you will have to solely rely on the comments on top of auto-generated source to prevent developers from doing so.
version-controlling - Lets say you update the template of someMethod(), now all of your source file's version will be updated, even if the source updates is auto-generated. you will see redundant history.
You can use cglib to generate code at runtime.
Iterating through the fields and defining someMethod is a pretty vague problem statement, so it's hard to give you a very useful answer, but Eclipse's refactoring support provides some excellent tools. It'll give you constructors which initialize a selected set of the defined members, and it'll also define a toString method for you.
I don't know what other someMethod()'s you'd want to consider, but there's a start for you.
I'd be very wary of injecting generated code into files containing hand-written code. Hand-written code should be checked into revision control, but generated code should not be; the code generation should be done as part of the build process. You'd have to structure your build process so that for each file you make a temporary copy, inject the generated source code into it, and compile the result, without touching the original source file that the developers work on.
Antlr is really a great tool that can be used very easily for transforming Java source code to Java source code.
What I'd like to do is scan a set of Java classes, and trace all method calls from a specific method of an Abstract Class, and within that context, build a list of all code which performs some operation (in this case, instantiates an instance of a certain class). I want to know, the line number, and the arguments supplied.
I've begun looking at BCEL, but it doesn't seem to have call graph tracing built in? I'm hesitant to write my own because getting the overloading, type signatures and polymorphic dispatch right might be be tricky.
I half expected a tool or example code to exist, but I haven't found anything yet. It really feels like I'm about to reinvent a wheel. But if I do it will be an open source wheel and available on GitHub ;-)
PS: You will find the existing question "How to Generator a Java Call Graph", because it sounds identical, but it's not at all what I need.
You can use the java-callgraph tool suite to create accurate enough static and dynamic callgraphs for Java.
You can use Doxygen with Graphviz. It is easy to install and use.
You can try JavaDepend , it gives many features needed for dependencies and metrics, it provides also a CQL like SQL to request your code base.
Disclosure: it's a commercial software.
Soot should allow you to easily achieve what you are looking for:
http://www.sable.mcgill.ca/soot/
It can construct precise call graphs fully automatically.
You can find all necessary documentation here:
http://www.sable.mcgill.ca/soot/tutorial/index.html
Also, there's an active mailing list for Soot.
It sounds like you want something that provides access to the abstract syntax and a complete symbol table. Then a custom scan of the ASTs of the functions in the call graph rooted in each implementing method (as indicated by the symbol tables) of an abstract method gives you a chance to locate a new operation whose type is the specific class of interest.
The DMS Software Reengineering Toolkit is generalized compiler technology providing basic services of parsing, AST building/navigation, symbol table building/navigation, control flow, data flow and call graph construction. DMS has an optional Java Front End that provides a full Java parser, builds Java ASTs and symbol tables, and can construct a call graph. The Java Front End can also read .class files; you weren't clear as to whether you wanted to climb into class files, too, hunting for information.
The answer you want isn't off the shelf. You need to build some custom code to implement the ideas in the first paragraph, but DMS can provide most of the raw material. It doesn't provide much detail from the .class files (these are used mostly to resolve types in source code).
For a 'recent' Eclipse install (relative to the question), see Certiv CallGraph.
CallGraph enables graphical analysis of program call relations and flow sequencing. Also enables exploration of extended class inheritance hierarchies.
Call-path analysis and class hieararchy resolution are performed using the JDT platform Search and Call Hierarchy mechanisms.
Sequence diagrams are generated from a static analysis of of the JDT platform AST for any selected class or method.
Uses Zest as the graphics visualization engine.
You can install it via the Eclipse marketplace. I am not involved in making this.
You cannot zoom out which is not very practical but has support for Sequence Diagram which is nice and allows to open/close nodes on demand to dig further.
Requirements:
Eclipse 4.6 (Neon) on Java 8 VM
Eclipse Zest Visualization Toolkit 1.7
Eclipse Public License v1.0
You can see:
https://github.com/Adrninistrator/java-all-call-graph/blob/main/README-en.md
The output example:
upward
org.mybatis.spring.SqlSessionUtils:lambda$closeSqlSession$6(org.apache.ibatis.session.SqlSession)
[0]#org.mybatis.spring.SqlSessionUtils:lambda$closeSqlSession$6(org.apache.ibatis.session.SqlSession)
[1]# org.mybatis.spring.SqlSessionUtils:closeSqlSession(org.apache.ibatis.session.SqlSession,org.apache.ibatis.session.SqlSessionFactory)
[2]# org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor:invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[]) !entry!
org.mybatis.spring.SqlSessionUtils:lambda$getSqlSession$0()
[0]#org.mybatis.spring.SqlSessionUtils:lambda$getSqlSession$0()
[1]# org.mybatis.spring.SqlSessionUtils:getSqlSession(org.apache.ibatis.session.SqlSessionFactory) !entry!
[1]# org.mybatis.spring.SqlSessionUtils:getSqlSession(org.apache.ibatis.session.SqlSessionFactory,org.apache.ibatis.session.ExecutorType,org.springframework.dao.support.PersistenceExceptionTranslator)
[2]# org.mybatis.spring.SqlSessionUtils:getSqlSession(org.apache.ibatis.session.SqlSessionFactory) !entry!
[2]# org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor:invoke(java.lang.Object,java.lang.reflect.Method,java.lang.Object[]) !entry!
downward
org.mybatis.spring.SqlSessionFactoryBean:scanClasses(java.lang.String,java.lang.Class)
[0]#org.mybatis.spring.SqlSessionFactoryBean:scanClasses(java.lang.String,java.lang.Class)
[1]# org.springframework.util.StringUtils:tokenizeToStringArray(java.lang.String,java.lang.String)
[1]# org.springframework.util.ClassUtils:convertClassNameToResourcePath(java.lang.String)
[1]# org.springframework.core.io.support.ResourcePatternResolver:getResources(java.lang.String)
[1]# org.springframework.core.type.classreading.MetadataReaderFactory:getMetadataReader(org.springframework.core.io.Resource)
[1]# org.springframework.core.type.classreading.MetadataReader:getClassMetadata()
[1]# org.springframework.core.type.ClassMetadata:getClassName()
[1]# org.apache.ibatis.io.Resources:classForName(java.lang.String)
[2]# org.apache.ibatis.io.ClassLoaderWrapper:classForName(java.lang.String)
[3]# org.apache.ibatis.io.ClassLoaderWrapper:getClassLoaders(java.lang.ClassLoader)
[3]# org.apache.ibatis.io.ClassLoaderWrapper:classForName(java.lang.String,java.lang.ClassLoader[])
[1]# org.mybatis.spring.SqlSessionFactoryBean:lambda$scanClasses$19(org.springframework.core.io.Resource,java.lang.Throwable)
A Wonderful git repo is here for this:
https://github.com/gajanandjha/Java-Call-Tree-Generator
It generates a call tree of all threads in a Java process, then the repo README comments gives us some Unix magic commands to get the thread traces that we require and generates a simple webpage with tree view of all the methods the thread has visited in a flow.