I have a post-compilation step that manipulates the Java bytecode of generated classes. I'd like to make life as painless as possible for library consumers, so I'm looking at ways I can make this process automatic and (if possible) compiler agnostic.
The Annotation Processing API provides many of the desired features (automatic service discovery; supported by Eclipse). Unfortunately, this is aimed at code generators and doesn't support manipulation of existing artefacts:
The initial inputs to the tool are
considered to be created by the zeroth
round; therefore, attempting to create
a source or class file corresponding
to one of those inputs will result in
a FilerException.
The Decorator pattern recommended by the API is not an option.
I can see how to perform the step with a runtime agent/instrumentation, but this is a worse option than a manual build step as it would require anyone even peripherally touched by the API to configure their JVMs in a non-obvious manner.
Is there a way to plug into or wrap the compiler tool as invoked by javac? Has anyone successfully subverted the annotation processors to manipulate bytecode, no matter what the doc says?
The Groovy compiler is the only bytecode compiler which allows to hook into the compilation process (example: Generate bytecode to support the Singleton pattern)
The Annotation Processing API is not meant to change the code. As you have already found out, all you can do is install a classloader, examine the bytecode at runtime and manipulate it. It's braindead but it works. This follows the general "we're afraid that a developer could try something stupid" theme which you will find throughout Java. There is no way to extend javac. The relevant classes are either private, final or will change with the next version of Java.
Another option is to write annotated Java, for example you write a class "ExampleTpl.java". Then, you use a precompiler which expands the annotations in that file to get "Example.java". In the rest of the code, you use Example and ignore ExampleTpl.
For Eclipse, there is a bug report to automate this step. I'm not aware of any other work in this area.
It can be done.
Take a look at my blog post Roman Numerals, in our Java where an annotation processor is used to rewrite code. Limitation being that it works with Sun's javac only.
Related
I want to modify / make the rule target only public interfaces (not public classes etc). Is this possible ? Im using this rule in Java code but its too strict for my project and I would love to know if there is a way to change it a little bit.
Link for rule: https://rules.sonarsource.com/java/RSPEC-1213
For an existing ruleset on SonarQube, talk to your sonar administrator to change the rules that are enforced on the code and remove that particular one from global enforcement.
There have been a few times I've gone to the admins of the tool for the install that I use and said "this rule isn't one that I care about or will enforce and only makes it confusing" and had them remove that rule from the globally run ruleset.
Is it possible to write your own rule?
Yes, it is possible. From SonarQube's docs: Adding coding rules you have some options. Either you can write a plugin for SonarQube and add that to your instance (docs), or you can write an external application that analyzes the code which SonarQube consumes.
If you don't have your own instance of sonarqube or aren't up to writing the associated plugin or external tooling... you might want to instead lookout PMD (site).
For PMD, writing a custom rule can be much simpler (docs). One of the ways that PMD works is by 'compiling' the Java code into an XML representation of the abstract syntax tree for Java and then running xpath queries against that XML (tutorial).
The xpath rule can then be included in a project's configuration.
What about turning it off for the code that I'm working on?
If a specific rule is one that you don't want to invoke, you could suppress it with #SuppressWarnings("java:S106") (that particular spares warnings is for System.out.println use, but the same structure can be used for other warnings) or by adding // NOSONAR too strict on the line. There are spots where I have such comments where following the rule for a particular set of code is problematic and suppress it for that line, method, or class - with the comment about why that is done.
That particular rule... I'm gonna agree with the Java (and now Oracle) guidelines and follow it. The reason is that if anyone else works on the code, they'll expect it to follow that convention. Having a consistent understanding of what things should be where in code so that another developer doesn't need to go dig through an entire file to find the constructor when it is expected to be at the top (under the field definition) is a good thing. What's more, it limits the future cases where a developer goes through to make things consistent with conventions and results in a lot of style: updating code to follow style guide commits later.
We are migrating a system written in C to Java and must retain existing processes (no debate). We currently "embed" compile-time information into the C application using the C preprocessor, for example:
cc -o xxx.o -DCOMP_ARG='"compile time arg"' xxx.c
The xxx.c file can then use "COMP_ARG" and its value will be embedded in the code and we have little worry about it being changed inadvertently.
We realize Java likes to use properties files, however, our requirements are such that some information ** ** be embedded in the code, so properties files are not an option - these certain values cannot be specified at runtime. To illustrate the point, such data could be a date-stamp of when the file was compiled, but the exact data is irrelevant to the question.
We are looking for a way to specify at compile time various values that are available to the Java code. We are quite aware that Java does not have a pre-processor as does C, so the mechanism would be different.
Our current solution is using a code generation step (Maven), which does work, however, Eclipse is wreaking havoc trying to deal with the source files so that we had turn off "Build Automatically". We really want to find a more robust solution.
We appreciate any help, thanks.
The xxx.c file can then use "COMP_ARG" and its value will be embedded
in the code and we have little worry about it being changed
inadvertently.
...our requirements are such that some information be embedded in the
code....
We are looking for a way to specify at compile time various values
that are available to the Java code. We are quite aware that Java does
not have a pre-processor as does C, so the mechanism would be
different.
It seems that the best way to solve this problem would be to make use of annotations in your code.
In Java, annotations are a kind of interface declaration, but they do not enforce a behavioral contract with an implementing class. Rather, they are meant to define a contract with some external framework, preprocessor, or with the compiler itself. Annotations are used extensively in Java EE 5.0 (and later) to specify configuration and behavior to the framework within which the developer's code runs. Annotations are also used extensively by the JavaDoc documentation processor. Here, the annotations in the doc comments allow you to specify and format the information which you intend to appear in the documentation when the JavaDoc processor runs.
Annotations can be defined to be accessible at runtime. In such a case, the primary mechanism for accessing annotations is the Java Reflection facility. For example, annotations with a retention policy of RUNTIME and defined on a class, can be accessed through that class's corresponding Class object:
Class myCls = MyClass.class; // the "class literal" for MyClass
Annotation[] annotations = myCls.getDeclaredAnnotations();
Annotations can include arguments for parameters to allow for more flexibility in configuration. The use of annotations is most convenient when the code itself can be so annotated.
A quick tutorial on how annotations are defined and used in Java is available here: https://docs.oracle.com/javase/tutorial/java/annotations/
I'm going to post my own answer which seems to be "Can't be done" - what can't be done, apparently, is provide at compile time to Java, a set of parameters that gets passed to the program at execution time. The solution appears to be to continue with what I am doing which is to update a Java source file with the compile-time data and figure out how to coax Eclipse to stop over-writing the files.
Thanks to everyone who commented.
Is it possible to use a bytecode manipulation library like ASM at compile time?
Specifically, I'd like to use Java's annotation processing API to implement boilerplate-heavy methods on annotated classes. Implementing an annotation processor is straightforward enough, but it seems like the .class files don't yet exist when the Processor is run. Is there another way?
You might be interested in Javassist ( http://www.jboss.org/javassist ) which can enhance and save classes as a post-compilation step.
This article describes how to save enhanced classes : https://dzone.com/articles/implementing-build-time
in particular, once you have altered a class, you can do something like this:
compiledClass.writeFile("/tmp/modifiedClassesFolder");
It should be possible since the following project is doing it: Project Lombok
Also:
Java 8 will bring a new mechanism that allows you to write plug-ins for the Java compiler (javac). A compiler plug-in lets you add new phases to javac without making changes to its code base. New behavior can be encapsulated in a plug-in and distributed for other people to use. For example, javac plug-ins could be used to do the following:
• Add extra compile-time checks
• Add code transformations
• Perform customized analysis of source code
You should use CGLib instead. With CGLib you can add proxies with method interceptors and have the interceptor implement your boilerplate code. Another option is to look at Javassist. With Javassist you literally create a new subclass using actual text (in strings) and have javassist compile it into byte-code.
I have a scenario where I have code written against version 1 of a library but I want to ship version 2 of the library instead. The code has shipped and is therefore not changeable. I'm concerned that it might try to access classes or members of the library that existed in v1 but have been removed in v2.
I figured it would be possible to write a tool to do a simple check to see if the code will link against the newer version of the library. I appreciate that the code may still be very broken even if the code links. I am thinking about this from the other side - if the code won't link then I can be sure there is a problem.
As far as I can see, I need to run through the bytecode checking for references, method calls and field accesses to library classes then use reflection to check whether the class/member exists.
I have three-fold question:
(1) Does such a tool exist already?
(2) I have a niggling feeling it is much more complicated that I imagine and that I have missed something major - is that the case?
(3) Do you know of a handy library that would allow me to inspect the bytecode such that I can find the method calls, references etc.?
Thanks!
I think that Clirr - a binary compatibility checker - can help here:
Clirr is a tool that checks Java libraries for binary and source compatibility with older releases. Basically you give it two sets of jar files and Clirr dumps out a list of changes in the public api. The Clirr Ant task can be configured to break the build if it detects incompatible api changes. In a continuous integration process Clirr can automatically prevent accidental introduction of binary or source compatibility problems.
Changing the library in your IDE will result in all possible compile-time errors.
You don't need anything else, unless your code uses another library, which in turn uses the updated library.
Be especially wary of Spring configuration files. Class names are configured as text and don't show up as missing until runtime.
If you have access to the source code, you could just compile source against the new library. If it doesn't compile, you have definitely a problem. If it compiles you may still have a problem if the program uses reflection, some kind of IoC stuff like Spring etc.
If you have unit tests, then you may have a better change catch any linking errors.
If you have only have a .class file of the program, then I don't know any tools that would help besides decomplining class file to source and compiling source again against the new library, but that doesn't sound too healthy.
The checks you mentioned are done by the JVM/Java class loader, see e.g. Linking of Classes and Interfaces.
So "attempting to link" can be simply achieved by trying to run the application. Of course you could hoist the checks to run them yourself on your collection of .class/.jar files. I guess a bunch of 3rd party byte code manipulators like BCEL will also do similar checks for you.
I notice that you mention reflection in the tags. If you load classes/invoke methods through reflection, there's no way to analyse this in general.
Good luck!
I'm developing a system that allows developers to upload custom groovy scripts and freemarker templates.
I can provide a certain level of security at a very high level with the default Java security infrastructure - i.e. prevent code from accessing the filesystem or network, however I have a need to restrict access to specific methods.
My plan was to modify the Groovy and Freemarker runtimes to read Annotations that would either whitelist or blacklist certain methods, however this would force me to maintain a forked version of their code, which is not desirable.
All I essentially need to be able to do is prevent the execution of specific methods when called from Groovy or Freemarker. I've considered a hack that would look at the call stack, but this would be a massive speed hit (and it quite messy).
Does anyone have any other ideas for implementing this?
You can do it by subclassing the GroovyClassLoader and enforcing your constraints within an AST Visitor. THis post explains how to do it: http://hamletdarcy.blogspot.com/2009/01/groovy-compile-time-meta-magic.html
Also, the code referenced there is in the samples folder of Groovy 1.6 installer.
You should have a look at the project groovy-sandbox from kohsuke. Have also a look to his blog post here on this topic and what is solution is addressing: sandboxing, but performance drawback.
OSGi is great for this. You can partition your code into bundles and set exactly what each bundle exposes, and to what other bundles. Would that work for you?
You might also consider the java-sandbox (http://blog.datenwerke.net/p/the-java-sandbox.html) a recently developed library that allows to securely execute untrusted code from within java.
Also see: http://blog.datenwerke.net/2013/06/sandboxing-groovy-with-java-sandbox.html