Removal of a block of code during runtime - java

I need to remove a particular function or a class from my Java code when it is being converted into .jar file using Maven. But the caveat is the function or class should stay inside the source code.
Is there any such way in which I can achieve this using Maven and/or any Java utilities?
(there are a lot of functions ~400 and their implementations are very large as well therefore commenting the code is not an option)

Okay, so the real problem is this:
We have a code base which includes certain parts that are not currently being used, but they may be used in the future, so we want to keep them in the code base, but we do not want them to be shipped to customers. (Due to, uhm, reasons.) What are the best practices for achieving this? Note that commenting them out would be impractical.
The proper way to achieve this is to extract all those parts into a separate module, and refrain from shipping that module.
The hacky way is to use a hard-coded feature flag.
A normal (non-hard-coded) feature flag is a boolean which controls a certain aspect of the behavior of our software. For example, if you build an mp3 player application, and you want to add support for the aac file format, but you do not want to ship support for it yet, then you might want to create a boolean supportAacFeatureFlag() method, and have all code that pertains to the aac file format invoke that method and check the return value before doing anything. It is important to note that this must be a method, not a constant, so that its value is not known at compilation time, because every single if statement that checks the value of a constant is bound to yield a "condition is always true" or "condition is always false" warning. The great benefit of feature flags over commenting-out code is that the code controlled by a feature flag must still compile, so it must be semantically correct. The problem with feature flags is that they do not eliminate the code; the code still gets shipped, it just does not get executed.
A hard-coded feature flag is a feature flag which is implemented using a constant. The constant condition warning will be issued in every single use of that flag, so it will have to be globally disabled. (That's why this approach is hacky: we normally want all warnings enabled.) The benefit of using a constant is that its value is known at compilation time, so even though the compiler will still compile the controlled code, it will refrain from emitting any bytecode for it, so the code essentially does not get shipped to customers. Only empty functions get shipped.
Note that this is documented behavior of the Java compiler. In other languages like C++ and C# the compiler always emits all code, and you have to use other means of controlling code generation, like #defined symbols, which, in my opinion, are also very hacky.
An alternative way which I am sure some people will opt for but I would strongly advice against is to keep the unused code in a separate feature branch and remove it from the master branch. I would strongly advise against this, because any refactorings applied to the master branch will not affect the feature branch, so the code will diverge, so it will be a nightmare to integrate it in the future.

Related

apply CheckReturnValue to entire project

I work on a large legacy Java 8 (Android) application. We recently found a bug that was caused by an ignored result of method. Specifically a caller of a send() method didn't take the right actions when it the sending failed. It's been fixed but now I want to add some static analysis to help find if other existing bugs of the same nature exist in our code. And additionally, to prevent new bugs of the same nature from being added in the future.
We already use Find Bugs, PMD, Checkstyle, Lint, and SonarQube. So I figured that one of these probably already has the check I'm looking for, but it just needs to be enabled. But after a few hours of searching and testing, I don't think that's the case.
For reference, this is the code I was testing with:
public class Application {
public status void main(String[] args) {
foo(); // I want this to be caught
Bar aBar = new Bar();
aBar.baz(); // I want this to be caught
}
static boolean foo() {
return System.currentTimeMillis() % 2 == 0;
}
}
public class Bar {
boolean baz() {
return System.currentTimeMillis() % 2 == 0;
}
}
I want to catch this on the caller side since some callers may use the value while others do not. (The send() method described above was this case)
I found the following existing static analysis rules but they only seem to apply to very specific circumstances to avoid false positives and not work on my example:
Return values from functions without side effects should not be ignored (only for immutable classes in the Java API)
Method ignores exceptional return value (only for known methods like File.delete())
Method ignores return value (only for methods annotated with javax.annotation.CheckReturnValue I think...)
Method ignores return value, is this OK? (only when the return value is the same type as the type the method is invoked on)
Return value of method without side effect is ignored (only when the method does not produce any effect other than return value)
So far the best option seems to be #3 but it requires me to annotate EVERY method or class in my HUGE project. Java 9+ seems to allow annotating at the package-level but that's not an option for me. Even if it was, the project has A LOT of packages. I really would like a way to configure this to be applied to my whole project via one/few locations instead needing to modify every file.
Lastly I came across this Stack Overflow answer that showed me that IntelliJ has this check with a "Report all ignored non-library calls" check. Doing this seems to work as far as highlighting in the IDE. But I want this to cause CI fail. I found there's a way to trigger this via command line using intelliJ tools but this still outputs an XML/JSON file and I'll need to write custom code to parse that output. I'd also need to install IDE tools onto the CI machine which seems like overkill.
Does anyone know of a better way to achieve what I want? I can't be the first person to only care about false negatives and not care about false positives. I feel like it should be manageable to have any return value that is currently being unused to either be logged or have it explicitly stated that the return value is intentionally ignored it via an annotation or assigning to a variable convention like they do in Error Prone
Scenarios like the one you describe invariably give rise to a substantial software defect (a true bug in every respect); made more frustrating and knotty because the code fails silently, and which allowed the problem to remain hidden. Your desire to identify any similar hidden defects (and correct them) is easy to understand; however, (I humbly suggest) static code analysis may not be the best strategy:
Working from the concerns you express in your question: a CheckReturnValue rule runs a high risk of producing a cascade of //Ignore code comments, rule violationSuppress clauses, and/or #suppressRule annotations that far outnumber the rule's positive defect detection count.
The Java programming language further increases the likelihood of a high rule suppression count, after taking Java garbage collection into consideration and assessing how garbage collection effects software development. Working from the understanding that Java garbage collection is based on object instance reference counting, that only instances with a reference count of 0 (zero) are eligible for garbage collection, it makes perfect sense for Java developers to avoid unnecessary references, and to naturally adopt the practice of ignoring unimportant method call return values. The ignored instances will simply fall off of the local call stack, most will reach a reference count of 0 (zero), immediately become eligible for and quickly undergo garbage collection.
Shifting now from a negative perspective to positive, I offer alternatives, for your consideration, that (I believe) will improve your results, as well as your probability to reach a successful outcome.
Based on your description of the scenario and resulting defect / bug, it feels like the proximate root cause of the problem is a unit testing failure or an integration testing failure. The implementation of a send operation that may (and almost certainly will at some point) fail, both unit testing and integration testing absolutely should have incorporated multiple possible failure scenarios and verified failure scenario handling. I obviously don't know, but I'm willing to bet that if you focus on creating and running unit tests and integration tests, the quality of the system will improve at every step, the improvements will be clearly evident, and you may very well uncover some or all of the hidden bugs that are the cause of your current concern, trepidation, stress, and worry.
Consider keeping the gist of your current static code analysis research alive, but shift your approach in a new direction. The first time I read your question, I was struck by the realization that the code checks you would like to perform exist in multiple unrelated locations across the code base and are quickly becoming overly complex, the specific details of the checks are different in many section of code, and each of the special cases make the overall effort unrealistic. Basically, what you would like to implement represents a cross-cutting goal that falls across a sizable section of the code base, and the implementation details have made what is a fairly simple good idea ridiculously complex. Your question is almost a textbook example of a problem that is best implemented taking a cross-cutting aspect-oriented approach.
If you have the time and interest, please take a look at the AspectJ framework, maybe code a few exploratory aspects, and let me know what you think. I'd like to hear your thoughts, if you feel like having a geeky dev conversation at some point. I really hope this is helpful-
You may use the intelliJ IDEA's inspection: Java | Probable bugs | Result of method call ignored with "Report all ignored non-library calls" option enabled. It catches both cases provided in your code sample.

Make my logger very effective to my Java-application

I am struggling with the following problem and ask for help.
My application has a logger module. This takes the trace level and the message (as string).
Often should be messages constructed from different sources and/or different ways (e.G. once using String.format in prior of logging, other times using .toString methods of different objects etc). Therefore: the construction method of the error messages cannot be generalized.
What I want is, to make my logger module effective. That means: the trace messages would only then be constructed if the actual trace level gets the message. And this by preventing copy-paste code in my application.
With C/C++, by using macros it was very easy to achive:
#define LOG_IT(level, message) if(level>=App.actLevel_) LOG_MSG(message);
The LOG_MSG and the string construction was done only if the trace level enabled that message.
With Java, I don't find any similar possibility for that. That to prevent: the logging would be one line (no if-else copy-pastes everywhere), and the string construction (expensive operation) only be done if necessary.
The only solution I know, is to surrond every logger-calls with an IF-statement. But this is exactly what I avoided previously in the C++ app, and what I want to avoid in my actual Java-implementation.
My problem is, on the target system only Java 1.6 is available. Therefore the Supplier is not a choice.
What can I do in Java? How can this C/C++ method easily be done?
Firstly, I would encourage you to read this if you're thinking about implementing your own logger.
Then, I'd encourage you to look at a well-established logging API such as SLF4j. Whilst it is possible to create your own, using a pre-existing API will save you time, effort and above all else provide you with more features and flexibility out of the box (I.e file based configuration, customisability (look at Mapped Diagnostic Context)).
To your specific question, there isn't a simple way to do what you're trying to do. C/C++ are fundamentally different to java in that the preprocessor allows for macros like you've created above. Java doesn't really have an easy-to-use equivalent, though there are examples of projects that do make use of compile time code generation which is probably the closest equivalent (i.e. Project Lombok, Mapstruct).
The simplest way I know of to avoid expensive string building operations whilst logging is to surround the building of the string with a simple conditional:
if ( logger.isTraceEnabled() )
{
// Really expensive operation here
}
Or, if you're using Java 8, the standard logging library takes a java.util.function.Supplier<T> argument which will only be executed if the current log level matches that of the logging method being called:
log.fine(()-> "Value is: " + getValue());
There is also currently a ticket open for SLF4j to implement this functionality here.
If you're really really set on implementing your own logger, the two above features are easy enough to implement yourself, but again I'd encourage you not to.
Edit: Aspectj compile time weaving can be used to achieve something similar to what you're trying to achieve. It would allow you to wrap all your logging statements with a conditional statement in order to remove the boilerplate checking.
Newest logging libraryies, including java.util.logging, have a second form of methods, taking a Supplier<String>.
e.g. log.info( ()->"Hello"); instead of log.info("Hello");.
The get() method of the supplier is only called if the message has effectively to be logged, therefore your string is only constructed in that case.
I think the most important thing to understand here is that the C/C++ macro solution, does not save computational effort by not constructing the logged message, in case the log level was such that the message would not be logged.
Why is so? Simply because the macro method would make the pre-processor substitute every usage of the macro:
LOG_IT(level, message)
with the code:
if(level>=App.actLevel_) LOG_MSG(message);
Substituting anything you passed as level and anything you passed as message along with the macro itself. The resulting code to be compiled will be exactly the same as if you copied and pasted the macro code everywhere in your program. The only thing macros help you with, is to avoid the actual copying and pasting, and to make the code more readable and maintainable.
Sometimes they manage to do it, other times they make the code more cryptic and thus harder to maintain as a result. In any case, macros do not provide deferred execution to save you from actually constructing the string, as Java8 Logger class does by using lambda expressions. Java defers the execution of the body of a lambda until the last possible time. In other words, the body of the lambda is executed after the if statement.
To go back to your example in C\C++, you as a developer, would probably want the code to work regardless of the log level, so you would be forced to construct a valid string message and pass it to the macro. Otherwise in certain log levels, the program would crash! So, since the message string construction code must be before the call to the macro, you will execute it every time, regardless of the log level.
So, to make the equivalent to your code is quite simple in Java 6! You just use the built-in class: Logger. This class provides support for logging levels automatically, so you do not need to create a custom implementation of them.
If what you are asking is how to implement deferred execution without lambdas, though, I do not think it is possible.
If you wanted to make real deferred execution in C\C++ you would have to make the logging code such, as to take a function pointer to a function returning the message string, you would make your code execute the function passed to you by the function pointer inside the if statement and then you would call your macro passing not a string but a function that creates and returns the string! I believe the actual C\C++ code to do this is out of scope for this question... The key concept here, is that C\C++ provide you the tools to make deferred execution, simply because they support function pointers. Java does not support function pointers, until Java8.

performance issues using PropertyUtilsBean.getProperty()

I am currently having some performance issues using PropertyUtilsBean.getProperty to evaluate some property-expressions like:
obj.propert1.coolMap[1].property
I am currently using PropertyUtilsBean.getProperty for this, but it has to be executed a lot of times and I can see the enormous amount of wasted CPU time in JProfiler.
As the expression for each occurance never changes, I would like to sort of pre-initialize the property accessor. In an older version of my software I used reflection to save an instance of the getMethod so all I had to do, was to call the method for a given object. In the current version we added support for more complex expressions and therefore switched to PropertyUtilsBean ... seems this was a bad decision from a performance point of view.
Is there a way to have reusable access to a property which is able to interpret such property expressions as PropertyUtilsBean understands them? I would really like to be able to do this without implementing it manually.

From Static Typing to Dynamic Typing

I have always worked on statically typed languages (C/C++, Java). I have been playing with Clojure and I really like it.
One thing I am worried about is: say that I have a windows that takes 3 modules as arguments and along the way the requirements change and I need to pass another module to the function. I just change the function and the compiler complains everywhere I used it. But in Clojure it won't complain until the function is called. I can just do a regex search and replace but it seems there is a chance to miss a call and it will go unnoticed until that function is actually called. How do you guys deal with this?
This is one of the reasons automated testing/test driven development is even more important in dynamically typed languages. I haven't used Clojure (I mostly use Ruby), so unfortunately I can't recommend a specific testing framework.
The first thing I'd like to mention is that Bruce Eckel has written a very interesting article called Strong Typing vs Strong Testing (the link is down at the moment, unfortunately, but hopefully it will be up soon).
His idea is that when dealing with compiled languages, the compiler is just acting as the first, automatic step of automatic testing. When making the move to a dynamic language, you lose this first level of automatic testing. But in both cases, this first, automatic level is just one part of testing, and not even a very important part.
His point is that if you're developing programs properly, i.e. doing some form of tests and regression tests, the lack of a compiler will only force you to add some more, somewhat basic tests anyways, which is why it's no big loss.
So I guess the first answer I'd give you is, focus on your testing, something you should be doing anyway, and such changes shouldn't affect you too badly.
The second thing I'd like to mention is many dynamic languages that I've seen (for example, Python) have much better abilities to change what methods/classes do without breaking existing code.
For example, with Python, if your method used to accept two parameters but now requires a third one, you can always add a default parameter without breaking any existing code, but that you can now utilize. This is a very basic technique, but in Python's case (and I assume most other dynamic languages as well), these techniques can get much more interesting; since they're dynamic, you can pretty much change the implementation of functions for specific modules, change what variables mean, etc.
I'd suggest looking at which techniques Clojure has that allow similair things, and deciding if they apply in your situation.
You do the same thing you did if the method was part of a public interface that you weren't the only user of.
You add a new method with the extra module and and change the old one to call the new one with a suitable default.
Oh and if your program is that big, make sure you have good tests (test-is should make it simpler than Java)
Test coverage is definitely important. But a dynamically typed language will allow you to work in a different way. In a strongly typed language (like Java), a change in the interface needs to modify all the callers. In Ruby, you could do this-- but probably won't. Instead, you'll probably add flexibility to the method on one of a few ways. Namely:
you tend to have very few methods that take as many as three parameters in Ruby (as opposed to Java). Because you don't have strong typed interface of Java, you break the problem down into smaller pieces and steps. It's much more common to write methods that take just 1 parameter, and then refactor when it becomes more complex.
it's possible-- and common-- to leave the old behavior in place while adding more arguments. For example, if you have to add a third argument to a two argument method, you will set its default value to preserve the old behavior (and save you a refactor). If you are familiar with Javascript libraries like jQuery, they take advantage of this everywhere with "optional" arguments.
similar to optional arguments, methods can grow to take a flexible parameter list. With solid test coverage, you can quite easily add a new behavior to an existing method and safely know you haven't broken the existing code. In Rails, methods like "render" take a wide range of options.
You're not completely without compiler support in Clojure. In the specific example you give, it's the arity of the function that changed, which would be picked up by compiling the Clojure code. I'm still making the strong -> dynamic typing transition and find this comforting!
You lose some level of refactoring and type safety when you move to dynamic languages. The more information the compiler has, the more it can do at compile time for you.
Tim Bray discusses it here,critique of which by Cedric is here,and a post on artima discussing it at length.
If you really need static typing, you can use https://github.com/clojure/core.typed and it's leiningen module to test static variable passing.

are there any potential issues with obfuscating an application?

I am building a spring mvc web application.
I plan on using hibernate.
I don't have much experience with obfuscating etc.
What are the potential downsides to obfuscating an application?
I understand that there might be issues with debugging the app, and recovering lost source code is also an issue.
Are there any known issues with the actually running of the application? Can bugs be introduced?
Since this is an area I am looking for general guidance, please feel free to open up any issues that I should be aware of.
There are certainly some potential performance/maintenance issues, but a good obfuscator will let you get round at least some of them. Things to look out for:
an obvious one: if your code calls methods by reflection or dynamically loads classes, then this is liable to fail if the class/method names are obfuscated; a good obfuscator will let you select class/method names not to obfuscate to get round this problem;
a similar issue can occur if not all of your application is compiled at the same time;
if it deals directly at the bytecode level, an obfuscator can create code that in principle a Java compiler cannot create (e.g. it can insert arbitrary GOTO instructions, whereas from Java these can only be created as part of a loop)-- this may be a bit theoretical, but if I were writing a JVM, I'd optimise performance for sequences of bytecodes that a Java compiler can create, not ones that it can't...
the obfuscator is liable to make other subtle changes to performance if it significantly alters the number of bytecodes in a method, or in some way changes whether a given method/piece of code hits thresholds for certain JVM optimisations (e.g. "inline methods with fewer than X bytecodes").
But as you can see, some of these effects are a little subtle and theoretical-- so to some extent what you need to do is soak-test your application after obfuscation, just as you would with any other major change.
You should also be careful not to assume that obfuscation hides your code/algorithm (if that is your intention) as much as you want it to-- use a decompiler to have a look at the contents of the resulting obfuscated classes.
Surprised no one has mentioned speed - in general, more obfuscated = slower-running code
[Edit] I can't believe this has -2. It is a correct answer.
Shortening identifiers and removing unused methods will decrease the file-size, but have 0 impact on the running speed (other than the few nanoseconds shaved off the loading time). In the meanwhile, most of the obfuscation of the program comes from added code:
Breaking 1 method into 5; interleaving methods; merging classes [aggregation transformations]
Splitting 1 arithmetic expression into 10; jumbling the control-flow [computation transformations]
And adding chunks of code that do nothing [opaque predicates]
are all common obfuscation techniques that cause a program to run slower.
You may want to look at some of the comments here, to decide if obfuscating makes sense:
https://stackoverflow.com/questions/1988451/net-obfuscation
You may want to express why you want to obfuscate. IMO the best reasons are mainly to have a smaller application, as you can get rid of classes that aren't being used in your project, while obfuscating.
I have never seen bugs introduced, as long as you aren't using reflection, assuming you can find something, as private methods for example will have their names changed.
The biggest problem centers around that fact that obfuscating programs generally make a guarantee of not changing the behavior of their target program. In some cases it proves to be very hard to do this -- for example, imagine a program which checks the value of certain private fields via reflection from a string array. An obfuscator may not be able to tell that this string also needs to be updated correspondingly, and the result will be unexpected access errors that pop up at runtime.
Worse still, it may not be obvious that the behavior of a program has changed subtly -- then you may not know that there's a problem at all, until your customer finds it first and gets upset.
Generally, professional-grade obfuscation products are sophisticated enough to catch some kinds of problems and prevent them, but ultimately it can be challenging to cover all the bases. The best defense is to run unit tests against the obfuscated result and make sure that all your expected behavior continues to hold true.
1 free one you might want to check out is Babel. It is designed to be used on the command line (like many other obfuscators), there is a Reflector addin that will provide a UI for you.
When it comes to obfuscation, you really need to analyze what your goal is. In your case - if you have a web application (mvc) are you planning on selling it as a canned downloadable application? (if not and you keep the source on your web servers then you don't need it).
You might look at the components and pick only certain parts to obfuscate ... not the whole thing. In general ASP.Net apps break pretty easy when you try to add obfuscation after you developed them due to all the reflection used.
Pretty much everything mentioned above is true ... it all depends on how many features you turn on to make it hard to reverse your code:
Renaming of members (fields/methods/events/properties) is most common (comes in different flavors: simple renaming of methods from something like GetId() to a() all the way to unreadable characters and removal of namespaces). BTW: This is where reflection usually breaks. Your assembly file may end up being smaller due to smaller strings being used too.
String encryption: this makes it harder to reverse your static strings used in your code. BTW: this paired with renaming makes it difficult for you to debug your renaming problems ... so you might turn it on after you have that working. This also will have to add code to decrypt the string right before it is used in IL
Code mangling ... this is what BlueRaja was refering to. It makes your code look like spagetti code - to make it harder for someone to figure out. The CLR does not like this ... it can't optimize things as easy and your final code will mostlikely proccess slower due to the additional branching and something not being inlined due to the IL rewriting used for this option. BTW: this option really does raise the bar on what it takes to reverse you source code, but may come with a performance hit.
Removal of unused code. Some obfuscators offer you the option to trim any code that it finds not being used. This may make your assembly a little smaller if you have alot of dead code hanging around ... but it is just a free benefit obfuscators throw in.
My advice is to only use it if you know why you are using it and design with that end in mind ... don't try to add it after you've finished your code (I've done that and it's not fun)

Categories