Related
I need to remove a particular function or a class from my Java code when it is being converted into .jar file using Maven. But the caveat is the function or class should stay inside the source code.
Is there any such way in which I can achieve this using Maven and/or any Java utilities?
(there are a lot of functions ~400 and their implementations are very large as well therefore commenting the code is not an option)
Okay, so the real problem is this:
We have a code base which includes certain parts that are not currently being used, but they may be used in the future, so we want to keep them in the code base, but we do not want them to be shipped to customers. (Due to, uhm, reasons.) What are the best practices for achieving this? Note that commenting them out would be impractical.
The proper way to achieve this is to extract all those parts into a separate module, and refrain from shipping that module.
The hacky way is to use a hard-coded feature flag.
A normal (non-hard-coded) feature flag is a boolean which controls a certain aspect of the behavior of our software. For example, if you build an mp3 player application, and you want to add support for the aac file format, but you do not want to ship support for it yet, then you might want to create a boolean supportAacFeatureFlag() method, and have all code that pertains to the aac file format invoke that method and check the return value before doing anything. It is important to note that this must be a method, not a constant, so that its value is not known at compilation time, because every single if statement that checks the value of a constant is bound to yield a "condition is always true" or "condition is always false" warning. The great benefit of feature flags over commenting-out code is that the code controlled by a feature flag must still compile, so it must be semantically correct. The problem with feature flags is that they do not eliminate the code; the code still gets shipped, it just does not get executed.
A hard-coded feature flag is a feature flag which is implemented using a constant. The constant condition warning will be issued in every single use of that flag, so it will have to be globally disabled. (That's why this approach is hacky: we normally want all warnings enabled.) The benefit of using a constant is that its value is known at compilation time, so even though the compiler will still compile the controlled code, it will refrain from emitting any bytecode for it, so the code essentially does not get shipped to customers. Only empty functions get shipped.
Note that this is documented behavior of the Java compiler. In other languages like C++ and C# the compiler always emits all code, and you have to use other means of controlling code generation, like #defined symbols, which, in my opinion, are also very hacky.
An alternative way which I am sure some people will opt for but I would strongly advice against is to keep the unused code in a separate feature branch and remove it from the master branch. I would strongly advise against this, because any refactorings applied to the master branch will not affect the feature branch, so the code will diverge, so it will be a nightmare to integrate it in the future.
I'm converting a Java library to Objective-C. The Java code uses exceptions flagrantly (to my Objective-C accustomed mind). When converting, should I be throwing Objective-C exceptions (only within the library; I'll catch them before they leave) or should I use NSError constructs.
I'm familiar with the use-case for exceptions in regular Objective-C code; i.e. only for truly exceptional errors. If I don't get a definitive answer here, I'll probably use NSErrors.
Throwing and catching exceptions in Objective-C is expensive (except on 32-bit Mac OS X, where the #try part of the exception-catching code is the expensive part instead of the #catch part.)
You're better off returning error codes in some mechanism (such as NSError, the OO way to do it in Objective-C.) Let those bubble up to the code that accesses your framework, and then let that code handle it appropriately.
Memory cleanup in the case of error with either system should not be a huge worry, as you should be able to put most objects and allocations in an autorelease pool. However, be advised that the pool will eat up any NSError or NSException objects created within its scope, so you'll have to make sure those objects survive past the end of your code with additional retains and releases. (Slightly off-topic, but I've seen a lot of people screw this part up when doing error handling.)
If the library can handle all the exceptions on its own, then they're not really exceptions in the Objective-C sense of the word: no programmer error has occurred; rather, something wholly anticipated has happened. You should use use error codes/NSError as needed, but you might be able to dispose with much of the info provided by the exceptions and just return a value that indicates an "error" (nil, 0, and NSNotFound being some of the common ones) and handle that. You might also consider error-handling delegate methods.
I would just return errors on the Objective-C library (NSError approach). After all, it's the way error handling is done in C.
I wouldn't worry too much about using or not using exceptions. If it makes your code cleaner, use them. The only thing you really have to be careful about is not throwing exceptions through code that you do not know is exception safe which is pretty much all code you did not write.
Yes, using exceptions is expensive, but worrying about that is an example of premature optimisation. After all, compared with a C function call, Objective-C message dispatch is expensive, but you don't hear Objective-C programmers saying "don't use Objective-C messages".
Seeing a checked expection in API is not rare, one of the most well known examples is IOException in Closeable.close(). And often dealing with this exception really annoys me. Even more annoying example was in one of our projects. It consists from several components and each component declares specific checked exception. The problem (in my opinion) is that at design time it was not exactly known what certain exceptions would be. Thus, for instance, component Configurator declared ConfiguratorExeption. When I asked why not just use unchecked exceptions, I was told that we want our app to be robust and not to blow in runtime. But it seams to be a weak argument because:
Most of those exceptions effectively make app unusable. Yes, it doesn't blow up, but it cannot make anything exepting flooding log with messages.
Those exceptions are not specific and virtually mean that 'something bad happened'. How client is supposed to recover?
In fact all recovering consists from logging exception and then swallowing it. This is performed in large try-catch statement.
I think, that this is a recurring pattern. But still checked exceptions are widely used in APIs. What is the reason for this? Are there certain types of APIs that are more appropriate for checked exceptions?
There have been a lot of controversy around this issue.
Take a look at this classic article about that subject http://www.mindview.net/Etc/Discussions/CheckedExceptions
I personally tend to favor the use of Runtime exceptions myself and have started to consider the use of checked exceptions a bad idea in your API.
In fact some very popular Java API's have started to do the same, for instance, Hibernate dropped its use of checked exceptions for Runtime from version 3, the Spring Framework also favor the use of Runtime over checked exceptions.
One of the problems with large libraries is that they do not document all the exceptions that may be thrown, so your code may bomb at any time if an undocumented RuntimeException just happens to be thrown from deep down code you do not "own".
By explicitly declaring all those, at least the developer using said library have the compiler help dealing with them correctly.
There is nothing like doing forensic analysis at 3 in the morning to discover that some situation triggered such an undeclared exception.
Checked Exceptions should only be thrown for things that are 1) Exceptional they are an exception to the rule of success, most poor exception throwing is the terrible habit of defensive coding and 2) Actionable by the client. If something happens that the client of the API can't possibly affect in any way make it a RuntimeException.
There's different views on the matter, but I tend to view things as follows:
a checked exception represents an event which one could reasonably expect to occur under some predictable, exceptional circumstances that are still "within the normal operating conditions of a program/typical caller", and which can typically be dealt with not too far up the call stack;
an unchecked exception represents a condition that we "wouldn't really expect to occur" within the normal running environment of a program, and which can be dealt with fairly high up the call stack (or indeed possibly cause us to shut down the application in the case of a simpler app);
en error represents a condition which, if it occurs, we would generally expect to result in us shutting down the application.
For example, it's quite within the realms of a typical environment that under some exceptional-- but fairly predictable-- conditions, closing a file could cause an I/O error (flushing a buffer to a file on closing when the disk is full). So the decision to let Closable throw a checked IOException is probably reasonable.
On the other hand, there are some examples within the standard Java APIs where the decision is less defensible. I would say that the XML APIs are typically overfussy when it comes to checked exceptions (why is not finding an XML parser something you really expect to happen and deal with in a typical application...?), as is the reflection API (you generally really expect class definitions to be found and not to be able to plough on regardless if they're not...). But many decisions are arguable.
In general, I would agree that exceptions of the "configuration exception" type should probably be unchecked.
Remember if you are calling a method which declares a checked exception but you "really really don't expect it to be thrown and really wouldn't know what to do if it were thrown", then you can programmatically "shrug your shoulders" and re-cast it to a RuntimeException or Error...
You can, in fact, use Exception tunneling so that a generic exception (such as your ConfiguratorException) can give more detail about what went wrong (such as a FileNotFound).
In general I would caution against this however, as this is likely to be a leaky abstraction (no one should care whether your configurator is trying to pull its data from the filesystem, database, across a network or whatever)
If you are using checked exceptions then at least you'll know where and why your abstractions are leaky. IMHO, this is a good thing.
I am building a spring mvc web application.
I plan on using hibernate.
I don't have much experience with obfuscating etc.
What are the potential downsides to obfuscating an application?
I understand that there might be issues with debugging the app, and recovering lost source code is also an issue.
Are there any known issues with the actually running of the application? Can bugs be introduced?
Since this is an area I am looking for general guidance, please feel free to open up any issues that I should be aware of.
There are certainly some potential performance/maintenance issues, but a good obfuscator will let you get round at least some of them. Things to look out for:
an obvious one: if your code calls methods by reflection or dynamically loads classes, then this is liable to fail if the class/method names are obfuscated; a good obfuscator will let you select class/method names not to obfuscate to get round this problem;
a similar issue can occur if not all of your application is compiled at the same time;
if it deals directly at the bytecode level, an obfuscator can create code that in principle a Java compiler cannot create (e.g. it can insert arbitrary GOTO instructions, whereas from Java these can only be created as part of a loop)-- this may be a bit theoretical, but if I were writing a JVM, I'd optimise performance for sequences of bytecodes that a Java compiler can create, not ones that it can't...
the obfuscator is liable to make other subtle changes to performance if it significantly alters the number of bytecodes in a method, or in some way changes whether a given method/piece of code hits thresholds for certain JVM optimisations (e.g. "inline methods with fewer than X bytecodes").
But as you can see, some of these effects are a little subtle and theoretical-- so to some extent what you need to do is soak-test your application after obfuscation, just as you would with any other major change.
You should also be careful not to assume that obfuscation hides your code/algorithm (if that is your intention) as much as you want it to-- use a decompiler to have a look at the contents of the resulting obfuscated classes.
Surprised no one has mentioned speed - in general, more obfuscated = slower-running code
[Edit] I can't believe this has -2. It is a correct answer.
Shortening identifiers and removing unused methods will decrease the file-size, but have 0 impact on the running speed (other than the few nanoseconds shaved off the loading time). In the meanwhile, most of the obfuscation of the program comes from added code:
Breaking 1 method into 5; interleaving methods; merging classes [aggregation transformations]
Splitting 1 arithmetic expression into 10; jumbling the control-flow [computation transformations]
And adding chunks of code that do nothing [opaque predicates]
are all common obfuscation techniques that cause a program to run slower.
You may want to look at some of the comments here, to decide if obfuscating makes sense:
https://stackoverflow.com/questions/1988451/net-obfuscation
You may want to express why you want to obfuscate. IMO the best reasons are mainly to have a smaller application, as you can get rid of classes that aren't being used in your project, while obfuscating.
I have never seen bugs introduced, as long as you aren't using reflection, assuming you can find something, as private methods for example will have their names changed.
The biggest problem centers around that fact that obfuscating programs generally make a guarantee of not changing the behavior of their target program. In some cases it proves to be very hard to do this -- for example, imagine a program which checks the value of certain private fields via reflection from a string array. An obfuscator may not be able to tell that this string also needs to be updated correspondingly, and the result will be unexpected access errors that pop up at runtime.
Worse still, it may not be obvious that the behavior of a program has changed subtly -- then you may not know that there's a problem at all, until your customer finds it first and gets upset.
Generally, professional-grade obfuscation products are sophisticated enough to catch some kinds of problems and prevent them, but ultimately it can be challenging to cover all the bases. The best defense is to run unit tests against the obfuscated result and make sure that all your expected behavior continues to hold true.
1 free one you might want to check out is Babel. It is designed to be used on the command line (like many other obfuscators), there is a Reflector addin that will provide a UI for you.
When it comes to obfuscation, you really need to analyze what your goal is. In your case - if you have a web application (mvc) are you planning on selling it as a canned downloadable application? (if not and you keep the source on your web servers then you don't need it).
You might look at the components and pick only certain parts to obfuscate ... not the whole thing. In general ASP.Net apps break pretty easy when you try to add obfuscation after you developed them due to all the reflection used.
Pretty much everything mentioned above is true ... it all depends on how many features you turn on to make it hard to reverse your code:
Renaming of members (fields/methods/events/properties) is most common (comes in different flavors: simple renaming of methods from something like GetId() to a() all the way to unreadable characters and removal of namespaces). BTW: This is where reflection usually breaks. Your assembly file may end up being smaller due to smaller strings being used too.
String encryption: this makes it harder to reverse your static strings used in your code. BTW: this paired with renaming makes it difficult for you to debug your renaming problems ... so you might turn it on after you have that working. This also will have to add code to decrypt the string right before it is used in IL
Code mangling ... this is what BlueRaja was refering to. It makes your code look like spagetti code - to make it harder for someone to figure out. The CLR does not like this ... it can't optimize things as easy and your final code will mostlikely proccess slower due to the additional branching and something not being inlined due to the IL rewriting used for this option. BTW: this option really does raise the bar on what it takes to reverse you source code, but may come with a performance hit.
Removal of unused code. Some obfuscators offer you the option to trim any code that it finds not being used. This may make your assembly a little smaller if you have alot of dead code hanging around ... but it is just a free benefit obfuscators throw in.
My advice is to only use it if you know why you are using it and design with that end in mind ... don't try to add it after you've finished your code (I've done that and it's not fun)
I know Googling I can find an appropriate answer, but I prefer listening to your personal (and maybe technical) opinions.
What is the main reason of the difference between Java and C# in throwing exceptions?
In Java the signature of a method that throws an exception has to use the "throws" keyword, while in C# you don't know in compilation time if an exception could be thrown.
In the article The Trouble with Checked Exceptions and in Anders Hejlsberg's (designer of the C# language) own voice, there are three main reasons for C# not supporting checked exceptions as they are found and verified in Java:
Neutral on Checked Exceptions
“C# is basically silent on the checked
exceptions issue. Once a better
solution is known—and trust me we
continue to think about it—we can go
back and actually put something in
place.”
Versioning with Checked Exceptions
“Adding a new exception to a throws
clause in a new version breaks client
code. It's like adding a method to an
interface. After you publish an
interface, it is for all practical
purposes immutable, …”
“It is funny how people think that the
important thing about exceptions is
handling them. That is not the
important thing about exceptions. In a
well-written application there's a
ratio of ten to one, in my opinion, of
try finally to try catch. Or in C#,
using statements, which are
like try finally.”
Scalability of Checked Exceptions
“In the small, checked exceptions are
very enticing…The trouble
begins when you start building big
systems where you're talking to four
or five different subsystems. Each
subsystem throws four to ten
exceptions. Now, each time you walk up
the ladder of aggregation, you have
this exponential hierarchy below you
of exceptions you have to deal with.
You end up having to declare 40
exceptions that you might throw.…
It just balloons out of control.”
In his article, “Why doesn't C# have exception specifications?”, Anson Horton (Visual C# Program Manager) also lists the following reasons (see the article for details on each point):
Versioning
Productivity and code quality
Impracticality of having class author differentiate between
checked and unchecked exceptions
Difficulty of determining the correct exceptions for interfaces.
It is interesting to note that C# does, nonetheless, support documentation of exceptions thrown by a given method via the <exception> tag and the compiler even takes the trouble to verify that the referenced exception type does indeed exist. There is, however, no check made at the call sites or usage of the method.
You may also want to look into the Exception Hunter, which is a commerical tool by Red Gate Software, that uses static analysis to determine and report exceptions thrown by a method and which may potentially go uncaught:
Exception Hunter is a new analysis
tool that finds and reports the set of
possible exceptions your functions
might throw – before you even ship.
With it, you can locate unhandled
exceptions easily and quickly, down to
the line of code that is throwing the
exceptions. Once you have the results,
you can decide which exceptions need
to be handled (with some exception
handling code) before you release your
application into the wild.
Finally, Bruce Eckel, author of Thinking in Java, has an article called, “Does Java need Checked Exceptions?”, that may be worth reading up as well because the question of why checked exceptions are not there in C# usually takes root in comparisons to Java.
Because the response to checked exceptions is almost always:
try {
// exception throwing code
} catch(Exception e) {
// either
log.error("Error fooing bar",e);
// OR
throw new RuntimeException(e);
}
If you actually know that there is something you can do if a particular exception is thrown, then you can catch it and then handle it, but otherwise it's just incantations to appease the compiler.
The basic design philosophy of C# is that actually catching exceptions is rarely useful, whereas cleaning up resources in exceptional situations is quite important. I think it's fair to say that using (the IDisposable pattern) is their answer to checked exceptions. See [1] for more.
http://www.artima.com/intv/handcuffs.html
By the time .NET was designed, Java had checked exceptions for quite some time and this feature was viewed by Java developers at best as controversial controversial. Thus .NET designers chose not to include it in C# language.
Fundamentally, whether an exception should be handled or not is a property of the caller, rather than of the function.
For example, in some programs there is no value in handling an IOException (consider ad hoc command-line utilities to perform data crunching; they're never going to be used by a "user", they're specialist tools used by specialist people). In some programs, there is value in handling an IOException at a point "near" to the call (perhaps if you get a FNFE for your config file you'll drop back to some defaults, or look in another location, or something of that nature). In other programs, you want it to bubble up a long way before it's handled (for example you might want it to abort until it reaches the UI, at which point it should alert the user that something has gone wrong.
Each of these cases is dependent on the application, and not the library. And yet, with checked exceptions, it is the library that makes the decision. The Java IO library makes the decision that it will use checked exceptions (which strongly encourage handling that's local to the call) when in some programs a better strategy may be non-local handling, or no handling at all.
This shows the real flaw with checked exceptions in practice, and it's far more fundamental than the superficial (although also important) flaw that too many people will write stupid exception handlers just to make the compiler shut up. The problem I describe is an issue even when experienced, conscientious developers are writing the program.
Interestingly, the guys at Microsoft Research have added checked exceptions to Spec#, their superset of C#.
Anders himself answers that question in this episode of the Software engineering radio podcast
I went from Java to C# because of a job change. At first, I was a little concerned about the difference, but in practice, it hasn't made a difference.
Maybe, it's because I come from C++, which has the exception declaration, but it's not commonly used. I write every single line of code as if it could throw -- always use using around Disposable and think about cleanup I should do in finally.
In retrospect the propagation of the throws declaration in Java didn't really get me anything.
I would like a way to say that a function definitely never throws -- I think that would be more useful.
Additionally to the responses that were written already, not having checked exceptions helps you in many situations a lot. Checked exceptions make generics harder to implement and if you have read the closure proposals you will notice that every single closure proposal has to work around checked exceptions in a rather ugly way.
I sometimes miss checked exceptions in C#/.NET.
I suppose besides Java no other notable platform has them. Maybe the .NET guys just went with the flow...