I want to keep only java util, io, nioand math packages and want to remove all other packages like java.sql and others from my JDK.
How can I remove them?
So if I write some program which import removed packages it will give
error package doesn't exist.
Use a SecurityManager instead of hacking the JDK
I'm going to give you the best answer I can.
Why you really shouldn't be doing what you want to do
When you're writing code, it is commonly agreed to develop that code in a way that is extendable. That is, your code can be plugged into other applications, or it can be changed and added to, very easily. Now with that principle in mind, let's review what happens when you delete the possible functionality of your program. Say you delete the SQL package, and in the future, you want a backend database to provide some persistence in your program. You're screwed.
The idea of Java, in fact I'd go as far as to say the major advantage of Java, is it's commonality, consistency and standardization of patterns. A getter is always a getter. A variable (that isn't a constant) starts with a lower case letter. Classes have a standardized way of being structured. All these things make developing in Java quite intuitive.
The JDK is part of that consistency, and to edit it is to really impact one of the major points of Java. It sounds like you want to implement your program in a different, more compact language.
Finally, you have no idea how the client may want to extend your project in the future. IF you want to have some repeatable business from the client, and generate a good reputation at the same time, you want to design your code with good design practise in mind.
There is no such tool, AFAIK.
Removing stuff from the Java libraries can be technically tricky, 'cos it can be difficult to know if your code might directly or indirectly use some class or method.
There are potentially "licensing issues" if you add or remove classes from a JRE installer, and ship it to other people.
Concerning your proposed use case.
If you are building this as a web application, then you are going to have a lot of difficulty cutting out classes that are not needed. A typical webapp server-side framework uses a lot of Java SE interfaces.
If you accepted and ran code someone who wanted to try and bring down your service, they could do it without using only the Object class. (Hint: infinite loops and filling the heap.) Running untrusted code on your server is a bad idea. Period.
Think about the consequence for someone trying to run legitimate code on your server. Why shouldn't they be allowed to use library classes / methods? (I'd certainly be a bit miffed if I couldn't use "ordinary" library classes ...)
My advice would be reconsider if it was a good idea to implement such a service at all ... given the risks, and the difficulty you could have if your safeguards were ineffective. If you decide to proceed, I advise running the untrusted code within the JVM in a security box. As a second level of defence in case Java security is compromised, I'd recommend running the service "chrooted" or better still in an isolated virtual machine that can be turned off if you run into problems.
Related
First of all, I know how to build a Java application. But I have always been puzzled about where to put my classes. There are proponents for organizing the packages in a strictly domain oriented fashion, others separate by tier.
I have a problems with
naming and placing
So,
Where do you put your domain specific constants?
Where do you place class for stuff which is both infrastructural and domain specific (for instance I have a FileStorageStrategy class, which stores the files either in the database, or alternatively in database)?
thx
Here's an excellent approach to coming up with your own style and approach to class naming and organization style:
Find a great IDE. I really like Eclipse. It provides all of the features that follow.
Identify a naming style and organizational layout. This is an inductive approach to developing your own effective best practices.
Ask yourself: does #2 help you or are you fighting with it?
Archive your system frequently. Use a great configuration management system like git or GitHub! It will make #5 easier.
Make small changes. The IDE should allow you to rename class at any point of your system and make the changes globally. Moving classes should be equally easy. Doing this will prevent getting hamstrung by analysis: analysis paralysis.
Loop to #2. Yes, infinite loop. I've been doing this loop for a very long time. (Duh, it's an infinite loop!) In fact, sometimes I thrash: I flip-flop between 2 styles.
While in your infinite loop, examine online style guides and coding standards. Google and NASA JPL have great ones. Also, look at the great open-source APIs to see how the community of developers name and organize. If someone else is going to work with your code, you don't want to confuse them in any way. Worst case, they'll ignore what you did and rewrite it.
Don't be afraid to experiment. Don't be afraid to make mistakes. Don't always rely on what others do or have done. Good luck.
I am curious about what automatic methods may be used to determine if a Java app running on a Windows or PC is malware. (I don't really even know what exploits are available to such an app. Is there someplace I can learn about the risks?) If I have the source code, are there specific packages or classes that could be used more harmfully than others? Perhaps they could suggest malware?
Update: Thanks for the replies. I was interested in knowing if this would be possible, and it basically sounds totally infeasible. Good to know.
If it's not even possible to automatically determine whether a program terminates, I don't think you'll get much leverage in automatically determining whether an app does "naughty stuff".
Part of the problem of course is defining what constitutes malware, but the majority is simply that deducing proofs about the behaviour of other programs is surprisingly difficult/impossible. You may have some luck spotting particular patterns, but on the whole you can't be confident (and I suspect it's provably impossible) that you've caught all possible attack vectors.
And in the general sphere, catching 95% of vectors isn't really worthwhile when the attackers simply concentrate on the remaining 5%.
Well, there's always the fundamental philosophical question: what is a malware? It's code that was intended to do damage, or at least code that doesn't do what it claims to. How do you plan to judge intent based on libraries it uses?
Having said that, if you at least roughly know what the program is supposed to do, you can indeed find suspicious packages, things the program wouldn't normally need to access. Like network connections when the program is meant to run as a desktop app. But then the network connection could just be part of an autoupdate feature. (Is autoupdate itself a malware? Sometimes it feels like it is.)
Another indicator is if a program that ostensibly doesn't need any special privileges, refuses to run in a sandbox. And the biggest threat is if it tries to load a native library when it shouldn't need one.
But all these only make sense if you know what the code is supposed to do. An antivirus package might use very similar techniques to viruses, the only difference is what's on the label.
Here is a general outline for how you can bound the possible actions your java application can take. Basically you are testing to see if the java application is 'inert' (can't take harmful actions) and thus it probably not mallware.
This won't necessarily tell you mallware or not, as others have pointed out. The app could still do annoying things like pop-up windows. Perhaps the best indication, is to see if the application is digitally signed by an author you trust; if not -- be afraid.
You can disassemble the class files to determine which Java APIs the application uses; you are looking for points where the java app uses the OS. Since java uses a virtual machine, there are well defined points where a java application could take potentially harmful actions -- these are the 'gateways' to various OS calls (for example opening a socket or reading a file).
Its difficult to enumerate all the APIs, different functions which execute the same OS action should require the same Permission. But java's docs don't provide an exhaustive list.
Does the java app use any native libraries -- if so its a big red flag.
The JVM does not offer the ability to run arbitrary code, or use native system APIs; in particular it does not offer the ability to modify the registry (a typical action of PC mallware). The only way a java application can do this is via native libraries. Typically there is no need for a normal application written in java to use native code (unless it needs to use devices).
Check for System.loadLibrary() or System.load() or Runtime.loadLibrary() or Runtime.load(). This is how the VM loads native libraries.
Does it use the network or file system?
Look for use of java.io, java.net.
Does it make system calls (via Runtime.exec())
You can check for the use of java.lang.Runtime.exec() or ProcessBuilder.exec().
Does it try to control the keyboard / mouse?
You could also run the application in a restricted policy JVM (the instructions/tools for doing this are not as simple as they should be) and see what fails (see Oracle's security tutorial) -- note that disassembly is the only way to be sure, just because the app doesn't do anything harmful once, doesn't mean it won't in the future.
This definitely is not easy, and I was surprised to find how many places one needs to look at (for example several java functions load native libraries, not just one).
I have been working on Java Swing project. Its design is pretty poor. I have been given task to make its design better. Primarily I was thinking that I will decouple Java Swing code and the business code by following MVC pattern.
Doing this stuff manually seems error prone. Is there any tool/software that decouples GUI layer code (written in Java Swing) and the business layer code (written in core Java).
As far as I can tell the answer to that question is "No, there is not such a tool."
I would hazard a guess to say that it is not even likely that someone could make such a tool. The problem is that thereis no easy way to distinquish between "business logic" and "GUI" code.
On projects that I have worked on, we made a concerted effort in the design phase to keep the GUI and the business logic separate, but I can't count the number of times that it was unavoidable that there was cross-over, especially when some business rule drove how the GUI should behave.
I don't envy your task. Dealing with, and modifying existing code, especially code that you didn't create, IS error prone. And the whole concept of MVC programming is really just way to think about how you build a program not a set of existing library of tools to use.
I learned MVC programming back in Smalltalk days, when it was originally invented by the smart people at Xerox Parc, and even they couldn't avoid some cross-over coupling of code.
Automated tools work well when the input is well defined and structured. Based on what you have described, your inputs are far from that.
Yes, doing this work is going to be complicated and a lot of work. But this is something that you just need to roll up your sleeves and dive into it. Before you start touching the code, sit down and make sure you fully understand how things are laid out now (and why that is a bad thing). Then start planning how you would do it better.
Now that you have a plan of how things should be, start working with the code and updating the legacy code.
There isn't a tool, but there are techniques.
Refactoring is a practice of making small changes to code to alter it's architecture without altering it's function. You've already probably done refactoring to a degree, but sometimes reading up on a subject you know about can help sharpen one's skills.
Many IDEs do support some of the simple refactorings. Eclipse and NetBeans have different strengths in their refactoring support, and depending on your current environment, you may find one to be more useful than the other.
What refactoring won't do for you is to give you a roadmap of how to get from point A (your spaghetti code) to point B (your cleanly separated out MVC). Instead it will just make you a better driver along the path. Unfortunately, (or fortunately for all the developers out there) good brains are still needed to figure out what sections of code intend to do, and whether they are more appropriately placed in the Model, View, or Controller. The rest is just teasing out the dependencies in such a manner that you can eventually move the code up or down in the sequence, and eventually push it into a method that can then be moved to the appropriate class.
Good luck.
I also don't know of any tool. And as others here - kind of - point out, I think such tools would end up making programmers redundant. We don't want that, do we. :-D
But, checking manually if you actually managed to decouple the code should not be to complicated. Just make sure your "business part" imports no swing packages:
find in files "swing"
move all those files in one package
check if there is any "business logic" in there => take it out
...
Is rjava the only way to connect R to Java? I am asking because there is a disclaimer at the end of the web page:
This interface uses Java reflection
API to find the correct method so it
is much slower and may not be right
(works for simple examples but may not
for more complex ones). For now its
use is discouraged in programs as it
may change in the future.
This is slightly concerning. How do you address this issue? I know that Rweka has a self-contained interface, so I may look into that package, but maybe many R users have already gone through the pains.
It is not the only one as the Omegahat project also has the RSJava package. But as many of the other brilliant innovations from Omegahat (which practically speaking is really just Duncan Temple Lang), this one may not build as easily or reliably.
The rJava package on the other hand is used by almost thirty other packages
CADStat, Containers, Deducer, JGR,
RFreak, RImageJ, RJDBC, RLadyBug,
aCGH.Spline, ant, arulesNBMiner,
colbycol, cshapes, dynGraph, farmR,
gWidgetsrJava, glmulti,
helloJavaWorld, iplots, rSymPy, rcdk,
rcdklibs, scagnostics, spcosa, RKEA,
RWeka, Snowball, openNLP, wordnet
which I take as quite the endorsement.
I think that disclaimer only applies if you use the $ operator to access your java objects. As long as you stick with the .jcall function you won't incur the overhead.
In terms of experience using rJava, I've found it works exactly as advertised and for my package (farmR) it hasn't caused any performance problems. I don't make a huge number of calls into java though, and I haven't used any of the java GUI toolkits.
I am an Rweka user, and I can tell you it is amazingly quick, it outperforms weka alone, while using it's functions in the r environment. I think that the R package has a very special way to integrate inside the language java libraries, nevertheless these libraries need to be prepared to allow this. For being able to do a proper integration you will need to do an important amount of research in order to see how to make things fit properly. I recommend you to read the documentation that comes with R, which details which are the best practices for writing NEW LIBRARIES libraries.
I'm in the process of reviewing a code base (~20K LOC) and trying to determine how to migrating it from 1.4.2 to 5. Obviously, it's not an overnight project and the suggestion which I have received is to write new code against Java 5 and migrate the old code in a piece-meal fashion. Also, I'm no expert in the new features in Java 5 (i.e. I know of them, but have never written any for production use).
My questions:
What features of Java 5 are typically used in production code? (i.e. generics, auto-boxing, etc.) Are there features to be avoided / not considered to be best-practices?
What are the best refactoring strategies which I can use migrate a code base of this size? (i.e. make changes to classes one at a time only when a class is edited, etc.) Objective - reduce risk on the code base. Limitation - resources to do refactoring.
Any advice is appreciated - thanks in advance.
UPDATE - a year too late, but better late than never? =)
Thank you for all of the comments - lots of great points of view. In the life of a software developer, there's always going to be the projects you strive to finish but never get around to because of something more "urgent".
With respect to the use of Java 5 (at that time), it was something which was required in the client's production environment, so that was why we did not use Java 6.
I found that the stronger typing for collections, enums and unboxing of primitives were the features I tend to apply the most, both to old and new code. The refactoring was fairly straight-forward, but code comprehension improved significantly and standards became easier to enforce. The ones I had the most trouble with was the generics; I think it's a concept which I still haven't had a chance to fully grasp and appreciate yet, so it was difficult for me to find previous cases where the application of generics was appropriate.
Thanks again to everyone who contributed to this thread and apologies for the late follow up.
Java 5 is almost completely backwards compatible with Java 4. Typically, the only change you must make when you migrate is to rename any usages of the new enum keyword in the Java 4 code.
The full list of potential compatibility problems is listed here:
http://java.sun.com/j2se/1.5.0/compatibility.html
The only other one that I've run into in practice is related to the change in the JAXP implementation. In our case, it simply meant removing xerces.jar from the classpath.
As far as refactoring goes, I think that migrating your collection classes to use the new strongly-typed generic versions and removing unnecessary casting is a good idea. But as another poster pointed out, changing to generic collections tends to work best if you work in vertical slices. Otherwise, you end up having to add casting to the code to make the generic types compatible with the non-generic types.
Another feature I like to use when I'm migrating code is the #Override annotation. It helps to catch inheritance problems when you're refactoring code.
http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Override.html
The new concurrency library is very useful if your code uses threading. For example, you may be able to replace home-grown thread pools with a ThreadPoolExecutor.
http://java.sun.com/j2se/1.5.0/docs/relnotes/features.html#concurrency
I would definitely take the approach of updating the code as you change it during normal maintenance. Other than the compatibility issues, I don't think there is a compelling reason to use the new Java 5 features unless you're already changing the code for other reasons.
There is one very real issue with the "viral" nature of generics; once you start introducing them at a given layer in an architecture you generally want to introduce it at the layer above & below as well. I have found that introducing generics is probably best done in full "verticals". But you do not have to do all the verticals at once.
This is a really hard question to answer because it depends on what code will be affected and how critical that code is.
First and foremost, when migration is a nontrivial undertaking, do yourself a favour and upgrade to the latest version of Java, which would be Java 6 not Java 5. Java 6 has been out for a year and a half or more and is mature. There's no reason to not pick it over Java 5 (imho).
Secondly, like any software project, your goal should be to get something into production as soon as you possibly can. So you need to identify a slice of your system. The smaller the better, the more non-cdritical, the better.
The other thing to do is just try starting up your app under Java 6 and seeing what breaks. It might be worse than you expected. It might be much better.
The other thing you'll probably need to be aware of is that by the sounds of it you will have jars/libraries in your app that have since been deprecated. Some may not even be compatible with Java beyond 1.4.2. You will probably want to upgrade all of these to the latest version as well.
This will probably mean more stuff breaking but using old/deprecated APIs is just kicking the can down the street and causes you other problems.
There are exceptions to this where upgrading can have far-reaching consequences. Axis1 to Axis2 comes to mind. Those situations require more careful thought.
As for what features are used... all of them pretty much. I can't think of any that should be avoided off the top of my head.
Also, I just noticed the size of your project: ~20K LOC. That's actually quite small (eg I've written an app about that size in the last 3 months by myself).
Lastly, this also depends on how easily you will find things that break. If you have good unit test coverage then great. That's pretty rare though. If you can just run through the app and reliably find problems it's not too bad.
The problematic situations are where scenarios are hard to test and it's likely you won't uncover problems straight away. That calls for more caution.
You would want to migrate stuff that doesn't work in the transition from 1.4 to 5 (not sure what that would be), but I'd be wary of migrating stuff for the sake of it.
If you do take this route, some questions:
Do you have comprehensive test coverage ? If not, you should write unit tests for the code you're going to be migrating.
Do you have components that are widely used within your codebase ? If so, they are probably candidates to be migrated in terms of their API (e.g. using generics etc.)
In terms of what's widely used from Java 5. Generics is important and makes your life a lot easier. I don't see autoboxing too much, nor enums (this is all relative). Varargs almost never. Annotations are useful for frameworks, but I consume these. I don't think I've ever implemented one myself.
20 (non-comment) kloc should be small enough to insert generics with a big bang. Obviously make sure your code compiles an runs on Java SE 5 first. The relatively easy thing about generics is that adding them makes very little change to semantics (certain overloadings can change because of implicit cases - Iterator<char[]> iter; ... System.out.println(iter.next()); as a bad example off the top of my head).
Some cases adding generics will highlight conceptual problems with the code. Using one Map as two maps with disjoint key sets, for example. TreeMap is an example in the Java library where a single class has two distinct mode (using Comparator<T> or Comparable<T>).
Things like enhanced-for and auto-boxing are very local and can be added piecemeal. enums are rarer and might take some thinking about how you are actually going to use them.
I think you're going about this the wrong way. Your plan shouldn't be to update all current code to Java 1.5, your plan should be to ensure that all current code runs exactly the same in 1.5 as it did in 1.4.2, and that all future code written will work fine in 1.5.
I've gone through a few transitions like this of varied sized code bases. The goal was always to make sure we had a ton of unit tests so that we could easily plug in 1.5 and run our tests through it. We actually encountered about 10 problems, mostly related to regular expression libraries not supporting something or supporting something differently.
Write all new code in 1.5 then, and if you change an older class for whatever reason, spend a minute and implement generics, but there's no reason to refactor everything. That sounds a bit dangerous to me if you don't have the tests in place.