How to reset loaded classes fast in java? - java

I'm using a custom classloader to load some java classes. I need to execute some methods from these loaded classes in a loop. For each loop iteration I need a fresh initialization of all the classes (all static fields). I have measured that the execution time is three times slower if I use a new classloader for each iteration than the execution time when not using a fresh classloader in each iteration.
Can I reset the loaded classes to their initial state without loading them with a new classloader?
Or is there a way to speed up recurring loading of the same classes in different classloaders?

When you load the class with a new classloader, the JMV will almost certainly have to re-jit the byte code. Until it does, the first uses of the newly loaded classes will be slower.
I assume these classes are library code which you cannot modify? Because the fact that you're having to use the classes in this way suggests flawed design to me.

Just from the top of my head: can you take a snapshot of the initial state of the class using reflection and then restoring it?

Related

Creating a SimpleName to CanonicalName map statically

I need to create a map of our domain classes simple names to their fully canonical names. I want to do this only for classes that are under our package structure, and that implement Serializable.
In serialization we use the canonical names of classes alot --it's a good default behaviour as its a very conservative approach, but our model objects are going to move around between packages, and I don't want that to represent a breaking change requiring migration scripts, so I'd like this map. I've already tooled our serializer to use this map, now I just need a good strategy for populating it. Its been frustrating.
First alternative: have each class announce itself statically
the most obvious and most annoying: edit each class in question to include the code
static{
Bootstrapper.classAliases.put(
ThisClass.class.getSimpleName(),
ThisClass.class.getCanonicalName()
);
}
I knew I could do this from the get-go, I started on it, and I really hate it. There's no way this is going to be maintained properly, new classes will be introduced, somebody will forget to add this line, and I'll get myself in trouble.
Second alternative: read through the jar
traverse the jar our application is in, load each class, and see if it should be added to this map. This solution smelled pretty bad -- I'm disturbing the normal loading order and I'm coupled tightly to a particular deployment scheme. Gave up on this fairly quickly.
Third alternative: use java.lang.Instrumentation
requires me to run java with a java agent. More specifics about deployment.
Fourth alternative: hijack class loaders
My first idea was to see if I could add a listener to the class loaders, and then listen for my desired classes being loaded, adding them to this map as they're loaded into the JVM. strictly speaking this isn't doing this statically, but its close enough.
After discovering the tree-like nature of class loaders, and the various different schemes used by the different threads and different libraries, I thought that implementing this solution would be both too complicated and lead to bugs.
Fifth alternative: leverage the build system & a properties file
This one seems like one of the better solutions but I don't have the ant skill to do it. My plan would be to search each file for the pattern
//using human readable regex
[whitespace]* package [whitespace]* com.mycompany [char]*;
[char not 'class']*
class [whitespace]+ (<capture:"className">[nameCharacter]+) [char not '{']* implements [char not '{'] Serializable [char not '{'] '{'
//using notepad++'s regex
\s*package\s+([A-Za-z\._]*);.*class\s+(\w+)\s+implements\s+[\w,_<>\s]*Serializable
and then write out each matching entry in the form [pathFound][className]=[className] to a properties file.
Then I add some fairly simple code to load this properties file into a map at runtime.
am I missing something obvious? Why is this so difficult to do? I know that the lazy nature of java classes means that the language is antithetical to code asking the question "what classes are there", and I guess my problem is a derivative of this question, but still, I'm surprised at how much I'm having to scratch my brain to do this.
So I suppose my question is 2 fold:
how would you go about making this map?
If it would be with your build system, what is the ant code needed to do it? Is this worth converting to gradle for?
Thanks for any help
I would start with your fifth alternative. So, there is a byte code manipulation project called - javassist which lets you load .class files and deal with them using java objects. For example, you can load a "Foo.class" and start asking it things like give me your package, public methods etc.
Checkout the ClassPool & CtClass objects.
List<CtClass> classes = new ArrayList<>();
// Using apache commons I/O you can use a glob pattern to populate ALL_CLASS_FILES_IN_PROJECT
for (File file : ALL_CLASS_FILES_IN_PROJECT) {
ClassPool default = ClassPool.getDefault();
classes.add(default.makeClass(new FileInputStream(file.getPath())));
}
The classes list will have all the classes ready for you to now deal with. You can add this to a static block in some entry point class that always gets loaded.
If this doesn't work for you, the next bet is to use the javaagent to do this. Its not that hard to do it, but it will have some implication on your deployment (the agent lib jar should be made available & the -javaagent added to the startup args).

When should classes be initialised - at load time or at first use?

One can load a class dynamically using this method of java.lang.Class:
public static Class<?> forName(String name, boolean initialize,
ClassLoader loader)
According to the JavaDoc, the second parameter is used to control the timing of class initialization (execution of static initialization code). If true, the class is initialized after loading and during the execution of this method; if false, initialization is delayed until the first time the class is used.
Now, I understand all that, but the docs don't say how to decide which strategy to use. Is it better to always do initialization immediately? Is it better to always delay it to first use? Does it depend on the circumstances?
Yes, it depends on circumstances, but usually it is preferred to just let classes be loaded and initialized on first use.
Cases when you might want to early initialize them (e.g. by calling forName() for them):
Static initialization blocks might perform checks for external resources (e.g. files, database connection), and if those fail, you don't even want to continue the execution of your program.
Similar to the previous: loading external, native libraries. If those fail (or not suitable for the current platform), you might want to detect that early and not continue with your app.
Static initializaiton blocks might perform lengthy operations and you don't want to have delays/lags later on when they are really needed, you can initialize them early or on different, background threads.
If you have static configuration files where class names are specified as text, you might want to initialize/load them early to detect configuration errors/typos. Such examples are logger config files, web.xml, spring context etc.
Many classes in the standard Java library cache certain data like the HTTPUrlConnection caches the HTTP user agent returned by System.getProperty("http.agent"). When it is first used, its value will be cached and if you change it (with like System.setProperty()), the new value will not be used. You can force such caching if you initialize the proper classes early, protecting them to be modified by the code later on.
Cases when you should not initialize early:
Classes which might only need in rare cases, or they might not even be needed at all throughout the run of your application. For example a GUI application might only show the About dialog when the user selects the Help/About menu. Obviously no need to load the relevant classes early (e.g. AboutDialog) because this is a rare case and in most runs the user will not do this / need this.

Dynamic program updating, runtime compilation, and class loaders

I have an application that needs the ability to update parts of itself (one class at a time) without stopping and restarting. With the JavaCompiler API, it is straightforward to generate modified class source code, recompile, load, and instantiate a class. I can do this all in memory (no files read from disk or net).
The application will never instantiate more than one object of such a class. There will only be two or three references to that object. When the modified class is loaded and instantiated, all those references will be changed to the new object. I can also probably guarantee that no method in the affected class is running in another thread while loading the modified class.
My question is this: will my class loader have problems loading a modified class with the same name as a class it previously loaded?
If I do not explicitly implement a cache of loaded classes in the class loader, would that avoid problems? Or could delegation to a parent class loader still cause a problem?
I hope to use a single instance of my class loader, but if necessary, I could instantiate a new one each time I update a class.
Note: I looked at OSGI and it seems to be quite a bit more than I need.
There's a useful example on this at http://tutorials.jenkov.com/java-reflection/dynamic-class-loading-reloading.html
We do quite a bit of dynamic class reloading ourselves (using Groovy for compilations). Note if you've got class dependencies then you may need recompile these dependencies on reload. In dev stacks we keep a track of these dependencies and then recompile whenever dependencies become stale. In production stacks we opted for a non-reloading ClassLoader and create a new ClassLoader when ever anything changes. So you can do it either way.
BTW - you might find the GroovyScriptEngine at http://grepcode.com/file/repo1.maven.org/maven2/org.codehaus.groovy/groovy-all/1.8.5/groovy/util/GroovyScriptEngine.java#GroovyScriptEngine very interesting if you want to dig around how they do it.
Okay, it should work: when you load the new class, it will replace the class name in the appropriate tables, and the memory should be GC'd. That said, I'd give it a strenuous test with a real short program that compiles a nontrivial class and replaces it, say 10,000 times.

Problem with static attributes

My problem is that I'm working on a project that requires me to run multiple instances of someone elses code which has many static attributes/variables, which causes all the instances to share those resources and, well, crash. I can run multiple instances of this other person's program if I create a .jar file off of it and open it multiple times by running the .jar in windows, but running calling the "main" method multiple times in my code (which is what I need to do) won't work.
I thought about creating a .jar and using Runtime.getRuntime().exec( "myprog.jar" ); to call the program multiple times, but that won't work for me since I have to pass an instance of my object to this new program and I don't think this solution would allow for that.
PS: This is also posted in the Sun forums, so I`ll post the answer I get there here or the answer I get here there naturally giving proper credit once I this is solved =P.
Remember that a static element in Java is unique only in the context of a classloader (hierarchy); a class is uniquely identified in a JVM by the tuple {classloader, classname}.
You need to instantiate isolated classloaders and load the jar using that class loader. Each loaded class (and thus statis elements) are unique in their classloader and will not interfere with one another.
I'd say you have three alternatives:
Refactor the legacy application so that it doesn't use static attributes. If you can do this, this may be the best solution in the long term.
Continue with your approach of launching the legacy application in a separate JVM. There are a number of ways that you can pass (copies of) objects to another JVM. For example, you could serialize them and pass them via the child processes input stream. Or you could stringify them and pass them as arguments. In either case, you'll need to create your own 'main' class/method that deals with the object passing before calling the legacy app.
I think you should be able to use classloader magic to dynamically load a fresh copy of the legacy application each time you run it. If you create a new classloader each time, you should get a fresh copy of the legacy application classes with a separate set of statics. But, you have to make sure that the legacy app is not on your main classpath. The problem with this approach is that it is expensive, and you are likely to create memory leaks.
The description is a little confusing.
If you are running the code multiple times, you are running multiple independent processes, each running in its own JVM. There is no way that they are actually sharing the values of their static fields. Java doesn't let you directly share memory between multiple VMs.
Can you elaborate more (ideally with examples and code) what the attributes are defined as and what kind of failures you are getting? This may be completely unrelated to them being static.
In particular, what exactly do you mean by shared resources? What resources are your programs sharing?
The proper approach was already suggested - using custom ClassLoaders. Another thing comes to my mind, which might seem ugly, but will probably do, and is a bit more object-oriented approach.
The legacy code is used for its operations, and it incorrectly uses static instead of instance variables. You can fix that using inheritance and reflection:
create (or reuse) an utility class that copies instance variables to static ones
extend the classes in question and provide the same instance variables as the static ones
override all methods. In the overriding methods use the utility to copy the state of the current object to the static variables, and then delegate to (call) the super methods.
Then start using instance of your class, instead of the legacy ones. That way you will simulate the proper behaviour.
Have in mind this is NOT thread-safe.

Classes, Static Methods, or Instance Methods - Memory Consumption and Executable Size in Compiled Languages?

I keep wondering about this to try to improve performance and size of my Flex swfs, how do classes vs. static methods vs. instance methods impact performance and the final compiled "executable's" size? Thinking how it might be possible to apply something like HAML and Sass to Flex...
Say I am building a very large admin interface with lots of components and views, and each of those components has a Skin object applied to them (thinking of the Spark Skinning Architecture for Flex).
Now I want to add 10 different Effects to every skin (say there are 100 components on the screen, so that's 1000 instantiated effects). Is it better to:
Have each Effect be a Class (BlurEffect, GlowEffect...), and add those 10 to the skin.
Have all Effects be instance methods in one larger class, say "MultiEffect.as", and add that one class to the skin, referenced like multiEffect.glow().
Have all Effects be static methods in one singleton-esque "EffectManager.as" class, and just reference the effects in the skin via EffectManager.glow(this).
So
Multiple Effect Classes per skin, vs.
One Effect class per skin, with instance methods, vs.
One Effect class globally, with static methods
How do those things impact memory and executable size (swf size in this example)? I know that classes are better OO practices and that static methods are slower than instance methods, and that Singletons are to be avoided, so it's not about performance necessarily. More about memory (which if smaller would be better in some cases), and file size.
Couldn't find such information for Flex, but for Java (which shouldn't be too different), object creation overhead is only 8 bytes of memory.
That means if we're talking about 1000 instances, the overhead of using objects for each instance is at most 8K - negligible. If 100x more, it's still 800K which is still nothing.
So, echoing the previous answers, choose the option that gives you a better design.
Oh, and the difference in resulting file size is pretty much nothing.

Categories