Is there any way to create class that extends ByteBuffer class?
Some abstract methods from ByteBuffer are package private, and if I create package java.nio, security exception is thrown.
I would want to do that for performance reasons - getInt for example has about 10 method invocations, as well as quite a few if's. Even if all checks are left, and only method calls are inlined and big/small endian checks are removed, tests that I've created show that it can be about 4 times faster.
You cant extend ByteBuffer and thanks God for.
You cant extend b/c there are no protected c-tors. Why thank god part? Well, having only 2 real subclasses ensures that the JVM can Heavily optimizes any code involving ByteBuffer.
Last, if you need to extend the class for real, edit the byte code, and just add protected attribute the c-tor and public attribute to DirectByteBuffer (and DirectByteBufferR). Extending the HeapBuffer serves no purposes whatsoever since you can access the underlying array anyways
use -Xbootclasspath/p and add your own classes there, extend in the package you need (outside java.nio). That's how it's done.
Another way is using sun.misc.Unsafe and do whatever you need w/ direct access to the memory after address().
I would want to do that for
performance reasons - getInt for
example has about 10 method
invocations, as well as quite a few
if's. Even if all checks are left, and
only method calls are inlined and
big/small endian checks are removed,
tests that I've created show that it
can be about 4 times faster.
Now the good part, use gdb and check the truly generated machine code, you'd be surprised how many checks would be removed.
I can't imagine why a person would want to extend the classes. They exist to allow good performance not just OO polymorph execution.
edit:
How to declare any class and bypass Java verifier
On Unsafe: Unsafe has 2 methods that bypass the verifier and if you have a class that extends ByteBuffer you can just call any of them. You need some hacked version (but that's super easy) of ByteBuffer w/ public access and protected c-tor just for the compiler.
The methods are below. You can use 'em on your own risk. After you declare the class like that you can even use it w/ new keyword (provided there is a suitable c-tor)
public native Class defineClass(String name, byte[] b, int off, int len, ClassLoader loader, ProtectionDomain protectionDomain);
public native Class defineClass(String name, byte[] b, int off, int len);
You can disregard protection levels by using reflection, but that kinda defeats the performance goal in a big way.
You can NOT create a class in the java.nio package - doing so (and distributing the result in any way) violates Sun's Java license and could theoretically get you into legal troubles.
I don't think there's a way to do what you want to do without going native - but I also suspect that you're succumbing to the temptation of premature optimization. Assuming that your tests are correct (which microbenchmarks are often not): are you really sure that access to ByteBuffer is going to be the performance bottleneck in your actual application? It's kinda irrelevant whether ByteBuffer.get() could be 4 times faster when your app only spends 5% of its time there and 95% processing the data it's fetched.
Wanting to bypass all checks for the sake of (possibly purely theoretical) performance does not sound a good idea. The cardinal rule of performance tuning is "First make it work correctly, THEN make it work faster".
Edit: If, as stated in the comments, the app actually does spend 20-40% of its time in the ByteBuffer methods and the tests are correct, that means a speedup potential of 15-30% - significant, but IMO not worth starting to use JNI or messing with the API source. I'd try to exhaust all other options first:
Are you using the -server VM?
Could the app be modified to make fewer calls to ByteBuffer rather than trying to speed up those it does make?
Use a profiler to see where the calls are coming from - perhaps some are outright unnecessary
Maybe the algorithm can be modified, or you can use some sort of caching
ByteBuffer is abstract so, yes, you can extend it... but I think what you want to do is extend the class that is actually instantiated which you likely cannot. It could also be that the particular one that gets instantiated overrides that method to be more efficient than the one in ByteBuffer.
I would also say that you are likely wrong in general about all of that being needed - perhaps it isn't for what you are testing, but likely the code is there for a reason (perhaps on other platforms).
If you do believe that you are correct on it open a bug and see what they have to say.
If you want to add to the nio package you might try setting the boot classpath when you call Java. It should let you put your classes in before the rt.jar ones. Type java -X to see how to do that, you want the -Xbootclasspath/p switch.
+50 bounty for a way to circumvent the access restriction (tt cannot be
done using reflection alone. Maybe
there is a way using sun.misc.Unsafe
etc.?)
Answer is: there is no way to circumvent all access restrictions in Java.
sun.misc.Unsafe works under the authority of security managers, so it won't help
Like Sarnum said:
ByteBuffer has package private
abstract _set and _get methods, so you
couldn't override it. And also all the
constructors are package private, so
you cannot call them.
Reflection allows you to bypass a lot of stuff, but only if the security manager allows it. There are many situations where you have no control on the security manager, it is imposed on you. If your code were to rely on fiddling with security managers, it would not be 'portable' or executable in all circumstances, so to speak.
The bottom line of the question is that trying to override byte buffer is not going to solve the issue.
There is no other option than implementing a class yourself, with the methods you need. Making methods final were you can will help the compiler in its effort to perform optimizations (reduce the need to generate code for runtime polymorphism & inlining).
The simplest way to get the Unsafe instances is via reflection. However if reflection is not available to you, you can create another instance. You can do this via JNI.
I tried in byte code, to create an instance WITHOUT calling a constructor, allowing you create an instance of an object with no accessible constructors. However, this id not work as I got a VerifyError for the byte code. The object has to have had a constructor called on it.
What I do is have a ParseBuffer which wraps a direct ByteBuffer. I use reflection to obtain the Unsafe reference and the address. To avoid running off the end of the buffer and killing the JVM, I allocate more pages than I need and as long as they are not touched no physical memory will be allocated to the application. This means I have far less bounds checks and only check at key points.
Using the debug version of the OpenJDK, you can see the Unsafe get/put methods turn into a single machine code instruction. However, this is not available in all JVM and may not get the same improvement on all platforms.
Using this approach I would say you can get about a 40% reduction in timings but comes at a risk which normal Java code does not have i.e. you can kill the JVM. The usecase I have is an object creation free XML parser and processor of the data contained using Unsafe compared with using a plain direct ByteBuffer. One of the tricks I use in the XML parser is to getShort() and getInt() to examine multiple bytes at once rather than examining each byte one at a time.
Using reflection to the the Unsafe class is an overhead you incurr once. Once you have the Unsafe instance, there is no overhead.
A Java Agent could modify ByteBuffer's bytecode and change the constructor's access modifier. Of course you'd need to install the agent at the JVM, and you still have to compile get your subclass to compile. If you're considering such optimizations then you must be up for it!
I've never attempted such low level manipulation. Hopefully ByteBuffer is not needed by the JVM before your agent can hook into it.
I am answering the question you WANT the answer to, not the one you asked. Your real question is "how can I make this go faster?" and the answer is "handle the integers an array at a time, and not singly."
If the bottleneck is truly the ByteBuffer.getInt() or ByteBuffer.getInt(location), then you do not need to extend the class, you can use the pre-existing IntBuffer class to grab data in bulk for more efficient processing.
int totalLength = numberOfIntsInBuffer;
ByteBuffer myBuffer = whateverMyBufferIsCalled;
int[] block = new int[1024];
IntBuffer intBuff = myBuffer.asIntBuffer();
int partialLength = totalLength/1024;
//Handle big blocks of 1024 ints at a time
try{
for (int i = 0; i < partialLength; i++) {
intBuff.get(block);
// Do processing on ints, w00t!
}
partialLength = totalLength % 1024; //modulo to get remainder
if (partialLength > 0) {
intBuff.get(block,0,partialLength);
//Do final processing on ints
}
} catch BufferUnderFlowException bufo {
//well, dang!
}
This is MUCH, MUCH faster than getting an int at a time. Iterating over the int[] array, which has set and known-good bounds, will also let your code JIT much tighter by eliminating bounds checks and the exceptions ByteBuffer can throw.
If you need further performance, you can tweak the code, or roll your own size-optimized byte[] to int[] conversion code. I was able to get some performance improvement using that in place of the IntBuffer methods with partial loop unrolling... but it's not suggested by any means.
Related
I have bean util library and we cache Method/Fields of properties, of course. Reading and writing goes via reflection.
There is an idea to skip reflection and for each method/field to bytecode-generate a simple object that directly calls the target. For example, if we have setFoo(String s) method, we would call a set(String s) method of this generated class that internally calls setFoo(). Again, we are replacing the reflection call with the runtime generated direct call.
I know Java does similar thing with GeneratedMethodAccessor. But it's cache may be limited by JVM argument.
Does anyone know if it make sense to roll-on my implementation, considering the performance? On one hand, it sounds fine, but on other, there are many new classes that will be created - and fill perm gen space.
Any experience on this subject?
You are trying to re-invent cglib's FastMethod
In fact, Reflection is not slower at all. See
https://stackoverflow.com/a/23580143/3448419
Reflection can do more than 50,000,000 invocations per second. It is unlikely to be a bottleneck.
Edit 2:
Does a program with a fully object-oriented implementation give high performance? Most of the framework is written with full power of it. However, reflection is also heavily used to achieve it like for AOP and dependency injection. Use of reflection affects the performance to a certain extent.
So, Is it good practice to use reflection? Is there some alternative to reflection from programming language constructs? To what extent should reflection be used?
Reflection is, in itself and by nature, slow. See this question for more details.
This is caused by a few reasons. Jon Skeet explains it nicely:
Check that there's a parameterless constructor Check the accessibility
of the parameterless constructor Check that the caller has access to
use reflection at all Work out (at execution time) how much space
needs to be allocated Call into the constructor code (because it won't
know beforehand that the constructor is empty)
Basically, reflection has to perform all the above steps before invocation, whereas normal method invocation has to do much less.
The JITted code for instantiating B is incredibly lightweight.
Basically it needs to allocate enough memory (which is just
incrementing a pointer unless a GC is required) and that's about it -
there's no constructor code to call really; I don't know whether the
JIT skips it or not but either way there's not a lot to do.
With that said, there are many cases where Java is not dynamic enough to do what you want, and reflection provides a simple and clean alternative. Consider the following scenario:
You have a large number of classes which represent various items, i.e. a Car, Boat, and House.
They both extend/implement the same class: LifeItem.
Your user inputs one of 3 strings, "Car", "Boat", or "House".
Your goal is to access a method of LifeItem based on the parameter.
The first approach that comes to mind is to build an if/else structure, and construct the wanted LifeItem. However, this is not very scalable and can become very messy once you have dozens of LifeItem implementations.
Reflection can help here: it can be used to dynamically construct a LifeItem object based on name, so a "Car" input would get dispatched to a Car constructor. Suddenly, what could have been hundreds of lines of if/else code turns into a simple line of reflection. The latter scenario would not be as valid on a Java 7+ platform due to the introduction of switch statements with Strings, but even then then a switch with hundreds of cases is something I'd want to avoid. Here's what the difference between cleanliness would look like in most cases:
Without reflection:
public static void main(String[] args) {
String input = args[0];
if(input.equals("Car"))
doSomething(new Car(args[1]));
else if(input.equals("Boat"))
doSomething(new Boat(args[1]));
else if (input.equals("House"))
doSomething(new House(args[1]));
... // Possibly dozens more if/else statements
}
Whereas by utilizing reflection, it could turn into:
public static void main(String[] args) {
String input = args[0];
try {
doSomething((LifeItem)Class.forName(input).getConstructor(String.class).newInstance(args[1]));
} catch (Exception ie) {
System.err.println("Invalid input: " + input);
}
}
Personally, I'd say the latter is neater, more concise, and more maintainable than the first. In the end its a personal preference, but that's just one of the many cases where reflection is useful.
Additionally, when using reflection, you should attempt to cache as much information as possible. In other words employ simple, logical things, like not calling get(Declared)Method everywhere if you can help it: rather, store it in a variable so you don't have the overhead of refetching the reference whenever you want to use it.
So those are the two extremes of the pro's and con's of reflection. To sum it up if reflection improves your code's readability (like it would in the presented scenario), by all means go for it. And if you do, just think about reducing the number of get* reflection calls: those are the easiest to trim.
While reflection is most expensive than "traditional code", premature optimization is the root of all evil. From a decade-long empirical evidence, I assume that a method invoked via reflection will hardly affect performance unless it is invoked from a heavy loop, and even so there have been some performance enhancements on reflection:
Certain reflective operations, specifically Field, Method.invoke(),
Constructor.newInstance(), and Class.newInstance(), have been
rewritten for higher performance. Reflective invocations and
instantiations are several times faster than in previous releases
Enhancements in J2SDK 1.4 -
Note that method lookup (i.e. Class.getMethod) is not mentioned above, and choosing the right Method object usually requires additional steps such as traversing the class hierarchy while asking for the "declared method" in case that it is not public), so I tend to save the found Method in a suitable map whenever it is possible, so that the next time the cost would be only that of a Map.get() and Method.invoke(). I guess that any well-written framework can handle this correctly.
One should also consider that certain optimizations are not possible if reflection is used (such as method inlining or escape analysis. Java HotSpotâ„¢ Virtual Machine Performance Enhancements). But this doesn't mean that reflection has to be avoided at all cost.
However, I think that the decision of using reflection should be based in other criteria, such as code readability, maintainability, design practices, etc. When using reflection in your own code (as opposed to using a framework that internally uses reflection), one risk transforming compile-time errors into run-time errors, which are harder to debug. In some cases, one could replace the reflective invocation by a traditional OOP pattern such as Command or Abstract Factory.
I can give you one example (but sorry, I can't show you the test results, because it was few months ago). I wrote an XML library (custom project oriented) which replaced some old DOM parser code with classes + annotations. My code was half the size of the original. I did tests, and yes, reflection was more expensive, but not much (something like 0.3 seconds out of 14-15 seconds of executing (loss is about 2%)). In places, where code is executed infrequently, reflection can be used with a small performance loss.
Moreover, I am sure, that my code can be improved for better performance.
So, I suggest these tips:
Use reflection if you can do it in a way that is beautiful, compact & laconic;
Do not use reflection if your code will be executed many-many times;
Use reflection, if you need to project a huge amount of information from another source (XML-files, for example) to Java application;
The best usage for reflections and annotations is where code is executed only once (pre-loaders).
I'm writing code in the Java ME environment, so speed is absolutely an important factor. I have read several places that reflection of any sort (even the very limited amounts that are allowed on java ME) can be a very large bottleneck.
So, my question is this: is doing String.class.getName() slow? What about myCustomObject.getClass().getName()? Is it better to simply replace those with string constants, like "java.lang.String" and "com.company.MyObject"?
In case you're wondering, I need the class names of all primitives (and non-primitives as well) because Java ME does not provide a default serialization implementation and thus I have to implement my own. I need a generic serialization solution that will work for both communication across the network as well as local storage (RMS, but also JSR-75)
Edit
I'm using Java 1.3 CLDC.
String.class.getName() would be not slow because its value will be loaded before executed.i.e compiler will put its value before line will execute.
myCustomObject.getClass().getName() would be bit slower then previous as it will be retrieved at time for execution
Reflection is not unnaturally slow; it's just as slow as you'd expect, but no slower. First, calling a method via reflection requires all the object creation and method calling that is obvious from the reflection API, and second, that if you're calling methods through reflection, Hotspot won't be able to optimize through the calls.
Calling getClass().getName() is no slower than you'd expect, either: the cost of a couple of virtual method calls plus a member-variable fetch. The .class version is essentially the same, plus or minus a variable fetch.
I can't speak for Java ME, but I'm not surprised at the overhead by using reflection on a resource constrained system. I wouldn't think it is unbearably slow, but certainly you would see improvements from hard-coding the names into a variable.
Since you mentioned you were looking at serialization, I'd suggest you take a look into how its done in the Kryo project. You might find some of their methods useful, heck you might even be able to use it in Java ME. (Unfortunately, I have no experience with ME)
I'm a beginner and I've always read that it's bad to repeat code. However, it seems that in order to not do so, you would have to have extra method calls usually. Let's say I have the following class
public class BinarySearchTree<E extends Comparable<E>>{
private BinaryTree<E> root;
private final BinaryTree<E> EMPTY = new BinaryTree<E>();
private int count;
private Comparator<E> ordering;
public BinarySearchTree(Comparator<E> order){
ordering = order;
clear();
}
public void clear(){
root = EMPTY;
count = 0;
}
}
Would it be more optimal for me to just copy and paste the two lines in my clear() method into the constructor instead of calling the actual method? If so how much of a difference does it make? What if my constructor made 10 method calls with each one simply setting an instance variable to a value? What's the best programming practice?
Would it be more optimal for me to just copy and paste the two lines in my clear() method into the constructor instead of calling the actual method?
The compiler can perform that optimization. And so can the JVM. The terminology used by compiler writer and JVM authors is "inline expansion".
If so how much of a difference does it make?
Measure it. Often, you'll find that it makes no difference. And if you believe that this is a performance hotspot, you're looking in the wrong place; that's why you'll need to measure it.
What if my constructor made 10 method calls with each one simply setting an instance variable to a value?
Again, that depends on the generated bytecode and any runtime optimizations performed by the Java Virtual machine. If the compiler/JVM can inline the method calls, it will perform the optimization to avoid the overhead of creating new stack frames at runtime.
What's the best programming practice?
Avoiding premature optimization. The best practice is to write readable and well-designed code, and then optimize for the performance hotspots in your application.
What everyone else has said about optimization is absolutely true.
There is no reason from a performance point of view to inline the method. If it's a performance issue, the JIT in your JVM will inline it. In java, method calls are so close to free that it isn't worth thinking about it.
That being said, there's a different issue here. Namely, it is bad programming practice to call an overrideable method (i.e., one that is not final, static, or private) from the constructor. (Effective Java, 2nd Ed., p. 89 in the item titled "Design and document for inheritance or else prohibit it")
What happens if someone adds a subclass of BinarySearchTree called LoggingBinarySearchTree that overrides all public methods with code like:
public void clear(){
this.callLog.addCall("clear");
super.clear();
}
Then the LoggingBinarySearchTree will never be constructable! The issue is that this.callLog will be null when the BinarySearchTree constructor is running, but the clear that gets called is the overridden one, and you'll get a NullPointerException.
Note that Java and C++ differ here: in C++, a superclass constructor that calls a virtual method ends up calling the one defined in the superclass, not the overridden one. People switching between the two languages sometimes forget this.
Given that, I think it's probably cleaner in your case to inline the clear method when called from the constructor, but in general in Java you should go ahead and make all the method calls you want.
I would definitely leave it as is. What if you change the clear() logic? It would be impractical to find all the places where you copied the 2 lines of code.
Generally speaking (and as a beginner this means always!) you should never make micro-optimisations like the one you're considering. Always favour readability of code over things like this.
Why? Because the compiler / hotspot will make these sorts of optimisations for you on the fly, and many, many more. If anything, when you try and make optimisations along these sorts of lines (though not in this case) you'll probably make things slower. Hotspot understands common programming idioms, if you try and do that optimisation yourself it probably won't understand what you're trying to do so it won't be able to optimise it.
There's also a much greater maintenance cost. If you start repeating code then it's going to be much more effort to maintain, which will probably be a lot more hassle than you might think!
As an aside, you may get to some points in your coding life where you do need to make low level optimisations - but if you hit those points, you'll definitely, definitely know when the time comes. And if you don't, you can always go back and optimise later if you need to.
The best practice is to measure twice and cut once.
Once you've wasted time optimization, you can never get it back again! (So measure it first and ask yourself if it's worth optimisation. How much actual time will you save?)
In this case, the Java VM is probably already doing the optimization you are talking about.
The cost of a method call is the creation (and disposal) of a stack frame and some extra byte code expressions if you need to pass values to the method.
The pattern that I follow, is whether or not this method in question would satisfy one of the following:
Would it be helpful to have this method available outside this class?
Would it be helpful to have this method available in other methods?
Would it be frustrating to rewrite this every time i needed it?
Could the versatility of the method be increased with the use of a few parameters?
If any of the above are true, it should be wrapped up in it's own method.
Keep the clear() method when it helps readability. Having unmaintainable code is more expensive.
Optimizing compilers usually do a pretty good job of removing the redundancy from these "extra" operations; in many instances, the difference between "optimized" code and code simply written the way you want, and run through an optimizing compiler is none; that is to say, the optimizing compiler usually does just as good a job as you'd do, and it does it without causing any degradation of the source code. In fact, many times, "hand-optimized" code ends up being LESS efficient, because the compiler considers many things when doing the optimization. Leave your code in a readable format, and don't worry about optimization until a later time.
"Premature optimization is the root of
all evil." - Donald Knuth
I wouldn't worry about method call as much but the logic of the method. If it was critical systems, and the system needed to "be fast" then, I would look at optimising codes that takes long to execute.
Given the memory of modern computers this is very inexpensive. Its always better to break your code up into methods so someone can quickly read whats going on. It will also help with narrowing down errors in the code if the error is restricted to a single method with a body of a few lines.
As others have said, the cost of the method call is trivial-to-nada, as the compiler will optimize it for you.
That said, there are dangers in making method calls to instance methods from a constructor. You run the risk of later updating the instance method so that it may try to use an instance variable that has not been initiated yet by the constructor. That is, you don't necessarily want to separate out the construction activities from the constructor.
Another question--your clear() method sets the root to EMPTY, which is initialized when the object is created. If you then add nodes to EMPTY, and then call clear(), you won't be resetting the root node. Is this the behavior you want?
Before I ask my question can I please ask not to get a lecture about optimising for no reason.
Consider the following questions purely academic.
I've been thinking about the efficiency of accesses between root (ie often used and often accessing each other) classes in Java, but this applies to most OO languages/compilers. The fastest way (I'm guessing) that you could access something in Java would be a static final reference. Theoretically, since that reference is available during loading, a good JIT compiler would remove the need to do any reference lookup to access the variable and point any accesses to that variable straight to a constant address. Perhaps for security reasons it doesn't work that way anyway, but bear with me...
Say I've decided that there are some order of operations problems or some arguments to pass at startup that means I can't have a static final reference, even if I were to go to the trouble of having each class construct the other as is recommended to get Java classes to have static final references to each other. Another reason I might not want to do this would be... oh, say, just for example, that I was providing platform specific implementations of some of these classes. ;-)
Now I'm left with two obvious choices. I can have my classes know about each other with a static reference (on some system hub class), which is set after constructing all classes (during which I mandate that they cannot access each other yet, thus doing away with order of operations problems at least during construction). On the other hand, the classes could have instance final references to each other, were I now to decide that sorting out the order of operations was important or could be made the responsibility of the person passing the args - or more to the point, providing platform specific implementations of these classes we want to have referencing each other.
A static variable means you don't have to look up the location of the variable wrt to the class it belongs to, saving you one operation. A final variable means you don't have to look up the value at all but it does have to belong to your class, so you save 'one operation'. OK I know I'm really handwaving now!
Then something else occurred to me: I could have static final stub classes, kind of like a wacky interface where each call was relegated to an 'impl' which can just extend the stub. The performance hit then would be the double function call required to run the functions and possibly I guess you can't declare your methods final anymore. I hypothesised that perhaps those could be inlined if they were appropriately declared, then gave up as I realised I would then have to think about whether or not the references to the 'impl's could be made static, or final, or...
So which of the three would turn out fastest? :-)
Any other thoughts on lowering frequent-access overheads or even other ways of hinting performance to the JIT compiler?
UPDATE: After running several hours of test of various things and reading http://www.ibm.com/developerworks/java/library/j-jtp02225.html I've found that most things you would normally look at when tuning e.g. C++ go out the window completely with the JIT compiler. I've seen it run 30 seconds of calculations once, twice, and on the third (and subsequent) runs decide "Hey, you aren't reading the result of that calculation, so I'm not running it!".
FWIW you can test data structures and I was able to develop an arraylist implementation that was more performant for my needs using a microbenchmark. The access patterns must have been random enough to keep the compiler guessing, but it still worked out how to better implement a generic-ified growing array with my simpler and more tuned code.
As far as the test here was concerned, I simply could not get a benchmark result! My simple test of calling a function and reading a variable from a final vs non-final object reference revealed more about the JIT than the JVM's access patterns. Unbelievably, calling the same function on the same object at different places in the method changes the time taken by a factor of FOUR!
As the guy in the IBM article says, the only way to test an optimisation is in-situ.
Thanks to everyone who pointed me along the way.
Its worth noting that static fields are stored in a special per-class object which contains the static fields for that class. Using static fields instead of object fields are unlikely to be any faster.
See the update, I answered my own question by doing some benchmarking, and found that there are far greater gains in unexpected areas and that performance for simple operations like referencing members is comparable on most modern systems where performance is limited more by memory bandwidth than CPU cycles.
Assuming you found a way to reliably profile your application, keep in mind that it will all go out the window should you switch to another jdk impl (IBM to Sun to OpenJDK etc), or even upgrade version on your existing JVM.
The reason you are having trouble, and would likely have different results with different JVM impls lies in the Java spec - is explicitly states that it does not define optimizations and leaves it to each implementation to optimize (or not) in any way so long as execution behavior is unchanged by the optimization.