Performance of Overriding vs. if-statement - java

I'm extending and improving a Java application which also does long running searches with a small DSL (in detail it is used for Model-Finding, yes it's in general NP-Complete).
During this search I want to show a small progress bar on the console. Because of the generic structure of the DSL I cannot calculate the overall search space size. Therefore I can only output the progress of the first "backtracking" statement.
Now the question:
I can use a flag for each backtracking statement to indicate that this statement should report the progress. When evaluating the statement I can check the flag with an if-statement:
public class EvalStatement {
boolean reportProgress;
public EvalStatement(boolean report) {
reportProgress = report;
}
public void evaluate() {
int progress = 0;
while(someCondition) {
// do something
// maybe call other statement (tree structure)
if (reportProgress) {
// This is only executed by the root node, i. e.,
// the condition is only true for about 30 times whereas
// it is false millions or billions of times
++progress;
reportProgress(progress);
}
}
}
}
I can also use two different classes:
A class which does nothing
A subclass that is doing the output
This would look like this:
public class EvalStatement {
private ProgressWriter out;
public EvalStatement(boolean report) {
if (report)
out = new ProgressWriterOut();
else
out = ProgressWriter.instance;
}
public void evaluate() {
while(someCondition) {
// do something
// maybe call other statement (tree structure)
out.reportProgress(progress);
}
}
}
public class ProgressWriter {
public static ProgressWriter instance = new ProgressWriter();
public void reportProgress(int progress) {}
}
public class ProgressWriterOut extends ProgressWriter {
int progress = 0;
public void reportProgress(int progress) {
// This is only executed by the root node, i. e.,
// the condition is only true for about 30 times whereas
// it is false millions or billions of times
++progress;
// Put progress anywhere, e. g.,
System.out.print('#');
}
}
An now really the question(s):
Is the Java lookup of the method to call faster then the if statement?
In addition, would an interface and two independet classes be faster?
I know Log4J recommends to put an if-statement around log-calls, but I think the main reason is the construction of the parameters, espacially strings. I have only primitive types.
EDIT:
I clarified the code a little bit (what is called often... the usage of the singleton is irrelevant here).
Further, I made two long-term runs of the search where the if-statement respectively the operation call was hit 1.840.306.311 times on a machine doing nothing else:
The if version took 10h 6min 13sek (50.343 "hits" per second)
The or version took 10h 9min 15sek (50.595 "hits" per second)
I would say, this does not give a real answer, because the 0,5% difference is in the measuring tolerance.
My conclusion: They more or less behave the same, but the overriding approach could be faster in the long-term as guessed by Kane in the answers.

I think this is the text book definition of over-optimization. You're not really even sure you have a performance problem. Unless you're making MILLIONS of calls across that section it won't even show up in your hotspot reports if you profiled it. If statements, and methods calls are on the order of nanoseconds to execute. So in order for a difference between them you are talking about saving 1-10ns at the most. For that to even be perceived by a human as being slow it needs to be in the order of 100 milliseconds, and that's if they user is even paying attention like actively clicking, etc. If they're watching a progress bar they aren't even going to notice it.
Say we wanted to see if that added even 1s extra time, and you found one of those could save 10 ns (it's probably like a savings of 1-4ns). So that would mean you'd need that section to be called 100,000,000 times in order to save 1s. And I can guarantee you if you have 100 Million calls being made you'll find 10 other areas that are more expensive than the choice of if or polymorphism there. Seems sorta silly to debate the merits of 10ns on the off chance you might save 1s doesn't it?
I'd be more concerned about your usage of a singleton than performance.

I wouldn't worry about this - the cost is very small, output to the screen or computation would be much slower.

The only way to really answer this question is to try both and profile the code under normal circumstances. There are lots of variables.
That said, if I had to guess, I would say the following:
In general, an if statement compiles down to less bytecode than a method call, but with a JIT compiler optimizing, your method call may get inlined, which is no bytecode. Also, with branch-prediction of the if-statement, the cost is minimal.
Again, in general, using the interfaces will be faster than testing if you should report every time the loop is run. Over the long run, the cost of loading two classes, testing once, and instantiating one, is going to be less than running a particular test eleventy bajillion times. Over the long term.
Again, the better way to do this would be to profile the code on real world examples both ways, maybe even report back your results. However, I have a hard time seeing this being the performance bottleneck for your application... your time is probably better spent optimizing elsewhere if speed is a concern.

Putting anything on the monitor is orders of magnitude slower than either choice. If you really got a performance problem there (which I doubt) you'd need to reduce the number of calls to print.

I would assume that method lookup is faster than evaluating if(). In fact, also the version with the if needs a method lookup.
And if you really want to squeeze out every bit of performance, use private final methods in your ProgessWriter's, as this can allow the JVM to inline the method so there would be no method lookup, and not even a method call in the machine code derived from the byte code after it is finally compiled.
But, probably, they are both rather close in performance. I would suggest to test/profile, and then concentrate on the real performance issues.

Related

When not to use volatile,it still can see the changes which issued by other thread

public class VisibleDemo {
private boolean flag;
public VisibleDemo setFlag(boolean flag) {
this.flag = flag;
return this;
}
public static void main(String[] args) throws InterruptedException {
VisibleDemo t = new VisibleDemo();
new Thread(()->{
long l = System.currentTimeMillis();
while (true) {
if (System.currentTimeMillis() - l > 600) {
break;
}
}
t.setFlag(true);
}).start();
new Thread(()->{
long l = System.currentTimeMillis();
while (true) {
if (System.currentTimeMillis() - l > 500) {
break;
}
}
while (!t.flag) {
// if (System.currentTimeMillis() - l > 598) {
//
// }
}
System.out.println("end");
}).start();
}
}
if it does not have the following codes, it will not show "end".
if (System.currentTimeMillis() - l > 598) {
}
if it has these codes, it will probably show "end". Sometimes it does not show.
when is less than 598 or not have these codes, like use 550, it will not show "end".
when is 598, it will probably show "end"
when is greater than 598, it will show "end" every time
notes:
598 is on my computer, May be your computer is another number.
the flag is not with volatile, why can know the newest value.
First: I want to know Why?
Second: I need help,
I want to know the scenarios: when the worker cache of jvm thread will refresh to/from main memory.
OS: windows 10
java: jdk8u231
Your code is suffering from a data-race and that is why it is behaving unreliably.
The JMM is defined in terms of the happens-before relation. So if you have 2 actions A and B, and A happens-before B, then B should see A and everything before A. It is very important to understand that happens-before doesn't imply happening-before (so ordering based on physical time) and vice versa.
The 'flag' field is accessed concurrently; one thread is reading it while another thread is writing it. In JMM terms this is called conflicting access.
Conflicting accesses are fine as long as it is done using some form of synchronization because the synchronization will induce happens-before edges. But since the 'flag' accesses are plain loads/stores, there is no synchronization, and as a consequence, there will not be a happens-before edge to order the load and the store. A conflicting access, that isn't ordered by a happens-before edge, is called a data-race and that is the problem you are suffering from.
When there is a data-race; funny things can happen but it will not lead to undefined behavior like is possible under C++ (undefined behavior can effectively lead to any possible outcome including crashes and super weird behavior). So load still needs to see a value that is written and can't see a value coming out of thin air.
If we look at your code:
while (!t.flag) {
...
}
Because the flag field isn't updated within the loop and is just a plain load, the compiler is allowed to optimize this code to:
if(!t.flag){
while(true){...}
}
This particular optimization is called loop hoisting (or loop invariant code motion).
So this explains why the loop doesn't need to complete.
Why does it complete when you access the System.currentTimeMillis? Because you got lucky; apparently this prevents the JIT from applying the above optimization. But keep in mind that System.currentTimeMillis doesn't have any formal synchronization semantics and therefore doesn't induce happens-before edges.
How to fix your code?
The simplest way to fix your code would be to make 'flag' volatile or access both the read/write from a synchronized block. If you want to go really hardcore: use VarHandle get/set opaque. Officially it is still a data-race because opaque doesn't indice happens-before edges, but it will prevent the compiler to optimize out the load/store. It is a benign data race. The primary advantage is slightly better performance because it doesn't prevent the reordering of surrounding loads/stores.
I want to know the scenarios: when the worker cache of jvm thread will refresh to/from main memory.
This is a fallacy. Caches on modern CPUs are always coherent; this is taken care of by the cache coherence protocol like MESI. Writing to main memory for every volatile read/write would be extremely slow. For more information see the following excellent post. If you want to know more about cache coherence and memory ordering, please check this excellent book which you can download for free.
I want to know the scenarios: when the worker cache of jvm thread will refresh to/from main memory.
When Taylor Swift is playing on your music player, it'll be 598, unless it's tuesday, then it'll be 599.
No, really. It's that arbitrary. The JVM spec gives the JVM the right to come up with any old number for any reason if your code isn't properly guarded.
The problem is JVM diversity. There is a crazy combinatorial explosion:
There are about 8 OSes give or take.
There are like 20 different 'chip lines', with different pipelining behaviour.
These chips can be in various mitigating modes to mitigate against attacks like Spectre. Let's call it 3.
There are about 8 different major JVM vendors.
These come in ~10 or so different versions (java 8, java 9, java 10, java 11, etc).
That gives us about 384000 different combinations.
The point of the JMM (Java Memory Model) is to remove the handcuffs from a JVM implementation. A JVM implementation is looking for this optimal case:
It wants the freedom to use the various tricks that CPUs use to run code as fast as possible. For example, it wants the freedom to be capable of 're-ordering' (given a(); b(), to run b() first, and a() later. Which is okay, if a and b are utterly independent and are not in any way looking at each others modifications). The reason it wants to do this is because CPUs are pipelines: Even processing a single instruction is in fact a chain of many separate steps, and the 'parse the instruction' step can get cracking on parsing another instruction the very moment it is done, even if that instruction is still being processed by the rest of the pipe. In fact, the CPU could have 4 separate 'instruction parser units' and they can be parsing 4 instructions in parallel. This is NOT the kind of parallelism that multiple cores do: This is a single core that will parse 4 consecutive instructions in parallel because parsing instructions is slightly slower than running them. For example. But that's just intel chips of the Z-whatever line. That's the point. If the memory model of the java specification indicates that a JVM simply can't use this stuff then that would mean JVMs on that particular intel chip run slow as molasses. We don't want that.
Nevertheless, the memory model rules can't be so preferential to giving the JVM the right to re-order and do all sorts of crazy things that it becomes impossible to write reliable code for JVMs. Imagine the java lang spec says that the JVM can re-order any 2 instructions in one method at any time even if these 2 instructions are touching the same field. That'd be great for JVM engineers, they can go nuts with optimizing code on the fly to re-order it optimally. But it would impossible to write java code.
So, a balance has been struck. This balance takes the following form:
The JMM gives you specific rules - these rules take the form of: "If you do X, then the JVM guarantees Y".
But that is all. In particular, there is nothing written about what happens if you do not do X. All you know is, that then Y is not guaranteed. But 'not guaranteed' does not mean: Will definitely NOT happen.
Here is an example:
class Data {
static int a = 0;
static int b = 0;
}
class Thread1 extends Thread {
public void run() {
Data.a = 5;
Data.b = 10;
}
}
class Thread2 extends Thread {
public void run() {
int a = Data.a;
int b = Data.b;
System.out.println(a);
System.out.println(b);
}
}
class Main {
public static void main(String[] args) {
new Thread1().start();
new Thread2().start();
}
}
This code:
Makes 2 fields, which start out at 0 and 0.
Runs one thread that first sets a to 5 and then sets b to 10.
Starts a second thread that reads these 2 fields into local vars and then prints these.
The JVM spec says that it is valid for a JVM to:
Print 0/0
Print 5/0
Print 0/10
Print 5/10
But it would not be legal for a JVM to e.g. print '20/20', or '10/5'.
Let's zoom in on the 0/10 case because that is utterly bizarre - how could a JVM possibly do that? Well, reordering!
WILL a JVM print 0/10? On some combinations of JVM vender and version+Architecture+OS+phase of the moon, YES IT WILL. On most, no it won't. Ever. Still, imagine you wrote this code, you rely on 0/10 NEVER occurring, and you test the heck out of your code, and you verify that indeed, even running the test a million times, it never happens. You ship it to the production server and it runs fine for a week and then just as you are giving the demo to the really important potential customer, all heck breaks loose: Your app is broken, as from time to time the 0/10 case does occur.
You file a bug with your JVM vendor. And they close it as 'intended behaviour - wontfix'. That will really happen, because that really is the intended behaviour. _If you write code that relies on a thing being true that is NOT guaranteed by the JMM, then YOU wrote a bug, even if on your particular hardware on this particular day it is completely impossible for you to make this bug occur right now.
This means one simple and very nasty conclusion is the only correct one: You cannot test this stuff.
So, if you adhere to the rule that if there are no tests then you can't know if you code works, guess what? You cannot ever know if your code is fine. Ever.
That then leads to the conclusion that you don't want to write any such code.
This sounds crazy (how can you simply not ever, ever write anything multicore?) but it's not as nuts as you think. This only comes up if 2 threads are dependent on ordering relative to each other for some in-process action. For example, if two threads are both accessing the same field of the same instance. Simply... don't do that.
It's easier than you think: If all 'communication' between threads goes via the database and you know how to use transactions in databases, voila. Or you use a message bus service like RabbitMQ.
If for some job you really must write multithread code where the threads interact with each other, don't shoot the messenger: It is NOT POSSIBLE to test that you did it right. So write it very carefully.
A second conclusion is that the JMM doesn't explain how things work or what happens. It merely says: IF you follow these rules, I guarantee you that THIS will happen. If you don't follow these rules, anything can happen. A JVM is free to do all sorts of crazy shenanigans, and this documentation nor any other documentation will ever enumerate all the crazy things that could happen. After all, there are at least 38400 different combinations and it's crazy to attempt to document all 38400!
So, what are the core rules?
The core rules are so-called happens-before relationships. The basic rule is simply this:
There are various ways to establish H-B relationships. Such a relationship is always between 2 lines of code. 2 lines of code might be unrelated, H-B wise. Or, the rules state that line A 'happens-before' line B.
If and only if the rules state this, then it will be impossible to observe a state of the universe (the values of all fields of all instances in the entire JVM) at line B as it was before line A ran.
That's it. For example, if line A 'happens before' line B, but line B does not attempt to witness any field change A made, then the JVM is still free to reorder and have B run before A. The point is that this shouldn't matter - you're not observing, so why does it matter?
We can 'fix' our weird 0/0/5/10 issue by setting up H-B: If the 'grab the static field values and save them to local a/b vars' code happens-after thread1's setting of it, then we can be sure that the code will always print 5/10 and the JMM guarantees means a JVM that doesn't print that is broken.
H-B are also transitive (if HB(A, B) is true, and HB(B, C) is true, then HB(A, C) is also true).
How do you set up HB?
If line B would run after line A as per the usual understanding of how things run, and both are being run by the same thread, HB(A, B). This is obvious: If you just write x(); y();, then y cannot observe state as it was before x ran.
HB(thread.start(), X) where X is the very first line in the started thread.
HB(EndS, StartS), where EndS is the exiting of a synchronized block on object ref Z, and StartS is another thread entering a synchronized block (on ref Z as well) later.
HB(V, V) where V is 'accessing volatile variable Z', but it is hard to know which way the HB goes with volatiles.
There are a few more exotic ways. There's also a separate HB relationship for constructors and final variables that they initialize, but generally this one is real easy to understand (once a constructor returns, whatever final fields it initialized are definitely set and cannot be observed to not be set, even if otherwise no actual HB relationship has been established. This applies only to final fields).
This explains why you observe weird values. This also explains why your question of 'I want to know when a JVM thread will refresh to/from main memory' is not answerable: Because the java memory model spec and the java virtual machine spec intentionally and specifically make no promises on how that works. One JVM can work one way, another JVM can do it completely differently.
The reason I started off making a seeming joke about playing Taylor Swift is: A CPU has cores, and the cores are limited. A modern computer, especially a desktop, is doing thousands of things at once, and will therefore be rotating apps through cores all the time. Whether a field update is 'flushed out' to main memory (NOTE: THAT IS DANGEROUS THINKING - THE DOCS DO NOT ACTUALLY ENFORCE THAT JVMS CAN BE UNDERSTOOD IN THOSE TERMS!) might depend on whether it gets rotated out of a core or not. And that in turn might depend on your music player dealing with a particular compressed music file that takes a few more cores to decompress the next block so that it can be queued up in the audio buffer.
Hence, and this is no joke, the song you are playing on your music player can in fact change the number you get. Hence, why you have to give up: You CANNOT enumerate 'if my computer is in this state, then this code will always produce Y number'. There are billions of states you'd have to enumerate. Impossible.

How are JVM optimizations based on assumptions?

In section 12.3.3., "Unrealistic Sampling of Code Paths" the Java Concurrency In Practice book says:
In some cases, the JVM
may make optimizations based on assumptions that may only be true temporarily, and later back them out by invalidating the compiled code if they become untrue
I cannot understand above statement.
What are these JVM assumptions?
How does the JVM know whether the assumptions are true or untrue?
If the assumptions are untrue, does it influence the correctnes of my data?
The statement that you quoted has a footnote which gives an example:
For example, the JVM can use monomorphic call transformation to convert a virtual method call to a direct method call if no classes currently loaded override that method, but it invalidates the compiled code if a class is subsequently loaded that overrides the method.
The details are very, very, very complex here. So the following is a extremely oversimpilified example.
Imagine you have an interface:
interface Adder { int add(int x); }
The method is supposed to add a value to x, and return the result. Now imagine that there is a program that uses an implementation of this class:
class OneAdder implements Adder {
int add(int x) {
return x+1;
}
}
class Example {
void run() {
OneAdder a1 = new OneAdder();
int result = compute(a1);
System.out.println(result);
}
private int compute(Adder a) {
int sum = 0;
for (int i=0; i<100; i++) {
sum = a.add(sum);
}
return sum;
}
}
In this example, the JVM could do certain optimizations. A very low-level one is that it could avoid using a vtable for calling the add method, because there is only one implementation of this method in the given program. But it could even go further, and inline this only method, so that the compute method essentially becomes this:
private int compute(Adder a) {
int sum = 0;
for (int i=0; i<100; i++) {
sum += 1;
}
return sum;
}
and in principle, even this
private int compute(Adder a) {
return 100;
}
But the JVM can also load classes at runtime. So there may be a case where this optimization has already been done, and later, the JVM loads a class like this:
class TwoAdder implements Adder {
int add(int x) {
return x+2;
}
}
Now, the optimization that has been done to the compute method may become "invalid", because it's not clear whether it is called with a OneAdder or a TwoAdder. In this case, the optimization has to be undone.
This should answer 1. of your question.
Regarding 2.: The JVM keeps track of all the optimizations that have been done, of course. It knows that it has inlined the add method based on the assumption that there is only one implementation of this method. When it finds another implementation of this method, it has to undo the optimization.
Regarding 3.: The optimizations are done when the assumptions are true. When they become untrue, the optimization is undone. So this does not affect the correctness of your program.
Update:
Again, the example above was very simplified, referring to the footnote that was given in the book. For further information about the optimization techniques of the JVM, you may refer to https://wiki.openjdk.java.net/display/HotSpot/PerformanceTechniques . Specifically, the speculative (profile-based) techniques can probably be considered to be mostly based on "assumptions" - namely, on assumptions that are made based on the profiling data that has been collected so far.
Taking the quoted text in context, this section of the book is actually talking about the importance of using realistic text data (inputs) when you do performance testing.
Your questions:
What are these JVM assumptions?
I think the text is talking about two things:
On the one hand, it seems to be talking about optimizing based on the measurement of code paths. For example whether the "then" or "else" branch of an if statement is more likely to be executed. This can indeed result in generation of different code and is susceptible to producing sub-optimal code if the initial measurements are incorrect.
On the other hand, it also seems to be talking about optimizations that may turn out to be invalid. For example, at a certain point in time, there may be only one implementation of a given interface method that has been loaded by the JVM. On seeing this, the optimizer may decide to simplify the calling sequence to avoid polymorphic method dispatching. (The term used in the book for this a "monomorphic call transformation".) A bit latter, a second implementation may be loaded, causing the optimizer to back out that optimization.
The first of these cases only affects performance.
The second of these would affect correctness (as well as performance) if the optimizer didn't back out the optimization. But the optimizer does do that. So it only affects performance. (The methods containing the affected calls need to be re-optimized, and that affects overall performance.)
How do JVM know the assumptions are true or untrue?
In the first case, it doesn't.
In the second case, the problem is noticed when the JVM loads the 2nd method, and sees a flag on (say) the interface method that says that the optimizer has assumed that it is effectively a final method. On seeing this, the loader triggers the "back out" before any damage is done.
If the assumptions are untrue, does it influence the correctness of my data?
No it doesn't. Not in either case.
But the takeaway from the section is that the nature of your test data can influence performance measurements. And it is not simply a matter of size. The test data also needs to cause the application to behave the same way (take similar code paths) as it would behave in "real life".

Java, optimal calling of objects and methods

Lets say I have the following code:
private Rule getRuleFromResult(Fact result){
Rule output=null;
for (int i = 0; i < rules.size(); i++) {
if(rules.get(i).getRuleSize()==1){output=rules.get(i);return output;}
if(rules.get(i).getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
Is it better to leave it as it is or to change it as follows:
private Rule getRuleFromResult(Fact result){
Rule output=null;
Rule current==null;
for (int i = 0; i < rules.size(); i++) {
current=rules.get(i);
if(current.getRuleSize()==1){return current;}
if(current.getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
When executing, program goes each time through rules.get(i) as if it was the first time, and I think it, that in much more advanced example (let's say as in the second if) it takes more time and slows execution. Am I right?
Edit: To answer few comments at once: I know that in this particular example time gain will be super tiny, but it was just to get the general idea. I noticed I tend to have very long lines object.get.set.change.compareTo... etc and many of them repeat. In scope of whole code that time gain can be significant.
Your instinct is correct--saving intermediate results in a variable rather than re-invoking a method multiple times is faster. Often the performance difference will be too small to measure, but there's an even better reason to do this--clarity. By saving the value into a variable, you make it clear that you are intending to use the same value everywhere; if you re-invoke the method multiple times, it's unclear if you are doing so because you are expecting it to return different results on different invocations. (For instance, list.size() will return a different result if you've added items to list in between calls.) Additionally, using an intermediate variable gives you an opportunity to name the value, which can make the intention of the code clearer.
The only different between the two codes, is that in the first you may call twice rules.get(i) if the value is different one one.
So the second version is a little bit faster in general, but you will not feel any difference if the list is not bit.
It depends on the type of the data structure that "rules" object is. If it is a list then yes the second one is much faster as it does not need to search for rules(i) through rules.get(i). If it is a data type that allows you to know immediately rules.get(i) ( like an array) then it is the same..
In general yes it's probably a tiny bit faster (nano seconds I guess), if called the first time. Later on it will be probably be improved by the JIT compiler either way.
But what you are doing is so called premature optimization. Usually should not think about things that only provide a insignificant performance improvement.
What is more important is the readability to maintain the code later on.
You could even do more premature optimization like saving the length in a local variable, which is done by the for each loop internally. But again in 99% of cases it doesn't make sense to do it.

Which approach shows better performance: encapsulating into a method or not?

While I am writing the code sometimes I bump in the situation when I need to choose whether I should create a separate method (the advantage is that I can use my own syntax later) or implement the complex method which already exists (also less lines of the code).
Here are the examples using different programming languages (Objective-C and Java) to explain the question.
Objective-C example:
-(double) maxValueFinder: (NSMutableArray *)data {
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
return maxValue;
}
then later:
...
double max = [self maxValueFinder:data];
...
or just every time try to call:
...
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
...
Java example:
public static double maxFinder (ArrayList<Double> data) {
double maxValue = Collections.max(data);
return maxValue;
}
then later:
...
double max = maxFinder(data);
...
or just every time try to call:
...
double max = Collections.max(data);
...
or more complex case to make the point of my question sharper:
//using jsoup
public static Element getElement(Document content){
Element link = content.getElementsByTag("a").first();
return link;
}
or every time:
...
Element link = content.getElementsByTag("a").first();
...
Which approach cost less resources (performance, memory) or it is the same?
It absolutely doesn't matter. At least in your Java case you're uselessly recreating existing functionality, which is ridiculous.
You should first see if the functionality is contained in the standard library, then see if existing well known libraries have it, and only after that should you consider writing implementations yourself (especially for more complex functionality).
Performance has nothing to do with your question, except in the sense that the more time you spend on recreating existing functionality, the less time you have left for actual new code (therefore lowering your programming performance).
As for creating wrapper methods, that can be useful in some cases, especially if the actual method calls are often chained and you find yourself having more and more of those in the code. But there's a delicate difference between code clarity and writing excessive code.
public void parseHtml() {
parseFirstPart();
parseSecondPart();
parseThirdPart();
}
If we assume that each parse method only contains 1 or maybe 2 method calls then adding these additional methods is most likely useless, since the same thing can be achieved by proper commenting. If the parse methods contain a lot of calls, it makes sense to extract methods out of them. There's no rule about it, it's a skill you learn while you program (and of course depends a lot on what you view as beautiful code.
It's absolutely useless to recreating existing functionality.
Because these function is already implement in library.
If you talk about performance then both cases you are loading same line
double maxValue = Collections.max(data);
Performance is not matter in both cases because you are loading same code.

Java execution speed

I'm new to Java programming.
I am curious about speed of execution and also speed of creation and distruction of objects.
I've got several methods like the following:
private static void getAbsoluteThrottleB() {
int A = Integer.parseInt(Status.LineToken.nextToken());
Status.AbsoluteThrottleB=A*100/255;
Log.level1("Absolute Throttle Position B: " + Status.AbsoluteThrottleB);
}
and
private static void getWBO2S8Volts() {
int A = Integer.parseInt(Status.LineToken.nextToken());
int B = Integer.parseInt(Status.LineToken.nextToken());
int C = Integer.parseInt(Status.LineToken.nextToken());
int D = Integer.parseInt(Status.LineToken.nextToken());
Status.WBO2S8Volts=((A*256)+B)/32768;
Status.WBO2S8VoltsEquivalenceRatio=((C*256)+D)/256 - 128;
Log.level1("WideBand Sensor 8 Voltage: " + Double.toString(Status.WBO2S8Volts));
Log.level1("WideBand Sensor 8 Volt EQR:" + Double.toString(Status.WBO2S8VoltsEquivalenceRatio));
Would it be wise to create a separate method to process the data since it is repetative? Or would it just be faster to execute it as a single method? I have several of these which would need to be rewritten and I am wondering if it would actually improve speed of execution or if it is just as good, or if there is a number of instructions where it becomes a good idea to create a new method.
Basically, what is faster or when does it become faster to use a single method to process objects versus using another method to process several like objects?
It seems like at runtime, pulling a new variable, then performing a math operation on it is quicker then creating a new method and then pulling a varible then performing a math operation on it. My question is really where the speed is at..
These methods are all called only to read data and set a Status.Variable. There are nearly 200 methods in my class which generate data.
The speed difference of invoking a piece of code inside a method or outside of it is negligible. Specially compared with using the right algorithm for the task.
I would recommend you to use the method anyway, not for performance but for maintainability. If you need to change one line of code which turn out to introduce a bug or something and you have this code segment copy/pasted in 50 different places, it would be much harder to change ( and spot ) than having it in one single place.
So, don't worry about the performance penalty introduced by using methods because, it is practically nothing( even better, the VM may inline some of the calls )
I think S. Lott's comment on your question probably hits the nail perfectly on the head - there's no point optimizing code until you're sure the code in question actually needs it. You'll most likely end up spending a lot of time and effort for next to no gain, otherwise.
I'll also second Support's answer, in that the difference in execution time between invoking a separate method and invoking the code inline is negligible (this was actually what I wanted to post, but he kinda beat me to it). It may even be zero, if an optimizing compiler or JIT decides to inline the method anyway (I'm not sure if there are any such compilers/JITs for Java, however).
There is one advantage of the separate method approach however - if you separate your data-processing code into a separate method, you could in theory achieve some increased performance by having that method called from a separate thread, thus decoupling your (possibly time-consuming) processing code from your other code.
I am curious about speed of execution and also speed of creation and destruction of objects.
Creation of objects in Java is fast enough that you shouldn't need to worry about it, except in extreme and unusual situations.
Destruction of objects in a modern Java implementation has zero cost ... unless you use finalizers. And there are very few situations that you should even think of using a finalizer.
Basically, what is faster or when does it become faster to use a single method to process objects versus using another method to process several like objects?
The difference is negligible relative to everything else that is going on.
As #S.Lott says: "Please don't micro-optimize". Focus on writing code that is simple, clear, precise and correct, and that uses the most appropriate algorithms. Only "micro" optimize when you have clear evidence of a critical bottleneck.

Categories