I'm new to Java programming.
I am curious about speed of execution and also speed of creation and distruction of objects.
I've got several methods like the following:
private static void getAbsoluteThrottleB() {
int A = Integer.parseInt(Status.LineToken.nextToken());
Status.AbsoluteThrottleB=A*100/255;
Log.level1("Absolute Throttle Position B: " + Status.AbsoluteThrottleB);
}
and
private static void getWBO2S8Volts() {
int A = Integer.parseInt(Status.LineToken.nextToken());
int B = Integer.parseInt(Status.LineToken.nextToken());
int C = Integer.parseInt(Status.LineToken.nextToken());
int D = Integer.parseInt(Status.LineToken.nextToken());
Status.WBO2S8Volts=((A*256)+B)/32768;
Status.WBO2S8VoltsEquivalenceRatio=((C*256)+D)/256 - 128;
Log.level1("WideBand Sensor 8 Voltage: " + Double.toString(Status.WBO2S8Volts));
Log.level1("WideBand Sensor 8 Volt EQR:" + Double.toString(Status.WBO2S8VoltsEquivalenceRatio));
Would it be wise to create a separate method to process the data since it is repetative? Or would it just be faster to execute it as a single method? I have several of these which would need to be rewritten and I am wondering if it would actually improve speed of execution or if it is just as good, or if there is a number of instructions where it becomes a good idea to create a new method.
Basically, what is faster or when does it become faster to use a single method to process objects versus using another method to process several like objects?
It seems like at runtime, pulling a new variable, then performing a math operation on it is quicker then creating a new method and then pulling a varible then performing a math operation on it. My question is really where the speed is at..
These methods are all called only to read data and set a Status.Variable. There are nearly 200 methods in my class which generate data.
The speed difference of invoking a piece of code inside a method or outside of it is negligible. Specially compared with using the right algorithm for the task.
I would recommend you to use the method anyway, not for performance but for maintainability. If you need to change one line of code which turn out to introduce a bug or something and you have this code segment copy/pasted in 50 different places, it would be much harder to change ( and spot ) than having it in one single place.
So, don't worry about the performance penalty introduced by using methods because, it is practically nothing( even better, the VM may inline some of the calls )
I think S. Lott's comment on your question probably hits the nail perfectly on the head - there's no point optimizing code until you're sure the code in question actually needs it. You'll most likely end up spending a lot of time and effort for next to no gain, otherwise.
I'll also second Support's answer, in that the difference in execution time between invoking a separate method and invoking the code inline is negligible (this was actually what I wanted to post, but he kinda beat me to it). It may even be zero, if an optimizing compiler or JIT decides to inline the method anyway (I'm not sure if there are any such compilers/JITs for Java, however).
There is one advantage of the separate method approach however - if you separate your data-processing code into a separate method, you could in theory achieve some increased performance by having that method called from a separate thread, thus decoupling your (possibly time-consuming) processing code from your other code.
I am curious about speed of execution and also speed of creation and destruction of objects.
Creation of objects in Java is fast enough that you shouldn't need to worry about it, except in extreme and unusual situations.
Destruction of objects in a modern Java implementation has zero cost ... unless you use finalizers. And there are very few situations that you should even think of using a finalizer.
Basically, what is faster or when does it become faster to use a single method to process objects versus using another method to process several like objects?
The difference is negligible relative to everything else that is going on.
As #S.Lott says: "Please don't micro-optimize". Focus on writing code that is simple, clear, precise and correct, and that uses the most appropriate algorithms. Only "micro" optimize when you have clear evidence of a critical bottleneck.
Related
In the following piece of code we make a call listType.getDescription() twice:
for (ListType listType: this.listTypeManager.getSelectableListTypes())
{
if (listType.getDescription() != null)
{
children.add(new SelectItem( listType.getId() , listType.getDescription()));
}
}
I would tend to refactor the code to use a single variable:
for (ListType listType: this.listTypeManager.getSelectableListTypes())
{
String description = listType.getDescription();
if (description != null)
{
children.add(new SelectItem(listType.getId() ,description));
}
}
My understanding is the JVM is somehow optimized for the original code and especially nesting calls like children.add(new SelectItem(listType.getId(), listType.getDescription()));.
Comparing the two options, which one is the preferred method and why? That is in terms of memory footprint, performance, readability/ease, and others that don't come to my mind right now.
When does the latter code snippet become more advantageous over the former, that is, is there any (approximate) number of listType.getDescription() calls when using a temp local variable becomes more desirable, as listType.getDescription() always requires some stack operations to store the this object?
I'd nearly always prefer the local variable solution.
Memory footprint
A single local variable costs 4 or 8 bytes. It's a reference and there's no recursion, so let's ignore it.
Performance
If this is a simple getter, the JVM can memoize it itself, so there's no difference. If it's a expensive call which can't be optimized, memoizing manually makes it faster.
Readability
Follow the DRY principle. In your case it hardly matters as the local variable name is character-wise as about as long as the method call, but for anything more complicated, it's readability as you don't have to find the 10 differences between the two expressions. If you know they're the same, so make it clear using the local variable.
Correctness
Imagine your SelectItem does not accept nulls and your program is multithreaded. The value of listType.getDescription() can change in the meantime and you're toasted.
Debugging
Having a local variable containing an interesting value is an advantage.
The only thing to win by omitting the local variable is saving one line. So I'd do it only in cases when it really doesn't matter:
very short expression
no possible concurrent modification
simple private final getter
I think the way number two is definitely better because it improves readability and maintainability of your code which is the most important thing here. This kind of micro-optimization won't really help you in anything unless you writing an application where every millisecond is important.
I'm not sure either is preferred. What I would prefer is clearly readable code over performant code, especially when that performance gain is negligible. In this case I suspect there's next to no noticeable difference (especially given the JVM's optimisations and code-rewriting capabilities)
In the context of imperative languages, the value returned by a function call cannot be memoized (See http://en.m.wikipedia.org/wiki/Memoization) because there is no guarantee that the function has no side effect. Accordingly, your strategy does indeed avoid a function call at the expense of allocating a temporary variable to store a reference to the value returned by the function call.
In addition to being slightly more efficient (which does not really matter unless the function is called many times in a loop), I would opt for your style due to better code readability.
I agree on everything. About the readability I'd like to add something:
I see lots of programmers doing things like:
if (item.getFirst().getSecond().getThird().getForth() == 1 ||
item.getFirst().getSecond().getThird().getForth() == 2 ||
item.getFirst().getSecond().getThird().getForth() == 3)
Or even worse:
item.getFirst().getSecond().getThird().setForth(item2.getFirst().getSecond().getThird().getForth())
If you are calling the same chain of 10 getters several times, please, use an intermediate variable. It's just much easier to read and debug
I would agree with the local variable approach for readability only if the local variable's name is self-documenting. Calling it "description" wouldn't be enough (which description?). Calling it "selectableListTypeDescription" would make it clear. I would throw in that the incremented variable in the for loop should be named "selectableListType" (especially if the "listTypeManager" has accessors for other ListTypes).
The other reason would be if there's no guarantee this is single-threaded or your list is immutable.
Lets say I have the following code:
private Rule getRuleFromResult(Fact result){
Rule output=null;
for (int i = 0; i < rules.size(); i++) {
if(rules.get(i).getRuleSize()==1){output=rules.get(i);return output;}
if(rules.get(i).getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
Is it better to leave it as it is or to change it as follows:
private Rule getRuleFromResult(Fact result){
Rule output=null;
Rule current==null;
for (int i = 0; i < rules.size(); i++) {
current=rules.get(i);
if(current.getRuleSize()==1){return current;}
if(current.getResultFact().getFactName().equals(result.getFactName())) output=rules.get(i);
}
return output;
}
When executing, program goes each time through rules.get(i) as if it was the first time, and I think it, that in much more advanced example (let's say as in the second if) it takes more time and slows execution. Am I right?
Edit: To answer few comments at once: I know that in this particular example time gain will be super tiny, but it was just to get the general idea. I noticed I tend to have very long lines object.get.set.change.compareTo... etc and many of them repeat. In scope of whole code that time gain can be significant.
Your instinct is correct--saving intermediate results in a variable rather than re-invoking a method multiple times is faster. Often the performance difference will be too small to measure, but there's an even better reason to do this--clarity. By saving the value into a variable, you make it clear that you are intending to use the same value everywhere; if you re-invoke the method multiple times, it's unclear if you are doing so because you are expecting it to return different results on different invocations. (For instance, list.size() will return a different result if you've added items to list in between calls.) Additionally, using an intermediate variable gives you an opportunity to name the value, which can make the intention of the code clearer.
The only different between the two codes, is that in the first you may call twice rules.get(i) if the value is different one one.
So the second version is a little bit faster in general, but you will not feel any difference if the list is not bit.
It depends on the type of the data structure that "rules" object is. If it is a list then yes the second one is much faster as it does not need to search for rules(i) through rules.get(i). If it is a data type that allows you to know immediately rules.get(i) ( like an array) then it is the same..
In general yes it's probably a tiny bit faster (nano seconds I guess), if called the first time. Later on it will be probably be improved by the JIT compiler either way.
But what you are doing is so called premature optimization. Usually should not think about things that only provide a insignificant performance improvement.
What is more important is the readability to maintain the code later on.
You could even do more premature optimization like saving the length in a local variable, which is done by the for each loop internally. But again in 99% of cases it doesn't make sense to do it.
While I am writing the code sometimes I bump in the situation when I need to choose whether I should create a separate method (the advantage is that I can use my own syntax later) or implement the complex method which already exists (also less lines of the code).
Here are the examples using different programming languages (Objective-C and Java) to explain the question.
Objective-C example:
-(double) maxValueFinder: (NSMutableArray *)data {
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
return maxValue;
}
then later:
...
double max = [self maxValueFinder:data];
...
or just every time try to call:
...
double max = [[data valueForKeyPath:#"#max.intValue"] doubleValue];
...
Java example:
public static double maxFinder (ArrayList<Double> data) {
double maxValue = Collections.max(data);
return maxValue;
}
then later:
...
double max = maxFinder(data);
...
or just every time try to call:
...
double max = Collections.max(data);
...
or more complex case to make the point of my question sharper:
//using jsoup
public static Element getElement(Document content){
Element link = content.getElementsByTag("a").first();
return link;
}
or every time:
...
Element link = content.getElementsByTag("a").first();
...
Which approach cost less resources (performance, memory) or it is the same?
It absolutely doesn't matter. At least in your Java case you're uselessly recreating existing functionality, which is ridiculous.
You should first see if the functionality is contained in the standard library, then see if existing well known libraries have it, and only after that should you consider writing implementations yourself (especially for more complex functionality).
Performance has nothing to do with your question, except in the sense that the more time you spend on recreating existing functionality, the less time you have left for actual new code (therefore lowering your programming performance).
As for creating wrapper methods, that can be useful in some cases, especially if the actual method calls are often chained and you find yourself having more and more of those in the code. But there's a delicate difference between code clarity and writing excessive code.
public void parseHtml() {
parseFirstPart();
parseSecondPart();
parseThirdPart();
}
If we assume that each parse method only contains 1 or maybe 2 method calls then adding these additional methods is most likely useless, since the same thing can be achieved by proper commenting. If the parse methods contain a lot of calls, it makes sense to extract methods out of them. There's no rule about it, it's a skill you learn while you program (and of course depends a lot on what you view as beautiful code.
It's absolutely useless to recreating existing functionality.
Because these function is already implement in library.
If you talk about performance then both cases you are loading same line
double maxValue = Collections.max(data);
Performance is not matter in both cases because you are loading same code.
Let's say we are trying to build a document scanner class in java that takes 1 input argument, the log path(eg. C:\document\text1.txt). Which of the following implementations would you prefer based on performance/memory/modularity?
ArrayList<String> fileListArray = new ArrayList<String>();
fileListArray.add("C:\\document\\text1.txt");
fileListArray.add("C:\\document\\text2.txt");
.
.
.
//Implementation A
for(int i =0, j = fileListArray.size(); i < j; i++){
MyDocumentScanner ds = new MyDocumentScanner(fileListArray.get(i));
ds.scanDocument();
ds.resultOutput();
}
//Implementation B
MyDocumentScanner ds = new MyDocumentScanner();
for(int i=0, j=fileListArray.size(); i < j; i++){
ds.setDocPath(fileListArray.get(i));
ds.scanDocument();
ds.resultOutput();
}
Personally I would prefer A due to its encapsulation, but it seems like more memory usage due to creation of multiple instances. I'm curious if there is an answer to this, or it is another "that depends on the situation/circumstances" dilemma?
Although this is obviously opinion-based, I will try an answer to tell my opinion.
You approach A is far better. Your document scanner obviously handles a file. That should be set at construction time and be saved in an instance field. So every method can refer to this field. Moreover, the constructor can do some checks on the file reference (null check, existence, ...).
Your approach B has two very serious disadvantages:
After constructing a document scanner, clients could easily call all of the methods. If no file was set before, you must handle that "illegal state" with maybe an IllegalStateException. Thus, this approach increases code and complexity of that class.
There seems to be a series of method calls that a client should or can perform. It's easy to call the file setting method again in the middle of such a series with a completely other file, breaking the whole scan facility. To avoid this, your setter (for the file) should remember whether a file was already set. And that nearly automatically leads to approach A.
Regarding the creation of objects: Modern JVMs are really very fast at creating objects. Usually, there is no measurable performance overhead for that. The processing time (here: the scan) usually is much higher.
If you don't need multiple instances of DocumentScanner to co-exist, I see no point in creating a new instance in each iteration of the loop. It just creates work to the garbage collector, which has to free each of those instances.
If the length of the array is small, it doesn't make much difference which implementation you choose, but for large arrays, implementation B is more efficient, both in terms of memory (less instances created that the GC hasn't freed yet) and CPU (less work for the GC).
Are you implementing DocumentScanner or using an existing class?
If the latter, and it was designed for being able to parse multiple documents in a row, you can just reuse the object as in variant B.
However, if you are designing DocumentScanner, I would recommend to design it such that it handles a single document and does not even have a setDocPath method. This leads to less mutable state in that class and thus makes its design much easier. Also using an instance of the class becomes less error-prone.
As for performance, there won't be a measurable difference unless instantiating a DocumentScanner is doing a lot of work (like instantiating many other objects, too). Instantiating and freeing objects in Java is pretty cheap if they are used only for a short time due to the generational garbage collector.
I'm extending and improving a Java application which also does long running searches with a small DSL (in detail it is used for Model-Finding, yes it's in general NP-Complete).
During this search I want to show a small progress bar on the console. Because of the generic structure of the DSL I cannot calculate the overall search space size. Therefore I can only output the progress of the first "backtracking" statement.
Now the question:
I can use a flag for each backtracking statement to indicate that this statement should report the progress. When evaluating the statement I can check the flag with an if-statement:
public class EvalStatement {
boolean reportProgress;
public EvalStatement(boolean report) {
reportProgress = report;
}
public void evaluate() {
int progress = 0;
while(someCondition) {
// do something
// maybe call other statement (tree structure)
if (reportProgress) {
// This is only executed by the root node, i. e.,
// the condition is only true for about 30 times whereas
// it is false millions or billions of times
++progress;
reportProgress(progress);
}
}
}
}
I can also use two different classes:
A class which does nothing
A subclass that is doing the output
This would look like this:
public class EvalStatement {
private ProgressWriter out;
public EvalStatement(boolean report) {
if (report)
out = new ProgressWriterOut();
else
out = ProgressWriter.instance;
}
public void evaluate() {
while(someCondition) {
// do something
// maybe call other statement (tree structure)
out.reportProgress(progress);
}
}
}
public class ProgressWriter {
public static ProgressWriter instance = new ProgressWriter();
public void reportProgress(int progress) {}
}
public class ProgressWriterOut extends ProgressWriter {
int progress = 0;
public void reportProgress(int progress) {
// This is only executed by the root node, i. e.,
// the condition is only true for about 30 times whereas
// it is false millions or billions of times
++progress;
// Put progress anywhere, e. g.,
System.out.print('#');
}
}
An now really the question(s):
Is the Java lookup of the method to call faster then the if statement?
In addition, would an interface and two independet classes be faster?
I know Log4J recommends to put an if-statement around log-calls, but I think the main reason is the construction of the parameters, espacially strings. I have only primitive types.
EDIT:
I clarified the code a little bit (what is called often... the usage of the singleton is irrelevant here).
Further, I made two long-term runs of the search where the if-statement respectively the operation call was hit 1.840.306.311 times on a machine doing nothing else:
The if version took 10h 6min 13sek (50.343 "hits" per second)
The or version took 10h 9min 15sek (50.595 "hits" per second)
I would say, this does not give a real answer, because the 0,5% difference is in the measuring tolerance.
My conclusion: They more or less behave the same, but the overriding approach could be faster in the long-term as guessed by Kane in the answers.
I think this is the text book definition of over-optimization. You're not really even sure you have a performance problem. Unless you're making MILLIONS of calls across that section it won't even show up in your hotspot reports if you profiled it. If statements, and methods calls are on the order of nanoseconds to execute. So in order for a difference between them you are talking about saving 1-10ns at the most. For that to even be perceived by a human as being slow it needs to be in the order of 100 milliseconds, and that's if they user is even paying attention like actively clicking, etc. If they're watching a progress bar they aren't even going to notice it.
Say we wanted to see if that added even 1s extra time, and you found one of those could save 10 ns (it's probably like a savings of 1-4ns). So that would mean you'd need that section to be called 100,000,000 times in order to save 1s. And I can guarantee you if you have 100 Million calls being made you'll find 10 other areas that are more expensive than the choice of if or polymorphism there. Seems sorta silly to debate the merits of 10ns on the off chance you might save 1s doesn't it?
I'd be more concerned about your usage of a singleton than performance.
I wouldn't worry about this - the cost is very small, output to the screen or computation would be much slower.
The only way to really answer this question is to try both and profile the code under normal circumstances. There are lots of variables.
That said, if I had to guess, I would say the following:
In general, an if statement compiles down to less bytecode than a method call, but with a JIT compiler optimizing, your method call may get inlined, which is no bytecode. Also, with branch-prediction of the if-statement, the cost is minimal.
Again, in general, using the interfaces will be faster than testing if you should report every time the loop is run. Over the long run, the cost of loading two classes, testing once, and instantiating one, is going to be less than running a particular test eleventy bajillion times. Over the long term.
Again, the better way to do this would be to profile the code on real world examples both ways, maybe even report back your results. However, I have a hard time seeing this being the performance bottleneck for your application... your time is probably better spent optimizing elsewhere if speed is a concern.
Putting anything on the monitor is orders of magnitude slower than either choice. If you really got a performance problem there (which I doubt) you'd need to reduce the number of calls to print.
I would assume that method lookup is faster than evaluating if(). In fact, also the version with the if needs a method lookup.
And if you really want to squeeze out every bit of performance, use private final methods in your ProgessWriter's, as this can allow the JVM to inline the method so there would be no method lookup, and not even a method call in the machine code derived from the byte code after it is finally compiled.
But, probably, they are both rather close in performance. I would suggest to test/profile, and then concentrate on the real performance issues.