I'm using Collections.sort to sort a ArrayList of objects, and I want to see if there is a more efficient compareTo method for what I'm trying to do.
Here's the method:
#Override
public int compareTo(Song s) {
if (runningTime > s.runningTime) {
return -1;
} else if (runningTime < s.runningTime) {
return 1;
}
int lastCmp = title.compareTo(s.title);
return (lastCmp != 0 ? lastCmp : composer.compareTo(s.composer));
}
If anyone could suggest a more efficient method (i.e. quicker runtime) I would be very grateful.
Just like MeBigFatGuy said, any improvement is insignificant, but I think you can still clean up the code a bit to reduce unnecessary if-else condition. My two cents.
public int compareTo(Song s) {
if (runningTime != s.runningTime) {
return s.runningTime - runningTime;
}
else {
int lastCmp = title.compareTo(s.title);
return (lastCmp != 0 ? lastCmp : composer.compareTo(s.composer));
}
}
Assuming that the priority of song ordering is fixed (running time; title if running times are the same; composer if running times and titles are the same), then there's not much you can do better. If the priorities are not fixed, then perhaps testing on composer before title may speed things up a bit; it depends on your actual data. I'd keep the test on running times first, because that is always going to be faster than string comparisons.
Looks fine to me as well. If you're running into performance problems, you might check if you're sorting too often. Not sure if Collections.sort is sensitive to this by now, but you might gain sth if you don't resort an already sorted list after inserting only a few songs
Depending on how long it takes to read a property value, it might be a tiny bit quicker to store runningTime and s.runningTime into local variables first. So on average, instead of reading them 1.5 or more times per call, you'd only read them once per call.
Related
I'm relatively new to Android Development and Java. I made an android app for my own personal use because of the inconvenience that they've made interchanging new part numbers with old part numbers.
Long story short, the part numbers in my computer were updated, but they wont change the labels on any of the old part numbers 'until supplies last' (A very long time). The part numbers are in NO LOGICAL ORDER, and are alphanumeric. So they have to be strings (I think? - I'm from a mostly C++ background). We were handed a packet of 10 pages of part numbers to search through every time we have an order to interchange.
So I made an app to interchange the numbers on the computer with the numbers on the label. But that is a huge long list of if/else loops. For example:
else if(str.equals("EUR31")) { return "D102 B"; }
else if(str.equals("EUR228")) { return "D222 B"; }
else if(str.equals("EUR1072")) { return "D311 B"; }
else if(str.equals("ACT482")) { return "D646 B"; }
else if(str.equals("ACT325")) { return "D649 B"; }
else if(str.equals("EUR394")) { return "D712 B"; }
else if(str.equals("ACT526")) { return "D723 B"; }
else if(str.equals("EUR391")) { return "D729 B"; }
The question: Is there a way to optimize this so its not 300 lines of if/else statements? I have looked at hash tables, etc, but if its possible, I'm not sure how to implement it properly. This is merely for myself, it works fast and without any bugs on my phone, just looking to improve.
To solve this kind of logic into programming, HshTable is the best way because it gives O(1) searching time complexity. To implement for your question with the hash table it will look like:
create a Map like:
Map<String,String> strMap = new Map<String,String>();
// now put your strings into strMap like
map.put("your_key", "yourValue");
to retrieve the value corresponding to Key it will be like:
if(strMap.containsKey(str)){// your str
return strMap.get(str);
}
Modify it according to your need like in loops and all;
Example :
as per your question if you are putting you all string one by one in a map like:
strMap.put("EUR31","D102 B");
strMap.put("EUR228", "D222 B");
strMap.put("EUR1072","D311 B");
..........
..........
when you retrieve data from map so you can do like:
String str = "EUR31";
if(strMap.containsKey(str)){
return strMap.get(str);
}
I have a program and its processing rather large amounts of data. It is comparing one static string arraylist to another checking whether a string is contained in it.
But what happens is after processing lets say 40k+ strings it begins to fail on the checking. By fail I mean it begins to not recognize that a string already exists in the other?
Is there a reason for this or is the arraylist simply too large?
Thanks
EDIT
for (int i = 0; i < arraylist1.size(); i++) {
boolean enter = true;
for (int x = 0; x < arraylist2.size() && enter; x++) {
if (arraylist1.get(i).getString().matches(arraylist2.get(x))) {
enter = false;
}
}
if (enter) {
//do something
}
}
EDIT****
Off-topic to the question but using .equals() instead of .matches() improves the performance MASSIVELY.
The simple answer is: no.
ArrayLists do not lose what is in them.
Your symptoms could be caused by a number of things, including threading/synchronization issues, subtle differences in the string, etc.
You should consider using a HashSet anyway though. It will make the "contains" check much much faster.
Using HashSet all your code above becomes:
List<String> list;
Set<String> set;
for (String str: list) {
if (!set.contains(str)) {
//do something
}
}
Much simpler and incredibly faster.
If you do need to use lists you can do the same thing but having both collections as List, the API doesn't change but performance will.
For the following piece of code, sonarqube computes the method cyclomatic complexity as 9
String foo() {
if (cond1) return a;
if (cond2) return b;
if (cond3) return c;
if (cond4) return d;
return e;
}
I understand as per the rules for computation http://docs.sonarqube.org/display/SONAR/Metrics+-+Complexity the complexity of 9 is correct.
So complexity of the method is = 4 (if) + 4 (return) + 1 (method) = 9
This complexity can be reduced, if I have a single exit point.
String foo() {
String temp;
if (cond1) {
temp = a;
} else if (cond2) {
temp = b;
} else if (cond3) {
temp = c;
} else if (cond4) {
temp = d;
} else {
temp = e;
}
return temp;
}
I believe this code is more cluttered and unreadable than the previous version and I feel having methods with return on guard conditions is a better programming practice. So is there a good reason why return statement is considered for computation of cyclomatic complexity? Can the logic for computation be changed so that it doesn't promote single exit point.
I agree you should use some common sense and go with the code which you believe is simplest.
BTW You can simplify you code and have just one return if you use ? :
String foo() {
return cond1 ? a :
cond2 ? b :
cond3 ? c :
cond4 ? d : e;
}
"So is there a good reason why return statement is considered for
computation of cyclomatic complexity? Can the logic for computation be
changed so that it doesn't promote single exit point."
In your example having multiple returns doesn't add to the complexity and as #Peter Lawrey says you should employ common sense.
Does this mean that all examples of multiple return statements do not to complexity and it should be removed? I don't think so. If would be very easy to come up with an example of a method which is hard-to-read because of multiple return statements. Just imagine a 100 line method with 4 different return statement sprinkled throughout. That is the kind of issue this rules tries to catch.
This is a known problem with cyclomatic complexity.
Also there is good reason to think that cyclomatic complexity is useless. It correlates strongly with SLOC and only weakly with actual bugs. In fact SLOC is just as good a predictor of defects as cyclomatic complexity. The same goes for most other complexity metrics.
See http://www.leshatton.org/Documents/TAIC2008-29-08-2008.pdf, starting around slide 16.
Other answers have made good points about the computation involved.
I'd like to point out that your assertion that the code is less readable is false, because in one instance you have braces, and in the other you don't.
String foo() {
String output = e;
if (cond1) output = a;
else if (cond2) output = b;
else if (cond3) output = c;
else if (cond4) output = d;
return output;
}
This is as readable as the example you gave with return statements.
Whether or not you allow braceless if statements is a question of style that you should probably be consistent with across all your code.
The more important issue that cyclomatic complexity does address is that if computing the value of cond1, cond2 etc have side effects, i.e. if they were a stateful method rather than a field in this case, then the conceptual complexity of the code is much higher if you might return early compared to if you can't.
Assume that we have a given interface:
public interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
and following implementations:
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
public void negateWithoutCheck() {
this.b = false;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
#Override
public void negateWithCheck() {
if (b == true) {
this.b = false;
}
}
#Override
public void negateWithoutCheck() {
this.b = false;
}
}
Moreover assume that methods negate*Check() can be called 1+ many times and it is hard to say what is the upper bound of the number of calls.
The question is which method in both implementations is 'better'
according to execution speed, garbage collection, memory allocation, etc. -
negateWithCheck or negateWithoutCheck?
Does the answer depend on which from the two proposed
implementations we use or it doesn't matter?
Does the answer depend on the estimated number of calls? For what count of number is better to use one or first method?
There might be a slight performance benefit in using the one with the check. I highly doubt that it matters in any real life application.
premature optimization is the root of all evil (Donald Knuth)
You could measure the difference between the two. Let me emphasize that these kind of things are notoriously difficult to measure reliably.
Here is a simple-minded way to do this. You can hope for performance benefits if the check recognizes that the value doesn't have to be changed, saving you an expensive write into the memory. So I have changed your code accordingly.
interface StateKeeper {
public abstract void negateWithoutCheck();
public abstract void negateWithCheck();
}
class StateKeeperForPrimitives implements StateKeeper {
private boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
class StateKeeperForObjects implements StateKeeper {
private Boolean b = true;
public void negateWithCheck() {
if (b == false) {
this.b = true;
}
}
public void negateWithoutCheck() {
this.b = true;
}
}
public class Main {
public static void main(String args[]) {
StateKeeper[] array = new StateKeeper[10_000_000];
for (int i=0; i<array.length; ++i)
//array[i] = new StateKeeperForObjects();
array[i] = new StateKeeperForPrimitives();
long start = System.nanoTime();
for (StateKeeper e : array)
e.negateWithCheck();
//e.negateWithoutCheck();
long end = System.nanoTime();
System.err.println("Time in milliseconds: "+((end-start)/1000000));
}
}
I get the followings:
check no check
primitive 17ms 24ms
Object 21ms 24ms
I didn't find any performance penalty of the check the other way around when the check is always superfluous because the value always has to be changed.
Two things: (1) These timings are unreliable. (2) This benchmark is far from any real life application; I had to make an array of 10 million elements to actually see something.
I would simply pick the function with no check. I highly doubt that in any real application you would get any measurable performance benefit from the function that has the check but that check is error prone and is harder to read.
Short answer: the Without check will always be faster.
An assignment takes a lot less computation time than a comparison. Therefore: an IF statement is always slower than an assignment.
When comparing 2 variables, your CPU will fetch the first variable, fetch the second variable, compare those 2 and store the result into a temporary register. That's 2 fetches, 1 compare and a 1 store.
When you assign a value, your CPU will fetch the value on the right hand of the '=' and store it into the memory. That's 1 fetch and 1 store.
In general, if you need to set some state, just set the state. If, on the otherhand, you have to do something more - like log the change, inform about the change, etc. - then you should first inspect the old value.
But, in the case when methods like the ones you provided are called very intensely, there may be some performance difference in checking vs non-checking (whether the new value is different). Possible outcomes are:
1-a) check returns false
1-b) check returns true, value is assigned
2) value is assigned without check
As far as I know, writing is always slower than reading (all the way down to register level), so the fastest outcome is 1-a. If your case is that the most common thing that happens is that the value will not be changed ('more than 50%' logic is just not good enough, the exact percentage has to be figured out empirically) - then you should go with checking, as this eliminates redundant writing operation (value assignment). If, on the other hand, value is different more than often - assign it without checking.
You should test your concrete cases, do some profiling, and based on the result determine the best implementation. There is no general "best way" for this case (apart from "just set the state").
As for boolean vs Boolean here, I would say (off the top of my head) that there should be no performance difference.
Only today I've seen few answers and comments repeating that
Premature optimization is the root of all evil
Well obviously one if statement more is one thing more to do, but... it doesn't really matter.
And garbage collection and memory allocation... not an issue here.
I would generally consider the negateWithCheck to be slightly slower due there always being a comparison. Also notice in the StateKeeperOfObjects you are introducing some autoboxing. 'true' and 'false' are primitive boolean values.
Assuming you fix the StateKeeperOfObjects to use all objects, then potentially, but most likely not noticeable.
The speed will depend slightly on the number of calls, but in general the speed should be considered to be the same whether you call it once or many times (ignoring secondary effects such as caching, jit, etc).
It seems to me, a better question is whether or not the performance difference is noticeable. I work on a scientific project that involves millions of numerical computations done in parallel. We started off using Objects (e.g. Integer, Double) and had less than desirable performance, both in terms of memory and speed. When we switched all of our computations to primitives (e.g. int, double) and went over the code to make sure we were not introducing anything funky through autoboxing, we saw a huge performance increase (both memory and speed).
I am a huge fan of avoiding premature optimization, unless it is something that is "simple" to implement. Just be wary of the consequences. For example, do you have to represent null values in your data model? If so, how do you do that using a primitive? Doubles can be done easily with NaN, but what about Booleans?
negateWithoutCheck() is preferable because if we consider the number of calls then negateWithoutCheck() has only one call i.e. this.b = false; where as negateWithCheck() has one extra with previous one.
This is a simplified example. I have this enum declaration as follows:
public enum ELogLevel {
None,
Debug,
Info,
Error
}
I have this code in another class:
if ((CLog._logLevel == ELogLevel.Info) || (CLog._logLevel == ELogLevel.Debug) || (CLog._logLevel == ELogLevel.Error)) {
System.out.println(formatMessage(message));
}
My question is if there is a way to shorten the test. Ideally i would like somethign to the tune of (this is borrowed from Pascal/Delphi):
if (CLog._logLevel in [ELogLevel.Info, ELogLevel.Debug, ELogLevel.Error])
Instead of the long list of comparisons. Is there such a thing in Java, or maybe a way to achieve it? I am using a trivial example, my intention is to find out if there is a pattern so I can do these types of tests with enum value lists of many more elements.
EDIT: It looks like EnumSet is the closest thing to what I want. The Naïve way of implementing it is via something like:
if (EnumSet.of(ELogLevel.Info, ELogLevel.Debug, ELogLevel.Error).contains(CLog._logLevel))
But under benchmarking, this performs two orders of magnitude slower than the long if/then statement, I guess because the EnumSet is being instantiated every time it runs. This is a problem only for code that runs very often, and even then it's a very minor problem, since over 100M iterations we are talking about 7ms vs 450ms on my box; a very minimal amount of time either way.
What I settled on for code that runs very often is to pre-instantiate the EnumSet in a static variable, and use that instance in the loop, which cuts down the runtime back down to a much more palatable 9ms over 100M iterations.
So it looks like we have a winner! Thanks guys for your quick replies.
what you want is an enum set
http://docs.oracle.com/javase/1.5.0/docs/api/java/util/EnumSet.html
put the elements you want to test for in the set, and then use the Set method contains().
import java.util.EnumSet;
public class EnumSetExample
{
enum Level { NONE, DEBUG, INFO, ERROR };
public static void main(String[] args)
{
EnumSet<Level> subset = EnumSet.of(Level.DEBUG, Level.INFO);
for(Level currentLevel : EnumSet.allOf(Level.class))
{
if (subset.contains(currentLevel))
{
System.out.println("we have " + currentLevel.toString());
}
else
{
System.out.println("we don't have " + currentLevel.toString());
}
}
}
}
There's no way to do it concisely in Java. The closest you can come is to dump the values in a set and call contains(). An EnumSet is probably most efficient in your case. You can shorted the set initialization a little using the double brace idiom, though this has the drawback of creating a new inner class each time you use it, and hence increases the memory usage slightly.
In general, logging levels are implemented as integers:
public static int LEVEL_NONE = 0;
public static int LEVEL_DEBUG = 1;
public static int LEVEL_INFO = 2;
public static int LEVEL_ERROR = 3;
and then you can test for severity using simple comparisons:
if (Clog._loglevel >= LEVEL_DEBUG) {
// log
}
You could use a list of required levels, ie:
List<ELogLevel> levels = Lists.newArrayList(ELogLevel.Info,
ELogLevel.Debug, ELogLevel.Error);
if (levels.contains(CLog._logLevel)) {
//
}