Is there any difference in optimization between integer and string comparison?

Is there any difference in optimization between integer and string comparison? - java

I'm trying to make a game and I have a Selection class that holds a string named str in it. I apply the following code to my selection objects every 17 milliseconds.
if(s.Str == "Upgrade") {
}else if(s.Str == "Siege") {
}else if(s.Str == "Recruit") {
}
In other words, these selection objects will do different jobs according to their types(upgrade,siege etc...). I am using str variable elsewhere. my question is that:
Would it be more optimized if I assign the types to an integer when I first create the objects?
if(s.type == 1) {
}else if(s.type == 2) {
}else if(s.type == 3) {
}
This would make me write extra lines of code(Since I have to separate objects by type when I first create) and make the code more difficult to understand, but would there be a difference between comparing integers rather than comparing strings?

If you compare strings >that< way, there is probably no performance difference.
However, that is the WRONG WAY to compare strings. The correct way is to use the equals(Object) method. For example.
if (s.Str.equals("Upgrade")) {
Read this:
How do I compare strings in Java?
I apply the following code to my selection objects every 17 milliseconds.
The time that it will take to test two strings for equality is probably in the order of tens of NANOseconds. So ... basically ... the difference between comparing strings or integers is irrelevant.
This illustrates why premature optimization is a bad thing. You should only optimize code when you know that it is going to be worthwhile to spend your time on it; i.e. when you know there is going to be a pay-off.
So should I optimize after I write and finish all the code? Does 'not doing premature optimization' means that?
No it doesn't exactly mean that. (Well .. not to me anyway.) What it means to me is that you shouldn't optimize until:
you have a working program whose performance you can measure,
you have determined specific (quantifiable) performance criteria,
you have a means of measuring the performance; e.g. an appropriate benchmarks involving real or realistic use-cases, and
you have good a means of identifying the actual performance hotspots.
If you try to optimize before you have the above, you are likely to optimize the wrong parts of the code for the wrong reasons, and your effort (programmer time) is likely to be spent inefficiently.
In your specific case, my gut feeling is that if you followed the recommended process you would discover1 that this String vs int (vs enum) is irrelevant to your game's observable performance2.
But if you want to be more scientific than "gut feeling", you should wait until you have 1 through 4 settled, and then measure to see if the actual performance meets your criteria. Only then should you decide whether or not to optimize.
1 - My prediction assumes that your characterization of the problem is close enough to reality. That is always a risk when people try to identify performance issues "by eye" rather than by measuring.
2 - It is relevant to other things; e.g. code readability and maintainability, but I'm not going to address those in this Answer.

The Answer by Stephen C is correct and wise. But your example code is ripe for a different solution entirely.
Enum
If you want performance, type-safety, easier-to-read code, and want to ensure valid values, use enum objects rather than mere strings or integers.
public enum Action { UPGRADE , SIEGE , RECRUIT }
You can use a switch for the various enum possible objects.
Action action = Action.SIEGE ;
…
switch ( action )
{
case UPGRADE:
doUpgradeStuff() ;
break;
case SIEGE:
doSiegeStuff() ;
break;
case RECRUIT:
doRecruitStuff() ;
break;
default:
doDefaultStuff() ;
break;
}
Using enums this way will get even better in the future. See JEP 406: Pattern Matching for switch (Preview).
See Java Tutorials by Oracle on enums. And for an example, see their tutorial using enums for month, day-of-week, and text style.
See also this Question, linked to others.

Comparing primitive numbers like Integer will be definitely faster compared to String in Java. It will give you faster performance if you are executing it every 17 milliseconds.

Yes there is difference. String is a object and int is a primitive type. when you are doing object == "string" it is matching the address. You need to use equals method to check the exact match.

Related

Why using default trash value for string is wrong?

tl;dr;
Why using
string myVariable = "someInitialValueDifferentThanUserValue99999999999";
as default value is wrong?
explanation of situation:
I had a discussion with a colleague at my workplace.
He proposed to use some trash value as default in order to differentiate it from user value.
An easy example it would be like this:
string myVariable = "someInitialValueDifferentThanUserValue99999999999";
...
if(myVariable == "someInitialValueDifferentThanUserValue99999999999")
{
...
}
This is quite obvious and intuitive for me that this is wrong.
But I could not give a nice argument for this, beyond that:
this is not professional.
there is a slight chance that someone would input the same value.
Once I read that if you have such a situation your architecture or programming habits are wrong.
edit:
Thank you for the answers. I found a solution that satisfied me, so I share with the others:
It is good to make a bool guard value that indicates if the initialization of a specific object has been accomplished.
And based on this private bool variable I can deduce if I play with a string that is default empty value "" from my mechanism (that is during initialization) or empty value from the user.
For me, this is a more elegant way.

Optional
Optional can be used.
Returns an empty Optional instance. No value is present for this Optional.
API Note:
Though it may be tempting to do so, avoid testing if an object is empty by comparing with == against instances returned by Option.empty(). There is no guarantee that it is a singleton. Instead, use isPresent().
Ref: Optional
Custom escape sequence shared by server and client
Define default value
When the user enter's the default value, escape the user value
Use a marker character
Always define the first character as the marker character
Take decision based on this character and strip this character for any actual comparison
Define clear boundaries for the check as propagating this character across multiple abstractions can lead to code maintenance issues.

Small elaboration on "It's not professional":
It's often a bad idea, because
it wastes memory when not a constant (at least in Java - of course, unless you're working with very limited space that's negligible).
Even as constant it may introduce ambiguity once you have more classes, packages or projects ("Was it NO_INPUT, INPUT_NOT_PROVIDED, INPUT_NONE?")
usually it's a sign that there will be no standardized scope-bound Defined_Marker_Character in the Project Documentation like suggested in the other answers
it introduces ambiguity for how to deal with deciding if an input has been provided or not
In the end you will either have a lot of varying NO_INPUT constants in different classes or end up with a self-made SthUtility class that defines one constant SthUtility.NO_INPUT and a static method boolean SthUtility.isInputEmpty(...) that compares a given input against that constant, which basically is reinventing Optional. And you will be copy-pasting that one class into every of your projects.

There is really no need as you can do the following as of Java 11 which was four releases ago.
String value = "";
// true only if length == 0
if (value.isEmpty()) {
System.out.println("Value is empty");
}
String value = " ";
// true if empty or contains only white space
if (value.isBlank()) {
System.out.println("Value is blank");
}
And I prefer to limit uses of such strings that can be searched in the class file that might possibly lead to exploitation of the code.

Bugs found by FindBugs plugin Eclipse

using bug finder plugin, I found this bugs but does not understand why it was seen as bug in the code. Does anybody know and give me proper explanation regarding these? Thanks.
Source Code - https://drive.google.com/open?id=1gAyHFcdHBShV-9oC5G7GeOtCGf7bXoso;
Patient.java:17 Patient.generatePriority() uses the nextDouble method of Random to generate a random integer; using nextInt is more efficient [Of Concern(18), Normal confidence]
public int generatePriority(){
Random random = new Random();
int n = 5;
return (int)(random.nextDouble()*n);
}
ExaminationRoom.java:25 ExaminationRoom defines equals and uses Object.hashCode() [Of Concern(16), Normal confidence]
public boolean equals(ExaminationRoom room){
if (this.getWaitingPatients().size() == room.getWaitingPatients().size()){
return true;
}
else {
return false;
}
}
ExaminationRoom.java:15 ExaminationRoom defines compareTo(ExaminationRoom) and uses Object.equals() [Of Concern(16), Normal confidence]
// Compares sizes of waiting lists
#Override
public int compareTo(ExaminationRoom o) {
if (this.getWaitingPatients().size() > o.getWaitingPatients().size()){
return 1;
}
else if (this.getWaitingPatients().size() < o.getWaitingPatients().size()){
return -1;
}
return 0;
}
Hospital.java:41 Bad month value of 12 passed to new java.util.GregorianCalendar(int, int, int) in Hospital.initializeHospital() [Scary(7), Normal confidence]
doctors.add(new Doctor("Hermione", "Granger", new GregorianCalendar(1988, 12, 10), Specialty.PSY, room102));
Person.java:29 Return value of String.toLowerCase() ignored in Person.getFullName() [Scariest(3), High confidence]
public String getFullName(){
firstName.toLowerCase();
Character.toUpperCase(firstName.charAt(0));
lastName.toLowerCase();
Character.toUpperCase(lastName.charAt(0));
return firstName + " " + lastName;
}

Don’t create a new Random object each time.
Use random.nextInt(n).
Define a hashCode method in ExaminationRoom.
Having your compareTo method be inconsistent with equals() may or may not be OK.
Use LocalDate instead of GregorianCalendar.
Pick up and use the return values from String.toLowerCase() and Character.toUpperCase().
Consider SpotBugs as a newer alternative to FindBugs.
Details
Random
Creating a new Random object each time you need one gives your poorer pseudo-random numbers with a high risk of numbers being repeated. Declare a static variable holding the Random object outside your method and initialize it in the declaration (Random is thread-safe, so you can safely do that). For drawing a pseudo-random number from 0 through 4 use
int n = 5;
return random.nextInt(n);
It’s not only more efficient (as FindBugs says), I first of all find it much more readable.
hashCode()
#Override
public int hashCode() {
return Objects.hash(getWaitingPatients());
}
compareTo()
The equals method you have shown us seems to contradict FindBugs here. It does look a bit funny, though. Are two waiting rooms considered the same if they have the same number of waiting patients? Please think again. If you end up deciding that they are not equal but should be sorted into the same spot without discrimination, your compareTo method is inconsistent with equals(). If so, please insert a comment stating this fact. If you want FindBugs not to report this as a bug in subsequent analyses, you have two options:
Insert an annotation telling FindBugs to ignore the “bug”.
Create a FindBugs ignore XML file including this point.
I’m sorry I don’t remember the detauls of each, but your search engine should be helpful.
Don’t use GregorianCalendar
The GregorianCalendar class is poorly designed and long outdated. I suggest you evict it from your code and use LocalDate from java.time, the modern Java date and time API, instead.
doctors.add(new Doctor("Hermione", "Granger", LocalDate.of(1988, Month.DECEMBER, 10), Specialty.PSY, room102));
String.toLowerCase()
This was already treated in the other answer. Changing a name to have the first letter in upper case and the rest in lower case is not as simple as it sounds.
firstName.toLowerCase();
Character.toUpperCase(firstName.charAt(0));
The first of these two lines doesn’t modify the string firstName because strings have been designed to be immutable and toLowerCase() to return a new string with all letters in lower case (according to the rules of the default locale of the JVM, confusing). The second line also doesn’t modify any character because Java is call-by-value (look it up), so no method can modify a variable passed as argument. You’re not even passing a variable, but the return value from a different method. Also Character.toUpperCase() returns a new char in lower case.
What you need to do is pick up the values returned from these two method calls, use a substring operation for removing the first letter from the lower-case version of the name and concatenate the upper-case version of that letter with the remainder of the lower-case string. If it’s complicated, I am sure that your search engine can find examples of where and how it is done.
A bit of an aside: You may want to think twice before forcing Dr. Jack McNeil to be written as Mcneil and Dr. Ludwig von Saulsbourg as Von saulsbourg.
SpotBugs
It’s only something I have heard, I haven’t checked myself. The source code of FindBugs has been taken over by a project called SpotBugs. They say that SpotBugs is being developed more actively than FindBugs. So you may consider switching. I am myself a happy SpotBugs user in my daily work.
Links
What issues should be considered when overriding equals and hashCode in Java?
Documentation of Comparable explaining what it means that compareTo() is inconsistent with equals() and how to go about it.
Oracle tutorial: Date Time explaining how to use java.time.
Chicago Hope listing Jack McNeil as Orthopedic Surgeon.
Dr. Jack mentioning Dr. Ludwig von Saulsbourg on the cast.
SpotBugs

First thing to remember about "bug finder" tools, is that they are usually only guidelines. With that said:
The class GregorianCalendar counts months from 0, meaning 0 is January, 11 is December. 12 represents the 13th month which doesn't exist. Since the function expects an int, and you gave it an int, no compiler error is generated, even though this is certainly a bug. This article does a good job explaining the reasons to upgrade, and give examples of how to use the new APIs: https://www.baeldung.com/java-8-date-time-intro
If in doubt, you can always check the documentation. In this case, the class Calendar (which GregorianCalendar extends) delcares a static constant public static final int JANUARY = 0; This confirms that january is indeed 0, but also indicates that we can use this constant in our code. You might find new GregorianCalendar(1988, Calendar.JANUARY, 10) to be a bit more readable.
You may also want to consider switching to the more modern and standard systems used to deal with time. The Java 8 Time libraries are the "new standard", and are definitely worth looking into.
Secondly, Strings are immutable in Java. This means that once a String is created, its value can never be changed. This may be counter to your intuitions, as you may have seen code such as:
String s = "hello";
s = s + " world";
However, this doesn't modify the string s. Instead, s + " world" creates a new String, and assigns it to the variable s.
Similarly, s.toLowerCase() doesn't change what s is, it only generates a new String which you must assign.
You probably want firstName = firstName.toLowerCase();
With your first example, nothing immediately jumps out to me as "bad", but if you look at the messages generated by your tool, they label the first example as "Of Concern", but label the others (e.g. the string.toLowerCase() example) as "Scary"/"Scariest". Although I am not familiar with this tool in particular, I imagine this is indicating more of a "Code Smell", rather than an actual bug.
Perhaps look into Unit Testing, if you want to reassure yourself that your code works.

Nesting formats in String.format

Is nesting formats possible with java's String.format? An example would be;
String fooPadded = String.format("FOO:%1$10s", "foo");
// fooPadded:"FOO: foo"
String barPadded = String.format("%1$15s", fooPadded);
// barPadded:" FOO: foo"
Instead of calling 2 consecutive format methods which would be expensive in terms of performance, I want to wrap foo rule with bar rule in other terms reduce format to single one.

Are you having a performance problem in your program? If so, you are right in wanting to do something about it. If not, you shouldn’t. If you have, String.format() would not be my first suspect nor the second for taking too long time. Go measure before making any changes to your nice and readable code.
That said, I think the way to limit to one call to format() is:
String barPadded = String.format("%5s%10s", "FOO:", "foo");
I don’t think you can do nesting except with two calls as in your question.
And if "foo" happened to be exactly 11 chars long, my code would not give the exact same result as the code in your question.

Switch statement vs Ifs in Java

Herbert Schildt in "The Complete Reference" - Chapter 5:
A switch statement is usually more efficient than a set of nested ifs.
When the compiler compiles a switch statement , it inspects each of the case constants and creates a "Jump Table" that it will use for selecting the path of execution depending on the value of the expression. Therefore, if you need to select among a large group of values, a switch statement will run much faster than the equivalent logic coded using a sequence of if-elses. The compiler can do this because it knows that the case constants are all the same type and simply must be compared for equality with the switch expression. The compiler has no such knowledge of a long list of if expressions.
what does he mean by the term "JUMP TABLE"?
Switch differs from if-statements in that, switch can test only for equality whereas if can evaluate any type of boolean expression. If a switch(expression) does not match any of the case constants, that means it goes inside the default case. Isn't it a case of unequality? That makes me think of having no such big difference between the switch and if.
From an extract in the same book, he wrote:
An if-then-else statement can test expressions based on ranges of values or conditions, whereas a switch statement tests expressions based only on a single integer, enumerated value, or String object.
Doesn't that make if more powerful than switch?
I have worked in different projects on Core Java, but never felt the need to make use of the switch statement, and also not seen other associates make use of it. Not even a single time. How come a much more powerful switch control statement fails to land itself ahead of if in terms of usability?

1) The compiler creates a table with a row for every case statement, where the case condition is the key and the block of code is the value. When the switch statement is executed in your code, the block of the case can directly be accessed by the key of the case block. Where in nested if blocks you need to check and execute every condition before your will find the right one where your enter the conditional block of code.
Think of it like a hashtable (for switch case) compared to a linear linked list (for nested if statements) and compare the time to look up a specific value. The lookup time of a table is O(1) (just get it in one operation) where the lookup time in the linked list is O(n) (look at every item until you get the right one).
3) yes, an if statement can be used more flexible and express more, but in the case you want to switch between different codeblocks for a list of different values a switch block is executed faster and more readable.
4) you never need to use a switch because nested ifs can express the same, but in some cases it is more elegant. But because of the rare time it is good to use, programmers forget about it and don't use it even in the spots where it would fit better.
2) extract the difference out of the answers to the other points.

I use switch all the time, if you aren't using it you're doing Java wrong.
A jump table can be thought of like an array of locations in the code. For example imagine the following:
case 0:
// do stuff
break;
case 1:
// do stuff 2
break;
case 3:
// do stuff 3
break;
Then you have a jump table just containing
codeLocations[] = { do stuff, do stuff 2, do stuff 3 }.
Then to find the correct bit of code for case 1 Java just needs to goto codeLocations[1]. No actual integer comparisons are needed.
Obviously the actual implementation is rather more complicated than this as it needs to handle much more varied cases but it gives the idea.
Neither if nor switch are more powerful than each other. They are both tools. You're asking whether a hammer is more powerful than a screwdriver. Yes a hammer will knock more things into the wood, but a screwdriver will do a better job if you are working with screws :)
To expand on the "doing java wrong" thing:
Switch is a tool in the Java language, one that has been added for a reason.
You could apply the exact same question to for loops. Everything you can do with a for loop you can also do with a while loop (although it will likely involve writing quite a bit more code). Try it sometime.
This doesn't make while "more powerful" or even "more useful" than for. They are different tools with different strengths and weaknesses.
switch and if are the same as for and while. They do similar jobs and in a lot of places you could use either. However there are clear examples where one is preferable to the other.
If you are programming using a language while ignoring one of the fundamental tools in that languages toolkit then you are using that language wrong. This is not some obscure feature that hardly ever gets used, this is one of the primary flow control statements in the language.
If a carpenter only ever uses a hammer and never uses a screwdriver... they are doing it wrong. Equally if a programmer never uses a significant language feature...they are doing it wrong.
A proverb applies here "If the only tool you have is a hammer, everything looks like a nail".

1: A jump table is a table of jump targets. A jump target is an address in the code that corresponds to the code belonging to a case in the switch. If your switch cases are, for a simple example, 0, 1, ... N, then the N + 1 targets would be at index 0, 1, ... N of the table, respectively. You just calculate the value you're switching on an then look up the target by indexing into the table, not by comparing N + 1 times. That's the efficiency.
2: default is a case of inequality, yes, but very limited, as you note. Its the case for "not equal to any other of the cases" but it can't be "less than a given value" or "greater than..."
3: Yes, if is more powerful. However, switch is more efficient, sometimes. Efficiency != power.
4: Power != efficiency. But, to answer the implied question: the more efficient switch statement is not used because your colleagues have not used it. You have to ask them why. Maybe it's a matter of taste, or of ignorance of the advantages of switch. The greater efficiency of a switch statement is almost always going to be unnoticeable, unless you are in a very performance-sensitive context, so people are generally unaware of it. If performance is not paramount, then readability is. In some cases a developer might consider a switch statement and decide that a series of if statements is easier to read and understand.

1) In a switch case, a jump table maps a position in the code segment to a particular case. For instance, considering this snippet:
int a;
// a is manipulated here
switch (a) {
case 1: // address 0
// ...
break;
case 2: // address 1
// ...
break;
case 3: // address 2
// ...
break;
default: // other cases
// ...
break;
}
There would be a jump table mapping a=1 to the effective code address marked by address 0, a=2 to address 1, and so on. This can become more apparent when observing the resulting bytecode. With this approach, the number of conditional jumps in the bytecode becomes much smaller than in the case of if-else statements.
2) It's true that if-else statements can test of unequality, but achieving the same behavior of a switch-case statement involves chaining if-else statements. Here is an equivalent version of the above in terms of behavior.
if (a==1) {
// ...
} else if (a==2) {
// ...
} else if (a==3) {
// ...
} else {
// for any other cases
}
It could be a matter of personal opinion at some point, but a switch-case statement is more readable in this particular case.
But naturally, if-else statements are still indispensable, and you are expected to use the right tool for the job. If-else is more versatile than switch-case, but the latter may produce nicer code in some places. Hopefully, this answers the 3rd and 4th part of your question.

In layman's terms difference is negligible.
Both have their pros and cons. In most cases you would never use switch - instead you would use hashmap with actions or some other construct that has better readability.
I would reccomend you read Efficient Java or https://code.google.com/p/guava-libraries/wiki/GuavaExplained
In c# you could write this like :
class MyClass
{
static Dictionary<string, Func<MyClass, string, bool>> commands
= new Dictionary<string, Func<MyClass, string, bool>>
{
{ "Foo", (#this, x) => #this.Foo(x) },
{ "Bar", (#this, y) => #this.Bar(y) }
};
public bool Execute(string command, string value)
{
return commands[command](this, value);
}
public bool Foo(string x)
{
return x.Length > 3;
}
public bool Bar(string x)
{
return x == "";
}
}
After that you could have
var item = new MyClass();
item.Execute("Foo","ooF");

Someone told me that it is saving memory to use numbers directly instead of static final int fields, is that true?

In my Android project, there are many constances to represent bundle extra keys, Handler's message arguments, dialog ids ant etc.
Someone in my team uses some normal number to do this, like:
handler.sendMessage(handler.obtainMessage(MESSAGE_OK, 1, 0));
handler.sendMessage(handler.obtainMessage(MESSAGE_OK, 2, 0));
handler.sendMessage(handler.obtainMessage(MESSAGE_OK, 3, 0));
in handler:
switch (msg.arg1) {
case 1:
break;
case 2:
break;
case 3:
break;
}
he said too many static final constances cost a lot of memory. but i think his solution makes the code hard to read and refactor.
I have read this question and googled a lot and failed to find an answer.
java: is using a final static int = 1 better than just a normal 1?
I hope someone could show me the memory cost of static finals.
Sorry for my poor English.

You shouldn't bother to change it to literals, it will make your code less readable and less maintainable.
In the long run you will benefit from this "lose of memory"

Technically, he is right - static int fields do cost some additional memory.
However, the cost is negligible. It's an int, plus the associated metadata for the reflection support. The benefits of using meaningfull names that make your code more readable, and ensure that the semantic of that number is well known and consistent evewhere it is used, clearly outweight that cost.
You can do a simple test. Write a small application that calls handler.sendMessage 1000 times with different number literal, build it and note down the size of the .dex file. Then replace the 1000 literals with 1000 static int consts, and do the same. Compare the two sizes and you will get an idea of the order of magnitude of additional memory your app will need. (And just for completeness, post the numbers here as comment :-))

It saves a very small amount of memory - basically just the extra metadata required to record the extra constant in the relevant class and refer to it from other classes.
It is NOT worth worrying about this, unless you are extremely memory constrained.
Using well-named static final constants rather than mysterious magic numbers is much better for your code maintainability and sanity in the long run.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.