Why using default trash value for string is wrong? - java

tl;dr;
Why using
string myVariable = "someInitialValueDifferentThanUserValue99999999999";
as default value is wrong?
explanation of situation:
I had a discussion with a colleague at my workplace.
He proposed to use some trash value as default in order to differentiate it from user value.
An easy example it would be like this:
string myVariable = "someInitialValueDifferentThanUserValue99999999999";
...
if(myVariable == "someInitialValueDifferentThanUserValue99999999999")
{
...
}
This is quite obvious and intuitive for me that this is wrong.
But I could not give a nice argument for this, beyond that:
this is not professional.
there is a slight chance that someone would input the same value.
Once I read that if you have such a situation your architecture or programming habits are wrong.
edit:
Thank you for the answers. I found a solution that satisfied me, so I share with the others:
It is good to make a bool guard value that indicates if the initialization of a specific object has been accomplished.
And based on this private bool variable I can deduce if I play with a string that is default empty value "" from my mechanism (that is during initialization) or empty value from the user.
For me, this is a more elegant way.

Optional
Optional can be used.
Returns an empty Optional instance. No value is present for this Optional.
API Note:
Though it may be tempting to do so, avoid testing if an object is empty by comparing with == against instances returned by Option.empty(). There is no guarantee that it is a singleton. Instead, use isPresent().
Ref: Optional
Custom escape sequence shared by server and client
Define default value
When the user enter's the default value, escape the user value
Use a marker character
Always define the first character as the marker character
Take decision based on this character and strip this character for any actual comparison
Define clear boundaries for the check as propagating this character across multiple abstractions can lead to code maintenance issues.

Small elaboration on "It's not professional":
It's often a bad idea, because
it wastes memory when not a constant (at least in Java - of course, unless you're working with very limited space that's negligible).
Even as constant it may introduce ambiguity once you have more classes, packages or projects ("Was it NO_INPUT, INPUT_NOT_PROVIDED, INPUT_NONE?")
usually it's a sign that there will be no standardized scope-bound Defined_Marker_Character in the Project Documentation like suggested in the other answers
it introduces ambiguity for how to deal with deciding if an input has been provided or not
In the end you will either have a lot of varying NO_INPUT constants in different classes or end up with a self-made SthUtility class that defines one constant SthUtility.NO_INPUT and a static method boolean SthUtility.isInputEmpty(...) that compares a given input against that constant, which basically is reinventing Optional. And you will be copy-pasting that one class into every of your projects.

There is really no need as you can do the following as of Java 11 which was four releases ago.
String value = "";
// true only if length == 0
if (value.isEmpty()) {
System.out.println("Value is empty");
}
String value = " ";
// true if empty or contains only white space
if (value.isBlank()) {
System.out.println("Value is blank");
}
And I prefer to limit uses of such strings that can be searched in the class file that might possibly lead to exploitation of the code.

Related

Is there any difference in optimization between integer and string comparison?

I'm trying to make a game and I have a Selection class that holds a string named str in it. I apply the following code to my selection objects every 17 milliseconds.
if(s.Str == "Upgrade") {
}else if(s.Str == "Siege") {
}else if(s.Str == "Recruit") {
}
In other words, these selection objects will do different jobs according to their types(upgrade,siege etc...). I am using str variable elsewhere. my question is that:
Would it be more optimized if I assign the types to an integer when I first create the objects?
if(s.type == 1) {
}else if(s.type == 2) {
}else if(s.type == 3) {
}
This would make me write extra lines of code(Since I have to separate objects by type when I first create) and make the code more difficult to understand, but would there be a difference between comparing integers rather than comparing strings?
If you compare strings >that< way, there is probably no performance difference.
However, that is the WRONG WAY to compare strings. The correct way is to use the equals(Object) method. For example.
if (s.Str.equals("Upgrade")) {
Read this:
How do I compare strings in Java?
I apply the following code to my selection objects every 17 milliseconds.
The time that it will take to test two strings for equality is probably in the order of tens of NANOseconds. So ... basically ... the difference between comparing strings or integers is irrelevant.
This illustrates why premature optimization is a bad thing. You should only optimize code when you know that it is going to be worthwhile to spend your time on it; i.e. when you know there is going to be a pay-off.
So should I optimize after I write and finish all the code? Does 'not doing premature optimization' means that?
No it doesn't exactly mean that. (Well .. not to me anyway.) What it means to me is that you shouldn't optimize until:
you have a working program whose performance you can measure,
you have determined specific (quantifiable) performance criteria,
you have a means of measuring the performance; e.g. an appropriate benchmarks involving real or realistic use-cases, and
you have good a means of identifying the actual performance hotspots.
If you try to optimize before you have the above, you are likely to optimize the wrong parts of the code for the wrong reasons, and your effort (programmer time) is likely to be spent inefficiently.
In your specific case, my gut feeling is that if you followed the recommended process you would discover1 that this String vs int (vs enum) is irrelevant to your game's observable performance2.
But if you want to be more scientific than "gut feeling", you should wait until you have 1 through 4 settled, and then measure to see if the actual performance meets your criteria. Only then should you decide whether or not to optimize.
1 - My prediction assumes that your characterization of the problem is close enough to reality. That is always a risk when people try to identify performance issues "by eye" rather than by measuring.
2 - It is relevant to other things; e.g. code readability and maintainability, but I'm not going to address those in this Answer.
The Answer by Stephen C is correct and wise. But your example code is ripe for a different solution entirely.
Enum
If you want performance, type-safety, easier-to-read code, and want to ensure valid values, use enum objects rather than mere strings or integers.
public enum Action { UPGRADE , SIEGE , RECRUIT }
You can use a switch for the various enum possible objects.
Action action = Action.SIEGE ;
…
switch ( action )
{
case UPGRADE:
doUpgradeStuff() ;
break;
case SIEGE:
doSiegeStuff() ;
break;
case RECRUIT:
doRecruitStuff() ;
break;
default:
doDefaultStuff() ;
break;
}
Using enums this way will get even better in the future. See JEP 406: Pattern Matching for switch (Preview).
See Java Tutorials by Oracle on enums. And for an example, see their tutorial using enums for month, day-of-week, and text style.
See also this Question, linked to others.
Comparing primitive numbers like Integer will be definitely faster compared to String in Java. It will give you faster performance if you are executing it every 17 milliseconds.
Yes there is difference. String is a object and int is a primitive type. when you are doing object == "string" it is matching the address. You need to use equals method to check the exact match.

Bugs found by FindBugs plugin Eclipse

using bug finder plugin, I found this bugs but does not understand why it was seen as bug in the code. Does anybody know and give me proper explanation regarding these? Thanks.
Source Code - https://drive.google.com/open?id=1gAyHFcdHBShV-9oC5G7GeOtCGf7bXoso;
Patient.java:17 Patient.generatePriority() uses the nextDouble method of Random to generate a random integer; using nextInt is more efficient [Of Concern(18), Normal confidence]
public int generatePriority(){
Random random = new Random();
int n = 5;
return (int)(random.nextDouble()*n);
}
ExaminationRoom.java:25 ExaminationRoom defines equals and uses Object.hashCode() [Of Concern(16), Normal confidence]
public boolean equals(ExaminationRoom room){
if (this.getWaitingPatients().size() == room.getWaitingPatients().size()){
return true;
}
else {
return false;
}
}
ExaminationRoom.java:15 ExaminationRoom defines compareTo(ExaminationRoom) and uses Object.equals() [Of Concern(16), Normal confidence]
// Compares sizes of waiting lists
#Override
public int compareTo(ExaminationRoom o) {
if (this.getWaitingPatients().size() > o.getWaitingPatients().size()){
return 1;
}
else if (this.getWaitingPatients().size() < o.getWaitingPatients().size()){
return -1;
}
return 0;
}
Hospital.java:41 Bad month value of 12 passed to new java.util.GregorianCalendar(int, int, int) in Hospital.initializeHospital() [Scary(7), Normal confidence]
doctors.add(new Doctor("Hermione", "Granger", new GregorianCalendar(1988, 12, 10), Specialty.PSY, room102));
Person.java:29 Return value of String.toLowerCase() ignored in Person.getFullName() [Scariest(3), High confidence]
public String getFullName(){
firstName.toLowerCase();
Character.toUpperCase(firstName.charAt(0));
lastName.toLowerCase();
Character.toUpperCase(lastName.charAt(0));
return firstName + " " + lastName;
}
Don’t create a new Random object each time.
Use random.nextInt(n).
Define a hashCode method in ExaminationRoom.
Having your compareTo method be inconsistent with equals() may or may not be OK.
Use LocalDate instead of GregorianCalendar.
Pick up and use the return values from String.toLowerCase() and Character.toUpperCase().
Consider SpotBugs as a newer alternative to FindBugs.
Details
Random
Creating a new Random object each time you need one gives your poorer pseudo-random numbers with a high risk of numbers being repeated. Declare a static variable holding the Random object outside your method and initialize it in the declaration (Random is thread-safe, so you can safely do that). For drawing a pseudo-random number from 0 through 4 use
int n = 5;
return random.nextInt(n);
It’s not only more efficient (as FindBugs says), I first of all find it much more readable.
hashCode()
#Override
public int hashCode() {
return Objects.hash(getWaitingPatients());
}
compareTo()
The equals method you have shown us seems to contradict FindBugs here. It does look a bit funny, though. Are two waiting rooms considered the same if they have the same number of waiting patients? Please think again. If you end up deciding that they are not equal but should be sorted into the same spot without discrimination, your compareTo method is inconsistent with equals(). If so, please insert a comment stating this fact. If you want FindBugs not to report this as a bug in subsequent analyses, you have two options:
Insert an annotation telling FindBugs to ignore the “bug”.
Create a FindBugs ignore XML file including this point.
I’m sorry I don’t remember the detauls of each, but your search engine should be helpful.
Don’t use GregorianCalendar
The GregorianCalendar class is poorly designed and long outdated. I suggest you evict it from your code and use LocalDate from java.time, the modern Java date and time API, instead.
doctors.add(new Doctor("Hermione", "Granger", LocalDate.of(1988, Month.DECEMBER, 10), Specialty.PSY, room102));
String.toLowerCase()
This was already treated in the other answer. Changing a name to have the first letter in upper case and the rest in lower case is not as simple as it sounds.
firstName.toLowerCase();
Character.toUpperCase(firstName.charAt(0));
The first of these two lines doesn’t modify the string firstName because strings have been designed to be immutable and toLowerCase() to return a new string with all letters in lower case (according to the rules of the default locale of the JVM, confusing). The second line also doesn’t modify any character because Java is call-by-value (look it up), so no method can modify a variable passed as argument. You’re not even passing a variable, but the return value from a different method. Also Character.toUpperCase() returns a new char in lower case.
What you need to do is pick up the values returned from these two method calls, use a substring operation for removing the first letter from the lower-case version of the name and concatenate the upper-case version of that letter with the remainder of the lower-case string. If it’s complicated, I am sure that your search engine can find examples of where and how it is done.
A bit of an aside: You may want to think twice before forcing Dr. Jack McNeil to be written as Mcneil and Dr. Ludwig von Saulsbourg as Von saulsbourg.
SpotBugs
It’s only something I have heard, I haven’t checked myself. The source code of FindBugs has been taken over by a project called SpotBugs. They say that SpotBugs is being developed more actively than FindBugs. So you may consider switching. I am myself a happy SpotBugs user in my daily work.
Links
What issues should be considered when overriding equals and hashCode in Java?
Documentation of Comparable explaining what it means that compareTo() is inconsistent with equals() and how to go about it.
Oracle tutorial: Date Time explaining how to use java.time.
Chicago Hope listing Jack McNeil as Orthopedic Surgeon.
Dr. Jack mentioning Dr. Ludwig von Saulsbourg on the cast.
SpotBugs
First thing to remember about "bug finder" tools, is that they are usually only guidelines. With that said:
The class GregorianCalendar counts months from 0, meaning 0 is January, 11 is December. 12 represents the 13th month which doesn't exist. Since the function expects an int, and you gave it an int, no compiler error is generated, even though this is certainly a bug. This article does a good job explaining the reasons to upgrade, and give examples of how to use the new APIs: https://www.baeldung.com/java-8-date-time-intro
If in doubt, you can always check the documentation. In this case, the class Calendar (which GregorianCalendar extends) delcares a static constant public static final int JANUARY = 0; This confirms that january is indeed 0, but also indicates that we can use this constant in our code. You might find new GregorianCalendar(1988, Calendar.JANUARY, 10) to be a bit more readable.
You may also want to consider switching to the more modern and standard systems used to deal with time. The Java 8 Time libraries are the "new standard", and are definitely worth looking into.
Secondly, Strings are immutable in Java. This means that once a String is created, its value can never be changed. This may be counter to your intuitions, as you may have seen code such as:
String s = "hello";
s = s + " world";
However, this doesn't modify the string s. Instead, s + " world" creates a new String, and assigns it to the variable s.
Similarly, s.toLowerCase() doesn't change what s is, it only generates a new String which you must assign.
You probably want firstName = firstName.toLowerCase();
With your first example, nothing immediately jumps out to me as "bad", but if you look at the messages generated by your tool, they label the first example as "Of Concern", but label the others (e.g. the string.toLowerCase() example) as "Scary"/"Scariest". Although I am not familiar with this tool in particular, I imagine this is indicating more of a "Code Smell", rather than an actual bug.
Perhaps look into Unit Testing, if you want to reassure yourself that your code works.

How is the "empty string" sequence represented under the hood in Java?

Throughout my career I've often seen calls like this:
if( "".equals(foo) ) { //do stuff };
How is the empty string understood in terms of data in the lower-levels of Java?
Specifically, by "Lower-levels of Java" I'm referring to the actual contents of memory or some C/C++ construct being used to represent the "" sequence, rather than high-level implementations in Java.
I had previously checked the Java Language Specification which lead me to this, and noting that the "empty string" wasn't really given much more definition than that, this is then what led to the head-scratching.
I then ran javap on some various classes trying to tease out an answer through bytecode, but the behavior in regards to "How is the machine dealing with the sequence "" wasn't really any more clear. Having then excluded byte code and Java code I then posted the question here, hoping that someone would shed some light on the issue from a lower-level perspective.
There's no such thing as "the empty string character". A character is always a UTF-16 code unit, and there's no "empty" code unit. There's "an empty string" which is represented exactly the same way as any other string:
A char[] reference
An index into that char[]
A length
In this case, the length would be 0. The char[] reference could potentially be a reference to an empty char array, which could potentially be shared between all instance of String which have a length of 0.
(Code such as substring could be implemented by detecting 0-length requests and always returning the same reference to an empty string, but I'm not aware of implementations doing that.)

Good practice for string pool in Ruby

In Java, we usually create a StringPool.class to store frequently used Strings. For example: we declarepublic static final String SPACE = " ";and we call StringPool.SPACEwhen needed.Is it good practice to do this in Ruby as well? If yes, can you give an example of StringPoolin Ruby?
If your intention is to group a set of constants in ruby within a specific context, you could do this using a class or a module like so:
class MyConstants
CONST_1 = "Constant1"
CONST_2 = "Constant2"
# ...
end
or
module MyConstants
CONST_1 = "Constant1"
CONST_2 = "Constant2"
end
You could then access those constants in the following way:
MyConstants::CONST
Note that constant values can be anything besides strings, even symbols. As previously mentioned in other answers, it is a common idiom in ruby to use symbols. However, this pattern makes sense when you want to make explicit the fact that your constants belong to a certain context (i.e. like an enumeration). This enforces IMHO the semantics of your application.
Your answer was already provided by Ari in comments, but I will duplicate it here to help others find answers fast.
You need a symbol.
What is a symbol?
Symbol is immutable, fast to compare string.
How to make a symbol?
You can define its instances these ways:
:foo # => :foo <- it's a symbol
"string".to_sym # => :string <- symbol too. But prefer a first way
# due to avoid string object creation.
It mustn't be magic!
And it is not. All symbols are stored in special symbol-table even your :foo and :string. You can look into symbols table:
Symbol.all_symbols #=> [:execute_script, :array_new_range, :ExtendCommand, ...]
This table seems to be really big. What if you want to find symbol in it?
Symbol.all_symbols.index(:foobar42) # => bad. It will find index of symbol
# :foobar42 but it will create it first
# if it wasn't.
Symbol.all_symbols.index(/foobar42/) # returns nil if there's no such a symbol
The symbols are fast to compare because they are immutable. Method == just checks if object_ids are same.
When to use symbols?
I'm pretty sure you know. When you want to use one string couple of times and not planning to change it.
That's all.
My first thought after reading your question was: hey, why do they need string pools? I thought strings are immutable in java. That's why we have to use string builders and other stuff. But then I saw public static final String. Oh yeah, it is java.

Best way to validate a String against many patterns

This is a question more about best practices/design patterns than regexps.
In short I have 3 values: from, to and the value I want to change. From has to match one of several patterns:
XX.X
>XX.X
>=XX.X
<XX.X
<=XX.X
XX.X-XX.X
Whereas To has to be a decimal number. Depending on what value is given in From I have to check whether a value I want to change satisfies the From condition. For example the user inputs "From: >100.00 To: 150.00" means that every value greater than 100.00 should be changed.
The regexp itself isn't a problem. The thing is if I match the whole From against one regexp and it passes I still need to check which option was inputted - this will generate at least 5 IFs in my code and every time I want to add another option I will need to add another IF - not cool. Same thing if I were to create 5 Patterns.
Now I have a HashMap which holds a pattern as the key and a ValueMatcher as the value. When a user inputs a From value then I match it in a loop against every key in that map and if it matches then I use the corresponding ValueMatcher to actually check if the value that I want to change satisfies the "From" value.
This aproach on the other hand requires me to have a HashMap with all the possibilities, a ValueMatcher interface and 5 implementations each with only 1 short "matches" methode. I think it sure is better than the IFs, but still looks like an exaggerated solution.
Is there any other way to do it? Or is this how I actually should do it? I really regret that we can't hold methods in a HashMap/pass them as arguments because then I'd only have 1 class with all the matching methodes and store them in a HashMap.
How about a chain of responsibility.
Each ValueMatcher object exactly one From/To rule and a reference to the next ValueMatcher in the chain. Each ValueMatcher has a method which examines a candidate and either transaforms it or passes it on to the next in the chain.
This way adding a new rule is a trivial extension and the controlling code just passes the candidate to the first member of the chain.
a ValueMatcher interface and 5 implementations each with only 1 short "matches" methode. I think it sure is better than the IFs, but still looks like an exaggerated solution.
Well, for something as simple as evaluating a number against an operator and a limit value, couldn't you just write one slightly more generic ValueMatcher which has a limit value and an operator as its parameters? It would then be pretty easy to add 5 instances of this ValueMatcher with a few combinations of >, >=, etc.
EDIT: Removed non Java stuff... sorry about that.

Categories