Java String initialization (part 2) - java

I asked this goofy question earlier today and got good answers. I think what I really meant to ask is the following:
String aString = ""; // Or = null ?
if(someCondition)
aString = "something";
return aString;
In this case, the string has to be initialized in order to return it. I always thought that either option (setting it to "" or to null looks kind of ugly. I was just wondering what others do here...or is it more of just a matter of whether you want empty string or null being passed around in your program (and if you are prepared to handle either)?
Also assume that the intermediary logic is too long to cleanly use the conditional (? :) operator.

return (someCondition) ? "something" : "";
or
return (someCondition) ? "something" : null;
Typically though if your function says it will return a String I prefer to actually return a String instead of a null. Either way the calling function should probably check for both cases.

depends on what you want to achieve,
in general i would return null to signal that nothing was processed,
it might later pop up cases where someCondition is true but the string you build together is "" anyway, that way you can differentiate from that case if you return null if nothing was processed.
i.e.
String aString = null;
if(someCondition)
aString = "something";
return aString;
but it all really depends on what you want to achieve...
e.g. if the code is suppose to build together a string that is delivered directly in the UI you would go for "" instead

In this case it might be best to do something like:
public void func() {
boolean condition = getConditionFromSomewhere();
String condString = getAppropriateValue(condition);
}
public String getAppropriateValue(boolean condition) {
if (condition) {
return "something";
} else {
return "somethingElse";
}
}
It may seem a bit overkill for a boolean condition, but if you get into more complex conditions (more choices, like enums and the like), it would nicely abstract that logic away. And with a descriptive method name make it almost self-documenting to boot.

Since you're asking our opinions...
I'm a bit overconscious about quality. I prefer that all my "if" statements have an "else", because (1) it helps to understand the code if there are multiple nested "if's", (2) force me to consider the possibility (what should happen if the condition is false?).
Regarding reason (1), I prefer to avoid nested if's, but sometimes you inherit code with a lot of if's.
if(someCondition)
aString = "something";
else
aString = "";
I would prefer "null", because it would make the app fail and dump a stack that I can follow. An empty string, in contrast, would keep things going. Naturally, it depends on the logic of your code what is better: null or "".

Yes, this is just a matter of whether you want a null or an empty string being passed around in your program. If you plan to append things to it later, and the someCondition just indicates that you should give it a first value to begin with, then use the empty string. If you plan to have it indicate that there is a string or there is nothing, then perhaps using null is better.

Its just your preference of what your api should do. If you are returning null, there may be a chance that the api users can get NPE if they are not checking for null. But if you are using "" string, the error may pass silently if it is not supposed to be null. It is a preference of how you write your api and depends on the use case.

Thinking about it, I rarely use "" in code for any purpose.

Related

Why can I use .equalsIgnoreCase("anoterString") without assigning it to a variable or within a control flow statement?

I came across this code:
for (final String s : myList)
{
s.equalsIgnoreCase(test);
updateNeeded = true;
break;
}
I suspect that this is not what the programmer actually wanted to do. I believe he meant to write something like:
for (final String s : myList)
{
if(s.equalsIgnoreCase(test))
{
updateNeeded = true;
break;
}
}
However, I don't understand why there is no error in the first code snippet.
s.equalsIgnoreCase(test);
since the method .equalsIgnoreCase("anoterString") returns a boolean and it is not being assigned to anything or used within a control flow statement
It's just a method call. You don't have to use the result of a method call for anything else.
It's rarely a good idea to ignore the result of a non-void method (in particular, the return value of InputStream.read is sometimes ignored when it really shouldn't be) but the language specification makes no attempt to call this out as a problem.
It should probably meant to be as you think. You can call a method and not assigning its result, as sometimes you just don't need the result.
You are right, this is probably an error. However, Java compiler has no way of knowing that equalsIgnoreCase method is a "pure function", i.e. produces no side effects and is otherwise meaningless unless you keep its return value.
Most programming languages allow you to ignore the returned code of a method.
There are times that you would not care about the returned value and in these cases you should be allowed to.

What is the value of the test !"".equals(someString)

In a project I've been trying to familiarise myself with, I ran across a method that looks like this:
public boolean testString(String string){
return string != null && !"".equals(string);
}
What is the value of testing the string for emptiness this way instead of with the variable first? I understand why we see constant-first (Yoda syntax) in C, but is there any reason to do so with method calls in Java?
note: I do understand about NullPointerException, which is not possible in this instance. I'm looking for a value to doing it this way in this case particularly.
In this context it makes little difference, as it already tested for null. Usually you do it this way to make sure you don't call a member on a null-reference (resulting in a NullPointerException), i.e.
"test".equals(myString)
will never throw a null pointer exception whereas
myString.equals("test")
will if myString is null. So basically, the first test makes sure it's a string (not null) AND it's equal to "test".
For two strings it doesn't matter much, but when there is a non-final type involved it can be a micro-optimization.
If the left hand side is a non-overridden concrete type, then the dispatch becomes static.
Consider what the JIT has to do for
Object o;
String s;
o.equals(s)
vs
s.equals(o)
In the first, the JIT has to find the actual equals method used, whereas in the second, it knows that it can only by String.equals.
I adopted the habit of doing
"constant value" == variableName
in other languages, since it means that the code will fail to parse if I mis-type = instead of ==.
And when I learned Java, I kept that order preference.
The usual reason for using "constant string".equals(variable) is that this works properly even if variable is null (unlike variable.equals("constant string")). In your case, however, since you are testing that string != null in a short-circuit boolean test, it's entirely a matter of style (or habit).
If they just did this:
!"".equals(string);
then they're avoiding the possibility of a NullPointerException, which is pretty smart. However, they're checking for null right before this condition, which is technically not necessary.
Is it running any tools like checkstyle? if it is, putting the variable first will result in checkstyle failing. Another reason is that if you put the empty string first it will take away the possibility of getting a null exception if the variable is null because the expression will always evaluate to false. If you had the variable first and the variable was null it will throw an exception.
It is more than a coder preference. If the purpose of the method was only to check that string is not an empty String (without caring whether its a null) then it makes sense to have the constant first to avoid a NullPointerException.
e.g. This method will return the boolean outcome. false in case string is null.
public boolean testString(String string){
return !"".equals(string);
}
while this one may throw a runtime exception if string is null
public boolean testString(String string){
return !string.equals("");
}
No, it is unnatural, and harder to read. It triggers a pause for most readers, and may wastefully consume lots of resources on stackoverlow.com.
(Better use string.isEmtpy() anyway)
There are no fixed rules tho, sometime this is easier to read
if( null != foobar(blahblahblah, blahblahblah, blahblahblah) )
than
if( foobar(blahblahblah, blahblahblah, blahblahblah) != null )
This question can be answered on a number of levels:
What does the example mean?
As other answers have explained, !"".equals(str) tests if str is an non-empty string. In general, the <stringLiteral>.equals(str) idiom is a neat way of testing a string that deals with the null case without an explicit test. (If str is null then the expression evaluates to false.
Is this particular example best practice?
In general no. The !"".equals(str) part deals with the case where str is null, so the preceding null test is redundant.
However, if str was null in the vast majority of cases, this usage would possibly be faster.
What is a better way to do this from a code-style perspective?
return "".equals(str);
or
return str != null && !str.isEmpty();
However, the second approach doesn't work with Java versions prior to 1.6 ... because isEmpty() is a recent API extension.
What is the optimal way to do this?
My gut feeling is that return str != null && !str.isEmpty(); will be fastest. The String.isEmpty() method is implemented as a one-line test, and is small enough that the JIT compiler will inline it. The String.equals(Object) method is a lot more complicated, and too big to be inlined.
Miško Hevery (see his videos on youtube) calls this type of overkill "paranoid programming" :-)
Probably in this video:
http://www.youtube.com/watch?v=wEhu57pih5w
See also here: http://misko.hevery.com/2009/02/09/to-assert-or-not-to-assert/

Java While-Loops

So while rewriting some code, I came across something along the lines of:
Method 1
while ( iter.hasNext() ) {
Object obj = iter.next();
if ( obj instanceof Something ) {
returnValue = (Something) obj;
break;
}
}
I re-wrote it as the following without much thought (the actual purpose of the re-write was for other logic in this method):
Method 2
while ( (iter.hasNext()) && (returnValue == null) ) {
Object obj = iter.next();
if ( obj instanceof Something ) {
returnValue = (Something) obj;
}
}
I personally don't have any strong preference for either and really don't see anything wrong with either approach. Can anyone else think of benefits or consequences of using either approach? The variable, returnValue is returned. How would people feel if that was the last block in the method, and it is just returned?
EDIT: So here's what I'm doing: Currently this method takes a set of authorizations and validates them - returning a boolean. This method allows grouping so you can specify that at least one or all (meaning if at least one authorization is valid, pass the whole set). However, this method does not support levels of authorization which is what I'm changing it to do so that each level can specify different grouping. All this bit is just background information...does not have that much to do with the above code - an alternative method is used to perform the above block of code.
Clearer, for me, would be to extract this as a method of its own; then you can simply return the value rather than assigning it to a local.
while (iter.hasNext()) {
Object obj = iter.next();
if (obj instanceof Something)
return (Something)obj;
}
return null;
Better yet would be a foreach loop
for (Object o : yourList)
if (o instanceof Something)
return (Something)o
return null;
I like the first as it is more clear why you are breaking out of the loop. It says "if this condition is true, set the value and break the loop, we're done". The other took me another second or so, but as you said, there is no huge difference.
I prefer the explicit jump (whether it's a break or a return). It's difficult for me to articulate why, but I can compare it to writing in active vs. passive voice. Sure, it's merely style, but active is a lot more straightforward to read.
One school of thought is that a method should only have one return statement/point at the end of the method.
I tend to use whatever is clearer in the specific circumstances.
As an aside:
1) Should you be rewriting code that works (assuming it does)?
2) Should you be rewriting code that doesn't have tests (assuming it does not)?
Those are equivalent in this case but that is no longer the case if you want to do something after the if condition, which is often the case.
Performance is a wash and even if it wasn't, it's an irrelevant micro-optimization. The key goal should be readability and maintainability.
Method 1 is much clearer in what it is doing, it breaks the loop when obj instanceof Something==true, although method 2 is also pretty obvious.
Another slight advantage of Method 1 is that it is slightly faster since method 2 does one more condition check every loop, and it does one more check than method 1. What if the check
took more than 1 minute?
So obviously Method 1 is better and easier to understand.
Clearly the first form is much better because it is less complex. If you are not doing anything with returnValue you could simply return it as soon as you have found a match.
I don't agree that the first form is clearly better... The problem with returns, breaks and continues is that they make the code harder to refactor. The example given here is too simple to require such manipulation but in general it is all too easy to refactor a complex piece of code into its own method and neglect to ensure the caller's return paths remain valid. This is where automated testing can help, but I still prefer most of my checking to come out of the compiler.
I took the approach of asking what the loop is doing - Two things, it checks for the existence of an Object of type Fish, and it makes that Object available outside the loop. If we separate the two responsibilities we get:
Object obj = null;
while (iter.hasNext() && !(obj instanceof Fish)) {
obj = iter.next();
}
if (obj instanceof Fish) {
returnValue = (Fish) obj;
}
I still don't like it... perhaps its the instanceof or the use of Object or even just the nature of Iterator but it doesn't seem like nice code.
After your edit, it seems that you will have iterate through the whole list to check for all kinds of authorities. The code would be like:
Object returnValue=null; //Notice in your original code
while(iter.hasNext()){
Object kind=iter.next()
switch(kind.getType()){
case "fish":
case "reptile":
if(returnValue!=bird|returnValue!=mammal)
returnValue=(coldblood) kind;
break;
case "bird":
case "mammal":
returnValue=(coldblood) kind;
break;
case default:
//Fall-through
}
}
return returnValue;
So this is what I have to say:
You said you needed to implement multiple levels of permissions, I believe that you will have to iterate through the whole collection.
That having said, I recommend using Method 1 - that is if you have to break. The break keyword is more clear, and it is slightly faster. (One less jump however slight)

Should I set the initial java String values from null to ""?

Often I have a class as such:
public class Foo
{
private String field1;
private String field2;
// etc etc etc
}
This makes the initial values of field1 and field2 equal to null. Would it be better to have all my String class fields as follows?
public class Foo
{
private String field1 = "";
private String field2 = "";
// etc etc etc
}
Then, if I'm consistent with class definition I'd avoid a lot of null pointer problems. What are the problems with this approach?
That way lies madness (usually). If you're running into a lot of null pointer problems, that's because you're trying to use them before actually populating them. Those null pointer problems are loud obnoxious warning sirens telling you where that use is, allowing you to then go in and fix the problem. If you just initially set them to empty, then you'll be risking using them instead of what you were actually expecting there.
Absolutely not. An empty string and a null string are entirely different things and you should not confuse them.
To explain further:
"null" means "I haven't initialized
this variable, or it has no value"
"empty string" means "I know what the value is, it's empty".
As Yuliy already mentioned, if you're seeing a lot of null pointer exceptions, it's because you are expecting things to have values when they don't, or you're being sloppy about initializing things before you use them. In either case, you should take the time to program properly - make sure things that should have values have those values, and make sure that if you're accessing the values of things that might not have value, that you take that into account.
Does it actually make sense in a specific case for the value to be used before it is set somewhere else, and to behave as an empty String in that case? i.e. is an empty string actually a correct default value, and does it make sense to have a default value at all?
If the answer is yes, setting it to "" in the declaration is the right thing to do. If not, it's a recipe for making bugs harder to find and diagnose.
I disagree with the other posters. Using the empty string is acceptable. I prefer to use it whenever possible.
In the great majority of cases, a null String and an empty String represent the exact same thing - unknown data. Whether you represent that with a null or an empty String is a matter of choice.
Generally it would be best to avoid this. A couple of reasons:
Getting a NullPointerException is generally a good warning that you are using a variable before you should be, or that you forgot to set it. Setting it to an empty string would get rid of the NullPointerException, but would likely cause a different (and harder to track down) bug further down the line in your program.
There can be a valid difference between null and "". A null value usually indicates that no value was set or the value is unknown. An empty string indicates that it was deliberately set to be empty. Depending on your program, that subtle difference could be important.
I know this is an old question but I wanted to point out the following:
String s = null;
s += "hello";
System.out.println(s);// this will return nullhello
whereas
String s = "";
s += "hello";
System.out.println(s); // this will return hello
obviously the really answer to this is that one should use StringBuffer rather than just concatenate strings but as we all know that for some code it is just simpler to concatenate.
I would suggest neither.
Instead you should give your fields sensible values. If they don't have to change, I would make them final.
public class Foo {
private final String field1;
private final String field2;
public Foo(String field1, String field2) {
this.field1 = field1;
this.field2 = field2;
}
// etc.
}
No need to assign, i'm-not-initialised-yet values. Just give it the initial values.
I would avoid doing this, you need to know if your instances aren't getting populated with data correctly.
Null is better, that is why they are called unchecked exceptions {Null pointer exception}. When the exception is thrown, it tells you that you have to initialize it to some non null value before calling any methods on it.
If you do
private String field1 = "";
You are trying to supress the error. It is hard to find the bug, later.
I think when you use String s = null it will create variable "s" on stack only and no object will exists on heap,but as soon as you declare things as like String s=""; what it will does is like it will create "" object on heap.As we know that Strings are immutable so whenever u wil assign new value to string varible everytime it will create new Object on heap...So I think String s=null is efficient than String s = "";
Suggestions are welcome!!!!!
No way. Why do you want to do that? That will give incorrect results. nulls and """ are not same.

Is making an empty string constant worth it?

I have a co-worker that swears by
//in a singleton "Constants" class
public static final String EMPTY_STRING = "";
in a constants class available throughout the project. That way, we can write something like
if (Constants.EMPTY_STRING.equals(otherString)) {
...
}
instead of
if ("".equals(otherString)) {
...
}
I say it's
not worth it--it doesn't save any space in the heap/stack/string pool,
ugly
abuse of a constants class.
Who is the idiot here?
String literals are interned by default, so no matter how many times you refer to "" in code, there will only be one empty String object. I don't see any benefit in declaring EMPTY_STRING. Otherwise, you might as well declare ONE, TWO, THREE, FOUR, etc. for integer literals.
Of course, if you want to change the value of EMPTY_STRING later, it's handy to have it in one place ;)
Why on earth would you want a global variable in Java? James Gosling really tried to get rid of them; don't bring them back, please.
Either
0 == possiblyEmptyString.length()
or
possiblyEmptyString.isEmpty() // Java 6 only
are just as clear.
I much prefer seeing EMPTY_STRING.
It makes it english. "".equals 'reads' differently than EMPTY_STRING.equals.
Ironically the whole point of constants is to make them easily changeable. So unless your co-worker plans to redefine EMPTY_STRING to be something other than an empty string - which would be a really stupid thing to do - casting a genuine fixed construct such as "" to a constant is a bad thing.
As Dan Dyer says, its like defining the constant ONE to be 1: it is completely pointless and would be utterly confusing - potentially risky - if someone redefined it.
Well, I could guess too, but I did a quick test... Almost like cheating...
An arbitrary string is checked using various methods. (several iterations)
The results suggests that isEmpty() is both faster and indeed more readable;
If isEmpty() is not available, length() is a good alternative.
Using a constant is probably not worth it.
"".equals(someString()) :24735 ms
t != null && t.equals("") :23363 ms
t != null && t.equals(EMPTY) :22561 ms
EMPTY.equals(someString()) :22159 ms
t != null && t.length() == 0 :18388 ms
t != null && t.isEmpty() :18375 ms
someString().length() == 0 :18171 ms
In this scenario;
"IAmNotHardCoded".equals(someString())
I would suggest defining a constant in a r e l e v a n t place, since a global class
for all constants really sucks. If there is no relevant place, you are probably doing something else wrong...
Customer.FIELD_SHOE_SIZE //"SHOE_SIZE"
Might be considered a relevant place where as;
CommonConstants.I__AM__A__LAZY__PROGRAMMER // true
is not.
For BigIntegers and similar thing, I tend to end up defining a final static locally; like:
private final static BigDecimal ZERO = new BigDecimal(0);
private final static BigDecimal B100 = new BigDecimal("100.00");
Thats bugs me and wouldn't it be nice with some sugar for BigInts and BigDecimals...
I'm with your coworker. While the empty string is hard to mistype, you can accidentally put a space in there and it may be difficult to notice when scanning the code. More to the point it is a good practice to do this with all of your string constants that get used in more than one place -- although, I tend to do this at the class level rather than as global constants.
FWIW, C# has a static property string.Empty for just this purpose and I find that it improves the readability of the code immensely.
As a tangent to the question, I generally recommend using a utility function when what you're really checking for is "no useful value" rather than, specifically, the empty string. In general, I tend to use:
import org.apache.commons.lang.StringUtils;
// Check if a String is whitespace, empty ("") or null.
StringUtils.isBlank(mystr);
// Check if a String is empty ("") or null.
StringUtils.isEmpty(mystr);
The concept being that the above two:
Check the various other cases, including being null safe, and (more importantly)
Conveys what you are trying to test, rather than how to test it.
David Arno states: -
Ironically the whole point of
constants is to make them easily
changeable
This is simply not true. The whole point of constants is reuse of the same value and for greater readability.
It is very rare that constant values are changed (hence the name). It is more often that configuration values are changed, but persisted as data somewhere (like a config file or registry entry)
Since early programming, constants have been used to turn things like cryptic hex values such as 0xff6d8da412 into something humanly readable without ever intending to change the values.
const int MODE_READ = 0x000000FF;
const int MODE_EXECUTE = 0x00FF0000;
const int MODE_WRITE = 0x0000FF00;
const int MODE_READ_WRITE = 0x0000FFFF;
I don't like either choice. Why not if (otherString.length() == 0)
Edit: I actually always code
if (otherString == null || otherString.length() == 0)
yes--it offers no benefit.
depends on what you're used to, I'm sure.
No, it's just a constant--not an abuse.
The same argument comes up in .NET from time to time (where there's already a readonly static field string.Empty). It's a matter of taste - but personally I find "" less obtrusive.
Hehe, funny thing is:
Once it compiles, you wont see a difference (in the byte-code) between the "static final" thing and the string literal, as the Java-compiler always inlines "static final String" into the target class. Just change your empty string into something recognizable (like the LGPL-text) and look at the resulting *.class file of code that refernces that constant. You will find your text copied into that class-file.
One case where it does make sense to have a constant with value of empty string is when you the name captures the semantics of the value. For example:
if (Constants.FORM_FIELD_NOT_SET.equals(form.getField("foobar"))) {
...
}
This makes the code more self documenting (apart from the argument that a better design is to add the method checking whether a field is set to the form itself).
We just do the following for situations like this:
public class StaticUtils
{
public static boolean empty(CharSequence cs)
{
return cs == null || cs.length() == 0;
}
public static boolean has(CharSequence cs)
{
return !empty(cs);
}
}
Then just import static StaticUtils.*
Hmm, the rules are right but are being taken in a different sense! Lets look at the cause, firstly all object references in java are checked by equals(). Earlier on, in some languages it was done using '==' operator, if by accident someone used '=' for '==', a catastrophe. Now the question of magic numbers/constants, for a computer all constants/numbers are similar. Instead of 'int ONE=1' one can surely use 1, but will that hold true for double PI = 3.141...? What happens if someone tries to change the precision sometime later.
If we were to come up with a check list, what would the rule be address the general guideline isn't it? All I mean to say is that rules are supposed to aid, we can surely bend the rules only when we know them very well. Common sense prevails. As suggested by a friend of mine, program constants like 0/1 which denote exit conditions can be hard coded and hence magic number principle doesn't apply. But for those which participate in logical checks/rules, better keep them as configurable constants.
Why it is preferable to use String.Empty in C# and therefore a public constant in other languages, is that constants are static, therefore only take up one instance in memory.
Every time you do something like this: -
stringVariable = "";
you are creating a new instance of a null string, and pointing to it with stringVariable.
So every time you make an assignment of "" to a variable (pointer), that "" null string is a new string instance until it no longer has any pointer assignments to it.
initializing strings by pointing them all to the same constant, means only one "" is ever created and every initialized variable points to the same null string.
It may sound trivial, but creating and destroying strings is much more resource intensive than creating pointers (variables) and pointing them to an existing string.
As string initialization is common, it is good practice to do: -
const String EMPTY_STRING = "";
String variable1 = EMPTY_STRING;
String variable2 = EMPTY_STRING;
String variable3 = EMPTY_STRING;
String variable4 = EMPTY_STRING;
String variable5 = EMPTY_STRING;
You have created 5 string pointers but only 1 string
rather than: -
String variable1 = "";
String variable2 = "";
String variable3 = "";
String variable4 = "";
String variable5 = "";
You have created 5 string pointers and 5 separate null strings.
Not a major issue in this case, but in thousands of lines of code in dozens of classes, it is unnecessary memory waste and processor use, creating another null string variable, when they can all point to the same one, making applications much more efficient.
Of course, compilers should be clever enough to determine several static strings and reuse duplicates, but why assume?
Also, it's less prone to introducing errors as "" and " " will both compile, yet you may miss the space you accidentally added which could produce spurious run time errors, for example conditional logic such as: -
myvariable = " ";
While (myVariable == ""){
...
}
Code inside the while block is unreachable because myVariable will not satisfy the condition on the first iteration. The error of initializing with " " instead of "" is easy to miss, whereas: -
myvariable = EMPTY_STRING;
While (myVariable == EMPTY_STRING){
...
}
... is less likely to cause runtime errors, especially as misspelling EMPTY_STRING would generate a compile error instead of having to catch the error at run time.
The cleanest solution, would be to create a static class that contains members of all kinds of string constants you need, should you require more than just an empty string.
public static class StringConstants{
public static String Empty = "";
public static String EMail = "mailto:%s";
public static String http = "http://%s";
public static String https = "https://%s";
public static String LogEntry = "TimeStamp:%tYmdHMSL | LogLevel:%s| Type:%s | Message: '%s'";
}
String myVariable = StringConstants.Empty;
You may even be able to extend the native String object, depending on your language.
If you every wish to store "empty" strings in a nullable string column in oracle, you will have to change the definition of EMPTY_STRING to be something other than ""! (I recall from the last time I was forced to use Oracle that it does not know the difference between an empty string and a null string).
However this should be done in your data access layer so the rest of the app does not know about it, and/or sort out your data model so you don’t need to store empty string AND null strings in the same column.
Or simply just have it as string.IsNullOrEmpty(otherString)

Categories