I was asked this question today in an interview.
Can someone explain me the right answer ?
Here is the code.
String s1= "hellow";
String s2= "Hellow again";
System.out.println(s1+s2);
How many strings are created in the above code ?
I think it will be 3.Any suggestions?
The string literals "hellow" and "Hellow again" are in the string pool.
Now when we concatenate with s1 + s2, what really happens is the following:
new StringBuilder(s1).append(s2).toString()
which itself creates a new String (see for yourself). So it depends what you mean by "create"; if you're asking how many String objects exist at the very end of the snippet, then the answer is 3. But note that the string produced by s1+s2 is not retained and is therefore eligible to be garbage collected after it is printed.
This answer is essentially an extension of arshajii's answer
The answer all depends on what your interviewer(s) meant by "create", and technically also depends on what other code is present.
If you disassemble the bytecode generated by just that snippet, you get this:
public static void main(java.lang.String[]);
Code:
0: ldc #2 // String hellow
2: astore_1
3: ldc #3 // String Hellow again
5: astore_2
6: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
9: new #5 // class java/lang/StringBuilder
12: dup
13: invokespecial #6 // Method java/lang/StringBuilder."<init>":()V
16: aload_1
17: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
20: aload_2
21: invokevirtual #7 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
24: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
27: invokevirtual #9 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
30: return
Because "hellow" and "Hellow again" are string literals in the source code, they get placed into the constant pool at compile time, and so are present at program startup. As you can see, the strings "hellow" and "Hellow again" are simply loaded (ldc == load constant). They are not created by the above code snippet. The only String that's created is the one from the StringBuilder.
Now, if you declare the fields final, you get this:
public static void main(java.lang.String[]);
Code:
0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #3 // String hellowHellow again
5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
Based on this, you can also argue that no String objects are created by the above code snippet, as the compiler can optimize this statement. This is probably not the answer that the interviewers were looking for, since it depends on final being present.
Strings are immutable so s1+s2 creates a new String instance. If you want to avoid this you should use a StringBuffer.
The operator hierarchy in Java states, that String concatenation is being evaluated before a method call. Therefore the expression in brackets first creates a new String and is then being written to the standard output.
My answer: 3
The answer is implementation dependent, and it also depends on the context ... and what you mean by creation.
The execution of those lines of code creates one String object for the concatenation, and possibly others within the println call1. These creations would happen each time your application executes those lines.
At least a further two String objects will be created when the code is loaded and String objects are created for the String literals. However, that is a once off ...
1 - In some class libraries, println(s) is implemented as print(s + newline). Hence the println call may create another String object.
And the right answer for an interview question is probably not "3" but some (but not necessarily all) of the discussion above. They'd definitely want you to know that + creates a string; probably want some awareness that the constants s1 and s2 are created/exist as compile-time constants; and may be suitably impressed if you knew about the effect of adding final. Such questions are often not about getting the right answer, but thinking about them in the right way.
3 is the answer.
But use a StringBuilder (as StringBuffer is useless thread safe version of the StringBuilder).
You should use the StringBuilder.append method, instead of the + operator.
Related
I used the following simple logic to answer a question like this one:
1: if(a) // 1 operation
2: if (b) // 1 operation
and
1: if(a && b) // 1, 1(&&), 1 => 3 operations.
So, 2 operations versus 3, but in the first example the compiler needs to call another instruction to be executed.
Is this logic true?.
Does it depend on the compiler?.
Does calling an empty instruction like only ; cost the compiler some noticable time?.
This also discuss the same problem but not considering this logic.
Please help us to clarify this issue.
There are two methods to answer such a question precisely:
1.) Look at the IL code (and/or) aassembly code produce and count the CPU cycles needed to execute this code (Hint: this is not for beginners)
2.) Build a small test programm which executes both variants a large number of time, use StopWatch() to create a uesful and readable timing output, run it several times.
3.) Speculate about what you think the optimization step of the compiler is able to do and what this software will do, argue with others for hours
I assumed the compiler would produce the same byte code for your two cases. So I tested this with two different source files:
public class Test1 {
public static void main(String[] args) {
if (args[0].equals("a"))
if (args[1].equals("b"))
System.out.println("Foo");
}
}
and...
public class Test2 {
public static void main(String[] args) {
if (args[0].equals("a") && args[1].equals("b"))
System.out.println("Foo");
}
}
Inspecting their byte code with javap -c Test1 etc., the results are identical:
public static void main(java.lang.String[]);
Code:
0: aload_0
1: iconst_0
2: aaload
3: ldc #2 // String a
5: invokevirtual #3 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
8: ifeq 30
11: aload_0
12: iconst_1
13: aaload
14: ldc #4 // String b
16: invokevirtual #3 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
19: ifeq 30
22: getstatic #5 // Field java/lang/System.out:Ljava/io/PrintStream;
25: ldc #6 // String Foo
27: invokevirtual #7 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
30: return
Consequently, the performance would be identical. Although I welcome comments if anyone can think of an example where different byte code is produced.
My results are using Oracle's javac from Java 1.7. Results could be different with other compilers, although I suspect they won't be for this case.
There are 2 approaches to think about your question:
the Java language definition
language defines that the 2nd example will use short-circuit execution which means in the case the if statements will contain any code, the 2nd example may execute faster
the JVM optimization
the JVM runtime will remove the dead code blocks if it can proof they don't have any side-effects
Is there any difference b/w 1 and 2 in terms of concatenation if i do it instance method. I mean in either case only one object will be constructed ultimately i.e "abc" .Yes only difference i see is test will lie inside permgen space even thread come out of instance method but x will be garbage collected once thread is out of method but in terms of number of objects constructred will be same. Right?
// option 1
String test="a"+"b"+"c";
// option 2
StringBuffer x = new StringBuffer().append("a").append("b").append("c").toString()
I referred the link http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuffer.html to reach this conclusion.
First notice that the documentation you have linked is very old. Notice it's for Java 1.4.2.
J2SE 1.4.2 is in its Java Technology End of Life (EOL) transition period. The EOL transition period began Dec, 11 2006 and will complete October 30th, 2008, when J2SE 1.4.2 will have reached its End of Service Life (EOSL).
In newer versions of the documentation this statement has been removed. However another statement has been added that you should be aware of:
As of release JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder. The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization.
Secondly notice that the documentation you refer to has this code:
x = "a" + 4 + "c";
The 4 there isn't just a typo. Your example is different because the compiler will convert the code to use just a single string literal. These two lines are the same:
x = "a" + "b" + "c";
x = "abc";
The string literal will be interned.
But in the general case where the compiler cannot just use a single string literal, the compiler will transform the first version into the second, except it will use StringBuilder instead because it is more efficient.
First of all - use StringBuilder instead of StringBuffer, StringBuffer is deprecated now.
And for your question, nowadays it doesn't really matter, compiler automacally transforms String concacenation to StringBuilder.
There are only two cases where to use it. First one is better code readability (for example if you are building long Strings like SQL queries). And second one, when you concanete Strings in the loop, compiler for always make a new StringBuilder instance for each walk through loop, so be carefull about that.
First of all, StringBuilder is to StringBuffer what ArrayList is to Vector: it should be preferred because it's not synchronized.
Your first String is entirely constructed at compilation time, and is stored as a String literal. This literal is interned inside a pool, and the test variable always points to the same String instance.
Your second snippet dynamically concatenates, at runtime, three String literals. It returns a new String instance each time it's called.
Looking at the bytecode generated by the 2 examples, the first string is transformed into the "abc" string literal, whereas the second calls StringBuilder methods. You can actually test it with System.out.println(test == "abc");, which prints true.
0: ldc #2 // String abc
2: astore_1
3: new #3 // class java/lang/StringBuffer
6: dup
7: invokespecial #4 // Method java/lang/StringBuffer."<init>":()V
10: ldc #5 // String a
12: invokevirtual #6 // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
15: ldc #7 // String b
17: invokevirtual #6 // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
20: ldc #8 // String c
22: invokevirtual #6 // Method java/lang/StringBuffer.append:(Ljava/lang/String;)Ljava/lang/StringBuffer;
25: invokevirtual #9 // Method java/lang/StringBuffer.toString:()Ljava/lang/String;
28: astore_2
In this specific case, where you're concatenating three string literals at compile time, the compiler will generate code just as if you'd typed:
String test="abc";
thus avoiding any intermediate objects altogether.
I think in case of memory usages both are same.
I am working in Java code optimization. I'm unclear about the difference between String.valueOf or the +"" sign:
int intVar = 1;
String strVar = intVar + "";
String strVar = String.valueOf(intVar);
What is the difference between line 2 and 3?
public void foo(){
int intVar = 5;
String strVar = intVar+"";
}
This approach uses StringBuilder to create resultant String
public void foo();
Code:
0: iconst_5
1: istore_1
2: new #2; //class java/lang/StringBuilder
5: dup
6: invokespecial #3; //Method java/lang/StringBuilder."<init>":()V
9: iload_1
10: invokevirtual #4; //Method java/lang/StringBuilder.append:(I)Ljava/lan
g/StringBuilder;
13: ldc #5; //String
15: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/
String;)Ljava/lang/StringBuilder;
18: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/la
ng/String;
21: astore_2
22: return
public void bar(){
int intVar = 5;
String strVar = String.valueOf(intVar);
}
This approach invokes simply a static method of String to get the String version of int
public void bar();
Code:
0: iconst_5
1: istore_1
2: iload_1
3: invokestatic #8; //Method java/lang/String.valueOf:(I)Ljava/lang/Stri
ng;
6: astore_2
7: return
which in turn calls Integer.toString()
Ask yourself the purpose of the code. Is it to:
Concatenate an empty string with a value
Convert a value to a string
It sounds much more like the latter to me... which is why I'd use String.valueOf. Whenever you can make your code read in the same way as you'd describe what you want to achieve, that's a good thing.
Note that this works for all types, and will return "null" when passed a null reference rather than throwing a NullPointerException. If you're using a class (not an int as in this example) and you want it to throw an exception if it's null (e.g. because that represents a bug), call toString on the reference instead.
Using String.valueOf(int), or better, Integer.toString(int) is relatively more efficient for the machine. However, unless performance is critical (in which case I wouldn't suggest you use either) Then ""+ x is much more efficient use of your time. IMHO, this is usually more important. Sometimes massively more important.
In other words, ""+ wastes an object, but Integer.toString() creates several anyway. Either your time is more important or you want to avoid creating objects at all costs. You are highly unlikely to be in the position that creating several objects is fine, but creating one more is not.
I'd prefer valueOf(), because I think it's more readable and explicit.
Any concerns about performance are micro-optimizations that wouldn't be measurable. I wouldn't worry about them until I could take a measurement and see that they made a difference.
Well, if you look into the JRE source code, Integer.getChars(...) is the most vital method which actually does the conversion from integer to char[], but it's a package-private method.
So the question is how to get this method called with minimum overhead.
Following is an overview of the 3 approaches by tracing the calls to our target method, please look into the JRE source code to understand this better.
"" + intVar compiles to :
new StringBuilder() => StringBuilder.append(int) => Integer.getChars(...)
String.valueOf(intVar) => Integer.toString(intVar) => Integer.getChars(...)
Integer.toString(intVar) => Integer.getChars(...)
The first method unnecessarily creates one extra object i.e. the StringBuilder.
The second simply delegates to third method.
So you have the answer now.
PS: Various compile time and runtime optimizations come into play here. So actual performance benchmarks may say something else depending on different JVM implementations which we can't predict, so I generally prefer the approach which looks efficient by looking at the source code.
The first line is equivalent to
String strVal = String.valueOf(intVar) + "";
so that there is some extra (and pointless) work to do. Not sure if the compiler optimizes away concatenations with empty string literals. If it does not (and looking at #Jigar's answer it apparently does not), this will in turn become
String strVal = new StringBuilder().append(String.valueOf(intVar))
.append("").toString();
So you should really be using String.valueOf directly.
From the point of optimization , I will always prefer the String.valueOf() between the two. The first one is just a hack , trying to trick the conversion of the intVar into a String because the + operator.
Even though answers here are correct in general, there's one point that is not mentioned.
"" + intVar has better performance compared to String.valueOf() or Integer.toString(). So, if performance is critical, it's better to use empty string concatenation.
See this talk by Aleksey Shipilëv. Or these slides of the same talk (slide #24)
Concatenating Strings and other variables actually uses String.valueOf() (and StringBuilder) underneath, so the compiler will hopefully discard the empty String and produce the same bytecodes in both cases.
String strVar1 = intVar+"";
String strVar2 = String.valueOf(intVar);
strVar1 is equvalent to strVar2, but using int+emptyString ""
is not elegant way to do it.
using valueOf is more effective.
This question already has answers here:
How does the String class override the + operator?
(7 answers)
Closed 9 years ago.
I saw this question a few minutes ago, and decided to take a look in the java String class to check if there was some overloading for the + operator.
I couldn't find anything, but I know I can do this
String ab = "ab";
String cd = "cd";
String both = ab + cd; //both = "abcd"
Where's that implemented?
From the Fine Manual:
The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuilder(or StringBuffer) class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java. For additional information on string concatenation and conversion, see Gosling, Joy, and Steele, The Java Language Specification.
See String Concatenation in the JLS.
The compiler treats your code as if you had written something like:
String both = new StringBuilder().append(ab).append(cd).toString();
Edit: Any reference? Well, if I compile and decompile the OP's code, I get this:
0: ldc #2; //String ab
2: astore_1
3: ldc #3; //String cd
5: astore_2
6: new #4; //class java/lang/StringBuilder
9: dup
10: invokespecial #5; //Method java/lang/StringBuilder."<init>":()V
13: aload_1
14: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
17: aload_2
18: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
21: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
24: astore_3
25: return
So, it's like I said.
Most of the answers here are correct (it's handled by the compiler, + is converted to .append()...)
I wanted to add that everyone should take a look at the source code for String and append at some point, it's pretty impressive.
I believe it came down to something like:
"a"+"b"+"c"
=
new StringBuilder().append("a").append("b").append("c")
But then some magic happens. This turns into:
Create a string array of length 3
copy a into the first position.
copy b into the second
copy c into the third
Whereas most people believe that it will create a 2 character array with "ab", and then throw it away when it creates a three character array with "abc". It actually understands that it's being chained and does some manipulation outside what you would assume if these were simple library calls.
There is also a trick where if you have the string "abc" and you ask for a substring that turns out to be "bc", they CAN share the exact same underlying array. You'll notice that there is a start position, end position and "shared" flag.
In fact, if it's not shared, it's possible for it to extend the length of a string array and copy the new characters in when appending.
Now I'm just being confusing. Read the source code--it's fairly cool.
Very Late Edit:
The part about sharing the underlying array isn't quite true any more. They had to de-optimize String a little because people were downloading giant strings, taking a tiny sub-string and keeping it. This was holding the entire underlying array in storage, it couldn't be GC'd until all sub-references were dropped.
It is handled by the compiler.
This is special behavior documented in the language specification.
15.18.1 String Concatenation Operator +
If only one operand expression is of
type String, then string conversion is
performed on the other operand to
produce a string at run time. The
result is a reference to a String
object (newly created, unless the
expression is a compile-time constant
expression (§15.28))that is the
concatenation of the two operand
strings. The characters of the
left-hand operand precede the
characters of the right-hand operand
in the newly created string. If an
operand of type String is null, then
the string "null" is used instead of
that operand.
It's done at the language level. The Java Language Specification is very specific about what string addition must do.
String is defined as a standard type just like int, double, float, etc. on compiler level. Essentially, all compilers have operator overloading. Operator overloading is not defined for Developers (unlike in C++).
Interestingly enough: This question was logged as a bug: http://bugs.sun.com/view_bug.do?bug_id=4905919
We have a huge code base and we suspect that there are quite a few "+" based string concats in the code that might benefit from the use of StringBuilder/StringBuffer. Is there an effective way or existing tools to search for these, especially in Eclipse?
A search by "+" isn't a good idea since there's a lot of math in the code, so this needs to be something that actually analyzes the code and types to figure out which additions involve strings.
I'm pretty sure FindBugs can detect these. If not, it's still extremely useful to have around.
Edit: It can indeed find concatenations in a loop, which is the only time it really makes a difference.
Just make sure you really understand where it's actually better to use StringBuilder. I'm not saying you don't know, but there are certainly plenty of people who would take code like this:
String foo = "Your age is: " + getAge();
and turn it into:
StringBuilder builder = new StringBuilder("Your age is: ");
builder.append(getAge());
String foo = builder.toString();
which is just a less readable version of the same thing. Often the naive solution is the best solution. Likewise some people worry about:
String x = "long line" +
"another long line";
when actually that concatenation is performed at compile-time.
As nsander's quite rightly said, find out if you've got a problem first...
Why not use a profiler to find the "naive" string concatenations that actually matter? Only switch over to the more verbose StringBuffer if you actually need it.
Chances are you will make your performance worse and your code less readable. The compiler already makes this optimization, and unless you are in a loop, it will generally do a better job. Furthermore, in JDK 8 they may come out with StringUberBuilder, and all your code which uses StringBuilder will run slower, while the "+" concatenated strings will benefit from the new class.
“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.” - Donald Knuth
IntelliJ can find these using "structural search". You search for "$a + $b" and set the characteristics of both $a and $b as type java.lang.String.
However, if you have IntelliJ, it likely has a built in inspection that will do a better job of finding what you want anyway.
I suggest using a profiler. This is really a performance question and if you can't make the code show up with reasonable test data there is unlikely to be any value in changing it.
Jon Skeet (as always) and the others have already said all that is needed but I would really like to emphasize that maybe you are hunting for a non existing performance improvement...
Take a look at this code:
public class StringBuilding {
public static void main(String args[]) {
String a = "The first part";
String b = "The second part";
String res = a+b;
System.gc(); // Inserted to make it easier to see "before" and "after" below
res = new StringBuilder().append(a).append(b).toString();
}
}
If you compile it and disassemble it with javap, this is what you get.
public static void main(java.lang.String[]);
Code:
0: ldc #2; //String The first part
2: astore_1
3: ldc #3; //String The second part
5: astore_2
6: new #4; //class java/lang/StringBuilder
9: dup
10: invokespecial #5; //Method java/lang/StringBuilder."<init>":()V
13: aload_1
14: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
17: aload_2
18: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
21: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
24: astore_3
25: invokestatic #8; //Method java/lang/System.gc:()V
28: new #4; //class java/lang/StringBuilder
31: dup
32: invokespecial #5; //Method java/lang/StringBuilder."<init>":()V
35: aload_1
36: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
39: aload_2
40: invokevirtual #6; //Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
43: invokevirtual #7; //Method java/lang/StringBuilder.toString:()Ljava/lang/String;
46: astore_3
47: return
As you can see, 6-21 are pretty much identical to 28-43. Not much of an optimization, right?
Edit: The loop issue is valid though...
Instead of searching for just a + search for "+ and +" those will find the vast majority probably. cases where you are concatenating multiple variables will be tougher.
If you have a huge code base you probably have lots of hotspots, which may or may not involve "+" concatenation. Just run your usual profiler, and fix the big ones, regardless of what kind of construct they are.
It would be an odd approach to fix just one class of (potential) bottleneck, rather than fixing the actual bottlenecks.
With PMD, you can write rules with XPath or using a Java syntax. It might be worth investigating whether it can match the string concatenation operator—it certainly seems within the purview of static analysis. This is such a vague idea, I'm going to make this "community wiki"; if anyone else wants to elaborate (or create their own answer along these lines), please do!
Forget it - your JVM most likely does it already - see the JLS, 15.18.1.2 Optimization of String Concatenation:
An implementation may choose to perform conversion and concatenation in one step to avoid creating and then discarding an intermediate String object. To increase the performance of repeated string concatenation, a Java compiler may use the StringBuffer class or a similar technique to reduce the number of intermediate String objects that are created by evaluation of an expression.