Can I get some concrete explanation of memory and runtime overhead with the below two statements?
String CONST = "string constant";
StringBuilder sb1 = new StringBuilder();
sb1.append(CONST);
StringBuilder sb2 = new StringBuilder();
sb2.append("string constant");
Does second create string object and add in stringpool?
Is there any scenario(consider many string appends as well) where we can justify one is better than other?
There is no difference in memory or runtime overhead between these two versions.
Use whichever seems more readable or maintainable. If you're reusing the same string constant in many places, the constant is long, or might change, then pulling out a constant might be appropriate.
In reference to the runtime overhead, running a simulation of both methods yielded almost identical results.
My tests were done with 10,000,000,000 iterations and the runtime was:
Method 1 - 95109ms (~9.5ns average)
Method 2 - 95002ms (~9.5ns average)
So definitely no noticeable difference in performance.
Therefore, as #LouisWasserman said in their answer, just use the one that keeps your code clean and legible.
Related
What is the most efficient way to use an instance of DecimalFormat together with a StringBuilder? When numbers are appended to a string in a loop, for example. There is format(long number, StringBuffer toAppendTo, FieldPosition pos), but that uses a StringBuffer and not a StringBuilder, so it is not compatible, and there is also formatToCharacterIterator(Object obj) but that both must create an object for the iterator and does not work with primitive types, so it also requires additional potential boxing.
It seems to me calling format(long number) to produce a string and append it to the StringBuilder is the easiest option, but having to create a string just to append it seems to be kinda defeating the purpose of the StringBuilder. Is there really no other option?
Edit: I have decided to do some measurements to see the performance difference between these options. Based on the OpenJDK implementation, it seems all methods eventually get routed to either format(long, StringBuffer, FieldPosition) or format(double, StringBuffer, FieldPosition) (with the exception of large BigInteger and BigDecimal), so it would seem when appending just numbers, this way will always be faster with StringBuffer.
And indeed, directly using StringBuffer is about 20 % faster on my machine than via StringBuilder and intermediate string. However, the opposite is true when no number formatting is done and only strings are appended – then StringBuffer is 20 % slower. But considering formatting a number is about 5 times slower than simply appending a string, StringBuilder seems to be only ever more efficient when there are significantly more appends than formats.
Unless you can ascertain that the additional String and StringBuffer creation is an actual real-world bottleneck in your application, I suspect you are worrying unnecessarily. It's true that if we could re-invent the history of the JDK, these calls probably would have been defined as taking Appendable. But the authors may well have considered adding a corresponding method when StringBuilder and Appendable were added in Java 5 and decided it was not worth it.
Remember that in modern JVMs, the creation and disposal of temporary objects doesn't have the performance overhead that it once used to. In addition, various flavours of NumberFormat-- including DecimalFormat-- actually provide an internal 'fast path' (see the fastFormat() method) that avoids the StringBuffer creation you mention (though not the temporary String creation). The JVM also generally optimises for uncontended synchronisation as you will have here.
If a profile genuinely shows that DecimalFormat.foramt() is a bottleneck to your application, then there is a chance you may need to consider implementing a specific method to include the optimisations you require. But I suspect you'll find that it's not actually an issue.
Problem
I wrote 2 programs, one in Delphi and one in Java, for string concatenation and I noticed a much faster string concatenation in Delphi compared to Java.
Java
String str = new String();
long t0 = System.nanoTime();
for (int i = 0; i < 50000; i++)
str += "abc";
long t1 = System.nanoTime();
System.out.println("String + String needed " + (t1 - t0) / 1000000 + "ms");
Delphi
Stopwatch.Start;
for i := 1 to 50000 do
str := str + 'abc';
Stopwatch.Stop;
ShowMessage('Time in ms: ' + IntToStr(Stopwatch.ElapsedMilliseconds));
Question
Both measure the time in milliseconds but the Delphi program is much faster with 1ms vs. Javas 2 seconds. Why is string concatenation so much faster in Delphi?
Edit: Looking back at this question with more experience I should have come to the conclusion that the main difference comes from Delphi being compiled and Java being compiled and then run in the JVM.
TLDR
There may be other factors, but certainly a big contributor is likely to be Delphi's default memory manager. It's designed to be a little wasteful of space in order to reduce how often memory is reallocated.
Considering memory manager overhead
When you have a straight-forward memory manager (you might even call it 'naive'), your loop concatenating strings would actually be more like:
//pseudo-code
for I := 1 to 50000 do
begin
if CanReallocInPlace(Str) then
//Great when True; but this might not always be possible.
ReallocMem(Str, Length(Str) + Length(Txt))
else
begin
AllocMem(NewStr, Length(Str) + Length(Txt))
Copy(Str, NewStr, Length(Str))
FreeMem(Str)
end;
Copy(Txt, NewStr[Length(NewStr)-Length(Txt)], Length(Txt))
end;
Notice that on every iteration you increase the allocation. And if you're unlucky, you very often have to:
Allocate memory in a new location
Copy the existing 'string so far'
Finally release the old string
Delphi (and FastMM)
However, Delphi has switched from the default memory manager used in it's early days to a previously 3rd party one (FastMM) that's designed run faster primarily by:
(1) Using a sub-allocator i.e. getting memory from the OS a 'large' page at a time.
Then performing allocations from the page until it runs out.
And only then getting another page from the OS.
(2) Aggressively allocating more memory than requested (anticipating small growth).
Then it becomes more likely the a slightly larger request can be reallocated in-place.
These techniques can thought it's not guaranteed increase performance.
But it definitely does waste space. (And with unlucky fragmentation, the wastage can be quite severe.)
Conclusion
Certainly the simple app you wrote to demonstrate the performance greatly benefits from the new memory manager. You run through a loop that incrementally reallocates the string on every iteration. Hopefully with as many in-place allocations as possible.
You could attempt to circumvent some of FastMM's performance improvements by forcing additional allocations in the loop. (Though sub-allocation of pages would still be in effect.)
So simplest would be to try an older Delphi compiler (such as D5) to demonstrate the point.
FWIW: String Builders
You said you "don't want to use the String Builder". However, I'd like to point out that a string builder obtains similar benefits. Specifically (if implemented as intended): a string builder wouldn't need to reallocate the substrings all the time. When it comes time to finally build the string; the correct amount of memory can be allocated in a single step, and all portions of the 'built string' copied to where they belong.
In Java (and C#) strings are immutable objects. That means that if you have:
string s = "String 1";
then the compiler allocates memory for this string. Haven then
s = s + " String 2"
gives us "String 1 String 2" as expected but because of the immutability of the strings, a new string was allocated, with the exactly size to contain "String 1 String 2" and the content of both strings is copied to the new location. Then the original strings are deleted by the garbage collector. In Delphi a string is more "copy-on-write" and reference counted, which is much faster.
C# and Java have the class StringBuilder with behaves a lot like Delphi strings and are quite faster when modifying and manipulating strings.
II've been in the habit of doing:
int num = 12;
String text = ""+12;
for a long time, but I've found that to be a very inefficient mechanism for the large number of additions.
For those cases I generally do something like:
// this is psuedo code here..
FileInputStream fis = new FileInputStream(fis);
StringBuilder builder = new StringBuilder();
while(input.hasNext()) {
builder.append(input.nextString());
}
My question is: When coding for Android (vs the General Java case) Is the performance trade off at the small case worth using String Builder, or are there any other reasons to prefer String Builder in these small cases? It seems like it's a lot of extra typing int he simple case presented above. I also suspect (though I have not confirmed) that the memory allocations in the simple case are probably not worth it.
Edit: Suppose that the values being appended aren't known at compile time, I.E. they aren't constants.
Also the example above of ""+12 is a poorly chosen example.. Suppose it was
String userGeneratedText = textInput.getText().toString();
int someVal = intInput.getInt();
String finalVal = userGeneratedText+someVal;
If your code is short as you shown here:
String text = "foo" + 12;
The compiler will automatically replace the concatenation to use StringBuilder:
String text = new StringBuilder().append("foo").append(12).toString();
So don't worry about the inefficiency of this code, because it will work better than you expect.
For cases when you need to append very large Strings or you don't know how many objects (Strings, ints, booleans, etc) will you concatenate, use a StringBuilder as you do in your second code sample.
Here's a more in depth explanation about how the String concatenation works: http://blog.eyallupu.com/2010/09/under-hood-of-java-strings.html
As far as I know! string is immutable object. It means that its state cannot be changed, when ever you append value to string type then what happened is compiler deletes old one create new one with apended value.
But this is not the case with StringBuilder. StringBuilder is mutable which means its old value won't be destroyed. Any change/append will be taken place with existing object.
I know I am not covering in depth but this might cause major performance difference.
This question already has answers here:
StringBuilder vs String concatenation in toString() in Java
(20 answers)
Closed 8 years ago.
When should we use + for concatenation of strings, when is StringBuilder preferred and When is it suitable to use concat.
I've heard StringBuilder is preferable for concatenation within loops. Why is it so?
Thanks.
Modern Java compiler convert your + operations by StringBuilder's append. I mean to say if you do str = str1 + str2 + str3 then the compiler will generate the following code:
StringBuilder sb = new StringBuilder();
str = sb.append(str1).append(str2).append(str3).toString();
You can decompile code using DJ or Cavaj to confirm this :)
So now its more a matter of choice than performance benefit to use + or StringBuilder :)
However given the situation that compiler does not do it for your (if you are using any private Java SDK to do it then it may happen), then surely StringBuilder is the way to go as you end up avoiding lots of unnecessary String objects.
I tend to use StringBuilder on code paths where performance is a concern. Repeated string concatenation within a loop is often a good candidate.
The reason to prefer StringBuilder is that both + and concat create a new object every time you call them (provided the right hand side argument is not empty). This can quickly add up to a lot of objects, almost all of which are completely unnecessary.
As others have pointed out, when you use + multiple times within the same statement, the compiler can often optimize this for you. However, in my experience this argument doesn't apply when the concatenations happen in separate statements. It certainly doesn't help with loops.
Having said all this, I think top priority should be writing clear code. There are some great profiling tools available for Java (I use YourKit), which make it very easy to pinpoint performance bottlenecks and optimize just the bits where it matters.
P.S. I have never needed to use concat.
From Java/J2EE Job Interview Companion:
String
String is immutable: you can’t modify a String object but can replace it by creating a new instance. Creating a new instance is rather expensive.
//Inefficient version using immutable String
String output = "Some text";
int count = 100;
for (int i = 0; i < count; i++) {
output += i;
}
return output;
The above code would build 99 new String objects, of which 98 would be thrown away immediately. Creating new objects is not efficient.
StringBuffer/StringBuilder
StringBuffer is mutable: use StringBuffer or StringBuilder when you want to modify the contents. StringBuilder was added in Java 5 and it is identical in all respects to StringBuffer except that it is not synchronised, which makes it slightly faster at the cost of not being thread-safe.
//More efficient version using mutable StringBuffer
StringBuffer output = new StringBuffer(110);
output.append("Some text");
for (int i = 0; i < count; i++) {
output.append(i);
}
return output.toString();
The above code creates only two new objects, the StringBuffer and the final String that is returned. StringBuffer expands as needed, which is costly however, so it would be better to initialise the StringBuffer with the correct size from the start as shown.
If all concatenated elements are constants (example : "these" + "are" + "constants"), then I'd prefer the +, because the compiler will inline the concatenation for you. Otherwise, using StringBuilder is the most effective way.
If you use + with non-constants, the Compiler will internally use StringBuilder as well, but debugging becomes hell, because the code used is no longer identical to your source code.
My recommendation would be as follows:
+: Use when concatenating 2 or 3 Strings simply to keep your code brief and readable.
StringBuilder: Use when building up complex String output or where performance is a concern.
String.format: You didn't mention this in your question but it is my preferred method for creating Strings as it keeps the code the most readable / concise in my opinion and is particularly useful for log statements.
concat: I don't think I've ever had cause to use this.
Use StringBuilder if you do a lot of manipulation. Usually a loop is a pretty good indication of this.
The reason for this is that using normal concatenation produces lots of intermediate String object that can't easily be "extended" (i.e. each concatenation operation produces a copy, requiring memory and CPU time to make). A StringBuilder on the other hand only needs to copy the data in some cases (inserting something in the middle, or having to resize because the result becomes to big), so it saves on those copy operations.
Using concat() has no real benefit over using + (it might be ever so slightly faster for a single +, but once you do a.concat(b).concat(c) it will actually be slower than a + b + c).
Use + for single statements and StringBuilder for multiple statements/ loops.
The performace gain from compiler applies to concatenating constants.
The rest uses are actually slower then using StringBuilder directly.
There is not problem with using "+" e.g. for creating a message for Exception because it does not happen often and the application si already somehow screwed at the moment. Avoid using "+" it in loops.
For creating meaningful messages or other parametrized strings (Xpath expressions e.g.) use String.format - it is much better readable.
I suggest to use concat for two string concatination and StringBuilder otherwise, see my explanation for concatenation operator (+) vs concat()
I'm curious, how far the optimization of the following code snippet will go.
To what I know, whenever the capacity of StringBuffer is extended, it costs some CPU work, because its content is required to be reallocated. However, I guess Java compiler optimization can precalculate the required capacity instead of doing multiple reallocations.
The question is: will the following snippet of code be optimized so?
public static String getGetRequestURL(String baseURL, Map<String, String> parameters) {
StringBuilder stringBuilder = new StringBuilder();
parameters.forEach(
(key, value) -> stringBuilder.append(key).append("=").append(value).append("&"));
return baseURL + "?" + stringBuilder.delete(stringBuilder.length(),1);
}
In Java, most optimization is performed by the runtime's just in time compiler, so generally javac optimizations don't matter much.
As a consequence, a Java compiler is not required to optimize string concatenation, though all tend to do so as long as no loops are involved. You can check the extent of such compile time optimizations by using javap (the java decompiler included with the JDK).
So, could javac conceivably optimize this? To determine the length of the string builder, it would have to iterate the map twice. Since java does not feature const references, and the compiler has no special treatment for Map, the compiler can not determine that this rewrite would preserve the meaning of the code. And even if it could, it's not at all clear that the gains would be worth the cost of iterating twice. After all, modern processors can copy 4 to 8 characters in a single cpu instruction. Since memory access is sequential, there won't be any cache missing while growing the buffer. On the other hand, iterating the map a second time will likely cause additional cache misses, because the Map entries (and the strings they reference) can be scattered all over main memory.
In any case, I would not worry about the efficiency of this code. Even if your URL is 1000 characters long, resizing the buffer will take about 0.1 micro seconds. Unless you have evidence that this really is a performance hotspot, your time is probably better spent elsewhere.
First of all:
You can find out what (javac) compile time optimizations occur by looking at the bytecodes using the javap tool.
You can find out what JIT compiler optimizations are performed by getting the JVM to dump the native code.
So, if you need to know how your code has been optimized (on a particular platform) for practical reasons, then you should check.
In reality, the optimizations by javac are pretty simple-minded, and do not go to the extent of precalculating buffer sizes. I haven't checked, but I expect that the same is true for the JIT compiler. I doubt that it makes any attempt to preallocate a StringBuilder with an "optimal" size.
Why?
The reasons include the following:
An inaccurate precalculation (on average) doesn't help, and may be worse than doing nothing.
An accurate precalculation typically involves measuring the (dynamic) lengths of the actual strings to be joined.
Implementing the optimization logic would be complicated, and would make the optimizers slower and more effort to maintain.
At runtime, the String mensuration introduces overheads. Whether you would come out ahead often enough to make a difference is difficult to determine. (People don't like optimizations that make their code run slower ...)
There are better (more efficient) ways to do large scale text assembly than using String concatenation. The programmer (who has more knowledge of the problem domain and application logic) can optimize this better than a compiler. If it matters enough to spend the developer effort on this.
One optimization is to set the baseURL and ampersand in the stringBuilder instead of using the String concatenate at the end, such as:
public static String getGetRequestURL(String baseURL, Map<String, String> parameters) {
StringBuilder stringBuilder = new StringBuilder(baseURL);
stringBuilder.append("&");
parameters.forEach((key, value) -> stringBuilder.append(key).append("=").append(value).append("&"));
stringBuilder.setLength(stringBuilder.length() - 1);
return stringBuilder.toString();
}
If you want a little more speed and since javac or JIT will not optimize based potential string size, you can track that yourself without incurring much overhead, but adding a max size tracker, such as this:
protected static URL_SIZE = 256;
public static String getGetRequestURL(String baseURL, Map<String, String> parameters) {
StringBuilder stringBuilder = new StringBuilder(URL_SIZE);
stringBuilder.append(baseURL);
stringBuilder.append("&");
parameters.forEach((key, value) -> stringBuilder.append(key).append("=").append(value).append("&"));
int size = stringBuilder.length();
if (size > URL_SIZE) {
URL_SIZE = size;
}
stringBuilder.setLength(size - 1);
return stringBuilder.toString();
}
That said, with some testing of 1 million calls, I found that the different version preformed as (in milliseconds):
Your version: total = 1151, average = 230
Above version 1: total = 936, average = 187
Above version 2: total = 839, average = 167