This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Why to use StringBuffer in Java instead of the string concatenation operator
what is the advantage or aim of doing this
int a= 42
StringBuffer sb = new StringBuffer(40);
String s = sb.append("a = ").append(a).append("!").toString();
System.out.println(sb);
result > a = 42!
instead of
int a= 42
String s = "a = " + a + "!";
System.out.println(sb);
In your scenario, I'm not sure there is a difference b/c all of your "+" are on one line (which only creates a String once). In general, though, Strings are immutable objects and are not truly manipulated but rather created and discarded using StringBuffers.
So ultimately, you will have more efficient code if you use StringBuffers (and generally StringBuilders). If you google "String vs. StringBuffer vs. StringBuilder" you can find many articles detailing the statistics.
Efficiency. String concatenation in Java uses StringBuilders in the background anyway, so in some cases you can eke out a bit of efficiency by controlling that yourself.
Just run the code for 10000 time and measure the time. It should be obvious.
Some background-information: String is immutable while StringBuilder is not. So everytime you concatenate a String you have to copy an array.
PS: Sometimes the compiler optimizes things though. Maybe if you make your variable static final it would be just one String internally and no concatenation.
First of all, StringBuffer is synchronized, so you would typically use StringBuilder. + has been reimplemented to use StringBuilder a while ago.
Second, as #Riggy mentioned Java actually does optimize + as long as they occur in a single expression. But if you were to do:
String s = "";
s += a;
s += b;
s += c;
s += d;
Then the effective code would become:
String s ="";
s = new StringBuilder(s).append(a).toString();
s = new StringBuilder(s).append(b).toString();
s = new StringBuilder(s).append(c).toString();
s = new StringBuilder(s).append(d).toString();
which is suboptimal to
String s = new StringBuilder(s).append(a).append(b).append(c).append(d).toString();
Because of compiler optimizations, it may or may not make any difference in your app. You'll have to run comparison speed tests to see.
But before you obsess about performance, get the program working right. "Premature optimization is the root of all evil."
Related
This question already has answers here:
StringBuilder vs String concatenation in toString() in Java
(20 answers)
Closed 7 years ago.
I am concatenating a String in a loop but it takes ages, why is that?
for (String object : jsonData) {
counter++;
finalJsonDataStr += object;
}
Variable object is a piece of JSON, up to 70 chars and the loop goes approx 50k times.
I understand some people advice StringBuffer or StringBuilder but this link says, it has no performance improvements: StringBuilder vs String concatenation in toString() in Java
Use a String Builder to append to strings.
When you concatenate, Java is actually creating a new String with the results of the concatenation.
Do it multiple times and you are creating gazillion of strings for nothing.
Try:
StringBuilder sb = new StringBuilder();
for (String object : jsonData) {
counter++;
sb.append(object.toString()); //this does the concatenation internally
//but is very efficient
}
finalJsonDataStr = sb.toString(); //this gives you back the whole string
Remark:
When you do stuff like
myString = "hello " + someStringVariable + " World!" + " My name is " + name;
The compiler is smart enough to replace all that with a single StringBuilder, like:
myString = new StringBuilder("hello ")
.append(someStringVariable)
.append(" World!")
.append(" My name is ")
.append(name).toString();
But for some reason I don't know, it doesn't do it when the concatenation happens inside a loop.
You should use a StringBuffer or a StringBuilder.
When you add Strings with plus, a StringBuilder is created, strings are concatenated and a new String is return with toString() method of the StringBuilder. So image this object creation and string manipulation 50k times. It's much better if you instantiate only one StringBuilder yourself and just append strings...
This answer could be of use to you: concatenation operator (+) vs concat()
Before going to the actual problem, see how internal concatenation works.
String testString ="str"+"ingcon"+"catenation";
If we print the above declared String to console and see, the result is stringconcatenation.Which is correct and the + works fine. Here is out actual question, how does that + symbol did the magic ? ? Is it not a normal mathematical addition of Strings. The below code snippet shows how that code with + actually converts.
StringBuilder compilerGeneratedBuilder = new StringBuilder();
compilerGeneratedBuilder.append("str");
compilerGeneratedBuilder.append("ingcon");
compilerGeneratedBuilder.append("catenation");
String finalString = compilerGeneratedBuilder.toString();
More .....
50K times loop is a descent performance blocker to consider.
In such cases use StringBuilder with append method. Cause concat (+) create a new object every time a new String Builder object. That leads to 50k objects creations.
With single StringBuilder and append method, you can save the time of Objection creation as well as the memory too.
I'm learning Java and am wondering what's the best way to modify strings here (both for performance and to learn the preferred method in Java). Assume you're looping through a string and checking each character/performing some action on that index in the string.
Do I use the StringBuilder class, or convert the string into a char array, make my modifications, and then convert the char array back to a string?
Example for StringBuilder:
StringBuilder newString = new StringBuilder(oldString);
for (int i = 0; i < oldString.length() ; i++) {
newString.setCharAt(i, 'X');
}
Example for char array conversion:
char[] newStringArray = oldString.toCharArray();
for (int i = 0; i < oldString.length() ; i++) {
myNameChars[i] = 'X';
}
myString = String.valueOf(newStringArray);
What are the pros/cons to each different way?
I take it that StringBuilder is going to be more efficient since the converting to a char array makes copies of the array each time you update an index.
I say do whatever is most readable/maintainable until you you know that String "modification" is slowing you down. To me, this is the most readable:
Sting s = "foo";
s += "bar";
s += "baz";
If that's too slow, I'd use a StringBuilder. You may want to compare this to StringBuffer. If performance matters and synchronization does not, StringBuilder should be faster. If sychronization is needed, then you should use StringBuffer.
Also it's important to know that these strings are not being modified. In java, Strings are immutable.
This is all context specific. If you optimize this code and it doesn't make a noticeable difference (and this is usually the case), then you just thought longer than you had to and you probably made your code more difficult to understand. Optimize when you need to, not because you can. And before you do that, make sure the code you're optimizing is the cause of your performance issue.
What are the pros/cons to each different way. I take it that StringBuilder is going to be more efficient since the convering to a char array makes copies of the array each time you update an index.
As written, the code in your second example will create just two arrays: one when you call toCharArray(), and another when you call String.valueOf() (String stores data in a char[] array). The element manipulations you are performing should not trigger any object allocations. There are no copies being made of the array when you read or write an element.
If you are going to be doing any sort of String manipulation, the recommended practice is to use a StringBuilder. If you are writing very performance-sensitive code, and your transformation does not alter the length of the string, then it might be worthwhile to manipulate the array directly. But since you are learning Java as a new language, I am going to guess that you are not working in high frequency trading or any other environment where latency is critical. Therefore, you are probably better off using a StringBuilder.
If you are performing any transformations that might yield a string of a different length than the original, you should almost certainly use a StringBuilder; it will resize its internal buffer as necessary.
On a related note, if you are doing simple string concatenation (e.g, s = "a" + someObject + "c"), the compiler will actually transform those operations into a chain of StringBuilder.append() calls, so you are free to use whichever you find more aesthetically pleasing. I personally prefer the + operator. However, if you are building up a string across multiple statements, you should create a single StringBuilder.
For example:
public String toString() {
return "{field1 =" + this.field1 +
", field2 =" + this.field2 +
...
", field50 =" + this.field50 + "}";
}
Here, we have a single, long expression involving many concatenations. You don't need to worry about hand-optimizing this, because the compiler will use a single StringBuilder and just call append() on it repeatedly.
String s = ...;
if (someCondition) {
s += someValue;
}
s += additionalValue;
return s;
Here, you'll end up with two StringBuilders being created under the covers, but unless this is an extremely hot code path in a latency-critical application, it's really not worth fretting about. Given similar code, but with many more separate concatenations, it might be worth optimizing. Same goes if you know the strings might be very large. But don't just guess--measure! Demonstrate that there's a performance problem before you try to fix it. (Note: this is just a general rule for "micro optimizations"; there's rarely a downside to explicitly using a StringBuilder. But don't assume it will make a measurable difference: if you're concerned about it, you should actually measure.)
String s = "";
for (final Object item : items) {
s += item + "\n";
}
Here, we're performing a separate concatenation operation on each loop iteration, which means a new StringBuilder will be allocated on each pass. In this case, it's probably worth using a single StringBuilder since you may not know how large the collection will be. I would consider this an exception to the "prove there's a performance problem before optimizing rule": if the operation has the potential to explode in complexity based on input, err on the side of caution.
Which option will perform the best is not an easy question.
I did a benchmark using Caliper:
RUNTIME (NS)
array 88
builder 126
builderTillEnd 76
concat 3435
Benchmarked methods:
public static String array(String input)
{
char[] result = input.toCharArray(); // COPYING
for (int i = 0; i < input.length(); i++)
{
result[i] = 'X';
}
return String.valueOf(result); // COPYING
}
public static String builder(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result.toString(); // COPYING
}
public static StringBuilder builderTillEnd(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result;
}
public static String concat(String input)
{
String result = "";
for (int i = 0; i < input.length(); i++)
{
result += 'X'; // terrible COPYING, COPYING, COPYING... same as:
// result = new StringBuilder(result).append('X').toString();
}
return result;
}
Remarks
If we want to modify a String, we have to do at least 1 copy of that input String, because Strings in Java are immutable.
java.lang.StringBuilder extends java.lang.AbstractStringBuilder. StringBuilder.setCharAt() is inherited from AbstractStringBuilder and looks like this:
public void setCharAt(int index, char ch) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
value[index] = ch;
}
AbstractStringBuilder internally uses the simplest char array: char value[]. So, result[i] = 'X' is very similar to result.setCharAt(i, 'X'), however the second will call a polymorphic method (which probably gets inlined by JVM) and check bounds in if, so it will be a bit slower.
Conclusions
If you can operate on StringBuilder until the end (you don't need String back) - do it. It's the preferred way and also the fastest. Simply the best.
If you want String in the end and this is the bottleneck of your program, then you might consider using char array. In benchmark char array was ~25% faster than StringBuilder. Be sure to properly measure execution time of your program before and after optimization, because there is no guarantee about this 25%.
Never concatenate Strings in the loop with + or +=, unless you really know what you do. Usally it's better to use explicit StringBuilder and append().
I'd prefer to use StringBuilder class where original string is modified.
For String manipulation, I like StringUtil class. You'll need to get Apache commons dependency to use it
I can use either concatenation operator(+) or concat() method for string concatenation
String myData = "a"+"b"; // using concatenation operator
String myData = "a".concat("b"); // using concat() of String
But to concate a string with integer I cannot use concat() directly. SO I have to use either of the following logic
String myData = "a"+5;
String myData = "a".concat(String.valueOf(5));
But I found some thing strange in the following line when I want to use concatenation operator and concat
String myData = "a"+null; //output = anull
String myData = "a".concat(String.valueOf(5)); // output = NullPointerException
or
String myData = "a".concat(null); // output = NullPointerException
I have below question arised in my mind
1) How concat() method and concatenation operator works what is the difference in their logic of performing any task?
2) Can we really concat a null using (+) if so why concat() method cannot achieve the same
Thanks
1) The + operator (to produce Strings) always goes through an intermediate StringBuilder (or StringBuffer if targeting old platforms before 1.5 (released 7 (seven) years ago)). For concatenating two or perhaps three Strings concat will generally be faster due to the lack of intermediate. However, for longer concatenations + will win because there will be fewer intermediate allocations.
2) null generally indicates an error (quite possibly a design error). In general, errors should be reported as early as possible, which String.concat does. However, + or StringBuilder concatenation is often used to produce debug strings, so the null is tolerated and produces a result suitable for debugging (but not UI!).
There's no difference in the semantics, you just got your example wrong.
String myData = "a"+null; //output = anull
String myData = "a".concat(String.valueOf( (Object)null ) ); // output = anull
And this will compile and work. + behaves as if String.valueOf() was invoked for every operand.
I've heard that using StringBuilder is faster than using string concatenation, but I'm tired of wrestling with StringBuilder objects all of the time. I was recently exposed to the SLF4J logging library and I love the "just do the right thing" simplicity of its formatting when compared with String.format. Is there a library out there that would allow me to write something like:
int myInteger = 42;
MyObject myObject = new MyObject(); // Overrides toString()
String result = CoolFormatingLibrary.format("Simple way to format {} and {}",
myInteger, myObject);
Also, is there any reason (including performance but excluding fine-grained control of date and significant digit formatting) why I might want to use String.format over such a library if it does exist?
Although the Accepted answer is good, if (like me) one is interested in exactly Slf4J-style semantics, then the correct solution is to use Slf4J's MessageFormatter
Here is an example usage snippet:
public static String format(String format, Object... params) {
return MessageFormatter.arrayFormat(format, params).getMessage();
}
(Note that this example discards a last argument of type Throwable)
For concatenating strings one time, the old reliable "str" + param + "other str" is perfectly fine (it's actually converted by the compiler into a StringBuilder).
StringBuilders are mainly useful if you have to keep adding things to the string, but you can't get them all into one statement. For example, take a for loop:
String str = "";
for (int i = 0; i < 1000000; i++) {
str += i + " "; // ignoring the last-iteration problem
}
This will run much slower than the equivalent StringBuilder version:
StringBuilder sb = new StringBuilder(); // for extra speed, define the size
for (int i = 0; i < 1000000; i++) {
sb.append(i).append(" ");
}
String str = sb.toString();
But these two are functionally equivalent:
String str = var1 + " " + var2;
String str2 = new StringBuilder().append(var1).append(" ").append(var2).toString();
Having said all that, my actual answer is:
Check out java.text.MessageFormat. Sample code from the Javadocs:
int fileCount = 1273;
String diskName = "MyDisk";
Object[] testArgs = {new Long(fileCount), diskName};
MessageFormat form = new MessageFormat("The disk \"{1}\" contains {0} file(s).");
System.out.println(form.format(testArgs));
Output:
The disk "MyDisk" contains 1,273 file(s).
There is also a static format method which does not require creating a MessageFormat object.
All such libraries will boil down to string concatenation at their most basic level, so there won't be much performance difference from one to another.
Plus it worth bearing in min that String.format() is a bad implementation of sprintf done with regexps, so if you profile your code you will see an patterns and int[] that you were not expecting.
MessageFormat and the slf MessageFormmater are generally faster and allocate less junk
I am reading a csv file that has about 50,000 lines and 1.1MiB in size (and can grow larger).
In Code1, I use String to process the csv, while in Code2 I use StringBuilder (only one thread executes the code, so no concurrency issues)
Using StringBuilder makes the code a little bit harder to read that using normal String class.
Am I prematurely optimizing things with StringBuilder in Code2 to save a bit of heap space and memory?
Code1
fr = new FileReader(file);
BufferedReader reader = new BufferedReader(fr);
String line = reader.readLine();
while ( line != null )
{
int separator = line.indexOf(',');
String symbol = line.substring(0, seperator);
int begin = separator;
separator = line.indexOf(',', begin+1);
String price = line.substring(begin+1, seperator);
// Publish this update
publisher.publishQuote(symbol, price);
// Read the next line of fake update data
line = reader.readLine();
}
Code2
fr = new FileReader(file);
StringBuilder stringBuilder = new StringBuilder(reader.readLine());
while( stringBuilder.toString() != null ) {
int separator = stringBuilder.toString().indexOf(',');
String symbol = stringBuilder.toString().substring(0, separator);
int begin = separator;
separator = stringBuilder.toString().indexOf(',', begin+1);
String price = stringBuilder.toString().substring(begin+1, separator);
publisher.publishQuote(symbol, price);
stringBuilder.replace(0, stringBuilder.length(), reader.readLine());
}
Edit
I eliminated the toString() call, so there will be less string objects produced.
Code3
while( stringBuilder.length() > 0 ) {
int separator = stringBuilder.indexOf(",");
String symbol = stringBuilder.substring(0, separator);
int begin = separator;
separator = stringBuilder.indexOf(",", begin+1);
String price = stringBuilder.substring(begin+1, separator);
publisher.publishQuote(symbol, price);
Thread.sleep(10);
stringBuilder.replace(0, stringBuilder.length(), reader.readLine());
}
Also, the original code is downloaded from http://www.devx.com/Java/Article/35246/0/page/1
Will the optimized code increase performance of the app? - my question
The second code sample will not save you any memory nor any computation time. I am afraid you might have misunderstood the purpose of StringBuilder, which is really meant for building strings - not reading them.
Within the loop or your second code sample, every single line contains the expression stringBuilder.toString(), essentially turning the buffered string into a String object over and over again. Your actual string operations are done against these objects. Not only is the first code sample easier to read, but it is most certainly as performant of the two.
Am I prematurely optimizing things with StringBuilder? - your question
Unless you have profiled your application and have come to the conclusion that these very lines causes a notable slowdown on the execution speed, yes. Unless you are really sure that something will be slow (eg if you recognize high computational complexity), you definately want to do some profiling before you start making optimizations that hurt the readability of your code.
What kind of optimizations could be done to this code? - my question
If you have profiled the application, and decided this is the right place for an optimization, you should consider looking into the features offered by the Scanner class. Actually, this might both give you better performance (profiling will tell you if this is true) and more simple code.
Am I prematurely optimizing things with StringBuilder in Code2 to save a bit of heap space and memory?
Most probably: yes. But, only one way to find out: profile your code.
Also, I'd use a proper CSV parser instead of what you're doing now: http://ostermiller.org/utils/CSV.html
Code2 is actually less efficient than Code1 because every time you call stringBuilder.toString() you're creating a new java.lang.String instance (in addition to the existing StringBuilder object). This is less efficient in terms of space and time due to the object creation overhead.
Assigning the contents of readLine() directly to a String and then splitting that String will typically be performant enough. You could also consider using the Scanner class.
Memory Saving Tip
If you encounter multiple repeating tokens in your input consider using String.intern() to ensure that each identical token references the same String object; e.g.
String[] tokens = parseTokens(line);
for (String token : tokens) {
// Construct business object referencing interned version of token.
BusinessObject bo = new BusinessObject(token.intern());
// Add business object to collection, etc.
}
StringBuilder is usually used like this:
StringBuilder sb = new StringBuilder();
sb.append("You").append(" can chain ")
.append(" your ").append(" strings ")
.append("for better readability.");
String myString = sb.toString(); // only call once when you are done
System.out.prinln(sb); // also calls sb.toString().. print myString instead
StringBuilder has several good things
StringBuffer's operations are synchronized but StringBuilder is not, so using StringBuilder will improve performance in single threaded scenarios
Once the buffer is expanded the buffer can be reused by invoking setLength(0) on the object. Interestingly if you step into the debugger and examine the contents of StringBuilder you will see that contents are still exists even after invoking setLength(0). The JVM simply resets the pointer beginning of the string. Next time when you start appending the chars the pointer moves
If you are not really sure about length of string, it is better to use StringBuilder because once the buffer is expanded you can reuse the same buffer for smaller or equal size
StringBuffer and StringBuilder are almost same in all operations except that StringBuffer is synchronized and StringBuilder is not
If you dont have multithreading then it is better to use StringBuilder