How can we re assign the value of a StringBuffer or StringBuilder Variable?
StringBuffer sb=new StringBuffer("teststr");
Now i have to change the value of sb to "testString" without emptying the contents.
I am looking at a method which can do this assignment directly without using separate memory allocation.I think we can do it only after emptying the contents.
sb.setLength(0);
sb.append("testString");
It should first be mentioned that StringBuilder is generally preferred to StringBuffer. From StringBuffer's own API:
As of release JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder. The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization.
That said, I will stick to StringBuffer for the rest of the answer because that's what you're asking; everything that StringBuffer does, StringBuilder also... except synchronization, which is generally unneeded. So unless you're using the buffer in multiple threads, switching to StringBuilder is a simple task.
The question
StringBuffer sb = new StringBuffer("teststr");
"Now i have to change the value of sb to "testString" without emptying the contents"
So you want sb to have the String value "testString" in its buffer? There are many ways to do this, and I will list some of them to illustrate how to use the API.
The optimal solution: it performs the minimum edit from "teststr" to "testString". It's impossible to do it any faster than this.
StringBuffer sb = new StringBuffer("teststr");
sb.setCharAt(4, 'S');
sb.append("ing");
assert sb.toString().equals("testString");
This needlessly overwrites "tr" with "tr".
StringBuffer sb = new StringBuffer("teststr");
sb.replace(4, sb.length(), "String");
assert sb.toString().equals("testString");
This involves shifts due to deleteCharAt and insert.
StringBuffer sb = new StringBuffer("teststr");
sb.deleteCharAt(4);
sb.insert(4, 'S');
sb.append("ing");
assert sb.toString().equals("testString");
This is a bit different now: it doesn't magically know that it has "teststr" that it needs to edit to "testString"; it assumes only that the StringBuffer contains at least one occurrence of "str" somewhere, and that it needs to be replaced by "String".
StringBuffer sb = new StringBuffer("strtest");
int idx = sb.indexOf("str");
sb.replace(idx, idx + 3, "String");
assert sb.toString().equals("Stringtest");
Let's say now that you want to replace ALL occurrences of "str" and replace it with "String". A StringBuffer doesn't have this functionality built-in. You can try to do it yourself in the most efficient way possible, either in-place (probably with a 2-pass algorithm) or using a second StringBuffer, etc.
But instead I will use the replace(CharSequence, CharSequence) from String. This will be more than good enough in most cases, and is definitely a lot more clear and easier to maintain. It's linear in the length of the input string, so it's asymptotically optimal.
String before = "str1str2str3";
String after = before.replace("str", "String");
assert after.equals("String1String2String3");
Discussions
"I am looking for the method to assign value later by using previous memory location"
The exact memory location shouldn't really be a concern for you; in fact, both StringBuilder and StringBuffer will reallocate its internal buffer to different memory locations whenever necessary. The only way to prevent that would be to ensureCapacity (or set it through the constructor) so that its internal buffer will always be big enough and it would never need to be reallocated.
However, even if StringBuffer does reallocate its internal buffer once in a while, it should not be a problem in most cases. Most data structures that dynamically grows (ArrayList, HashMap, etc) do them in a way that preserves algorithmically optimal operations, taking advantage of cost amortization. I will not go through amortized analysis here, but unless you're doing real-time systems etc, this shouldn't be a problem for most applications.
Obviously I'm not aware of the specifics of your need, but there is a fear of premature optimization since you seem to be worrying about things that most people have the luxury of never having to worry about.
What do you mean with "reassign"? You can empty the contents by using setLength() and then start appending new content, if that's what you mean.
Edit: For changing parts of the content, you can use replace().
Generally, this kind of question can be easily answered by looking at the API doc of the classes in question.
You can use a StringBuilder in place of a StringBuffer, which is typically what people do if they can (StringBuilder isn't synchronized so it is faster but not threadsafe). If you need to initialize the contents of one with the other, use the toString() method to get the string representation. To recycle an existing StringBuilder or StringBuffer, simply call setLength(0).
Edit
You can overwrite a range of elements with the replace() function. To change the entire value to newval, you would use buffer.replace(0,buffer.length(),newval). See also:
StringBuilder
StringBuffer
You might be looking for the replace() method of the StringBuffer:
StringBuffer sb=new StringBuffer("teststr");
sb.replace(0, sb.length() - 1, "newstr");
Internally, it removes the original string, then inserts the new string, but it may save you a step from this:
StringBuffer sb=new StringBuffer("teststr");
sb.delete(0, sb.length() - 1);
sb.append("newstr");
Using setLength(0) reassigns a zero length StringBuffer to the variable, which, I guess, is not what you want:
StringBuffer sb=new StringBuffer("teststr");
// Reassign sb to a new, empty StringBuffer
sb.setLength(0);
sb.append("newstr");
Indeed, I think replace() is the best way. I checked the Java-Source code. It really overwrites the old characters.
Here is the source code from replace():
public AbstractStringBuffer replace(int start, int end, String str)
{
if (start < 0 || start > count || start > end)
throw new StringIndexOutOfBoundsException(start);
int len = str.count;
// Calculate the difference in 'count' after the replace.
int delta = len - (end > count ? count : end) + start;
ensureCapacity_unsynchronized(count + delta);
if (delta != 0 && end < count)
VMSystem.arraycopy(value, end, value, end + delta, count - end);
str.getChars(0, len, value, start);
count += delta;
return this;
}
Changing entire value of StringBuffer:
StringBuffer sb = new StringBuffer("word");
sb.setLength(0); // setting its length to 0 for making the object empty
sb.append("text");
This is how you can change the entire value of StringBuffer.
You can convert to/from a String, as follows:
StringBuffer buf = new StringBuffer();
buf.append("s1");
buf.append("s2");
StringBuilder sb = new StringBuilder(buf.toString());
// Now sb, contains "s1s2" and you can further append to it
Related
I want to read an input stream with a buffer and append each buffer to a string until the stream is empty.
char[] buffer = new char[32];
while ((bytesRead = streamReader.read(buffer, 0, 32)) != -1)
message += new String(buffer);
I couldn't find a method of String to append a character array directly and opted for this. Is there a way to know if this is wasting cycles copying the character array to a new string, only to immediately discard it?
This may be premature optimization but it glares at me pretty strongly.
I believe this constructor calls Arrays.copyOf().
No.
There are more efficient ways. Using a java.util.StringBuilder in your loop is more efficient as it is performing a series of operations. In your loop, += will create another String object each time. For example, if you were to do String d = a + b + c + d... for 1_000_000 Strings, then you'd be making 1_000_000 String objects as waste. Strings are immutable.
Related Link:
String concatenation: concat() vs "+" operator
Off-Site Link but still related
https://dzone.com/articles/string-concatenation-performacne-improvement-in-ja
I'd like to know for character concatenation in Java - which one of the below method would be better for readability, maintenance and performance - either 'char array' or 'string builder'.
The method has to take the first letter from both the strings, append and return it.
Eg:
Input 1: ABC Input 2: DEF -> method should return AD.
using string builder:
private String getString(String str1, String str2) {
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append(str1.charAt(0));
stringBuilder.append(str2.charAt(0));
return stringBuilder.toString();
}
using char array:
private String getString(String str1, String str2) {
char[] charArray = new char[2];
charArray[0] = str1.charAt(0);
charArray[1] = str2.charAt(0);
return String.valueOf(charArray);
}
StringBuilder is just a wrapper around a char[], adding functionality like resizing the array as necessary; and moving elements when you insert/delete etc.
It might be marginally faster to use the char[] directly for some things, but you'd lose (or have to reimplement) a lot of the useful functionality.
charArray is good in term of Performance and readability too but it hard to maintain the code like this. It can cause the error like Null pointer. You just need to add the null check with char[] code.
On the other side StringBuffer internally use the char. So, char is better here and also by doing this we are not creating an Object. Memory point of view. It's good not to create that one.
If you review the source code for StringBuilder, you will find that internally it uses a char[] to represent the buffered string. So both versions of your code are doing very similar things. However, I would vote for using StringBuilder, because it offers an API which can much more than the plain char[] which sits inside its implementation.
I'm learning Java and am wondering what's the best way to modify strings here (both for performance and to learn the preferred method in Java). Assume you're looping through a string and checking each character/performing some action on that index in the string.
Do I use the StringBuilder class, or convert the string into a char array, make my modifications, and then convert the char array back to a string?
Example for StringBuilder:
StringBuilder newString = new StringBuilder(oldString);
for (int i = 0; i < oldString.length() ; i++) {
newString.setCharAt(i, 'X');
}
Example for char array conversion:
char[] newStringArray = oldString.toCharArray();
for (int i = 0; i < oldString.length() ; i++) {
myNameChars[i] = 'X';
}
myString = String.valueOf(newStringArray);
What are the pros/cons to each different way?
I take it that StringBuilder is going to be more efficient since the converting to a char array makes copies of the array each time you update an index.
I say do whatever is most readable/maintainable until you you know that String "modification" is slowing you down. To me, this is the most readable:
Sting s = "foo";
s += "bar";
s += "baz";
If that's too slow, I'd use a StringBuilder. You may want to compare this to StringBuffer. If performance matters and synchronization does not, StringBuilder should be faster. If sychronization is needed, then you should use StringBuffer.
Also it's important to know that these strings are not being modified. In java, Strings are immutable.
This is all context specific. If you optimize this code and it doesn't make a noticeable difference (and this is usually the case), then you just thought longer than you had to and you probably made your code more difficult to understand. Optimize when you need to, not because you can. And before you do that, make sure the code you're optimizing is the cause of your performance issue.
What are the pros/cons to each different way. I take it that StringBuilder is going to be more efficient since the convering to a char array makes copies of the array each time you update an index.
As written, the code in your second example will create just two arrays: one when you call toCharArray(), and another when you call String.valueOf() (String stores data in a char[] array). The element manipulations you are performing should not trigger any object allocations. There are no copies being made of the array when you read or write an element.
If you are going to be doing any sort of String manipulation, the recommended practice is to use a StringBuilder. If you are writing very performance-sensitive code, and your transformation does not alter the length of the string, then it might be worthwhile to manipulate the array directly. But since you are learning Java as a new language, I am going to guess that you are not working in high frequency trading or any other environment where latency is critical. Therefore, you are probably better off using a StringBuilder.
If you are performing any transformations that might yield a string of a different length than the original, you should almost certainly use a StringBuilder; it will resize its internal buffer as necessary.
On a related note, if you are doing simple string concatenation (e.g, s = "a" + someObject + "c"), the compiler will actually transform those operations into a chain of StringBuilder.append() calls, so you are free to use whichever you find more aesthetically pleasing. I personally prefer the + operator. However, if you are building up a string across multiple statements, you should create a single StringBuilder.
For example:
public String toString() {
return "{field1 =" + this.field1 +
", field2 =" + this.field2 +
...
", field50 =" + this.field50 + "}";
}
Here, we have a single, long expression involving many concatenations. You don't need to worry about hand-optimizing this, because the compiler will use a single StringBuilder and just call append() on it repeatedly.
String s = ...;
if (someCondition) {
s += someValue;
}
s += additionalValue;
return s;
Here, you'll end up with two StringBuilders being created under the covers, but unless this is an extremely hot code path in a latency-critical application, it's really not worth fretting about. Given similar code, but with many more separate concatenations, it might be worth optimizing. Same goes if you know the strings might be very large. But don't just guess--measure! Demonstrate that there's a performance problem before you try to fix it. (Note: this is just a general rule for "micro optimizations"; there's rarely a downside to explicitly using a StringBuilder. But don't assume it will make a measurable difference: if you're concerned about it, you should actually measure.)
String s = "";
for (final Object item : items) {
s += item + "\n";
}
Here, we're performing a separate concatenation operation on each loop iteration, which means a new StringBuilder will be allocated on each pass. In this case, it's probably worth using a single StringBuilder since you may not know how large the collection will be. I would consider this an exception to the "prove there's a performance problem before optimizing rule": if the operation has the potential to explode in complexity based on input, err on the side of caution.
Which option will perform the best is not an easy question.
I did a benchmark using Caliper:
RUNTIME (NS)
array 88
builder 126
builderTillEnd 76
concat 3435
Benchmarked methods:
public static String array(String input)
{
char[] result = input.toCharArray(); // COPYING
for (int i = 0; i < input.length(); i++)
{
result[i] = 'X';
}
return String.valueOf(result); // COPYING
}
public static String builder(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result.toString(); // COPYING
}
public static StringBuilder builderTillEnd(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result;
}
public static String concat(String input)
{
String result = "";
for (int i = 0; i < input.length(); i++)
{
result += 'X'; // terrible COPYING, COPYING, COPYING... same as:
// result = new StringBuilder(result).append('X').toString();
}
return result;
}
Remarks
If we want to modify a String, we have to do at least 1 copy of that input String, because Strings in Java are immutable.
java.lang.StringBuilder extends java.lang.AbstractStringBuilder. StringBuilder.setCharAt() is inherited from AbstractStringBuilder and looks like this:
public void setCharAt(int index, char ch) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
value[index] = ch;
}
AbstractStringBuilder internally uses the simplest char array: char value[]. So, result[i] = 'X' is very similar to result.setCharAt(i, 'X'), however the second will call a polymorphic method (which probably gets inlined by JVM) and check bounds in if, so it will be a bit slower.
Conclusions
If you can operate on StringBuilder until the end (you don't need String back) - do it. It's the preferred way and also the fastest. Simply the best.
If you want String in the end and this is the bottleneck of your program, then you might consider using char array. In benchmark char array was ~25% faster than StringBuilder. Be sure to properly measure execution time of your program before and after optimization, because there is no guarantee about this 25%.
Never concatenate Strings in the loop with + or +=, unless you really know what you do. Usally it's better to use explicit StringBuilder and append().
I'd prefer to use StringBuilder class where original string is modified.
For String manipulation, I like StringUtil class. You'll need to get Apache commons dependency to use it
which of the following is an efficient way to reverse words in a string ?
public String Reverse(StringTokenizer st){
String[] words = new String[st.countTokens()];
int i = 0;
while(st.hasMoreTokens()){
words[i] = st.nextToken();i++}
for(int j = words.length-1;j--)
output = words[j]+" ";}
OR
public String Reverse(StringTokenizer st, String output){
if(!st.hasMoreTokens()) return output;
output = st.nextToken()+" "+output;
return Reverse(st, output);}
public String ReverseMain(StringTokenizer st){
return Reverse(st, "");}
while the first way seems more readable and straight forward, there are two loops in it. In the 2nd method, I've tried doing it in tail-recursive way. But I am not sure whether java does optimize tail-recursive code.
you could do this in just one loop
public String Reverse(StringTokenizer st){
int length = st.countTokens();
String[] words = new String[length];
int i = length - 1;
while(i >= 0){
words[i] = st.nextToken();i--}
}
But I am not sure whether java does optimize tail-recursive code.
It doesn't. Or at least the Sun/Oracle Java implementations don't, up to and including Java 7.
References:
"Tail calls in the VM" by John Rose # Oracle.
Bug 4726340 - RFE: Tail Call Optimization
I don't know whether this makes one solution faster than the other. (Test it yourself ... taking care to avoid the standard micro-benchmarking traps.)
However, the fact that Java doesn't implement tail-call optimization means that the 2nd solution is liable to run out of stack space if you give it a string with a large (enough) number of words.
Finally, if you are looking for a more space efficient way to implement this, there is clever way that uses just a StringBuilder.
Create a StringBuilder from your input String
Reverse the characters in the StringBuilder using reverse().
Step through the StringBuilder, identifying the start and end offset of each word. For each start/end offset pair, reverse the characters between the offsets. (You have to do this using a loop.)
Turn the StringBuilder back into a String.
You can test results by timing both of them on a large amount of results
eg. You reverse 100000000 strings and see how many seconds it takes. You could also compare start and end system timestamps to get the exact difference between the two functions.
StringTokenizer is not deprecated but if you read the current JavaDoc...
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
String[] strArray = str.split(" ");
StringBuilder sb = new StringBuilder();
for (int i = strArray.length() - 1; i >= 0; i--)
sb.append(strArray[i]).append(" ");
String reversedWords = sb.substring(0, sb.length -1) // strip trailing space
I am reading a csv file that has about 50,000 lines and 1.1MiB in size (and can grow larger).
In Code1, I use String to process the csv, while in Code2 I use StringBuilder (only one thread executes the code, so no concurrency issues)
Using StringBuilder makes the code a little bit harder to read that using normal String class.
Am I prematurely optimizing things with StringBuilder in Code2 to save a bit of heap space and memory?
Code1
fr = new FileReader(file);
BufferedReader reader = new BufferedReader(fr);
String line = reader.readLine();
while ( line != null )
{
int separator = line.indexOf(',');
String symbol = line.substring(0, seperator);
int begin = separator;
separator = line.indexOf(',', begin+1);
String price = line.substring(begin+1, seperator);
// Publish this update
publisher.publishQuote(symbol, price);
// Read the next line of fake update data
line = reader.readLine();
}
Code2
fr = new FileReader(file);
StringBuilder stringBuilder = new StringBuilder(reader.readLine());
while( stringBuilder.toString() != null ) {
int separator = stringBuilder.toString().indexOf(',');
String symbol = stringBuilder.toString().substring(0, separator);
int begin = separator;
separator = stringBuilder.toString().indexOf(',', begin+1);
String price = stringBuilder.toString().substring(begin+1, separator);
publisher.publishQuote(symbol, price);
stringBuilder.replace(0, stringBuilder.length(), reader.readLine());
}
Edit
I eliminated the toString() call, so there will be less string objects produced.
Code3
while( stringBuilder.length() > 0 ) {
int separator = stringBuilder.indexOf(",");
String symbol = stringBuilder.substring(0, separator);
int begin = separator;
separator = stringBuilder.indexOf(",", begin+1);
String price = stringBuilder.substring(begin+1, separator);
publisher.publishQuote(symbol, price);
Thread.sleep(10);
stringBuilder.replace(0, stringBuilder.length(), reader.readLine());
}
Also, the original code is downloaded from http://www.devx.com/Java/Article/35246/0/page/1
Will the optimized code increase performance of the app? - my question
The second code sample will not save you any memory nor any computation time. I am afraid you might have misunderstood the purpose of StringBuilder, which is really meant for building strings - not reading them.
Within the loop or your second code sample, every single line contains the expression stringBuilder.toString(), essentially turning the buffered string into a String object over and over again. Your actual string operations are done against these objects. Not only is the first code sample easier to read, but it is most certainly as performant of the two.
Am I prematurely optimizing things with StringBuilder? - your question
Unless you have profiled your application and have come to the conclusion that these very lines causes a notable slowdown on the execution speed, yes. Unless you are really sure that something will be slow (eg if you recognize high computational complexity), you definately want to do some profiling before you start making optimizations that hurt the readability of your code.
What kind of optimizations could be done to this code? - my question
If you have profiled the application, and decided this is the right place for an optimization, you should consider looking into the features offered by the Scanner class. Actually, this might both give you better performance (profiling will tell you if this is true) and more simple code.
Am I prematurely optimizing things with StringBuilder in Code2 to save a bit of heap space and memory?
Most probably: yes. But, only one way to find out: profile your code.
Also, I'd use a proper CSV parser instead of what you're doing now: http://ostermiller.org/utils/CSV.html
Code2 is actually less efficient than Code1 because every time you call stringBuilder.toString() you're creating a new java.lang.String instance (in addition to the existing StringBuilder object). This is less efficient in terms of space and time due to the object creation overhead.
Assigning the contents of readLine() directly to a String and then splitting that String will typically be performant enough. You could also consider using the Scanner class.
Memory Saving Tip
If you encounter multiple repeating tokens in your input consider using String.intern() to ensure that each identical token references the same String object; e.g.
String[] tokens = parseTokens(line);
for (String token : tokens) {
// Construct business object referencing interned version of token.
BusinessObject bo = new BusinessObject(token.intern());
// Add business object to collection, etc.
}
StringBuilder is usually used like this:
StringBuilder sb = new StringBuilder();
sb.append("You").append(" can chain ")
.append(" your ").append(" strings ")
.append("for better readability.");
String myString = sb.toString(); // only call once when you are done
System.out.prinln(sb); // also calls sb.toString().. print myString instead
StringBuilder has several good things
StringBuffer's operations are synchronized but StringBuilder is not, so using StringBuilder will improve performance in single threaded scenarios
Once the buffer is expanded the buffer can be reused by invoking setLength(0) on the object. Interestingly if you step into the debugger and examine the contents of StringBuilder you will see that contents are still exists even after invoking setLength(0). The JVM simply resets the pointer beginning of the string. Next time when you start appending the chars the pointer moves
If you are not really sure about length of string, it is better to use StringBuilder because once the buffer is expanded you can reuse the same buffer for smaller or equal size
StringBuffer and StringBuilder are almost same in all operations except that StringBuffer is synchronized and StringBuilder is not
If you dont have multithreading then it is better to use StringBuilder