What's the best way to set a StringBuilder object?

What's the best way to set a StringBuilder object? - java

I created a StringBuilder object without any value and appended some values afterwards. I later wanted to replace the object with an entirely different string.
Here's the code:
StringBuilder finalVersion = new StringBuilder();
finalVersion.append("6.0")
for (int i = 0; i < list.length(); i++) {
if(/*...*/){
finalVersion.append(".2");
} else {
finalVersion.append(".1");
}
if (/*...*/) {
if (/*...*/) {
finalVersion = new StringBuilder("Invalid parameter"));
}
}
}
What I've done is I created a new object to change the value, but perhaps there is a better approach without using stringBuilder.setLength(0);.
Could someone help?

To solve it I created a new object to change the value but I guess there is a better approach without use sb.setLength(0)
Both of those are good approaches. Another is sb.delete(0, sb.length()).
And when you want to replace the existing string content, sb.replace(0, sb.length(), "new content") is another option.
It really depends what your goal is:
Calling new StringBuffer(...) will give you a freshly allocated object with either the default capacity or a capacity that you specify.
The setLength, delete and replace approaches will recycle the existing object. This has advantages and disadvantages.
On the plus side, you don't allocate a new object1 ... so less garbage.
On the minus side, the string buffer uses the same amount of space as before, whether or not it needs to. Also, if you do this repeatedly, the chances are that the buffer and its backing array will be tenured by the GC, adding to the long-term memory load. You can free the unused capacity by calling sb.trimToSize(), but that is liable to cause a reallocation; i.e. what you were trying to avoid by not using new.
My advice would be to use new unless either the context means that you
can't, or your profiling tells you that new is generating too much garbage.
Looking at the code2, I think that setLength should be marginally faster than delete for emptying a StringBuffer. It gets more complicated when you are replacing the contents with new contents. Now you are comparing
sb.setLength(); sb.append("new content");
versus
sb.replace(0, sb.length(), "new content");
It needs to be measured ... if you care enough3 about performance to be comparing the cases.
1 - Unless the replacement string is large enough that the buffer needs to grow to hold it.
2 - By my reading of various versions of the StringBuilder and AbstractStringBuilder code, the delete method will always call System.arraycopy. However, to understand the performance impact, one would need to benchmark this carefully for different sizes of StringBuilder and across different Java versions.
3 - Actually, if you need to care. Beware the evils of premature optimization.

You can use replace. This method doesn't create a new StringBuilder object.
builder.replace(0, b.length(), "Invalid parameter");
Or, if doing it in two statements is fine with you, you can setLength(0), then append.
But really, you shouldn't worry about creating a new StringBuilder unless you actually encounter performance problems.

Related

Java software design - Looping, object creation VS modifying variables. Memory, performance & reliability comparison

Let's say we are trying to build a document scanner class in java that takes 1 input argument, the log path(eg. C:\document\text1.txt). Which of the following implementations would you prefer based on performance/memory/modularity?
ArrayList<String> fileListArray = new ArrayList<String>();
fileListArray.add("C:\\document\\text1.txt");
fileListArray.add("C:\\document\\text2.txt");
.
.
.
//Implementation A
for(int i =0, j = fileListArray.size(); i < j; i++){
MyDocumentScanner ds = new MyDocumentScanner(fileListArray.get(i));
ds.scanDocument();
ds.resultOutput();
}
//Implementation B
MyDocumentScanner ds = new MyDocumentScanner();
for(int i=0, j=fileListArray.size(); i < j; i++){
ds.setDocPath(fileListArray.get(i));
ds.scanDocument();
ds.resultOutput();
}
Personally I would prefer A due to its encapsulation, but it seems like more memory usage due to creation of multiple instances. I'm curious if there is an answer to this, or it is another "that depends on the situation/circumstances" dilemma?

Although this is obviously opinion-based, I will try an answer to tell my opinion.
You approach A is far better. Your document scanner obviously handles a file. That should be set at construction time and be saved in an instance field. So every method can refer to this field. Moreover, the constructor can do some checks on the file reference (null check, existence, ...).
Your approach B has two very serious disadvantages:
After constructing a document scanner, clients could easily call all of the methods. If no file was set before, you must handle that "illegal state" with maybe an IllegalStateException. Thus, this approach increases code and complexity of that class.
There seems to be a series of method calls that a client should or can perform. It's easy to call the file setting method again in the middle of such a series with a completely other file, breaking the whole scan facility. To avoid this, your setter (for the file) should remember whether a file was already set. And that nearly automatically leads to approach A.
Regarding the creation of objects: Modern JVMs are really very fast at creating objects. Usually, there is no measurable performance overhead for that. The processing time (here: the scan) usually is much higher.

If you don't need multiple instances of DocumentScanner to co-exist, I see no point in creating a new instance in each iteration of the loop. It just creates work to the garbage collector, which has to free each of those instances.
If the length of the array is small, it doesn't make much difference which implementation you choose, but for large arrays, implementation B is more efficient, both in terms of memory (less instances created that the GC hasn't freed yet) and CPU (less work for the GC).

Are you implementing DocumentScanner or using an existing class?
If the latter, and it was designed for being able to parse multiple documents in a row, you can just reuse the object as in variant B.
However, if you are designing DocumentScanner, I would recommend to design it such that it handles a single document and does not even have a setDocPath method. This leads to less mutable state in that class and thus makes its design much easier. Also using an instance of the class becomes less error-prone.
As for performance, there won't be a measurable difference unless instantiating a DocumentScanner is doing a lot of work (like instantiating many other objects, too). Instantiating and freeing objects in Java is pretty cheap if they are used only for a short time due to the generational garbage collector.

java performance : string literal

I would like to know about the relation between string literal and java performance.
for example, uses of below statements number of times makes any impact in performance.I have thousands of classes and many time we are using below statements :
1) buffer.append(",");
2) buffer.append("}");
3)String.append("10,000 times...same lines") // printing same lines in many classes
3)String.someStringMethod("same line many times") // using any String method
Does this cause performance impact in terms of memory management etc.Do we have any cleaner way ?
Thanks

It is really difficult to comment on examples that make no sense. However:
In general there are no particular efficiency concerns with Java String literals.
In general there are no particular efficiency concerns with methods that take String literals as arguments.
String concatenation / building can present efficiency concerns if a particular piece of code is executed often enough. However, if you need to build strings, then there is not a lot you can do about it.
There are one or two things that it is worth taking steps to avoid. The main one is this:
String s = "";
for (/* lots of times */) {
// do stuff
s += someOtherString;
}
The problem is that this generates and then discards lots of temporary Strings. The more efficient way to do it is this:
StringBuilder sb = new StringBuilder();
for (/* lots of times */) {
// do stuff
sb.append(someOtherString);
}
String s = sb.toString();
However, it is probably only worth while optimizing this kind of thing if the profiler tells you that this particular bit of code is a bottleneck.

Any code you write affects performance, so it's always better to invoke one append() with ",}" instead of 2 appends.
There's no method append in java.lang.String class.
None of the String methods make changes to the String object. Instead, methods like String.substring(), String.concat(), String.replace() create new String objects. This means performance is affected more significantly than if you use StringBuffer.
So generally StringBuffer methods are faster than those of String. However, new class was recently introduced called StringBuilder. There's only one difference from StringBuilder - it's not thread-safe. In real world cases thread management is taken care of higher level containers thus making it unnecessary to ensure thread safety for each class. In those cases you're advised to use StringBuilder. That should be the fastest.
In order to further improve performance of StringBuilder you have to be aware of resulting string length to allocate StringBuilder of a proper size. If it's too big you'll waste some memory, but that's usually a minor problem. If it's too small though, StringBuilder will have to recreate internal character array to make space for more characters. That would make that particular append() invocation slow. Actually, that's not the invocation that's slow, but garbage collection invoked to clean the memory up.
Particular methods of String class may be better or faster than those in StringBuffer/StringBuilder, but you have to be more specific with your questions for me to answer that.

The "Why" behind PMD's StringInstantiation rule

Along the lines of an existing thread, The “Why” behind PMD's rules, I'm trying to figure out the meaning of one particular PMD rule : String and StringBuffer Rules.StringInstantiation.
This rule states that you shouldn't explicitly instantiate String objects. As per their manual page :
Avoid instantiating String objects; this is usually unnecessary since
they are immutable and can be safely shared.
This rule is defined by the following Java
class:net.sourceforge.pmd.lang.java.rule.strings.StringInstantiationRule
Example(s):
private String bar = new String("bar"); // just do a String bar =
"bar";
http://pmd.sourceforge.net/pmd-5.0.1/rules/java/strings.html
I don't see how this syntax is a problem, other than it being pointless. Does it affect overwhole performance ?
Thanks for any thought.

With String foo = "foo" there will be on instance of "foo" in PermGen space (This is referred to as string interning). If you were to later type String bar = "foo" there would still only be one "foo" in the PermGen space.
Writing String foo = new String( "foo" ) will also create a String object to count against the heap.
Thus, the rule is there to prevent wasting memory.
Cheers,

It shouldn't usually affect performance in any measurable way, but:
private String bar = new String("bar"); // just do a String bar = "bar";
If you execute this line a million times you will have created a million objects
private String bar = "bar"; // just do a String bar = "bar";
If you execute this line a million times you will have created one Object.
There are scenarios where that actually makes a difference.

Does it affect overwhole performance ?
Well, performance and maintenance. Doing something which is pointless makes the reader wonder why the code is there in the first place. When that pointless operation also involves creating new objects (two in this case - a new char[] and a new String) that's another reason to avoid it...
In the past, there has been a reason to call new String(existingString) if the existing string was originally obtained as a small substring of a longer string - or other ways of obtaining a string backed by a large character array. I believe that this is not the case with more recent implementations of Java, but obviously you can still be using an old one. This shouldn't be a problem for constant strings anyway, mind you.
(You could argue that creating a new object allows you to synchronize on it. I would avoiding synchronizing on strings to start with though.)

One difference is the memory footprint:
String a = "abc"; //one object
String b = "abc"; //same object (i.e. a == b) => still one object in memory
String c = new String("abc"); // This is a new object - now 2 objects in memory
To be honest, the only reason I can think of, why one would use the String constructor is in combination with substring, which is a view on the original string. Using the String constructor in that case helps getting rid of the original string if it is not needed any longer.
However, since java 7u6, this is not the case any more so I don't see any reasons to use it any more.

It can be useful, because it creates a new identity, and sometimes object identities are important/crucial to an application. For example, it can be used as an internal sentinel value. There are other valid use cases too, e.g. to avoid constant expression.
If a beginner writes such code, it's very likely a mistake. But that is a very short learning period. It is highly unlikely that any moderately experienced Java programmer would write that by mistake; it must be for a specific purpose. File it under "it looks like a stupid mistake, but it takes efforts to make, so it's probably intended".

It is
pointless
confusing
slightly slower
You should try to write the simplest, clearest code you can. Adding pointless code is bad all round.

What will use more memory

I am working on improving the performance of my app. I am confused about which of the following will use more memory: Here sb is StringBuffer
String strWithLink = sb.toString();
clickHereTextview.setText(
Html.fromHtml(strWithLink.substring(0,strWithLink.indexOf("+"))));
OR
clickHereTextview.setText(
Html.fromHtml(sb.toString().substring(0,sb.toString().indexOf("+"))));

In terms of memory an expression such as
sb.toString().indexOf("+")
has little to no impact as the string will be garbage collected right after evaluation. (To avoid even the temporary memory usage, I would recommend doing
sb.indexOf("+")
instead though.)
However, there's a potential leak involved when you use String.substring. Last time I checked the the substring basically returns a view of the original string, so the original string is still resident in memory.
The workaround is to do
String strWithLink = sb.toString();
... new String(strWithLink.substring(0,strWithLink.indexOf("+"))) ...
^^^^^^^^^^
to detach the wanted string, from the original (potentially large) string. Same applies for String.split as discussed over here:
Java String.split memory leak?

The second will use more memory, because each call to StringBuilder#toString() creates a new String instance.
http://www.docjar.com/html/api/java/lang/StringBuilder.java.html

Analysis
If we look at StringBuilder's OpenJDK sources:
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
We see, that it instantiates a whole new String object. It places in the string pool as many new instances as many times you call sb.toString().
Outcome
Use String strWithLink = sb.toString();, reusing it will retrieve the same instance of String from the pool, rather the new one.

Check other people's answers, the second one does take a little bit more memory, but this sounds like you are over optimizing. Keeping your code clear and readable should be the priority. I'd suggest you don't worry so much about such tiny optimizations if readability will suffer.

The less work you do, the more efficient it usually is. In this case, you don't need to call toString at all
clickHereTextview.setText(Html.fromHtml(sb.substring(0, sb.indexOf("+"))));

Creating new objects always take up more memory. However, in your case difference seems insignificant.
Also, in your case, you are creating a local variable which takes heap space.
Whenever there are references in more than one location in your method it good to use
String strWithLink = sb.toString();, as you can use the same strWithLink everywhere . Otherwise, if there is only one reference, its always better to just use sb.toString(); directly.

Java: Different between two ways when using new Object

For example, you want to reverse a string, will there two ways:
first:
String a = "StackOverFlow";
a = new StringBuffer(a).reverse().toString();
and second is:
String a = "StackOverFlow";
StringBuffer b = new StringBuffer(a);
a = b.reverse().toString();
at above code, I have two question:
1) in first code, does java create a "dummy object" StringBuffer in memory before do reverse and change to String.
2) at above code, does first will more optimize than second because It makes GC works more effectively ? (this is a main question I want to ask)

Both snippets will create the same number of objects. The only difference is the number of local variables. This probably won't even change how many values are on the stack etc - it's just that in the case of the second version, there's a name for one of the stack slots (b).
It's very important that you differentiate between objects and variables. It's also important to write the most readable code you can first, rather than trying to micro-optimize. Once you've got clear, working code you should measure to see whether it's fast enough to meet your requirements. If it isn't, you should profile it to work out where you can make changes most effectively, and optimize that section, then remeasure, etc.

The first way will create a very real, not at all a "dummy object" for the StringBuffer.
Unless there are other references to b below the last line of your code, the optimizer has enough information to let the environment garbage-collect b as soon as it's done with toString
The fact that there is no variable for b does not make the object created by new less real. The compiler will probably optimize both snippets into identical bytecode, too.

StringBuffer b is not a dummy object, is a reference; basically just a pointer, that resides in the stack and is very small memory-wise. So not only it makes no difference in performance (GC has nothing to do with this example), but the Java compiler will probably remove it altogether (unless it's used in other places in the code).

In answer to your first question, yes, Java will create a StringBuffer object. It works pretty much the way you think it does.
To your second question, I'm pretty sure that the Java compiler will take care of that for you. The compiler is not without its faults but I think in a simple example like this it will optimize the byte code.
Just a tip though, in Java Strings are immutable. This means they cannot be changed. So when you assign a new value to a String Java will carve out a piece of memory, put the new String value in it, and redirect the variable to the new memory space. After that the garbage collector should come by and clear out the old string.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.