Char array or String Builder - java

I'd like to know for character concatenation in Java - which one of the below method would be better for readability, maintenance and performance - either 'char array' or 'string builder'.
The method has to take the first letter from both the strings, append and return it.
Eg:
Input 1: ABC Input 2: DEF -> method should return AD.
using string builder:
private String getString(String str1, String str2) {
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append(str1.charAt(0));
stringBuilder.append(str2.charAt(0));
return stringBuilder.toString();
}
using char array:
private String getString(String str1, String str2) {
char[] charArray = new char[2];
charArray[0] = str1.charAt(0);
charArray[1] = str2.charAt(0);
return String.valueOf(charArray);
}

StringBuilder is just a wrapper around a char[], adding functionality like resizing the array as necessary; and moving elements when you insert/delete etc.
It might be marginally faster to use the char[] directly for some things, but you'd lose (or have to reimplement) a lot of the useful functionality.

charArray is good in term of Performance and readability too but it hard to maintain the code like this. It can cause the error like Null pointer. You just need to add the null check with char[] code.
On the other side StringBuffer internally use the char. So, char is better here and also by doing this we are not creating an Object. Memory point of view. It's good not to create that one.

If you review the source code for StringBuilder, you will find that internally it uses a char[] to represent the buffered string. So both versions of your code are doing very similar things. However, I would vote for using StringBuilder, because it offers an API which can much more than the plain char[] which sits inside its implementation.

Related

Java IF Statement necessary?

I have:
String str = "Hello, how, are, you";
I want to create a helper method that removes the commas from any string. Which of the following is more accurate?
private static String removeComma(String str){
if(str.contains(",")){
str = str.replaceAll(",","");
}
return str;
}
OR
private static String removeComma(String str){
str = str.replaceAll(",","");
return str;
}
Seems like I don't need the IF statement but there might be a case where I do.
If there is a better way let me know.
Both are functionally equivalent but the former is more verbose and will probably be slower because it runs an extra operation.
Also note that you don't need replaceAll (which accepts a regular expression): replace will do.
So I would go for:
private static String removeComma(String str){
return str.replace(",", "");
}
The IF statement is unnecessary, unless you're handling "large" strings (we're talking megabytes or more).
If you're using the IF statement, your code will first search for the first occurance of a comma, and then execute the replacement. This could be costly if the comma is near the end of the string and your string is large, since it will have to be traversed twice.
Without the IF statement, commas will be replaced if they exist. If the answer is negative, your string will be untouched.
Bottom rule: use the version without the IF statement.
Both are correct, but the second one is cleaner since the IF statement of the first alternative is not needed.
It's a matter of what is the probability to have strings with comma in your universe of strings.
If you have a high probability, call the method replaceAll without checking first.
BUT If you are not using extremely huge strings, I guess you will see no difference in perfomance at all.
Just another solution with time complexity O(n), space complexity O(n):
public static String removeComma(String str){
int length = str.length();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < length; i++) {
char c = str.charAt(i);
if (c != ',') {
sb.append(c);
}
}
return sb.toString();
}

How can I efficiently use StringBuilder?

In the past, I've always used printf to format printing to the console but the assignment I currently have (creating an invoice report) wants us to use StringBuilder, but I have no idea how to do so without simply using " " for every gap needed. For example... I'm supposed to print this out
Invoice Customer Salesperson Subtotal Fees Taxes Discount Total
INV001 Company Eccleston, Chris $ 2357.60 $ 40.00 $ 190.19 $ -282.91 $ 2304.88
But I don't know how to get everything to line up using the StringBuilder. Any advice?
StringBuilder aims to reduce the overhead associated with creating strings.
As you may or may not know, strings are immutable. What this means that something like
String a = "foo";
String b = "bar";
String c = a + b;
String d = c + c;
creates a new string for each line. If all we are concerned about is the final string d, the line with string c is wasting space because it creates a new String object when we don't need it.
String builder simply delays actually building the String object until you call .toString(). At that point, it converts an internal char[] to an actual string.
Let's take another example.
String foo() {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < 100; i++)
sb.append(i);
return sb.toString();
}
Here, we only create one string. StringBuilder will keep track of the chars you have added to your string in its internal char[] value. Note that value.length will generally be larger than the total chars you have added to your StringBuilder, but value might run out of room for what you're appending if the string you are building gets too big. When that happens, it'll resize, which just means replacing value with a larger char[], and copying over the old values to the new array, along with the chars of whatever you appended.
Finally, when you call sb.toString(), the StringBuilder will call a String constructor that takes an argument of a char[].
That means only one String object was created, and we only needed enough memory for our char[] and to resize it.
Compare with the following:
String foo() {
String toReturn = "";
for (int i = 0; i < 100; i++)
toReturn += "" + i;
toReturn;
}
Here, we have 101 string objects created (maybe more, I'm unsure). We only needed one though! This means that at every call, we're disposing the original string toReturn represented, and creating another string.
With a large string, especially, this is very expensive, because at every call you need to first acquire as much memory as the new string needs, and dispose of as much memory as the old string had. It's not a big deal when things are kept short, but when you're working with entire files this can easily become a problem.
In a nutshell: if you're working appending / removing information before finalizing an output: use a StringBuilder. If your strings are very short, I think it is OK to just concatenate normally for convenience, but this is up to you to define what "short" is.

Best way to modify an existing string? StringBuilder or convert to char array and back to string?

I'm learning Java and am wondering what's the best way to modify strings here (both for performance and to learn the preferred method in Java). Assume you're looping through a string and checking each character/performing some action on that index in the string.
Do I use the StringBuilder class, or convert the string into a char array, make my modifications, and then convert the char array back to a string?
Example for StringBuilder:
StringBuilder newString = new StringBuilder(oldString);
for (int i = 0; i < oldString.length() ; i++) {
newString.setCharAt(i, 'X');
}
Example for char array conversion:
char[] newStringArray = oldString.toCharArray();
for (int i = 0; i < oldString.length() ; i++) {
myNameChars[i] = 'X';
}
myString = String.valueOf(newStringArray);
What are the pros/cons to each different way?
I take it that StringBuilder is going to be more efficient since the converting to a char array makes copies of the array each time you update an index.
I say do whatever is most readable/maintainable until you you know that String "modification" is slowing you down. To me, this is the most readable:
Sting s = "foo";
s += "bar";
s += "baz";
If that's too slow, I'd use a StringBuilder. You may want to compare this to StringBuffer. If performance matters and synchronization does not, StringBuilder should be faster. If sychronization is needed, then you should use StringBuffer.
Also it's important to know that these strings are not being modified. In java, Strings are immutable.
This is all context specific. If you optimize this code and it doesn't make a noticeable difference (and this is usually the case), then you just thought longer than you had to and you probably made your code more difficult to understand. Optimize when you need to, not because you can. And before you do that, make sure the code you're optimizing is the cause of your performance issue.
What are the pros/cons to each different way. I take it that StringBuilder is going to be more efficient since the convering to a char array makes copies of the array each time you update an index.
As written, the code in your second example will create just two arrays: one when you call toCharArray(), and another when you call String.valueOf() (String stores data in a char[] array). The element manipulations you are performing should not trigger any object allocations. There are no copies being made of the array when you read or write an element.
If you are going to be doing any sort of String manipulation, the recommended practice is to use a StringBuilder. If you are writing very performance-sensitive code, and your transformation does not alter the length of the string, then it might be worthwhile to manipulate the array directly. But since you are learning Java as a new language, I am going to guess that you are not working in high frequency trading or any other environment where latency is critical. Therefore, you are probably better off using a StringBuilder.
If you are performing any transformations that might yield a string of a different length than the original, you should almost certainly use a StringBuilder; it will resize its internal buffer as necessary.
On a related note, if you are doing simple string concatenation (e.g, s = "a" + someObject + "c"), the compiler will actually transform those operations into a chain of StringBuilder.append() calls, so you are free to use whichever you find more aesthetically pleasing. I personally prefer the + operator. However, if you are building up a string across multiple statements, you should create a single StringBuilder.
For example:
public String toString() {
return "{field1 =" + this.field1 +
", field2 =" + this.field2 +
...
", field50 =" + this.field50 + "}";
}
Here, we have a single, long expression involving many concatenations. You don't need to worry about hand-optimizing this, because the compiler will use a single StringBuilder and just call append() on it repeatedly.
String s = ...;
if (someCondition) {
s += someValue;
}
s += additionalValue;
return s;
Here, you'll end up with two StringBuilders being created under the covers, but unless this is an extremely hot code path in a latency-critical application, it's really not worth fretting about. Given similar code, but with many more separate concatenations, it might be worth optimizing. Same goes if you know the strings might be very large. But don't just guess--measure! Demonstrate that there's a performance problem before you try to fix it. (Note: this is just a general rule for "micro optimizations"; there's rarely a downside to explicitly using a StringBuilder. But don't assume it will make a measurable difference: if you're concerned about it, you should actually measure.)
String s = "";
for (final Object item : items) {
s += item + "\n";
}
Here, we're performing a separate concatenation operation on each loop iteration, which means a new StringBuilder will be allocated on each pass. In this case, it's probably worth using a single StringBuilder since you may not know how large the collection will be. I would consider this an exception to the "prove there's a performance problem before optimizing rule": if the operation has the potential to explode in complexity based on input, err on the side of caution.
Which option will perform the best is not an easy question.
I did a benchmark using Caliper:
RUNTIME (NS)
array 88
builder 126
builderTillEnd 76
concat 3435
Benchmarked methods:
public static String array(String input)
{
char[] result = input.toCharArray(); // COPYING
for (int i = 0; i < input.length(); i++)
{
result[i] = 'X';
}
return String.valueOf(result); // COPYING
}
public static String builder(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result.toString(); // COPYING
}
public static StringBuilder builderTillEnd(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result;
}
public static String concat(String input)
{
String result = "";
for (int i = 0; i < input.length(); i++)
{
result += 'X'; // terrible COPYING, COPYING, COPYING... same as:
// result = new StringBuilder(result).append('X').toString();
}
return result;
}
Remarks
If we want to modify a String, we have to do at least 1 copy of that input String, because Strings in Java are immutable.
java.lang.StringBuilder extends java.lang.AbstractStringBuilder. StringBuilder.setCharAt() is inherited from AbstractStringBuilder and looks like this:
public void setCharAt(int index, char ch) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
value[index] = ch;
}
AbstractStringBuilder internally uses the simplest char array: char value[]. So, result[i] = 'X' is very similar to result.setCharAt(i, 'X'), however the second will call a polymorphic method (which probably gets inlined by JVM) and check bounds in if, so it will be a bit slower.
Conclusions
If you can operate on StringBuilder until the end (you don't need String back) - do it. It's the preferred way and also the fastest. Simply the best.
If you want String in the end and this is the bottleneck of your program, then you might consider using char array. In benchmark char array was ~25% faster than StringBuilder. Be sure to properly measure execution time of your program before and after optimization, because there is no guarantee about this 25%.
Never concatenate Strings in the loop with + or +=, unless you really know what you do. Usally it's better to use explicit StringBuilder and append().
I'd prefer to use StringBuilder class where original string is modified.
For String manipulation, I like StringUtil class. You'll need to get Apache commons dependency to use it

Most efficient way to fill a String with a specified length with a specified character?

Basically given an int, I need to generate a String with the same length containing only the specified character. Related question here, but it relates to C# and it does matter what's in the String.
This question, and my answer to it are why I am asking this one. I'm not sure what's the best way to go about it performance wise.
Example
Method signature:
String getPattern(int length, char character);
Usage:
//returns "zzzzzz"
getPattern(6, 'z');
What I've tried
String getPattern(int length, char character) {
String result = "";
for (int i = 0; i < length; i++) {
result += character;
}
return result;
}
Is this the best that I can do performance-wise?
You should use StringBuilder instead of concatenating chars this way. Use StringBuilder.append().
StringBuilder will give you better performance. The problem with concatenation the way you are doing is each time a new String (string is immutable) is created then the old string is copied, the new string is appended, and the old String is thrown away. It's a lot of extra work that over a period of type (like in a big for loop) will cause performance degradation.
StringUtils from commons-lang or Strings from guava are your friends. As already stated avoid String concatenations.
StringUtils.repeat("a", 3) // => "aaa"
Strings.repeat("hey", 3) // => "heyheyhey"
Use primitive char arrays & some standard util classes like Arrays
public class Test {
static String getPattern(int length, char character) {
char[] cArray = new char[length];
Arrays.fill(cArray, character);
// return Arrays.toString(cArray);
return new String(cArray);
}
static String buildPattern(int length, char character) {
StringBuilder sb= new StringBuilder(length);
for (int i = 0; i < length; i++) {
sb.append(character);
}
return sb.toString();
}
public static void main(String args[]){
long time = System.currentTimeMillis();
getPattern(10000000,'c');
time = System.currentTimeMillis() - time;
System.out.println(time); //prints 93
time = System.currentTimeMillis();
buildPattern(10000000,'c');
time = System.currentTimeMillis() - time;
System.out.println(time); //prints 188
}
}
EDIT Arrays.toString() gave lower performance since it eventually used a StringBuilder, but the new String did the magic.
Yikes, no.
A String is immutable in java; you can't change it. When you say:
result += character;
You're creating a new String every time.
You want to use a StringBuilder and append to it, then return a String with its toString() method.
I think it would be more efficient to do it like following,
String getPattern(int length, char character)
{
char[] list = new char[length];
for(int i =0;i<length;i++)
{
list[i] = character;
}
return new string(list);
}
Concatenating a String is never the most efficient, since String is immutable, for better performance you should use StringBuilder, and append()
String getPattern(int length, char character) {
StringBuilder sb= new StringBuilder(length)
for (int i = 0; i < length; i++) {
sb.append(character);
}
return sb.toString();
}
Performance-wise, I think you'd have better results creating a small String and concatenating (using StringBuilder of course) until you reach the request size: concatenating/appending "zzz" to "zzz" performs probably betters than concatenating 'z' three times (well, maybe not for such small numbers, but when you reach 100 or so chars, doing ten concatenations of 'z' followed by ten concatenations of "zzzzzzzzzz" is probably better than 100 concatenatinos of 'z').
Also, because you ask about GWT, results will vary a lot between DevMode (pure Java) and "production mode" (running in JS in the browser), and is likely to vary depending on the browser.
The only way to really know is to benchmark, everything else is pure speculation.
And possibly use deferred binding to use the most performing variant in each browser (that's exactly how StringBuilder is emulated in GWT).

How to Reassign value of StringBuffer?

How can we re assign the value of a StringBuffer or StringBuilder Variable?
StringBuffer sb=new StringBuffer("teststr");
Now i have to change the value of sb to "testString" without emptying the contents.
I am looking at a method which can do this assignment directly without using separate memory allocation.I think we can do it only after emptying the contents.
sb.setLength(0);
sb.append("testString");
It should first be mentioned that StringBuilder is generally preferred to StringBuffer. From StringBuffer's own API:
As of release JDK 5, this class has been supplemented with an equivalent class designed for use by a single thread, StringBuilder. The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization.
That said, I will stick to StringBuffer for the rest of the answer because that's what you're asking; everything that StringBuffer does, StringBuilder also... except synchronization, which is generally unneeded. So unless you're using the buffer in multiple threads, switching to StringBuilder is a simple task.
The question
StringBuffer sb = new StringBuffer("teststr");
"Now i have to change the value of sb to "testString" without emptying the contents"
So you want sb to have the String value "testString" in its buffer? There are many ways to do this, and I will list some of them to illustrate how to use the API.
The optimal solution: it performs the minimum edit from "teststr" to "testString". It's impossible to do it any faster than this.
StringBuffer sb = new StringBuffer("teststr");
sb.setCharAt(4, 'S');
sb.append("ing");
assert sb.toString().equals("testString");
This needlessly overwrites "tr" with "tr".
StringBuffer sb = new StringBuffer("teststr");
sb.replace(4, sb.length(), "String");
assert sb.toString().equals("testString");
This involves shifts due to deleteCharAt and insert.
StringBuffer sb = new StringBuffer("teststr");
sb.deleteCharAt(4);
sb.insert(4, 'S');
sb.append("ing");
assert sb.toString().equals("testString");
This is a bit different now: it doesn't magically know that it has "teststr" that it needs to edit to "testString"; it assumes only that the StringBuffer contains at least one occurrence of "str" somewhere, and that it needs to be replaced by "String".
StringBuffer sb = new StringBuffer("strtest");
int idx = sb.indexOf("str");
sb.replace(idx, idx + 3, "String");
assert sb.toString().equals("Stringtest");
Let's say now that you want to replace ALL occurrences of "str" and replace it with "String". A StringBuffer doesn't have this functionality built-in. You can try to do it yourself in the most efficient way possible, either in-place (probably with a 2-pass algorithm) or using a second StringBuffer, etc.
But instead I will use the replace(CharSequence, CharSequence) from String. This will be more than good enough in most cases, and is definitely a lot more clear and easier to maintain. It's linear in the length of the input string, so it's asymptotically optimal.
String before = "str1str2str3";
String after = before.replace("str", "String");
assert after.equals("String1String2String3");
Discussions
"I am looking for the method to assign value later by using previous memory location"
The exact memory location shouldn't really be a concern for you; in fact, both StringBuilder and StringBuffer will reallocate its internal buffer to different memory locations whenever necessary. The only way to prevent that would be to ensureCapacity (or set it through the constructor) so that its internal buffer will always be big enough and it would never need to be reallocated.
However, even if StringBuffer does reallocate its internal buffer once in a while, it should not be a problem in most cases. Most data structures that dynamically grows (ArrayList, HashMap, etc) do them in a way that preserves algorithmically optimal operations, taking advantage of cost amortization. I will not go through amortized analysis here, but unless you're doing real-time systems etc, this shouldn't be a problem for most applications.
Obviously I'm not aware of the specifics of your need, but there is a fear of premature optimization since you seem to be worrying about things that most people have the luxury of never having to worry about.
What do you mean with "reassign"? You can empty the contents by using setLength() and then start appending new content, if that's what you mean.
Edit: For changing parts of the content, you can use replace().
Generally, this kind of question can be easily answered by looking at the API doc of the classes in question.
You can use a StringBuilder in place of a StringBuffer, which is typically what people do if they can (StringBuilder isn't synchronized so it is faster but not threadsafe). If you need to initialize the contents of one with the other, use the toString() method to get the string representation. To recycle an existing StringBuilder or StringBuffer, simply call setLength(0).
Edit
You can overwrite a range of elements with the replace() function. To change the entire value to newval, you would use buffer.replace(0,buffer.length(),newval). See also:
StringBuilder
StringBuffer
You might be looking for the replace() method of the StringBuffer:
StringBuffer sb=new StringBuffer("teststr");
sb.replace(0, sb.length() - 1, "newstr");
Internally, it removes the original string, then inserts the new string, but it may save you a step from this:
StringBuffer sb=new StringBuffer("teststr");
sb.delete(0, sb.length() - 1);
sb.append("newstr");
Using setLength(0) reassigns a zero length StringBuffer to the variable, which, I guess, is not what you want:
StringBuffer sb=new StringBuffer("teststr");
// Reassign sb to a new, empty StringBuffer
sb.setLength(0);
sb.append("newstr");
Indeed, I think replace() is the best way. I checked the Java-Source code. It really overwrites the old characters.
Here is the source code from replace():
public AbstractStringBuffer replace(int start, int end, String str)
{
if (start < 0 || start > count || start > end)
throw new StringIndexOutOfBoundsException(start);
int len = str.count;
// Calculate the difference in 'count' after the replace.
int delta = len - (end > count ? count : end) + start;
ensureCapacity_unsynchronized(count + delta);
if (delta != 0 && end < count)
VMSystem.arraycopy(value, end, value, end + delta, count - end);
str.getChars(0, len, value, start);
count += delta;
return this;
}
Changing entire value of StringBuffer:
StringBuffer sb = new StringBuffer("word");
sb.setLength(0); // setting its length to 0 for making the object empty
sb.append("text");
This is how you can change the entire value of StringBuffer.
You can convert to/from a String, as follows:
StringBuffer buf = new StringBuffer();
buf.append("s1");
buf.append("s2");
StringBuilder sb = new StringBuilder(buf.toString());
// Now sb, contains "s1s2" and you can further append to it

Categories