When working with a StringBuilder, I often append 2 char values to a StringBuilder using StringBuilder#append(char) twice, rather than StringBuilder#append(String).
I.e.:
StringBuilder builder = new StringBuilder();
builder.append(' ').append('t'); // would append(" t") work better here?
return builder.toString();
I would like to know:
Which approach is better performance-wise
Which approach is more common and why
I have already read through Using character instead of String for single-character values in StringBuffer append but it does not answer my question.
That question pertains to whether appending a single character (append('c')) is better than a single-character string (append("c")). I already understand why appending a single character is better than a single-character string, but I do not know whether appending a two-character string (append("ab")) is better than twice appending each character (append('a').append('b')).
In my testing, both of them seemed to take about the same time, however appending a string might be slightly slower (maybe 10 or so nanoseconds)
However, appending a string is much more popular as it's easier to use/understand.
This is really interesting to figure out this one.
As we all know the array is fast and internally String is using Character Array for storing the values.
internally both the method called the super.append(XXX) method ofAbstractStringBuilderclass.
if you see the code of append in AbstractStringBuilder for String and CharSeq
public AbstractStringBuilder append(String str) {
if (str == null) str = "null";
int len = str.length();
ensureCapacityInternal(count + len);
str.getChars(0, len, value, count);
count += len;
return this;
}
public AbstractStringBuilder append(CharSequence s, int start, int end) {
if (s == null)
s = "null";
if ((start < 0) || (start > end) || (end > s.length()))
throw new IndexOutOfBoundsException(
"start " + start + ", end " + end + ", s.length() "
+ s.length());
int len = end - start;
ensureCapacityInternal(count + len);
for (int i = start, j = count; i < end; i++, j++)
value[j] = s.charAt(i);
count += len;
return this;
}
These are the method internally called when you call append method.
Both the method calls the ensureCapacityInternal method to expand the array. So let's leave this method call as it is.
Now, the main difference comes in the next line of code.
The method with String args calls the getChars method. which internally call the System.arraycopy method, it's a native method and we can't predict the complexity of that method. it's based on the OS/JVM.
CharSeq method uses a for loop till the length of input charSequence.
for (int i = start, j = count; i < end; i++, j++)
value[j] = s.charAt(i);
i.e. it's completixity is depend on the length of input.
As I study, Other posts related to System.arraycopy method. They all say that it's effective than copying an array by a loop. even in an Effective Java Programing book.
So finally opinion, I would suggest that if the input is of small length then use the CharSequence only. Why waste the JVM for the small length String.
If you have long length string like a statement then go for a method with String args. Also, remember Space complexity increases in this case. i.e. String is immutable and you are creating more String every time in a pool. String.valueof(), (String)obj are examples.
Edited:
public AbstractStringBuilder append(char c) { ensureCapacityInternal(count + 1); value[count++] = c; return this; }
This method used when the args is char.
And seems that. It's more fast then others.
Because of assignment at count++ index of char. This method only contain the system.arraycopy method, which is common in all other method for ensuringcapacity.
Hope this will help. :)
Related
Let's say there has a string like " world ". This String only has the blank at front and end. Is the trim() faster than replace()?
I used the replace once and my mentor said don't use it since the trim() probably faster.
If not, what's the advantage of trim() than replace()?
If we look at the source code for the methods:
replace():
public String replace(CharSequence target, CharSequence replacement) {
String tgtStr = target.toString();
String replStr = replacement.toString();
int j = indexOf(tgtStr);
if (j < 0) {
return this;
}
int tgtLen = tgtStr.length();
int tgtLen1 = Math.max(tgtLen, 1);
int thisLen = length();
int newLenHint = thisLen - tgtLen + replStr.length();
if (newLenHint < 0) {
throw new OutOfMemoryError();
}
StringBuilder sb = new StringBuilder(newLenHint);
int i = 0;
do {
sb.append(this, i, j).append(replStr);
i = j + tgtLen;
} while (j < thisLen && (j = indexOf(tgtStr, j + tgtLen1)) > 0);
return sb.append(this, i, thisLen).toString()
}
Vs trim():
public String trim() {
int len = value.length;
int st = 0;
char[] val = value; /* avoid getfield opcode */
while ((st < len) && (val[st] <= ' ')) {
st++;
}
while ((st < len) && (val[len - 1] <= ' ')) {
len--;
}
return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
}
As you can see replace() calls multiple other methods and iterates throughout the entire String, while trim() simply iterates over the beginning and ending of the String until the character isn't a white space. So in the single respect of trying to only remove white space before and after a word, trim() is more efficient.
We can run some benchmarks on this:
public static void main(String[] args) {
long testStartTime = System.nanoTime();;
trimTest();
long trimTestTime = System.nanoTime() - testStartTime;
testStartTime = System.nanoTime();
replaceTest();
long replaceTime = System.nanoTime() - testStartTime;
System.out.println("Time for trim(): " + trimTestTime);
System.out.println("Time for replace(): " + replaceTime);
}
public static void trimTest() {
for(int i = 0; i < 1000000; i ++) {
new String(" string ").trim();
}
}
public static void replaceTest() {
for(int i = 0; i < 1000000; i ++) {
new String(" string ").replace(" ", "");
}
}
Output:
Time for trim(): 53303903
Time for replace(): 485536597
//432,232,694 difference
Assuming that the people writing the Java library code are doing a good job1, you can assume that a special purpose method (like trim()) will be as fast, and probably faster than a general purpose method (like replace(...)) doing the same thing.
Two reasons:
If the special purpose method is slower, its implementation can be rewritten as equivalent calls to the general purpose one, making the performance equivalent in most cases. A competent programmer will do this because it reduces maintenance costs.
In the special purpose method, it is likely that there will be optimizations that can be made that don't apply in the general-purpose case.
In this case we know that trim() only needs to look at the start and end of the string ... whereas replace(...) needs to look at all of the characters in the string. (We can infer this from the description of what the respective methods do.)
If we assume "competence" then we can infer that the developers will have done the analysis and not implemented trim() sub-optimally2; i.e. they won't code trim() to examine all characters.
There is another reason to use the special purpose method over the general purpose. It makes your code simpler, easier to read, and easier to inspect for correctness. This may well be more important than performance.
This clearly applies in the case of trim() versus replace(...).
1 - We can in this case. There are lots of eyes looking at this code, and lots of people who will complain loudly about egregious performance issues.
2 - Unfortunately, it is not always as straightforward as this. A library method needs to be optimized for "typical" behavior, but it also needs to avoid pathological performance in edge-cases. It is not always possible to achieve both things.
trim() is definitely faster to type, yes. It doesn't take any parameters.
It is also much faster to understand what you where trying to do. You were trying to trim the string, rather than replacing all the spaces it contains with the empty string, knowing from other context that there is only space at the beginning and the end of the string.
Indeed much faster no matter how you look at it. Don't complicate the life of the persons who're trying to read your code. Most of the time, it will be you months later, or at least someone you don't hate.
Trim will prune the outter characters until they are non white space. I believe they trim space, tab, and new lines.
Replace will scan the entire string (so, it could be a sentense) and would replace inner " " with "", essentially compressing them together.
They have different use cases though, obviously 1 is to clean up user input where the other is to update a string where matches are found with something else.
That being said, run times: Replace will run in N time, as it will look for all matching characters. Trim will run in O(N), but most likely a just a few characters off of each end.
The idea behind trim i think came around from people would would type and input things but accidentally press space before submitting their forms, essentially trying to save the field "Foo " instead of "Foo"
s.trim() shortens a String s. This means no characters has to be moved from an index to another. It starts at the first character (s.toCharArray()[0]) of the String and shortens the String character by character until the first non-whitespace character occurs. It works the same way to shorten the String at the end. So it compresses the String. If a String has no leading and trailing whitespace trim will be ready after checking the first and the last character.
In case of " world ".trim() two steps are needed: one to remove the first leading whitespace as it is on the first index and the the second to remove the last whitespace as it is on the last index.
" world ".replace(" ", "") will need at least n = " world ".length() steps. It has to check every character if it has to be replaced. But if we take into account that the implementation of String.replace(...) needs to compile a Pattern, build a Matcher and then to replace all the matched regions it's seems far complex comparing to shorten a String.
We also have to consider that " world ".replace(" ", "") does not replace whitespaces but only the String " ". Since String replace(CharSequence target, CharSequence replacement) compiles the target using Pattern.LITERAL we cannot use the character class \s. To be more accurate we would have to compare " world ".trim() to " world ".replaceAll("\\s", ""). It is still not the same because a whitespace in String trim() is defined as c <= ' ' for each c in s.toCharArray().
Summarizing: String.trim() should be faster - especially for long strings
The description how the methods work is based on the implementation of String in Java 8. But implementations can change.
But the question should be: What do you intent to do with the string? Do you want to trim it or to replace some characters? According to it use the corresponding method.
Why is StringBuilder much faster than string concatenation using the + operator? Even though that the + operator internally is implemented using either StringBuffer or StringBuilder.
public void shortConcatenation(){
long startTime = System.currentTimeMillis();
while (System.currentTimeMillis() - startTime <= 1000){
character += "Y";
}
System.out.println("short: " + character.length());
}
//// using String builder
public void shortConcatenation2(){
long startTime = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
while (System.currentTimeMillis() - startTime <= 1000){
sb.append("Y");
}
System.out.println("string builder short: " + sb.length());
}
I know that there are a lot of similar questions posted here, but these don't really answer my question.
Do you understand how it works internally?
Every time you do stringA += stringB; a new string is created an assigned to stringA, so it will consume memory (a new string instance!) and time (copy the old string + new characters of the other string).
StringBuilder will use an array of characters internally and when you use the .append() method it will do several things:
check if there are any free space for the string to append
again some internal checks and run a System.arraycopy to copy the characters of the string in the array.
Personally, I think the allocation of a new string every time (creating a new instance of string, put the string, etc.) could be very expensive in terms of memory and speed (in while/for, etc. especially).
In your example, use a StringBuilder is better, but if you need (example) something simple like a .toString(),
public String toString() {
return StringA + " - " + StringB;
}
makes no differences (well, in this case it is better you avoid StringBuilder overhead which is useless here).
Strings in Java are immutable. This means that methods that operate on strings cannot ever change the value of a string. String concatenation using += works by allocating memory for an entirely new string that is the concatenation of the two previous ones, and replacing the reference with this new string. Each new concatenation requires the construction of an entirely new String object.
In contrast, the StringBuilder and StringBuffer classes are implemented as a mutable sequence of characters. This means that as you append new Strings or characters onto a StringBuilder, it simply updates its internal array to reflect the changes you've made. This means that new memory is only allocated when the string grows past the buffer already existing in a StringBuilder.
I can list a very nice example for understanding the same (I mean I felt it's a nice example). Check the code here taken from a LeetCode problem: https://leetcode.com/problems/remove-outermost-parentheses/
1: Using String
public String removeOuterParentheses(String S) {
String a = "";
int num = 0;
for(int i=0; i < S.length()-1; i++) {
if(S.charAt(i) == '(' && num++ > 0) {
a += "(";
}
if(S.charAt(i) == ')' && num-- > 1) {
a += ")";
}
}
return a;
}
And now, using StringBuilder.
public String removeOuterParentheses(String S) {
StringBuilder sb = new StringBuilder();
int a = 0;
for(char ch : S.toCharArray()) {
if(ch == '(' && a++ > 0) sb.append('(');
if(ch == ')' && a-- > 1) sb.append(')');
}
return sb.toString();
}
The performance of both varies by a huge margin. The first submission uses String while the latter one uses StringBuilder.
As explained above the theory is the same. String by property is immutable and synchronous,i.e. its state cannot be changed. The second, for example, is expensive owing to the creation of a new memory allocation whenever a concatenation function or "+" is used. It will consume a lot of heap and in return be slower. In comparison StringBuilder is mutable, it will only append and not create an overload on the memory consumed.
I have been working on the Project Euler problem 4. I am new to java, and believe I have found the answer (906609 = 993 * 913, by using Excel!).
When I print the line commented out, I can that my string manipulations have worked. I've researched a few ways to compare strings in case I had not understoof something, but this routine doesn't give me a result.
Please help me identify why it is not printing the answer?
James
public class pall{
public static void main(String[] args){
int i;
int j;
long k;
String stringProd;
for(i=994;i>992; i--){
for (j=914;j>912; j--){
k=(i*j);
stringProd=String.valueOf(k);
int len=stringProd.length();
char[] forwards=new char[len];
char[] back = new char[len];
for(int l=0; l<len; l++){
forwards[l]=stringProd.charAt(l);
}
for(int m=0; m<len;m++){
back[m]=forwards[len-1-m];
}
//System.out.println(forwards);
//System.out.println(back);
if(forwards.toString().equals(back.toString())){
System.out.println(k);}
}
}
}
}
You are comparing the string representation of your array. toString() doesn't give you what you think. For example, the below code makes it clear:
char[] arr1 = {'a', 'b'};
char[] arr2 = {'a', 'b'};
System.out.println(arr1.toString() + " : " + arr2.toString());
this code prints:
[C#16f0472 : [C#18d107f
So, the string representation of both the arrays are different, even though the contents are equal. This is because arrays don't override toString() method. It inherits the Object#toString() method.
The toString method for class Object returns a string consisting of
the name of the class of which the object is an instance, the at-sign
character #, and the unsigned hexadecimal representation of the hash
code of the object. In other words, this method returns a string equal
to the value of:
getClass().getName() + '#' + Integer.toHexString(hashCode())
So, in the above output, [C is the output of char[].class.getName(), and 18d107f is the hashcode.
You can't also compare the arrays using forward.equals(back), as arrays in Java don't override equals() or hashCode() either. Any options? Yes, for comparing arrays you can use Arrays#equals(char[], char[]) method:
if (Arrays.equals(forward, back)) {
System.out.println(k);
}
Also, to get your char arrays, you don't need those loops. You can use String#toCharArray() method. And also to get the reverse of the String, you can wrap the string in a StringBuilder instance, and use it's reverse() method:
char[] forwards = stringProd.toCharArray();
char[] back = new StringBuilder(stringPod).reverse().toString().toCharArray();
And now that you have found out an easy way to reverse a string, then how about using String#equals() method directly, and resist creating those character arrays?
String stringPod = String.valueOf(k);
String reverseStringPod = new StringBuilder(stringPod).reverse().toString()
if (stringPod.equals(reverseStringPod)) {
System.out.println(k);
}
Finally, since it is about project euler, which is about speed and mostly mathematics. You should consider avoiding String utilities, and do it with general division and modulus arithmetic, to get each individual digits, from beginning and end, and compare them.
To convert a string to char[] use
char[] forward = stringProd.toCharArray();
To convert a char[] to String, use String(char[]) constructor:
String backStr = new String(back); // Not the same as back.toString()
However, this is not the most performant solution, for several reasons:
You do not need to construct a back array to check if a string is a palindrome - you can walk the string from both ends, comparing the characters as you go, until you either find a difference or your indexes meet in the middle.
Rather than constructing a new array in a loop, you could reuse the same array - in case you do want to continue with an array, you could allocate it once for the maximum length of the product k, and use it in all iterations of your loop.
You do not need to convert a number to string in order to check if it is a palindrome - you can get its digits by repeatedly taking the remainder of division by ten, and then dividing by ten to go to the next digit.
Here is an illustration of the last point:
boolean isPalindrome(int n) {
int[] digits = new int[10];
if (n < 0) n = -n;
int len = 0;
while (n != 0) {
digits[len++] = n % 10;
n /= 10;
}
// Start two indexes from the opposite sides
int left = 0, right = len-1;
// Loop until they meet in the middle
while (left < right) {
if (digits[left++] != digits[right--]) {
return false;
}
}
return true;
}
Is there a method or way to read and keep reading chars from a string, putting them in a new string until there is a certain char.
Keep reading from < to > but no further.
Thankx
Of course. You will need:
the method String.charAt(int)
the + operator (or the method String.concat, or, if performance matters, the class StringBuilder)
the for statement
and perhaps an if-statement with a break statement
The statements and operators are explained in the Java Tutorial, and the method in the api javadoc.
(And no, I will not provide an implementation, since you would learn little by copying it)
You can actually write a utility that does that
public class StringUtil {
public static String copy(String str, char startChar, char endChar) {
int startPos = str.indexOf(startChar);
int endPos = str.lastIndexOf(endChar);
if (endPos < startPos) {
throw new RuntimeException("endPos < startPos");
}
char[] dest = new char[endPos - startPos + 1];
str.getChars(startPos, endPos, dest, 0);
return new String(dest);
}
};
PS Untested....
Alternatively, you can
String result = str.substring(startPos, endPos + 1); //If you want to include the ">" tag.
This may not be what you want but it will give you the desired string.
String desiredString="<Hello>".split("[<>]")[1];
In Java,
I need to read lines of text from a file and then reverse each line, writing the reversed version into another file. I know how to read from one file and write to another. What I don't know how to do is manipulate the text so that "This is line 1" would be written into the second file as "1 enil si sihT"
since these are homeworks you are probably interested in your own implementation of reverse method.
The naive version visits the string backwards (from the last index to the index 0) while copying it in a StringBuilder:
public String reverse(String s) {
StringBuilder sb = new StringBuilder();
for (int i = s.length() - 1; i >= 0; i--) {
sb.append(s.charAt(i));
}
return sb.toString();
}
for example the String "hello":
H e l l o
0 1 2 3 4 // indexes for charAt()
the method start by the index 4 ('o') then the index 3 ('l') ... until 0 ('H').
StringBuilder buffer = new StringBuilder(theString);
return buffer.reverse().toString();
If this is homework, it would be better for you to understand how are data stored into the string it self.
A string may be represented as an array of characters
String line = // read line ....;
char [] data = line.toCharArray();
To reverse an array you have to swap the positions of the elements. The first in the last, the last in the first and so on.
int l = data.length;
char temp;
temp = data[0]; // put the first element in "temp" to avoid losing it.
data[0] = data[l - 1]; // put the last value in the first;
data[l - 1] = temp; // and the first in the last.
Continue with the rest of the elements ( hint use a loop ) in the array and then create a new String with the result:
String modifiedString = new String( data ); // where data is the reversed array.
If is not ( and you really just need to have the work done ) use:
StringBuilder.reverse()
Good luck.
String reversed = new StringBuilder(textLine).reverse().toString();
The provided answers all suggest using an already existing method, which is sound advice and usually more effective than writing your own.
Depending on the assignment, however, your teacher might expect you to write a method of your own. If that is the case, try using a for loop to walk through the string character by character, only instead of counting from zero and up, start counting from the last character index and down to zero, consecutively building the reversed string.
While we're feeding horrible, finished answers to the poor student, we might as well whet his appetite for the bizarre. If strings were guaranteed to be reasonably short and CPU time was no object, this is what I'd code:
public static String reverse(String str) {
if (str.length() == 0) return "";
else return reverse(str.substring(1)) + str.charAt(0);
}
(OK, I admit it: my current favorite language is Clojure, a Lisp!)
BONUS HOMEWORK: Figure out if, how and why this works!
java.lang.StringBuffer has a reverse method.