Why is the second code more efficient than the first one? - java

I am confused between two codes, why the second one I am going to give here is more efficient than the first one.
Both of the codes just reverse a String, but first code is slower than the other and I am not able to understand why.
The first code is:
String reverse1(String s) {
String answer = "";
for(int j = s.length() - 1; j >= 0; j--) {
answer += s.charAt(j);
}
return answer;
}
The second code is:
String reverse2(String s) {
char answer[] = new char[s.length()];
for(int j = s.length() - 1; j >= 0; j--) {
answer[s.length() - j - 1] = s.charAt(j);
}
return new String(answer);
}
And I'm not able to understand how the second code is more efficient than the first one, I'd appreciate any insight on this.

The first code declares
String answer;
Strings are immutable. Therefore, every append operation reallocates the entire string, copies it, then copies in the new character.
The second code declares
char answer[];
Arrays are mutable, so each iteration copies only a single character. The final string is created once, not once per iteration of the loop.

Your question is perhaps difficult to answer exactly, in part because the answer would depend on the actual implementation of the first version. This, in turn, would depend on what version of Java you are using, and what the compiler decided to do.
Assuming that the compiler keeps the first version verbatim as you wrote it, then yes, the first version might be more inefficient, because it would require allocating a new string for each step in the reversal process. The second version, on the contrary, just maintains a single array of characters.
However, if the compiler is smart enough to use a StringBuilder, then the answer changes. Consider the following first version:
String reverse1(String s) {
StringBuilder answer = new StringBuilder();
for (int j = s.length() - 1; j >= 0; j--)
answer.append(s.charAt(j));
return answer;
}
Under the hood, StringBuilder is implemented using a character array. So calling StringBuilder#append is somewhat similar to the second version, i.e. it just adds new characters to the end of the buffer.
So, if your first version executes using literal String, then it is more inefficient than the second version, but using StringBuilder it might be on par with the second version.

String is immutable. Whenever you do answer += s.charAt(j); it creates a new object. Try printing GC logs using -XX:+PrintGCDetails and see if the latency is caused by minor GC.

String object is immutable and every time you made an add operation you create another object, allocating space and so on, so it's quite inefficient when you need to concatenate many strings.
Your char array method fits your specific need well, but if you need more generic string concatenation support, you could consider StringBuilder

In this code you are creating a new String object in each loop iteration,because String is immutable class
String reverse1(String s) {
String answer = "";
for (int j = s.length() - 1; j >= 0; j--)
answer += s.charAt(j);
return answer;
}
In this code you have already allocated memory to char array,Your code will create only single String at last line, so it is more efficient
String reverse2(String s) {
char answer[] = new char[s.length()];
for (int j = s.length() - 1; j >= 0; j--)
answer[s.length() - j - 1] = s.charAt(j);
return new String(answer);
}

Why is the second code more efficient than the first one?
String is immuable, by answer += s.charAt(j); you are creating a new instance of String in each loop, which makes your code slow.
Instead of String, you are suggested to use StringBuilder in a single thread context, for both performance and readablity(might be a little slower than fix-sized char array but has a better readablity):
String reverse1(String s) {
StringBuilder answer = new StringBuilder("");
for (int j = s.length() - 1; j >= 0; j--)
answer.append(s.charAt(j));
return answer.toString();
}

The JVM treats strings as immutable. Hence, every time you append to the existing string, you are actually create a new string! This means that a new string object has to be created in heap for every loop iteration. Creating an object and maintaining its lifecycle has its overhead. Add to that the garbage collection of the discarded strings (the string created in the previous iteration won't have a reference to it in the next, and hence, it is collected by the JVM).
You should consider using a StringBuilder. I ran some tests and the time taken by the StringBuilder code is not much smaller than that of the fixed-length array.
There are some nuances to how the JVM treats strings. There are things like string interning that the JVM does so that it does not have to create a new object for multiple strings with the same content. You might want to look into that.

Related

Error in swapping characters in stringbuffer object

i am trying to sort a string in the alphabetical order, however i am facing an error in the line :
sb.charAt(j)=sb.charAt(j+1);
where the compiler shows an error as expected variable; found value
the rest of the code is as follows :
import java.util.Scanner;
class a
{
public static void main(String[] agrs)
{
Scanner sc = new Scanner(System.in);
String s = sc.next();
StringBuffer sb = new StringBuffer();
sb.append(s);
for(int i = 0; i< s.length(); i++)
{
for(int j = 0; j<s.length(); j++)
{
if(s.charAt(j)>s.charAt(j+1)){
char temp = s.charAt(j);
sb.charAt(j)=sb.charAt(j+1);
sb.charAt(j+1)=temp;
}
}
}}}
kindly help me out as i'm a beginner and i cannot figure out why this issue is occurring , thank you .
This looks like a homework assignment where the goal is to sort the characters of a text being entered, so if you enter gfedbca the result should be abcdefg.
You already got a comment telling you what the problem is: StringBuffer#charAt() is not returning a reference to StringBuffer's internal array that you can change the value of. Dependent on the actual assignment you can call StringBuffers setCharAt method or you can go another approach by converting the text to sort to a char array and do the sorting in there. There are actually helper-classes in the JVM, that do that for you, have a look e.g. at the class java.util.Arrays
As already answered by many, the issue is in charAt(index) you are using, as this returns the character at the given index rather than setting a char at the index position.
My answer is to divert your approach of sorting. For simpler solutions, where smaller data sets (like your problem) are used, you should use the predefined sorting algorithms, like Insertion Sort
You may get help for the algo from here: http://www.geeksforgeeks.org/insertion-sort/
StringBuffer's charAt returns just the value of the char at the index, if you want to swap two chars you need to use setter for that, so you possible want to do somtehing like:
for(int j = 0; j < s.length() - 1; j++) {
if(s.charAt(j) > s.charAt(j + 1)) {
char temp = s.charAt(j);
sb.setCharAt(j, sb.charAt(j + 1));
sb.setCharAt(j + 1, temp);
}
}
This method can only return values and can not set values, I guess you might want to use this method:
setCharAt()
It can meet your requirement

Best way to modify an existing string? StringBuilder or convert to char array and back to string?

I'm learning Java and am wondering what's the best way to modify strings here (both for performance and to learn the preferred method in Java). Assume you're looping through a string and checking each character/performing some action on that index in the string.
Do I use the StringBuilder class, or convert the string into a char array, make my modifications, and then convert the char array back to a string?
Example for StringBuilder:
StringBuilder newString = new StringBuilder(oldString);
for (int i = 0; i < oldString.length() ; i++) {
newString.setCharAt(i, 'X');
}
Example for char array conversion:
char[] newStringArray = oldString.toCharArray();
for (int i = 0; i < oldString.length() ; i++) {
myNameChars[i] = 'X';
}
myString = String.valueOf(newStringArray);
What are the pros/cons to each different way?
I take it that StringBuilder is going to be more efficient since the converting to a char array makes copies of the array each time you update an index.
I say do whatever is most readable/maintainable until you you know that String "modification" is slowing you down. To me, this is the most readable:
Sting s = "foo";
s += "bar";
s += "baz";
If that's too slow, I'd use a StringBuilder. You may want to compare this to StringBuffer. If performance matters and synchronization does not, StringBuilder should be faster. If sychronization is needed, then you should use StringBuffer.
Also it's important to know that these strings are not being modified. In java, Strings are immutable.
This is all context specific. If you optimize this code and it doesn't make a noticeable difference (and this is usually the case), then you just thought longer than you had to and you probably made your code more difficult to understand. Optimize when you need to, not because you can. And before you do that, make sure the code you're optimizing is the cause of your performance issue.
What are the pros/cons to each different way. I take it that StringBuilder is going to be more efficient since the convering to a char array makes copies of the array each time you update an index.
As written, the code in your second example will create just two arrays: one when you call toCharArray(), and another when you call String.valueOf() (String stores data in a char[] array). The element manipulations you are performing should not trigger any object allocations. There are no copies being made of the array when you read or write an element.
If you are going to be doing any sort of String manipulation, the recommended practice is to use a StringBuilder. If you are writing very performance-sensitive code, and your transformation does not alter the length of the string, then it might be worthwhile to manipulate the array directly. But since you are learning Java as a new language, I am going to guess that you are not working in high frequency trading or any other environment where latency is critical. Therefore, you are probably better off using a StringBuilder.
If you are performing any transformations that might yield a string of a different length than the original, you should almost certainly use a StringBuilder; it will resize its internal buffer as necessary.
On a related note, if you are doing simple string concatenation (e.g, s = "a" + someObject + "c"), the compiler will actually transform those operations into a chain of StringBuilder.append() calls, so you are free to use whichever you find more aesthetically pleasing. I personally prefer the + operator. However, if you are building up a string across multiple statements, you should create a single StringBuilder.
For example:
public String toString() {
return "{field1 =" + this.field1 +
", field2 =" + this.field2 +
...
", field50 =" + this.field50 + "}";
}
Here, we have a single, long expression involving many concatenations. You don't need to worry about hand-optimizing this, because the compiler will use a single StringBuilder and just call append() on it repeatedly.
String s = ...;
if (someCondition) {
s += someValue;
}
s += additionalValue;
return s;
Here, you'll end up with two StringBuilders being created under the covers, but unless this is an extremely hot code path in a latency-critical application, it's really not worth fretting about. Given similar code, but with many more separate concatenations, it might be worth optimizing. Same goes if you know the strings might be very large. But don't just guess--measure! Demonstrate that there's a performance problem before you try to fix it. (Note: this is just a general rule for "micro optimizations"; there's rarely a downside to explicitly using a StringBuilder. But don't assume it will make a measurable difference: if you're concerned about it, you should actually measure.)
String s = "";
for (final Object item : items) {
s += item + "\n";
}
Here, we're performing a separate concatenation operation on each loop iteration, which means a new StringBuilder will be allocated on each pass. In this case, it's probably worth using a single StringBuilder since you may not know how large the collection will be. I would consider this an exception to the "prove there's a performance problem before optimizing rule": if the operation has the potential to explode in complexity based on input, err on the side of caution.
Which option will perform the best is not an easy question.
I did a benchmark using Caliper:
RUNTIME (NS)
array 88
builder 126
builderTillEnd 76
concat 3435
Benchmarked methods:
public static String array(String input)
{
char[] result = input.toCharArray(); // COPYING
for (int i = 0; i < input.length(); i++)
{
result[i] = 'X';
}
return String.valueOf(result); // COPYING
}
public static String builder(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result.toString(); // COPYING
}
public static StringBuilder builderTillEnd(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result;
}
public static String concat(String input)
{
String result = "";
for (int i = 0; i < input.length(); i++)
{
result += 'X'; // terrible COPYING, COPYING, COPYING... same as:
// result = new StringBuilder(result).append('X').toString();
}
return result;
}
Remarks
If we want to modify a String, we have to do at least 1 copy of that input String, because Strings in Java are immutable.
java.lang.StringBuilder extends java.lang.AbstractStringBuilder. StringBuilder.setCharAt() is inherited from AbstractStringBuilder and looks like this:
public void setCharAt(int index, char ch) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
value[index] = ch;
}
AbstractStringBuilder internally uses the simplest char array: char value[]. So, result[i] = 'X' is very similar to result.setCharAt(i, 'X'), however the second will call a polymorphic method (which probably gets inlined by JVM) and check bounds in if, so it will be a bit slower.
Conclusions
If you can operate on StringBuilder until the end (you don't need String back) - do it. It's the preferred way and also the fastest. Simply the best.
If you want String in the end and this is the bottleneck of your program, then you might consider using char array. In benchmark char array was ~25% faster than StringBuilder. Be sure to properly measure execution time of your program before and after optimization, because there is no guarantee about this 25%.
Never concatenate Strings in the loop with + or +=, unless you really know what you do. Usally it's better to use explicit StringBuilder and append().
I'd prefer to use StringBuilder class where original string is modified.
For String manipulation, I like StringUtil class. You'll need to get Apache commons dependency to use it

Executing code N times and other code N+1 times

The question is about while-loops in which I need some code to be executed N times and some other code N+1 times. NOT about concatening Strings, I just use this as bad-coded yet short example.
Let me explain my question by providing an example.
Say I want to concatenate N+1 Strings, by glueing them with "\n", for example. I will have N+1 lines of text then, but I only need to add N times "\n".
Is there any boilerplate solution for this type of loop in which you have to execute some code N times and other code N+1 times? I'm NOT asking for solution to concatenate Strings! That is just a (bad) example. I'm looking for the general solution.
The problem I have with this is code duplication, so to code my example I'll do this (bad pseudo code, I know I have to use StringBuilder etc.):
String[] lines = <some array of dimension N+1>;
String total = lines[0];
for (int i = 1; i < N + 1; i++){
total += "\n" + lines[i];
}
The problem becomes worse if the code that has to be executed N+1 times, becomes larger, of course. Then I would do something like
codeA(); // adding the line of text
for (int i = 1; i < N + 1; i++){
codeB(); // adding the "\n"
codeA();
}
To remove the duplication, you can do this different by checking inside the loop, too, but then I find this quite stupid as I know beforehand that the check is pre-determined, as it will only be false the first iteration:
for (int i = 0; i < N + 1; i++){
if (i > 0){
codeB(); // adding the "\n"
}
codeA();
}
Is there any solution for this, a sort of while-loop that initializes once with codeA() en then keeps looping over codeB() and codeA()?
People must have run into this before, I guess. Just wondering if there are any beautiful solutions for this.
To my dissapointment, I believe that there is no such construct that satisfies the conditions as you have stated them and I will attempt to explain why (though I can't prove it in a strictly mathematical way).
The requirements of the problem are:
We have two parts of code: codeA() and codeB()
The two parts are executed a different number of times, N and N+1
We want to avoid adding a condition inside the loop
We want to execute each part only as many times as strictly necessary
2) is a direct consequence of 1). If we didn't have two parts of code we would not need a different number of executions. We would have a single loop body.
4) is again a consequence of 1). There is no redundant execution if we have a single loop body. We can control its execution through the loop's condition
So the restrictions are basically 1) and 3).
Now inside the loop we need to answer two questions on each iteration: a) do we execute codeA()? and b) do we execute codeB()? We simply do not have enough information to decide since we only have a single condition (the condition of the loop) and that condition will be used to decide if both of the code parts would be executed or not.
So we need to break 1) and/or 3) Either we add the extra condition inside the loop or we delegate the decision to some other code (thus not having two parts anymore).
Apparently an example of delegation could be (I am using the string concatenation example):
String [] lines = ...
for (int i = 0; i < N; i++){
// delegate to a utility class LineBuilder (perhaps an extension of StringBuilder) to concatenate lines
// this class would still need to check a condition e.g. for the first line to skip the "\n"
// since we have delegated the decisions we do not have two code parts inside the loop
lineBuilder.addLine( lines[i] );
}
Now a more interesting case of delegation would be if we could delegate the decision to the data itself (this might worth keeping in mind). Example:
List<Line> lines = Arrays.asList(
new FirstLine("Every"), // note this class is different
new Line("word"),
new Line("on"),
new Line("separate"),
new Line("line") );
StringBuffer sb = new StringBuffer();
for (Line l : lines) {
// Again the decision is delegated. Data knows how to print itself
// Line would return: "\n" + s
// FirstLine would return: s
sb.append( l.getPrintVersion() );
}
Of course all of the above does not mean that you couldn't implement a class that tries to solve the problem. I believe though this is beyond the scope of your original question not to mention that would be an overkill for simple loops
Concatenating Strings like this is a bad idea and a much bigger issue IMHO.
However to answer your question I would do
String sep = "";
StringBuilder sb= new StringBuilder();
for(String s: lines) {
sb.append(sep).append(s);
sep = "\n";
}
String all = sb.toString();
Note: there is usually a good way to avoid needing to create this String at all such a processing the lines as you get them. It is hard to say without more context.
This kind of thing is fairly common, like when you build sql. This is the pattern that I follow:
String[] lines ...//init somehow;
String total = lines[0];
boolean firstTime = true;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < length; i++){
if(firstTime) firstTime = false;
else sb.append('\n');
sb.append(lines[i]);
}
Note that this is not the same, as the first example and here is why:
String[] lines = <some array of dimension N+1>;
String total = lines[0];
for (int i = 1; i < N + 1; i++){
total += "\n" + lines[i];
}
Assuming you have an array of [0] = 'line1' and [1] = 'line2'
Here you end up with line1line2\n, when the desired output is:
line1\nline2.
The example I provided is clear, and does not perform poorly. In fact a much bigger performance gain is made by utilizing StringBuilder/Buffer. Having clear code is essential for the pro.
Personally i have most of the time the same problem, on the String example i use the StringBuilder as you said, and just delete the characters added to much:
StringBuilder sb = new StringBuilder();
for(int i=0; i<N; i++) {
sb.append(array[i]).append("\n");
}
sb.delete(sb.length-1, sb.length); // maybe check if sb contains something
In the common case i suppose there is no other way than adding the if you suggested. To make the code more clear i would check at the end of the for loop:
StringBuilder sb = new StringBuilder();
for(int i=0; i<N; i++) {
sb.append(array[i]);
if(i < N) {
sb.append("\n");
}
}
But i totally agree this is sad to have this double logic

Which approach is better and optimized between two of the following?

Below are the two approaches for reversing the string with calling API methods. Please tell which approach is better with due justification
public String functionOne(String str){
char arr[] = str.toCharArray();
int limit = arr.length/2;
for (int i = arr.length-1, j = 0; j < limit; i--, j++) {
char c = arr[i];
arr[i] = arr[j];
arr[j] = c;
}
return new String(arr);
}
public String functionTwo(String str) {
StringBuilder strBuilder = new StringBuilder();
char[] strChars = str.toCharArray();
for (int i = strChars.length - 1; i >= 0; i--) {
strBuilder.append(strChars[i]);
}
return strBuilder.toString();
}
Actually when I run my code on string of length 100000, approach second took double time as that of first approach. By using System.currentTimeMillis() I found execution difference of 1 in first approach and 2 in second approach.
How about this:
new StringBuilder("some string").reverse().toString();
The API already in place for this will likely use the most efficient manner.
Second is more readable: I can skim through that without having to think about what it's doing. I would go with that every time (unless there's a good reason why you need it to be milliseconds quicker?)
The first one stops my brain for a few seconds. That means it's dangerous and can easily be broken by future changes. It either needs comments, or replacing (with the second one).
Both are equally same. First is using n/2 operation which is O(n) and the second one is doing it in n operation which is also of O(n) time complexity.
In practice both will run almost equally well because n or n/2 operation won't make much difference.
EDIT: If you don't get the time complexity meaning, try generating a random string of large length say 1 million and calculate the time for both approach.

new String() vs literal string performance

This question has been asked many times on StackOverflow but none of them were based on performance.
In Effective Java book it's given that
If String s = new String("stringette"); occurs in a loop or in a
frequently invoked method, millions of String instances can be created
needlessly.
The improved version is simply the following:
String s = "stringette"; This version uses a single String instance, rather than
creating a new one each time it is executed.
So, I tried both and found significant improvement in performance:
for (int j = 0; j < 1000; j++) {
String s = new String("hello World");
}
takes about 399 372 nanoseconds.
for (int j = 0; j < 1000; j++) {
String s = "hello World";
}
takes about 23 000 nanoseconds.
Why is there so much performance improvement? Is there any compiler optimization happening inside?
In the first case, a new object is being created in each iteration, in the second case, it's always the same object, being retrieved from the String constant pool.
In Java, when you do:
String bla = new String("xpto");
You force the creation of a new String object, this takes up some time and memory.
On the other hand, when you do:
String muchMuchFaster = "xpto"; //String literal!
The String will only be created the first time (a new object), and it'll be cached in the String constant pool, so every time you refer to it in it's literal form, you're getting the exact same object, which is amazingly fast.
Now you may ask... what if two different points in the code retrieve the same literal and change it, aren't there problems bound to happen?!
No, because Strings, in Java, as you may very well know, are immutable! So any operation that would mutate a String returns a new String, leaving any other references to the same literal happy on their way.
This is one of the advantages of immutable data structures, but that's another issue altogether, and I would write a couple of pages on the subject.
Edit
Just a clarification, the constant pool isn't exclusive to String types, you can read more about it here, or if you google for Java constant pool.
http://docs.oracle.com/javase/specs/jvms/se7/jvms7.pdf
Also, a little test you can do to drive the point home:
String a = new String("xpto");
String b = new String("xpto");
String c = "xpto";
String d = "xpto";
System.out.println(a == b);
System.out.println(a == c);
System.out.println(c == d);
With all this, you can probably figure out the results of these Sysouts:
false
false
true
Since c and d are the same object, the == comparison holds true.
The performance difference is in fact much greater: HotSpot has an easy time compiling the entire loop
for (int j = 0; j < 1000; j++)
{String s="hello World";}
out of existence so the runtime is a solid 0. This, however, happens only after the JIT compiler kicks in; that's what warmup is for, a mandatory procedure when microbenchmarking anything on the JVM.
This is the code I ran:
public static void timeLiteral() {
for (int j = 0; j < 1_000_000_000; j++)
{String s="hello World";}
}
public static void main(String... args) {
for (int i = 0; i < 10; i++) {
final long start = System.nanoTime();
timeLiteral();
System.out.println((System.nanoTime() - start) / 1000);
}
}
And this is a typical output:
1412
38
25
1
1
0
0
1
0
1
You can observe the JIT taking effect very soon.
Note that I don't iterate one thousand, but one billion times in the inner method.
as already have been answered the second retrieves the instance from the String pool (remember Strings are immutable).
Additionally you should check the intern() method which enables you to put new String() into a pool in case you do not know the constant value of the string in runtime: e.g:
String s = stringVar.intern();
or
new String(stringVar).intern();
I will add additional fact, you should know that additionally to the String object more info exist in the pool (the hashcode): this enables fast hashMap search by String in the relevant data Strtuctures (instead of recreating the hashcode each time)
The JVM maintains a pool of references to unique String objects that are literals. In your new String example you are wrapping the literals with an instance of each.
See http://www.precisejava.com/javaperf/j2se/StringAndStringBuffer.htm

Categories