I've been told that code such as:
for (int i = 0; i < x.length(); i++) {
// blah
}
is actually O(n^2) because of the repeated calls to x.length(). Instead I should use:
int l = x.length();
for (int i = 0; i < l; i++) {
// blah
}
Is this true? Is string length stored as a private integer attribute of the String class? Or does String.length() really walk the whole string just to determine its length?
No, the length of a java string is O(1) because java's string class stores the length as a field.
The advice you've received is true of C, amongst other languages, but not java. C's strlen walks the char array looking for the end-of-string character. Joel's talked about it on the podcast, but in the context of C.
Contrary to what has been said so far, there is no guarantee that String.length() is a constant time operation in the number of characters contained in the string. Neither the javadocs for the String class nor the Java Language Specification require String.length to be a constant time operation.
However, in Sun's implementation String.length() is a constant time operation. Ultimately, it's hard to imagine why any implementation would have a non-constant time implementation for this method.
String stores the length in a separate variable. Since string is immutable, the length will never change.
It will need to calculate the length only once when it is created, which happens when memory is allocated for it.
Hence its O(1)
In the event you didn't know you could write it this way:
for (int i = 0, l = x.length(); i < l; i++) {
// Blah
}
It's just slightly cleaner since l's scope is smaller.
You should be aware that the length() method returns the number of UTF-16 code points, which is not necessarily the same as the number of characters in all cases.
OK, the chances of that actually affecting you are pretty slim, but there's no harm in knowing it.
I don't know how well the link will translate, but see the source of String#length. In short, #length() has O(1) complexity because it's just returning a field. This is one of the many advantages of immutable strings.
No worries even though we are calling length() as a method on x.length(), Actually length is stored as a field/property in String class and this field/property returned by length() method whenever we call " x.length()".
Check out this image link or below code snippet of length() method defined in String class-
ImageLink
public int length() {
return value.length >> coder();
}
length() method returns the length property which is already stored in String class.
According to this, the length is a field of the String object.
Related
I was always told strings in java are immutable, unless your going to use the string builder class or string writter class.
Take a look at this practice question I found online,
Given a string and a non-negative int n, return a larger string that is n copies of the original string.
stringTimes("Hi", 2) → "HiHi"
stringTimes("Hi", 3) → "HiHiHi"
stringTimes("Hi", 1) → "Hi"
and the solution came out to be
Solution:
public String stringTimes(String str, int n) {
String result = "";
for (int i=0; i<n; i++) {
result = result + str; // could use += here
}
return result;
}
As you see in the solution our 3rd line assigns the string , then we change it in our for loop. This makes no sense to me! (I answered the question in another nooby way ) Once I saw this solution I knew I had to ask you guys.
Thoughts? I know im not that great at programming but I haven't seen this type of example here before, so I thought I'd share.
The trick to understanding what's going on is the line below:
result = result + str;
or its equivalent
result += str;
Java compiler performs a trick on this syntax - behind the scene, it generates the following code:
result = result.concat(str);
Variable result participates in this expression twice - once as the original string on which concat method is called, and once as the target of an assignment. The assignment does not mutate the original string, it replaces the entire String object with a new immutable one provided by concat.
Perhaps it would be easier to see if we introduce an additional String variable into the code:
String temp = result.concat(str);
result = temp;
Once the first line has executed, you have two String objects - temp, which is "HiHi", and result, which is still "Hi". When the second line is executed, result gets replaced with temp, acquiring a new value of "HiHi".
If you use Eclipse, you could make a breakpoint and run it step by step. You will find the id (find it in "Variables" View) of "result" changed every time after java did
result = result + str;
On the other hand, if you use StringBuffer like
StringBuffer result = new StringBuffer("");
for(int i = 0; i < n; i++){
result.append(str);
}
the id of result will not change.
String objects are indeed immutable. result is not a String, it is a reference to a String object. In each iteration, a new String object is created and assigned to the same reference. The old object with no reference is eventually destroyed by a garbage collector. For a simple example like this, it is a possible solution. However, creating a new String object in each iteration in a real-world application is not a smart idea.
I am trying to overwrite the compareTo in Java such that it works as follows. There will be two string arrays containing k strings each. The compareTo method will go through the words in order, comparing the kth element of each array. The arrays will then be sorted thusly. The code I have currently is as follows, but it does not work properly.
I need a return statement outside the for-loop. I'm not sure what this return statement should return, since one of the for-loop return statements will always be reached.
Also, am I using continue correctly here?
public int compareTo(WordNgram wg) {
for (int k = 0; k < (this.myWords).length; k++) {
String temp1 = (this.myWords)[k];
String temp2 = (wg.myWords)[k];
int last = temp1.compareTo(temp2);
if (last == 0) {
continue;
} else {
return last;
}
}
}
You want to compare the two string at the same location:
int last = temp1.compare(temp2);
Java compiler mandates all the end points must have a return statement. In your case you must return 0 at end so when both arrays contain completely equal strings the caller will know they are equal.
You should start listening to your compiler, because after looking at your code for 1 minute, I spotted two undefined states: this.myWords.length is 0 and the two words are equal.
Also, I personally find it very difficult to handle multiple method exit points with all possibilities for input considered and rather insert a single returning statement which makes debugging easier and the results more predictable. In your case for example, I would collect the results of compareTo in a collection if they differ from 0 so that after the for-loop has finished, you could decide at the state of this collection if 0 (empty collection) or the first value in the collection could be returned. I like this more formal approach, because it enforces you to think set-like as in "Give me all comparing results where compareTo results in anything else but 0. If this list is empty, the comparing result is 0, otherwise it is the first element of the list."
I'm learning Java and am wondering what's the best way to modify strings here (both for performance and to learn the preferred method in Java). Assume you're looping through a string and checking each character/performing some action on that index in the string.
Do I use the StringBuilder class, or convert the string into a char array, make my modifications, and then convert the char array back to a string?
Example for StringBuilder:
StringBuilder newString = new StringBuilder(oldString);
for (int i = 0; i < oldString.length() ; i++) {
newString.setCharAt(i, 'X');
}
Example for char array conversion:
char[] newStringArray = oldString.toCharArray();
for (int i = 0; i < oldString.length() ; i++) {
myNameChars[i] = 'X';
}
myString = String.valueOf(newStringArray);
What are the pros/cons to each different way?
I take it that StringBuilder is going to be more efficient since the converting to a char array makes copies of the array each time you update an index.
I say do whatever is most readable/maintainable until you you know that String "modification" is slowing you down. To me, this is the most readable:
Sting s = "foo";
s += "bar";
s += "baz";
If that's too slow, I'd use a StringBuilder. You may want to compare this to StringBuffer. If performance matters and synchronization does not, StringBuilder should be faster. If sychronization is needed, then you should use StringBuffer.
Also it's important to know that these strings are not being modified. In java, Strings are immutable.
This is all context specific. If you optimize this code and it doesn't make a noticeable difference (and this is usually the case), then you just thought longer than you had to and you probably made your code more difficult to understand. Optimize when you need to, not because you can. And before you do that, make sure the code you're optimizing is the cause of your performance issue.
What are the pros/cons to each different way. I take it that StringBuilder is going to be more efficient since the convering to a char array makes copies of the array each time you update an index.
As written, the code in your second example will create just two arrays: one when you call toCharArray(), and another when you call String.valueOf() (String stores data in a char[] array). The element manipulations you are performing should not trigger any object allocations. There are no copies being made of the array when you read or write an element.
If you are going to be doing any sort of String manipulation, the recommended practice is to use a StringBuilder. If you are writing very performance-sensitive code, and your transformation does not alter the length of the string, then it might be worthwhile to manipulate the array directly. But since you are learning Java as a new language, I am going to guess that you are not working in high frequency trading or any other environment where latency is critical. Therefore, you are probably better off using a StringBuilder.
If you are performing any transformations that might yield a string of a different length than the original, you should almost certainly use a StringBuilder; it will resize its internal buffer as necessary.
On a related note, if you are doing simple string concatenation (e.g, s = "a" + someObject + "c"), the compiler will actually transform those operations into a chain of StringBuilder.append() calls, so you are free to use whichever you find more aesthetically pleasing. I personally prefer the + operator. However, if you are building up a string across multiple statements, you should create a single StringBuilder.
For example:
public String toString() {
return "{field1 =" + this.field1 +
", field2 =" + this.field2 +
...
", field50 =" + this.field50 + "}";
}
Here, we have a single, long expression involving many concatenations. You don't need to worry about hand-optimizing this, because the compiler will use a single StringBuilder and just call append() on it repeatedly.
String s = ...;
if (someCondition) {
s += someValue;
}
s += additionalValue;
return s;
Here, you'll end up with two StringBuilders being created under the covers, but unless this is an extremely hot code path in a latency-critical application, it's really not worth fretting about. Given similar code, but with many more separate concatenations, it might be worth optimizing. Same goes if you know the strings might be very large. But don't just guess--measure! Demonstrate that there's a performance problem before you try to fix it. (Note: this is just a general rule for "micro optimizations"; there's rarely a downside to explicitly using a StringBuilder. But don't assume it will make a measurable difference: if you're concerned about it, you should actually measure.)
String s = "";
for (final Object item : items) {
s += item + "\n";
}
Here, we're performing a separate concatenation operation on each loop iteration, which means a new StringBuilder will be allocated on each pass. In this case, it's probably worth using a single StringBuilder since you may not know how large the collection will be. I would consider this an exception to the "prove there's a performance problem before optimizing rule": if the operation has the potential to explode in complexity based on input, err on the side of caution.
Which option will perform the best is not an easy question.
I did a benchmark using Caliper:
RUNTIME (NS)
array 88
builder 126
builderTillEnd 76
concat 3435
Benchmarked methods:
public static String array(String input)
{
char[] result = input.toCharArray(); // COPYING
for (int i = 0; i < input.length(); i++)
{
result[i] = 'X';
}
return String.valueOf(result); // COPYING
}
public static String builder(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result.toString(); // COPYING
}
public static StringBuilder builderTillEnd(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result;
}
public static String concat(String input)
{
String result = "";
for (int i = 0; i < input.length(); i++)
{
result += 'X'; // terrible COPYING, COPYING, COPYING... same as:
// result = new StringBuilder(result).append('X').toString();
}
return result;
}
Remarks
If we want to modify a String, we have to do at least 1 copy of that input String, because Strings in Java are immutable.
java.lang.StringBuilder extends java.lang.AbstractStringBuilder. StringBuilder.setCharAt() is inherited from AbstractStringBuilder and looks like this:
public void setCharAt(int index, char ch) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
value[index] = ch;
}
AbstractStringBuilder internally uses the simplest char array: char value[]. So, result[i] = 'X' is very similar to result.setCharAt(i, 'X'), however the second will call a polymorphic method (which probably gets inlined by JVM) and check bounds in if, so it will be a bit slower.
Conclusions
If you can operate on StringBuilder until the end (you don't need String back) - do it. It's the preferred way and also the fastest. Simply the best.
If you want String in the end and this is the bottleneck of your program, then you might consider using char array. In benchmark char array was ~25% faster than StringBuilder. Be sure to properly measure execution time of your program before and after optimization, because there is no guarantee about this 25%.
Never concatenate Strings in the loop with + or +=, unless you really know what you do. Usally it's better to use explicit StringBuilder and append().
I'd prefer to use StringBuilder class where original string is modified.
For String manipulation, I like StringUtil class. You'll need to get Apache commons dependency to use it
Here is one of the constructor for String object in Java:
public String(String original) {
int size = original.count;
char[] originalValue = original.value;
char[] v;
if (originalValue.length > size) {
// The array representing the String is bigger than the new
// String itself. Perhaps this constructor is being called
// in order to trim the baggage, so make a copy of the array.
int off = original.offset;
v = Arrays.copyOfRange(originalValue, off, off+size);
} else {
// The array representing the String is the same
// size as the String, so no point in making a copy.
v = originalValue;
}
this.offset = 0;
this.count = size;
this.value = v;
}
The line of code if (originalValue.length > size) is what I care about, I don't think this condition can be true for all the code inside IF being executed. The String is in fact an array of characters. original.count should be equal to its value's length (its value is an array of characters), so the condition wouldn't happen.
I may be wrong, so I need your explanation. Thanks for your help.
VipHaLong.
The String is infact an array of characters
No it's not. It's an object which internally has a reference to an array of characters.
original.count should be equal to its value's length (its value is an array of characters)
Not necessarily. It depends on the exact version of Java you're looking at, but until recently several strings could refer to the same char[], each using a different portion of the array.
For example, if you have:
String longString = "this is a long string";
String shortString = longString.substring(0, 2);
... the object referred to shortString would use the same char[] that the original string referred to, but with an start offset of 0 and a count of 2. So if you then called:
String copyOfShortString = new String(shortString);
that would indeed go into the if block you were concerned about in your question.
As of Java 7 update 5, the Oracle JRE has changed to make substring always take a copy. (The pros and cons behind this can get quite complicated, but it's worth being aware of both systems.)
It looks like the version of code you're looking at is an older version where string objects could share an underlying array but view different portions.
The String implementation that you are looking at does not copy character data when you create a substring. Instead, multiple String objects can refer to the same character array but have different offset and count (and therefore length).
Therefore, the if condition can, in fact, be true.
Note that this sharing of character arrays has been removed in recent versions of the Oracle JDK.
I am struggling with the charAt method.what i want to know is if when you use charAt, are you able to use more than one number in the parameter, so that you look at more than one character in one method?
No, there is no vanilla JavaScript method for that. You could always write one that prototypes the String object, though:
String.prototype.charsAt = function(indexes) {
var returned = [ ];
for(var i = 0; i < indexes.length; i++)
{
returned.push(this.charAt(indexes[i]));
}
return returned;
}
You can then call it using:
var text = 'mystring';
alert(text.charsAt([0, 1]));
You can see a working demo here > http://jsfiddle.net/MDNRS/. As others have said though, this is really entirely trivial, as you should use substr() or other methods.
From javadoc:
charAt
public char charAt(int index)
Returns the char value at the specified index. An index ranges from 0 to length() -
The first char value of the sequence is at
index 0, the next at index 1, and so on, as for array indexing.
If you need something else you can use toCharArray() to access the underlying char[] and do what you need
looks like your understanding of method charat(int index) is a little bit off. Calling this method is simply to get a single character at specified index.
If you are looking for searching a specific character sequence, you might want to look into the contains(CharSequence cs) method of String class.
Reference:
Java API