Java StringBuilder Delete Last Occurance of Character Efficiently

Java StringBuilder Delete Last Occurance of Character Efficiently - java

What is the most efficient way to delete the last occurance of a char from a StringBuilder?
My current solution is O(N), but I feel like this problem can be solved in constant time.
public StringBuilder deleteLastOccurance(StringBuilder builder, char c) {
int lastIndex = builder.lastIndexOf(String.valueOf(c));
if (lastIndex != -1) {
builder.deleteCharAt(lastIndex); // O(N)
}
return builder;
}

In the end it will be an O(n) time no matter what. There is no other way to determine the last character without checking all the way to the end.
Even internal java API methods will have the same underlying implementation.

Related

Count the Characters in a String Recursively & treat "eu" as a Single Character

I am new to Java, and I'm trying to figure out how to count Characters in the given string and threat a combination of two characters "eu" as a single character, and still count all other characters as one character.
And I want to do that using recursion.
Consider the following example.
Input:
"geugeu"
Desired output:
4 // g + eu + g + eu = 4
Current output:
2
I've been trying a lot and still can't seem to figure out how to implement it correctly.
My code:
public static int recursionCount(String str) {
if (str.length() == 1) {
return 0;
}
else {
String ch = str.substring(0, 2);
if (ch.equals("eu") {
return 1 + recursionCount(str.substring(1));
}
else {
return recursionCount(str.substring(1));
}
}
}

OP wants to count all characters in a string but adjacent characters "ae", "oe", "ue", and "eu" should be considered a single character and counted only once.
Below code does that:
public static int recursionCount(String str) {
int n;
n = str.length();
if(n <= 1) {
return n; // return 1 if one character left or 0 if empty string.
}
else {
String ch = str.substring(0, 2);
if(ch.equals("ae") || ch.equals("oe") || ch.equals("ue") || ch.equals("eu")) {
// consider as one character and skip next character
return 1 + recursionCount(str.substring(2));
}
else {
// don't skip next character
return 1 + recursionCount(str.substring(1));
}
}
}

Recursion explained
In order to address a particular task using Recursion, you need a firm understanding of how recursion works.
And the first thing you need to keep in mind is that every recursive solution should (either explicitly or implicitly) contain two parts: Base case and Recursive case.
Let's have a look at them closely:
Base case - a part that represents a simple edge-case (or a set of edge-cases), i.e. a situation in which recursion should terminate. The outcome for these edge-cases is known in advance. For this task, base case is when the given string is empty, and since there's nothing to count the return value should be 0. That is sufficient for the algorithm to work, outcomes for other inputs should be derived from the recursive case.
Recursive case - is the part of the method where recursive calls are made and where the main logic resides. Every recursive call eventually hits the base case and stars building its return value.
In the recursive case, we need to check whether the given string starts from a particular string like "eu". And for that we don't need to generate a substring (keep in mind that object creation is costful). instead we can use method String.startsWith() which checks if the bytes of the provided prefix string match the bytes at the beginning of this string which is chipper (reminder: starting from Java 9 String is backed by an array of bytes, and each character is represented either with one or two bytes depending on the character encoding) and we also don't bother about the length of the string because if the string is shorter than the prefix startsWith() will return false.
Implementation
That said, here's how an implementation might look:
public static int recursionCount(String str) {
if(str.isEmpty()) {
return 0;
}
return str.startsWith("eu") ?
1 + recursionCount(str.substring(2)) : 1 + recursionCount(str.substring(1));
}
Note: that besides from being able to implement a solution, you also need to evaluate it's Time and Space complexity.
In this case because we are creating a new string with every call time complexity is quadratic O(n^2) (reminder: creation of the new string requires allocating the memory to coping bytes of the original string). And worse case space complexity also would be O(n^2).
There's a way of solving this problem recursively in a linear time O(n) without generating a new string at every call. For that we need to introduce the second argument - current index, and each recursive call should advance this index either by 1 or by 2 (I'm not going to implement this solution and living it for OP/reader as an exercise).
In addition
In addition, here's a concise and simple non-recursive solution using String.replace():
public static int count(String str) {
return str.replace("eu", "_").length();
}
If you would need handle multiple combination of character (which were listed in the first version of the question) you can make use of the regular expressions with String.replaceAll():
public static int count(String str) {
return str.replaceAll("ue|au|oe|eu", "_").length();
}

Java IF Statement necessary?

I have:
String str = "Hello, how, are, you";
I want to create a helper method that removes the commas from any string. Which of the following is more accurate?
private static String removeComma(String str){
if(str.contains(",")){
str = str.replaceAll(",","");
}
return str;
}
OR
private static String removeComma(String str){
str = str.replaceAll(",","");
return str;
}
Seems like I don't need the IF statement but there might be a case where I do.
If there is a better way let me know.

Both are functionally equivalent but the former is more verbose and will probably be slower because it runs an extra operation.
Also note that you don't need replaceAll (which accepts a regular expression): replace will do.
So I would go for:
private static String removeComma(String str){
return str.replace(",", "");
}

The IF statement is unnecessary, unless you're handling "large" strings (we're talking megabytes or more).
If you're using the IF statement, your code will first search for the first occurance of a comma, and then execute the replacement. This could be costly if the comma is near the end of the string and your string is large, since it will have to be traversed twice.
Without the IF statement, commas will be replaced if they exist. If the answer is negative, your string will be untouched.
Bottom rule: use the version without the IF statement.

Both are correct, but the second one is cleaner since the IF statement of the first alternative is not needed.

It's a matter of what is the probability to have strings with comma in your universe of strings.
If you have a high probability, call the method replaceAll without checking first.
BUT If you are not using extremely huge strings, I guess you will see no difference in perfomance at all.

Just another solution with time complexity O(n), space complexity O(n):
public static String removeComma(String str){
int length = str.length();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < length; i++) {
char c = str.charAt(i);
if (c != ',') {
sb.append(c);
}
}
return sb.toString();
}

Space complexity of a recursive algorithm

I have a recursive algorithm to find a palindrome in Java. It should return true if the given string is palindrome. False otherwise. This method uses substring method, which is little bit trickier to find the complexity.
Here's my algorithm:
static boolean isPalindrome (String str) {
if (str.length() > 1) {
if (str.charAt(0) == (str.charAt(str.length() - 1))) {
if (str.length() == 2) return true;
return isPalindrome(str.substring(1, str.length() - 1));
}
return false;
}
else {
return true;
}
}
What is the space complexity of this algorithm ?
I mean, when I call the method substring(), does it create a new string all the time ? What actually substring method do in Java ?

In older versions of Java (mainly in Java 6 and before)*, substring returned a new instance that shared the internal char array of the longer string (that is nicely illustrated here). Then substring had time and a space complexity of O(1).
Newer versions use a different representation of String, which does not rely on a shared array. Instead, substring allocates a new array of just the required size, and copies the contents from the longer string. Then substring has a time and a space complexity of O(n).
*Actually the change was introduced in update 6 of Java 7.

Java CharAt() and deleteCharAt() performance

I've been wondering about the implementation of charAt function for String/StringBuilder/StringBuffer in java
what is the complexity of that ?
also what about the deleteCharAt() in StringBuffer/StringBuilder ?

For String, StringBuffer, and StringBuilder, charAt() is a constant-time operation.
For StringBuffer and StringBuilder, deleteCharAt() is a linear-time operation.
StringBuffer and StringBuilder have very similar performance characteristics. The primary difference is that the former is synchronized (so is thread-safe) while the latter is not.

Let us just look at the corresponding actual java implementation(only relevant code) for each of these methods in turn. That itself will answer about their efficiency.
String.charAt :
public char charAt(int index) {
if ((index < 0) || (index >= value.length)) {
throw new StringIndexOutOfBoundsException(index);
}
return value[index];
}
As we can see, it is just a single array access which is a constant time operation.
StringBuffer.charAt :
public synchronized char charAt(int index) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
return value[index];
}
Again, single array access, so a constant time operation.
StringBuilder.charAt :
public char charAt(int index) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
return value[index];
}
Again, single array access, so a constant time operation. Even though all these three methods look same, there are some minor differences. For example, only StringBuffer.charAt method is synchronized but not other methods. Similarly if check is slightly different for String.charAt (guess why?). Closer look at these method implementations itself give us other minor differences among them.
Now, let us look at deleteCharAt implementations.
String does not have deleteCharAt method. The reason might be it is an immutable object. So exposing an API which explicitly indicates that this method modifies the object is not probably a good idea.
Both StringBuffer and StringBuilder are subclasses of AbstractStringBuilder. The deleteCharAt method of these two classes is delegating the implementation to its parent class itself.
StringBuffer.deleteCharAt :
public synchronized StringBuffer deleteCharAt(int index) {
super.deleteCharAt(index);
return this;
}
StringBuilder.deleteCharAt :
public StringBuilder deleteCharAt(int index) {
super.deleteCharAt(index);
return this;
}
AbstractStringBuilder.deleteCharAt :
public AbstractStringBuilder deleteCharAt(int index) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
System.arraycopy(value, index+1, value, index, count-index-1);
count--;
return this;
}
A closer look at AbstractStringBuilder.deleteCharAt method reveals that it is actually calling System.arraycopy. This can be O(N) in worst case. So deleteChatAt method is O(N) time complexity.

The charAt method is O(1).
The deleteCharAt method on StringBuilder and StringBuffer is O(N) on average, assuming you are deleting a random character from an N character StringBuffer / StringBuilder. (It has to move, on average, half of the remaining characters to fill up the "hole" left by the deleted character. There is no amortization over multiple operations; see below.) However, if you delete the last character, the cost will be O(1).
There is no deleteCharAt method for String.
In theory, StringBuilder and StringBuffer could be optimized for the case where you are inserting or deleting multiple characters in a "pass" through the buffer. They could do this by maintaining an optional "gap" in the buffer, and moving characters across it. (IIRC, emacs implements its text buffers this way.) The problems with this approach are:
It requires more space, for the attributes that say where the gap is, and for the gap itself.
It makes the code a lot more complicated, and slows down other operations. For instance, charAt would have to compare the offset with the start and end points of the gap, and make the corresponding adjustments to the actual index value before fetching the character array element.
It is only going to help if the application does multiple inserts / deletes on the same buffer.
Not surprisingly, this "optimization" has not been implemented in the standard StringBuilder / StringBuffer classes. However, a custom CharSequence class could use this approach.

charAt is super fast (and can use intrinsics for String), it's a simple index into an array. deleteCharAt would require an arraycopy, thus deleting a char won't be fast.

Since we all know that the string is implemented in JDK as a character array, which implements the randomAccess interface. Therefore the time complexity of charAt should be int O(1). As other arrays, the delete operation has the O(n) time complexity.

Summary of all responses from above:
charAt is O(1) since its just accessing the index of an array
deleteCharAt can be O(N) in worst case since it copies the entire array for it.

The time complexity for a code segment

From an online notes, I read the following java code snippet for reversing a string, which is claimed to have quadratic time complexity. It seems to me that the “for” loop for i just iterates the whole length of s. How does it cause a quadratic time complexity?
public static String reverse(String s)
{
String rev = new String();
for (int i = (s.length()-1); i>=0; i--) {
rev = rev.append(s.charAt(i));
}
return rev.toString();
}

public static String reverse(String s)
{
String rev = " ";
for (int i=s.length()-1; i>=0; i--)
rev.append(s.charAt(i); // <--------- This is O(n)
Return rev.toString();
}
I copy pasted your code. I'm not sure where you get this but actually String doesn't have append method. Maybe rev is a StringBuilder or another Appendable.

Possibly because the append call does not execute in constant time. If it's linear with the length of the string, that would explain it.

append has to find the end of the string, which is Ο(n). So, you have an Ο(n) loop executed Ο(n) times.

I don't think String has an append method. So, this code won't compile.
But, coming to the problem of quadratic complexity, let us assume that you are actually appending the string with a character using '+' operator or the String.concat() method.
The String objects are immutable. So, whenever you append to a string, a new string of bigger length is created, old string contents are copied to it and then the final character is appended, and the previous string is destroyed. So, this process takes more and more time as the string grows.
The appending loop takes O(n) time but for every loop you take O(n) time to copy the string character by character. This leads to quadratic complexity.
It would be better to use StringBuilder or StringBuffer. However, I guess the time complexity you mentioned would be with older java compilers. But, new advanced compilers would actually optimize the '+' operation with StringBuilder.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java StringBuilder Delete Last Occurance of Character Efficiently - java

In the end it will be an O(n) time no matter what. There is no other way to determine the last character without checking all the way to the end. Even internal java API methods will have the same underlying implementation.

Related

Count the Characters in a String Recursively & treat "eu" as a Single Character

Java IF Statement necessary?

Space complexity of a recursive algorithm

Java CharAt() and deleteCharAt() performance

The time complexity for a code segment

Categories

Resources