which code is more efficient? - java

which of the following is an efficient way to reverse words in a string ?
public String Reverse(StringTokenizer st){
String[] words = new String[st.countTokens()];
int i = 0;
while(st.hasMoreTokens()){
words[i] = st.nextToken();i++}
for(int j = words.length-1;j--)
output = words[j]+" ";}
OR
public String Reverse(StringTokenizer st, String output){
if(!st.hasMoreTokens()) return output;
output = st.nextToken()+" "+output;
return Reverse(st, output);}
public String ReverseMain(StringTokenizer st){
return Reverse(st, "");}
while the first way seems more readable and straight forward, there are two loops in it. In the 2nd method, I've tried doing it in tail-recursive way. But I am not sure whether java does optimize tail-recursive code.

you could do this in just one loop
public String Reverse(StringTokenizer st){
int length = st.countTokens();
String[] words = new String[length];
int i = length - 1;
while(i >= 0){
words[i] = st.nextToken();i--}
}

But I am not sure whether java does optimize tail-recursive code.
It doesn't. Or at least the Sun/Oracle Java implementations don't, up to and including Java 7.
References:
"Tail calls in the VM" by John Rose # Oracle.
Bug 4726340 - RFE: Tail Call Optimization
I don't know whether this makes one solution faster than the other. (Test it yourself ... taking care to avoid the standard micro-benchmarking traps.)
However, the fact that Java doesn't implement tail-call optimization means that the 2nd solution is liable to run out of stack space if you give it a string with a large (enough) number of words.
Finally, if you are looking for a more space efficient way to implement this, there is clever way that uses just a StringBuilder.
Create a StringBuilder from your input String
Reverse the characters in the StringBuilder using reverse().
Step through the StringBuilder, identifying the start and end offset of each word. For each start/end offset pair, reverse the characters between the offsets. (You have to do this using a loop.)
Turn the StringBuilder back into a String.

You can test results by timing both of them on a large amount of results
eg. You reverse 100000000 strings and see how many seconds it takes. You could also compare start and end system timestamps to get the exact difference between the two functions.

StringTokenizer is not deprecated but if you read the current JavaDoc...
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
String[] strArray = str.split(" ");
StringBuilder sb = new StringBuilder();
for (int i = strArray.length() - 1; i >= 0; i--)
sb.append(strArray[i]).append(" ");
String reversedWords = sb.substring(0, sb.length -1) // strip trailing space

Related

Android String.split("") returning extra element

I am trying to split a word into its individual letters.
I tried both String.split("") and String.split("|") however when I split a word it is creating a extra empty element.
Example:
word = "word";
int n = word.length();
Log.i("20",Integer.toString(n));
String[] letters = word.split("|");
Log.i("25",Integer.toString(letters.length));
The output in the Android Monitor is:
07-21 15:50:23.084 5711-5711/com.strizhevskiy.movetester I/20: 4
07-21 15:50:23.085 5711-5711/com.strizhevskiy.movetester I/25: 5
I put the individual letters into TextView blocks and I can actually see an extra empty TextView.
When I test these methods in my regular Java it outputs the expected answer: 4.
I am almost tempted to think this is an actual bug in Android's implementation of the method.
I am thinking you want to do this:
public Character[] toCharacterArray( String s ) {
if ( s == null ) {
return null;
}
int len = s.length();
Character[] array = new Character[len];
for (int i = 0; i < len ; i++) {
array[i] = new Character(s.charAt(i));
}
return array;
}
Instead of splitting a word without delimiters?
I hope this helps!
It's hard to say if it's bug or expected behavior, because what are you doing doesn't make sense. You are trying to split string with logical OR (split is waiting for Regular expression, not just a string), so as result it could be different result in Android comparing with normal java, and I don't see there any issue.
Anyway, there is many ways to achieve what you want in a normal way, e.g. just iterating over word by each char in a cycle or just use toCharArray String's method.
Thank you for the suggestions. My current work-around is to use a mock array and copying over into a fresh array using System.arraycopy().
String[] mockLetters = word.split("");
int n = word.length();
String[] letters = new String[n];
System.arraycopy(mockLetters,1,letters,0,n);
I appreciate the suggestions to use toCharArray(). However, these letters then get put into TextViews and TextView doesnt seem to accept char. I could, of coarse, make it work but I've decided to stick with what I currently have.
Tom, in a comment to my question, answered my underlying issue:
Why String.split() worked differently in Android than it does in Java?
Apparently the rules for String.split() changed with Java 8.
Try passing a 0 as the limit per the documentation below so that the trailing spaces are discarded.
String[] split (String regex,
int limit)
If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

optimization - converting std input to integer array in java

I want to read each line of input, store the numbers in an int[] array preform some calculations, then move onto my next line of input as fast as possible.
Input (stdin)
2 4 8
15 10 5
12 14 3999 -284 -71
0 -213 18 4 2
0
This is a pure optimization problem and not entirely good practice in the real world as I'm assuming perfect input. I'm interested in how to improve my current method for taking input from stdin and representing it as an integer array. I have seen methods using scanner where they use a getnextint method, however I've read in multiple places scanner is a lot slower than BufferedReader.
Can this taking in of input step be improved?
Current Method
BufferedReader bufferedInput = new BufferedReader(new InputStreamReader(System.in));
String line;
String[] lineArray;
try{
// a line with just "0" indicates end of std input
while((line = bufferedInput.readLine()) != "0"){
lineArray = line.split("\\s+"); // is "\\s+" the optimized regex
int arrlength = lineArray.length;
int[] lineInt = new int[arrlength];
for(int i = 0; i < arrlength; i++){
lineInt[i] = Integer.parseInt(lineArray[i]);
}
// Preform some operations on lineInt, then regenerate a new
// lineInt with inputs from next line of stdin
}
}catch(IOException e){
}
judging from other questions Difference between parseInt and valueOf in java? parseint seems to be the most efficient method for converting strings to integers1. Any enlightenment would be of great help.
Thank you :)
Edit 1: removed GCD information and 'algorithm' tag
Edit 2: (hopefully) made question more concise, grammatical fix ups
First of all, I just want out that it is totally pointless optimizing in your particular example.
For your example, most people would agree that the best solution is not the optimal one. Rather the most readable solution is will be the best.
Having said that, if you want the most optimal solution, then don't use Scanner, don't use BufferedReader.readLine(), don't use String.split and don't use Integer.parseInt(...).
Instead read characters one at a time using BufferedReader.read() and parse and convert them to int by hand. You also need to implement your own "extendable array of int" type that behaves like an ArrayList<Integer>.
This is a lot of (unnecessary) work, and many more lines of code to maintain. BAD IDEA ...
I second what Stephen said, the speed of parsing is likely to massively outperform the speed of actual I/O done, therefore improving parsing won't give you much.
Seriously, don't do this unless you've built the whole system, profiled it and found that inefficient parsing is what keeps it from hitting its performance targets.
But strictly just as an exercise, and because the general principle may be useful elsewhere, here's an example of how to parse it straight from a string.
The assumptions are:
You will use a sensible encoding, where the characters 0..9 are consecutive.
The only characters in the stream will be 0..9, minus sign and space.
All the numbers are well-formed.
Another important caveat is that for the sake of simplicity I used ArrayList, which is a bad idea for storing primitives, the overhead of boxing/unboxing probably wipes out all improvement in parsing speed. In the real world I'd use a list variant custom-made for primitives.
public static List<Integer> parse(String s) {
List<Integer> ret = new ArrayList<Integer>();
int sign = 1;
int current = 0;
boolean inNumber = false;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c >= '0' && c <= '9') { //we assume a sensible encoding
current = current * 10 + sign * (c-'0');
inNumber = true;
}
else if (c == ' ' && inNumber) {
ret.add(current);
current = 0;
inNumber = false;
sign = 1;;
}
else if (c == '-') {
sign = -1;
}
}
if (inNumber) {
ret.add(current);
}
return ret;
}

Java IF Statement necessary?

I have:
String str = "Hello, how, are, you";
I want to create a helper method that removes the commas from any string. Which of the following is more accurate?
private static String removeComma(String str){
if(str.contains(",")){
str = str.replaceAll(",","");
}
return str;
}
OR
private static String removeComma(String str){
str = str.replaceAll(",","");
return str;
}
Seems like I don't need the IF statement but there might be a case where I do.
If there is a better way let me know.
Both are functionally equivalent but the former is more verbose and will probably be slower because it runs an extra operation.
Also note that you don't need replaceAll (which accepts a regular expression): replace will do.
So I would go for:
private static String removeComma(String str){
return str.replace(",", "");
}
The IF statement is unnecessary, unless you're handling "large" strings (we're talking megabytes or more).
If you're using the IF statement, your code will first search for the first occurance of a comma, and then execute the replacement. This could be costly if the comma is near the end of the string and your string is large, since it will have to be traversed twice.
Without the IF statement, commas will be replaced if they exist. If the answer is negative, your string will be untouched.
Bottom rule: use the version without the IF statement.
Both are correct, but the second one is cleaner since the IF statement of the first alternative is not needed.
It's a matter of what is the probability to have strings with comma in your universe of strings.
If you have a high probability, call the method replaceAll without checking first.
BUT If you are not using extremely huge strings, I guess you will see no difference in perfomance at all.
Just another solution with time complexity O(n), space complexity O(n):
public static String removeComma(String str){
int length = str.length();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < length; i++) {
char c = str.charAt(i);
if (c != ',') {
sb.append(c);
}
}
return sb.toString();
}

Best way to modify an existing string? StringBuilder or convert to char array and back to string?

I'm learning Java and am wondering what's the best way to modify strings here (both for performance and to learn the preferred method in Java). Assume you're looping through a string and checking each character/performing some action on that index in the string.
Do I use the StringBuilder class, or convert the string into a char array, make my modifications, and then convert the char array back to a string?
Example for StringBuilder:
StringBuilder newString = new StringBuilder(oldString);
for (int i = 0; i < oldString.length() ; i++) {
newString.setCharAt(i, 'X');
}
Example for char array conversion:
char[] newStringArray = oldString.toCharArray();
for (int i = 0; i < oldString.length() ; i++) {
myNameChars[i] = 'X';
}
myString = String.valueOf(newStringArray);
What are the pros/cons to each different way?
I take it that StringBuilder is going to be more efficient since the converting to a char array makes copies of the array each time you update an index.
I say do whatever is most readable/maintainable until you you know that String "modification" is slowing you down. To me, this is the most readable:
Sting s = "foo";
s += "bar";
s += "baz";
If that's too slow, I'd use a StringBuilder. You may want to compare this to StringBuffer. If performance matters and synchronization does not, StringBuilder should be faster. If sychronization is needed, then you should use StringBuffer.
Also it's important to know that these strings are not being modified. In java, Strings are immutable.
This is all context specific. If you optimize this code and it doesn't make a noticeable difference (and this is usually the case), then you just thought longer than you had to and you probably made your code more difficult to understand. Optimize when you need to, not because you can. And before you do that, make sure the code you're optimizing is the cause of your performance issue.
What are the pros/cons to each different way. I take it that StringBuilder is going to be more efficient since the convering to a char array makes copies of the array each time you update an index.
As written, the code in your second example will create just two arrays: one when you call toCharArray(), and another when you call String.valueOf() (String stores data in a char[] array). The element manipulations you are performing should not trigger any object allocations. There are no copies being made of the array when you read or write an element.
If you are going to be doing any sort of String manipulation, the recommended practice is to use a StringBuilder. If you are writing very performance-sensitive code, and your transformation does not alter the length of the string, then it might be worthwhile to manipulate the array directly. But since you are learning Java as a new language, I am going to guess that you are not working in high frequency trading or any other environment where latency is critical. Therefore, you are probably better off using a StringBuilder.
If you are performing any transformations that might yield a string of a different length than the original, you should almost certainly use a StringBuilder; it will resize its internal buffer as necessary.
On a related note, if you are doing simple string concatenation (e.g, s = "a" + someObject + "c"), the compiler will actually transform those operations into a chain of StringBuilder.append() calls, so you are free to use whichever you find more aesthetically pleasing. I personally prefer the + operator. However, if you are building up a string across multiple statements, you should create a single StringBuilder.
For example:
public String toString() {
return "{field1 =" + this.field1 +
", field2 =" + this.field2 +
...
", field50 =" + this.field50 + "}";
}
Here, we have a single, long expression involving many concatenations. You don't need to worry about hand-optimizing this, because the compiler will use a single StringBuilder and just call append() on it repeatedly.
String s = ...;
if (someCondition) {
s += someValue;
}
s += additionalValue;
return s;
Here, you'll end up with two StringBuilders being created under the covers, but unless this is an extremely hot code path in a latency-critical application, it's really not worth fretting about. Given similar code, but with many more separate concatenations, it might be worth optimizing. Same goes if you know the strings might be very large. But don't just guess--measure! Demonstrate that there's a performance problem before you try to fix it. (Note: this is just a general rule for "micro optimizations"; there's rarely a downside to explicitly using a StringBuilder. But don't assume it will make a measurable difference: if you're concerned about it, you should actually measure.)
String s = "";
for (final Object item : items) {
s += item + "\n";
}
Here, we're performing a separate concatenation operation on each loop iteration, which means a new StringBuilder will be allocated on each pass. In this case, it's probably worth using a single StringBuilder since you may not know how large the collection will be. I would consider this an exception to the "prove there's a performance problem before optimizing rule": if the operation has the potential to explode in complexity based on input, err on the side of caution.
Which option will perform the best is not an easy question.
I did a benchmark using Caliper:
RUNTIME (NS)
array 88
builder 126
builderTillEnd 76
concat 3435
Benchmarked methods:
public static String array(String input)
{
char[] result = input.toCharArray(); // COPYING
for (int i = 0; i < input.length(); i++)
{
result[i] = 'X';
}
return String.valueOf(result); // COPYING
}
public static String builder(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result.toString(); // COPYING
}
public static StringBuilder builderTillEnd(String input)
{
StringBuilder result = new StringBuilder(input); // COPYING
for (int i = 0; i < input.length(); i++)
{
result.setCharAt(i, 'X');
}
return result;
}
public static String concat(String input)
{
String result = "";
for (int i = 0; i < input.length(); i++)
{
result += 'X'; // terrible COPYING, COPYING, COPYING... same as:
// result = new StringBuilder(result).append('X').toString();
}
return result;
}
Remarks
If we want to modify a String, we have to do at least 1 copy of that input String, because Strings in Java are immutable.
java.lang.StringBuilder extends java.lang.AbstractStringBuilder. StringBuilder.setCharAt() is inherited from AbstractStringBuilder and looks like this:
public void setCharAt(int index, char ch) {
if ((index < 0) || (index >= count))
throw new StringIndexOutOfBoundsException(index);
value[index] = ch;
}
AbstractStringBuilder internally uses the simplest char array: char value[]. So, result[i] = 'X' is very similar to result.setCharAt(i, 'X'), however the second will call a polymorphic method (which probably gets inlined by JVM) and check bounds in if, so it will be a bit slower.
Conclusions
If you can operate on StringBuilder until the end (you don't need String back) - do it. It's the preferred way and also the fastest. Simply the best.
If you want String in the end and this is the bottleneck of your program, then you might consider using char array. In benchmark char array was ~25% faster than StringBuilder. Be sure to properly measure execution time of your program before and after optimization, because there is no guarantee about this 25%.
Never concatenate Strings in the loop with + or +=, unless you really know what you do. Usally it's better to use explicit StringBuilder and append().
I'd prefer to use StringBuilder class where original string is modified.
For String manipulation, I like StringUtil class. You'll need to get Apache commons dependency to use it

Splitting string N into N/X strings

I would like some guidance on how to split a string into N number of separate strings based on a arithmetical operation; for example string.length()/300.
I am aware of ways to do it with delimiters such as
testString.split(",");
but how does one uses greedy/reluctant/possessive quantifiers with the split method?
Update: As per request a similar example of what am looking to achieve;
String X = "32028783836295C75546F7272656E745C756E742E657865000032002E002E005C0"
Resulting in X/3 (more or less... done by hand)
X[0] = 32028783836295C75546F
X[1] = 6E745C756E742E6578650
x[2] = 65000032002E002E005C0
Dont worry about explaining how to put it into the array, I have no problem with that, only on how to split without using a delimiter, but an arithmetic operation
You could do that by splitting on (?<=\G.{5}) whereby the string aaaaabbbbbccccceeeeefff would be split into the following parts:
aaaaa
bbbbb
ccccc
eeeee
fff
The \G matches the (zero-width) position where the previous match occurred. Initially, \G starts at the beginning of the string. Note that by default the . meta char does not match line breaks, so if you want it to match every character, enable DOT-ALL: (?s)(?<=\G.{5}).
A demo:
class Main {
public static void main(String[] args) {
int N = 5;
String text = "aaaaabbbbbccccceeeeefff";
String[] tokens = text.split("(?<=\\G.{" + N + "})");
for(String t : tokens) {
System.out.println(t);
}
}
}
which can be tested online here: http://ideone.com/q6dVB
EDIT
Since you asked for documentation on regex, here are the specific tutorials for the topics the suggested regex contains:
\G, see: http://www.regular-expressions.info/continue.html
(?<=...), see: http://www.regular-expressions.info/lookaround.html
{...}, see: http://www.regular-expressions.info/repeat.html
If there's a fixed length that you want each String to be, you can use Guava's Splitter:
int length = string.length() / 300;
Iterable<String> splitStrings = Splitter.fixedLength(length).split(string);
Each String in splitStrings with the possible exception of the last will have a length of length. The last may have a length between 1 and length.
Note that unlike String.split, which first builds an ArrayList<String> and then uses toArray() on that to produce the final String[] result, Guava's Splitter is lazy and doesn't do anything with the input string when split is called. The actual splitting and returning of strings is done as you iterate through the resulting Iterable. This allows you to just iterate over the results without allocating a data structure and storing them all or to copy them into any kind of Collection you want without going through the intermediate ArrayList and String[]. Depending on what you want to do with the results, this can be considerably more efficient. It's also much more clear what you're doing than with a regex.
How about plain old String.substring? It's memory friendly (as it reuses the original char array).
well, I think this is probably as efficient a way to do this as any other.
int N=300;
int sublen = testString.length()/N;
String[] subs = new String[N];
for(int i=0; i<testString.length(); i+=sublen){
subs[i] = testString.substring(i,i+sublen);
}
You can do it faster if you need the items as a char[] array rather as individual Strings - depending on how you need to use the results - e.g. using testString.toCharArray()
Dunno, you'll probably need a method that takes string and int times and returns a list of strings. Pseudo code (haven't checked if it works or not):
public String[] splintInto(String splitString, int parts)
{
int dlength = splitString.length/parts
ArrayList<String> retVal = new ArrayList<String>()
for(i=0; i<splitString.length;i+=dlength)
{
retVal.add(splitString.substring(i,i+dlength)
}
return retVal.toArray()
}

Categories