Compile time optimization of String concatenation - java

I'm curious, how far the optimization of the following code snippet will go.
To what I know, whenever the capacity of StringBuffer is extended, it costs some CPU work, because its content is required to be reallocated. However, I guess Java compiler optimization can precalculate the required capacity instead of doing multiple reallocations.
The question is: will the following snippet of code be optimized so?
public static String getGetRequestURL(String baseURL, Map<String, String> parameters) {
StringBuilder stringBuilder = new StringBuilder();
parameters.forEach(
(key, value) -> stringBuilder.append(key).append("=").append(value).append("&"));
return baseURL + "?" + stringBuilder.delete(stringBuilder.length(),1);
}

In Java, most optimization is performed by the runtime's just in time compiler, so generally javac optimizations don't matter much.
As a consequence, a Java compiler is not required to optimize string concatenation, though all tend to do so as long as no loops are involved. You can check the extent of such compile time optimizations by using javap (the java decompiler included with the JDK).
So, could javac conceivably optimize this? To determine the length of the string builder, it would have to iterate the map twice. Since java does not feature const references, and the compiler has no special treatment for Map, the compiler can not determine that this rewrite would preserve the meaning of the code. And even if it could, it's not at all clear that the gains would be worth the cost of iterating twice. After all, modern processors can copy 4 to 8 characters in a single cpu instruction. Since memory access is sequential, there won't be any cache missing while growing the buffer. On the other hand, iterating the map a second time will likely cause additional cache misses, because the Map entries (and the strings they reference) can be scattered all over main memory.
In any case, I would not worry about the efficiency of this code. Even if your URL is 1000 characters long, resizing the buffer will take about 0.1 micro seconds. Unless you have evidence that this really is a performance hotspot, your time is probably better spent elsewhere.

First of all:
You can find out what (javac) compile time optimizations occur by looking at the bytecodes using the javap tool.
You can find out what JIT compiler optimizations are performed by getting the JVM to dump the native code.
So, if you need to know how your code has been optimized (on a particular platform) for practical reasons, then you should check.
In reality, the optimizations by javac are pretty simple-minded, and do not go to the extent of precalculating buffer sizes. I haven't checked, but I expect that the same is true for the JIT compiler. I doubt that it makes any attempt to preallocate a StringBuilder with an "optimal" size.
Why?
The reasons include the following:
An inaccurate precalculation (on average) doesn't help, and may be worse than doing nothing.
An accurate precalculation typically involves measuring the (dynamic) lengths of the actual strings to be joined.
Implementing the optimization logic would be complicated, and would make the optimizers slower and more effort to maintain.
At runtime, the String mensuration introduces overheads. Whether you would come out ahead often enough to make a difference is difficult to determine. (People don't like optimizations that make their code run slower ...)
There are better (more efficient) ways to do large scale text assembly than using String concatenation. The programmer (who has more knowledge of the problem domain and application logic) can optimize this better than a compiler. If it matters enough to spend the developer effort on this.

One optimization is to set the baseURL and ampersand in the stringBuilder instead of using the String concatenate at the end, such as:
public static String getGetRequestURL(String baseURL, Map<String, String> parameters) {
StringBuilder stringBuilder = new StringBuilder(baseURL);
stringBuilder.append("&");
parameters.forEach((key, value) -> stringBuilder.append(key).append("=").append(value).append("&"));
stringBuilder.setLength(stringBuilder.length() - 1);
return stringBuilder.toString();
}
If you want a little more speed and since javac or JIT will not optimize based potential string size, you can track that yourself without incurring much overhead, but adding a max size tracker, such as this:
protected static URL_SIZE = 256;
public static String getGetRequestURL(String baseURL, Map<String, String> parameters) {
StringBuilder stringBuilder = new StringBuilder(URL_SIZE);
stringBuilder.append(baseURL);
stringBuilder.append("&");
parameters.forEach((key, value) -> stringBuilder.append(key).append("=").append(value).append("&"));
int size = stringBuilder.length();
if (size > URL_SIZE) {
URL_SIZE = size;
}
stringBuilder.setLength(size - 1);
return stringBuilder.toString();
}
That said, with some testing of 1 million calls, I found that the different version preformed as (in milliseconds):
Your version: total = 1151, average = 230
Above version 1: total = 936, average = 187
Above version 2: total = 839, average = 167

Related

Is LinkedList.toString().replace() O(2*k)?

I know that converting a suitable object (e.g., a linked list) to a string using the toString() method is an O(n) operation, where 'n' is the length of the linked list. However, if you wanted to then replace something in that that string using the replace(), method, is that also an o(k) method, where 'k' is the length of the string?
For example, for the line String str = path.toString().replace("[", "").replace("]", "").replace(",", "");, does this run through the length of the linked list 1 time, and then the length of the string an additional 3 times? If so, is there a more efficient way to do what that line of code does?
Yes, it would. replace has no idea that [ and ] are only found at the start and end. In fact, it's worse - you get another loop for copying the string over (the string has an underlying array and that needs to be cloned in its entirety to lop a character out of it).
If your intent is to replace every [ in the string, then, no, there is no faster way. However, if your actual intent is to simply not have the opening brace and closing brace, then either write your own loop to toString the contents. Something like:
LinkedList<Foo> foos = ...;
StringBuilder out = new StringBuilder();
for (Foo f : foos) out.append(out.length() == 0 ? "" : ", ").append(f);
return out.toString();
Or even:
String.join(", ", foos);
Or even:
foos.stream().collect(Collectors.joining(", "));
None of this is the same thing as .replace("[", "") - after all, if a [ symbol is part of the toString() of any Foo object, it would be stripped out as well with .replace("[", "") - though you probably didn't want that to happen.
Note that the way modern CPUs work, unless that list has over a few thousand elements in it, looping it 4 times is essentially free and takes no measurable time. The concept of O(n) 'kicks in' after a certain number of loops. On modern hardware, it tends to be a lot of loops before it matters. Often other concerns are much more important. As a simple example, linked list, in general? Horrible performance relative to something like ArrayList. Even in cases where O(k) wise it should be faster. It's due to the way linkedlists create extra objects and how these tend to be non-contiguous (not near each other in memory). Modern CPUs can't read memory at all. They can ask the memory controller to take one of the on-die cache pages and replace it with the contents of another memory page, which takes 500 to a 1000 cycles. The CPU will ask the memory controller to do that and then go to sleep for 1000 cycles. You can see how reducing the number of times it does this can have a rather marked effect on performance, and yet the O(k) business doesn't and cannot take it into account.
Do not worry about performance unless you have a real life scenario where the program appears to run slower than you think it should. Then, use a profiler to figure out which 1% of the code is eating 99% of the resources (because it's virtually always a 1% 'hot path' that is responsible) and then optimize just that 1%. It's pretty much impossible to predict what the 1% is going to be. So, don't bother trying to do so while writing code, it just leads you to writing harder to maintain, less flexible code - which ironically enough tends to lead to situations where adjusting the hot path is harder. Worrying about performance, in essence, slows down the code. Hence why it's very very important not to worry about that, and worry instead about code that is easy to read and easy to modify.

Java string concatenation optimisation is applied in this case?

Let's imagine I have a lib which contains the following simple method:
private static final String CONSTANT = "Constant";
public static String concatStringWithCondition(String condition) {
return "Some phrase" + condition + CONSTANT;
}
What if someone wants to use my method in a loop? As I understand, that string optimisation (where + gets replaced with StringBuilder or whatever is more optimal) is not working for that case? Or this is valid for strings initialised outside of the loop?
I'm using java 11 (Dropwizard).
Thanks.
No, this is fine.
The only case that string concatenation can be problematic is when you're using a loop to build one single string. Your method by itself is fine. Callers of your method can, of course, mess things up, but not in a way that's related to your method.
The code as written should be as efficient as making a StringBuilder and appending these 3 constants to it. There certainly is absolutely no difference at all between a literal ("Some phrase"), and an expression that the compiler can treat as a Compile Time Constant (which CONSTANT, here, clearly is - given that CONSTANT is static, final, not null, and of a CTCable type (All primitives and strings)).
However, is that 'efficient'? I doubt it - making a stringbuilder is not particularly cheap either. It's orders of magnitude cheaper than continually making new strings, sure, but there's always a bigger fish:
It doesn't matter
Computers are fast. Really, really fast. It is highly likely that you can write this incredibly badly (performance wise) and it still won't be measurable. You won't even notice. Less than a millisecond slower.
In general, anybody that worries about performance at this level simply lacks perspective and knowledge: If you apply that level of fretting to your java code and you have the knowledge to know what could in theory be non-perfectly-performant, you'll be sweating every 3rd character you ever type. That's no way to program. So, gain that perspective (or take it from me, "just git gud" is not exactly something you can do in a week - take it on faith for now, as you learn you can start verifying) - and don't worry about it. Unless you actually run into an actual situation where the code is slower than it feels like it could be, or slower than it needs to be, and then toss profilers and microbenchmark testing frameworks at it, and THEN, armed with all that information (and not before!), consider optimizing. The reports tell you what to optimize, because literally less than 1% of the code is responsible for 99% of the performance loss, so spending any time on code that isn't in that 1% is an utter waste of time, hence why you must get those reports first, or not start at all.
... or perhaps it does
But if it does matter, and it's really that 1% of the code that is responsible for 99% of the loss, then usually you need to go a little further than just 'optimize the method'. Optimize the entire pipeline.
What is happening with this string? Take that into consideration.
For example, let's say that it, itself, is being appended to a much bigger stringbuilder. In which case, making a tiny stringbuilder here is incredibly inefficient compared to rewriting the method to:
public static void concatStringWithCondition(StringBuilder sb, String condition) {
sb.append("Some phrase").append(condition).append(CONSTANT);
}
Or, perhaps this data is being turned into bytes using UTF_8 and then tossed onto a web socket. In that case:
private static final byte[] PREFIX = "Some phrase".getBytes(StandardCharsets.UTF_8);
private static final byte[] SUFFIX = "Some Constant".getBytes(StandardCharsets.UTF_8);
public void concatStringWithCondition(OutputStream out, String condition) {
out.write(PREFIX);
out.write(condition.getBytes(StandardCharsets.UTF_8));
out.write(SUFFIX);
}
and check if that outputstream is buffered. If not, make it buffered, that'll help a ton and would completely dwarf the cost of not using string concatenation. If the 'condition' string can get quite large, the above is no good either, you want a CharsetEncoder that encodes straight to the OutputStream, and may even want to replace all that with some ByteBuffer based approach.
Conclusion
Assume performance is never relevant until it is.
IF performance truly must be tackled, strap in, it'll take ages to do it right. Doing it 'wrong' (applying dumb rules of thumb that do not work) isn't useful. Either do it right, or don't do it.
IF you're still on bard, always start with profiler reports and use JMH to gather information.
Be prepared to rewrite the pipeline - change the method signatures, in order to optimize.
That means that micro-optimizing, which usually sacrifices nice abstracted APIs, is actively bad for performance - because changing pipelines is considerably more difficult if all code is micro-optimized, given that this usually comes at the cost of abstraction.
And now the circle is complete: Point 5 shows why the worrying about performance as you are doing in this question is in fact detrimental: It is far too likely that this worry results in you 'optimizing' some code in a way that doesn't actually run faster (because the JVM is a complex beast), and even if it did, it is irrelevant because the code path this code is on is literally only 0.01% or less of the total runtime expenditure, and in the mean time you've made your APIs worse and lack abstraction which would make any actually useful optimization much harder than it needs to be.
But I really want rules of thumb!
Allright, fine. Here are 2 easy rules of thumb to follow that will lead to better performance:
When in rome...
The JVM is an optimising marvel and will run the craziest code quite quickly anyway. However, it does this primarily by being a giant pattern matching machine: It finds recognizable code snippets and rewrites these to the fastest, most carefully tuned to juuust your combination of hardware machine code it can. However, this pattern machine isn't voodoo magic: It's got limited patterns. Which patterns do JVM makers 'ship' with their JVMs? Why, the common patterns, of course. Why include a pattern for exotic code virtually nobody ever writes? Waste of space.
So, write code the way java programmers tend to write it. Which very much means: Do not write crazy code just because you think it might be faster. It'll likely be slower. Just follow the crowd.
Trivial example:
Which one is faster:
List<String> list = new ArrayList<String>();
for (int i = 0; i < 10000; i++) list.add(someRandomName());
// option 1:
String[] arr = list.toArray(new String[list.size()]);
// option 2:
String[] arr = list.toArray(new String[0]);
You might think, obviously, option 1, right? Option 2 'wastes' a string array, making a 0-length array just to toss it in the garbage right after. But you'd be wrong: Option 2 is in fact faster (if you want an explanation: The JVM recognizes it, and does a hacky move: It makes an new string array that does not need to be initialized with all zeroes first. Normal java code cannot do this (arrays are neccessarily initialized blank, to prevent memory corruption issues), but specifically .toArray(new X[0])? Those pattern matching machines I told you about detect this and replace it with code that just blits the refs straight into a patch of memory without wasting time writing zeroes to it first.
It's a subtle difference that is highly unlikely to matter - it just highlights: Your instincts? They will mislead you every time.
Fortunately, .toArray(new X[0]) is common java code. And easier and shorter. So just write nice, convenient code that looks like how other folks write and you'd have gotten the right answer here. Without having to know such crazy esoterics as having to reason out how the JVM needs to waste time zeroing out that array and how hotspot / pattern matching might possibly eliminate this, thus making it faster. That's just one of 5 million things you'd have to know - and nobody can do that. Thus: Just write java code in simple, common styles.
Algorithmic complexity is a thing hotspot can't fix for you
Given an O(n^3) algorithm fighting an O(log(n) * n^2) algorithm, make n large enough and the second algorithm has to win, that's what big O notation means. The JVM can do a lot of magic but it can pretty much never optimize an algorithm into a faster 'class' of algorithmic complexity. You might be surprised at the size n has to be before algorithmic complexity dominates, but it is acceptable to realize that your algorithm can be fundamentally faster and do the work on rewriting it to this more efficient algorithm even without profiler reports and benchmark harnesses and the like.

Difference between stringBuillder append(CONST) and append("new string")

Can I get some concrete explanation of memory and runtime overhead with the below two statements?
String CONST = "string constant";
StringBuilder sb1 = new StringBuilder();
sb1.append(CONST);
StringBuilder sb2 = new StringBuilder();
sb2.append("string constant");
Does second create string object and add in stringpool?
Is there any scenario(consider many string appends as well) where we can justify one is better than other?
There is no difference in memory or runtime overhead between these two versions.
Use whichever seems more readable or maintainable. If you're reusing the same string constant in many places, the constant is long, or might change, then pulling out a constant might be appropriate.
In reference to the runtime overhead, running a simulation of both methods yielded almost identical results.
My tests were done with 10,000,000,000 iterations and the runtime was:
Method 1 - 95109ms (~9.5ns average)
Method 2 - 95002ms (~9.5ns average)
So definitely no noticeable difference in performance.
Therefore, as #LouisWasserman said in their answer, just use the one that keeps your code clean and legible.

Why is string concatenation faster in Delphi than in Java?

Problem
I wrote 2 programs, one in Delphi and one in Java, for string concatenation and I noticed a much faster string concatenation in Delphi compared to Java.
Java
String str = new String();
long t0 = System.nanoTime();
for (int i = 0; i < 50000; i++)
str += "abc";
long t1 = System.nanoTime();
System.out.println("String + String needed " + (t1 - t0) / 1000000 + "ms");
Delphi
Stopwatch.Start;
for i := 1 to 50000 do
str := str + 'abc';
Stopwatch.Stop;
ShowMessage('Time in ms: ' + IntToStr(Stopwatch.ElapsedMilliseconds));
Question
Both measure the time in milliseconds but the Delphi program is much faster with 1ms vs. Javas 2 seconds. Why is string concatenation so much faster in Delphi?
Edit: Looking back at this question with more experience I should have come to the conclusion that the main difference comes from Delphi being compiled and Java being compiled and then run in the JVM.
TLDR
There may be other factors, but certainly a big contributor is likely to be Delphi's default memory manager. It's designed to be a little wasteful of space in order to reduce how often memory is reallocated.
Considering memory manager overhead
When you have a straight-forward memory manager (you might even call it 'naive'), your loop concatenating strings would actually be more like:
//pseudo-code
for I := 1 to 50000 do
begin
if CanReallocInPlace(Str) then
//Great when True; but this might not always be possible.
ReallocMem(Str, Length(Str) + Length(Txt))
else
begin
AllocMem(NewStr, Length(Str) + Length(Txt))
Copy(Str, NewStr, Length(Str))
FreeMem(Str)
end;
Copy(Txt, NewStr[Length(NewStr)-Length(Txt)], Length(Txt))
end;
Notice that on every iteration you increase the allocation. And if you're unlucky, you very often have to:
Allocate memory in a new location
Copy the existing 'string so far'
Finally release the old string
Delphi (and FastMM)
However, Delphi has switched from the default memory manager used in it's early days to a previously 3rd party one (FastMM) that's designed run faster primarily by:
(1) Using a sub-allocator i.e. getting memory from the OS a 'large' page at a time.
Then performing allocations from the page until it runs out.
And only then getting another page from the OS.
(2) Aggressively allocating more memory than requested (anticipating small growth).
Then it becomes more likely the a slightly larger request can be reallocated in-place.
These techniques can thought it's not guaranteed increase performance.
But it definitely does waste space. (And with unlucky fragmentation, the wastage can be quite severe.)
Conclusion
Certainly the simple app you wrote to demonstrate the performance greatly benefits from the new memory manager. You run through a loop that incrementally reallocates the string on every iteration. Hopefully with as many in-place allocations as possible.
You could attempt to circumvent some of FastMM's performance improvements by forcing additional allocations in the loop. (Though sub-allocation of pages would still be in effect.)
So simplest would be to try an older Delphi compiler (such as D5) to demonstrate the point.
FWIW: String Builders
You said you "don't want to use the String Builder". However, I'd like to point out that a string builder obtains similar benefits. Specifically (if implemented as intended): a string builder wouldn't need to reallocate the substrings all the time. When it comes time to finally build the string; the correct amount of memory can be allocated in a single step, and all portions of the 'built string' copied to where they belong.
In Java (and C#) strings are immutable objects. That means that if you have:
string s = "String 1";
then the compiler allocates memory for this string. Haven then
s = s + " String 2"
gives us "String 1 String 2" as expected but because of the immutability of the strings, a new string was allocated, with the exactly size to contain "String 1 String 2" and the content of both strings is copied to the new location. Then the original strings are deleted by the garbage collector. In Delphi a string is more "copy-on-write" and reference counted, which is much faster.
C# and Java have the class StringBuilder with behaves a lot like Delphi strings and are quite faster when modifying and manipulating strings.

Existing solution to "smart" initial capacity for StringBuilder

I have a piece logging and tracing related code, which called often throughout the code, especially when tracing is switched on. StringBuilder is used to build a String. Strings have reasonable maximum length, I suppose in the order of hundreds of chars.
Question: Is there existing library to do something like this:
// in reality, StringBuilder is final,
// would have to create delegated version instead,
// which is quite a big class because of all the append() overloads
public class SmarterBuilder extends StringBuilder {
private final AtomicInteger capRef;
SmarterBuilder(AtomicInteger capRef) {
int len = capRef.get();
// optionally save memory with expense of worst-case resizes:
// len = len * 3 / 4;
super(len);
this.capRef = capRef;
}
public syncCap() {
// call when string is fully built
int cap;
do {
cap = capRef.get();
if (cap >= length()) break;
} while (!capRef.compareAndSet(cap, length());
}
}
To take advantage of this, my logging-related class would have a shared capRef variable with suitable scope.
(Bonus Question: I'm curious, is it possible to do syncCap() without looping?)
Motivation: I know default length of StringBuilder is always too little. I could (and currently do) throw in an ad-hoc intitial capacity value of 100, which results in resize in some number of cases, but not always. However, I do not like magic numbers in the source code, and this feature is a case of "optimize once, use in every project".
Make sure you do the performance measurements to make sure you really are getting some benefit for the extra work.
As an alternative to a StringBuilder-like class, consider a StringBuilderFactory. It could provide two static methods, one to get a StringBuilder, and the other to be called when you finish building a string. You could pass it a StringBuilder as argument, and it would record the length. The getStringBuilder method would use statistics recorded by the other method to choose the initial size.
There are two ways you could avoid looping in syncCap:
Synchronize.
Ignore failures.
The argument for ignoring failures in this situation is that you only need a random sampling of the actual lengths. If another thread is updating at the same time you are getting an up-to-date view of the string lengths anyway.
You could store the string length of each string in a statistic array. run your app, and at shutdown you take the 90% quartil of your string length (sort all str length values, and take the length value at array pos = sortedStrings.size() * 0,9
That way you created an intial string builder size where 90% of your strings will fit in.
Update
The value could be hard coded (like java does for value 10 in ArrayList), or read from a config file, or calclualted automatically in a test phase. But the quartile calculation is not for free, so best you run your project some time, measure the 90% quartil on the fly inside the SmartBuilder, output the 90% quartil from time to time, and later change the property file to use the value.
That way you would get optimal results for each project.
Or if you go one step further: Let your smart Builder update that value from time to time in the config file.
But this all is not worth the effort, you would do that only for data that have some millions entries, like digital road maps, etc.

Categories