String allocation of literals

String allocation of literals - java

Another question answered me how concatenation of String literals is evaluated in compile time. In a project I'm working on we handle multi-line Strings of big queries using a StringBuffer. It appends just literals, so it had me thinking whether if something similar might happen.
In the following code, will the buffer append its contents at compile time? how would this behave when multiple threads are trying to execute this function?
public static String querySomething(int arg){
StringBuffer buffer = new StringBuffer();
buffer.append("A quite long query");
buffer.append("that doesn't fit in one line");
buffer.append("...");
}
Wouldn't it be better to define the String as a constant since it would be thread safe and we know it can get concatenated at compile time with the plus operator. Something like:
private final static REALLY_LONG_QUERY1 = "A quite long query that"
+"doesn't fit in one line"
+"...";

Wouldn't it be better to define the String as a constant ...
Basically, yes.
... since it would be thread safe and we know it can get concatenated at compile time with the plus operator.
These assertions are both correct.
However, you would not need to worry about thread safety any in the version of your code with a StringBuffer.
The StringBuffer class is thread-safe.
If the StringBuffer instance is only visible to one thread (e.g. the thread calling the method that declares and uses the instance), then the instance is thread confined and does not need to be a thread-safe data structure. (And you could use StringBuilder instead ...)
The primary advantage of the version that uses + concatenation of literals is that it takes zero time at runtime, and causes no allocation of objects ... apart from the one String object that represents the concatenated string constant that is allocated when your class is loaded.
In fact, in many places where people explicitly use StringBuilder or StringBuffer to "optimize" string concatenation, it either has no effect, or actually makes the code slower:
As you noted, the Java compiler evaluates concatenation of literals (using +) at compiler time, but it can't do the same thing for explicit StringBuilder.append calls.
In addition, the Java compiler will typically translate non-constant String concatenations (using +) in an expression into equivalent code using StringBuilder.
The only cases where it is worthwhile to use StringBuilder explicitly are when the sting building spans multiple statements; e.g. because you are concatenating stuff in a loop.

I would prefer the second solution (merely using + operator).
Why? Because:
More readable
More functional (oriented functional programming, fashion and efficient today) avoiding useless (temporary) local variables and especially mutable variables (like buffer is).

In the following code, will the buffer append its contents at compile time?
Yes.
How would this behave when multiple threads are trying to execute this function?
No problems, since each thread would use it's own StringBuffer (it is declared inside the method).
Wouldn't it be better to define the String as a constant?
Yes, it would make more sense here.

StringBuffer fit better when you want to build a string which you don't know the actual size at compile time, for example:
public static String querySomething(int arg) {
StringBuffer buffer = new StringBuffer();
while (...) {
buffer.Append(someStuff());
}
}
In your case, a constant is more suitable.

Related

Is it wise to declare a String as final if I use it many times?

I have a method repeatedMethod like this:
public static void repeatedMethod() {
// something else
anotherMethod("myString");
// something else
}
public static void anotherMethod(String str) {
//something that doesn't change the value of str
}
and I call the repeatedMethod many times.
I would like to ask if it is wise to declare myString as static final outside that method like this:
public static final String s = "myString";
public void repeatedMethod() {
anotherMethod(s);
}
I think that when I do anotherMethod("myString"), a new instance of String is created. And since I do that many times, many instances of String are created.
Therefore, it might be better to create only one instance of String outside the repeatedMethod and use only that one every time.

What you are doing is right but for the wrong reason.
When you do anotherMethod("myString"), no new instance of String is actually going to be created: it might reuse a String from the String constant pool (refer to this question).
However, factoring common String values as constants (i.e. as private static final) is a good practice: if that constant ever needs to change, you only need to change it in one place of the source code (for example, if tomorrow, "myString" needs to become "myString2", you only have one modification to make)

String literals of the same text are identical, there won't be excessive object creation as you fear.
But it's good to put that string literal in a static final variable (a constant) with a descriptive name that documents the purpose of that string. It's generally a recommended practice to extract string literals to constants with good names.
This is especially true when the same string literal appears in more than one place in the code (a "magic string"), in which case it's strongly recommended to extract to a constant.

no, you don't need to do that ,there is a "constant pool" in the JVM ,for every inline string (ex:"myString") ,it will be treated as an constant variable implicitly, and every identical inline string will be put in the constant pool just once.
for example ,for
String i="test",j="test";
there will be just one instance of constant variable "test" in the constant pool.
also refer to
http://www.thejavageek.com/2013/06/19/the-string-constant-pool/

Optimize for clarify before worrying about performance. String literals are only created once, ever (unless the literal is unloaded) Performance is not only less important usually but irreverent in this case.
You should define a constant instead repeating the same String to make it clear these strings don't happen to be the same, but must be the same. If someone trying to maintain the code later has to modify one String, does this mean they should all be changed or not.
BTW When you optimize for clarity you are also often optimizing for performance. The JIT looks for common patterns and if you try to out smart the optimizer you are more likely to confuse it resulting in less optimal code.

Making a set of changes to a string in Java - best practice approach

Looking for the best practice Java approach for the following problem.
I have a (relatively) long string and a set of (non-overlapping) changes to make to it - lets say the changes have the signature:
change(int startIndex, int endIndex, String replacement);
and an example would be
assert doChange("aaa",new Change(1,2,"hello")).equals("aHelloa");
My plan is to work backwards (so the changing indexes are avoided) though the string splitting into three peices each time and then stitching in the replacement. But I can imagine this has a much more effective/java-like approach... is there an API call I've missed?

The standard Java String is immutable, which makes it unsuitable for extended string-based operations. But there are also the classes StringBuffer and StringBuilder which represent a mutable string designed for being manipulated. They even have a native replace(start, end, str) method which does exactly what you are trying to do.
The main difference between these two classes is that StringBuffer is thread-safe while StringBuilder is not. When you don't have multiple threads accessing the same string, use StringBuilder, because it generally performs faster.

Equivalence of static const char* in a java function

I make regular use of this idiom in C++:
/*return type*/ foo(/*parameters*/){
static const char* bar = "Bar";
/*some code here*/
}
Internally this gets added to a table of string literals. Does this Java code do a similar thing:
/*return type*/ foo(/*parameters*/){
final String bar = "Bar";
/*some code here*/
}
or am I unwittingly introducing inefficiencies here?

Strings are immutable in Java. This means you don't have to give hints to have the JVM know it won't change and optimize it.
String literals are interned to avoid redundancies, which means they already are "added to a table of string literals". Using final here isn't necessary.

I think your solution is correct and they are as near to equivalent as Java can express.
As other answers have mentioned strings are immutable and the final does not add any performance enhancement, however I feel the final is semantically useful here. Much like 'const' in c++; 'final' ensures that the value cannot be changed and attempting to do so will result in a compiler error - it seems to me that this is a desirable behavior in your case.
Also (much as in the case with c++ const) it might lead to some possible optimizations that otherwise would not be considered.

The final keyword in a Java method doesn't do what you think it does.
In this particular case, final makes the variable unmodifiable. That's it. The content itself does get added to a global table of String constants, but the pointer to it (pardon the terminology) is technically set every single time.
The final keyword is mainly useful from within a method to make the variable available to any anonymous classes you create after it. It's Java's piss-poor half-assed support for what they like to consider "closure". And it also happens to be the only way you can access an outer variable from within an inner anonymous class.
final String bar = "Bar";
final Set<String> allTheBars = new HashSet<>() {{
add(bar);
}};

You can specify for a string to be added to the literal pool (referred to as interning in Java) by invoking String.intern() as follows:
final String bar = myString.intern();
It is basically the same concept as the literal pool, using the same object for the given string. Note that string literals are interned automatically. It also allows you to compare interned strings by reference, so it might be more efficient. However, you must always compare the strings returned via intern(). Thus
a.equals(b);
could be replaced with
a.intern() == b.intern();
Note that you don't actually want to the above exactly as depicted. Ideally you can keep the interned strings around and reuse them. However, there are some pitfalls to interned strings. They are not garbage collected, and the the method itself is a bit expensive.

java performance : string literal

I would like to know about the relation between string literal and java performance.
for example, uses of below statements number of times makes any impact in performance.I have thousands of classes and many time we are using below statements :
1) buffer.append(",");
2) buffer.append("}");
3)String.append("10,000 times...same lines") // printing same lines in many classes
3)String.someStringMethod("same line many times") // using any String method
Does this cause performance impact in terms of memory management etc.Do we have any cleaner way ?
Thanks

It is really difficult to comment on examples that make no sense. However:
In general there are no particular efficiency concerns with Java String literals.
In general there are no particular efficiency concerns with methods that take String literals as arguments.
String concatenation / building can present efficiency concerns if a particular piece of code is executed often enough. However, if you need to build strings, then there is not a lot you can do about it.
There are one or two things that it is worth taking steps to avoid. The main one is this:
String s = "";
for (/* lots of times */) {
// do stuff
s += someOtherString;
}
The problem is that this generates and then discards lots of temporary Strings. The more efficient way to do it is this:
StringBuilder sb = new StringBuilder();
for (/* lots of times */) {
// do stuff
sb.append(someOtherString);
}
String s = sb.toString();
However, it is probably only worth while optimizing this kind of thing if the profiler tells you that this particular bit of code is a bottleneck.

Any code you write affects performance, so it's always better to invoke one append() with ",}" instead of 2 appends.
There's no method append in java.lang.String class.
None of the String methods make changes to the String object. Instead, methods like String.substring(), String.concat(), String.replace() create new String objects. This means performance is affected more significantly than if you use StringBuffer.
So generally StringBuffer methods are faster than those of String. However, new class was recently introduced called StringBuilder. There's only one difference from StringBuilder - it's not thread-safe. In real world cases thread management is taken care of higher level containers thus making it unnecessary to ensure thread safety for each class. In those cases you're advised to use StringBuilder. That should be the fastest.
In order to further improve performance of StringBuilder you have to be aware of resulting string length to allocate StringBuilder of a proper size. If it's too big you'll waste some memory, but that's usually a minor problem. If it's too small though, StringBuilder will have to recreate internal character array to make space for more characters. That would make that particular append() invocation slow. Actually, that's not the invocation that's slow, but garbage collection invoked to clean the memory up.
Particular methods of String class may be better or faster than those in StringBuffer/StringBuilder, but you have to be more specific with your questions for me to answer that.

The "Why" behind PMD's StringInstantiation rule

Along the lines of an existing thread, The “Why” behind PMD's rules, I'm trying to figure out the meaning of one particular PMD rule : String and StringBuffer Rules.StringInstantiation.
This rule states that you shouldn't explicitly instantiate String objects. As per their manual page :
Avoid instantiating String objects; this is usually unnecessary since
they are immutable and can be safely shared.
This rule is defined by the following Java
class:net.sourceforge.pmd.lang.java.rule.strings.StringInstantiationRule
Example(s):
private String bar = new String("bar"); // just do a String bar =
"bar";
http://pmd.sourceforge.net/pmd-5.0.1/rules/java/strings.html
I don't see how this syntax is a problem, other than it being pointless. Does it affect overwhole performance ?
Thanks for any thought.

With String foo = "foo" there will be on instance of "foo" in PermGen space (This is referred to as string interning). If you were to later type String bar = "foo" there would still only be one "foo" in the PermGen space.
Writing String foo = new String( "foo" ) will also create a String object to count against the heap.
Thus, the rule is there to prevent wasting memory.
Cheers,

It shouldn't usually affect performance in any measurable way, but:
private String bar = new String("bar"); // just do a String bar = "bar";
If you execute this line a million times you will have created a million objects
private String bar = "bar"; // just do a String bar = "bar";
If you execute this line a million times you will have created one Object.
There are scenarios where that actually makes a difference.

Does it affect overwhole performance ?
Well, performance and maintenance. Doing something which is pointless makes the reader wonder why the code is there in the first place. When that pointless operation also involves creating new objects (two in this case - a new char[] and a new String) that's another reason to avoid it...
In the past, there has been a reason to call new String(existingString) if the existing string was originally obtained as a small substring of a longer string - or other ways of obtaining a string backed by a large character array. I believe that this is not the case with more recent implementations of Java, but obviously you can still be using an old one. This shouldn't be a problem for constant strings anyway, mind you.
(You could argue that creating a new object allows you to synchronize on it. I would avoiding synchronizing on strings to start with though.)

One difference is the memory footprint:
String a = "abc"; //one object
String b = "abc"; //same object (i.e. a == b) => still one object in memory
String c = new String("abc"); // This is a new object - now 2 objects in memory
To be honest, the only reason I can think of, why one would use the String constructor is in combination with substring, which is a view on the original string. Using the String constructor in that case helps getting rid of the original string if it is not needed any longer.
However, since java 7u6, this is not the case any more so I don't see any reasons to use it any more.

It can be useful, because it creates a new identity, and sometimes object identities are important/crucial to an application. For example, it can be used as an internal sentinel value. There are other valid use cases too, e.g. to avoid constant expression.
If a beginner writes such code, it's very likely a mistake. But that is a very short learning period. It is highly unlikely that any moderately experienced Java programmer would write that by mistake; it must be for a specific purpose. File it under "it looks like a stupid mistake, but it takes efforts to make, so it's probably intended".

It is
pointless
confusing
slightly slower
You should try to write the simplest, clearest code you can. Adding pointless code is bad all round.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

String allocation of literals - java

I would prefer the second solution (merely using + operator). Why? Because: More readable More functional (oriented functional programming, fashion and efficient today) avoiding useless (temporary) local variables and especially mutable variables (like buffer is).

StringBuffer fit better when you want to build a string which you don't know the actual size at compile time, for example: public static String querySomething(int arg) { StringBuffer buffer = new StringBuffer(); while (...) { buffer.Append(someStuff()); } } In your case, a constant is more suitable.

Related

Is it wise to declare a String as final if I use it many times?

Making a set of changes to a string in Java - best practice approach

Equivalence of static const char* in a java function

java performance : string literal

The "Why" behind PMD's StringInstantiation rule

Categories

Resources