I'm writing a MUD (text based game) at the moment using java. One of the major aspects of a MUD is formatting strings and sending it back to the user. How would this best be accomplished?
Say I wanted to send the following string:
You say to Someone "Hello!" - where "Someone", "say" and "Hello!" are all variables. Which would be best performance wise?
"You " + verb + " to " + user + " \"" + text + "\""
or
String.format("You %1$s to %2$s \"%3$s\"", verb, user, text)
or some other option?
I'm not sure which is going to be easier to use in the end (which is important because it'll be everywhere), but I'm thinking about it at this point because concatenating with +'s is getting a bit confusing with some of the bigger lines. I feel that using StringBuilder in this case will simply make it even less readable.
Any suggestion here?
If the strings are built using a single concatenation expression; e.g.
String s = "You " + verb + " to " + user + " \"" + text + "\"";
then this is more or less equivalent to the more long winded:
StringBuilder sb = new StringBuilder();
sb.append("You");
sb.append(verb);
sb.append(" to ");
sb.append(user);
sb.append(" \"");
sb.append(text );
sb.append('"');
String s = sb.toString();
In fact, a classic Java compiler will compile the former into the latter ... almost. In Java 9, they implemented JEP 280 which replaces the sequence of constructor and method calls in the bytecodes with a single invokedynamic bytecode. The runtime system then optimizes this1.
The efficiency issues arise when you start creating intermediate strings, or building strings using += and so on. At that point, StringBuilder becomes more efficient because you reduce the number of intermediate strings that get created and then thrown away.
Now when you use String.format(), it should be using a StringBuilder under the hood. However, format also has to parse the format String each time you make the call, and that is an overhead you don't have if you do the string building optimally.
Having said this, My Advice would be to write the code in the way that is most readable. Only worry about the most efficient way to build strings if profiling tells you that this is a real performance concern. (Right now, you are spending time thinking about ways to address a performance issue that may turn out to be insignificant or irrelevant.)
Another answer mentions that using a format string may simplify support for multiple languages. This is true, though there are limits as to what you can do with respect to such things as plurals, genders, and so on.
1 - As a consequence, hand optimization as per the example above might actually have negative consequences, for Java 9 or later. But this is a risk you take whenever you micro-optimize.
I think that concatenation with + is more readable than using String.format.
String.format is good when you need to format number and dates.
Concateneting with plus, the compilet can transforms the code in performatic way. With string format i don t know.
I prefer cocatenation with plus, i think that is easer to undersand.
The key to keeping it simple is to never look at it. Here is what I mean:
Joiner join = Joiner.on(" ");
public void constructMessage(StringBuilder sb, Iterable<String> words) {
join.appendTo(sb, words);
}
I'm using the Guava Joiner class to make readability a non-issue. What could be clearer than "join"? All the nasty bits regarding concatenation are nicely hidden away. By using Iterable, I can use this method with all sorts of data structures, Lists being the most obvious.
Here is an example of a call using a Guava ImmutableList (which is more efficient than a regular list, since any methods that modify the list just throw exceptions, and correctly represents the fact that constructMessage() cannot change the list of words, just consume it):
StringBuilder outputMessage = new StringBuilder();
constructMessage(outputMessage,
new ImmutableList.Builder<String>()
.add("You", verb, "to", user, "\"", text, "\"")
.build());
I will be honest and suggest that you take the first one if you want less typing, or the latter one if you are looking for a more C-style way of doing it.
I sat here for a minute or two pondering the idea of what could be a problem, but I think it comes down to how much you want to type.
Anyone else have an idea?
Assuming you are going to reuse base strings often Store your templates like
String mystring = "You $1 to $2 \"$3\""
Then just get a copy and do a replace $X with what you want.
This would work really well for a resource file too.
I think String.format looks cleaner.
However you can use StringBuilder and use append function to create the string you want
The best, performance-wise, would probably be to use a StringBuffer.
Related
This question already has answers here:
StringBuilder vs String concatenation in toString() in Java
(20 answers)
Closed 8 years ago.
When should we use + for concatenation of strings, when is StringBuilder preferred and When is it suitable to use concat.
I've heard StringBuilder is preferable for concatenation within loops. Why is it so?
Thanks.
Modern Java compiler convert your + operations by StringBuilder's append. I mean to say if you do str = str1 + str2 + str3 then the compiler will generate the following code:
StringBuilder sb = new StringBuilder();
str = sb.append(str1).append(str2).append(str3).toString();
You can decompile code using DJ or Cavaj to confirm this :)
So now its more a matter of choice than performance benefit to use + or StringBuilder :)
However given the situation that compiler does not do it for your (if you are using any private Java SDK to do it then it may happen), then surely StringBuilder is the way to go as you end up avoiding lots of unnecessary String objects.
I tend to use StringBuilder on code paths where performance is a concern. Repeated string concatenation within a loop is often a good candidate.
The reason to prefer StringBuilder is that both + and concat create a new object every time you call them (provided the right hand side argument is not empty). This can quickly add up to a lot of objects, almost all of which are completely unnecessary.
As others have pointed out, when you use + multiple times within the same statement, the compiler can often optimize this for you. However, in my experience this argument doesn't apply when the concatenations happen in separate statements. It certainly doesn't help with loops.
Having said all this, I think top priority should be writing clear code. There are some great profiling tools available for Java (I use YourKit), which make it very easy to pinpoint performance bottlenecks and optimize just the bits where it matters.
P.S. I have never needed to use concat.
From Java/J2EE Job Interview Companion:
String
String is immutable: you can’t modify a String object but can replace it by creating a new instance. Creating a new instance is rather expensive.
//Inefficient version using immutable String
String output = "Some text";
int count = 100;
for (int i = 0; i < count; i++) {
output += i;
}
return output;
The above code would build 99 new String objects, of which 98 would be thrown away immediately. Creating new objects is not efficient.
StringBuffer/StringBuilder
StringBuffer is mutable: use StringBuffer or StringBuilder when you want to modify the contents. StringBuilder was added in Java 5 and it is identical in all respects to StringBuffer except that it is not synchronised, which makes it slightly faster at the cost of not being thread-safe.
//More efficient version using mutable StringBuffer
StringBuffer output = new StringBuffer(110);
output.append("Some text");
for (int i = 0; i < count; i++) {
output.append(i);
}
return output.toString();
The above code creates only two new objects, the StringBuffer and the final String that is returned. StringBuffer expands as needed, which is costly however, so it would be better to initialise the StringBuffer with the correct size from the start as shown.
If all concatenated elements are constants (example : "these" + "are" + "constants"), then I'd prefer the +, because the compiler will inline the concatenation for you. Otherwise, using StringBuilder is the most effective way.
If you use + with non-constants, the Compiler will internally use StringBuilder as well, but debugging becomes hell, because the code used is no longer identical to your source code.
My recommendation would be as follows:
+: Use when concatenating 2 or 3 Strings simply to keep your code brief and readable.
StringBuilder: Use when building up complex String output or where performance is a concern.
String.format: You didn't mention this in your question but it is my preferred method for creating Strings as it keeps the code the most readable / concise in my opinion and is particularly useful for log statements.
concat: I don't think I've ever had cause to use this.
Use StringBuilder if you do a lot of manipulation. Usually a loop is a pretty good indication of this.
The reason for this is that using normal concatenation produces lots of intermediate String object that can't easily be "extended" (i.e. each concatenation operation produces a copy, requiring memory and CPU time to make). A StringBuilder on the other hand only needs to copy the data in some cases (inserting something in the middle, or having to resize because the result becomes to big), so it saves on those copy operations.
Using concat() has no real benefit over using + (it might be ever so slightly faster for a single +, but once you do a.concat(b).concat(c) it will actually be slower than a + b + c).
Use + for single statements and StringBuilder for multiple statements/ loops.
The performace gain from compiler applies to concatenating constants.
The rest uses are actually slower then using StringBuilder directly.
There is not problem with using "+" e.g. for creating a message for Exception because it does not happen often and the application si already somehow screwed at the moment. Avoid using "+" it in loops.
For creating meaningful messages or other parametrized strings (Xpath expressions e.g.) use String.format - it is much better readable.
I suggest to use concat for two string concatination and StringBuilder otherwise, see my explanation for concatenation operator (+) vs concat()
What is the style recommendation for the Java string concatenation operator "+"?
Edit: Specifically, should it be used or not?
Thinking in Java (Eckel) says that the overloaded + operator is implemented using StringBuilder (although not all compilers may be supporting this as per alphazero's answer) and thus multiple String objects and the associated memory use and garbage collection are avoided. Given this, I would answer my own question by saying that the + operator is probably fine, style-wise. The only caveat is that the + is the only instance of overloading in the language and that exceptionalism might count as a minor reason not to use it. In retrospect, the advantage of terseness is pretty significant in some situations and that has got to count for a lot of style.
As long as your team members are comfortable with it.
Because there is no "correct" coding style. But I agree that you should always use white-spaces between strings and operator for better readability.
Following Java's coding conventions Strings should be concatenated like:
String str = "Long text line "
+ "more long text.";
Make sure the '+' operator always begins the next line.
https://www.oracle.com/technetwork/java/javase/documentation/codeconventions-136091.html#248
It is perfectly fine to use the '+' operator for String concatenation, there are different libraries that provide other structure for it, but for me it is the most simple way.
Hope this helps!
Happy coding,
Brady
Is this what you meant?
"string1" + "string"
or, if you have long lines
"a really long string....." +
"another really long string......" +
"ditto once again" +
"the last one, I promise"
If you have the time to format this right, then:
"a really long string....." +
"another really long string......" +
"ditto once again" +
"the last one, I promise"
Basically, every time you use the + operator, you should use it with at least one whitespace before and after. If you're using it when concatenating long strings, put it at the end of the line.
The overall recommendation is not to use this form (at all) if performance is of concern, and to instead use StringBuilder or StringBuffer (per your threading model). The reason is simply this: Strings in java are immutable and the '+' operator will create many intermediary String objects when processing expressions of form S1 + S2 + ... + Sn.
[Edit: Optimization of String Concatenation]
In my project there are some code snippets which uses StringBuffer objects, and the small part of it is as follows
StringBuffer str = new StringBuffer();
str.append("new " + "String()");
so i was confused with the use of append method and the + operator.
ie the following code could be written as
str.append("new ").append("String()");
So are the two lines above same?(functionally yes but) Or is there any particular usage of them? ie performance or readability or ???
thanks.
In that case it's more efficient to use the first form - because the compiler will convert it to:
StringBuffer str = new StringBuffer();
str.append("new String()");
because it concatenates constants.
A few more general points though:
If either of those expressions wasn't a constant, you'd be better off (performance-wise) with the two calls to append, to avoid creating an intermediate string for no reason
If you're using a recent version of Java, StringBuilder is generally preferred
If you're immediately going to append a string (and you know what it is at construction time), you can pass it to the constructor
Actually the bytecode compiler will replace all string concatenation which involve non constants in a Java program with invocations of StringBuffer. That is
int userCount = 2;
System.out.println("You are the " + userCount + " user");
will be rewritten as
int userCount = 2;
System.out.println(new StringBuffer().append("You are the ").append(userCount).append(" user").toString());
That is at least what is observable when decompiling java class files compiled with JDK 5 or 6. See this post.
The second form is most efficient in terms of performance because there is only one string object that is created and is appended to the stringbuffer.
The first form creates three string objects 1) for "new" 2)for "new String" 3) for the concatenated result of 1) and 2). and this third string object is concatenated to the string buffer.
Unless you are working with concurrent systems, use StringBuilder instead of StringBuffer. Its faster but not thread-safe :)
It also shares the same API so its more or less a straight find/replace-
Given a string with replacement keys in it, how can I most efficiently replace these keys with runtime values, using Java? I need to do this often, fast, and on reasonably long strings (say, on average, 1-2kb). The form of the keys is my choice, since I'm providing the templates here too.
Here's an example (please don't get hung up on it being XML; I want to do this, if possible, cheaper than using XSL or DOM operations). I'd want to replace all #[^#]*?# patterns in this with property values from bean properties, true Property properties, and some other sources. The key here is fast. Any ideas?
<?xml version="1.0" encoding="utf-8"?>
<envelope version="2.3">
<delivery_instructions>
<delivery_channel>
<channel_type>#CHANNEL_TYPE#</channel_type>
</delivery_channel>
<delivery_envelope>
<chan_delivery_envelope>
<queue_name>#ADDRESS#</queue_name>
</chan_delivery_envelope>
</delivery_envelope>
</delivery_instructions>
<composition_instructions>
<mime_part content_type="application/xml">
<content><external_uri>#URI#</external_uri></content>
</mime_part>
</composition_instructions>
</envelope>
The naive implementation is to use String.replaceAll() but I can't help but think that's less than ideal. If I can avoid adding new third-party dependencies, so much the better.
The appendReplacement method in Matcher looks like it might be useful, although I can't vouch for its speed.
Here's the sample code from the Javadoc:
Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());
EDIT: If this is as complicated as it gets, you could probably implement your own state machine fairly easily. You'd pretty much be doing what appendReplacement is already doing, although a specialized implementation might be faster.
It's premature to leap to writing your own. I would start with the naive replace solution, and actually benchmark that. Then I would try a third-party templating solution. THEN I would take a stab at the custom stream version.
Until you get some hard numbers, how can you be sure it's worth the effort to optimize it?
Does Java have a form of regexp replace() where a function gets called?
I'm spoiled by the Javascript String.replace() method. (For that matter you could run Rhino and use Javascript, but somehow I don't think that would be anywhere near as fast as a pure Java call even if the Javascript compiler/interpreter were efficient)
edit: never mind, #mmyers probably has the best answer.
gratuitous point-groveling: (and because I wanted to see if I could do it myself :)
Pattern p = Pattern.compile("#([^#]*?)#");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement(sb,substitutionTable.lookupKey(m.group(1)));
}
m.appendTail(sb);
// replace "substitutionTable.lookupKey" with your routine
You really want to write something custom so you can avoid processing the string more than once. I can't stress this enough - as most of the other solutions I see look like they are ignoring that problem.
Optionally turn the text into a stream. Read it char by char forwarding each char to an output string/stream until you see the # then read to the next # slurping out the key, substituting the key into the output: repeat until end of stream.
I know it's plain old brute for - but it's probably the best.
I'm assuming you have some reasonable assumption around '#' not just 'showing up' independant of your token keys in the input. :)
please don't get hung up on it being XML; I want to do this, if possible, cheaper than using XSL or DOM operations
Whatever's downstream from your process will get hung up if you don't also process the inserted strings for character escapes. Which isn't to say that you can't do it yourself if you have good cause, but does mean you either have to make sure your patterns are all in text nodes, and you also correctly escape the replacement text.
What exact advantage does #Foo# have over the standard &Foo; syntax already built into the XML libraries which ship with Java?
Text processing is going to always be bounded if you dont shift your paradigm. I dont know how flexible your domain is, so not sure if this is applicable, but here goes:
try creating an index into where your text substitution is - this is especially good if the template doesnt change often, because it becomes part of the "compile" of the template, into a binary object that can take in the value required for the substitutions, and blit out the entire string as a byte array. This object can be cached/saved, and next time, resubstitute in the new value to use again. I.e., you save on parsing the document every time. (implementation is left as an exercise to the reader =D )
But please use a profiler to check whether this is actually the bottleneck that you say it is before embarking on writing a custom templating engine. The problem may actually be else where.
As others have said, appendReplacement() and appendTail() are the tools you need, but there's something you have watch out for. If the replacement string contains any dollar signs, the method will try to interpret them as capture-group references. If there are any backslashes (which are used to escape the dollars sing), it will either eat them or throw an exception.
If your replacement string is dynamically generated, you may not know in advance whether it will contain any dollar signs or backslashes. To prevent problems, you can append the replacement directly to the StringBuffer, like so:
Pattern p = Pattern.compile("#([^#]*?)#");
Matcher m = p.matcher(s);
StringBuffer sb = new StringBuffer();
while (m.find())
{
m.appendReplacement("");
sb.append(substitutionTable.lookupKey(m.group(1)));
}
m.appendTail(sb);
You still have to call appendReplacement() each time, because that's what keeps you in sync with the match position. But this trick avoids a lot of pointless processing, which could give you a noticeable performance boost as a bonus.
this is what I use, from the apache commons project
http://commons.apache.org/lang/api/org/apache/commons/lang/text/StrSubstitutor.html
I also have a non-regexp based substitution library, available here. I have not tested its speed, and it doesn't directly support the syntax in your example. But it would be easy to extend to support that syntax; see, for instance, this class.
Take a look at a library that specializes in this, e.g., Apache Velocity. If nothing else, you can bet their implementation for this part of the logic is fast.
I wouldn't be so sure the accepted answer is faster than String.replaceAll(String,String). Here for your comparison is the implementation of String.replaceAll and the Matcher.replaceAll that is used under the covers. looks very similar to what the OP is looking for, and I'm guessing its probably more optomized than this simplistic solution.
public String replaceAll(String s, String s1)
{
return Pattern.compile(s).matcher(this).replaceAll(s1);
}
public String replaceAll(String s)
{
reset();
boolean flag = find();
if(flag)
{
StringBuffer stringbuffer = new StringBuffer();
boolean flag1;
do
{
appendReplacement(stringbuffer, s);
flag1 = find();
} while(flag1);
appendTail(stringbuffer);
return stringbuffer.toString();
} else
{
return text.toString();
}
}
... Chii is right.
If this is a template that has to be run so many times that speed matters, find the index of your substitution tokens to be able to get to them directly without having to start at the beginning each time. Abstract the 'compilation' into an object with the nice properties, they should only need updating after a change to the template.
Rythm a java template engine now released with an new feature called String interpolation mode which allows you do something like:
String result = Rythm.render("Hello #who!", "world");
The above case shows you can pass argument to template by position. Rythm also allows you to pass arguments by name:
Map<String, Object> args = new HashMap<String, Object>();
args.put("title", "Mr.");
args.put("name", "John");
String result = Rythm.render("Hello #title #name", args);
Since your template content is relatively long you could put them into a file and then call Rythm.render using the same API:
Map<String, Object> args = new HashMap<String, Object>();
// ... prepare the args
String result = Rythm.render("path/to/my/template.xml", args);
Note Rythm compile your template into java byte code and it's fairly fast, about 2 times faster than String.format
Links:
Check the full featured demonstration
read a brief introduction to Rythm
download the latest package or
fork it
This question already has answers here:
How to format strings in Java
(10 answers)
Closed 5 years ago.
Is there a more elegant way of doing this in Java?
String value1 = "Testing";
String test = "text goes here " + value1 + " more text";
Is it possible to put the variable directly in the string and have its value evaluated?
String test = String.format("test goes here %s more text", "Testing");
is the closest thing that you could write in Java
A more elegant way might be:
String value = "Testing";
String template = "text goes here %s more text";
String result = String.format(template, value);
Or alternatively using MessageFormat:
String template = "text goes here {0} more text";
String result = MessageFormat.format(template, value);
Note, if you're doing this for logging, then you can avoid the cost of performing this when the log line would be below the threshold. For example with SLFJ:
The following two lines will yield the exact same output. However, the second form will outperform the first form by a factor of at least 30, in case of a disabled logging statement.
logger.debug("The new entry is "+entry+".");
logger.debug("The new entry is {}.", entry);
Rythm a java template engine now released with an new feature called String interpolation mode which allows you do something like:
String result = Rythm.render("Hello #who!", "world");
The above case shows you can pass argument to template by position. Rythm also allows you to pass arguments by name:
Map<String, Object> args = new HashMap<String, Object>();
args.put("title", "Mr.");
args.put("name", "John");
String result = Rythm.render("Hello #title #name", args);
Links:
Check the full featured demonstration
read a brief introduction to Rythm
download the latest package or
fork it
It may be done by some template-libaries. But beware, Strings are immutable in Java. So in every case at some low level the concatenation will be done.
You'll always have to use some form of concatenation for this (assuming value1 isn't a constant like you show here).
The way you've written it will implicitly construct a StringBuilder and use it to concatenate the strings. Another method is String.format(String, Object...)1, which is analogous to sprintf from C. But even with format(), you can't avoid concatenation.
1 Yes, I know the anchor link is broken.
What you want is called String interpolation. It is not possible in Java, although JRuby, Groovy and probably other JVM languages do that.
Edit: as for elegance, you can use a StringBuffer or check the other poster's solution. But at the low level, this will always be concatenation, as the other posters said.
You can use this free library. It gives you sprintf like functionality. Or use String.format static method provided you use Java 5 or newer.
Why do you think string concatenation isn't elegant?
If all you are doing is simple concatenation, I'd argue that code readability is more important and I'd leave it like you have it. It's more readable than using a StringBuilder.
Performance won't be the problem that most people think it is.
Read this from CodingHorror
I would use a StringBuffer.. it's a common practise when you are dealing with strings. It may seem a bit when you see it for the first time, but you'll get quickly used to it..
String test = new StringBuffer("text goes here ").append(value1).append(" more text").toString();
Strings are immutable thus a new instance is created after every concatenation. This can cause performance issues when used in loops.
StringBuffer is mutable version of String - that means you can create one, modify it as you want and you have still only one instance. When desired you can get a String representation of the StringBuffer by calling it's toString() method.
The problem is not if this is an elegant way or not. The idea behind using a template system may be that you put your template in a normal text file and don't have to change java code if you change your message (or think about i18ln).