VisualVM - strange self time

VisualVM - strange self time - java

Today I got confused by the results of Visual VM profiling I got.
I have the following simple Java method:
public class Encoder {
...
private BitString encode(InputStream in, Map<Character, BitString> table)
throws IOException {
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
BitString result = new BitString();
int i;
while ((i = reader.read()) != -1) {
char ch = (char) i;
BitString string = table.get(ch);
result = result.append(string);
}
return result;
}
}
This method reads characters from a stream, one at a time. For each character it looks up it's bit-string representation and it joins these bit-strings to represent the whole stream.
A BitString is a custom data structure that represents a sequence of bits using an underlying byte array.
The method performs very poorly. The problem lies with BitString#append - that method creates a new byte array, copies the bits from both input BitStrings and returns it as a new BitString instance.
public BitString append(BitString other) {
BitString result = new BitString(size + other.size);
int pos = 0;
for (byte b : this) {
result.set(pos, b);
pos++;
}
for (byte b : other) {
result.set(pos, b);
pos++;
}
return result;
}
However, when I tried to use VisualVM to verify what's happening, here's what I got:
I have very little experience with Visual VM and profiling in general. To my understanding, this looks as if the problem lied somewhere in encode method itself, not in append.
To be sure, I surrounded the whole encode method and the append call with custom time measuring, like this:
public class Encoder {
private BitString encode(InputStream in, Map<Character, BitString> table)
throws IOException {
>> long startTime = System.currentTimeMillis();
>> long appendDuration = 0;
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
BitString result = new BitString();
int i;
>> long count = 0;
while ((i = reader.read()) != -1) {
char ch = (char) i;
BitString string = table.get(ch);
>> long appendStartTime = System.currentTimeMillis();
result = result.append(string);
>> long appendEndTime = System.currentTimeMillis();
>> appendDuration += appendEndTime - appendStartTime;
>> count++;
>> if (count % 1000 == 0) {
>> log.info(">>> CHARACTERS PROCESSED: " + count);
>> long endTime = System.currentTimeMillis();
>> log.info(">>> TOTAL ENCODE DURATION: " + (endTime - startTime) + " ms");
>> log.info(">>> TOTAL APPEND DURATION: " + appendDuration + " ms");
>> }
}
return result;
}
}
And I got the following results:
CHARACTERS PROCESSED: 102000
TOTAL ENCODE DURATION: 188276 ms
APPEND CALL DURATION: 188179 ms
This seems in contradiction with the results from Visual VM.
What am I missing?

You're seeing this behavior because VisualVM can only sample the call stack at safepoints, and the JVM is optimizing the safepoints out of your code. This causes the samples to be lumped together under "Self time" instead, which makes it artificially inflated and misleading. There are two possible fixes:
To make VisualVM work better, add JVM options to keep more safepoints, like -XX:-Inline and -XX:+UseCountedLoopSafepoints. These will slow your code down some, but will make the profiling results much more accurate. This solution is easy, and it's usually good enough. Just remember to remove these options when you're not profiling!
If you don't mind switching profilers, you can use Java Mission Control or honest-profiler. These have the ability to take samples at places other than safepoints, using a special API of the JVM. This solution is a little more accurate, since you're profiling the fully optimized code, but in my opinion those profilers are harder to use than VisualVM.
In your specific case, the method call to BitString.append() is probably being inlined by the JVM for performance. This causes the safepoint that's normally at the end of a method call to be removed, which means that method won't show up in the profiler anymore.
There's an excellent blog post here with more details about what safepoints are and how they work, and another one here that talks in more detail about the interaction between safepoints and sampling profilers.

[THIS ANSWER is INVALID. but i keep it until OP get some help from a member,because this post contains two comments which helps others to understand the problem].
in this case VisualVM has measured the actual CPU time, but time value you measured is "elapsed time".
if the execution threads had to wait for IO or Network, then that time will not be measured as CPU time.

Related

System.out.print consumes too much memory when printing to console. Is it possible to reduce?

I have simple programm:
public class Test {
public static void main(String[] args) {
for (int i = 0; i < 1_000_000; i++) {
System.out.print(1);
}
}
}
And launched profiling. Here are the results:
I assume that memory grows because of this method calls:
public void print(int i) {
write(String.valueOf(i));
}
Is there a way to print int values in the console without memory drawdown?
On local machine I try add if (i % 10000 == 0) System.gc(); to cycle and memory consumption evened out. But the system that checks the solution does not make a decision. I tried to change the values of the step but still does not pass either in memory(should work less than 20mb) or in time(<1sec)
EDIT I try this
String str = "hz";
for (int i = 0; i < 1_000_0000; i++) {
System.out.print(str);
}
But same result:
EDIT2 if I write this code
public class Test {
public static void main(String[] args) {
byte[] bytes = "hz".getBytes();
for (int i = 0; i < 1_000_0000; i++) {
System.out.write(bytes, 0, bytes.length);
}
}
}
I have this
Therefore, I do not believe that Java is making its noises. They would be in both cases.

You need to convert the int into characters without generating a new String each time you do it. This could be done in a couple of ways:
Write a custom "int to characters" method that converts to ASCII bytes in a byte[] (See #AndyTurner's example code). Then write the byte[]. And repeat.
Use ByteBuffer, fill it directly using a custom "int to characters" converter method, and use a Channel to output the bytes when the buffer is full. And repeat.
If done correctly, you should be able to output the numbers without generating any garbage ... other than your once-off buffers.
Note that System.out is a PrintStream wrapping a BufferedOutputStream wrapping a FileOuputStream. And, when you output a String directly or indirectly using one of the print methods, that actually does through a BufferedWriter that is internal to the PrintStream. It is complicated ... and apparently the print(String) method generates garbage somewhere in that complexity.
Concerning your EDIT 1: when you repeatedly print out a constant string, you are still apparently generating garbage. I was surprised by this, but I guess it is happening in the BufferedWriter.
Concerning your EDIT 2: when you repeatedly write from a byte[], the garbage generation all but disappears. This confirms that at least one of my suggestions will work.
However, since you are monitoring the JVM with an external profile, your JVM is also running an agent that is periodically sending updates to your profiler. That agent will most likely be generating a small amount of garbage. And there could be other sources of garbage in the JVM; e.g. if you have JVM GC logging enabled.

Since you have discovered that printing a byte[] keeps memory allocation within the required bounds, you can use this fact:
Allocate a byte array the length of the ASCII representation of Integer.MIN_VALUE (11 - the longest an int can be). Then you can fill the array backwards to convert a number i:
int p = buffer.length;
if (i == Integer.MIN_VALUE) {
buffer[--p] = (byte) ('0' - i % 10);
i /= 10;
}
boolean neg = i < 0;
if (neg) i = -i;
do {
buffer[--p] = (byte) ('0' + i % 10);
i /= 10;
} while (i != 0);
if (neg) buffer[--p] = '-';
Then write this to your stream:
out.write(buffer, p, buffer.length - p);
You can reuse the same buffer to write as many numbers as you wish.

The pattern of memory usage is typical for java. Your code is irrelevant. To control java memory usage you need to use some -X parameters for example "-Xms512m -Xmx512m" will set both minimum and maximum heap size to 512m. BTW in order to minimize the sow-like memory graph it would be recommended to set min and max size to the same value. Those params could be given to java on command line when you run your java for example:
java -Xms512m -Xmx512m myProgram
There are other ways as well. Here is one link where you can read more about it: Oracle docs. There are other params that control stacksize and some other things. The code itself if written without memory usage considerations may influence memory usage as well, but in your case its too trivial of a code to do anything. Most memory concerns are addressed by configuring jvm memory usage params

Why is BufferedReader read() much slower than readLine()?

I need to read a file one character at a time and I'm using the read() method from BufferedReader. *
I found that read() is about 10x slower than readLine(). Is this expected? Or am I doing something wrong?
Here's a benchmark with Java 7. The input test file has about 5 million lines and 254 million characters (~242 MB) **:
The read() method takes about 7000 ms to read all the characters:
#Test
public void testRead() throws IOException, UnindexableFastaFileException{
BufferedReader fa= new BufferedReader(new FileReader(new File("chr1.fa")));
long t0= System.currentTimeMillis();
int c;
while( (c = fa.read()) != -1 ){
//
}
long t1= System.currentTimeMillis();
System.err.println(t1-t0); // ~ 7000 ms
}
The readLine() method takes only ~700 ms:
#Test
public void testReadLine() throws IOException{
BufferedReader fa= new BufferedReader(new FileReader(new File("chr1.fa")));
String line;
long t0= System.currentTimeMillis();
while( (line = fa.readLine()) != null ){
//
}
long t1= System.currentTimeMillis();
System.err.println(t1-t0); // ~ 700 ms
}
* Practical purpose: I need to know the length of each line, including the newline characters (\n or \r\n) AND the line length after stripping them. I also need to know if a line starts with the > character. For a given file this is done only once at the start of the program. Since EOL chars are not returned by BufferedReader.readLine() I'm resorting on the read() method. If there are better ways of doing this, please say.
** The gzipped file is here http://hgdownload.cse.ucsc.edu/goldenpath/hg19/chromosomes/chr1.fa.gz. For those who may be wondering, I'm writing a class to index fasta files.

The important thing when analyzing performance is to have a valid benchmark before you start. So let's start with a simple JMH benchmark that shows what our expected performance after warmup would be.
One thing we have to consider is that since modern operating systems like to cache file data that is accessed regularly we need some way to clear the caches between tests. On Windows there's a small little utility that does just this - on Linux you should be able to do it by writing to some pseudo file somewhere.
The code then looks as follows:
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Mode;
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
#BenchmarkMode(Mode.AverageTime)
#Fork(1)
public class IoPerformanceBenchmark {
private static final String FILE_PATH = "test.fa";
#Benchmark
public int readTest() throws IOException, InterruptedException {
clearFileCaches();
int result = 0;
try (BufferedReader reader = new BufferedReader(new FileReader(FILE_PATH))) {
int value;
while ((value = reader.read()) != -1) {
result += value;
}
}
return result;
}
#Benchmark
public int readLineTest() throws IOException, InterruptedException {
clearFileCaches();
int result = 0;
try (BufferedReader reader = new BufferedReader(new FileReader(FILE_PATH))) {
String line;
while ((line = reader.readLine()) != null) {
result += line.chars().sum();
}
}
return result;
}
private void clearFileCaches() throws IOException, InterruptedException {
ProcessBuilder pb = new ProcessBuilder("EmptyStandbyList.exe", "standbylist");
pb.inheritIO();
pb.start().waitFor();
}
}
and if we run it with
chcp 65001 # set codepage to utf-8
mvn clean install; java "-Dfile.encoding=UTF-8" -server -jar .\target\benchmarks.jar
we get the following results (about 2 seconds are needed to clear the caches for me and I'm running this on a HDD so that's why it's a good deal slower than for you):
Benchmark Mode Cnt Score Error Units
IoPerformanceBenchmark.readLineTest avgt 20 3.749 ± 0.039 s/op
IoPerformanceBenchmark.readTest avgt 20 3.745 ± 0.023 s/op
Surprise! As expected there's no performance difference here at all after the JVM has settled into a stable mode. But there is one outlier in the readCharTest method:
# Warmup Iteration 1: 6.186 s/op
# Warmup Iteration 2: 3.744 s/op
which is exaclty the problem you're seeing. The most likely reason I can think of is that OSR isn't doing a good job here or that the JIT is only running too late to make a difference on the first iteration.
Depending on your use case this might be a big problem or negligible (if you're reading a thousand files it won't matter, if you're only reading one this is a problem).
Solving such a problem is not easy and there are no general solutions, although there are ways to handle this. One easy test to see if we're on the right track is to run the code with the -Xcomp option which forces HotSpot to compile every method on the first invocation. And indeed doing so, causes the large delay at the first invocation to disappear:
# Warmup Iteration 1: 3.965 s/op
# Warmup Iteration 2: 3.753 s/op
Possible solution
Now that we have a good idea what the actual problem is (my guess is still all those locks neither being coalesced nor using the efficient biased locks implementation), the solution is rather straight forward and simple: Reduce the number of function calls (so yes we could've arrived at this solution without everything above, but it's always nice to have a good grip on the problem and there might have been a solution that didn't involve changing much code).
The following code runs consistently faster than either of the other two - you can play with the array size but it's surprisingly unimportant (presumably because contrary to the other methods read(char[]) does not have to acquire a lock so the cost per call is lower to begin with).
private static final int BUFFER_SIZE = 256;
private char[] arr = new char[BUFFER_SIZE];
#Benchmark
public int readArrayTest() throws IOException, InterruptedException {
clearFileCaches();
int result = 0;
try (BufferedReader reader = new BufferedReader(new FileReader(FILE_PATH))) {
int charsRead;
while ((charsRead = reader.read(arr)) != -1) {
for (int i = 0; i < charsRead; i++) {
result += arr[i];
}
}
}
return result;
}
This is most likely good enough performance wise, but if you wanted to improve performance even further using a file mapping might (wouldn't count on too large an improvement in a case such as this, but if you know that your text is always ASCII, you could make some further optimizations) further help performance.

So this is the practical answer to my own question: Don't use BufferedReader.read() use FileChannel instead. (Obviously I'm not answering the WHY I put in the title). Here's the quick and dirty benchmark, hopefully others will find it useful:
#Test
public void testFileChannel() throws IOException{
FileChannel fileChannel = FileChannel.open(Paths.get("chr1.fa"));
long n= 0;
int noOfBytesRead = 0;
long t0= System.nanoTime();
while(noOfBytesRead != -1){
ByteBuffer buffer = ByteBuffer.allocate(10000);
noOfBytesRead = fileChannel.read(buffer);
buffer.flip();
while ( buffer.hasRemaining() ) {
char x= (char)buffer.get();
n++;
}
}
long t1= System.nanoTime();
System.err.println((float)(t1-t0) / 1e6); // ~ 250 ms
System.err.println("nchars: " + n); // 254235640 chars read
}
With ~250 ms to read the whole file char by char, this strategy is considerably faster than BufferedReader.readLine() (~700 ms), let alone read(). Adding if statements in the loop to check for x == '\n' and x == '>' makes little difference. Also putting a StringBuilder to reconstruct lines doesn't affect the timing too much. So this is plenty good for me (at least for now).
Thanks to #Marco13 for mentioning FileChannel.

Java JIT optimizes away empty loop bodies, so your loops actually look like this:
while((c = fa.read()) != -1);
and
while((line = fa.readLine()) != null);
I suggest you read up on benchmarking here and the optimization of the loops here.
As to why the time taken differs:
Reason one (This only applies if the bodies of the loops contain code): In the first example, you're doing one operation per line, in the second, you're doing one per character. This this adds up the more lines/characters you have.
while((c = fa.read()) != -1){
//One operation per character.
}
while((line = fa.readLine()) != null){
//One operation per line.
}
Reason two: In the class BufferedReader, the method readLine() doesn't use read() behind the scenes - it uses its own code. The method readLine() does less operations per character to read a line, than it would take to read a line with the read() method - this is why readLine() is faster at reading an entire file.
Reason three: It takes more iterations to read each character, than it does to read each line (unless each character is on a new line); read() is called more times than readLine().

Thanks #Voo for the correction. What I mentioned below is correct from FileReader#read() v/s BufferedReader#readLine() point of view BUT not correct from BufferedReader#read() v/s BufferedReader#readLine() point of view, so I have striked-out the answer.
Using read() method on BufferedReader is not a good idea, it wouldn't cause you any harm but it certainly wastes the purpose of class.
Whole purpose in life of BufferedReader is to reduce the i/o by buffering the content. You can read here in Java tutorials. You may also notice that read() method in BufferedReader is actually inherited from Reader while readLine() is BufferedReader's own method.
If you want to use read() method then I would say you better use FileReader, which is meant for that purpose. You can read here in Java tutorials.
So, I think answer to your question is very simple (without going into bench-marking and all that explainations) -
Each read() is handled by underlying OS and triggers disk access, network activity, or some other operation that is relatively expensive.
When you use readLine() then you save all these overheads, so readLine() will always be faster than read(), may not be substantially for small data but faster.

It is not surprising to see this difference if you think about it. One test is iterating the lines in a text file, while the other is iterating characters.
Unless each line contains one character, it is expected that the readLine() is way faster than the read() method.(although as pointed out by the comments above, it is arguable since a BufferedReader buffers the input, while the physical file reading might not be the only performance taking operation)
If you really want to test the difference between the 2 I would suggest a setup where you iterate over each character in both tests. E.g. something like:
void readTest(BufferedReader r)
{
int c;
StringBuilder b = new StringBuilder();
while((c = r.read()) != -1)
b.append((char)c);
}
void readLineTest(BufferedReader r)
{
String line;
StringBuilder b = new StringBuilder();
while((line = b.readLine())!= null)
for(int i = 0; i< line.length; i++)
b.append(line.charAt(i));
}
Besides the above, please use a "Java performance diagnostic tool" to benchmark your code. Also, readup on how to microbenchmark java code.

According to the documentation:
Every read() method call makes an expensive system call.
Every readLine() method call still makes an expensive system call, however, for more bytes at once, so there are fewer calls.
Similar situation happens when we make database update command for each record we want to update, versus a batch update, where we make one call for all the records.

High Level Java Optimization

There are many questions and answers and opinions about how to do low level Java optimization, with for, while, and do-while loops, and whether it's even necessary.
My question is more of a High Level based optimization in design. Let's assume I have to do the following:
for a given string input, count the occurrence of each letter in the string.
this is not a major problem when the string is a few sentences, but what if instead we want to count the occurrence of each word in a 900,000 word file. building loops just wastes time.
So what is the high level design pattern that can be applied to this type of problem.
I guess my major point is that I tend to use loops to solve many problems, and I would like to get out of the habit of using loops.
thanks in advance
Sam
p.s. If possible can you produce some pseudo code for solving the 900,000 word file problem, I tend to understand code better than I can understand English, which I assume is the same for most visitors of this site

The word count problem is one of the most widely covered problems in the Big Data world; it's kind of the Hello World of frameworks like Hadoop. You can find ample information throughout the web on this problem.
I'll give you some thoughts on it anyway.
First, 900000 words might still be small enough to build a hashmap for, so don't discount the obvious in-memory approach. You said pseudocode is fine, so:
h = new HashMap<String, Integer>();
for each word w picked up while tokenizing the file {
h[w] = w in h ? h[w]++ : 1
}
Now once your dataset is too large to build an in-memory hashmap, you can do your counting like so:
Tokenize into words writing each word to a single line in a file
Use the Unix sort command to produce the next file
Count as you traverse the sorted file
These three steps go in a Unix pipeline. Let the OS do the work for you here.
Now, as you get even more data, you want to bring in map-reduce frameworks like hadoop to do the word counting on clusters of machines.
Now, I've heard when you get into obscenely large datasets, doing things in a distributed enviornment does not help anymore, because the transmission time overwhelms the counting time, and in your case of word counting, everything has to "be put back together anyway" so then you have to use some very sophisticated techniques that I suspect you can find in research papers.
ADDENDUM
The OP asked for an example of tokenizing the input in Java. Here is the easiest way:
import java.util.Scanner;
public class WordGenerator {
/**
* Tokenizes standard input into words, writing each word to standard output,
* on per line. Because it reads from standard input and writes to standard
* output, it can easily be used in a pipeline combined with sort, uniq, and
* any other such application.
*/
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
while (input.hasNext()) {
System.out.println(input.next().toLowerCase());
}
}
}
Now here is an example of using it:
echo -e "Hey Moe! Woo\nwoo woo nyuk-nyuk why soitenly. Hey." | java WordGenerator
This outputs
hey
moe!
woo
woo
woo
nyuk-nyuk
why
soitenly.
hey.
You can combine this tokenizer with sort and uniq like so:
echo -e "Hey Moe! Woo\nwoo woo nyuk-nyuk why soitenly. Hey." | java WordGenerator | sort | uniq
Yielding
hey
hey.
moe!
nyuk-nyuk
soitenly.
why
woo
Now if you only want to keep letters and throw away all punctuation, digits and other characters, change your scanner definition line to:
Scanner input = new Scanner(System.in).useDelimiter(Pattern.compile("\\P{L}"));
And now
echo -e "Hey Moe! Woo\nwoo woo^nyuk-nyuk why#2soitenly. Hey." | java WordGenerator | sort | uniq
Yields
hey
moe
nyuk
soitenly
why
woo
There is a blank line in the output; I'll let you figure out how to whack it. :)

The fastest solution to this is O(n) AFAIK use a loop to iterate the string, get the character and update the count in a HashMap accordingly. At the end the HashMap contains all the characters that occurred and a count of all the occurrences.
Some pseduo-code (may not compile)
HashMap<Character, Integer> map = new HashMap<Character, Integer>();
for (int i = 0; i < str.length(); i++)
{
char c = str.charAt(i);
if (map.containsKey(c)) map.put(c, map.get(c) + 1);
else map.put(c, 1);
}

It's hard for you to get much better than using a loop to solve this problem. IMO, the best way to speed up this sort of operation is to split the workload into different units of work and process the units of work with different processors (using threads, for example, if you have a multiprocessor computer).

You shouldn't assume 900,000 is a lot of words. If you have a CPU with 8 threads and 3 GHZ that's 24 billion clock cycles per second. ;)
However for counting characters using an int[] will be much faster. There is only 65,536 possible characters.
StringBuilder words = new StringBuilder();
Random rand = new Random();
for (int i = 0; i < 10 * 1000 * 1000; i++)
words.append(Long.toString(rand.nextLong(), 36)).append(' ');
String text = words.toString();
long start = System.nanoTime();
int[] charCount = new int[Character.MAX_VALUE];
for (int i = 0; i < text.length(); i++)
charCount[text.charAt(i)]++;
long time = System.nanoTime() - start;
System.out.printf("Took %,d ms to count %,d characters%n", time / 1000/1000, text.length());
prints
Took 111 ms to count 139,715,647 characters
Even 11x times the number of words takes a fraction of a second.
A much longer parallel version is a little faster.
public static void main(String... args) throws InterruptedException, ExecutionException {
StringBuilder words = new StringBuilder();
Random rand = new Random();
for (int i = 0; i < 10 * 1000 * 1000; i++)
words.append(Long.toString(rand.nextLong(), 36)).append(' ');
final String text = words.toString();
long start = System.nanoTime();
// start a thread pool to generate 4 tasks to count sections of the text.
final int nThreads = 4;
ExecutorService es = Executors.newFixedThreadPool(nThreads);
List<Future<int[]>> results = new ArrayList<Future<int[]>>();
int blockSize = (text.length() + nThreads - 1) / nThreads;
for (int i = 0; i < nThreads; i++) {
final int min = i * blockSize;
final int max = Math.min(min + blockSize, text.length());
results.add(es.submit(new Callable<int[]>() {
#Override
public int[] call() throws Exception {
int[] charCount = new int[Character.MAX_VALUE];
for (int j = min; j < max; j++)
charCount[text.charAt(j)]++;
return charCount;
}
}));
}
es.shutdown();
// combine the results.
int[] charCount = new int[Character.MAX_VALUE];
for (Future<int[]> resultFuture : results) {
int[] result = resultFuture.get();
for (int i = 0, resultLength = result.length; i < resultLength; i++) {
charCount[i] += result[i];
}
}
long time = System.nanoTime() - start;
System.out.printf("Took %,d ms to count %,d characters%n", time / 1000 / 1000, text.length());
}
prints
Took 45 ms to count 139,715,537 characters
But for a String with less than a million words its not likely to be worth it.

As a general rule, you should just write things in a straightforward way, and then do performance tuning to make it as fast as possible.
If that means putting in a faster algorithm, do so, but at first, keep it simple.
For a small program like this, it won't be too hard.
The essential skill in performance tuning is not guessing.
Instead, let the program itself tell you what to fix.
This is my method.
For more involved programs, like this one, experience will show you how to avoid the over-thinking that ends up causing a lot of the poor performance it is trying to avoid.

You have to use divide and conquer approach and avoid race for resources. There are different approaches and/or implementations for that. The idea is the same - split the work and parallelize the processing.
On single machine you can process chunks of the data in separate threads, although having the chunks on the same disk will slow things down considerably. H having more threads means having more context-switching, for throughput is IMHO better to have smaller amount of them and keep them busy.
You can split the processing to stages and use SEDA or something similar and with really big data you do for map-reduce - just count with the expense of distributing data across cluster.
I'll be glad of somebody point to another widely-used API.

Why use StringBuilder? StringBuffer can work with multiple thread as well as one thread?

Suppose our application have only one thread. and we are using StringBuffer then what is the problem?
I mean if StringBuffer can handle multiple threads through synchronization, what is the problem to work with single thread?
Why use StringBuilder instead?

StringBuffers are thread-safe, meaning that they have synchronized methods to control access so that only one thread can access a StringBuffer object's synchronized code at a time. Thus, StringBuffer objects are generally safe to use in a multi-threaded environment where multiple threads may be trying to access the same StringBuffer object at the same time.
StringBuilder's access is not synchronized so that it is not thread-safe. By not being synchronized, the performance of StringBuilder can be better than StringBuffer. Thus, if you are working in a single-threaded environment, using StringBuilder instead of StringBuffer may result in increased performance. This is also true of other situations such as a StringBuilder local variable (ie, a variable within a method) where only one thread will be accessing a StringBuilder object.
So, prefer StringBuilder because,
Small performance gain.
StringBuilder is a 1:1 drop-in replacement for the StringBuffer class.
StringBuilder is not thread synchronized and therefore performs better on most implementations of Java
Check this out :
Don't Use StringBuffer!
StringBuffer vs. StringBuilder performance comparison

StringBuilder is supposed to be a (tiny) bit faster because it isn't synchronised (thread safe).
You can notice the difference in really heavy applications.
The StringBuilder class should generally be used in preference to this one, as it supports all of the same operations but it is faster, as it performs no synchronization.
http://download.oracle.com/javase/6/docs/api/java/lang/StringBuffer.html

Using StringBuffer in multiple threads is next to useless and in reality almost never happens.
Consider the following
Thread1: sb.append(key1).append("=").append(value1);
Thread2: sb.append(key2).append("=").append(value2);
each append is synchronized, but a thread can stoop at any point so you can have any of the following combinations and more
key1=value1key2=value2
key1key2==value2value1
key2key1=value1=value2
key2=key1=value2value1
This can be avoided by synchronizing the whole line at a time, but this defeats the point of using StringBuffer instead of StringBuilder.
Even if you have a correctly synchronized view, it more complicated than just creating a thread local copy of the whole line e.g. StringBuilder and log lines at a time to a class like a Writer.

StringBuffer is not wrong in a single-threaded application. It will work just as well as StringBuilder.
The only difference is the tiny overhead added by having all synchronized methods, which brings no advantage in a single-threaded application.
My opinion is that the main reason StringBuilder was introduced is that the compiler uses StringBuffer (and now StringBuilder) when it compiles code that contains String concatenation: in those cases synchronization is never necessary and replacing all of those places with an un-synchronized StringBuilder can provide a small performance improvement.

StringBuilder has a better performance because it's methods are not synchronized.
So if you do not need to build a String concurrently (which is a rather untypical scenarion anyway), then there's no need to "pay" for the unnecessary synchronization overhead.

This will help u guys,
Be Straight Builder is faster than Buffer,
public class ConcatPerf {
private static final int ITERATIONS = 100000;
private static final int BUFFSIZE = 16;
private void concatStrAdd() {
System.out.print("concatStrAdd -> ");
long startTime = System.currentTimeMillis();
String concat = new String("");
for (int i = 0; i < ITERATIONS; i++) {
concat += i % 10;
}
//System.out.println("Content: " + concat);
long endTime = System.currentTimeMillis();
System.out.print("length: " + concat.length());
System.out.println(" time: " + (endTime - startTime));
}
private void concatStrBuff() {
System.out.print("concatStrBuff -> ");
long startTime = System.currentTimeMillis();
StringBuffer concat = new StringBuffer(BUFFSIZE);
for (int i = 0; i < ITERATIONS; i++) {
concat.append(i % 10);
}
long endTime = System.currentTimeMillis();
//System.out.println("Content: " + concat);
System.out.print("length: " + concat.length());
System.out.println(" time: " + (endTime - startTime));
}
private void concatStrBuild() {
System.out.print("concatStrBuild -> ");
long startTime = System.currentTimeMillis();
StringBuilder concat = new StringBuilder(BUFFSIZE);
for (int i = 0; i < ITERATIONS; i++) {
concat.append(i % 10);
}
long endTime = System.currentTimeMillis();
// System.out.println("Content: " + concat);
System.out.print("length: " + concat.length());
System.out.println(" time: " + (endTime - startTime));
}
public static void main(String[] args) {
ConcatPerf st = new ConcatPerf();
System.out.println("Iterations: " + ITERATIONS);
System.out.println("Buffer : " + BUFFSIZE);
st.concatStrBuff();
st.concatStrBuild();
st.concatStrAdd();
}
}
Output
run:
Iterations: 100000
Buffer : 16
concatStrBuff -> length: 100000 time: 11
concatStrBuild -> length: 100000 time: 4
concatStrAdd ->

Manish, although there is just one thread operating on your StringBuffer instance, there is some overhead in acquiring and releasing the monitor lock on the StringBuffer instance whenever any of its methods are invoked. Hence StringBuilder is a preferable choice in single thread environment.

There is a huge cost to synchronize objects. Don't see a program as a standalone entity; its not a problem when you are reading the concepts and applying them on small programs like you have mentioned in your question details, the problems arise when we want to scale the system. In that case your single threaded program might be dependent on several other methods/programs/entities, so synchronized objects can cause a serious programming complexity in terms of performance. So if you are sure that there is no need to synchronize an object then you should use StringBuilder as it is a good programming practice. At the end we want to learn programming to make scalable high performance systems, so that is what we should do!

Synchronization on the local variables

Today I was faced with the method constructServiceUrl() of the org.jasig.cas.client.util.CommonUtils class. I thought he was very strange:
final StringBuffer buffer = new StringBuffer();
synchronized (buffer)
{
if (!serverName.startsWith("https://") && !serverName.startsWith("http://"))
{
buffer.append(request.isSecure() ? "https://" : "http://");
}
buffer.append(serverName);
buffer.append(request.getRequestURI());
if (CommonUtils.isNotBlank(request.getQueryString()))
{
final int location = request.getQueryString().indexOf(
artifactParameterName + "=");
if (location == 0)
{
final String returnValue = encode ? response.encodeURL(buffer.toString()) : buffer.toString();
if (LOG.isDebugEnabled())
{
LOG.debug("serviceUrl generated: " + returnValue);
}
return returnValue;
}
buffer.append("?");
if (location == -1)
{
buffer.append(request.getQueryString());
}
else if (location > 0)
{
final int actualLocation = request.getQueryString()
.indexOf("&" + artifactParameterName + "=");
if (actualLocation == -1)
{
buffer.append(request.getQueryString());
}
else if (actualLocation > 0)
{
buffer.append(request.getQueryString().substring(0, actualLocation));
}
}
}
}
Why did the author synchronizes a local variable?

This is an example of manual "lock coarsening" and may have been done to get a performance boost.
Consider these two snippets:
StringBuffer b = new StringBuffer();
for(int i = 0 ; i < 100; i++){
b.append(i);
}
versus:
StringBuffer b = new StringBuffer();
synchronized(b){
for(int i = 0 ; i < 100; i++){
b.append(i);
}
}
In the first case, the StringBuffer must acquire and release a lock 100 times (because append is a synchronized method), whereas in the second case, the lock is acquired and released only once. This can give you a performance boost and is probably why the author did it. In some cases, the compiler can perform this lock coarsening for you (but not around looping constructs because you could end up holding a lock for long periods of time).
By the way, the compiler can detect that an object is not "escaping" from a method and so remove acquiring and releasing locks on the object altogether (lock elision) since no other thread can access the object anyway. A lot of work has been done on this in JDK7.
Update:
I carried out two quick tests:
1) WITHOUT WARM-UP:
In this test, I did not run the methods a few times to "warm-up" the JVM. This means that the Java Hotspot Server Compiler did not get a chance to optimize code e.g. by eliminating locks for escaping objects.
JDK 1.4.2_19 1.5.0_21 1.6.0_21 1.7.0_06
WITH-SYNC (ms) 3172 1108 3822 2786
WITHOUT-SYNC (ms) 3660 801 509 763
STRINGBUILDER (ms) N/A 450 434 475
With JDK 1.4, the code with the external synchronized block is faster. However, with JDK 5 and above the code without external synchronization wins.
2) WITH WARM-UP:
In this test, the methods were run a few times before the timings were calculated. This was done so that the JVM could optimize code by performing escape analysis.
JDK 1.4.2_19 1.5.0_21 1.6.0_21 1.7.0_06
WITH-SYNC (ms) 3190 614 565 587
WITHOUT-SYNC (ms) 3593 779 563 610
STRINGBUILDER (ms) N/A 450 434 475
Once again, with JDK 1.4, the code with the external synchronized block is faster. However, with JDK 5 and above, both methods perform equally well.
Here is my test class (feel free to improve):
public class StringBufferTest {
public static void unsync() {
StringBuffer buffer = new StringBuffer();
for (int i = 0; i < 9999999; i++) {
buffer.append(i);
buffer.delete(0, buffer.length() - 1);
}
}
public static void sync() {
StringBuffer buffer = new StringBuffer();
synchronized (buffer) {
for (int i = 0; i < 9999999; i++) {
buffer.append(i);
buffer.delete(0, buffer.length() - 1);
}
}
}
public static void sb() {
StringBuilder buffer = new StringBuilder();
synchronized (buffer) {
for (int i = 0; i < 9999999; i++) {
buffer.append(i);
buffer.delete(0, buffer.length() - 1);
}
}
}
public static void main(String[] args) {
System.out.println(System.getProperty("java.version"));
// warm up
for(int i = 0 ; i < 10 ; i++){
unsync();
sync();
sb();
}
long start = System.currentTimeMillis();
unsync();
long end = System.currentTimeMillis();
long duration = end - start;
System.out.println("Unsync: " + duration);
start = System.currentTimeMillis();
sync();
end = System.currentTimeMillis();
duration = end - start;
System.out.println("sync: " + duration);
start = System.currentTimeMillis();
sb();
end = System.currentTimeMillis();
duration = end - start;
System.out.println("sb: " + duration);
}
}

Inexperience, incompetence, or more likely dead yet benign code that remains after refactoring.
You're right to question the worth of this - modern compilers will use escape analysis to determine that the object in question cannot be referenced by another thread, and so will elide (remove) the synchronization altogether.
(In a broader sense, it is sometimes useful to synchronize on a local variable - they are still objects after all, and another thread can still have a reference to them (so long as they have been somehow "published" after their creation). Still, this is seldom a good idea as it's often unclear and very difficult to get right - a more explicitly locking mechanism with other threads is likely to prove better overall in these cases.)

I don't think the synchronization can have any effect, since buffer is never passed to another method or stored in a field before it goes out of scope, so no other thread can possibly have access to it.
The reason it is there could be political - I've been in a similar situation: A "pointy-haired boss' insisted that I clone a string in a setter method instead of just storing the reference, for fear of having the contents changed. He didn't deny that strings are immutable but insisted on cloning it "just in case." Since it was harmless (just like this synchronization) I did not argue.

That's a bit insane...it doesn't do anything except add overhead. Not to mention that calls to StringBuffer are already synchronized, which is why StringBuilder is preferred for cases where you won't have multiple threads accessing the same instance.

IMO, there is no need for synchronization that local variable. Only if it was exposed to others, e.g. by passing it to a function that will store it and potentially use it in another thread, the synchronization would make sense.
But as this is not the case, I see no use of it

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.