Java: What's the reason behind System.out.println() being that slow? - java

For small logical programs that can be done in a text editor, for tracing I use the classic System.out.println().
I guess you all know how frustrating it is to use that in a block of high number of iterations. Why is it so slow? What's the reason behind it?

This has nothing whatsoever to do with the JVM. Printing text to screen simply involves a lot of work for the OS in drawing the letters and especially scrolling. If you redirect System.out to a file, it will be much faster.

This is very OS-dependent. For example, in Windows, writing to the console is a blocking operation, and it's also slow, and so writing lots of data to the console slows down (or blocks) your application. In unix-type OSes, writing to the console is buffered, so your app can continue unblocked, and the console will catch up as it can.

Ya, there is a huge amount of overhead in writing to the console. Far greater than that required to write to a file or a socket. Also if there are a large number of threads they are all contending on the same lock. I would recommend using something other that System.out.println to trace.

This has nothing to do with Java and JVM but with the console terminal. In most OSes I know writing in the console output is slow.

Buffering can help enormously. Try this:
System.setOut( new PrintStream(new BufferedOutputStream(System.out)) );
But beware: you won't see the output appear gradually, but all in a flash. Which is great, but if you're using it for debugging, and the program crashes before it terminates, in some circumstances it's possible you won't see the text printed just before the crash.
This is because the buffer wasn't flushed before the crash. It was printed, but it's still in the buffer, and didn't make it out to the console where you can see it. I remember this happening to me, in a puzzling debug session. It's best to occasionally flush explicitly, to make sure you see it:
System.out.flush();

Some terminals just are faster than others. This may vary even within one operating system.

This might look as it doesn't directly answer your question, but my advice is to never use System.out for tracing ( if you mean by that a kind of debugging, in order just to see the advance of your app )
The problems with System.out for debugging are several :
once the app ends, when you close the console you'll loose the log
you'll have to remove those statements once your app is working properly ( or comment them ). Later if you want to reactivate them, you'll have to uncomment/comment again ... tedious
I recommend instead to use log4j and "watch" the log file, either with a tail command - there's also a Tail for Windows - either with an Eclipse plugin like LogWatcher.

One interesting thing I've noticed about writing to the terminal (at least in Windows). Is it actually runs much much faster if the window is minimized. This is definitely closely tied with Michael Borgwardt's answer about drawing and scrolling. Really if you're logging enough to notice the slowdown you're probably better off writing to a file.

The slowness is due the large amount of Java-Native transitions which happen on every line break or flush. If the iteration has to many steps, the System.out.println() isn't much of a help. If the iteration steps are not that important by themselves, you may call System.out.println() on every 10 or 100 steps only. You can also wrap the System.out into a BufferedOutputStream. And of course there is always the option to asynchronize the print via ExecutorService.

Related

what is the proper way to close streams on java's exec?

After
Runtime.getRuntime().exec(command);
i see syscalls happening that show 2~3 file descriptors (FIFO pipes). What is the proper way to close them with try-with-resource pattern?
Most historical tribal knowledge found on java forums suggest:
# out of date!
... } finally {
IOUtils.closeQuietly(p.getOutputStream());
IOUtils.closeQuietly(p.getInputStream());
IOUtils.closeQuietly(p.getErrorStream());
}
but that doesn't sound right because 1) method closeQuietly is deprecated and most libraries suggest using try-with-resource, 2) it is inelegant as I might not necessarily have all streams.
And simply moving the exec() call into try feels wrong as it is not the resource i will call close() on.
Closing them isn't necessary; the close by themselves when the process dies. If the process never dies, it is also not neccessary: Either you make a new never-dying process every so often in which case your system is going to crash and run out of resources whether you close these or not, or you make it only once, in which case these resources aren't going to count for much. For what it is worth, these are quite lightweight resources, and often they simply cannot be 'closed' in the sense that the resources can be 'freed' - closing them either keeps them open but denies further chat (and sends EOFs where needed), or reroutes them to /dev/null; generally processes just have 3 pipes on em and will continue to have them until the process dies.
Yes, closeQuietly is a silly idea for virtually all purposes, and so it is here. If closing these streams somehow fail you probably don't want to silently ignore that.
If you must close them, the individual streams from these 3 are closable. However, note that you're reading rules of thumb and attempting to apply them as if they are gospel truth. try-with-resources is not always the right answer, and try-with-resources is not a 100% always replacement for close, let alone closeQuietly.
For example, try-with-resources specifically is designed around a period of usage. You declare the span of statements within which the resource should be available (the braces that go with the try block), and the construct will then ensure that the resource is closed only once code flow transitions out of that span of statements, no matter how it exits this. That makes it probably irrelevant here, too!
You are starting a long-lived process and don't care about the in/out. You just want the process to run and to keep running. This means there is no span at all, and you should just call close() on these if somehow you feel it is important to try to save the resources even though most likely this accomplishes nothing at all. No span-of-statements means try-with-resources isn't right.
You are starting a short-lived process that you interact with. The right thing to 'close' is the process itself, except you can't use try-with-resources for that. That can only be used on auto-closables. (resources where the class that represents them implement AutoClosable. Most do, some don't. Lock is a famous one. Process is another: To 'close' it, you invoke destroy() or even destroyForcibly(). You cannot use try-with-resources (not without ugly hacks that defeats the purpose) to do this! Once you close/destroy the process, the streams that went along with them are dead too.
More generally the principle is: If you create it, you close it. If you never call getOutputStream() you never created them. On some OSes, fetching these streams and then closing them wastes more resources than not doing this. Thus, if the argument is based on some sort of purity model, then you shouldn't close them either. If it's based on pragmatics, you'd have to test how heavy these resources really are (most likely, extremely light), whether closing them actually saves you some pipes (most likely, it will not), and whether close()-ing the result of invoking getOutputStream() on the process even helps if the answers to the above questions make that relevant (it probably will, but the spec does not guarantee this).
They are very light processes that in almost every case don't require closing...

Is Opening Java FileOutputStream efficient?

I'm writing a singleton logger for my program right now, and I was wondering whether it would be better to open and close it everytime I log something, or to open the stream at creation of the singleton and close it at the termination of the program? And if I were to do that, how would I close it at termination?
The main advantage of opening the file once is performance. You save yourself the penalty of calling an open each time, and seek to the end of the file for appending; this get worse if the file is big (and some logs tend to be).
The cons are:
You might not be able to read the last log line inmmediately, if there is some buffering in the writer (delayed writes). Howeever, this can be fixed by flushing after each write (you might lose some performance, but this is not usually relevant).
You cannot simultaneously write to the same log from different processes. But you probably don't need this - and if you need it, the open-and-close solution still needs to deal with concurrency.
Some external log processing (typically, log rotation with renaming) becomes problematic. To allow for this, you might need to implement some signalling that closes and reopens the file.
Typically, the advantage outweights the cons, so the general rule is to keep the log file open. But that depends on the scenario.
(As other answers point out, normally you'd prefer to use some standard logging library instead of implementing this on your own. But it's instructive to give it a try, or at least to think of all the issues involved).
Do not close it, just flush, this is what Log4j FileAppender does by default.
You should open once (and close once). If you do nothing, Java will close it for you. You may prefer to explicitly override Object.finalize().

Is FileOutStream.write(byte[]) always blocking?

I wondered if FileOutputStream.write(byte[]) is always blocking the current thread, leading to a ThreadContext switch, or can it be that this operation does not block if the OS buffers are large enought to handle the bytes.
The reason for these thoughts are, I wondered if the logging I do with log4j in my application is a real performance hit, and if it would be faster to use a Queue of logging messages which is read by a separate thread and written to the logfiles (I know the disadvantages of swallowed logging statement if the app quits and the statements in the queue are not flushed to disk).
No, I didn't profile it yet, these are rather conceptual thoughts.
Need not be.
FileOutputStream.write(byte[]) is a native method. Common sense would suggest that write() may just write to the internal buffers, and a later call to flush() would actually commit it.
You can use the log4j org.apache.log4j.AsyncAppender and logging calls will not block. The actual logging is done in another thread so you won't need to worry about calls to log4j not returning in a timely manner.
By default immediateFlush is enabled which means that logging is slower but ensures that each append request is actually written out. You can set this to false if you don't care whether or not the last lines are written out if your application crashes.
log4j.appender.R.ImmediateFlush=false
Also, take a look at this post on Log4j: Performance Tips, in which the author has got some test stats on using immediateFlush, bufferedIO and asyncAppender. He concludes, that for local logging "set immediateFlush=false, and leave bufferedIO at the default of don't buffer" and that "asycAppender actually takes longer than normal non-asyc".
It's likely going to depend on the OS, drivers and underlying file system. If write caching is enabled for example it'll probably return right away. I've seen gigabytes/day of logs written synchronously without affecting performance too much, as long as IO isn't bottlenecked. It's still probably worth writing them asynchronously if you're concerned about response times. And it eliminates potential future issues, e.g. if you changed to writing to network drive and the network has issues.

Weird Java problem, while loop termination

I have a piece of code that looks like this:
Algorithm a = null;
while(a == null)
{
a = grid.getAlgorithm();
}
getAlgorithm() in my Grid class returns some subtype of Algorithm depending on what the user chooses from some options.
My problem is that even after an algorithm is selected, the loop never terminates. However, that's not the tricky bit, if I simply place a System.out.println("Got here"); after my call to getAlgorithm(), the program runs perfectly fine and the loop terminates as intended.
My question is: why does adding that magic print statement suddenly make the loop terminate?
Moreover, this issue first came up when I started using my new laptop, I doubt that's related, but I figured it would be worth mentioning.
Edit: The program in question is NOT multithreaded. The code for getAlgorithm() is:
public Algorithm getAlgorithm ()
{
return algorithm;
}
Where algorithm is initially null, but will change value upon some user input.
I believe the issue has to deal with how grid.getAlgorithm is executed. If there is very little cost associated with executing the method, then your while loop will cycle very quickly as long the method continues to return null. That is often referred to as a busy wait.
Now it sounds like your new laptop is encountering a starvation issue which didn't manifest on your old computer. It is hard to say why but if you look at the link I included above, the Wikipedia article does indicate that busy waits do have unpredictable behavior. Maybe your old computer handles user IO better than your new laptop. Regardless, on your new laptop, that loop is taking resources away from whatever is handling your user IO hence it is starving the process that is responsible for breaking the loop.
You are doing active polling. This is a bad practice. You should at least let the polling thread sleep (with Thread.sleep). Since println does some io, it probably does just that. If your app is not multithreaded it is unlikely to work at all.
If this loop is to wait for user input in a GUI then ouch. Bad, bad idea and even with Thread.sleep() added I'd never recommend it. Instead, you most likely want to register an event listener on the component in question, and only have the validation code fire off when the contents change.
It's more than likely you're program is locking up because you've reached some form of deadlock more than anything else, especially if your application is multithreaded. Rather than try to solve this issue and hack your way round it, I'd seriously consider redesigning how this part of the application works.
You should check getAlgorithm(), there must be something wrong in the method.
There are two scenarios:
Your code is really not meant to be multi-threaded. In this case you need to insert some sort of user input in the loop. Otherwise you might as well leave it as Algorithm a = grid.getAlgorithm(); and prevent the infinite loop.
Your code is multi-threaded in which case you have some sort of 'visibility' problem. Go to Atomicity, Visibility and Ordering or read Java Concurrency in Practice to learn more about visibility. Essentially it means that without some sort of synchronization between threads, the thread you are looping in may never find out that the value has changed due to optimizations the JVM may perform.
You did not mention any context around how this code is run. If it is a console based application and you started from a 'main' function, you would know if there was multi-threading. I am assuming this is not the case since you say there is no multithreading. Another option would be that this is a swing application in which case you should read Multithreaded Swing Applications. It might be a web application in which case a similar case to swing might apply.
In any case you could always debug the application to see which thread is writing to the 'algorithm' variable, then see which thread is reading from it.
I hope this is helpful. In any case, you may find more help if you give a little more context in your question. Especially for a question with such an intriguing title as 'Weird Java problem, while loop termination'.

In Java, how to log a message every time a given object's monitor is entered or exited?

I am trying to debug some C/Java bindings that use some custom refcounting/locking. I would like to have the JVM print a message every time a given object has its monitor entered or exited. Is there any way to do this? Basically, I want this:
synchronized(lock) {
...
System.out.println("hi");
...
}
to print this:
*** "lock" monitorenter
hi
*** "lock" monitorexit
I have looked at the XX options and found nothing. This is OpenJDK 6.
Good question. The only solution I could come up with is basically this:
Use a custom class-loader and preprocess the files with using a bytecode manipulation library such as ASM. (ASM has a good example of how to work with bytecode rewriting in class loaders.)
Then simply add a call to System.out.println before each monitorenter and monitorexit.
Thanks to the nice visitor pattern in the ASM library, this shouldn't be more than a screen or two of code.
Trying to debug a concurreny issue with creative uses of print statements is a losing battle, as your print statements could have their own concurrency bug and not print in the order you expect. Trying to debug or println your way out of a concurreny bug may sound good, but I don't think it will get you the result you want. You need to use careful thinking and logic to reason that your code is correct (more Computer Science than Software Engineering).
Concurrency issues are very hard. If you haven't read Concurrency in Practice, make sure you go read it. Then look at all the possible ways that your synchronized block can be reached, all the things it can change that are outside the scope of the lock, etc.
This would be a perfect situation to use dTrace.
Unfortunately that requires Solaris or OS X.
Fortunately OpenSolaris can still be downloaded and run in a virtual machine. It runs best in VirtualBox.
I don't believe there is a way to bind to a "locking" event in java. But you can look into java.lang.management for various locking information. For example, there is ThreadMXBean.findDeadlockedThreads()
Unless you write your own locking class (or modify the existing one) my guess is that it would be rather difficult to do what you want, specially if you are using a synchronized block over a monitor object and not a Lock class. However you can use the jstack command supplied with the JDK to analyze your process at runtime, check here for the man page, and also there is the JVM option -XX:-PrintConcurrentLocks for printing your locks if you stop your JVM process using Ctrl-Break (more options here).
I will suggest you to implement existing implementation of Lock class or implement one of you own (http://download.oracle.com/javase/tutorial/essential/concurrency/newlocks.html).
Now you can override the lock and unlock method. So instead of using synchronized methods/statements make use of this facility and in lock/unlock methods put your logging :)

Categories