Running multiple instances of the same program slows down all of them - java

I have a single-threaded program that utilizes nearly 100% CPU when run alone. If I instantiate multiple instances of it on separate prompts, they all slow down considerably (about 8 fold). I am running jre version 1.7.0_25 on Ubuntu 12.04 with a Intel® Core™ i7-3930K CPU # 3.20GHz × 12 and 64 GB RAM. What could be causing the slow down? Surely, the programs can not be competing for the same CPU. And, I have always made sure that I run lesser instances than there are cores. I appreciate any insight.
Thanks,
Suresh

Each instance creates a separate JVM with a distinct initial heap (defined by -Xms JVM argument) and program resources. If you have too many instances, there is no surprise the OS will be swapping memory to disk and CPU's context will switch continuously between the java processes.

Related

java.util.concurrent.ScheduledExecutorService performance on different computers

I have 2 Java programs. Each has a memory region, and programs pass messages to each other's memory regions. Java program checks its memory region with ScheduledExecutorService at a certain period of time, i.e. 5 miliseconds. Then it does some conversion and prints the message to screen. Two programs run succesfully and they print messages to screen ( I have implemented all JNI related things and programs run succesfully). The problem is that when I start these programs on my computer (windows 8, 64 bits, 8gb ram, 2.67 Ghz i5 processor) they run really fast. However, when I run these exactly same programs with exactly same native c++ library where I handle shared memory related works, on a different computer they run slower. The more interesting is that I tried to run them on a both better computer with 32 gb ram, i7 processor and worse computer with 6 gb ram and 64 bits core duo cpu and still they run slower. So another computer with either better or worse features than my computer did not matter on the speed of programs. All I see they run slower on other computers.
Any help is appreciated. Thanks.

Why is multithreaded Java Program not faster on 'super' Linux server vs laptop Win7?

Intro
So far, I have been working on a piece of software which I am now testing to see the benefit of concurrency. I am testing the same software using two different systems:
System 1: 2 x Intel(R) Xeon(R) CPU E5-2665 # 2.40GHz with a
total of 16 cores , 64GB of RAM running on
Scientific LINUX 6.1 and JAVA SE Runtime Enviroment (build
1.7.0_11-b21).
System 2 Lenovo Thinkpad T410 with Intel i5
processor # 2.67GHz with 4 cores, 4GB of ram running windows 7 64-bit
and JAVA SE Runtime Enviroment (build 1.7.0_11-b21).
Details: The program simulates patients with type 1 diabetes. It does some import (read from csv), some numerical computations(Dopri54 + newton) and some export (Write to csv).
I have exclusive rights to the server, so there should be no noise at all.
Results
These are my results:
Now as you can see system 1 is just as fast as system 2 despite it is a pretty powerfull machine. I have no idea why this is the case - and I am confident that the system is the same. The number of threads goes from 10-100.
Question:
Why would does the two runs have similar execution time despite system 1 being significantly more powerfull than system 2?
UPDATE!
Now, I just thought a bit about what you guys said about it being an I/O memory issue. So, I thought that if I could reduce the file size it would speed up the program, right? I managed to reduce the import file size with a factor of 5, however, no performance improvement at all. Do you guys still think it is the same problem?
As you write .csv files, it is possible that the bottleneck is not your camputation power, but the writing rate on your hard disk.
Almost certainly this means that either CPU time is not the bottleneck for this application, or that something about it is making it resistant to effective parallelization, or both.
For example if reading the data from disk is actually the limiting factor then faster disks are what matters, not faster processors.
If it's running out of memory then that will be a bigger bottlneck.
If it takes more time to spawn each thread than the actual processing inside the thread.
etc.
In this sort of optimization work metrics are king. You need real hard solid numbers for how long things are taking, and where in your program you are losing that time. Only then can you see where to focus your efforts and see if they are effective.

Difference betweeen jvm on linux and solaris machines

I'm running 2 jboss5.1 server, on linux & solaris machines, with similar jvm (xms & xmx) configurations. But when i check the memory usage on server start:
linux machine -- 2.1gb mem usage (RES)
Solaris machine -- 500mb mem usage
Memory used by jboss process on linux is above 1 gb from the start (even before any class loading starts). When i take dump from linux its size is around 700 mb only.
What could be causing such a difference of memory?
A lot of things could make the difference, and there is not enough information here to know what. For example, are they both 64-bit OS's and 64-bit JVMs? What about the behavior of malloc - that's up to the OS. Just because a process asks for N bytes of memory doesn't mean it immediately gets that much memory - memory allocators can be very clever. Then there is the question of whether it's actually an apples-to-apples measurement in terms of how the OS reports it.
"Memory usage" means a lot of things. Are we talking about the Java heap (if you take heap dumps of both VMs after startup and an identical priming bit of work, are they the same size or different?), or that plus class data, etc.? You also have hotspot in the picture, compiling Java bytecode into native code that will be different between the two OS's (maybe very different sizes if your Solaris box is a Sparc machine)
The most likely thing is 64-bit vs. 32-bit, but it's impossible to say. You might use some native profiling tools on each to see what calls are allocating memory - that would start to clarify things.
Unless it's causing a problem, it's probably not something to worry about - but healthy curiosity is a good thing.

Advantages to use Java on Solaris

On many forums I found that people use Solaris for their Java applications.
I interested what are the main advantages of such combination?
My first assumption is that Solaris is very fast.
I also found out that on Solaris it is possible to match one-to-one java threads with kernel threads - as I understand it results in again very fast thread creation.
Please correct me if I'm wrong and are there any other main points?
What Solaris gives you (as its Software not hardware) over Linux or Windows is greater system manageability and low level tracing like DTrace.
What you appear to be asking about is having more threads running concurrently which is a feature of the hardware. If you run Solaris x86 or Linux or Window on the same hardware you will have the same number of logical threads. However if you run Solaris on some SPARC processors which have lots of logical threads (32 or more) running concurrently which reduces overhead if you have a need for that many threads.
The http://en.wikipedia.org/wiki/SPARC_T3 process supports up to 512 logical threads across 16 cores. This can really improve performance where you have a need for so many threads, e.g. using many blocking IO connections.
However if you need only one to six critical threads (and a bunch of non-critcal threads) a plain x64 processor will be much faster, and cheaper. (As it is designed to handle less threads faster and are mass produced on a larger scale)
We use Solaris for java applications at my workplace. I do not know about any exact performance advantage, but the reasons we decided to use Solaris were:
Solaris Service Management Facility (http://www.oreillynet.com/pub/a/sysadmin/2006/04/13/using-solaris-smf.html )
Ability to copy the entire zone backup to another box in case of HW failure.
We run application servers such as Weblogic, and it helps that SMF starts them back up if they crash for any reason. Also, we do backup of our zones at regular intervals, and from what I hear- the zone can be moved to another machine in case of HW failure, and the application back to normal.

Running JAVA on Windows Intel vs Solaris Sparc (T1000)

Hi I'm trying to test my JAVA app on Solaris Sparc and I'm getting some weird behavior. I'm not looking for flame wars. I just curious to know what is is happening or what is wrong...
I'm running the same JAR on Intel and on the T1000 and while on the Windows machine I'm able to get 100% (Performance monitor) cpu utilisation on the Solaris machine I can only get 25% (prstat)
The application is a custom server app I wrote that uses netty as the network framework.
On the Windows machine I'm able to reach just above 200 requests/responses a second including full business logic and access to outside 3rd parties while on the Solaris machine I get about 150 requests/responses at only 25% CPU
One could only imagine how many more requests/responses I could get out of the Sparc if I can make it uses full power.
The servers are...
Windows 2003 SP2 x64bit, 8GB, 2.39Ghz Intel 4 core
Solaris 10.5 64bit, 8GB, 1Ghz 6 core
Both using jdk 1.6u21 respectively.
Any ideas?
The T1000 uses a multi-core CPU, which means that the CPU can run multiple threads simultaneously. If the CPU is at 100% utilization, it means that all cores are running at 100%. If your application uses less threads than the number of cores, then your application cannot use all the cores, and therefore cannot use 100% of the CPU.
Without any code, it's hard to help out. Some ideas:
Profile the Java app on both systems, and see where the difference is. You might be surprised. Because the T1 CPU lacks out-of-order execution, you might see performance lacking in strange areas.
As Erick Robertson says, try bumping up the number of threads to the number of virtual cores reported via prstat, NOT the number of regular cores. The T1000 uses UltraSparc T1 processors, which make heavy use of thread-level parallelism.
Also, note that you're using the latest-gen Intel processors and old Sun ones. I highly recommend reading Developing and Tuning Applications on UltraSPARC T1 Chip Multithreading Systems and Maximizing Application Performance on Chip Multithreading (CMT) Architectures, both by Sun.
This is quite an old question now, but we ran across similar issues.
An important fact to notice is that SUN T1000 is based on UltraSpac T1 processor which only have 1 single FPU for 8 cores.
So if you application does a lot or even some Float-Point calculation, then this might become an issue, as the FPU will become the bottleneck.

Categories