Java application running significantly faster on Mac OS than Windows - java

I am trying to run a Java program which finds Poly divisible numbers. However, I have found that it is significantly slower on my Windows computer in comparison to the speed it runs at on a Macbook Pro.
There was another question I found here: Java code running faster on Mac with slower processor than on my Windows computer? with a similar problem, however it has no conclusive answer to why it is slower.
Do you know any reason why this may be and how this could be fixed.
Windows Specs:
i7 7700k # 4.2 GHz (4 Cores, 8 threads)
16 GB RAM # 3000MHz. (DDR4)
Timings: https://gist.github.com/JosephBywater/f79f5e8277d148c26804c85c2c6a399a
Macbook Pro Early 2015
Intel i5 (unknown model, possibly i5 5257U) # 2.7 GHz
8 GB RAM # 1867 MHz. (DDR3)
Timings: https://gist.github.com/oliverdunk/ad3dd134b653c43c9928
Code https://github.com/oliverdunk/Polydivisbles

Related

Different CPU usage in Java VisualVM

In a game I'm making, I went to check why my FPS was throttling on my laptop in Java VisualVM. (I develop on my computer, which has better specs). What I noticed was that the function 'render()' in my Tiles class was hogging up most of my CPU time on my laptop. (See this picture for laptop CPU times)
Next, I went to check if this was the case on my desktop as well, since there is no FPS throttle there. The results on my desktop were as follows: (Desktop CPU times)
What struck me as odd, was that on my laptop, rendering the tiles seems to hog most of the CPU time, whereas on my desktop, the game loop itself takes up most CPU time.
I'm struggling to find an explanation for this. Could it be a hardware difference? And how is it that the render() method takes up more CPU time than the actual game loop (which it is a part of)?
Laptop specs:
CPU: Intel i7-7500U (4 core 2.70Ghz)
GPU: Intel HD Graphics 620 (Display) NVidia GeForce 940MX (Render)
RAM: 8GB
Desktop specs:
CPU: Intel i5-4460 (4 core 3.2Ghz)
GPU: NVidia GeForce GTX 760
RAM: 8GB

Why is Tomcat 7 on 32 Bit CPU/OS/Java so much slower than on 64 Bit?

Raspberry Pi3 Raspberry Pi3 Odroid C2 Odroid XU4
1,20 GHz 1,20 GHz 1,5 GHz 2,0 GHz
Debian 32 Bit SuSE 64 Bit Ubuntu 64 Bit Ubuntu 32 Bit
Start Apache Tomcat 04:30,00 00:29,06 00:27,45 04:08,39
1. page (1. request) 00:50,00 00:03,91 00:03,66 00:24,75
1. page (2. request) 00:03,30 00:00,79 00:00,77 00:02,39
I'm working on an IoT kind of project and needed to test if some web frontend implemented in Java using Tomcat as web server is "fast enough" on our possible hardware. We need to choose between Raspberry Pi3, Odroid C2 and Odroid XU4. Pi3 and C2 both have a 64 Bit CPU with slightly different performance according to their specs, XU4 has a 32 Bit CPU only and should be faster as the other two in theory as well. The important thing is that Pi3 is by default running a 32 Bit OS even if it has a 64 Bit CPU, the XU4 is running 32 Bit as well, but the C2 is running a 64 Bit OS incl. 64 Bit Java etc.
Comparing all those devices in default settings we found that the C2 was significantly faster than the other both. It was 4+ minutes vs. ~30 seconds for a restart of Tomcat with some test application of ours. Additionally, tools like htop showed that most of the runtime all cores of the C2 were used, whereas Pi3 and XU4 were mostly only able to put one core under load. That great performance difference was the same after Tomcat has loaded and we were able to browse through our test app: It was ~1,5 seconds vs. 4 to 5,5 seconds for just browsing some page with some CSS/JS.
While the default OS for the Pi3 is 32 Bit only, we were able to successfully install a special 64 Bit SuSE distribution. And guess what happened? The performance was much closer now to what we saw on the C2 already, almost the same for many tests, even though the Pi3 is clocked at only 1,2 vs. 1,5 GHz of the C2. Especially interesting was that now all cores of the Pi3 were under load most of the time as well, so overall behaviour was very much like the C2 now.
So by only switching to 64 Bit OS, Java etc. we saw that dramatically improvement in performance. Everything else was the same, same test app, Tomcat etc., nothing overclocked, no other storage or else. How can that be? What is responsible for that dramatic improvement?
With a 64 Bit OS we see that all cores of the devices are more under load compared to 32 Bit. But why should the Linux kernel scheduler care about if it's running on 32 or 64 Bit this much?
And if it doesn't and the difference comes from Java, why/how that? Shouldn't a 32 Bit and 64 Bit JVM perform almost identically in such a simple test? Shouldn't both especially put almost the same load on cores and not behave that different? The architecture of the OS shouldn't have any effect on how many threads are used inside the JVM, that is mostly under control of Tomcat and our test app and therefore didn't change. According to what I've read about performance of 32 vs. 64 Bit Java, the difference should be negligible in my use case. Additionally, other users with a better performance of a 64 Bit JVM don't seem to have a factor of 4 to 5 like I'm seeing and the differences on CPU load of individual cores aren't explained as well.
Our test is not I/O bound, we don't allocate much memory or work with many threads or such, it's almost strictly CPU, only compiling Java classes and publishing HTML, CSS and JS. But we see very different load on the cores depending on 32/64 Bit and very different performance results.
One of my colleagues said he read somewhere that Java is internally working with 64 Bit values only and that therefore on a 32 Bit CPU/OS more cycles are needed to process the same thing. I guess his source doesn't mean really everything, but only references/pointers to memory like for objects. But I can't believe that a 32 Bit JVM is internally really using 64 Bit pointers for no reason, especially if even optimizations like compressed oops exist. But might be an explanation, so any ideas on that?
If it's of any interest, the packages on the 32 Bit OS all had "armhf" as architecture, compared to "arm64" on the 64 Bit ones. I thought that might have influence on how Java was built, maybe really using 64 Bit pointers for some weird reason?
Java was OpenJDK 8 always, same architecture like OS and as current as the package manager of the OS provides. Pi3 with SuSE had 1.8_144, UB provided 1.8_131 for both 32 Bit and 64 Bit installations, all were server VMs. Additionally, the Linux Kernel was different e.g. Pi3 with SuSE vs. C2 and XU 4 with UB: Pi3 had some current 4.x, C2 some old 3.14 and XU 4 some current 4.9 as well.
So, any ideas on where the difference comes from? Thanks!
You've told you installed OpenJDK 8 from the standard package.
There has never been an optimized build of OpenJDK 8 for ARM 32 (at least on Debian and Ubuntu). The default package is built from "Zero" port which does not even have a JIT compiler.
root#localhost:~# java -server -version
openjdk version "1.8.0_131"
OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-1~bpo8+1-b11)
OpenJDK Zero VM (build 25.131-b11, interpreted mode)
^^^^^^^ ^^^^^^^^^^^^^^^^
Try to install Oracle JDK manually from Java SE downloads page.
It has an optimized HotSpot JVM inside. And it indeed works much faster.
root#localhost:~# /usr/java/jdk1.8.0_131/bin/java -server -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) Server VM (build 25.131-b11, mixed mode)
^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^
On the contrast, Aarch64 port of HotSpot JVM has been a part of OpenJDK for a long time. So, on a 64-bit OS the default OpenJDK package comes with HotSpot JVM that includes an optimizing JIT compiler.

multithreading Java cannot fully use all the OS cores on Windows

I have two machines with exactly the same hardware settings: Intel i7 CPU (4 cores, 8 threads or virtual cores) 16GB memory. The only difference is that one machine runs Windows 7, and the other hosts Ubuntu.
My multi-threading program has 8 threads doing similar tasks. I noticed that when it is running on the windows machine, only 4 virtual cores are running with 30% CPU usage, and the rest 4 virtual cores have 0 CPU usage. When the same program runs on the Ubuntu machine, all the 8 virtual cores have roughly equal CPU usage of 30%.
Besides, the program runs significantly faster on the Ubuntu machine.
Why does java not use all the CPU cores on windows? Is there anything particular with the jre/OS setting?
More details on System info:
Windows 7 64 bit, Oracle 64 bit java 1.7
Ubuntu 14.04 LTS 64 bit, OpenJDK 64 bit java 1.7
More details on the task:
One thread read file, initiate instances, feed them to input buffer (LinkedBlockingQueue).
Eight threads calculate some measures for each given instances. The calculation results are fed to output buffer (LinkedBlockingQueue).
One thread write the results to file (sort of intensive I/O here).

Java algorithm performance difference between Linux and Windows

We observe a very significant performance difference of an algorithm in java between Linux (ubuntu) and Windows. The algorithm is running on JADE multi-agent system. In the system each agent has its own process. The agents perform computational intensive operations and communicate with each other. The difference in performance is 3x-4x in favor of Windows. Test machine is i3770 quad core (8 logical cores). We have checked with java 7 and 8, 32 bit and 64 bit version. Under Linux, all the cores have the same load: about 50% is spent in userspace and 50% in kernel. Under Windows 80% is spent in user mode, 20% in kernel.
The PNG files contain the Java profile information sorted by various columns from Windows and Linux. (jdk8-u25-x64)
Recently we doubled Linux performance using “chrt” command: “chrt -f 99 java …” which enables FIFO_SCHED scheduling for the java process.
We suspect that Linux needlessly switches between tasks too many times.
Is there a way to increase java performance on Linux to match Windows performance?

Why is there such a big difference in memory use of a Java application in Windows XP 32 vs Windows 7 64

I have a little Java application that I wrote for recording my work activities. Since I have it open all day, every day, one of the things that originally concerned me as to choice of language is the amount of memory it would use.
Happily, under Windows XP I it would typically consume about 5 MB when minimized and 12 or so when maximized, and happily runs with -Xmx5M (memory consumption according to Windows Task Manager).
When I upgraded my home PC with newer hardware, and at the same time, to Windows 7 64, (although I installed and am using the 32 bit JVM), I immediately noted that the JVM for this application now reports 68 MB+ always... and that's with -Xmx5M -Xss16K, according to Task Manager's "Working Set".
Both the old and new machines had/have 4 GB of RAM, of which 512 MB is used by video. Both were running recent builds of Java 6 - about update 15 for WinXP, and now update 24 for Win7. The application footprint on disk is 70 K in 12 classes. Moreover, my work computer is still Windows XP, running Java 6_24, and it shows about 12 MB for this identical application - and by identical I mean literally that, since the two systems are sync'd for all my development tools.
As a developer, I need to understand the reasons why my applications appear to chew up so much memory.
Can anyone shed some light on this, and suggest how to meaningfully reduce the memory footprint for a Java 6 application?
Edit
The answer may be in an excessive PermGen size. According to JVisualVM, I have a heap of:
Size: 5.2 MB, Used: 4.3 MB (peak) and Allocated 6.2 MB.
but for the PermGen
Size: 12.5 MB, Used: 4.6 MB (peak) and Allocated 67.1 MB.
So is it possible that the 68 MB shown in Task Manager in Win 7 is simply requested but unassigned virtual memory?
EDIT 2
Reducing PermGen to 12 MB had no effect on the process RAM, but JVisualVM did show it reduced (apparently 12 MB constitutes some sort of minimum, because going lower than that had no effect in JVVM).
The 64 bit OS uses 64 bits as the size of pointers while a 32 bit OS uses 32 bits. That might just be one of the reasons though.
Remember that everything in Java is a pointer (for the most part) so when switching to a 64 bit machine, with 64 bit memory addresses, your pointers double in size.
There is an option in the JVM (-XX:+UseCompressedOops) to turn on compressed object pointers. This will put your memory usage near the levels you saw on 32 bit machines.
For the 64-bit JVM you can specify an option that will reduce the overhead of 64-bit addressing. As long as the heap is under 32GB it can essentially compress the pointers back to 32-bit. In a typical Java application this saves about 40% of the memory. See Compressed oops in the Hotspot JVM for more details. Here's the option for it:
-XX:+UseCompressedOops
Note that if you use a 32-bit JVM on a 64-bit OS that you won't pay the overhead of 64-bit memory addresses (although there's a different overhead to pay there in translation, yet it's very minor).
I believe that Windows 7 doesn't report memory use the same as XP either (I don't have a reference offhand though). So for a fair comparison of 32-bit vs 64-bit you need to run both on the same version of Windows.
This has nothing to do with the operating system being 64-bit - Windows XP and Windows 7 use RAM in very different ways.
Because of this, Windows 7 will almost always report all programs as using more memory than Windows XP did. Don't worry about this - this is just a result of Windows 7 using your memory the way it should be: as a cache.
While it's not the whole story, the PermGen size is 30% larger on 64-bit plaforms:
-XX:MaxPermSize - Size of the Permanent Generation.
[5.0 and newer: 64 bit VMs are scaled
30% larger; 1.4 amd64: 96m; 1.3.1
-client: 32m.]
Java 6 HotSpot VM Options

Categories