I write an OpenGL app in Java using JOGL. I am trying to completely avoid the creation of objects during the main app's phase as it could lead to the small periodic lag caused by GC.
I want to wrap some JOGL's methods with my own. Imagine a method void method(int[] result, int offset) which receives the pointer to an array and an offset and puts one integer value into it at the specified index. I want to wrap it with simple int getResult()
So I need to create a temporary array somewhere and I must do that in advance (according to 1).
But if it will be stored in a field of the class containing this wrapper method, this will force me to make the wrapper method synchronized. I know that sychronization in time of mostly single-thread access shouldn't produce a big overhead but I still want to know is it there a better solution for this.
Notes:
Synchronized is not the answer, 3.000.000 of empty synchronized blocks, just monitorenter-monitorexit take 17 ms. You have only 16.(6) if you want to keep 60 fps.
As I haven't enough power for voting up the only way I found to appreciate Dave's answer is writting a demo:
class Test {
private static final int CYCLES = 1000000000;
int[] global = new int[1];
ThreadLocal<int[]> local = new ThreadLocal<int[]>();
void _fastButIncorrect() { global[0] = 1; }
synchronized void _slowButCorrect() { global[0] = 1; }
void _amazing() {
int[] tmp = local.get();
if( tmp == null ){
tmp = new int[1];
local.set(tmp);
}
tmp[0] = 1;
}
long fastButIncorrect() {
long l = System.currentTimeMillis();
for (int i = 0; i < CYCLES; i++) _fastButIncorrect();
return System.currentTimeMillis() - l;
}
long slowButCorrect() {
long l = System.currentTimeMillis();
for (int i = 0; i < CYCLES; i++) _slowButCorrect();
return System.currentTimeMillis() - l;
}
long amazing() {
long l = System.currentTimeMillis();
for (int i = 0; i < CYCLES; i++) _amazing();
return System.currentTimeMillis() - l;
}
void test() {
System.out.println(
"fastButIncorrect cold: " + fastButIncorrect() + "\n" +
"slowButCorrect cold: " + slowButCorrect() + "\n" +
"amazing cold: " + amazing() + "\n" +
"fastButIncorrect hot: " + fastButIncorrect() + "\n" +
"slowButCorrect hot: " + slowButCorrect() + "\n" +
"amazing hot: " + amazing() + "\n"
);
}
public static void main(String[] args) {
new Test().test();
}
}
on my machine the results are:
fastButIncorrect cold: 40
slowButCorrect cold: 8871
amazing cold: 46
fastButIncorrect hot: 38
slowButCorrect hot: 9165
amazing hot: 41
Thanks again, Dave!
If you don't have too many threads, you can use a ThreadLocal:
ThreadLocal<int[]> tmpArrayThreadLocal = new ThreadLocal<int[]>();
code to use this:
int[] tmpArray = tmpArrayThreadLocal.get();
if( tmpArray == null ){
tmpArray = new int[100];
tmpArrayThreadLocal.set(tmpArray);
}
method(tmpArray, 5)
You could clean up the code by encapsulating the ThreadLocal in another class.
Related
I have a piece of the source code in java8:
public class Test {
public static void main(String[] args) {
testObject(1.3);
testObject(1.4);
}
private static void testObject(double num) {
System.out.println("test:" + num);
long sta = System.currentTimeMillis();
int size = 10000000;
Object[] o = new Object[(int) (size * num)];
for (int i = 0; i < size; i++) {
o[i] = "" + i;
}
System.out.println("object[]: " + (System.currentTimeMillis() - sta) + " ms");
}
}
execution Result:
test:1.3
object[]: 7694 ms
test:1.4
object[]: 3826 ms
Why is the running time so different when my quantity is 1.4 * size?
I wanted to see how Java array assignment works, but I couldn't find anything on google.
In addition you have to keep in mind that System.currentTimeMillis returns a "Wall-Clock-Time". If your OS does a reschedule during the for-loop and a different process gets the cpu, the Wall-Clock-Time increases but your program won't execute.
So, like the question tile. I'm trying to learn multithreading programming. I have a awkward program to hlep me understand multithreading is faster than regular execution. The programm has seven classes in one java file, one test class, three classes implement Runnable, and three regular classes. The six classes all do the same thing, counting to 10 millions and return result. My problem is the three classes using three threads to run, but they didn't return the correct counts as I expected. However the three regular classes work fine.
I really appreciate anyone can help me to understand why it happens! I using JDK 9 and Eclipse 2018-12.
import java.time.Duration;
import java.time.Instant;
class MyMultiThreadExample{
public static void main(String[] args) {
GameOne g1 = new GameOne();
GameTwo g2 = new GameTwo();
GameThree g3 = new GameThree();
Thread thread1 = new Thread(g1);
Thread thread2 = new Thread(g2);
Thread thread3 = new Thread(g3);
Instant start1 = Instant.now();
thread1.start();
thread2.start();
thread3.start();
Instant end1 = Instant.now();
long elapsed = Duration.between(start1, end1).toMillis();
int total = g1.getCount() + g2.getCount() + g3.getCount();
System.out.println("MultiThread running cost " + elapsed + " to count " + total + " times");
GameFour g4 = new GameFour();
GameFive g5 = new GameFive();
GameSix g6 = new GameSix();
Instant start2 = Instant.now();
g4.run();
g5.run();
g6.run();
Instant end2 = Instant.now();
long elapsed2 = Duration.between(start2, end2).toMillis();
int total2 = g3.getCount() + g4.getCount() + g5.getCount();
System.out.println("Sequential running cost " + elapsed2 + " to count " + total2 + " times");
}
}
class GameOne implements Runnable {
int count1 = 0;
#Override
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print("Game1 at round " + count + " now");
count1++;
}
}
public int getCount() {
System.out.println("GameOne counts " + count1);
return count1;
}
}
class GameTwo implements Runnable {
int count2 = 0;
#Override
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print("Game2 at round " + count + " now");
count2++;
}
}
public int getCount() {
System.out.println("GameTwo counts " + count2);
return count2;
}
}
class GameThree implements Runnable {
int count3 = 0;
#Override
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print("Game3 at round " + count + " now");
count3++;
}
}
public int getCount() {
System.out.println("GameThree counts " + count3);
return count3;
}
}
class GameFour {
int count4 = 0;
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print("Game3 at round " + count + " now");
count4++;
}
}
public int getCount() {
System.out.println("GameFour counts " + count4);
return count4;
}
}
class GameFive {
int count5 = 0;
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print("Game3 at round " + count + " now");
count5++;
}
}
public int getCount() {
System.out.println("GameFive counts " + count5);
return count5;
}
}
class GameSix {
int count6 = 0;
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print("Game3 at round " + count + " now");
count6++;
}
}
public int getCount() {
System.out.println("GameFive counts " + count6);
return count6;
}
}
I have a awkward program to hlep me understand multithreading is faster than regular execution.
It's important to understand this is not always the case. You should only use multiple Threads when you have long running tasks that can run in parallel. IF your tasks are short, they almost certainly will run faster by running on a single Thread as there's an overhead on creating an specially synchronizing between Threads.
With that out of the way, you are not actually measuring the correct time here.
When you call Thread.start(), it will run the relevant Runnable in parallel with the code inside your function.
To let the Threads run until they complete before proceeding, you must call Thread#join():
thread1.start();
thread2.start();
thread3.start();
// all 3 Threads may be running now, but maybe not even started!
// let's wait for them to finish running by joining them
thread1.join();
thread2.join();
thread3.join();
This is the easiest way to wait... but there are others and this is a complex topic.
You may also run into trouble as your tasks have mutable state (the count variables) and the visibility of changes from different Threads needs to be carefully managed (you can make it volatile, for example, so updates are flushed to other Threads).
To learn more about concurrency in Java, I recommend you read about it. The Baeldung tutorials are excellent.
You're forgetting to call thread.join() -- this waits until the thread finishes executing.
Otherwise you're reading the counters in the middle of the execution.
Your code should be:
thread1.start()
thread2.start()
thread3.start()
thread1.join()
thread2.join()
thread3.join()
Additionally, all your classes can be compacted into a single class Game:
class Game implements Runnable {
String name;
int count = 0;
public Game(String name) {
this.name = name;
}
#Override
public void run() {
for (int i = 0; i < 10000000; i++) {
// System.out.print(name + " at round " + count + " now");
count++;
}
}
public int getCount() {
System.out.println(name + " counts " + count);
return count;
}
}
Each will have its own counter, and you can run them in a thread or in the same thread by calling run() -- your main method remains mostly unchanged except where they're instantiated. They can be instantiated like:
Game g1 = new Game("GameOne");
Game g2 = new Game("GameTwo");
Game g3 = new Game("GameThree");
Game g4 = new Game("GameFour");
Game g5 = new Game("GameFive");
Game g6 = new Game("GameSix");
I have two threads and they are both reading the same static variable (some big object - an array with 500_000_000 ints).
The two threads are pinned to a cpu (1 and 2) (cpu affinity) so minimize jitters.
Do you know if the two threads will slow down each other because of the static variable is read by both threads running on different cpu?
import net.openhft.affinity.AffinityLock;
public class BigObject {
public final int[] array = new int[500_000_000];
public static final BigObject bo_static = new BigObject();
public BigObject() {
for( int i = 0; i<array.length; i++){
array[i]=i;
}
}
public static void main(String[] args) {
final Boolean useStatic = true;
Integer n = 2;
for( int i = 0; i<n; i++){
final int k = i;
Runnable r = new Runnable() {
#Override
public void run() {
BigObject b;
if( useStatic){
b = BigObject.bo_static;
}
else{
b = new BigObject();
}
try (AffinityLock al = AffinityLock.acquireLock()) {
while(true){
long nt1 = System.nanoTime();
double sum = 0;
for( int i : b.array){
sum+=i;
}
long nt2 = System.nanoTime();
double dt = (nt2-nt1)*1e-6;
System.out.println(k + ": sum " + sum + " " + dt);
}
}
}
};
new Thread(r).start();
}
}
}
Thanks
In your case there won't be a slow down from doing it multi-threaded - since you're doing only reads no need to invalidate any shared state between your CPUs.
Depending on the back-ground load there could be bus limitations and stuff, but if the affinity is defined at the OS level as well - there would be more inter-CPU and inter-core communications at an easily pre-fetched manner (since you access the data sequentially) than memory-cpu communications. Back-ground load would affect the performance in single-threaded case as well - so there's no need to argue about it.
If the whole system is dedicated to your program - than you would have approximately ~20Gb/s memory bandwidth on modern CPUs which is more than enough for your data-set.
I'm working on a fork of FernFlower from Jetbrains and I've been adding minor improvements to it.
One thing that really annoys me about FernFlower is that it bases the type of the local variable based on its value in bpush/spush etc. While Jode and Procyon somehow find a way to find the original value of a local variable.
Here is the original source code.
public static void main(String[] args) throws Exception {
int hello = 100;
char a2 = 100;
short y1o = 100;
int hei = 100;
System.out.println(a2+" "+y1o+", "+hei+", "+hello);
}
When decompiled with FernFlower, it outputs this:
public static void main(String[] args) throws Exception {
byte hello = 100;
char a2 = 100;
byte y1o = 100;
byte hei = 100;
System.out.println(a2 + " " + y1o + ", " + hei + ", " + hello);
}
But when decompiled with Jode/Procyon it outputs the original local variable types:
public static void main(String[] args)
throws Exception
{
int hello = 100;
char a2 = 'd';
short y1o = 100;
byte hei = 100;
System.out.println(a2 + " " + y1o + ", " + hei + ", " + hello);
}
I was wondering how is this possible because I thought no local variable type information is stored at compile time? How can I add the same functionality to FernFlower?
.class files optionally contain a 'LocalVariableTable' attribute for debugging purposes. If you invoke the command javap -l <Class>.class you can see the data if it is present.
So after looking around and debugging I found that for some reason FernFlower decides to completely ignore some of the data in the LocalVariableTable.
Here is ferns original code for decoding the LocalVariableTable:
public void initContent(ConstantPool pool) throws IOException {
DataInputFullStream data = stream();
int len = data.readUnsignedShort();
if (len > 0) {
mapVarNames = new HashMap<Integer, String>(len);
for (int i = 0; i < len; i++) {
data.discard(4);
int nameIndex = data.readUnsignedShort();
data.discard(2);
int varIndex = data.readUnsignedShort();
mapVarNames.put(varIndex, pool.getPrimitiveConstant(nameIndex).getString());
}
} else {
mapVarNames = Collections.emptyMap();
}
}
If you want type information you need to add the following:
#Override
public void initContent(ConstantPool pool) throws IOException {
DataInputFullStream data = stream();
int len = data.readUnsignedShort();
if (len > 0) {
mapVarNames = new HashMap<Integer, String>(len);
mapVarTypes = new HashMap<Integer, String>(len);
for (int i = 0; i < len; i++) {
int start = data.readUnsignedShort();
int end = start + data.readUnsignedShort();
int nameIndex = data.readUnsignedShort();
int typeIndex = data.readUnsignedShort();
int varIndex = data.readUnsignedShort();
mapVarNames.put(varIndex, pool.getPrimitiveConstant(nameIndex).getString());
mapVarTypes.put(varIndex, pool.getPrimitiveConstant(typeIndex).getString());
}
} else {
mapVarNames = Collections.emptyMap();
mapVarTypes = Collections.emptyMap();
}
}
It now outputs the same code as Jode with proper variable types :)
I wonder why FernFlower chose to ignore this information.
Is it possible to convert the function go into the non-recursive function? Some hints or a start-up sketch would be very helpful
public static TSPSolution solve(CostMatrix _cm, TSPPoint start, TSPPoint[] points, long seed) {
TSPSolution sol = TSPSolution.randomSolution(start, points, seed, _cm);
double t = initialTemperature(sol, 1000);
int frozen = 0;
System.out.println("-- Simulated annealing started with initial temperature " + t + " --");
return go(_cm, sol, t, frozen);
}
private static TSPSolution go(CostMatrix _cm, TSPSolution solution, double t, int frozen) {
if (frozen >= 3) {
return solution;
}
i++;
TSPSolution bestSol = solution;
System.out.println(i + ": " + solution.fitness() + " " + solution.time() + " "
+ solution.penalty() + " " + t);
ArrayList<TSPSolution> nHood = solution.nHood();
int attempts = 0;
int accepted = 0;
while (!(attempts == 2 * nHood.size() || accepted == nHood.size()) && attempts < 500) {
TSPSolution sol = nHood.get(rand.nextInt(nHood.size()));
attempts++;
double deltaF = sol.fitness() - bestSol.fitness();
if (deltaF < 0 || Math.exp(-deltaF / t) > Math.random()) {
accepted++;
bestSol = sol;
nHood = sol.nHood();
}
}
frozen = accepted == 0 ? frozen + 1 : 0;
double newT = coolingSchedule(t);
return go(_cm, bestSol, newT, frozen);
}
This is an easy one, because it is tail-recursive: there is no code between the recursive call & what the function returns. Thus, you can wrap the body of go in a loop while (frozen<3), and return solution once the loop ends. And replace the recursive call with assignments to the parameters: solution=bestSol; t=newT;.
You need to thinkg about two things:
What changes on each step?
When does the algorithm end?
Ans the answer should be
bestSol (solution), newT (t), frozen (frozen)
When frozen >= 3 is true
So, the easiest way is just to enclose the whole function in something like
while (frozen < 3) {
...
...
...
frozen = accepted == 0 ? frozen + 1 : 0;
//double newT = coolingSchedule(t);
t = coolingSchedule(t);
solution = bestSol;
}
As a rule of thumb, the simplest way to make a recursive function iterative is to load the first element onto a Stack, and instead of calling the recursion, add the result to the Stack.
For instance:
public Item recursive(Item myItem)
{
if(myItem.GetExitCondition().IsMet()
{
return myItem;
}
... do stuff ...
return recursive(myItem);
}
Would become:
public Item iterative(Item myItem)
{
Stack<Item> workStack = new Stack<>();
while (!workStack.isEmpty())
{
Item workItem = workStack.pop()
if(myItem.GetExitCondition().IsMet()
{
return workItem;
}
... do stuff ...
workStack.put(workItem)
}
// No solution was found (!).
return myItem;
}
This code is untested and may (read: does) contain errors. It may not even compile, but should give you a general idea.