java static variables and cache - java

I have two threads and they are both reading the same static variable (some big object - an array with 500_000_000 ints).
The two threads are pinned to a cpu (1 and 2) (cpu affinity) so minimize jitters.
Do you know if the two threads will slow down each other because of the static variable is read by both threads running on different cpu?
import net.openhft.affinity.AffinityLock;
public class BigObject {
public final int[] array = new int[500_000_000];
public static final BigObject bo_static = new BigObject();
public BigObject() {
for( int i = 0; i<array.length; i++){
array[i]=i;
}
}
public static void main(String[] args) {
final Boolean useStatic = true;
Integer n = 2;
for( int i = 0; i<n; i++){
final int k = i;
Runnable r = new Runnable() {
#Override
public void run() {
BigObject b;
if( useStatic){
b = BigObject.bo_static;
}
else{
b = new BigObject();
}
try (AffinityLock al = AffinityLock.acquireLock()) {
while(true){
long nt1 = System.nanoTime();
double sum = 0;
for( int i : b.array){
sum+=i;
}
long nt2 = System.nanoTime();
double dt = (nt2-nt1)*1e-6;
System.out.println(k + ": sum " + sum + " " + dt);
}
}
}
};
new Thread(r).start();
}
}
}
Thanks

In your case there won't be a slow down from doing it multi-threaded - since you're doing only reads no need to invalidate any shared state between your CPUs.
Depending on the back-ground load there could be bus limitations and stuff, but if the affinity is defined at the OS level as well - there would be more inter-CPU and inter-core communications at an easily pre-fetched manner (since you access the data sequentially) than memory-cpu communications. Back-ground load would affect the performance in single-threaded case as well - so there's no need to argue about it.
If the whole system is dedicated to your program - than you would have approximately ~20Gb/s memory bandwidth on modern CPUs which is more than enough for your data-set.

Related

Why do consumers decrease the producer's performance

I'm currently trying to increase the performance of my software by implementing the producer-consumer pattern. In my particular case I have a producer that sequentially creates Rows and multiple consumers that perform some task for a given batch of rows.
The problem I'm facing now is that when I measure the performance of my Producer-Consumer pattern, I can see that the producer's running time massively increases and I don't understand why this is the case.
So far I mainly profiled my code and did micro-benchmarking yet the results did not lead me to the actual problem.
public class ProdCons {
static class Row {
String[] _cols;
Row() {
_cols = Stream.generate(() -> "Row-Entry").limit(5).toArray(String[]::new);
}
}
static class Producer {
private static final int N_ITER = 8000000;
final ExecutorService _execService;
final int _batchSize;
final Function<Row[], Consumer> _f;
Producer(final int batchSize, final int nThreads, Function<Row[], Consumer> f) throws InterruptedException {
_execService = Executors.newFixedThreadPool(nThreads);
_batchSize = batchSize;
_f = f;
// init all threads to exclude their generaration time
startThreads();
}
private void startThreads() throws InterruptedException {
List<Callable<Void>> l = Stream.generate(() -> new Callable<Void>() {
#Override
public Void call() throws Exception {
Thread.sleep(10);
return null;
}
}).limit(4).collect(Collectors.toList());
_execService.invokeAll(l);
}
long run() throws InterruptedException {
final long start = System.nanoTime();
int idx = 0;
Row[] batch = new Row[_batchSize];
for (int i = 0; i < N_ITER; i++) {
batch[idx++] = new Row();
if (idx == _batchSize) {
_execService.submit(_f.apply(batch));
batch = new Row[_batchSize];
idx = 0;
}
}
final long time = System.nanoTime() - start;
_execService.shutdownNow();
_execService.awaitTermination(100, TimeUnit.MILLISECONDS);
return time;
}
}
static abstract class Consumer implements Callable<String> {
final Row[] _rowBatch;
Consumer(final Row[] data) {
_rowBatch = data;
}
}
static class NoOpConsumer extends Consumer {
NoOpConsumer(Row[] data) {
super(data);
}
#Override
public String call() throws Exception {
return null;
}
}
static class SomeConsumer extends Consumer {
SomeConsumer(Row[] data) {
super(data);
}
#Override
public String call() throws Exception {
String res = null;
for (int i = 0; i < 1000; i++) {
res = "";
for (final Row r : _rowBatch) {
for (final String s : r._cols) {
res += s;
}
}
}
return res;
}
}
public static void main(String[] args) throws InterruptedException {
final int nRuns = 10;
long totTime = 0;
for (int i = 0; i < nRuns; i++) {
totTime += new Producer(100, 1, (data) -> new NoOpConsumer(data)).run();
}
System.out.println("Avg time with NoOpConsumer:\t" + (totTime / 1000000000d) / nRuns + "s");
totTime = 0;
for (int i = 0; i < nRuns; i++) {
totTime += new Producer(100, 1, (data) -> new SomeConsumer(data)).run();
}
System.out.println("Avg time with SomeConsumer:\t" + (totTime / 1000000000d) / nRuns + "s");
}
Actually, since the consumers run in different threads than the producer, I would expect that the running time of the producer is not effected by the Consumer's workload. However, running the program I get the following output
#1 Thread, #100 batch size
Avg time with NoOpConsumer: 0.7507254368s
Avg time with SomeConsumer: 1.5334749871s
Note that the time measurement does only measure the production time and not the consumer time and that not submitting any jobs requires on avg. ~0.6 secs.
Even more surprising is that when I increase the number of threads from 1 to 4, I get the following results (4-cores with hyperthreading).
#4 Threads, #100 batch size
Avg time with NoOpConsumer: 0.7741189636s
Avg time with SomeConsumer: 2.5561667638s
Am I doing something wrong? What am I missing? Currently I have to believe that the running time differences are due to context switches or anything related to my system.
Threads are not completely isolated from one another.
It looks like your SomeConsumer class allocates a lot of memory, and this produces garbage collection work that is shared between all threads, including your producer thread.
It also accesses a lot of memory, which can knock the memory used by the producer out of L1 or L2 cache. Accessing real memory takes a lot longer than accessing cache, so this can make your producer take longer as well.
Note also that I didn't actually verify that you're measuring the producer time properly, and it's easy to make mistakes there.

Java factorial calculation with threads

I am trying to calculate the factorial of very large numbers using threads but the threadless function is calculating faster.How can i use parallel computing with threads---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
public class Faktoriyel implements Runnable{
private Sayi sayi;
public Sayi faktoriyelSonuc;
public Faktoriyel(Sayi sayi){
this.sayi = sayi;
}
#Override
public void run() {
BigInteger fact = new BigInteger("1");
for (int i = 1 ;i <= sayi.GetSayi().longValue() ; i++) {
fact = fact.multiply(new BigInteger(i + ""));
}
faktoriyelSonuc = new Sayi(fact.toString());
System.out.println(faktoriyelSonuc.GetSayi());
}
}
These are main ---
public class Project1{
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
long baslangicSeri = System.nanoTime();
System.out.println(SeriFaktoriyel(new Sayi("200000")));
long bitisSeri = System.nanoTime();
double SerigecenSure = (double)(bitisSeri-baslangicSeri)/1000000000;
System.out.println("Seri Hesaplama : "+SerigecenSure+" saniye");
long baslangicParalel = System.nanoTime();
ExecutorService havuz = Executors.newFixedThreadPool(10);
havuz.execute(new Faktoriyel(new Sayi("200000")));
havuz.shutdown();
while(!havuz.isTerminated()){ }
long bitisParalel = System.nanoTime();
double gecenSure = (double)(bitisParalel-baslangicParalel)/1000000000;
System.out.println("Paralel hesaplama : "+gecenSure+" saniye");
}
public static String SeriFaktoriyel(Sayi sayi){
BigInteger fact = new BigInteger("1");
for (int i = 1; i <= sayi.GetSayi().longValue() ; i++) {
fact = fact.multiply(new BigInteger(i + ""));
}
return fact.toString();
}
}
There are a two things that I can point out that damage the performance of your threaded version:
System.out.println() is a system call which has a significant overhead, and it is applied only in the threaded version.
You are using a thread-pool of size 10 , unless you have 10 cores on your computer, it means that your program suffers from redundant context switches. (You will get better performance with a thread pool in the size of your actual amount of pc cores)
If this is Chinese to you I would recommend reading about context switches :)

unexpected answers in multithreading in java

This is my code that sum variable 'res' by one 4*10^7 time using 4 threads:
class MathSin extends Thread {
public double a;
public MathSin(int degree) {
a = degree;
}
#Override
public void run() {
for (int i = 0; i < Math.pow(10., 7); i++)
MathThreads.res++;
}
}
class MathThreads {
public static double res = 0;
public static void main(String args[]) {
MathSin st = new MathSin(8);
MathSin ct = new MathSin(8);
MathSin tt = new MathSin(8);
MathSin qt = new MathSin(8);
st.start();
ct.start();
tt.start();
qt.start();
try { // wait for completion of all thread and then sum
st.join();
ct.join(); // wait for completion of MathCos object
tt.join();
qt.join();
System.out.println(res);
} catch (InterruptedException IntExp) {
}
}
}
and these are some of answers :
1.8499044E7
2.3446789E7
.
.
.
I expected get 3.0E7 but get another different answers.
how can fix this problem?
What is the problem?
You are observing race conditions while updating the static variable res.
MathThreads.res++
is equivalent to:
double tmp = MathThreads.res;
MathThreads.res = tmp + 1;
Now what happened if two threads reads at the same time a value for tmp, and both update res with tmp + 1? Well, one increment has simply been forgotten: res ends being tmp + 1 instead of being tmp + 1 + 1!
So with 4 threads updating res concurrently, you simply end up with an undefined behavior : it is impossible to predict the final value of res because of those race conditions. Two executions of the same code will give you different answers.
How to solve this issue?
To make your code thread-safe, you need to use a thread-safe structure for res: a structure that can be concurrently updated and accessed.
In your case, an AtomicLong seems the perfect choice:
public static AtomicLong res = new AtomicLong(0);
And in the run method:
for (int i = 0; i < Math.pow(10., 7); i++) {
MathThreads.res.incrementAndGet();
}

Java Threading- Am I writing thread-safe code?

I'm currently trying to get into parallel processing, so to do this, I'm writing a program that processes an image, giving information about its color values overall- I am doing some tests on this one class with a randomly generated array of integers, and 4 threads are running to process every 4th pixel, from their respective starting places. I was just wondering if this read is thread-safe? Can multiple threads read the same data structure if that's what I want?
import java.awt.image.BufferedImage;
import java.lang.Thread;
public class ImageProcessor extends Thread {
public static void main(String[] args) {
int[] z = new int[10000000];
for (int i = 0; i < 10000000; i++) {
double a = (Math.random()*1000000);
z[i] = (int) a;
}
ImageProcessor ip = new ImageProcessor();
ip.imgRGBPercent(z);
}
public ImageProcessor() {
}
public void process(int[] x, int startPoint) {
(new Thread(new ReadThread(x, startPoint))).start();
}
public int[] imgRGBPercent(int[] x) {
ReadThread first = new ReadThread(x, 0);
ReadThread second = new ReadThread(x, 1);
ReadThread third = new ReadThread(x, 2);
ReadThread fourth = new ReadThread(x, 3);
Thread a = (new Thread(first));
Thread b = (new Thread(second));
Thread c = (new Thread(third));
Thread d = (new Thread(fourth));
long timeMetric = System.currentTimeMillis();
a.start();
b.start();
c.start();
d.start();
try {
a.join();
}
catch (Exception e) {
}
try {
b.join();
}
catch (Exception e) {
}
try {
c.join();
}
catch (Exception e) {
}
try {
d.join();
}
catch (Exception e) {
}
int redTotal, blueTotal, greenTotal;
redTotal = first.getRGBTotals()[0] + second.getRGBTotals()[0] + third.getRGBTotals()[0] + fourth.getRGBTotals()[0];
blueTotal = first.getRGBTotals()[1] + second.getRGBTotals()[1] + third.getRGBTotals()[1] + fourth.getRGBTotals()[1];
greenTotal = first.getRGBTotals()[2] + second.getRGBTotals()[2] + third.getRGBTotals()[2] + fourth.getRGBTotals()[2];
System.out.println(greenTotal);
System.out.println(System.currentTimeMillis() - timeMetric);
timeMetric = System.currentTimeMillis();
ColorValue cv1 = new ColorValue();
int sum = 0;
int sum1 = 0;
int sum2 = 0;
for (int i = 0; i < x.length; i++) {
sum += cv1.getGreen(x[i]);
sum1 += cv1.getRed(x[i]);
sum2 += cv1.getBlue(x[i]);
}
System.out.println(sum);
System.out.println(System.currentTimeMillis() - timeMetric);
int[] out = new int[3];
return out;
}
private class ReadThread implements Runnable {
private int[] colorArr;
private int startPoint, redTotal, blueTotal, greenTotal;
private ColorValue cv;
public ReadThread(int[] x, int startPoint) {
colorArr = x;
this.startPoint = startPoint;
cv = new ColorValue();
}
#Override
public void run() {
//System.out.println("hit");
for (int i = startPoint; i < colorArr.length; i+=4 ) {
redTotal += ColorValue.getRed(colorArr[i]);
blueTotal += ColorValue.getBlue(colorArr[i]);
greenTotal += ColorValue.getGreen(colorArr[i]);
}
}
public int[] getRGBTotals() {
int[] out = new int[3];
out[0] = redTotal;
out[1] = blueTotal;
out[2] = greenTotal;
return out;
}
}
}
Yes. As long as the data structure is not modified while it's being read, you're safe. Every write done before starting a thread will be visible by the started thread.
This logic would concern me a little:
for (int i = startPoint; i < colorArr.length; i+=4 ) {
redTotal += ColorValue.getRed(colorArr[i]);
blueTotal += ColorValue.getBlue(colorArr[i]);
greenTotal += ColorValue.getGreen(colorArr[i]);
}
colorArr is a reference to an array; the reference was passed to the Runnable during the constructor, but the array itself was created outside.
In the complete program you posted, I don't think it's a problem, since this array isn't modified anywhere in your program after the point where you start the threads. But in a larger, "real-world" case, you have to be aware that you're reading colorArr[i] three times and the value may not be the same each time, if there are other threads that could make changes to colorArr. That's one of the things you have to watch out for when writing concurrent code. This would be a little better:
for (int i = startPoint; i < colorArr.length; i+=4 ) {
int color = colorArr[i];
redTotal += ColorValue.getRed(color);
blueTotal += ColorValue.getBlue(color);
greenTotal += ColorValue.getGreen(color);
}
But depending on what your needs are, you may need to take extra steps to make sure no part of the program can modify colorArr at any point while the entire loop is running. Then you need to start looking into lock objects and synchronized, and you'd want to seriously consider setting up a separate class for the colorArr, with methods for modifying and reading the array that are either synchronized methods or contain logic to ensure that things are synchronized properly--by putting the array in its own class, the needed synchronization logic could be encapsulated in that class, so clients of the class wouldn't have to worry about it. That's the kind of thing you need to think about when you start using concurrency.
Yes, multiple threads can read the same objects and are fine so long as other threads aren't modifying them at the same time. Depending on what you're doing the Fork-Join framework may be useful, it manages a lot of the threading details for you, so would be worth investigating.

How to remove manually created pauses in Main-thread?

Problem description:
We have a given matrix randomly filled with digits and have to create separate threads for each row of the matrix that count how many times the digits encounter in that row.
Without these sleeps in the main thread, it's not working correctly..
Here's my solution.
Also it's following here:
public class TestingMatrixThreads {
public static void main(String[] arr) throws InterruptedException {
int[][] a = new int[67][6];
// class.Count works with class.Matrix, that's why I've made it this way
Matrix m = new Matrix(a);
m.start();
Thread.sleep(1000); // Here comes the BIG question -> how to avoid these
// manually created pauses
Count c;
Thread t;
// Creating new threads for each row of the matrix
for (int i = 0; i < Matrix.matr.length; i++) {
c = new Count(i);
t = new Thread(c);
t.start();
}
//Again - the same question
System.out.println("Main - Sleep!");
Thread.sleep(50);
System.out.println("\t\t\t\t\tMain - Alive!");
int sum = 0;
for (int i = 0; i < Count.encounters.length; i++) {
System.out.println(i + "->" + Count.encounters[i]);
sum += Count.encounters[i];
}
System.out.println("Total numbers of digits: " + sum);
}
}
class Count implements Runnable {
int row;
public static int[] encounters = new int[10]; // here I store the number of each digit's(array's index) encounters
public Count(int row) {
this.row = row;
}
public synchronized static void increment(int number) {
encounters[number]++;
}
#Override
public void run() {
System.out.println(Thread.currentThread().getName() + ", searching in row " + row + " STARTED");
for (int col = 0; col < Matrix.matr[0].length; col++) {
increment(Matrix.matr[row][col]);
}
try {
Thread.sleep(1); // If it's missing threads are starting and stopping consequently
} catch (InterruptedException e) {
}
System.out.println(Thread.currentThread().getName() + " stopped!");
}
}
class Matrix extends Thread {
static int[][] matr;
public Matrix(int[][] matr) {
Matrix.matr = matr;
}
#Override
public void run() {
//print();
fill();
System.out.println("matrix filled");
print();
}
public static void fill() {
for (int i = 0; i < matr.length; i++) {
for (int j = 0; j < matr[0].length; j++) {
matr[i][j] = (int) (Math.random() * 10);
}
}
}
public static void print() {
for (int i = 0; i < matr.length; i++) {
for (int j = 0; j < matr[0].length; j++) {
System.out.print(matr[i][j] + " ");
}
System.out.println();
}
}
}
P.S. I'm sorry if this question is too stupid for you to answer, but I'm a newbie in Java programming, as well as it's my very first post in stackoverflow, so please excuse me for the bad formatting, too :)
Thank you in advance!
Change the Thread.sleep by m.join()
Doing this will make the main thread wait for the other to complete its work and then it will continu its execution.
Cheers
To answer your main question:
Thread.join();
For example:
public static void main(String[] args) throws Exception {
final Thread t = new Thread(new Runnable() {
#Override
public void run() {
System.out.println("Do stuff");
}
});
t.start();
t.join();
}
The start call, as you know, kicks off the other Thread and runs the Runnable. The join call then waits for that started thread to finish.
A more advanced way to deal with multiple threads is with an ExecutorService. This detaches the threads themselves from the tasks they do. You can have a pool of n threads and m > n tasks.
Example:
public static void main(String[] args) throws Exception {
final class PrintMe implements Callable<Void> {
final String toPrint;
public PrintMe(final String toPrint) {
this.toPrint = toPrint;
}
#Override
public Void call() throws Exception {
System.out.println(toPrint);
return null;
}
}
final List<Callable<Void>> callables = new LinkedList<>();
for (int i = 0; i < 10; ++i) {
callables.add(new PrintMe("I am " + i));
}
final ExecutorService es = Executors.newFixedThreadPool(4);
es.invokeAll(callables);
es.shutdown();
es.awaitTermination(1, TimeUnit.DAYS);
}
Here we have 4 threads and 10 tasks.
If you go down this route you probably need to look into the Future API to so that you can check whether the tasks completed successfully. You can also return a value from the task; in your case a Callable<Integer> would seem to be appropriate so that you can return the result of your calculation from the call method and gather up the results from the Future.
As other Answers have stated, you can do this simply using join; e.g.
Matrix m = new Matrix(a);
m.start();
m.join();
However, I just want to note that if you do that, you are not going to get any parallelism from the Matrix thread. You would be better of doing this:
Matrix m = new Matrix(a);
m.run();
i.e. executing the run() method on the main thread. You might get some parallelism by passing m to each "counter" thread, and having them all join the Matrix thread ... but I doubt that it will be worthwhile.
Frankly, I'd be surprised if you get a worthwhile speedup for any of the multi-threading you are trying here:
If the matrix is small, the overheads of creating the threads will dominate.
If the matrix is large, you are liable to run into memory contention issues.
The initialization phase takes O(N^2) computations compared with the parallelized 2nd phase that has N threads doing O(N) computations. Even if you can get a decent speedup in the 2nd phase, the 1st phase is likely to dominate.

Categories