I implement a recommendation algorithm in Java program.
However, I have serious problems. The dataset is too large and it's computation is too slow. So, I need to do parallel programming in Java.
For example,
for (int i=0; i < 10000000 ; i++) { ~~~ }
I want to split this sentences such as
process 1: for (int i=0; i < 10000 ; i++)
process 2: for (int i=10001; i < 20000 ; i++)
process 3: for (int i=20001; i < 30000 ; i++)
...
I know similar methods in Python. How to do parallel programming in Java?
Hope this will help you.
public class MyRunnable implements Runnable {
private final long countUntil;
MyRunnable(long countUntil) {
this.countUntil = countUntil;
}
#Override
public void run() {
long sum = 0;
for (long i = 1; i < countUntil; i++) {
sum += i;
}
System.out.println(sum);
}
}
public class Main {
public static void main(String[] args) {
// We will store the threads so that we can check if they are done
List<Thread> threads = new ArrayList<Thread>();
// We will create 500 threads
for (int i = 0; i < 500; i++) {
Runnable task = new MyRunnable(10000000L + i);
Thread worker = new Thread(task);
// We can set the name of the thread
worker.setName(String.valueOf(i));
// Start the thread, never call method run() direct
worker.start();
// Remember the thread for later usage
threads.add(worker);
}
int running = 0;
do {
running = 0;
for (Thread thread : threads) {
if (thread.isAlive()) {
running++;
}
}
System.out.println("We have " + running + " running threads. ");
} while (running > 0);
}
}
i got it from here
Related
I'd like to keep a counter of executed threads, to use in the same threads that I am executing.
The problem here is that although the counter increases, it increases unevenly and from the console output I got this (I have a for loop that executes 5 threads with ExecutorService):
This is a test. N:3
This is a test. N:4
This is a test. N:4
This is a test. N:4
This is a test. N:4
As you can see instead of getting 1,2,3,4,5 I got 3,4,4,4,4.
I assume this is because the for loop is running fast enough to execute the threads, and the threads are fast enough to execute the code requesting for the counter faster than the counter can update itself (does that even make sense?).
Here is the code (it is smaller and there is no meaningful use for the counter):
for (int i = 0; i < 5; i++)
{
Thread thread;
thread = new Thread()
{
public void run()
{
System.out.println("This is test. N: "+aldo );
//In here there is much more stuff, saying it because it might slow down the execution (if that is the culprit?)
return;
}
};
threadList.add(thread);
}
//later
for (int i = 0; i < threadList.size(); i++)
{
executor.execute(threadList.get(i));
aldo = aldo + 1;
}
executor.shutdown();
try
{
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
}
catch (InterruptedException e)
{
}
Yes, aldo the counter ( with a few other lists, I think) are missing from the code (they are very simple).
The best way I know of doing this is by creating a custom thread class with a constructor that passes in a number. The variable holding the number can then be used later for any needed logging. Here is the code I came up with.
public static void main(String[] args) {
class NumberedThread implements Runnable {
private final int number;
public NumberedThread(int number) {
this.number = number;
}
#Override
public void run() {
System.out.println("This is test. N: " + number);
}
}
List<Thread> threadList = new ArrayList<>();
for (int i = 1; i < 6; i++) threadList.add(new Thread(new NumberedThread(i)));
ExecutorService executor = Executors.newFixedThreadPool(10);;
for (Thread thread : threadList) executor.execute(thread);
executor.shutdown();
try {
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
}
catch (InterruptedException ignored) { }
}
You could also use a string object instead if you wanted to name the threads.
aldo is not modified by the tasks in the thread, but instead is modified in the main thread, here:
for (int i = 0; i < threadList.size(); i++) {
executor.execute(threadList.get(i));
//here...
aldo = aldo + 1;
}
Also, since you want a counter that can increase its value in several threads, then you may use an AtomicInteger rather than int.
Your code should look like this:
AtomicInteger aldo = new AtomicInteger(1);
for (int i = 0; i < 5; i++) {
executor.execute( () -> {
System.out.println("This is test. N: " + aldo.getAndIncrement());
});
}
Following code exists in chapter 13 of "UnderStanding the JVM Advanced Features and Best Practices, second Edition".
But when the thread execute " while (Thread.activeCount() > 1)", it will be blocked and nothing will be printed.
public class Code_12_1 {
public static AtomicInteger race = new AtomicInteger(0);
public static void increment(){
race.incrementAndGet();
}
private static final int THREADS_COUNT = 20;
public static void main(String[] args) throws Exception{
Thread[] threads = new Thread[THREADS_COUNT];
for (int i = 0; i < THREADS_COUNT; i++) {
threads[i] = new Thread(new Runnable() {
#Override
public void run() {
for (int j = 0; j < 10000; j++) {
increment();
}
}
});
threads[i].start();
}
while (Thread.activeCount() > 1)
Thread.yield();
System.out.println(race);
}
}
But when I change "while (Thread.activeCount() > 1)" to "while (Thread.activeCount()>2)" the thread can execute correctly and output the answer 200000.
So, why thread will be blocked when it executes "while (Thread.activeCount() > 1)"?
I have figured it out. I ran your code and it turns out there is already another Thread running besides the Main thread initially(in the current ThreadGroup), even before you all other threads. So, the following print statement gave me a count of 2(see the initialCount):
public static void main(String[] args) throws Exception{
Thread[] threads = new Thread[THREADS_COUNT];
int initialCount = Thread.activeCount();
System.out.println("Initial count: " + initialCount);
for (int i = 0; i < THREADS_COUNT; i++) {
threads[i] = new Thread(new Runnable() {
#Override
public void run() {
for (int j = 0; j < 10000; j++) {
increment();
}
}
});
threads[i].start();
}
while (Thread.activeCount() > 1){
System.out.println(race);
System.out.println(Thread.activeCount());
Thread.yield();
}
}
So, what you have to do is:
while (Thread.activeCount() > initialCount){
Thread.yield();
}
And, it will work as expected.
Reasoning: It is blocked because all your Threads finish executing after you start them and then you have this additional thread remaining that never makes you exit the while loop.
If you insert a print statement before Thread.yield(), you get an output like,
200000
200000
200000
200000
200000
200000
...
...
This supports the reasoning that all the Threads have finished and just 2 are left that keep the loop running.
For our assignment for class, we have to count the amount of words in a txt file by splitting it into n segments, which we are supposed to be able to set before launching the programm. Each segment should then get its own thread, which counts the words and then stops. At the end, the main thread should collect all the individual word counts and add them together.
This is (part of) what I wrote so far
for (int i = 0; i < segments; i++){
Thread thread = new Thread();
thread.start();
int words = counting(stringarray[i]);
totalwords += words;
long nanos = ManagementFactory.getThreadMXBean().getThreadCpuTime(Thread.currentThread().getId());
System.out.println("This Thread read " + words + " words. The total word count now is " + totalwords +
". The time it took to finish for this thread is " + nanos +".");
System.out.println("Number of active threads from the given thread: " + Thread.activeCount());
}
Now, while this gets the primary job done (counting the words in different threads and adding them to the total), I dont know how to just "leave the thread be" and then add the individual wordcounts together after every thread has done its job.
Additionally, while this is definitely starting multiple threads, it only ever prints out that I have 2, or maybe 3 threads running at a time, even if I split the txt into 100 segments. Is there a way to have them all run at the same time?
The wording of the question suggest that each thread has its own counter, so I would declare a thread class:
public class WordCounter extends Thread {
private String text;
private int count;
public WordCounter(String text) {
this.text = text;
}
public int getCount() {
return count;
}
#Override
public void run() {
count = counting(text);
}
}
and use it as follows:
WordCounter[] threads = new WordCounter[segments];
for (int i = 0; i < segments; ++i) {
threads[i] = new WordCounter(stringarray[i]);
threads[i].start();
}
int total = 0;
for (int i = 0; i < segments; ++i) {
threads[i].join();
total += threads[i].getCount();
}
You may use next code snippet as a basis.
Note, that in case you increment common variable in different threads, this operation has to be thread-safe. That's why AtomicInteger variable is used as a counter
final List<String> segments = new ArrayList<>();
//TODO:Fill segments ... this is up to you
//In case threads will increment same variable it has to be thread-safe
final AtomicInteger worldCount = new AtomicInteger();
//Create Thread for each segment (this is definitely not optimal)
List<Thread> workers = new ArrayList<>(segments.size());
for (int i = 0; i < segments.size(); i++) {
final String segment = segments.get(i);
Thread worker = new Thread(new Runnable() {
#Override
public void run() {
//increment worldCount
worldCount.addAndGet(counting(segment));
}
});
workers.add(worker);
worker.start();
}
//Wait until all Threads are finished
for (Thread worker : workers) {
worker.join();
}
int result = worldCount.get();
Same solutions, but with Executors:
final List<String> segments = new ArrayList<>();
segments.add("seg1");
segments.add("seg2");
segments.add("seg 3");
final AtomicInteger worldCount = new AtomicInteger();
List<Future> workers = new ArrayList<>(segments.size());
ExecutorService executor = Executors.newFixedThreadPool(segments.size());
for (String segment : segments) {
Future<Integer> worker = executor.submit(() -> worldCount.addAndGet(counting(segment)));
workers.add(worker);
}
executor.shutdown();
if (!executor.awaitTermination(5, TimeUnit.SECONDS)) {
System.out.println("Still waiting...");
System.exit(0);
}
int result = worldCount.get();
System.out.println("result = " + result);
I have a quad core cpu, and I've been trying to experiment with multithreading for performance reasons. I've written this code below just to see how fast it would go, and I noticed that it's actually slower than the 2nd code block that only uses the main thread
int numCrunchers = Runtime.getRuntime().availableProcessors();
public void crunch() {
int numPairs = 1000;
for(int i=0; i < numPairs; i++)
pairs.add(...);
int share = pairs.size()/numCrunchers;
for(int i=0; i < numCrunchers; i++) {
Cruncher cruncher = crunchers.get(i);
for(int j=0; j < share; j++)
cruncher.nodes.add(pairs.poll());
}
for(Cruncher cruncher : crunchers)
threadpool.execute(cruncher);
threadpool.shutdown();
try {
threadpool.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private class Cruncher implements Runnable {
public BlockingQueue<Pair<PathNode>> nodes = new LinkedBlockingQueue<Pair<PathNode>>();
private AStarPathfinder pathfinder;
private LinkedList<PathNode> path = new LinkedList<PathNode>();
public Cruncher(GridGraph graph) {
pathfinder = new AStarPathfinder(graph);
}
#Override
public void run() {
while(true) {
path.clear();
Pair<PathNode> pair = nodes.poll();
if(pair != null) {
pathfinder.search(path, pair.first(), pair.second());
paths.add(new LinkedList<PathNode>(path));
} else {
System.out.println("This cruncher is done");
break;
}
}
}
}
Each thread took around 34,000,000,000 nanoseconds on my pc, but when I decided to use no threads except for the main thread, it only took 1,090,195,046 nanoseconds, 34x time difference.
LinkedList<Pair<PathNode>> pairs = new LinkedList<Pair<PathNode>>();
int numPairs = 1000;
AStarPathfinder pathfinder = new AStarPathfinder(graph);
for(int i=0; i < numPairs; i++)
pairs.add(...);
long current = System.nanoTime();
for(int i=0; i < numPairs; i++) {
Pair<PathNode> pair = pairs.poll();
path.clear();
pathfinder.search(path, pair.first(), pair.second());
}
System.out.printf("Operation took %d nanoseconds", System.nanoTime() - current);
My question is why using multiple threads causes my program to run slow? Is the code not properly taking advantage of all the cores on my cpu? I ran this several times, and the results were similar, (30+)x time difference between multithreading and single-threading
Edit:
Decided to measure the time of each individual operation on the multithreaded
while(true) {
path.clear();
Pair<PathNode> pair = nodes.poll();
if(pair != null) {
long current = System.nanoTime();
pathfinder.search(path, pair.first(), pair.second());
paths.add(new LinkedList<PathNode>(path));
System.out.printf("Took %d nanoseconds\n", System.nanoTime() - current);
} else {
System.out.println("This cruncher is done");
break;
}
}
and single threaded...
LinkedList<Pair<PathNode>> pairs = new LinkedList<Pair<PathNode>>();
int numPairs = 1000;
AStarPathfinder pathfinder = new AStarPathfinder(graph);
for(int i=0; i < numPairs; i++)
pairs.add(...);
for(int i=0; i < numPairs; i++) {
long current = System.nanoTime();
Pair<PathNode> pair = pairs.poll();
path.clear();
pathfinder.search(path, pair.first(), pair.second());
System.out.printf("Operation took %d nanoseconds", System.nanoTime() - current);
}
Each Cruncher has its own AStarPathfinder instance, so the pathfinder.search() couldn't be causing blocking between each of the threads. The multithreaded application was still much slower.
Problem description:
We have a given matrix randomly filled with digits and have to create separate threads for each row of the matrix that count how many times the digits encounter in that row.
Without these sleeps in the main thread, it's not working correctly..
Here's my solution.
Also it's following here:
public class TestingMatrixThreads {
public static void main(String[] arr) throws InterruptedException {
int[][] a = new int[67][6];
// class.Count works with class.Matrix, that's why I've made it this way
Matrix m = new Matrix(a);
m.start();
Thread.sleep(1000); // Here comes the BIG question -> how to avoid these
// manually created pauses
Count c;
Thread t;
// Creating new threads for each row of the matrix
for (int i = 0; i < Matrix.matr.length; i++) {
c = new Count(i);
t = new Thread(c);
t.start();
}
//Again - the same question
System.out.println("Main - Sleep!");
Thread.sleep(50);
System.out.println("\t\t\t\t\tMain - Alive!");
int sum = 0;
for (int i = 0; i < Count.encounters.length; i++) {
System.out.println(i + "->" + Count.encounters[i]);
sum += Count.encounters[i];
}
System.out.println("Total numbers of digits: " + sum);
}
}
class Count implements Runnable {
int row;
public static int[] encounters = new int[10]; // here I store the number of each digit's(array's index) encounters
public Count(int row) {
this.row = row;
}
public synchronized static void increment(int number) {
encounters[number]++;
}
#Override
public void run() {
System.out.println(Thread.currentThread().getName() + ", searching in row " + row + " STARTED");
for (int col = 0; col < Matrix.matr[0].length; col++) {
increment(Matrix.matr[row][col]);
}
try {
Thread.sleep(1); // If it's missing threads are starting and stopping consequently
} catch (InterruptedException e) {
}
System.out.println(Thread.currentThread().getName() + " stopped!");
}
}
class Matrix extends Thread {
static int[][] matr;
public Matrix(int[][] matr) {
Matrix.matr = matr;
}
#Override
public void run() {
//print();
fill();
System.out.println("matrix filled");
print();
}
public static void fill() {
for (int i = 0; i < matr.length; i++) {
for (int j = 0; j < matr[0].length; j++) {
matr[i][j] = (int) (Math.random() * 10);
}
}
}
public static void print() {
for (int i = 0; i < matr.length; i++) {
for (int j = 0; j < matr[0].length; j++) {
System.out.print(matr[i][j] + " ");
}
System.out.println();
}
}
}
P.S. I'm sorry if this question is too stupid for you to answer, but I'm a newbie in Java programming, as well as it's my very first post in stackoverflow, so please excuse me for the bad formatting, too :)
Thank you in advance!
Change the Thread.sleep by m.join()
Doing this will make the main thread wait for the other to complete its work and then it will continu its execution.
Cheers
To answer your main question:
Thread.join();
For example:
public static void main(String[] args) throws Exception {
final Thread t = new Thread(new Runnable() {
#Override
public void run() {
System.out.println("Do stuff");
}
});
t.start();
t.join();
}
The start call, as you know, kicks off the other Thread and runs the Runnable. The join call then waits for that started thread to finish.
A more advanced way to deal with multiple threads is with an ExecutorService. This detaches the threads themselves from the tasks they do. You can have a pool of n threads and m > n tasks.
Example:
public static void main(String[] args) throws Exception {
final class PrintMe implements Callable<Void> {
final String toPrint;
public PrintMe(final String toPrint) {
this.toPrint = toPrint;
}
#Override
public Void call() throws Exception {
System.out.println(toPrint);
return null;
}
}
final List<Callable<Void>> callables = new LinkedList<>();
for (int i = 0; i < 10; ++i) {
callables.add(new PrintMe("I am " + i));
}
final ExecutorService es = Executors.newFixedThreadPool(4);
es.invokeAll(callables);
es.shutdown();
es.awaitTermination(1, TimeUnit.DAYS);
}
Here we have 4 threads and 10 tasks.
If you go down this route you probably need to look into the Future API to so that you can check whether the tasks completed successfully. You can also return a value from the task; in your case a Callable<Integer> would seem to be appropriate so that you can return the result of your calculation from the call method and gather up the results from the Future.
As other Answers have stated, you can do this simply using join; e.g.
Matrix m = new Matrix(a);
m.start();
m.join();
However, I just want to note that if you do that, you are not going to get any parallelism from the Matrix thread. You would be better of doing this:
Matrix m = new Matrix(a);
m.run();
i.e. executing the run() method on the main thread. You might get some parallelism by passing m to each "counter" thread, and having them all join the Matrix thread ... but I doubt that it will be worthwhile.
Frankly, I'd be surprised if you get a worthwhile speedup for any of the multi-threading you are trying here:
If the matrix is small, the overheads of creating the threads will dominate.
If the matrix is large, you are liable to run into memory contention issues.
The initialization phase takes O(N^2) computations compared with the parallelized 2nd phase that has N threads doing O(N) computations. Even if you can get a decent speedup in the 2nd phase, the 1st phase is likely to dominate.