For our assignment for class, we have to count the amount of words in a txt file by splitting it into n segments, which we are supposed to be able to set before launching the programm. Each segment should then get its own thread, which counts the words and then stops. At the end, the main thread should collect all the individual word counts and add them together.
This is (part of) what I wrote so far
for (int i = 0; i < segments; i++){
Thread thread = new Thread();
thread.start();
int words = counting(stringarray[i]);
totalwords += words;
long nanos = ManagementFactory.getThreadMXBean().getThreadCpuTime(Thread.currentThread().getId());
System.out.println("This Thread read " + words + " words. The total word count now is " + totalwords +
". The time it took to finish for this thread is " + nanos +".");
System.out.println("Number of active threads from the given thread: " + Thread.activeCount());
}
Now, while this gets the primary job done (counting the words in different threads and adding them to the total), I dont know how to just "leave the thread be" and then add the individual wordcounts together after every thread has done its job.
Additionally, while this is definitely starting multiple threads, it only ever prints out that I have 2, or maybe 3 threads running at a time, even if I split the txt into 100 segments. Is there a way to have them all run at the same time?
The wording of the question suggest that each thread has its own counter, so I would declare a thread class:
public class WordCounter extends Thread {
private String text;
private int count;
public WordCounter(String text) {
this.text = text;
}
public int getCount() {
return count;
}
#Override
public void run() {
count = counting(text);
}
}
and use it as follows:
WordCounter[] threads = new WordCounter[segments];
for (int i = 0; i < segments; ++i) {
threads[i] = new WordCounter(stringarray[i]);
threads[i].start();
}
int total = 0;
for (int i = 0; i < segments; ++i) {
threads[i].join();
total += threads[i].getCount();
}
You may use next code snippet as a basis.
Note, that in case you increment common variable in different threads, this operation has to be thread-safe. That's why AtomicInteger variable is used as a counter
final List<String> segments = new ArrayList<>();
//TODO:Fill segments ... this is up to you
//In case threads will increment same variable it has to be thread-safe
final AtomicInteger worldCount = new AtomicInteger();
//Create Thread for each segment (this is definitely not optimal)
List<Thread> workers = new ArrayList<>(segments.size());
for (int i = 0; i < segments.size(); i++) {
final String segment = segments.get(i);
Thread worker = new Thread(new Runnable() {
#Override
public void run() {
//increment worldCount
worldCount.addAndGet(counting(segment));
}
});
workers.add(worker);
worker.start();
}
//Wait until all Threads are finished
for (Thread worker : workers) {
worker.join();
}
int result = worldCount.get();
Same solutions, but with Executors:
final List<String> segments = new ArrayList<>();
segments.add("seg1");
segments.add("seg2");
segments.add("seg 3");
final AtomicInteger worldCount = new AtomicInteger();
List<Future> workers = new ArrayList<>(segments.size());
ExecutorService executor = Executors.newFixedThreadPool(segments.size());
for (String segment : segments) {
Future<Integer> worker = executor.submit(() -> worldCount.addAndGet(counting(segment)));
workers.add(worker);
}
executor.shutdown();
if (!executor.awaitTermination(5, TimeUnit.SECONDS)) {
System.out.println("Still waiting...");
System.exit(0);
}
int result = worldCount.get();
System.out.println("result = " + result);
Related
While testing concurrency, I found something unexpected.
Concurrency was controlled using concurrentHashMap and AtomicLong.
public class HumanRepository {
private final static Map<Long, Human> STORE = new ConcurrentHashMap<>();
private AtomicLong sequence = new AtomicLong();
public void save(Human human) {
STORE.put(sequence.incrementAndGet(), human);
}
public int size() {
return STORE.size();
}
public Long getSeq() {
return sequence.get();
}
}
I tested saving in multiple threads.
#Test
void name() throws NoSuchMethodException, InterruptedException {
final int threads = 3_500;
final ExecutorService es = Executors.newFixedThreadPool(threads);
final CountDownLatch count = new CountDownLatch(threads);
final HumanRepository repository = new HumanRepository();
for (int i = 0; i < threads; i++) {
try {
es.execute(() -> repository.save(new Human("aa")));
} finally {
count.countDown();
}
}
count.await();
System.out.println("seq = " + repository.getSeq());
System.out.println("size = " + repository.size());
}
I tested it with 3500 threads simultaneously. The result I expected is 3500 for both seq and size.
But sometimes I get seq=3499, size=3500.
That's weird. It is strange that seq does not come out as 3500, and even though the size is 3500, it does not make sense that seq is 3499.
I don't know why the data number and seq in the map are not the same and 3500 is not coming out.
** If you do Thread.sleep(400L); after count.await();, surprisingly, the value of seq is 3500
You are not actually waiting for all tasks to complete. Which means that if you get the 3500/3500 output, it's by chance.
Specifically, you decrease the countdown latch on the main thread after scheduling the job, instead of inside of the job, once it's done. That means your countdownlatch is basically just another glorified loop variable that doesn't do any inter-thread communication. Try something like this instead:
for (int i = 0; i < threads; i++) {
es.execute(() -> {
repository.save(new Human("aa"));
count.countDown();
});
}
You are calling count.countDown() outside the thread executing the HumanRepository.save(). So its possible that the main thread is not synchronized for the completion of the threads.
So you may see the results of repository.getSeq() while one thread is running. Can you try with the following code?
final int threads = 3_500;
final ExecutorService es = Executors.newFixedThreadPool(threads);
final CountDownLatch count = new CountDownLatch(threads);
final HumanRepository repository = new HumanRepository();
for (int i = 0; i < threads; i++) {
try {
es.execute(() -> {
repository.save(new Human("aa"));
count.countDown();
});
} finally {
}
}
count.await();
System.out.println("seq = " + repository.getSeq());
System.out.println("size = " + repository.size());
I'd like to keep a counter of executed threads, to use in the same threads that I am executing.
The problem here is that although the counter increases, it increases unevenly and from the console output I got this (I have a for loop that executes 5 threads with ExecutorService):
This is a test. N:3
This is a test. N:4
This is a test. N:4
This is a test. N:4
This is a test. N:4
As you can see instead of getting 1,2,3,4,5 I got 3,4,4,4,4.
I assume this is because the for loop is running fast enough to execute the threads, and the threads are fast enough to execute the code requesting for the counter faster than the counter can update itself (does that even make sense?).
Here is the code (it is smaller and there is no meaningful use for the counter):
for (int i = 0; i < 5; i++)
{
Thread thread;
thread = new Thread()
{
public void run()
{
System.out.println("This is test. N: "+aldo );
//In here there is much more stuff, saying it because it might slow down the execution (if that is the culprit?)
return;
}
};
threadList.add(thread);
}
//later
for (int i = 0; i < threadList.size(); i++)
{
executor.execute(threadList.get(i));
aldo = aldo + 1;
}
executor.shutdown();
try
{
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
}
catch (InterruptedException e)
{
}
Yes, aldo the counter ( with a few other lists, I think) are missing from the code (they are very simple).
The best way I know of doing this is by creating a custom thread class with a constructor that passes in a number. The variable holding the number can then be used later for any needed logging. Here is the code I came up with.
public static void main(String[] args) {
class NumberedThread implements Runnable {
private final int number;
public NumberedThread(int number) {
this.number = number;
}
#Override
public void run() {
System.out.println("This is test. N: " + number);
}
}
List<Thread> threadList = new ArrayList<>();
for (int i = 1; i < 6; i++) threadList.add(new Thread(new NumberedThread(i)));
ExecutorService executor = Executors.newFixedThreadPool(10);;
for (Thread thread : threadList) executor.execute(thread);
executor.shutdown();
try {
executor.awaitTermination(Long.MAX_VALUE, TimeUnit.NANOSECONDS);
}
catch (InterruptedException ignored) { }
}
You could also use a string object instead if you wanted to name the threads.
aldo is not modified by the tasks in the thread, but instead is modified in the main thread, here:
for (int i = 0; i < threadList.size(); i++) {
executor.execute(threadList.get(i));
//here...
aldo = aldo + 1;
}
Also, since you want a counter that can increase its value in several threads, then you may use an AtomicInteger rather than int.
Your code should look like this:
AtomicInteger aldo = new AtomicInteger(1);
for (int i = 0; i < 5; i++) {
executor.execute( () -> {
System.out.println("This is test. N: " + aldo.getAndIncrement());
});
}
Lets say I have n threads concurrently taking values from a shared queue:
public class WorkerThread implements Runnable{
private BlockingQueue queue;
private ArrayList<Integer> counts = new ArrayList<>();
private int count=0;
public void run(){
while(true) {
queue.pop();
count++;
}
}
}
Then for each thread, I want to count every 5 seconds how many items it has dequeued, and then store it in its own list (counts)
I've seen here Print "hello world" every X seconds how you can run some code every x seconds:
Timer t = new Timer();
t.scheduleAtFixedRate(new TimerTask(){
#Override
public void run(){
counts.add(count);
count = 0
}
}, 0, 5000);
The problem with this is that I can't access count variable and the list of counts unless they are static. But I don't want them to be static because I don't want the different threads to share those variables.
Any ideas of how to handle this?
I don't think it's possible to use scheduled execution for you case(neither Timer nor ScheduledExecutorService), because each new scheduled invocation will create a new tasks with while loop. So number of tasks will increase constantly.
If you don't need to access this list of counts in runtime i would suggest something like this one:
static class Task implements Runnable {
private final ThreadLocal<List<Integer>> counts = ThreadLocal.withInitial(ArrayList::new);
private volatile List<Integer> result = new ArrayList<>();
private BlockingQueue<Object> queue;
public Task(BlockingQueue<Object> queue) {
this.queue = queue;
}
#Override
public void run() {
int count = 0;
long start = System.nanoTime();
try {
while (!Thread.currentThread().isInterrupted()) {
queue.take();
count++;
long end = System.nanoTime();
if ((end - start) >= TimeUnit.SECONDS.toNanos(1)) {
counts.get().add(count);
count = 0;
start = end;
}
}
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
// the last value
counts.get().add(count);
// copy the result cause it's not possible
// to access thread local variable outside of this thread
result = counts.get();
}
public List<Integer> getCounts() {
return result;
}
}
public static void main(String[] args) throws Exception {
ExecutorService executorService = Executors.newFixedThreadPool(3);
BlockingQueue<Object> blockingQueue = new LinkedBlockingQueue<>();
Task t1 = new Task(blockingQueue);
Task t2 = new Task(blockingQueue);
Task t3 = new Task(blockingQueue);
executorService.submit(t1);
executorService.submit(t2);
executorService.submit(t3);
for (int i = 0; i < 50; i++) {
blockingQueue.add(new Object());
Thread.sleep(100);
}
// unlike shutdown() interrupts running threads
executorService.shutdownNow();
executorService.awaitTermination(1, TimeUnit.SECONDS);
System.out.println("t1 " + t1.getCounts());
System.out.println("t2 " + t2.getCounts());
System.out.println("t3 " + t3.getCounts());
int total = Stream.concat(Stream.concat(t1.getCounts().stream(), t2.getCounts().stream()), t3.getCounts().stream())
.reduce(0, (a, b) -> a + b);
// 50 as expected
System.out.println(total);
}
Why not a static AtomicLong?
Or the WorkerThread(s) can publish that they poped to the TimerTask or somewhere else? And the TimerTask reads that info?
I am learning java, and trying to summarize elements in table with multiple threads, but I am always getting wrong result.
I tried 4 different methods of threads synchronization and all of them failed. Everything is explained in the comments.
My result (bad):
Without threads: 4949937, 15ms
With threads: 4944805, 78ms
Maybe am I executing the System.out.println on summarizeT() too early? I mean before all the threads finish work. With .join() the summarizeT() method works good but. Is the .join() blocking the "main" thread until all other threads are finished?
Main class:
public class Main
{
static int size = 100000; //size of tab
static int length = 100; //each thread gets 100 elements of tab, thread 1 calculates sum from 0 to 99, thread 2 from 100 to 199 etc.
static int[] tab = new int[size];
static Random generator = new Random();
static void initialize()
{
for (int i = 0; i < size; i++)
tab[i] = generator.nextInt(100);
}
static int summarize() //summarize with only one thread
{
int sum = 0;
for (int i = 0; i < size; i++)
sum += tab[i];
return sum;
}
static int summarizeT() //summarize with more threads (size / length)
{
int threadsCounter = size/length;
int start = 0; //pointer to table from where each thread should start counting
int[] sum = new int[1]; //I am sharing the sum value between threads with table, not sure if it is best method to pass the value between them
sum[0] = 0;
Thread[] threads = new Thread[threadsCounter]; //nedeed for .join() test
for (int i = 0; i < threadsCounter; i++)
{
threads[i] = new Thread(new MyThread(tab, start, sum));
threads[i].start();
start += length; //moving the start pointer, next thread should start counting from next 100 indexes
}
/*for (int i = 0; i < threadsCounter; i++) // adding .join() solves the problem, but is it a good solution?
{
try {
threads[i].join();
} catch (InterruptedException e) {
e.printStackTrace();
}
}*/
return sum[0];
}
public static void main(String[] args)
{
initialize();
long start = Calendar.getInstance().getTimeInMillis();
System.out.println("Without threads: " + summarize() + ", " + (Calendar.getInstance().getTimeInMillis() - start) + "ms");
start = Calendar.getInstance().getTimeInMillis();
System.out.println("With threads: " + summarizeT() + ", " + (Calendar.getInstance().getTimeInMillis() - start) + "ms"); //giving wrong answer
}
}
MyThread class:
import java.util.concurrent.Semaphore;
public class MyThread extends Thread
{
int[] tab;
int[] sum;
int start;
MyThread(int tab[], int start, int sum[]) //in args: main table, starting index, value that is being shared between threads
{
this.tab = tab;
this.start = start;
this.sum = sum;
}
#Override
public void run()
{
int end = start + Main.length; //place where thread should stop counting
int temp = 0; //nedeed to sumarize the "subtable"
while (start < end)
{
temp += tab[start];
start++;
}
// Method 1
Semaphore semaphore = new Semaphore(1);
try {
semaphore.acquire();
} catch (InterruptedException e1) {
e1.printStackTrace();
}
try
{
sum[0] += temp;
} catch (Exception e) {
} finally {
semaphore.release();
}
// Method 2
/*Object lock = new Object();
synchronized(lock)
{
sum[0] += temp;
}*/
// Method 3
/*synchronized(this)
{
sum[0] += temp;
}*/
// Method 4
//summarize(temp);
// Method 5 - no threads synchronization, works only when .join() is used, the same as other methods
//sum[0] += temp;
}
private synchronized void summarize(int value)
{
sum[0] += value;
}
}
Isn't the problem here that you create a lock object , or semaphore object, in each thread, rather than having one object that all threads synchronise on?
Each Thread creates is own Semaphore object ( for example ) so no other thread will ever contend with it. You need to create an object that ALL threads have access to and synchronise on that. You might consider synchronising on the array that you are writing the results in to itself.
There are some problems with your solution.
You should use AtomicInteger to hold results. that way you don't need to synchronize sum update.
BTW the way you synchronize is invalid. For semaphores to work you need to share same instance between all threads. And your try/catch/finally blocks are invalid. You should do acquire() and sum update in one try block, and release() in it's finally. That way you did it. It is possible that you'll do sum update even though acquire() failed.
Also you return from summarizeT() without waiting for threads to finish. You have to implement thread.join() logic or some other way to wait.
Problem description:
We have a given matrix randomly filled with digits and have to create separate threads for each row of the matrix that count how many times the digits encounter in that row.
Without these sleeps in the main thread, it's not working correctly..
Here's my solution.
Also it's following here:
public class TestingMatrixThreads {
public static void main(String[] arr) throws InterruptedException {
int[][] a = new int[67][6];
// class.Count works with class.Matrix, that's why I've made it this way
Matrix m = new Matrix(a);
m.start();
Thread.sleep(1000); // Here comes the BIG question -> how to avoid these
// manually created pauses
Count c;
Thread t;
// Creating new threads for each row of the matrix
for (int i = 0; i < Matrix.matr.length; i++) {
c = new Count(i);
t = new Thread(c);
t.start();
}
//Again - the same question
System.out.println("Main - Sleep!");
Thread.sleep(50);
System.out.println("\t\t\t\t\tMain - Alive!");
int sum = 0;
for (int i = 0; i < Count.encounters.length; i++) {
System.out.println(i + "->" + Count.encounters[i]);
sum += Count.encounters[i];
}
System.out.println("Total numbers of digits: " + sum);
}
}
class Count implements Runnable {
int row;
public static int[] encounters = new int[10]; // here I store the number of each digit's(array's index) encounters
public Count(int row) {
this.row = row;
}
public synchronized static void increment(int number) {
encounters[number]++;
}
#Override
public void run() {
System.out.println(Thread.currentThread().getName() + ", searching in row " + row + " STARTED");
for (int col = 0; col < Matrix.matr[0].length; col++) {
increment(Matrix.matr[row][col]);
}
try {
Thread.sleep(1); // If it's missing threads are starting and stopping consequently
} catch (InterruptedException e) {
}
System.out.println(Thread.currentThread().getName() + " stopped!");
}
}
class Matrix extends Thread {
static int[][] matr;
public Matrix(int[][] matr) {
Matrix.matr = matr;
}
#Override
public void run() {
//print();
fill();
System.out.println("matrix filled");
print();
}
public static void fill() {
for (int i = 0; i < matr.length; i++) {
for (int j = 0; j < matr[0].length; j++) {
matr[i][j] = (int) (Math.random() * 10);
}
}
}
public static void print() {
for (int i = 0; i < matr.length; i++) {
for (int j = 0; j < matr[0].length; j++) {
System.out.print(matr[i][j] + " ");
}
System.out.println();
}
}
}
P.S. I'm sorry if this question is too stupid for you to answer, but I'm a newbie in Java programming, as well as it's my very first post in stackoverflow, so please excuse me for the bad formatting, too :)
Thank you in advance!
Change the Thread.sleep by m.join()
Doing this will make the main thread wait for the other to complete its work and then it will continu its execution.
Cheers
To answer your main question:
Thread.join();
For example:
public static void main(String[] args) throws Exception {
final Thread t = new Thread(new Runnable() {
#Override
public void run() {
System.out.println("Do stuff");
}
});
t.start();
t.join();
}
The start call, as you know, kicks off the other Thread and runs the Runnable. The join call then waits for that started thread to finish.
A more advanced way to deal with multiple threads is with an ExecutorService. This detaches the threads themselves from the tasks they do. You can have a pool of n threads and m > n tasks.
Example:
public static void main(String[] args) throws Exception {
final class PrintMe implements Callable<Void> {
final String toPrint;
public PrintMe(final String toPrint) {
this.toPrint = toPrint;
}
#Override
public Void call() throws Exception {
System.out.println(toPrint);
return null;
}
}
final List<Callable<Void>> callables = new LinkedList<>();
for (int i = 0; i < 10; ++i) {
callables.add(new PrintMe("I am " + i));
}
final ExecutorService es = Executors.newFixedThreadPool(4);
es.invokeAll(callables);
es.shutdown();
es.awaitTermination(1, TimeUnit.DAYS);
}
Here we have 4 threads and 10 tasks.
If you go down this route you probably need to look into the Future API to so that you can check whether the tasks completed successfully. You can also return a value from the task; in your case a Callable<Integer> would seem to be appropriate so that you can return the result of your calculation from the call method and gather up the results from the Future.
As other Answers have stated, you can do this simply using join; e.g.
Matrix m = new Matrix(a);
m.start();
m.join();
However, I just want to note that if you do that, you are not going to get any parallelism from the Matrix thread. You would be better of doing this:
Matrix m = new Matrix(a);
m.run();
i.e. executing the run() method on the main thread. You might get some parallelism by passing m to each "counter" thread, and having them all join the Matrix thread ... but I doubt that it will be worthwhile.
Frankly, I'd be surprised if you get a worthwhile speedup for any of the multi-threading you are trying here:
If the matrix is small, the overheads of creating the threads will dominate.
If the matrix is large, you are liable to run into memory contention issues.
The initialization phase takes O(N^2) computations compared with the parallelized 2nd phase that has N threads doing O(N) computations. Even if you can get a decent speedup in the 2nd phase, the 1st phase is likely to dominate.