Why the threads stop when the other start to execute? - java

I'm new to Java concurrent API and I've searched but didn't find an answer to my question.
Well, I have a code that look for every file inside directories and their subdirectories and another that copy every found file that match a specified pattern.
I separate this codes in one Runnable implementation called DirSearch and one Callable implementation called FileSearch and submit them using an ExecutorService.
That's the code:
private boolean execute() {
ExecutorService executor = Executors.newFixedThreadPool(threadsNumber);
BlockingQueue<File> dirQueue = new LinkedBlockingQueue<>();
BlockingQueue<File> fileQueue = new LinkedBlockingQueue<>(10000);
boolean isFinished = false;
try {
for(int i = 0; i < dirThreads; i++) {
executor.submit(new DirSearch(dirQueue, fileQueue, count, dirThreads);
}
count.incrementAndGet();
dirQueue.add(baseDir);
Future<Boolean> future = executor.submit(new FileSearch(filequeue, outputDirectory, filename));
isFinished = future.get();
} catch(ExecutionException | InterruptedException | RuntimeException ex) {
ex.printStackTrace();
} finally {
executor.shutdownNow();
}
return isFinished;
}
...
private void copyFile(File in, File out) {
Path inPath = Paths.get(in.getAbsolutePath());
Path outPath = Paths.get(out.getAbsolutePath(), in.getName());
try {
main.updateCurrentLabel(outPath.toString());
switch(mode) {
case "1":
Files.copy(inPath, outPath, StandardCopyOption.REPLACE_EXISTING);
break;
case "2":
Files.move(inPath, outPath, StandardCopyOption.REPLACE_EXISTING);
break;
default:
break;
}
main.updateCopiedLabel(String.valueOf(countCpFiles.incrementAndGet()));
} catch(IOException ex) {
ex.printStackTrace();
}
}
...
private class DirSearch implements Runnable {
...
#Override
public void run() {
try {
File dir = dirQueue.take();
while(dir != new File("")) {
File[] elements = dir.listFiles();
if(elements != null) {
for(File element : elements) {
if(element.isDirectory()) {
count.incrementAndGet();
dirQueue.put(element);
} else {
fileQueue.put(element);
}
}
}
if(count.decrementAndGet() == 0) {
end();
}
dir = dirQueue.take();
}
} catch(InterruptedException ex) {
ex.printStackTrace();
}
}
...
}
...
private class FileSearch implements Callable<Boolean> {
...
#Override
public Boolean call() {
boolean isFinished = false;
try {
File file = fileQueue.take();
while(file != new File("")) {
incrementAnalyzed();
String foundFile = file.getName().toLowerCase();
if(foundFile.matches(filename.replace("?", ".?").replace("*", ".*?"))) {
copyFile(file, outputDirectory);
}
file = fileQueue.take();
}
isFinished = true;
} catch(InterruptedException ex) {
ex.printStackTrace();
}
return isFinished;
}
}
The problem is: when the FileSearch start to copy files, the other threads (DirSearch) stop and don't look for any new file until the copy is completed. Why this is happening? Am I doing anything wrong or this is not the correct approach?

Two possible answers which came to my mind and which i cant guarantee they apply to your specific situation:
1. Java VM gets only one core from your CPU which means it can only run one thread at a time.
2. your threads both use the same variable which means only one is allowed to really manipulate it at a time. For this specific problem look up java keyword "synchronized".
I guess the root of the problem tends to be #1

Related

How to apply multithreading in java to find a word in a directory(main directory and the word given) and recursively its subdirectories

I am quite new on Stack Overflow and a beginner in Java so please forgive me if I have asked this question in an improper way.
PROBLEM
I have an assignment which tells me to make use of multi-threading to search files for a given word, which might be present in any file of type .txt and .html, on any-level in the given directory (So basically the entire directory). The absolute file path of the file has to be displayed on the console if the file contains the given word.
WHAT HAVE I TRIED
So I thought of dividing the task into 2 sections, Searching and Multithreading respectively,
I was able to get the Searching part( File_search.java ). This file has given satisfactory results by searching through the directory and finding all the files in it for the given word.
File_search.java
public class File_search{
String fin_output = "";
public String searchInTextFiles(File dir,String search_word) {
File[] a = dir.listFiles();
for(File f : a){
if(f.isDirectory()) {
searchInTextFiles(f,search_word);
}
else if(f.getName().endsWith(".txt") || f.getName().endsWith(".html") || f.getName().endsWith(".htm") ) {
try {
searchInFile(f,search_word);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
return fin_output;
}
public void searchInFile(File f,String search_word) throws FileNotFoundException {
final Scanner sc = new Scanner(f);
while(sc.hasNextLine()) {
final String lineFromFile = sc.nextLine();
if(lineFromFile.contains(search_word)) {
fin_output += "FILE : "+f.getAbsolutePath().toString()+"\n";
}
}
}
Now, I want to be able to use multiple threads to execute the task File_search.java using ThreadPoolExecuter service. I'm not sure If I can do it using Runnable ,Callable or by using a Thread class or by any other method?
Can you please help me with the code to do the multi-threading part? Thanks :)
I agree to the comment of #chrylis -cautiouslyoptimistic, but for the purpose of understanding below will help you.
One simpler approach could be to do the traversal of directories in the main Thread, I mean the logic which you have added in function searchInTextFiles and do the searching logic as you did in function searchInFile in a Threadpool of size let's say 10.
Below sample code will help you to understand it better.
public class Traverser {
private List<Future<String>> futureList = new ArrayList<Future<String>>();
private ExecutorService executorService;
public Traverser() {
executorService = Executors.newFixedThreadPool(10);
}
public static void main(String[] args) throws InterruptedException, ExecutionException {
System.out.println("Started");
long start = System.currentTimeMillis();
Traverser traverser = new Traverser();
traverser.searchInTextFiles(new File("Some Directory Path"), "Some Text");
for (Future<String> future : traverser.futureList) {
System.out.println(future.get());
}
traverser.executorService.shutdown();
while(!traverser.executorService.isTerminated()) {
System.out.println("Not terminated yet, sleeping");
Thread.sleep(1000);
}
long end = System.currentTimeMillis();
System.out.println("Time taken :" + (end - start));
}
public void searchInTextFiles(File dir,String searchWord) {
File[] filesList = dir.listFiles();
for(File file : filesList){
if(file.isDirectory()) {
searchInTextFiles(file,searchWord);
}
else if(file.getName().endsWith(".txt") || file.getName().endsWith(".html") || file.getName().endsWith(".htm") ) {
try {
futureList.add(executorService.submit(new SearcherTask(file,searchWord)));
} catch (Exception e) {
e.printStackTrace();
}
}
}
}}
public class SearcherTask implements Callable<String> {
private File inputFile;
private String searchWord;
public SearcherTask(File inputFile, String searchWord) {
this.inputFile = inputFile;
this.searchWord = searchWord;
}
#Override
public String call() throws Exception {
StringBuilder result = new StringBuilder();
Scanner sc = null;
try {
sc = new Scanner(inputFile);
while (sc.hasNextLine()) {
final String lineFromFile = sc.nextLine();
if (lineFromFile.contains(searchWord)) {
result.append("FILE : " + inputFile.getAbsolutePath().toString() + "\n");
}
}
} catch (Exception e) {
//log error
throw e;
} finally {
sc.close();
}
return result.toString();
}}

How to gracefully wait to job task finish in BlockingQueue java

I am writing a job queue using BlockingQueue and ExecutorService. It basically waiting new data in the queue, if there are any data put into the queue, executorService will fetch data from queue. But the problem is that i am using a loop that loops to wait the queue to have data and thus the cpu usage is super high.
I am new to use this api. Not sure how to improve this.
ExecutorService mExecutorService = Executors.newSingleThreadExecutor();
BlockingQueue<T> mBlockingQueue = new ArrayBlockingQueue();
public void handleRequests() {
Future<T> future = mExecutorService.submit(new WorkerHandler(mBlockingQueue, mQueueState));
try {
value = future.get();
} catch (InterruptedException | ExecutionException e) {
e.printStackTrace();
}
if (mListener != null && returnedValue != null) {
mListener.onNewItemDequeued(value);
}
}
}
private static class WorkerHandler<T> implements Callable<T> {
private final BlockingQueue<T> mBlockingQueue;
private PollingQueueState mQueueState;
PollingRequestHandler(BlockingQueue<T> blockingQueue, PollingQueueState state) {
mBlockingQueue = blockingQueue;
mQueueState = state;
}
#Override
public T call() throws Exception {
T value = null;
while (true) { // problem is here, this loop takes full cpu usage if queue is empty
if (mBlockingQueue.isEmpty()) {
mQueueState = PollingQueueState.WAITING;
} else {
mQueueState = PollingQueueState.FETCHING;
}
if (mQueueState == PollingQueueState.FETCHING) {
try {
value = mBlockingQueue.take();
break;
} catch (InterruptedException e) {
Log.e(TAG, e.getMessage(), e);
break;
}
}
}
Any suggestions on how to improve this would be much appreciated!
You don't need to test for the queue to be empty, you just take(), so the thread blocks until data is available.
When an element is put on the queue the thread awakens an value is set.
If you don't need to cancel the task you just need:
#Override
public T call() throws Exception {
T value = mBlockingQueue.take();
return value;
}
If you want to be able to cancel the task :
#Override
public T call() throws Exception {
T value = null;
while (value==null) {
try {
value = mBlockingQueue.poll(50L,TimeUnit.MILLISECONDS);
break;
} catch (InterruptedException e) {
Log.e(TAG, e.getMessage(), e);
break;
}
}
return value;
}
if (mBlockingQueue.isEmpty()) {
mQueueState = PollingQueueState.WAITING;
} else {
mQueueState = PollingQueueState.FETCHING;
}
if (mQueueState == PollingQueueState.FETCHING)
Remove these lines, the break;, and the matching closing brace.

Thread synchronization dont work unless add print statement

I am making a program that checks if a string is contained in a tree of directories and text files and I use producer-consumer pattern. Unfortunately my consumer thread doesn't want to stop unless I add a print statement. I tried everything - synchronization, making fields volatile but still can't find the problem.
public class Producer
extends Thread
{
private volatile Storage store;
private volatile Reader read;
Producer(Storage store, Reader read){
this.read = read;
this.store = store;
}
public void run()
{
while (!read.isEmpty()) {
String FileName = read.returnAllPaths().peek().getFileName().toString();
String item = null;
try {
item = read.returnAllPaths().take().toString();
} catch (InterruptedException e1) {
e1.printStackTrace();
}
File currentFile = new File(item);
try (BufferedReader reader = new BufferedReader(new FileReader(currentFile))) {
String line;
while ((line = reader.readLine()) != null) {
FileAndLine current = new FileAndLine(FileName, line);
store.fillStore(current);
}
} catch (IOException e) {
e.printStackTrace();
}
}
store.setEndOfPaths(true);
}
}
public class Consumer
extends Thread
{
private volatile Storage store;
private String clue;
public Consumer(Storage store, String clue){
this.store = store;
this.clue = clue;
}
public void run()
{
FileAndLine currentLine;
while(!store.isEndOfPaths() || !store.isEmpty()){
currentLine = store.depleteStore();
System.out.println("q");
if(currentLine.line.contains(clue))
System.out.println(currentLine.FileName + ": " + currentLine.line);
}
}
}
public class Storage {
private BlockingQueue<FileAndLine> Store;
private boolean full;
private volatile boolean endOfPaths;
public Storage(){
Store = new LinkedBlockingQueue<FileAndLine>();
full = false;
}
private boolean isFull(){
return full;
}
public synchronized BlockingQueue<FileAndLine> getStore(){
return this.Store;
}
public synchronized boolean isEmpty(){
return Store.isEmpty();
}
public synchronized void setEndOfPaths(boolean set){
endOfPaths = set;
}
public synchronized boolean isEndOfPaths(){
return endOfPaths;
}
public synchronized void fillStore(FileAndLine line){
while(isFull()){
try {
wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
Store.add(line);
full = false;
notifyAll();
if(Store.size() == 1000){
full = true;
}
}
public synchronized FileAndLine depleteStore(){
FileAndLine line;
if(endOfPaths == true && Store.isEmpty())
{
return new FileAndLine("", "");
}
while(Store.isEmpty())
{
try {
wait();
} catch (InterruptedException e) {
e.printStackTrace();
}
}
line = new FileAndLine(Store.remove());
if(Store.size() < 1000){
full = false;
notifyAll();
}
return line;
}
}
When you do
private volatile Reader read;
This means the Reader read reference is volatile. This means when you read this field, it's access is volatile. However, it doesn't mean the object referenced is thread safe. The test you have is
while (!read.isEmpty()) {
and it is this test which has to be thread safe.
Note: when you write to the console, you are using a synchronized block indirectly as all the method of PrintStream is synchronized and this has both a read and write barrier.
I am making a program that checks if a string is contained in a tree of directories and text files
String text = "looking-for";
Files.walk(Paths.get("mydir"))
.parallel()
.filter(p -> p.toFile().isFile())
.forEach(p -> {
try {
if (Files.lines(p)
.anyMatch(l -> l.contains(text)) {
System.out.println("file " + p + " contains " + text);
});
} catch (IOException e) {
e.printStackTrace();
}
});
You don't need manage all the handling of dividing up the work to multiple threads and co-ordinating them, esp when you have a task which is data processing.

Why does multithreaded version take the same amount of time as single threaded version?

I have the following work queue implementation, which I use to limit the number of threads in use. It works by me initially adding a number of Runnable objects to the queue, and when I am ready to begin, I run "begin()". At this point I do not add any more to the queue.
public class WorkQueue {
private final int nThreads;
private final PoolWorker[] threads;
private final LinkedList queue;
Integer runCounter;
boolean hasBegun;
public WorkQueue(int nThreads) {
runCounter = 0;
this.nThreads = nThreads;
queue = new LinkedList();
threads = new PoolWorker[nThreads];
hasBegun = false;
for (int i = 0; i < nThreads; i++) {
threads[i] = new PoolWorker();
threads[i].start();
}
}
public boolean isQueueEmpty() {
synchronized (queue) {
if (queue.isEmpty() && runCounter == 0) {
return true;
} else {
return false;
}
}
}
public void begin() {
hasBegun = true;
synchronized (queue) {
queue.notify();
}
}
public void add(Runnable r) {
if (!hasBegun) {
synchronized (queue) {
queue.addLast(r);
runCounter++;
}
} else {
System.out.println("has begun executing. Cannot add more jobs ");
}
}
private class PoolWorker extends Thread {
public void run() {
Runnable r;
while (true) {
synchronized (queue) {
while (queue.isEmpty()) {
try {
queue.wait();
} catch (InterruptedException ignored) {
}
}
r = (Runnable) queue.removeFirst();
}
// If we don't catch RuntimeException,
// the pool could leak threads
try {
r.run();
synchronized (runCounter) {
runCounter--;
}
} catch (RuntimeException e) {
// You might want to log something here
}
}
}
}
}
This is a runnable I use to keep track of when all the jobs on the work queue have finished:
public class QueueWatcher implements Runnable {
private Thread t;
private String threadName;
private WorkQueue wq;
public QueueWatcher(WorkQueue wq) {
this.threadName = "QueueWatcher";
this.wq = wq;
}
#Override
public void run() {
while (true) {
if (wq.isQueueEmpty()) {
java.util.Date date = new java.util.Date();
System.out.println("Finishing and quiting at:" + date.toString());
System.exit(0);
break;
} else {
try {
Thread.sleep(1000);
} catch (InterruptedException ex) {
Logger.getLogger(PlaneGenerator.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
}
public void start() {
wq.begin();
System.out.println("Starting " + threadName);
if (t == null) {
t = new Thread(this, threadName);
t.setDaemon(false);
t.start();
}
}
}
This is how I use them:
Workqueue wq = new WorkQueue(9); //Get same results regardless of 1,2,3,8,9
QueueWatcher qw = new QueueWatcher(wq);
SomeRunnable1 sm1 = new SomeRunnable1();
SomeRunnable2 sm2 = new SomeRunnable2();
SomeRunnable3 sm3 = new SomeRunnable3();
SomeRunnable4 sm4 = new SomeRunnable4();
SomeRunnable5 sm5 = new SomeRunnable5();
wq.add(sm1);
wq.add(sm2);
wq.add(sm3);
wq.add(sm4);
wq.add(sm5);
qw.start();
But regardless of how many threads I use, the result is always the same - it always takes about 1m 10seconds to complete. This is about the same as when I just did a single threaded version (when everything ran in main()).
If I set wq to (1,2,3--9) threads it is always between 1m8s-1m10s. What is the problem ? The jobs (someRunnable) have nothing to do with each other and cannot block each other.
EDIT: Each of the runnables just read some image files from the filesystems and create new files in a separate directory. The new directory eventually contains about 400 output files.
EDIT: It seems that only one thread is always doing work. I made the following changes:
I let the Woolworker store an Id
PoolWorker(int id){
this.threadId = id;
}
Before running I print the id of the worker.
System.out.println(this.threadId + " got new task");
r.run();
In WorkQueue constructor when creating the poolworkers I do:
for (int i = 0; i < nThreads; i++) {
threads[i] = new PoolWorker(i);
threads[i].start();
}
But it seems that that only thread 0 does any work, as the output is always:
0 got new task
Use queue.notifyAll() to start processing.
Currently you're using queue.notify(), which will only wake a single thread. (The big clue that pointed me to this was when you mentioned only a single thread was running.)
Also, synchronizing on Integer runCounter isn't doing what you think it's doing - runCounter++ is actually assigning a new value to the Integer each time, so you're synchronizing on a lot of different Integer objects.
On a side note, using raw threads and wait/notify paradigms is complicated and error-prone even for the best programmers - it's why Java introduced the java.util.concurrent package, which provide threadsafe BlockingQueue implementations and Executors for easily managing multithreaded apps.

Java Get notification if file is modified

I'm looking for a way to get a notification when a certain file is modified. I want to call a certain method when this happens, but in some cases I also want that the method is not called.
I tried the following:
class FileListener extends Thread {
private Node n;
private long timeStamp;
public FileListener(Node n) {
this.n = n;
this.timeStamp = n.getFile().lastModified();
}
private boolean isModified() {
long newStamp = n.getFile().lastModified();
if (newStamp != timeStamp) {
timeStamp = newStamp;
return true;
} else {
return false;
}
public void run() {
while(true) {
if (isModified()) {
n.setStatus(STATUS.MODIFIED);
}
try {
Thread.sleep(1000);
} catch(Exception e) {
e.printStackTrace();
}
}
}
The Node class contains a reference to the file, a STATUS (enum) and a reference to the FileListener of that file. When the file is modified, I want the status to change to STATUS.MODIFIED. But, in some cases, the file where the Node refers to changes to a new File and I don't want it to automatically change the status to Modified. In that case I tried this:
n.listener.interrupt(); //interrupts the listener
n.setListener(null); //sets listener to null
n.setFile(someNewFile); //Change the file in the node
//Introduce a new listener, which will look at the new file.
n.setListener(new FileListener(n));
n.listener.start(); // start the thread of the new listener
But what I get is an Exception thrown by 'Thread.sleep(1000)', because the sleep was interrupted and when I check the status, it is still modified to STATUS.MODIFIED.
Am I doing something wrong?
What about the watch service : http://docs.oracle.com/javase/7/docs/api/java/nio/file/WatchService.html?
WatchService watcher = FileSystems.getDefault().newWatchService();
Path dir = ...;
try {
WatchKey key = dir.register(watcher, ENTRY_MODIFY);
} catch (IOException x) {
System.err.println(x);
}
And then:
for (;;) {
//wait for key to be signaled
WatchKey key;
try {
key = watcher.take();
} catch (InterruptedException x) {
return;
}
for (WatchEvent<?> event: key.pollEvents()) {
WatchEvent.Kind<?> kind = event.kind();
if (kind == OVERFLOW) {
continue;
}
...
}

Categories