How can i avoid ConcurrentModificationException in below scenario - java

I the below scenario, I can get the ConcurrentModificationException while calling dslList.clear();
How can I avoid concurrent modification exception without hampering the performance?
public void processFiles(List<FileIdBothDirectoryInformation> files) {
AtomicInteger count = new AtomicInteger(0);
ExecutorService executorService = Executors.newFixedThreadPool(30);
List dslList = new ArrayList<>();
List<CompletableFuture<Void>> completableFutures = files.stream().map(file -> CompletableFuture.runAsync(() -> {
try {
File localFile = nasDataLoader.downloadFile(file.getFilePath());
if(localFile != null) {
List<> lines = parse(localFile);
dslList.addAll(lines.stream().map(line -> {
apply(line);
return new Xxx(line);
}).collect(Collectors.toList()));
if(dslList.size() >= BATCH_SIZE || count.incrementAndGet() == files.size()) {
persistenceManager.persist(dslList);
dslList.clear();
}
}
} catch () {
}
}, executorService)).collect(Collectors.toList());
List<Void> result = completableFutures.stream().map(CompletableFuture::join).collect(Collectors.toList());
executorService.shutdown();
}

Related

How to declare Futures in Java without warnings?

Trying to use Futures and IntelliJ is giving me various warnings, not sure how to code it 'correctly'. The code works but obviously want to learn best practice.
try {
public void futuresTest() {
try {
List<String> valuesToProcess = List.of("A","B","C","D","E");
ExecutorService executor = Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors());
List<Future<MyObject>> futures = new ArrayList<>();
for(String s : valuesToProcess) {
futures.add((Future<MyObject>) executor.submit(new MyObject(s))); //<THIS
}
LOG.info("Waiting for threads to finish...");
boolean termStatus = executor.awaitTermination(10, TimeUnit.MINUTES);
if (termStatus) {
LOG.info("Success!");
} else {
LOG.warn("Timed Out!");
for(Future<MyObject> f : futures) {
if(!f.isDone()) {
LOG.warn("Failed to process {}:",f);
}
}
}
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}
Gives Unchecked cast: 'java.util.concurrent.Future<capture<?>>' to 'java.util.concurrent.Future<model.MyObject>'
List<Future> futures = new ArrayList<>();
for(String s : valuesToProcess) {
futures.add( executor.submit(new MyObject(s)));
}
Gives Raw use of parameterized class 'Future'
is it just supposed to be List<Future<?>> futures = new ArrayList<>(); that has no warnings but I would think I should be specifying my Object.
Based on the comments it does sound like the correct approach is
List<Future<?>> futures = new ArrayList<>();
for(String s : valuesToProcess) {
futures.add(executor.submit(new MyObject(s)));
}

Completable Future getting stuck and halts processing indefintely

Hi can someone help as to why the main in below never completes. When I pass 1 to the test method it get stuck completely. However passing 2 makes the run ok. Want to understand the actual issue and also what would be the correct way to code this.
public class Test {
private static final ScheduledExecutorService EXECUTOR = Executors.newScheduledThreadPool(1, r -> {
Thread t = defaultThreadFactory().newThread(r);
t.setDaemon(true);
return t;
});
public static void main(String[] args) {
Test t = new Test();
t.test(1).toCompletableFuture().join();
System.out.println("DONE");
}
public CompletionStage<Void> run(int i) {
if (i == 1) throw new RuntimeException();
CompletableFuture<Void> future = new CompletableFuture<>();
future.completeExceptionally(new RuntimeException());
return future;
}
public CompletionStage<Void> test(int i) {
CompletableFuture<Void> future = new CompletableFuture<>();
EXECUTOR.schedule(() -> run(i).handle((output, error) -> {
if (error instanceof CompletionException) {
error = error.getCause();
}
if (error != null) {
CompletableFuture<Void> failedFuture = new CompletableFuture<>();
failedFuture.completeExceptionally(error);
return failedFuture;
}
return completedFuture(output);
}).thenCompose(u -> u).thenApply(future::complete).exceptionally(future::completeExceptionally), 0, TimeUnit.SECONDS);
return future;
}
}
EDIT: used solution suggested by holger and working fine. In this solution the handle catches the runtime exception. Why not previously?
public class Test {
private static final ScheduledExecutorService EXECUTOR = Executors.newScheduledThreadPool(1, r -> {
Thread t = defaultThreadFactory().newThread(r);
t.setDaemon(true);
return t;
});
public static void main(String[] args) {
Test t = new Test();
System.out.println("STARTED");
t.test(1).toCompletableFuture().join();
System.out.println("DONE");
}
public CompletionStage<Void> run(int i) {
if (i == 1)
throw new RuntimeException();
CompletableFuture<Void> future = new CompletableFuture<>();
future.completeExceptionally(new RuntimeException());
return future;
}
public CompletionStage<Void> test(int i) {
return completedFuture(i).thenComposeAsync(this::run, r -> EXECUTOR.schedule(r, 0, TimeUnit.SECONDS))
.handle((res, ex) -> {
if (ex == null)
return completedFuture(res);
if (ex instanceof CompletionException) {
ex = ex.getCause();
}
CompletableFuture<Void> failedFuture = new CompletableFuture<>();
failedFuture.completeExceptionally(ex);
return failedFuture;
}).thenCompose(u -> u);
}
}

Java multi-threaded with CompletableFuture works slower

I tried to write code for counting files of certain type on my computer.
I tested both one thread solution and multi-threads asynch solution, and it seems like the one thread is working faster. Is anything wrong with my code? and if not, why isn't it working faster?
The code below:
AsynchFileCounter - The asynchronized version.
ExtensionFilter - The file filter to list only directories and files with the extension specified
BasicFileCounter - The one thread version.
public class AsynchFileCounter {
public int countFiles(String path, String extension) throws InterruptedException, ExecutionException {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return countFilesRecursive(f, filter);
}
private int countFilesRecursive(File f, ExtensionFilter filter) throws InterruptedException, ExecutionException {
return CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenApplyAsync(files -> {
int count = 0;
for (File file : files) {
if(file.isFile())
count++;
else
try {
count += countFilesRecursive(file, filter);
} catch (Exception e) {
e.printStackTrace();
}
}
return count;
}).get();
}
}
public class ExtensionFilter implements FileFilter {
private String extension;
private boolean allowDirectories;
public ExtensionFilter(String extension, boolean allowDirectories) {
if(extension.startsWith("."))
extension = extension.substring(1);
this.extension = extension;
this.allowDirectories = allowDirectories;
}
#Override
public boolean accept(File pathname) {
if(pathname.isFile() && pathname.getName().endsWith("." + extension))
return true;
if(allowDirectories) {
if(pathname.isDirectory())
return true;
}
return false;
}
}
public class BasicFileCounter {
public int countFiles(String path, String extension) {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return countFilesRecursive(f, filter);
}
private int countFilesRecursive(File f, ExtensionFilter filter) {
int count = 0;
File [] ar = f.listFiles(filter);
for (File file : ar) {
if(file.isFile())
count++;
else
count += countFilesRecursive(file, filter);
}
return count;
}
}
You have to spawn multiple asynchronous jobs and must not wait immediately for their completion:
public int countFiles(String path, String extension) {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return countFilesRecursive(f, filter).join();
}
private CompletableFuture<Integer> countFilesRecursive(File f, FileFilter filter) {
return CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenCompose(files -> {
if(files == null) return CompletableFuture.completedFuture(0);
int count = 0;
CompletableFuture<Integer> fileCount = new CompletableFuture<>(), all=fileCount;
for (File file : files) {
if(file.isFile())
count++;
else
all = countFilesRecursive(file, filter).thenCombine(all, Integer::sum);
}
fileCount.complete(count);
return all;
});
}
Note that File.listFiles may return null.
This code will count all files of a directory immediately but launch a new asynchronous job for sub-directories. The results of the sub-directory jobs are combined via thenCombine, to sum their results. For simplification, we create another CompletableFuture, fileCount to represent the locally counted files. thenCompose returns a future which will be completed with the result of the future returned by the specified function, so the caller can use join() to wait for the final result of the entire operation.
For I/O operations, it may help to use a different thread pool, as the default ForkJoinPool is configured to utilize the CPU cores rather the I/O bandwidth:
public int countFiles(String path, String extension) {
ExecutorService es = Executors.newFixedThreadPool(30);
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
int count = countFilesRecursive(f, filter, es).join();
es.shutdown();
return count;
}
private CompletableFuture<Integer> countFilesRecursive(File f,FileFilter filter,Executor e){
return CompletableFuture.supplyAsync(() -> f.listFiles(filter), e)
.thenCompose(files -> {
if(files == null) return CompletableFuture.completedFuture(0);
int count = 0;
CompletableFuture<Integer> fileCount = new CompletableFuture<>(), all=fileCount;
for (File file : files) {
if(file.isFile())
count++;
else
all = countFilesRecursive(file, filter,e).thenCombine(all,Integer::sum);
}
fileCount.complete(count);
return all;
});
}
There is no best number of threads, this depends on the actual execution environment and would be subject to measuring and tuning. When the application is supposed to run in different environments, this should be a configurable parameter.
But consider that you might be using the wrong tool for the job. An alternative are Fork/Join tasks, which support interacting with the thread pool to determine the current saturation, so once all worker threads are busy, it will proceed scanning locally with an ordinary recursion rather than submitting more asynchronous jobs:
public int countFiles(String path, String extension) {
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
return POOL.invoke(new FileCountTask(f, filter));
}
private static final int TARGET_SURPLUS = 3, TARGET_PARALLELISM = 30;
private static final ForkJoinPool POOL = new ForkJoinPool(TARGET_PARALLELISM);
static final class FileCountTask extends RecursiveTask<Integer> {
private final File path;
private final FileFilter filter;
public FileCountTask(File file, FileFilter ff) {
this.path = file;
this.filter = ff;
}
#Override
protected Integer compute() {
return scan(path, filter);
}
private static int scan(File directory, FileFilter filter) {
File[] fileList = directory.listFiles(filter);
if(fileList == null || fileList.length == 0) return 0;
List<FileCountTask> recursiveTasks = new ArrayList<>();
int count = 0;
for(File file: fileList) {
if(file.isFile()) count++;
else {
if(getSurplusQueuedTaskCount() < TARGET_SURPLUS) {
FileCountTask task = new FileCountTask(file, filter);
recursiveTasks.add(task);
task.fork();
}
else count += scan(file, filter);
}
}
for(int ix = recursiveTasks.size() - 1; ix >= 0; ix--) {
FileCountTask task = recursiveTasks.get(ix);
if(task.tryUnfork()) task.complete(scan(task.path, task.filter));
}
for(FileCountTask task: recursiveTasks) {
count += task.join();
}
return count;
}
}
I figured it out. since I am adding up the results in this line:
count += countFilesRecursive(file, filter);
and using get() to receive the result, I am actually waiting for the result, instead of really parallelising the code.
This is my current code, which actually runs much faster than the one thread code. However, I could not figure out an elegant way of knowing when the parallel method is done.
I would love to hear how should I solve that?
Here's the ugly way I am using:
public class AsynchFileCounter {
private LongAdder count;
public int countFiles(String path, String extension) {
count = new LongAdder();
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
countFilesRecursive(f, filter);
// ******** The way I check whether The function is done **************** //
int prev = 0;
int cur = 0;
do {
prev = cur;
try {
Thread.sleep(50);
} catch (InterruptedException e) {}
cur = (int)count.sum();
} while(cur>prev);
// ******************************************************************** //
return count.intValue();
}
private void countFilesRecursive(File f, ExtensionFilter filter) {
CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenAcceptAsync(files -> {
for (File file : files) {
if(file.isFile())
count.increment();
else
countFilesRecursive(file, filter);
}
});
}
}
I did some changes to the code:
I use AtomicInteger to count the files instead of the LongAdder.
After reading Holger's answer, I decided to count directories being processed. When the number goes down to zero, the work is done. So I added a lock and a condition to let the main thread know when the work is done.
I added a check whether the file.listFiles() returns a null. I ran the code on windows and it never did (I had an empty directory, and it returned an empty array), but since it is using native code, it might return null on other OS.
public class AsynchFileCounter {
private AtomicInteger count;
private AtomicInteger countDirectories;
private ReentrantLock lock;
private Condition noMoreDirectories;
public int countFiles(String path, String extension) {
count = new AtomicInteger();
countDirectories = new AtomicInteger();
lock = new ReentrantLock();
noMoreDirectories = lock.newCondition();
ExtensionFilter filter = new ExtensionFilter(extension, true);
File f = new File(path);
countFilesRecursive(f, filter);
lock.lock();
try {
noMoreDirectories.await();
} catch (InterruptedException e) {}
finally {
lock.unlock();
}
return count.intValue();
}
private void countFilesRecursive(File f, ExtensionFilter filter) {
countDirectories.getAndIncrement();
CompletableFuture.supplyAsync(() -> f.listFiles(filter))
.thenAcceptAsync(files -> countFiles(filter, files));
}
private void countFiles(ExtensionFilter filter, File[] files) {
if(files != null) {
for (File file : files) {
if(file.isFile())
count.incrementAndGet();
else
countFilesRecursive(file, filter);
}
}
int currentCount = countDirectories.decrementAndGet();
if(currentCount == 0) {
lock.lock();
try {
noMoreDirectories.signal();
}
finally {
lock.unlock();
}
}
}
}

JUnit run threads with static variable in Java

I'm trying to run two threads inside JUnit. The following code will be invoked by several JUnit test.
I want to stop both threads when result is not null. What should I do with it? The problem is multiple JUnit test share the same String result object and somehow this code gets blocked by previous test. The problem is when another test invoke this method, the result would assign to null, and previous test would block in while(true) loop.
static String result = null;
public static synchronized String remoteClogTailDir(final int maxRetry, String hostName,
final String identifier, final String remoteClogDirPaths, final String whichKeyValue) {
result = null;
final String[] hosts = hostName.split(",");
if(hosts != null && hosts.length == 2){
Thread t1 = null;
Thread t2 = null;
t1 = new Thread(new Runnable(){
#Override
public void run(){
String resultOfThread = null;
resultOfThread = remoteClogTailDir(maxRetry, hosts[0].trim(), identifier, null,
remoteClogDirPaths, false, whichKeyValue);
if(result == null && resultOfThread != null){
result = resultOfThread;
}
}
});
t2 = new Thread(new Runnable(){
#Override
public void run(){
String resultOfThread = null;
resultOfThread = remoteClogTailDir(maxRetry, hosts[1].trim(), identifier, null,
remoteClogDirPaths, false, whichKeyValue);
if(result == null && resultOfThread != null){
result = resultOfThread;
}
}
});
t1.start();
t2.start();
while(true){
if(result != null){
t1.interrupt();
t2.interrupt();
return result;
}
}
}else{
return remoteClogTailDir(maxRetry, hostName, identifier, null,
remoteClogDirPaths, false, whichKeyValue);
}
}
If I understand correctly, you want to execute several search in parallel, and take the first search which complete. You shouldn't use static properties for that.
You could use a ExecutorCompletionService for such tasks:
Executor executor = Executors.newCachedThreadPool();
CompletionService<String> ecs = new ExecutorCompletionService<String>(executor);
List<Future<String>> futures = new ArrayList<Future<String>>();
try {
futures.add(ecs.submit(search1));
futures.add(ecs.submit(search2));
for (int i = 0; i < futures.size(); ++i) {
String result = ecs.take().get();
if (result != null) {
return result;
}
}
} finally {
for (Future<String> f : futures) {
f.cancel(true);
}
}
executor.shutdownNow();
with search1 or search2 a simple Callable :
Callable<String> search1 = new Callable<String() {
public String call() {
return remoteClogTailDir(...)
}
}

Is there a read write lock with listeners for java?

Is there a java library that implements something that behaves like a ReadWriteLock but uses listeners or CompletableFuture/CompletionStage instead of blocking?
Ideally I'd like to write:
lock = ...
CompletionStage stage = lock.lockRead();
stage.thenAccept(r -> { doSomething(); r.release(); });
And also important:
CompletionStage stage = lock.tryLockWrite(10, TimeUnit.SECONDS);
stage.handle(callback);
I'm looking to know if something like this exists and if it does how is it called.
I'm not looking to implement this myself, but rather use a library to simplify some framework code.
I think writing it yourself shouldn't be hard enough. Chances are it would take less time than looking for a library. It's pretty simple overall:
static const int STATE_UNLOCKED = 0;
static const int STATE_READING = 1;
static const int STATE_WRITING = 2;
int state = STATE_UNLOCKED;
int readers = 0;
Queue<CompletableFuture<Void>> queueWriters = new LinkedList<CompletableFuture<Void>>();
Queue<CompletableFuture<Void>> queueReaders = new LinkedList<CompletableFuture<Void>>();
public synchronized CompletionStage<Void> lockWriter() {
CompletableFuture<Void> l = new CompletableFuture<Void>();
if (state == STATE_UNLOCKED) {
state = STATE_WRITING;
l.complete(null);
return l;
}
queueWriters.offer(l);
return l;
}
public synchronized CompletionStage<Void> lockReader() {
CompletableFuture<Void> l = new CompletableFuture<Void>();
if (state != STATE_WRITING) {
state = STATE_READING;
readers++;
l.complete(null);
return l;
}
queueReaders.offer(l);
return l;
}
public void unlock() {
CompletableFuture<Void> l = null;
synchronized(this) {
if (state == STATE_READING) {
readers--;
if (readers > 0) {
return;
}
}
l = queueReaders.poll();
if (l != null) {
state = STATE_READING;
readers++;
}
else {
l = queueWriters.poll();
if (l != null) {
state = STATE_WRITING;
}
else {
state = STATE_UNLOCKED;
return;
}
}
}
l.complete(null);
while (true) {
synchronized (this) {
if (state != STATE_READING) {
return;
}
l = queueReaders.poll();
if (l == null) {
return;
}
readers++;
}
l.complete(null);
}
}
Adding timed locking (By using some sort of "expiring queue" or writer-starvation prevention (By preventing additional readers from being executed if queueWriters is not empty) to the above shouldn't be that much more difficult either.

Categories