multithreaded file download in java - java

I am trying to implement multithreading using ExecutorService for downloading files parallely. Below is my code
public void downloadFiles(List<String> filenames, final String fileSavePath) {
if (filenames != null && filenames.size() > 0) {
List<Callable<Void>> jobs = new ArrayList();
for (final String fileName : filenames) {
jobs.add(new Callable() {
public Void call() throws Exception {
downloadFile(fileName, fileSavePath);
return null;
}
});
}
performJobs(jobs);
}
}
My requirement is that i want to return a status from this method after all the files are downloaded succesfully. I am not sure how to do this. I cannot access variable of inner class from an outer one.
Any advice would be appreciable.
Thanks

A Callable can return a result. When you submit a job to the executor service, you get a future back. Calling get() on it will give you back the result returned by the Callable which can very well be the status of that particular download.
In your particular example, instead of returning null, return the result of downloading the file. Another way can be to use a shared thread-safe queue between the callables and add the status to that queue (though it's a roundabout way of doing stuff). You can also use this sort of trick to "update" some status on the UI etc.

From the Javadoc of Callable:
A task that returns a result and may throw an exception. Implementors
define a single method with no arguments called call.
Taking a cue from this, change List<Callable<Void>> jobs to List<Callable<Boolean>> jobs and similarly change your return type of your call method. Using this, after completion of the task, you can then check the returned status.

Use an ExecutorCompletionService.

Related

Do something based on result of CompletableFuture without blocking thread in Java

First off, I am not really familiar with CompletableFuture. What I am trying to do is retrieve data from a database via CompletableFuture and then do something with the result. Using CompletableFuture#join/get to work with the data is blocking the thread.
CompletableFuture<IPlayerData> future = playerDataManager.getOfflinePlayerDataAsync(target);
IPlayerData result = future.join(); //blocks the thread e.G. if database isn't reachable
//work with the result (maybe callback?)
Note that I am trying to not run the part of code above on a seperate thread. Is there any good way to do this non-blocking? I am pretty sure that there is something wrong with my code (Maybe getOfflinePlayerAsync?) and I really don't know how to continue.
#Override
public CompletableFuture<IPlayerData> getOfflinePlayerDataAsync(OfflinePlayer player) {
CompletableFuture<IPlayerData> future = new CompletableFuture<>();
DiscoBox.schedule(() -> future.complete(handler.loadObject(player.getUniqueId().toString()))).createAsyncTask(); //gets object from database
return future;
}
If you don't want to block the thread that is "receiving" the value, you could do one if the following:
// Get the value if it is available
if (future.isDone()) {
value = future.get();
// do something with value
}
// Get the value if it is available in the next second
try {
value = future.get(1, TimeUnit.SECONDS);
// do something with value
} catch (TimeoutException ex) {
// ho hum
}
With a CompletableFuture, there other non-blocking alternatives; e.g.
value = future.getNow(someDefault);
if (value != someDefault) {
// do something with value
}
Note with all of the above, you are only making one attempt to get the value. If one attempt is not enough you might be tempted to do something like this:
while (!future.isDone()) {
// do something else
}
value = future.get();
// do something with value
but this potentially blocks the thread ... in effect. You don't get to move past the end of the above code until the future has been completed.
If you don't care when the value is available because you don't intend to do anything with it, you can simply ignore the future.
Finally, the other way to deliver values asynchronously is using callbacks. You could provide a callback function as a parameter to your getOfflinePlayerDataAsync method. Then you could deliver the result like this:
DiscoBox.schedule(() -> callback(handler.loadObject(...))).createAsyncTask();
The callback might simply assigned the returned value to some shared variable, or it could do something more complicated. The key thing is that it will execute on the async task's thread, not on the thread that calls getOfflinePlayerDataAsync.

Runnable locked (park) using ExecutorService and BlockingQueue

Note: I understand the rules site, but I can't to put all code (complex/large code).
I put a DIFFERENT (all the real code is too much and you don't need here) code in Github but reproduces the Problem (the main class is joseluisbz.mock.support.TestOptimalDSP and switching class is joseluisbz.mock.support.runnable.ProcessorDSP) like the video.
Please don't recommend to me another jar or external library for this code.
I wish I was more specific, but I don't know what part to extract and show.
Before you close this question: Obviously, I am willing to refine my question if someone tells me where to look (technical detail).
I made a video in order to show my issue.
Even to formulate the question, I made a diagram to show the situation.
My program has a JTree, showing the relations between Worker.
I have a diagram interaction between threads controlling life with ExecutorService executorService = Executors.newCachedThreadPool(); and List<Future<?>> listFuture = Collections.synchronizedList(new ArrayList<>());
Each Runnable is started in this way listFuture().add(executorService().submit(this)); in its constructor. The lists are created like this: BlockingQueue<Custom> someBlockingQueue = new LinkedBlockingQueue<>();
My diagram shows who the Worker's father is if he has one.
It also shows, the writing relationships between the BlockingQueue.
RunnableStopper stops related runnables contained in Worker like property.
RunnableDecrementer, RunnableIncrementer, RunnableFilter operates with a cycle that runs each Custom that it receives for its BlockingQueue.
For which they always create a RunnableProcessor (it has no loop, but because of its long processing, once the task is finished it should be collected by the GC).
Internally the RunnableIncrementer has a Map Map<Integer, List<Custom>> mapListDelayedCustom = new HashMap<>();//Collections.synchronizedMap(new HashMap<>());
When arrives some Custom... I need to obtain the List of lastReceivedCustom List<Custom> listDelayedCustom = mapListDelayedCustom.putIfAbsent(custom.getCode(), new ArrayList<>());
I'm controlling the Size (is not growing indefinitely).
My code stops working when I add the following lines:
if (listDelayedCustom.size() > SomeValue) {
//No operation has yet been included in if sentence
}
But commenting the lines doesn't block
//if (listDelayedCustom.size() > SomeValue) {
// //No operation has yet been included in if sentence
//}
What could be blocking my Runnable?
It makes no sense that adding the lines indicated (Evaluate the size of a list: if sentence) above stops working.
Any advice to further specify my question?
First, the way you set thread names is wrong. You use this pattern:
public class Test
{
public static class Task implements Runnable
{
public Task()
{
Thread.currentThread().setName("Task");
}
#Override
public void run()
{
System.out.println("Task: "+Thread.currentThread().getName());
}
}
public static void main(String[] args)
{
new Thread(new Task()).start();
System.out.println("Main: "+Thread.currentThread().getName());
}
}
which gives the (undesired) result:
Main: Task
Task: Thread-0
It's incorrect because, in the Task constructor, the thread has not started yet, so you're changing the name of the calling thread, not the one of the spawned thread. You should set the name in the run() method.
As a result, the thread names in your screenshot are wrong.
Now the real issue. In WorkerDSPIncrement, you have this line:
List<ChunkDTO> listDelayedChunkDTO = mapListDelayedChunkDTO.putIfAbsent(chunkDTO.getPitch(), new ArrayList<>());
The documentation for putIfAbsent() says:
If the specified key is not already associated with a value (or is mapped to null) associates it with the given value and returns null, else returns the current value.
Since the map is initially empty, the first time you call putIfAbsent(), it returns null and assigns it to listDelayedChunkDTO.
Then you create a ProcessorDSP object:
ProcessorDSP processorDSP = new ProcessorDSP(controlDSP, upNodeDSP, null,
dHnCoefficients, chunkDTO, listDelayedChunkDTO, Arrays.asList(parent.getParentBlockingQueue()));
It means you pass null as the listDelayedChunkDTO parameter. So when this line executes in ProcessorDSP:
if (listDelayedChunkDTO.size() > 2) {
it throws a NullPointerException and the runnable stops.

Single that emits a value that passed function returned

I want a Single that calls a certain function and then completes with the value that function has returned.
The following is something similar:
Single.fromCallable(this::func);
The problem with that is that it calls this::func every time a subscriber is added. So if this::func counts calls that single would return 3 to a third subscriber it gets.
I see that as a problem, because, what if this::func were a long running operation.
And I don't get it, does that mean that Single::onComplete has been called twice? Which I thought it was impossible, and it doesn't make sense, because, how can something complete twice?
And since I'm an Android programmer Single::fromFuture doesn't work here, is there some alternative to it?
I will demonstrate my problem with the following example:
class SingleFromCallableTest {
int funcCalls = 0;
int func(){
return funcCalls++;
}
#Test
public void run(){
Single<Integer> source = Single.fromCallable(this::func);
source.subscribe(System.out::println); // prints 1
source.subscribe(System.out::println); // prints 2
}
}
IMO, second subscriber shouldn't have been called because Single should succed just once IMO.
Just like with SingleSubject, once you call onSuccess, it cannot succed again.
If Single.fromCallable would work the way it think it should, than, in the previouse example, source could have completed even before the first subscriber subscribed, which means that, only the following way of subscribing would make sense:
Single.fromCallable(this::func).subscribe(System.out.println);
But actually, maybe even then it's possible not to catch a value emited by single, maybe that's way this is not possible.
The method #fromCallable is a factory and will return a new Single every time. On every subscription you will subscribe to a new Single. Therefore the function will be invoked for every subscriber. If you want to cache the value, you would use #cache operator. Please have a look at provided two tests.
The test 'notCached' will invoke the function for each subscription. The test 'cached' will invoke the function only one time. If you want to share the result, just re-use create Single#fromCallable with #cache operator.
Environment
dependencies {
compile 'io.reactivex.rxjava2:rxjava:2.1.6'
compile 'org.mockito:mockito-core:2.11.0'
testCompile("org.junit.jupiter:junit-jupiter-api:5.0.0")
testRuntime("org.junit.jupiter:junit-jupiter-engine:5.0.0")
Tests
#Test
void notCached() throws Exception {
Callable<Integer> mock = mock(Callable.class);
when(mock.call()).thenReturn(10);
Single<Integer> integerSingle = Single.fromCallable(mock);
Disposable subscribe1 = integerSingle.subscribe();
Disposable subscribe2 = integerSingle.subscribe();
verify(mock, times(2)).call();
}
#Test
void cached() throws Exception {
Callable<Integer> mock = mock(Callable.class);
when(mock.call()).thenReturn(10);
Single<Integer> integerSingle = Single.fromCallable(mock).cache();
Disposable subscribe1 = integerSingle.subscribe();
Disposable subscribe2 = integerSingle.subscribe();
Disposable subscribe3 = integerSingle.subscribe();
Disposable subscribe4 = integerSingle.subscribe();
verify(mock, times(1)).call();
}

Running a method using multithread in java

I have method as
public List<SenderResponse> sendAllFiles(String folderName) {
List<File> allFiles = getListOfFiles();
List<SenderResponse> finalResponse = new ArrayList<SenderResponse>();
for (File file : allFiles) {
finalResponse.getResults().add(sendSingleFile(file));
}
return finalResponse;
}
which is running as a single thread. I want run sendSingleFile(file) using multithread so I can reduce the total time taken to send files.
how can I run sendSingleFile(file) using multithreads for various files and get the final response?
I found few articles using threadpoolexecutor. But how to handle the response got during the sendSingleFile(file) and add it to one Final SenderResponse?
I am kind of new to multi-thread. Please suggest the best way to process these files.
Define an executor service
ExecutorService executor = Executors.newFixedThreadPool(MAX_THREAD); //Define integer value of MAX_THREAD
Then for each job you can do something like this:-
Callable<SenderResponse> task = () -> {
try {
return sendSingleFile(file);
}
catch (InterruptedException e) {
throw new IllegalStateException("Interrupted", e);
}
};
Future<SenderResponse> future = executor.submit(task);
future.get(MAX_TIME_TO_WAIT, TimeUnit.SECONDS); //Blocking call. MAX_TIME_TO_WAIT is max time future will wait for the process to execute.
You start by writing code that works works for the single-thread solution. The code you posted wouldn't even compile; as the method signature says to return SenderResponse; whereas you use/return a List<SenderResponse> within the method!
When that stuff works, you continue with this:
You create an instance of
ExecutorService, based on as many threads as you want to
You submit tasks into that service.
Each tasks knows about that result list object. The task does its work, and adds the result to that result list.
The one point to be careful about: making sure that add() is synchronized somehow - having multiple threads update an ordinary ArrayList is not safe.
For your situation, I would use a work stealing pool (ForkJoin executor service) and submit "jobs" to it. If you're using guava, you can wrap that in a listeningDecorator which will allow you to add a listener on the futures it returns.
Example:
// create the executor service
ListeningExecutorService exec = MoreExecutors.listeningDecorator(Executors.newWorkStealingPool());
for(Foo foo : bar) {
// submit can accept Runnable or Callable<T>
final ListenableFuture<T> future = exec.submit(() -> doSomethingWith(foo));
// Run something when it is complete.
future.addListener(() -> doSomeStuff(future), exec);
}
Note that the listener will be called whether the future was successful or not.

How to synchronise asynchronous methods in RxJava? Async Waterfall in RxJava

With RxJava, how to sequentially execute asynchronous methods on a list of data?
Using the Node.js Async module it is possible to call a function over an array of objects sequentially in the following fashion:
var dataArray = ['file1','file2','file3']; // array of arguments
var processor = function(filePath, callback) { // function to call on dataArray's arguments
fs.access(filePath, function(err) { // perform an async operation
// process next item in dataArray only after the following line is called
callback(null, !err)
}
};
async.every(dataArray, processor, function(err, result) {
// process results
});
What's nice about this is that code executed within the processor function can be asynchronous and the callback can be run only once the async task is finished. This means that each object in dataArray will be processed one after another, not in parallel.
Now, in RxJava I can call a processing function over dataArray by calling:
String[] dataArray = new String[] {"file1", "file2", "file3"};
Observable.fromArray(dataArray).subscribe(new Consumer<String>() { // in RxJava 1 it's Action1
#Override
public void accept(#NonNull String filePath) throws Exception {
//perform operation on filePath
}
});
However, if I perform an asynchronous operation on filePath, how can I ensure the sequential execution on items of dataArray? What I'd be looking for is something along the lines of:
String[] dataArray = new String[] {"file1", "file2", "file3"};
Observable.fromArray(dataArray).subscribe(new Consumer<String>() {
// process next item in dataArray only after the callback is called
#Override
public void accept(#NonNull String filePath, CallNext callback) throws Exception {
// SomeDatabase will call callback once
// the asynchronous someAsyncOperation finishes
SomeDatabase.someAsyncOperation(filePath, callback);
}
});
And furthermore, how do I call some code once all items of dataArray have been processed? Sort of a 'completion listener'?
Note: I realise I'm probably getting the RX concept wrong. I'm asking since I can't find any guideline on how to implement the Node.js' Async pattern I mentioned above using RxJava. Also, I'm working with Android hence no lambda functions.
After a couple of days of trial and error, the solution I used for this question is as follows. Any suggestions on improvement are most welcome!
private Observable<File> makeObservable(final String filePath){
return Observable.create(new ObservableOnSubscribe<File>() {
#Override
public void subscribe(final ObservableEmitter<File> e) throws Exception {
// someAsyncOperation will call the Callback.onResult after
// having finished the asynchronous operation.
// Callback<File> is an abstract code I put together for this example
SomeDatabase.someAsyncOperation(filePath, new Callback<File>(){
#Override
public void onResult(Error error, File file){
if (error != null){
e.onError(error);
} else {
e.onNext(file);
e.onComplete();
}
}
})
}
});
}
private void processFilePaths(ArrayList<String> dataArray){
ArrayList<Observable<File>> observables = new ArrayList<>();
for (String filePath : dataArray){
observables.add(makeObservable(filePath));
}
Observable.concat(observables)
.toList()
.subscribe(new Consumer<List<File>>() {
#Override
public void accept(#NonNull List<File> files) throws Exception {
// process results.
// Files are now sequentially processed using the asynchronous code.
}
});
}
In short what's happening:
Turn dataArray into an array of Observable's
Each Observable performs the asynchronous operation at the time of subscription and feeds the data to onNext after the async operation finishes
Use Observable.concat() on the array of Observable's to ensure sequential execution
Use .toList() to combine the results of Observable's into one List
While this code does what I require, I'm not entirely satisfied with this solution for a few reasons:
I'm not sure if executing the asynchronous code within the Observable.create()...subscribe(){ is the right way to use Observable's.
I read that using Observable.create should be used with caution (ref docs)
I'm not sure the behaviour I'm trying to achieve requires creating a new Observable for each item in the dataArray. It seems more natural to me to have one observable that is capable of releasing data sequentially - but again, that may be just my old style of thinking.
I've read that using concatMap should do the trick, but I could never figure out how to apply it to this example, and found concat doing just the trick. Anyone caring to explain the difference between the two?
I also tried using .zip function on the array of Observable's before, and while I managed to get the List of results at the end, the async operations were executed in parallel not sequentially.
One very simple and easy way to perform asynchronous operations (calculations etc.) is to use the RxJava Async Utils library (not written by me, just used in our code).
Easiest way to run a method in a background thread is to use Async.start(this::methodToCall, Schedulers.io()) which returns an Observable<T> that produces the return value of the passed method after the method call completes.
There are many other alternative methods that allow you to e.g. report intermediate results via an Observer<T>
If you want to use Observable.from() and concatMap() (or similar ways) remember to first move the processing to a background scheduler with .observerOn(Schedulers.io()) or any other scheduler you want.
To call a function over an array of objects sequentially, you need not any async facilities. Just make a for-loop over the array.
If you need that loop to execute asynchronously, then please describe what asynchrony you need: either to start the loop after array is asynchronously filled, or you want to react asynchronously to the result of the computation, or both. In either case, class CompletableFuture may help.

Categories