Using SwingWorker publish efficiently

Using SwingWorker publish efficiently - java

I am using SwingWorker to query a server process for a large number of "result" objects on a background thread. As individual results arrive I want to publish them and display them on the GUI.
My question is: Given that I will be receiving potentially thousands of results is it more efficient to call publish(V... chunks) for every N results or should I just call publish for each event received?
I see that the documentation states that multiple calls to publish will be coalesced into a single call to process but wasn't sure if it was still better to retain some form of control in my own code by throttling when I call publish. What do people recommend?

I say do the simplest thing that works - leave it to the Swing API to perform the throttling and if you run into problems later on it'll be an easy fix to add additional throttling yourself at that time (plus you'll have the justification for doing so).

Related

Are Co-Routines in Android Development only for Kotlin? Retrieval from android room using MVVM, way to get id returned to activity immediately?

This is a general question regarding android development and the use of co-routines. I am relatively new to developing in android and have created an application using the MVVM architecture model.
I am currently having a problem where I insert into a table and retrieve an ID back in an observer with LiveData.
I then need to use this ID immediately to insert into another table to act as a foreign key.
One table defines the entry and the other the fields associated to that entry.
My issue is that the insertion of the initial ID is happening in the background, so by the time the ID is returned to the activity an error has already been thrown up.
I need some way of:
either waiting for the ID to be returned
or have the insertion run in the foreground (but am unsure how to do
this).
I have seen one solution is to use co-routines but this seems to just be a Kotlin solution.
Does anyone know of a solution that would work in android java to immediately retrieve the ID of insertion in the activity to use for the next insert?
*I am using a room SQL Database.

Ok, correct me if I'm wrong, but what I think you want is a way to chain asynchronous operations together in a synchronous way.
So you have one operation which needs to insert into a table asynchronously, and another operation which needs to use the id from the result of the first operation to insert into another table.
So your second operation requires the first operation to have finished before it runs. But your first operation is running in the background so the question arises; "How do I make sure not to fire the second operation until the first one has finished?".
This is the concept of "chaining" asynchronous calls. Or, in other words, performing asynchronous calls in a synchronous fashion.
Because you need to use Java you won't be able to use Kotlin coroutines (because that's a Kotlin language feature). Fortunately, there are several methods for achieving this in Java.
I personally would recommend the use of RX Java. There are loads of operators for combining asynchronous operations. The one you'd probably want for this use case is called flatMap, which is an operator which blocks on the first operations result before invoking the second operation, with the results of the first one as argument(s).
However, RX is quite a big dependency to add and also has quite a learning curve. So, choosing to use this tool will depend on how prevelant this kind of problem is in your code base.
Another option, is to set up a shared single thread executor which would be used to issue both operations on the same background thread. Because it is a single background thread, as long as you issue the commands into the executor sequentially, they will execute sequentially, but on a background thread. So, assuming your Room DB functions are blocking (i.e. when you issue them, the current thread waits for the operation to complete) then you can have a chained operation like so:
// Create a shared single threaded executor to run both operations on the same background thread
private Executor sharedSingleThreadExecutor = Executors.newSingleThreadExecutor();
private void doThingAThenThingB() {
// Sequentially call thing A on shared background thread
sharedSingleThreadExecutor.execute(() -> {
// Do thing A
doThingA();
});
// Sequentially call thing B on shared background thread
sharedSingleThreadExecutor.execute(() -> {
// Do thing b
doThingB();
});
}

An alternative for Thread.Sleep() when using asynchronous servlets?

What I'm trying to do is to wait for all requests from users in a specific webpage to come in (they are coming in at about the same time), then process them in a servlet, checking which requests send the correct value for some parameter, and output the final result to all the users. I'm using asyncContexts for this, and I am currently using Thread.Sleep(1000) for each user on his request, so that the final result is outputted only when the data is completely collected. However, I've read that Thread.Sleep(1000) is very inefficient to use in web apps, and was wondering if you could suggest some other way of ensuring all the data is collected before the results are outputted. I could provide code if necessary, however, it is a bit messy.

Seems like you know about the number of requests that will be there on your Servlet. So in that case you may use a CountDownLatch, Semaphore or any Blocking collection provided in Java 5 concurrency API.

Showing a state of another thread in GUI

I have a GUI and the GUI is starting another thread (Java). This thread is starting a class which is crawling many websites. Now I want to show in the GUI how many websites are crawled and how many are left.
I wonder what's the best solution for that.
First idea was to start a timer in the GUI and periodically ask the crawler how many is left. But I guess this is quite dirty...
Then one could pass the GUI to the crawler and it is calling a GUI method every time the count of ready websites changes. But I don't think that's much better?
What is the best way to do something like that?

It depends.
Ask the crawler how much work it is done isn't a bad idea. The benefit is you can actually control when an update occurs and can balance out the load.
The downside is that the information may go stale very quickly and you may never get accurate results, as by the time you've read the values, the crawler may have already changed them.
You could have the crawler provide a call back interface, which the GUI registers to and when the crawler updates it's states, calls back to the GUI.
The problem here is the UI may become swamped with results, causing to lag as it tries to keep up. Equally, while the crawler is firing these notifications, it isn't doing it's work...
(Assuming Swing)
In either case, you need to make sure that any ideas you make to the UI are made from within the Event Dispatching Thread. This means if you use the callback method, the updates coming back will come from the crawlers thread context. You will need to resync these with the EDT.
In this case you could simply use a SwingWorker which provides mechanisms for syncing updates back to the EDT for you.
Check out Concurrency in Swing for more details

register a callback function to your thread. when your data is dirty, invoke this callback function to notify GUI thread to update. don't forget to use synchronization.

AsyncTasks are too slow for several simultaneous networking operations

I'm developing an app which must heavily interact with the server.So user input name and password and after authorization the next tasks must be performed:
The app has to fetch all incoming and outcoming messages for this user and load them to SQLite database.
Fetch all user friends (JSON with id,names,contact_data) and also load it to the app's database
Jump to the next activity and display income messages from the local database.
The problem this operations are too slow and when app starts new activity it is nothing to fetch from the database :AsyncTasks have not completed yet.I'm forced to use AsyncTask.get() in order to wait when they all complete but this takes over 16 seconds to wait!So what should I do: use threads, or before loading fetched data to database hold it in memory and display it in the new activity instead of fetching it from the database?But even without database tasks other fetching tasks take nearly 10 seconds to wait!So what should I do?

Oke a couple of things going pretty wrong here.
Do not use AsyncTasks for Networking. Use a service. In short, this is because your AsyncTask will stop, as soon as the Activity that started it will stop. This means that network requests get aborted easily and data goes lost and has to re-start again when the Activity is opened again.
Do not use .get() on AsyncTasks. This makes the UI thread wait for the task to complete, making the whole AsyncTask idea kinda useless. In other words: This blocks your UI.
What you should do:
Read up on using services. You can also have a look at a great opensource library called RoboSpice to help you with this.
Stop using .get() on AsyncTasks, if you want to know when it is done just use a listener.
Execute AsyncTasks on a threadpool ( myTask.executeOnExecutor(AsyncTask.THREAD_POOL_EXECUTOR); ) when possible.

You should use a Service. This way it always can complete the tasks it was doing and you can complete all your tasks. Besides that you should initialize the app once, and after that only update the data.. that can't take 10 seconds.. than you're having an other problem.. But the nice thing of the service is that this can run in the background. see: Services in Android Tutorial
== Edit
Also take a look at GreenDao This library arranges fast SQlLite operations. Without the large setup!

AsyncTasks are not meant to run several small tasks concurrently at the same time. Quoting the docs
When first introduced, AsyncTasks were executed serially on a single background thread. Starting with DONUT, this was changed to a pool of threads allowing multiple tasks to operate in parallel. Starting with HONEYCOMB, tasks are executed on a single thread to avoid common application errors caused by parallel execution.
Use Threads in a ThreadPool when you want to run multiple tasks concurrently.
How you want to handle this situation is up to you. When the background tasks take too long, you can always show an alert dialog to the user and then take them to the activity once the data has been populated. Many apps show a 'Loading' screen when this happens. You can also show the 'Loading' Spinner control if no data is available yet. Never show a blank screen.
If the server side calls are under your control, employ some sort of caching to speed up the time. Any API call that lasts more than a second will make for an impatient user. If not employ one of the techniques mentioned in the previous paragraph. #Perception's technique is also one to consider if you can do it.

Patterns/Principles for thread-safe queues and "master/worker" program in Java

I have a problem which I believe is the classic master/worker pattern, and I'm seeking advice on implementation. Here's what I currently am thinking about the problem:
There's a global "queue" of some sort, and it is a central place where "the work to be done" is kept. Presumably this queue will be managed by a kind of "master" object. Threads will be spawned to go find work to do, and when they find work to do, they'll tell the master thing (whatever that is) to "add this to the queue of work to be done".
The master, perhaps on an interval, will spawn other threads that actually perform the work to be done. Once a thread completes its work, I'd like it to notify the master that the work is finished. Then, the master can remove this work from the queue.
I've done a fair amount of thread programming in Java in the past, but it's all been prior to JDK 1.5 and consequently I am not familiar with the appropriate new APIs for handling this case. I understand that JDK7 will have fork-join, and that that might be a solution for me, but I am not able to use an early-access product in this project.
The problems, as I see them, are:
1) how to have the "threads doing the work" communicate back to the master telling them that their work is complete and that the master can now remove the work from the queue
2) how to efficiently have the master guarantee that work is only ever scheduled once. For example, let's say this queue has a million items, and it wants to tell a worker to "go do these 100 things". What's the most efficient way of guaranteeing that when it schedules work to the next worker, it gets "the next 100 things" and not "the 100 things I've already scheduled"?
3) choosing an appropriate data structure for the queue. My thinking here is that the "threads finding work to do" could potentially find the same work to do more than once, and they'd send a message to the master saying "here's work", and the master would realize that the work has already been scheduled and consequently should ignore the message. I want to ensure that I choose the right data structure such that this computation is as cheap as possible.
Traditionally, I would have done this in a database, in sort of a finite-state-machine manner, working "tasks" through from start to complete. However, in this problem, I don't want to use a database because of the high volume and volatility of the queue. In addition, I'd like to keep this as light-weight as possible. I don't want to use any app server if that can be avoided.
It is quite likely that this problem I'm describing is a common problem with a well-known name and accepted set of solutions, but I, with my lowly non-CS degree, do not know what this is called (i.e. please be gentle).
Thanks for any and all pointers.

As far as I understand your requirements, you need ExecutorService. ExecutorService have
submit(Callable task)
method which return value is Future. Future is a blocking way to communicate back from worker to master. You could easily expand this mechanism to work is asynchronous manner. And yes, ExecutorService also maintaining work queue like ThreadPoolExecutor. So you don't need to bother about scheduling, in most cases. java.util.concurrent package already have efficient implementations of thread safe queue (ConcurrentLinked queue - nonblocking, and LinkedBlockedQueue - blocking).

Check out java.util.concurrent in the Java library.
Depending on your application it might be as simple as cobbling together some blocking queue and a ThreadPoolExecutor.
Also, the book Java Concurrency in Practice by Brian Goetz might be helpful.

First, why do you want to hold the items after a worker started doing them? Normally, you would have a queue of work and a worker takes items out of this queue. This would also solve the "how can I prevent workers from getting the same item"-problem.
To your questions:
1) how to have the "threads doing the
work" communicate back to the master
telling them that their work is
complete and that the master can now
remove the work from the queue
The master could listen to the workers using the listener/observer pattern
2) how to efficiently have the master
guarantee that work is only ever
scheduled once. For example, let's say
this queue has a million items, and it
wants to tell a worker to "go do these
100 things". What's the most efficient
way of guaranteeing that when it
schedules work to the next worker, it
gets "the next 100 things" and not
"the 100 things I've already
scheduled"?
See above. I would let the workers pull the items out of the queue.
3) choosing an appropriate data
structure for the queue. My thinking
here is that the "threads finding work
to do" could potentially find the same
work to do more than once, and they'd
send a message to the master saying
"here's work", and the master would
realize that the work has already been
scheduled and consequently should
ignore the message. I want to ensure
that I choose the right data structure
such that this computation is as cheap
as possible.
There are Implementations of a blocking queue since Java 5

Don't forget Jini and Javaspaces. What you're describing sounds very like the classic producer/consumer pattern that space-based architectures excel at.
A producer will write the jobs into the space. 1 or more consumers will take out jobs (under a transaction) and work on that in parallel, and then write the results back. Since it's under a transaction, if a problem occurs the job is made available again for another consumer .
You can scale this trivially by adding more consumers. This works especially well when the consumers are separate VMs and you scale across the network.

If you are open to the idea of Spring, then check out their Spring Integration project. It gives you all the queue/thread-pool boilerplate out of the box and leaves you to focus on the business logic. Configuration is kept to a minimum using #annotations.
btw, the Goetz is very good.

This doesn't sound like a master-worker problem, but a specialized client above a threadpool. Given that you have a lot of scavenging threads and not a lot of processing units, it may be worthwhile simply doing a scavaging pass and then a computing pass. By storing the work items in a Set, the uniqueness constraint will remove duplicates. The second pass can submit all of the work to an ExecutorService to perform the process in parallel.
A master-worker model generally assumes that the data provider has all of the work and supplies it to the master to manage. The master controls the work execution and deals with distributed computation, time-outs, failures, retries, etc. A fork-join abstraction is a recursive rather than iterative data provider. A map-reduce abstraction is a multi-step master-worker that is useful in certain scenarios.
A good example of master-worker is for trivially parallel problems, such as finding prime numbers. Another is a data load where each entry is independant (validate, transform, stage). The need to process a known working set, handle failures, etc. is what makes a master-worker model different than a thread-pool. This is why a master must be in control and pushes the work units out, whereas a threadpool allows workers to pull work from a shared queue.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.