Android: cancellable Java Collection sort in AsyncTask.doInBackground

Android: cancellable Java Collection sort in AsyncTask.doInBackground - java

I am writing an application for Android mobile phones.
I have a java.util.ArrayList that contains objects which require a custom java.util.Comparator to be sorted properly.
I know I can use java.util.Collections.sort() but the amount of data is such that I want to sort my ArrayList in an android.os.AsyncTask.
I do not want to use several AsyncTask objects that would each sort a subset of the data.
Any AsyncTask can be cancelled so I want to regularly call AsyncTask.isCancelled() while I sort. If it returns true, I give up on sorting (and on my whole data set).
I Googled but could not find an AsyncTask-friendly way to sort while regularly checking for cancellation.
I may be able to call isCancelled() in my implementation of java.util.Comparator.compare() and throw my own subclass of java.lang.RuntimeException if it returns true. Then try{java.util.Collections.sort(ArrayList, Comparator);} catch () {} for that specific exception. I don't feel entirely comfortable with that approach.
Alternatively, I can use an intermediary java.util.TreeSet and write 2 loops that each check for cancellation before every iteration. The first loop would add all the items of the ArrayList to the TreeSet (and my Comparator implementation keeps them sorted on insertion). The second loop would add all the object in the TreeSet back into the ArrayList, in the correct order, thanks to the TreeSet natural java.util.Iterator. This approach uses some extra memory but will assuredly work.
The last approach I can think of is to implement myself what I have actually been looking for in Android: A generic java.util.List (preferably quick)sort using a java.util.Comparator and an android.os.AsyncTask that regularly checks for cancellation.
Has anybody found that?
Do you have any other solution to this problem?
EDIT:
Although I haven't given any thought to what the sorting method signature would look like, I would also be perfectly happy with using a android.os.CancellationSignal to decide when to abandon the sorting.

I’ll try to describe my thought process here. If anybody had better offers at any point…
Lets re-affirm what we are trying to achieve here.
We need a sorting algorithm with the following properties
Runs on a single task
Be in place i.e. not use extra memory
We should be able to cancel the sort at will, i.e. return immediately or very close to it when we decide it's no longer needed.
Be efficient
Would not use exceptions to control a perfectly normal flow of your application. You are right about not feeling comfortable about that one ☺
There is no native android tool to do that AFAIK
Let's focus for a second on requirement 3.
Here is a quote from asycTask documentation, The section regarding cancelling a task
Blockquote
To ensure that a task is cancelled as quickly as possible, you should always check the return value of isCancelled() periodically from doInBackground(Object[]), if possible (inside a loop for instance.) ".
Meaning, an iterative sorting algorithm, where on each iteration you must check for the isCancalled() flag, will fill this requirment. The problem is simple iterative sorting algorithms , such is Insertion sort, often are not very efficient. It shouldn’t matter too much for small inputs, but since you say your typical input is a huge array list, and that triggered our quest anyway, we need to keep things as efficient as possible.
Since you did mention quick sort, I was thinking, it has got everything we need! It’s efficient, it’s in place, it runs on a single task. There is only one shortfall. It is, in it’s classic form, recursive, meaning it won’t return immediately upon cancellation. Luckily a brief Google search yields many results that can help including this one. In Brief, you can find there a variant for quicksort that is iterative. This is done by replacing the recursive callstack by a stack that stores the same indexes that recursive implementation would use to preform "partition" with.
Take this Algorithm, add a check if asyncTask.isCancelled() on each iteration, and you got yourself a solution that answers all the requirements.

Related

does caching reduce the running time in java?

For example, what I need to implement at the moment is called the Submission history. This requires me to use data-structure that takes better than O(n) for each of its methods, and someone told me to use
HashMap<studentId, TreeMap<Date, studentScore>>
since,
getBestGrade method: find all submission for student in O(1) and then find best submission in O(N) (You can improve it by caching best score).
So my question is, how would I approach to use caching for the getBestGrade?
What I am thinking is, first make a class for the tree-map and add methods of put, remove and getBestGrade in it. Than I just call it back in the another class.
Also, how does the use of caching reduce the time-complexity(big-O)?
Please help... thanks.

It is called memoization method(technique). In Java 8, there are new features for this issue, here is the link.
It depends on the frequency of the repeating the operation that you cached the old data. You should manage the cache size, of course. It may provide you some advantages however, it may also going to kill your memory.
Here is an example.
BTW, this is a good way to store the data. Acess will take O(1) with given studentId and Date keys.
HashMap<studentId, TreeMap<Date, studentScore>>

Is there an alternative to CopyOnWriteArrayList which can be sorted?

I have a collection of 'effects' I draw on an 'object' in a GUI ( gradients, textures, text etc ). The nature of the underlying system means that this effect collection can be accessed by multiple threads. The majority of operations are reads, so at the moment I'm using a CopyOnWriteArrayList which works ok.
But now I need to sort the collection of effects based on their draw order whenever I add a new effect or change an effect's draw order. I also need to be able to iterate through the collection in forwards & reverse ( iterater.next() & iterator.previous() ).
After some research I've found CopyOnWriteArrayLists don't like being sorted:
Behaviour of CopyOnWriteArrayList
If you tried to sort a CopyOnWriteArrayList you'll see the list throws an UsupportedOperationException (the sort invokes set on the collection N times). You should only use this read when you are doing upwards of 90+% reads.
I also found a suggestion of using ConcurrentSkipListSet, as it handles concurrency & sorting, but looking at the JavaDoc I'm worried about this:
Beware that, unlike in most collections, the size method is not a constant-time operation.
And I use & rely on the size() method quite a bit.
I could implement a system of manual synchronization on the effect collection as a standard ArrayList, but this is a very big refactor & I'd rather exhaust all other possibilities, if anyone has any ideas? Thanks for sticking with me this far.

Probably the best way to go is to manually synchronize at the point where you are sorting your collection. You could do something like (pseudo code) :
synchronize {
convert copyonwritearrylist to normal list.
sort normallist.
convert normal list to copyonwritearraylist and replace the shared instance
}
Alternatively you might just use a normal ArrayList and roll out your own solution using ReentrantReadWriteLock This should work OK in case you have more reads than writes.

Good algorithm for generating call graphs?

I am writing some code to generate call graphs for a particular intermediate representation without executing it by statically scanning the IR code. The IR code itself is not too complex and I have a good understanding of what function call sequences look like so all I need to do is trace the calls. I am currently doing it the obvious way:
Keep track of where we are
If we encounter a function call, branch to that location, execute and come back
While branching put an edge between the caller and the callee
I am satisfied with where I am getting at but I want to make sure that I am not reinventing the wheel here and face corner cases. I am wondering if there are any accepted good algorithms (and/or design patterns) that do this efficiently?
UPDATE:
The IR code is a byte-code disassembly from a homebrewn Java-like language and looks like the Jasmine specification.

From an academic perspective, here are some considerations:
Do you care about being conservative / correct? For example, suppose the code you're analyzing contains a call through a function pointer. If you're just generating documentation, then it's not necessary to deal with this. If you're doing a code optimization that might go wrong, you will need to assume that 'call through pointer' means 'could be anything.'
Beware of exceptional execution paths. Your IR may or may not abstract this away from you, but keep in mind that many operations can throw both language-level exceptions as well as hardware interrupts. Again, it depends on what you want to do with the call graph later.
Consider how you'll deal with cycles (e.g. recursion, mutual recursion). This may affect how you write code for traversing the graphs later on (i.e., they will need some sort of 'visited' set to avoid traversing cycles forever).
Cheers.
Update March 6:
Based on extra information added to the original post:
Be careful about virtual method invocations. Keep in mind that, in general, it is unknowable which method will execute. You may have to assume that the call will go to any of the subclasses of a particular class. The standard example goes a bit like this: suppose you have an ArrayList<A>, and you have class B extends A. Based on a random number generator, you will add instances of A and B to the list. Now you call x.foo() for all x in the list, where foo() is a virtual method in A with an override in B. So, by just looking at the source code, there is no way of knowing whether the loop calls A.foo, B.foo, or both at run time.

I don't know the algorithm, but pycallgraph does a decent job. It is worth checking out the source for it. It is not long and should be good for checking out existing design patterns.

Multiple threads modifying a collection in Java?

The project I am working on requires a whole bunch of queries towards a database. In principle there are two types of queries I am using:
read from excel file, check for a couple of parameters and do a query for hits in the database. These hits are then registered as a series of custom classes. Any hit may (and most likely will) occur more than once so this part of the code checks and updates the occurrence in a custom list implementation that extends ArrayList.
for each hit found, do a detail query and parse the output, so that the classes created in (I) get detailed info.
I figured I would use multiple threads to optimize time-wise. However I can't really come up with a good way to solve the problem that occurs with the collection these items are stored in. To elaborate a little bit; throughout the execution objects are supposed to be modified by both (I) and (II).
I deliberately didn't c/p any code, as it would be big chunks of code to make any sense.. I hope it make some sense with the description above.
Thanks,

In Java 5 and above, you may either use CopyOnWriteArrayList or a synchronized wrapper around your list. In earlier Java versions, only the latter choice is available. The same is true if you absolutely want to stick to the custom ArrayList implementation you mention.
CopyOnWriteArrayList is feasible if the container is read much more often than written (changed), which seems to be true based on your explanation. Its atomic addIfAbsent() method may even help simplify your code.
[Update] On second thought, a map sounds more fitting to the use case you describe. So if changing from a list to e.g. a map is an option, you should consider ConcurrentHashMap. [/Update]
Changing the objects within the container does not affect the container itself, however you need to ensure that the objects themselves are thread-safe.

Just use the new java.util.concurrent packages.
Classes like ConcurrentLinkedQueue and ConcurrentHashMap are already there for you to use and are all thread-safe.

Priority Queues in Java

java.util.PriorityQueue allows a Comparator to be passed at construction time. When inserting elements, they are ordered according to the priority specified by the comparator.
What happens when the priority of an element changes after it has been inserted? When does the PriorityQueue reorder elements? Is it possible to poll an element that does not actually have minimal priority?
Are there good implementations of a priority queue which allow efficient priority updates?

You should remove the element, change it, and re-insert, since ordering occurs when it is inserted. Although it involves several steps, it is efficient might be good enough. (I just noticed the comment about removal being O(n).)
One problem is that it will also re-order when you remove the element, which is redundant if you are just going to re-insert it a moment later. If you implement your own priority queue from scratch, you could have an update() that skips this step, but extending Java's class won't work because you are still limited to the remove() and add() provided by the base.

I would expect the PriorityQueue to not reorder things - and it could get very confused if it tries to do a binary search to find the right place to put any new entries.
Generally speaking I'd expect changing the priority of something already in a queue to be a bad idea, just like changing the values making up a key in a hash table.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.