Java Thread Good Practice this way? - java

this is some kind of long post, so I have to say thanks for reading.
My app is supposed to process a lot of soundfiles, lets say 4000+. My first approach was to load a certain amount (lets say 200mb) of sound data, process it, write it and then "null" the data to let the gc free it. But regarting to the fact that the data is loaded via intranet, this seems not to be the "best" way. (File access is slow) Calculations should start with the first loaded file. To achive this, I changed the concept to a sort of "producer/consumer" (I think). Here are my classes so far:
Reader/Producer
public class ReaderThread extends Thread {
List<Long> files;
ConcurrentLinkedQueue<Long> loaded = new ConcurrentLinkedQueue<Long>();
boolean finished = false;
public ReaderThread( List<Long> soundFiles) {
this.files = soundFiles;
}
#Override
public void run() {
Iterator<Long> it = files.iterator();
while(it.hasNext()) {
Long id = it.next();
if (FileLoader.load(id)) {
loaded.add(id);
}
}
finished = true;
}
public Long getNextId() {
while(loaded.isEmpty()) {
if( finished ) {
return null;
}
}
Long id = loaded.poll();
return id;
}
}
This is the writer/(Not consumer)
public class WriterThread extends Thread {
ConcurrentLinkedQueue<Long> loaded = new ConcurrentLinkedQueue<Long>();
String directory;
boolean abort = false;
public WriterThread(String directory) {
this.directory = directory;
}
#Override
public void run() {
while(!(abort&&loaded.isEmpty())) {
if(!loaded.isEmpty()) {
Long id = loaded.poll();
FileWriter.write(id, directory);
FileManager.unload(id);
}
}
}
public synchronized void submit(Long id) {
loaded.add(id);
}
public synchronized void halt() {
abort = true;
}
}
This is the part where all things get together:
// Forgive me the "t" and "w". ;-)
t = new ReaderThread(soundSystem,soundfilesToProcess);
w = new WriterThread(soundSystem,outputDirectory );
t.start();
w.start();
long start = System.currentTimeMillis();
while(!abort) {
Long id = t.getNextId();
if(id!=null) {
SoundFile soundFile = soundSystem.getSoundfile(id);
ProcessorChain pc = new ProcessorChain(soundFile, getProcessingChain(), w);
Future<List<ProcessorResult>> result = es.submit(pc);
results.add(result);
}else {
break;
}
}
for( Future<List<ProcessorResult>> result : results) {
List<ProcessorResult> tempResults;
try {
tempResults = result.get();
processResults(tempResults);
} catch (InterruptedException e) {
e.printStackTrace();
} catch (ExecutionException e) {
e.printStackTrace();
}
}
w.halt();
"ProcessorChain" is a runnable.
es.submit -> "es" is a CachedThreadPool.
What I need to know in the first place is weather or not this approach is "good", or if it is more like "nonsens". It seems to work quite well, but I have little problems with the writer thread, it seems that in some cases not all files get written. The writer threads submit method is called by the ProcessorChain when it has finished work.
The second thing it thread safety. Did I miss somethin?

I believe it will be (a lot) simpler if each thread reads, process and then writes a whole soundfile (one thread per file).
You use can Java thread pools and let the operating system/Java VM parallelize the read/process/write with multiple files to gain eficiency. I may be wrong, but from what you described a simpler solution would be enough and then you can measure your bottleneck if further improvements are needed.

I think the approach is ok in general (one thread for reading input, one for writing output and one or more for processing).
A couple of suggestions:
1 - you probably want to use semaphores instead of having your threads spinning in a constant loop. For example, with a semaphore your write thread would just block until a file was actually available to write. Currently it will spin, potentially wasting 1/3 of your cpu cycles when there's nothing to write.
2 - you probably want to explicitly create worker threads instead of doing the work on the main thread. That way you can have multiple threads doing processing at the same time.
That may already be what ProcessorChain is doing, but it's not clear from the snippet.

Related

Lock or wait cache load

We need to lock a method responsible for loading database date into a HashMap based cache.
A possible situation is that a second thread tries to access the method while the first method is still loading cache.
We consider the second thread's effort in this case to be superfluous. We would therefore like to have that second thread wait until the first thread is finished, and then return (without loading the cache again).
What I have works, but it seems quite inelegant. Are there better solutions?
private static final ReentrantLock cacheLock = new ReentrantLock();
private void loadCachemap() {
if (cacheLock.tryLock()) {
try {
this.cachemap = retrieveParamCacheMap();
} finally {
cacheLock.unlock();
}
} else {
try {
cacheLock.lock(); // wait until thread doing the load is finished
} finally {
try {
cacheLock.unlock();
} catch (IllegalMonitorStateException e) {
logger.error("loadCachemap() finally {}",e);
}
}
}
}
I prefer a more resilient approach using read locks AND write locks. Something like:
private static final ReadWriteLock cacheLock = new ReentrantReadWriteLock();
private static final Lock cacheReadLock = cacheLock.readLock();
private static final Lock cacheWriteLock = cacheLock.writeLock();
private void loadCache() throws Exception {
// Expiry.
while (storeCache.expired(CachePill)) {
/**
* Allow only one in - all others will wait for 5 seconds before checking again.
*
* Eventually the one that got in will finish loading, refresh the Cache pill and let all the waiting ones out.
*
* Also waits until all read locks have been released - not sure if that might cause problems under busy conditions.
*/
if (cacheWriteLock.tryLock(5, TimeUnit.SECONDS)) {
try {
// Got a lock! Start the rebuild if still out of date.
if (storeCache.expired(CachePill)) {
rebuildCache();
}
} finally {
cacheWriteLock.unlock();
}
}
}
}
Note that the storeCache.expired(CachePill) detects a stale cache which may be more than you are wanting but the concept here is the same, establish a write lock before updating the cache which will deny all read attempts until the rebuild is done. Also, manage multiple attempts at write in a loop of some sort or just drop out and let the read lock wait for access.
A read from the cache now looks like this:
public Object load(String id) throws Exception {
Store store = null;
// Make sure cache is fresh.
loadCache();
try {
// Establish a read lock so we do not attempt a read while teh cache is being updated.
cacheReadLock.lock();
store = storeCache.get(storeId);
} finally {
// Make sure the lock is cleared.
cacheReadLock.unlock();
}
return store;
}
The primary benefit of this form is that read access does not block other read access but everything stops cleanly during a rebuild - even other rebuilds.
You didn't say how complicated your structure is and how much concurrency / congestion you need. There are many ways to address your need.
If your data is simple, use a ConcurrentHashMap or similar to hold your data. Then just read and write in threads regardlessly.
Another alternative is to use actor model and put read/write on the same queue.
If all you need is to fill a read-only map which is initialized from database once requested, you could use any form of double-check locking which may be implemented in a number of ways. The easiest variant would be the following:
private volatile Map<T, V> cacheMap;
public void loadCacheMap() {
if (cacheMap == null) {
synchronized (this) {
if (cacheMap == null) {
cacheMap = retrieveParamCacheMap();
}
}
}
}
But I would personally prefer to avoid any form of synchronization here and just make sure that the initialization is done before any other thread can access it (for example in a form of init method in a DI container). In this case you would even avoid overhead of volatile.
EDIT: The answer works only when initial load is expected. In case of multiple updates, you could try to replace the tryLock by some other form of test and test-and-set, for example using something like this:
private final AtomicReference<CountDownLatch> sync =
new AtomicReference<>(new CountDownLatch(0));
private void loadCacheMap() {
CountDownLatch oldSync = sync.get();
if (oldSync.getCount() == 0) { // if nobody updating now
CountDownLatch newSync = new CountDownLatch(1);
if (sync.compareAndSet(oldSync, newSync)) {
cacheMap = retrieveParamCacheMap();
newSync.countDown();
return;
}
}
sync.get().await();
}

Concurrent algorithm for interrupting and restarting a calculation

I am developing an application which performs allows the user to adjust several parameters and then performs a computation which can take up to a minute, after which it displays the result to the user.
I would like the user to be able to adjust the parameters and restart the calculation, terminating the progress of the current calculation.
Additionally, from the programming perspective, I would like to be able to block until the calculation is completed or interrupted, and be able to know which.
In pseudo code, this is roughly what I am looking for:
method performCalculation:
interrupt current calculation if necessary
asynchronously perform calculation with current parameters
method performCalculationBlock:
interrupt current calculation if necessary
perform calculation with current parameters
if calculation completes:
return true
if calculation is interrupted:
return false
What I have so far satisfies the first method, but I am not sure how to modify it to add the blocking functionality:
private Thread computationThread;
private Object computationLock = new Object();
private boolean pendingComputation = false;
...
public MyClass() {
...
computationThread = new Thread() {
public void run() {
while (true) {
synchronized (computationLock) {
try {
computationLock.wait();
pendingComputation = false;
calculate();
} catch (InterruptedException e) {
}
}
}
}
private void checkForPending() throws InterruptedException {
if (pendingComputation)
throw new InterruptedException();
}
private void calculate() {
...
checkForPending();
...
checkForPending();
...
// etc.
}
};
computationThread.start();
}
private void requestComputation() {
pendingComputation = true;
synchronized (computationLock) {
computationLock.notify();
}
}
What is the best way to go about adding this functionality? Or is there a better way to design the program to accomplish all of these things?
If you are using JDK 5 or earlier, check the java.util.concurrent package. The FutureTask class seems to match your requirement: a cancellable asynchronous computation with blocking feature.

Use finalize() in my case?

I have an ImageWrapper class that saves images to temporary files in disk in order to free heap memory, and allows reloading them when needed.
class ImageWrapper {
File tempFile;
public ImageWrapper(BufferedImage img) {
// save image to tempFile and gc()
}
public BufferedImage getImage() {
// read the image from tempFile and return it.
}
public void delete() {
// delete image from disk.
}
}
My concern is, how to make sure that files gets deleted when such ImageWrapper's instance is garbage collected (otherwise I risk filling the disk with unneeded images). This must be done while the application is still running (as opposed to during-termination cleanup suggestions) due to the fact that it is likely to run for long periods.
I'm not fully familiar with java's GC concept, and I was wondering if finalize() is what I'm looking for. My idea was to call delete() (on a separate Thread, for that matters) from an overriden finalize() method. Is that the right way to do it?
UPDATE:
I don't think I can close() the object as suggested by many users, due to the fact that each such image is fetched to a list of listeners which I don't control, and might save a reference to the object. the only time when I'm certain to be able to delete the file is when no references are held, hence I thought finalize() is the right way. Any suggestions?
UPDATE 2:
What are the scenarios where finalize() will not be called? If the only possibles are exiting the program (in an expected/unexpected way), I can take it, because it means I risk only one unneeded temp file left un deleted (the one that was processed during exiting).
Another approach is to use File.deleteOnExit() which marks a file for the JVM to delete upon exit. I realise it's not quite what you're looking for, but may be of interest.
To be clear, if your JVM dies unexpectedly, it won't clear those files. As such, you may want to architect your solution to clear up cache files on startup, such that you don't build up a mass of unused cache files over time.
An good alternative to finalize is the PhantomReference. the best way to use it is:
public class FileReference extends PhantomReference<CachedImage> {
private final File _file;
public FileReference(CachedImage img, ReferenceQueue<CachedImage> q, File f) {
super(img, q);
_file = f;
}
public File getFile() {
_file;
}
}
Then use it like:
public class CachedImage {
private static final ReferenceQueue<CachedImage>
refQue = new ReferenceQueue<CachedImage>();
static {
Thread t = new Thread() {
#Override
public void run() {
try {
while (true) {
FileReference ref = (FileReference)refQue.remove();
File f = ref.getFile();
f.delete();
}
} catch (Throwable t) {
_log.error(t);
}
}
};
t.setDaemon(true);
t.start();
}
private final FileReference _ref;
public CachedImage(BufferedImage bi, File tempFile) {
tempFile.deleteOnExit();
saveAndFree(bi, tempFile);
_ref = new FileReference<CachedImage>(this, refQue, tempFile);
}
...
}
It is not recommended to use finalize().The problem is that you can't count on the garbage collector to ever delete an object. So, any code that you put into your class's overridden finalize() method is not guaranteed to run.
There's no guarantee that your finalize method will ever get called; in particular, any objects hanging around when the program exits are usually just thrown away with no cleanup. Closeable is a much better option.
As an alternative to #Brian Agnew's answer, why not install a ShutdownHook that clears out your cache directory?
public class CleanCacheOnShutdown extends Thread {
#Override
public void run() { ... }
}
System.getRuntime().addShutdownHook(new CleanCacheOnShutdown());
I ended up using a combination of File.deleteOnExit() (thanks #Brian), and a ScheduledExecutorService that goes over a ReferenceQueue of PhantomReferences to my class instances, according to this post.
I add this answer because no one suggested using ReferenceQueue (which I think is the ideal solution for my problem), and I think it will be helpful for future readers.
The (somewhat simplified) outcome is this (changed the class name to CachedImage):
public class CachedImage {
private static Map<PhantomReference<CachedImage>, File>
refMap = new HashMap<PhantomReference<CachedImage >, File>();
private static ReferenceQueue<CachedImage>
refQue = new ReferenceQueue<CachedImage>();
static {
Executors.newScheduledThreadPool(1).scheduleWithFixedDelay(new Thread() {
#Override
public void run() {
try {
Reference<? extends CachedImage> phanRef =
refQue.poll();
while (phanRef != null) {
File f = refMap.get(phanRef);
f.delete();
phanRef = refQue.poll();
}
} catch (Throwable t) {
_log.error(t);
}
}
}, 1, 1, TimeUnit.MINUTES);
}
public CachedImage(BufferedImage bi, File tempFile) {
tempFile.deleteOnExit();
saveAndFree(bi, tempFile);
PhantomReference<CachedImage> pref =
new PhantomReference<CachedImage>(this, refQue);
refMap.put(pref, tempFile);
}
...
}

Threaded BlockingQueue, clarification needed

When working with a BlockingQueue, i implemented the following logic to read from it until told otherwise. Unfortunately the following is happening, intermittently:
The problem:
Even after shouldContinueReading is set to false, loop does not CONSISTENTLY break
The problem is intermittent, sometimes everything works fine
As part of the QThread class, i declare:
public static volatile boolean shouldContinueReading = true;
Run (confirmed to be executing) method contains:
while (shouldContinueReading) {
try {
String retrieved = qIn.poll(2, TimeUnit.MILLISECONDS);
if (retrieved != null)
consume(retrieved);
} catch (InterruptedException e) {
e.printStackTrace();
}
}
System.out.println("I am out"); // <-- not always seen
if (qIn.remainingCapacity() > 0) {
try {
consume(qIn.take());
} catch (InterruptedException e) {
e.printStackTrace();
}
}
While this is going on, in another thread, when certain things happen, shouldContinueReading changes state
while (stillReading) {
// do nothing
}
QThread.shouldContinueReading = false;
Update: problem resolved
Turns out the problem lies a bit further:
private void consume(String take) {
// some processing
produce(newData.toString());
}
private void produce(String newData) {
System.out.println(newData);
try {
qOut.put(newData); // <-- Problem is here. Should use offer instead of put
} catch (InterruptedException e) {
e.printStackTrace();
}
}
Both qIn (queue in) and qOut (queue out) are declared as:
private volatile BlockingQueue<String> qIn;
private volatile BlockingQueue<String> qOut;
The objects themselves are created elsewhere as follows and passed down to the constructor:
BlockingQueue<String> q1 = new SynchronousQueue<String>();
BlockingQueue<String> q2 = new SynchronousQueue<String>();
QThread qThread = new QThread(q1, q2);
Any suggestions? what i should do with qOut? Am i not declaring it correctly?
I bet QThread.shouldContinueReading = false; isn't getting executed always,or the reading thread is not executing in the first place. I.e. the problem you are seeing is likely somewhere up the stream -- not here. The first thing I'd do would be to pin down where exactly the problem lies, with 100% confidence (put some more print statements).
Apart from the problem, I'd recommend to use the thread interruption mechanism instead of rolling your own flag (which is, in turn just a glorified flag, but that way you can affect third party codes like BlockedQueue and make the implementation simpler and more efficient even) especially if this is production code.

Is there an alternate/better way to do this simple logic in java?

I have a method, say method1(), that takes a while to run. During it's execution, if there is another call to method1(), it should be ignored. I have, roughly, something like this
boolean mFlag = false;
void method1()
{
if(!mFlag)
{
mFlag=true;
// do Stuff
mFlag=false;
}
}
This works. But I was wondering if there is a better way to do this preferably not involving any flags.
Yes, you should really be using something from java.util.concurrent.locks Your example isn't quite strictly correct, that boolean needs to be volatile.
ReentrantLock lock = new ReentrantLock();
void method1()
{
if(lock.tryLock())
{
try {
if (!(lock.getHoldCount() > 1)) {
//do Some Stuff
}
} finally {
lock.unlock();
}
}
}
Edited to handle skipping execution on reentrance as inidicated in your comment. Unfortunatly there isn't really a great way to do that with the built in library, as it's a bit of an odd usecase, but I still think using the built in library is a better option.
Are you trying to guard against re-entry from the same thread or multiple threads accessing at the same time.
Assuming multi-threaded access, the light approach is to use java.util.concurrent.atomic. No need for anything as "heavy" as a lock (provided there are not further requirements).
Assuming no-reentry from the same method:
private final AtomicBoolean inMethod = new AtomicBoolean();
void method1() {
if (inMethod.compareAndSet(true, false)) { // Alternatively getAndSet
try {
// do Stuff
} finally {
inMethod.set(false); // Need to cover exception case!
}
}
}
If you want to allow reentry within the same thread, then it gets messy enough to use locks:
private final AtomicReference<Thread> inMethod = new AtomicReference<Thread>();
void method1() {
final Thread current = Thread.currentThread();
final Thread old = inMethod.get();
if (
old == current || // We already have it.
inMethod.compareAndSet(null, current) // Acquired it.
) {
try {
// do Stuff
} finally {
inMethod.set(old); // Could optimise for no change.
}
}
}
Could use the Execute Around idiom for this.
Maybe you should use synchronized methods
http://download.oracle.com/javase/tutorial/essential/concurrency/syncmeth.html

Categories