I have to implement a queue to which object will be added and removed by two different threads at different time based on some factor.My problem is the requirement says the queue( whole queue and data it hold) should not take 200KB+ data .If size is 200 thread should wait for space to be available to push more data.Object pushed may vary in size.I can create java queue obut the size of queue will return the total object pushed instead of total memory used How do i determine the totla size of data my queue is refering to .
Consider the object pushed as
class A{
int x;
byte[] buf;//array size vary per object
}
There is no out of the box functionality for this in Java. (In part, because there is no easy way to know if the objects added to the collection are referenced elsewhere and therefore if adding them takes up additional memory.)
For your use case, you would probably be best of just subclassing queue. Override the super to add the size of the object to a counter (obviously you will have to make this calculation thread safe.) and to throw an exception IllegalStateException if it doesn't have room. Similarly decrement your counter if on an overridden remove class.
The method of determining how to much space to add to the counter could vary. Farlan suggested using this and that looks like it would work. But since you are suggesting that you are dealing with a byte array, the size of the data you are adding might already be known to you. You will also have to consider whether you want to consider any of the overhead. The object takes some space, as does the reference inside of the queue itself. Plus the queue object. You could figure out exact values for that, but since it seems like your requirement is just to prevent outofmemory, you could probably just use rough estimates for those as long as you are consistent.
The details of what queue class you want to subclass may depend on how much contention you think there will be between the threads. But it sounds like you have a handle on the sync issues.
Related
I have an array to which many threads are writing. However each thread has a pre-assigned range of indices which it may write to. Further, nothing will be reading from the array until all threads are done.
So far, so thread-safe. The problem arises when I need to expand the array, by which of course I mean swap it out for a larger array which copies the first. This is only done occasionally (similar to an ArrayList).
Currently I'm acquiring a lock for every single write to the array. Even though there is no need to lock in order to keep the array consistent, I'm having to lock in case the array is currently being copied/swapped.
As there are very many writes I don't want to require a lock for them. I'm okay with a solution which requires locking for writer threads only while the array is being copied and swapped, as this is infrequent.
But I can't just impose write locks only when the copy/swap is in progress, as threads may already be committing writes to the old array.
I think I need some variety of barrier which waits for all writes to complete, then pauses the threads while I copy/swap the array. But CyclicBarrier would require me to know exactly how many threads are currently active, which is non-trivial and possibly susceptible to edge-cases in which the barrier ends up waiting forever, or lowers itself too early. In particular I'm not sure how I'd deal with a new thread coming in while the barrier is already up, or how to deal with threads which are currently polling a job queue, so will never decrement the barrier count while there are no new jobs.
I may have to implement something which (atomically) counts active threads and tries to pre-empt all the edge cases.
But this may well be a "solved" problem that I don't know about, so I'm hoping there may be a simpler (therefore better) solution than the Cyclic barrier/thread counting. Ideally one which uses an existing utility class.
By the way, I've considered CopyOnWriteArray. This is no use to me, as it copies for every write (a lot of them), not just array expansions.
Also note the structure written to pretty much has to be an array, or array-based.
Thanks
Although it's technically not correct, you can probably use a ReadWriteLock. The threads that are writing to a single portion all use a read lock (this is the technically incorrect part, they're not reading...), and the resize uses a write lock. That way, all writing threads can work together. A resize has to wait until all portioned writes are done, which then blocks the entire array. Once that is done, all portioned writes can continue.
There is a solution, although there will be some overhead, but no locking.
But first, I would recommend using a 2-D array (an array of arrays) unless you absolutely need a 1-D array. You can then expand the top-level array without affecting the contents of the lower-level arrays. You can also write a wrapper class for this to access the whole thing using 1-D indices if you wish.
But if you really want to have a 1-D array, I would recommend the following:
I am assuming each thread has some number which it knows which uniquely identifies itself and can be converted to a small index (else, I don't see how you index into the main array).
I also assume you have a reference to the main array called mainArray which is a statically accessible, but it also could be injected into the threads. It should be declared volatile.
You need another array currentArrays of length numberOfThreads, also available to all of the threads. Each array element will contain a reference of the main array the thread is currently using.
When you need to grow the array, allocate a new array and write its reference to mainArray. You don't need to copy anything at this point.
Before accessing the main array in your threads you need to grab a local reference to it (i.e., a local variable) by assigning from mainArray.
Then compare the grabbed reference with the reference in currentArrays. If it is the same, carry on, being careful to use the local reference.
If it is different, call a method (that you will write) to copy the part of the previous array for your thread to the new array and then carry on as before. Write the new array reference to currentArrays for that thread. Again, use the local reference until you are done.
The old array should be garbage collected once all of the threads have finished copying their part of it, which means not until all threads have had at least one request requiring it.
There will be some initialisation code for first time use which should be obvious (all currentArrays elements are set to mainArray).
I believe this should work. There is obviously the overhead of comparing array references before you can access the array; however, if you do a lot of array accesses in a single transaction/request you can save the array reference that you grabbed, pass it around and only recheck when you need to grab it again. That should reduce the overhead.
Disclaimer: I haven't tested it. Comments welcome.
We have a whole bunch of data sources where we consult some REST API or other and get back a list of objects. I'm trying to design an abstraction layer that doesn't need to know how to contact any specific API instance or how to semantically interpret the objects, but that guarantees that we get back a list of objects from whichever class implements the interface we need at the time.
I expect at times the numbers of results to be quite large (but always finite!) and often slow to retrieve, so I require something that does not load everything into memory all at once but allows the results of the list to be worked with as they become available. I'm fine if the list blocks on next or hasNext or whatever the appropriate analogue is.
What's the most appropriate abstraction / approach for achieving these goals and how is it implemented?
My gut tells me it ought to be some flavor of Java 8 Streams, possibly created via the Java 9 Stream.iterate method, but I'm not too familiar with functional programming paradigms and can't for the life of me figure out how one would populate the elements of the Stream as they became available from the REST calls and close it out when it's finished.
It turns out I was confusing myself by conflating two issues: how to provide an Iterator in an Interface (which is trivial), and how to populate that Iterator in the background. I ended up with roughly the following:
Create a custom abstract class which implements Iterator. That class has an internal BlockingQueue and an internal List. It also defines an abstract method which is intended to perform all the activities of population in a single invocation.
The first time hasNext() is called, kick off a daemon thread which invokes that abstract method. Then, while the thread is alive (meaning it's still populating the BlockingQueue) or the List isn't empty (meaning not all elements have been consumed via next()), poll against the BlockingQueue until it has at least one element in it. Once it does, remove that element and add it to the List. next() merely returns elements from the List.
This results in lazy loading (nothing occurs until hasNext() is called for the first time) that also happens asynchronously in the background -- the caller will be able to process things as soon as they're available (hasNext() will block if things aren't available), and it doesn't use up an unreasonable amount of memory (the BlockingQueue will block if it has too many elements).
I know the threads save the values of the variables in the cpu cache where it is running because in this way the cpu doesnt have wait so much time when it's necessary to get the values inside in the variables.
But for example if i have this object
public class MyObject {
int a = 2;
}
and now the thread do something like this:
MyObject obj = new MyObject();
obj.a=3;
My question is:
what will be saved in the cpu cache ?
all the MyObject structure or just the reference?
I think all the structure (have more sense) but i prefer to ask because i would like to be sure about that.
I'm a noob about multithread and i'm sure is more complex how a cpu cache works, but at the moment i need just basic information.
In your example, only one thread is acting. For this thread, cache is transparent - there is no way to determine if a value is in cache, in main memory, or both. First all values are put in the cache but then very soon, in an unknown moment of time they are pushed out.
"i would like to be sure about that" - why? Your program behaviour does not depend on this.
These question has two sides:
What the CPU is doing: The CPU is designed to keep everything in the cache that is needed very often. If you change a value it will keep changes in the cache until it is needs to write it to the main memory (actually it depends on the CPUs strategy write-back vs write-through). The "need" to write it to main memory is programatically controlled or the CPU descides its needing space for other stuff. To answer one part of your question: For the CPU everything is data, the value you set in Java, and the internal object data structures. To access your value, you need the object address first, so that is very probably in the cache, too :)
The second point, is what Java programmer should expect and not expect: This is very exactly defined in the Java Memory Model. Just start here: http://en.wikipedia.org/wiki/Java_Memory_Model
So for your lines:
MyObject obj = new MyObject();
obj.a=3;
There is no guarantee that another thread running after this code, sees the new value. And it also may not see your new object reference but null instead. You need a synchronized block or a volatile variable.
I know if two threads are writing to the same place I need to make sure they do it in a safe way and cause no problems but what if just one thread reads and does all the writing while another just reads.
In my case I'm using a thread in a small game for the 1st time to keep the updating apart from the rendering. The class that does all the rendering will never write to anything it reads, so I am not sure anymore if I need handle every read and write for everything they both share.
I will take the right steps to make sure the renderer does not try to read anything that is not there anymore but when calling things like the player and entity's getters should I be treating them in the same way? or would setting the values like x, y cords and Booleans like "alive" to volatile do the trick?
My understanding has become very murky on this and could do with some enlightening
Edit: The shared data will be anything that needs to be drawn and moved and stored in lists of objects.
For example the player and other entity's;
With the given information it is not possible to exactly specify a solution, but it is clear that you need some kind of method to synchronize between the threads. The issue is that as long as the write operations are not atomic that you could be reading data at the moment that it is being updates. This means that you for instance get an old y-coordinate with a new x-coordinate.
Basically you only do not need to worry about synchronization if both threads are only reading the information or - even better - if all the data structures are immutable (so both threads can not modify the objects). The best way to proceed is to think about which operations need to be atomic first, and then create a solution to make the operations atomic.
Don't forget: get it working, get it right, get it optimized (in that order).
You could have problems in this case if list's sizes are variable and you don't synchronize the access to them, consider this:
read-only thread reads mySharedList size and it sees it is 15; at that moment its CPU time finishes and read-write thread is given the CPU
read-write thread deletes an element from the list, now its size is 14.
read-only thread is again granted CPU time, it tries to read the last element using the (now obsolete) size it read before being interrupted, you'll have an Exception.
I have some objects that are created/destroyed very often and that can exist in many lists at the same time. To ensure I have no references left to them the objects have a flag isDestroyed, and if this is set, each list is responsible for removing the object from the list.
However this is ofcourse a growing ground for memory leaks. What if I forget to remove objects from one of the lists? To visually monitor that the program behaves correctly, I override finalize and increase a global variable to track destructions (not a formal test, only to get an idea). However as I have no control over the GC, I could in theory wait forever until something is destroyed.
So the question is two-fold: When having objects that are in multiple lists, is a "isDestroyed" considered a good way to control the object lifetime? It forces everyone who uses the object to take care to remove it from their lists, which seems bad.
And, is there any good way to see when the reference count reaches zero on an object, ie when its scheduled for destruction?
EDIT: To be more specific, in my case I my objects represent physical entities in a room. And I have one manager class that draws each object, therefore it is in one list. Another list contains all the objects that are clickable, so there I have another list. Having all objects in one list and using polymorphism or instance of is not an option in this case. When a object is "destroyed", it should neither be shown or clickable in any way, therefore I want to remove it from both lists.
You should have a look at the java.lang.ref Package.
And, is there any good way to see when the reference count reaches
zero on an object, ie when its scheduled for destruction?
You can use the ReferenceQueue Object
From JavaDoc of java.lang.ref.ReferenceQueue
Reference queues, to which registered reference objects are appended
by the garbage collector after the appropriate reachability changes
are detected.
I think this what WeakReference and ReferenceQueue is for - you create a WeakReference for the object you are tracking and associate it with a ReferenceQueue. Then you have another thread that processes WeakReference(s) as it is returned from ReferenceQueue.remove(). WeakReference's are added to ReferenceQueue when the referenced objects is GC'd. But can you give an example on what these lists you are trying to clean up when the referenced objects are dead?
The way this is usually handled is through the Observer pattern. Each list attaches a destroy-listener that gets notified upon destruction. How this meshes with you architecture, I have no details to judge from.
If you want to be notified I'm almost sure you need PhantomReference, read here:
http://weblogs.java.net/blog/2006/05/04/understanding-weak-references