I'm making a game in Java. Every enemy in the game is a thread and they are constantly looping through the game's data structures (I always use the class Vector).
Lately I have been getting the "ConcurrentModificationException" because an element is being added/removed from a Vector while a Thread is looping through it. I know there are strategies to avoid the add/remove problem (I actually use some to avoid problems with Removes but I still have problems with "Adds").
I heard that java supports a Vector/List that avoids the ConcurrentModificationException.
Do you have any idea of what this structure might be?
Thanks.
Check out java.util.concurrent, it has what you're looking for.
CopyOnWriteArrayList. But read its javadocs carefully and consider if in practice it gives the behavior that you are expecting (check Memory Consistence effects), plus if the performance overhead is worth it. Besides synchronization with ReentrantReadWriteLock, AtomicReferences, and Collections.synchronizedList may help you.
Related
Is there a method where I can iterate a Collection and only retrieve just a subset of attributes without loading/unloading the each of the full object to cache? 'Cos it seems like a waste to load/unload the WHOLE (possibly big) object when I need only some attribute(s), especially if the objects are big. It might cause unnecessary cache conflicts when loading such unnecessary data, right?
When I meant to 'load to cache' I mean to 'process' that object via the processor. So there would be objects of ex: 10 attributes. In the iterating loop I only use 1 of those. In such a scenario, I think its a waste to load all the other 9 attributes to the processor from the memory. Isn't there a solution to only extract the attributes without loading the full object?
Also, does something like Google's Guava solve the problem internally?
THANK YOU!
It's not usually the first place to look, but it's not certainly impossible that you're running into cache sharing problems. If you're really convinced (from realistic profiling or analysis of hardware counters) that this is a bottleneck worth addressing, you might consider altering your data structures to use parallel arrays of primitives (akin to column-based database storage in some DB architectures). e.g. one 'column' as a float[], another as a short[], a third as a String[], all indexed by the same identifier. This structure allows you to 'query' individual columns without loading into cache any columns that aren't currently needed.
I have some low-level algorithmic code that would really benefit from C's struct. I ran some microbenchmarks on various alternatives and found that parallel arrays was the most effective option for my algorithms (that may or may not apply to your own).
Note that a parallel-array structure will be considerably more complex to maintain and mutate than using Objects in java.util collections. So I'll reiterate - I'd only take this approach after you've convinced yourself that the benefit will be worth the pain.
There is no way in Java to manage loading to processor caches, and there is no way to change how the JVM works with objects, so the answer is no.
Java is not a low-level language and hides such details from the programmer.
The JVM will decide how much of the object it loads. It might load the whole object as some kind of read-ahead optimization, or load only the fields you actually access, or analyze the code during JIT compilation and do a combination of both.
Also, how large do you worry your objects are? I have rarely seen classes with more than a few fields, so I would not consider that big.
Is there a thread-safe implementation of a tree in Java? I have found a bit of information that recommends using synchronized() around the add and remove methods, but I interested in seeing if there is anything built into Java.
Edit: I am trying to use an Octree. Just learning as I go, but I am using this project to learn both multi-threading and spatial indexing so there are lots of new topics for me here. If anyone has some particularly good reference material please do share.
From the documentation for TreeMap:
SortedMap m = Collections.synchronizedSortedMap(new TreeMap(...));
Note that this only makes each call synchronized. In many cases this is the wrong granularity for an application and you are better off synchronizing at a higher level. See the docs for synchronizedSortedMap.
You can use Collections.synchronizedSet() or synchronizedMap() to add the synchronization around individual methods, but thread safety isn't really a property of a data structue but of an application. The wrapper will not be sufficient if you iterate over the tree, or do series of operations that need to be atomic.
A java.util.concurrent.ConcurrentSkipListMap might be of interest. This is overkill for most uses, but if you need fine-grained synchronization there's nothing like it. And overkill beats underkill. Of course, it's not a tree, but does the same job. I do not believe you can get low-level synchronization in a real tree.
I need to hold a large number of elements (500k or so) in a list or a set I need to do high performance traversal, addition and removal. This will be done in a multithreaded environment and I don't care if I gets to see the updates done after traversal began (weakly consistent), what Java collection is right for this scenario?
I need to hold a large number of
elements (500k or so) in a list or a
set I need to do high performance
traversal, addition and removal.
...
This will be done in a multithreaded
environment
ConcrrentSkipListMap - it's not a List but List semantics are practically useless in concurrent environment.
It will have the elements sorted in a tree alike structure and not accessible via hashing, so you need some natural ordering (or external via comparator)
If you need only add/remove at the ends of a Queue - ConcurrentLinkedQueue.
Synchronized collections are not suited for multi-threaded environment if you expect even moderate contention. They require full lock holding during the entire traverse operation as well. I'd advise against ConcurrentHashMap, either.
In the end: if you are going for real multi-CPU like 64+ and expect high contention and don't want natural ordering follow the link: http://sourceforge.net/projects/high-scale-lib
Here is a very good article on selecting a collection depending on your application
http://www.developer.com/java/article.php/3829891/Selecting-the-Best-Java-Collection-Class-for-Your-Application.htm
you can try this as well
http://www.javamex.com/tutorials/collections/how_to_choose.shtml
If traversal == read, and add/remove == update, I'd say that it's not often that a single collection is optimized for both operations.
But your best bet is likely to be a HashMap.
Multithreaded - so look at j.u.concurrent. Maybe ConcurrentHashMap used as a Set - e.g. use put(x, x) instead of add(x).
If you do addition and removal often, then something "linked" is probably the best choice. That way everytime you add/remove only an index has to be updated, in contrast to an ArrayList for example where the whole Array has to be "moved". The problem is that you are asking for the holy grail of Collections.
Taking a look at the Concurrent Collections might help.
But what do you mean by "traversal"?
If you need to add or remove items in the middle of a list quickly, LinkedList is a good choice. To use it in multithreaded enviroment, you need to synchronise it like this:
List l = Collections.synchronisedList(new LinkedList());
On other hand, due to large size of data, is it possible to store the data in database? And use memory collection as cache.
are duplicate items allowed?
is yes, Set can't be used. you can use SortedSet otherwise.
The project I am working on requires a whole bunch of queries towards a database. In principle there are two types of queries I am using:
read from excel file, check for a couple of parameters and do a query for hits in the database. These hits are then registered as a series of custom classes. Any hit may (and most likely will) occur more than once so this part of the code checks and updates the occurrence in a custom list implementation that extends ArrayList.
for each hit found, do a detail query and parse the output, so that the classes created in (I) get detailed info.
I figured I would use multiple threads to optimize time-wise. However I can't really come up with a good way to solve the problem that occurs with the collection these items are stored in. To elaborate a little bit; throughout the execution objects are supposed to be modified by both (I) and (II).
I deliberately didn't c/p any code, as it would be big chunks of code to make any sense.. I hope it make some sense with the description above.
Thanks,
In Java 5 and above, you may either use CopyOnWriteArrayList or a synchronized wrapper around your list. In earlier Java versions, only the latter choice is available. The same is true if you absolutely want to stick to the custom ArrayList implementation you mention.
CopyOnWriteArrayList is feasible if the container is read much more often than written (changed), which seems to be true based on your explanation. Its atomic addIfAbsent() method may even help simplify your code.
[Update] On second thought, a map sounds more fitting to the use case you describe. So if changing from a list to e.g. a map is an option, you should consider ConcurrentHashMap. [/Update]
Changing the objects within the container does not affect the container itself, however you need to ensure that the objects themselves are thread-safe.
Just use the new java.util.concurrent packages.
Classes like ConcurrentLinkedQueue and ConcurrentHashMap are already there for you to use and are all thread-safe.
I'm looking for a good hash map implementation. Specifically, one that's good for creating a large number of maps, most of them small. So memory is an issue. It should be thread-safe (though losing the odd put might be an OK compromise in return for better performance), and fast for both get and put. And I'd also like the moon on a stick, please, with a side-order of justice.
The options I know are:
HashMap. Disastrously un-thread safe.
ConcurrentHashMap. My first choice, but this has a hefty memory footprint - about 2k per instance.
Collections.sychronizedMap(HashMap). That's working OK for me, but I'm sure there must be faster alternatives.
Trove or Colt - I think neither of these are thread-safe, but perhaps the code could be adapted to be thread safe.
Any others? Any advice on what beats what when? Any really good new hash map algorithms that Java could use an implementation of?
Thanks in advance for your input!
Collections.synchronizedMap() simply makes all the Map methods synchronized.
ConcurrentMap is really the interface you want and there are several implementations (eg ConcurrentHashMap, ConcurrentSkipList). It has several operations that Map doesn't that are important for threadsafe operations. Plus it is more granular than a synchronized Map as an operation will only lock a slice of the backing data structure rather than the entire thing.
I have no experience of the following, but I worked with a project once who swore by Javolution for real time and memory sensitive tasks.
I notice in the API there is FastMap that claims to be thread safe. As I say, I've no idea if it's any good for you, but worth a look:
API for FastMap
Javolution Home
Google Collection's MapMaker seems like it can do the job too.
Very surprising that it has a 2k foot print!! How about making ConcurrentHashMap's concurrency setting lower (e.g. 2-3), and optimizing its initial size (= make smaller).
I don't know where that memory consumption is coming from, but maybe it has something to do with maintaining striped locks. If you lower the concurrency setting, it will have less.
If you want good performance with out-of-the-box thread safety, ConcurrentHashMap is really nice.
Well, there's a spruced-up Colt in Apache Mahout. It's still not in the current business. What's wrong with protecting the code with a synchronized block? Are you expecting some devilishly complex scheme that hold locks for smaller granularity than put or get?
If you can code one, please contribute it to Mahout.
It's worth taking a look at the persistent hash maps in Clojure.
These are immutable, thread safe data structures with performance comparable to classic Java HashMaps. You'd obviously need to wrap them if you want a mutable map, but that shouldn't be difficult.
http://clojure.org/data_structures