Simple database-like collection class in Java

Simple database-like collection class in Java - java

The problem: Maintain a bidirectional many-to-one relationship among java objects.
Something like the Google/Commons Collections bidi maps, but I want to allow duplicate values on the forward side, and have sets of the forward keys as the reverse side values.
Used something like this:
// maintaining disjoint areas on a gameboard. Location is a space on the
// gameboard; Regions refer to disjoint collections of Locations.
MagicalManyToOneMap<Location, Region> forward = // the game universe
Map<Region, <Set<Location>>> inverse = forward.getInverse(); // live, not a copy
Location parkplace = Game.chooseSomeLocation(...);
Region mine = forward.get(parkplace); // assume !null; should be O(log n)
Region other = Game.getSomeOtherRegion(...);
// moving a Location from one Region to another:
forward.put(parkplace, other);
// or equivalently:
inverse.get(other).add(parkplace); // should also be O(log n) or so
// expected consistency:
assert ! inverse.get(mine).contains(parkplace);
assert forward.get(parkplace) == other;
// and this should be fast, not iterate every possible location just to filter for mine:
for (Location l : mine) { /* do something clever */ }
The simple java approaches are: 1. To maintain only one side of the relationship, either as a Map<Location, Region> or a Map<Region, Set<Location>>, and collect the inverse relationship by iteration when needed; Or, 2. To make a wrapper that maintains both sides' Maps, and intercept all mutating calls to keep both sides in sync.
1 is O(n) instead of O(log n), which is becoming a problem. I started in on 2 and was in the weeds straightaway. (Know how many different ways there are to alter a Map entry?)
This is almost trivial in the sql world (Location table gets an indexed RegionID column). Is there something obvious I'm missing that makes it trivial for normal objects?

I might misunderstand your model, but if your Location and Region have correct equals() and hashCode() implemented, then the set of Location -> Region is just a classical simple Map implementation (multiple distinct keys can point to the same object value). The Region -> Set of Location is a Multimap (available in Google Coll.). You could compose your own class with the proper add/remove methods to manipulate both submaps.
Maybe an overkill, but you could also use in-memory sql server (HSQLDB, etc). It allows you to create index on many columns.

I think you could achieve what you need with the following two classes. While it does involve two maps, they are not exposed to the outside world, so there shouldn't be a way for them to get out of sync. As for storing the same "fact" twice, I don't think you'll get around that in any efficient implementation, whether the fact is stored twice explicitly as it is here, or implicitly as it would be when your database creates an index to make joins more efficient on your 2 tables. you can add new things to the magicset and it will update both mappings, or you can add things to the magicmapper, which will then update the inverse map auotmatically. The girlfriend is calling me to bed now so I cannot run this through a compiler - it should be enough to get you started. what puzzle are you trying to solve?
public class MagicSet<L> {
private Map<L,R> forward;
private R r;
private Set<L> set;
public MagicSet<L>(Map forward, R r) {
this.forward = map;
this.r = r;
this.set = new HashSet<L>();
}
public void add(L l) {
set.add(l);
forward.put(l,r);
}
public void remove(L l) {
set.remove(l);
forward.remove(l);
}
public int size() {
return set.size();
}
public in contains(L l){
return set.contains(l);
}
// caution, do not use the remove method from this iterator. if this class was going
// to be reused often you would want to return a wrapped iterator that handled the remove method properly. In fact, if you did that, i think you could then extend AbstractSet and MagicSet would then fully implement java.util.Set.
public Iterator iterator() {
return set.iterator();
}
}
public class MagicMapper<L,R> { // note that it doesn't implement Map, though it could with some extra work. I don't get the impression you need that though.
private Map<L,R> forward;
private Map<R,MagicSet<L>> inverse;
public MagicMapper<L,R>() {
forward = new HashMap<L,R>;
inverse = new HashMap<R,<MagicSet<L>>;
}
public R getForward(L key) {
return forward.get(key);
}
public Set<L> getBackward(R key) {
return inverse.get(key); // this assumes you want a null if
// you try to use a key that has no mapping. otherwise you'd return a blank MagicSet
}
public void put (L l, R r) {
R oldVal = forward.get(l);
// if the L had already belonged to an R, we need to undo that mapping
MagicSet<L> oldSet = inverse.get(oldVal);
if (oldSet != null) {oldSet.remove(l);}
// now get the set the R belongs to, and add it.
MagicSet<L> newSet = inverse.get(l);
if (newSet == null) {
newSet = new MagicSet<L>(forward, r);
inverse.put(r,newSet);
}
newSet.add(l); // magically updates the "forward" map
}
}

Related

What is the fastest and most concise/correct way to implement this model class backed by values in a 2-dimensional array?

I solved this problem using a graph, but unfortunately now I'm stuck with having to use a 2d array and I have questions about the best way to go about this:
public class Data {
int[][] structure;
public data(int x, int y){
structure = new int[x][y]
}
public <<TBD>> generateRandom() {
// This is what my question is about
}
}
I have a controller/event handler class:
public class Handler implements EventHandler {
#Override
public void onEvent(Event<T> e) {
this.dataInstance.generateRandom();
// ... other stuff
}
}
Here is what each method will do:
Data.generateRandom() will generate a random value at a random location in the 2d int array if there exists a value in the structure that in not initialized or a value exists that is equal to zero
If there is no available spot in the structure, the structure's state is final (i.e. in the literal sense, not the Java declaration)
This is what I'm wondering:
What is the most efficient way to check if the board is full? Using a graph, I was able to check if the board was full on O(1) and get an available yet also random location on worst-case O(n^2 - 1), best case O(1). Obviously now with an array improving n^2 is tough, so I'm just now focusing on execution speed and LOC. Would the fastest way to do it now to check the entire 2d array using streams like:
Arrays.stream(board).flatMapToInt(tile -> tile.getX()).map(x -> x > 0).count() > board.getWidth() * board.getHeight()

(1) You can definitely use a parallel stream to safely perform read only operations on the array. You can also do an anyMatch call since you are only caring (for the isFull check) if there exists any one space that hasn't been initialized. That could look like this:
Arrays.stream(structure)
.parallel()
.anyMatch(i -> i == 0)
However, that is still an n^2 solution. What you could do, though, is keep a counter of the number of spaces possible that you decrement when you initialize a space for the first time. Then the isFull check would always be constant time (you're just comparing an int to 0).
public class Data {
private int numUninitialized;
private int[][] structure;
public Data(int x, int y) {
if (x <= 0 || y <= 0) {
throw new IllegalArgumentException("You can't create a Data object with an argument that isn't a positive integer.");
}
structure = new int[x][y];
int numUninitialized = x * y;
}
public void generateRandom() {
if (isFull()) {
// do whatever you want when the array is full
} else {
// Calculate the random space you want to set a value for
int x = ThreadLocalRandom.current().nextInt(structure.length);
int y = ThreadLocalRandom.current().nextInt(structure[0].length);
if (structure[x][y] == 0) {
// A new, uninitialized space
numUninitialized--;
}
// Populate the space with a random value
structure[x][y] = ThreadLocalRandom.current().nextInt(Integer.MIN_VALUE, Integer.MAX_VALUE);
}
}
public boolean isFull() {
return 0 == numUninitialized;
}
}
Now, this is with my understanding that each time you call generateRandom you take a random space (including ones already initialized). If you are supposed to ONLY choose a random uninitialized space each time it's called, then you'd do best to hold an auxiliary data structure of all the possible grid locations so that you can easily find the next random open space and to tell if the structure is full.
(2) What notification method is appropriate for letting other classes know the array is now immutable? It's kind of hard to say as it depends on the use case and the architecture of the rest of the system this is being used in. If this is an MVC application with a heavy use of notifications between the data model and a controller, then an observer/observable pattern makes a lot of sense. But if your application doesn't use that anywhere else, then perhaps just having the classes that care check the isFull method would make more sense.
(3) Java is efficient at creating and freeing short lived objects. However, since the arrays can be quite large I'd say that allocating a new array object (and copying the data) over each time you alter the array seems ... inefficient at best. Java has the ability to do some functional types of programming (especially with the inclusion of lambdas in Java 8) but only using immutable objects and a purely functional style is kind of like the round hole to Java's square peg.

Efficient search in datastructure ArrayList

I've an ArrayList which contains my nodes. A node has a source, target and costs. Now I have to iterate over the whole ArrayList. That lasts for for over 1000 nodes a while. Therefore I tried to sort my List by source. But to find the corresponding pair in the List I tried the binary search. Unfortunately that works only if I want to compare either source or target. But I have to compare both to get the right pair. Is there another possibility to search an ArrayList efficient?

Unfortunately, no. ArrayLists are not made to be efficiently searched. They are used to store data and not search it. If you want to merely know if an item is contained, I would suggest you use HashSet as the lookup will have a time complexitiy of O(1) instead of O(n) for the ArrayList (assuming that you have implemented a functioning equals method for your objects).
If you want to do fast searches for objects, I recommend using an implementation of Dictionnary like HashMap. If you can afford the space requirement, you can have multiple maps, each with different keys to have a fast lookup of your object no matter what key you have to search for. Keep in mind that the lookup also requires implementing a correct equals method. Unfortunately, this requires that each key be unique which may not be a brilliant idea in your case.
However, you can use a HashMapto store, for each source, a List of nodes that have the keyed source as a source. You can do the same for cost and target. That way you can reduce the number of nodes you need to iterate over substantially. This should prove to be a good solution with a scarcely connected network.
private HashMap<Source, ArrayList<Node>> sourceMap = new HashMap<Source, ArrayList<Node>>();
private HashMap<Target, ArrayList<Node>> targetMap = new HashMap<Target, ArrayList<Node>>();
private HashMap<Cost, ArrayList<Node>> costMap = new HashMap<Cost, ArrayList<Node>>();
/** Look for a node with a given source */
for( Node node : sourceMap.get(keySource) )
{
/** Test the node for equality with a given node. Equals method below */
if(node.equals(nodeYouAreLookingFor) { return node; }
}
In order to be sure that your code will work, be sure to overwrite the equals method. I know I have said so already but this is a very common mistake.
#Override
public boolean equals(Object object)
{
if(object instanceof Node)
{
Node node = (Node) object;
if(source.equals(node.getSource() && target.equals(node.getTarget()))
{
return true;
}
} else {
return false;
}
}
If you don't, the test will simply compare references which may or may not be equal depending on how you handle your objects.
Edit: Just read what you base your equality upon. The equals method should be implemented in your node class. However, for it to work, you need to implement and override the equals method for the source and target too. That is, if they are objects. Be watchful though, if they are Nodes too, this may result in quite some tests spanning all of the network.
Update: Added code to reflect the purpose of the code in the comments.
ArrayList<Node> matchingNodes = sourceMap.get(desiredSourde).retainAll(targetMap.get(desiredTarget));
Now you have a list of all nodes that match the source and target criteria. Provided that you are willing to sacrifice a bit of memory, the lookup above will have a complexity of O(|sourceMap| * (|sourceMap|+|targetMap|)) [1]. While this is superior to just a linear lookup of all nodes, O(|allNodeList|), if your network is big enough, which with 1000 nodes I think it is, you could benefit much. If your network follows a naturally occurring network, then, as Albert-László Barabási has shown, it is likely scale-free. This means that splitting your network into lists of at least source and target will likely (I have no proof for this) result in a scale-free size distribution of these lists. Therefore, I believe the complexity of looking up source and target will be substantially reduced as |sourceMap| and |targetMap| should be substantially lower than |allNodeList|.

You'll need to combine the source and target into a single comparator, e.g.
compare(T o1, T o2) {
if(o1.source < o2.source) { return -1; }
else if(o1.source > o2.source) { return 1; }
// else o1.source == o2.source
else if(o1.target < o2.target) { return -1; }
else if(o1.target > o2.target) { return 1; }
else return 0;
}

You can use the .compareTo() method to compares your nodes.

You can create two ArrayLists. The first sorted by source, the second sorted by target.
Then you can search by source or target using binarySearch on the corresponding List.

You can make a helper class to store source-target pairs:
class SourceTarget {
public final Source source; // public fields are OK when they're final and immutable.
public final Target target; // you can use getters but I'm lazy
// (don't give this object setters. Map keys should ideally be immutable)
public SourceTarget( Source s, Target t ){
source = s;
target = t;
}
#Override
public boolean equals( Object other ){
// Implement in the obvious way (only equal when both source and target are equal
}
#Override
public int hashCode(){
// Implement consistently with equals
}
}
Then store your things in a HashMap<SourceTarget, List<Node>>, with each source-target pair mapped to the list of nodes that have exactly that source-target pair.
To retrieve just use
List<Node> results = map.get( new SourceTarget( node.source, node.target ) );
Alternatively to making a helper class, you can use the comparator in Zim-Zam's answer and a TreeMap<Node,List<Node>> with a representative Node object acting as the SourceTarget pair.

Limited SortedSet

i'm looking for an implementation of SortedSet with a limited number of elements. So if there are more elements added then the specified Maximum the comparator decides if to add the item and remove the last one from the Set.
SortedSet<Integer> t1 = new LimitedSet<Integer>(3);
t1.add(5);
t1.add(3);
t1.add(1);
// [1,3,5]
t1.add(2);
// [1,2,3]
t1.add(9);
// [1,2,3]
t1.add(0);
// [0,1,2]
Is there an elegant way in the standard API to accomplish this?
I've wrote a JUnit Test for checking implementations:
#Test
public void testLimitedSortedSet() {
final LimitedSortedSet<Integer> t1 = new LimitedSortedSet<Integer>(3);
t1.add(5);
t1.add(3);
t1.add(1);
System.out.println(t1);
// [1,3,5]
t1.add(2);
System.out.println(t1);
// [1,2,3]
t1.add(9);
System.out.println(t1);
// [1,2,3]
t1.add(0);
System.out.println(t1);
// [0,1,2]
Assert.assertTrue(3 == t1.size());
Assert.assertEquals(Integer.valueOf(0), t1.first());
}

With the standard API you'd have to do it yourself, i.e. extend one of the sorted set classes and add the logic you want to the add() and addAll() methods. Shouldn't be too hard.
Btw, I don't fully understand your example:
t1.add(9);
// [1,2,3]
Shouldn't the set contain [1,2,9] afterwards?
Edit: I think now I understand: you want to only keep the smallest 3 elements that were added to the set, right?
Edit 2: An example implementation (not optimised) could look like this:
class LimitedSortedSet<E> extends TreeSet<E> {
private int maxSize;
LimitedSortedSet( int maxSize ) {
this.maxSize = maxSize;
}
#Override
public boolean addAll( Collection<? extends E> c ) {
boolean added = super.addAll( c );
if( size() > maxSize ) {
E firstToRemove = (E)toArray( )[maxSize];
removeAll( tailSet( firstToRemove ) );
}
return added;
}
#Override
public boolean add( E o ) {
boolean added = super.add( o );
if( size() > maxSize ) {
E firstToRemove = (E)toArray( )[maxSize];
removeAll( tailSet( firstToRemove ) );
}
return added;
}
}
Note that tailSet() returns the subset including the parameter (if in the set). This means that if you can't calculate the next higher value (doesn't need to be in the set) you'll have to readd that element. This is done in the code above.
If you can calculate the next value, e.g. if you have a set of integers, doing something tailSet( lastElement + 1 ) would be sufficient and you'd not have to readd the last element.
Alternatively you can iterate over the set yourself and remove all elements that follow the last you want to keep.
Another alternative, although that might be more work, would be to check the size before inserting an element and remove accordingly.
Update: as msandiford correctly pointed out, the first element that should be removed is the one at index maxSize. Thus there's no need to readd (re-add?) the last wanted element.
Important note:
As #DieterDP correctly pointed out, the implementation above violates the Collection#add() api contract which states that if a collection refuses to add an element for any reason other than it being a duplicate an excpetion must be thrown.
In the example above the element is first added but might be removed again due to size constraints or other elements might be removed, so this violates the contract.
To fix that you might want to change add() and addAll() to throw exceptions in those cases (or maybe in any case in order to make them unusable) and provide alterante methods to add elements which don't violate any existing api contract.
In any case the above example should be used with care since using it with code that isn't aware of the violations might result in unwanted and hard to debug errors.

I'd say this is a typical application for the decorator pattern, similar to the decorator collections exposed by the Collections class: unmodifiableXXX, synchronizedXXX, singletonXXX etc. I would take Guava's ForwardingSortedSet as base class, and write a class that decorates an existing SortedSet with your required functionality, something like this:
public final class SortedSets {
public <T> SortedSet<T> maximumSize(
final SortedSet<T> original, final int maximumSize){
return new ForwardingSortedSet<T>() {
#Override
protected SortedSet<T> delegate() {
return original;
}
#Override
public boolean add(final T e) {
if(original.size()<maximumSize){
return original.add(e);
}else return false;
}
// implement other methods accordingly
};
}
}

No, there is nothing like that using existing Java Library.
But yes, you can build a one like below using composition. I believe it will be easy.
public class LimitedSet implements SortedSet {
private TreeSet treeSet = new TreeSet();
public boolean add(E e) {
boolean result = treeSet.add(e);
if(treeSet.size() >= expectedSize) {
// remove the one you like ;)
}
return result;
}
// all other methods delegate to the "treeSet"
}
UPDATE
After reading your comment
As you need to remove the last element always:
you can consider maintaining a stack internally
it will increase memory complexity with O(n)
but possible to retrieve the last element with just O(1)... constant time
It should do the trick I believe

Creating method filters

In my code I have a List<Person>. Attributes to the objects in this list may include something along the lines of:
ID
First Name
Last Name
In a part of my application, I will be allowing the user to search for a specific person by using any combination of those three values. At the moment, I have a switch statement simply checking which fields are filled out, and calling the method designated for that combination of values.
i.e.:
switch typeOfSearch
if 0, lookById()
if 1, lookByIdAndName()
if 2, lookByFirstName()
and so on. There are actually 7 different types.
This makes me have one method for each statement. Is this a 'good' way to do this? Is there a way that I should use a parameter or some sort of 'filter'? It may not make a difference, but I'm coding this in Java.

You can do something more elgant with maps and interfaces. Try this for example,
interface LookUp{
lookUpBy(HttpRequest req);
}
Map<Integer, LookUp> map = new HashMap<Integer, LookUp>();
map.put(0, new LookUpById());
map.put(1, new LookUpByIdAndName());
...
in your controller then you can do
int type = Integer.parseInt(request.getParameter(type));
Person person = map.get(type).lookUpBy(request);
This way you can quickly look up the method with a map. Of course you can also use a long switch but I feel this is more manageable.

If good means "the language does it for me", no.
If good means 'readable', I would define in Person a method match() that returns true if the object matches your search criteria. Also, probably is a good way to create a method Criteria where you can encapsulate the criteria of search (which fields are you looking for and which value) and pass it to match(Criteria criteria).

This way of doing quickly becomes unmanageable, since the number of combinations quickly becomes huge.
Create a PersonFilter class having all the possible query parameters, and visit each person of the list :
private class PersonFilter {
private String id;
private String firstName;
private String lastName;
// constructor omitted
public boolean accept(Person p) {
if (this.id != null && !this.id.equals(p.getId()) {
return false;
}
if (this.firstName != null && !this.firstName.equals(p.getFirstName()) {
return false;
}
if (this.lastName != null && !this.lastName.equals(p.getLastName()) {
return false;
}
return true;
}
}
The filtering is now implemented by
public List<Person> filter(List<Person> list, PersonFilter filter) {
List<Person> result = new ArrayList<Person>();
for (Person p : list) {
if (filter.accept(p) {
result.add(p);
}
}
return result;
}

At some point you should take a look at something like Lucene which will give you the best scalability, manageability and performance for this type of searching. Not knowing the amount of data your dealing with I only recommend this for a longer term solution with a larger set of objects to search with. It's an amazing tool!

What is a data structure kind of like a hash table, but infrequently-used keys are deleted?

I am looking for a data structure that operates similar to a hash table, but where the table has a size limit. When the number of items in the hash reaches the size limit, a culling function should be called to get rid of the least-retrieved key/value pairs in the table.
Here's some pseudocode of what I'm working on:
class MyClass {
private Map<Integer, Integer> cache = new HashMap<Integer, Integer>();
public int myFunc(int n) {
if(cache.containsKey(n))
return cache.get(n);
int next = . . . ; //some complicated math. guaranteed next != n.
int ret = 1 + myFunc(next);
cache.put(n, ret);
return ret;
}
}
What happens is that there are some values of n for which myFunc() will be called lots of times, but many other values of n which will only be computed once. So the cache could fill up with millions of values that are never needed again. I'd like to have a way for the cache to automatically remove elements that are not frequently retrieved.
This feels like a problem that must be solved already, but I'm not sure what the data structure is that I would use to do it efficiently. Can anyone point me in the right direction?
Update I knew this had to be an already-solved problem. It's called an LRU Cache and is easy to make by extending the LinkedHashMap class. Here is the code that incorporates the solution:
class MyClass {
private final static int SIZE_LIMIT = 1000;
private Map<Integer, Integer> cache =
new LinkedHashMap<Integer, Integer>(16, 0.75f, true) {
protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest)
{
return size() > SIZE_LIMIT;
}
};
public int myFunc(int n) {
if(cache.containsKey(n))
return cache.get(n);
int next = . . . ; //some complicated math. guaranteed next != n.
int ret = 1 + myFunc(next);
cache.put(n, ret);
return ret;
}
}

You are looking for an LRUList/Map. Check out LinkedHashMap:
The removeEldestEntry(Map.Entry) method may be overridden to impose a policy for removing stale mappings automatically when new mappings are added to the map.

Googling "LRU map" and "I'm feeling lucky" gives you this:
http://commons.apache.org/proper/commons-collections//javadocs/api-release/org/apache/commons/collections4/map/LRUMap.html
A Map implementation with a fixed
maximum size which removes the least
recently used entry if an entry is
added when full.
Sounds pretty much spot on :)

WeakHashMap will probably not do what you expect it to... read the documentation carefully and ensure that you know exactly what you from weak and strong references.
I would recommend you have a look at java.util.LinkedHashMap and use its removeEldestEntry method to maintain your cache. If your math is very resource intensive, you might want to move entries to the front whenever they are used to ensure that only unused entries fall to the end of the set.

The Adaptive Replacement Cache policy is designed to keep one-time requests from polluting your cache. This may be fancier than you're looking for, but it does directly address your "filling up with values that are never needed again".

Take a look at WeakHashMap

You probably want to implement a Least-Recently Used policy for your map. There's a simple way to do it on top of a LinkedHashMap:
http://www.roseindia.net/java/example/java/util/LRUCacheExample.shtml

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Simple database-like collection class in Java - java

Related

What is the fastest and most concise/correct way to implement this model class backed by values in a 2-dimensional array?

Efficient search in datastructure ArrayList

Limited SortedSet

Creating method filters

What is a data structure kind of like a hash table, but infrequently-used keys are deleted?

Categories

Resources