Efficient search in datastructure ArrayList - java

I've an ArrayList which contains my nodes. A node has a source, target and costs. Now I have to iterate over the whole ArrayList. That lasts for for over 1000 nodes a while. Therefore I tried to sort my List by source. But to find the corresponding pair in the List I tried the binary search. Unfortunately that works only if I want to compare either source or target. But I have to compare both to get the right pair. Is there another possibility to search an ArrayList efficient?

Unfortunately, no. ArrayLists are not made to be efficiently searched. They are used to store data and not search it. If you want to merely know if an item is contained, I would suggest you use HashSet as the lookup will have a time complexitiy of O(1) instead of O(n) for the ArrayList (assuming that you have implemented a functioning equals method for your objects).
If you want to do fast searches for objects, I recommend using an implementation of Dictionnary like HashMap. If you can afford the space requirement, you can have multiple maps, each with different keys to have a fast lookup of your object no matter what key you have to search for. Keep in mind that the lookup also requires implementing a correct equals method. Unfortunately, this requires that each key be unique which may not be a brilliant idea in your case.
However, you can use a HashMapto store, for each source, a List of nodes that have the keyed source as a source. You can do the same for cost and target. That way you can reduce the number of nodes you need to iterate over substantially. This should prove to be a good solution with a scarcely connected network.
private HashMap<Source, ArrayList<Node>> sourceMap = new HashMap<Source, ArrayList<Node>>();
private HashMap<Target, ArrayList<Node>> targetMap = new HashMap<Target, ArrayList<Node>>();
private HashMap<Cost, ArrayList<Node>> costMap = new HashMap<Cost, ArrayList<Node>>();
/** Look for a node with a given source */
for( Node node : sourceMap.get(keySource) )
{
/** Test the node for equality with a given node. Equals method below */
if(node.equals(nodeYouAreLookingFor) { return node; }
}
In order to be sure that your code will work, be sure to overwrite the equals method. I know I have said so already but this is a very common mistake.
#Override
public boolean equals(Object object)
{
if(object instanceof Node)
{
Node node = (Node) object;
if(source.equals(node.getSource() && target.equals(node.getTarget()))
{
return true;
}
} else {
return false;
}
}
If you don't, the test will simply compare references which may or may not be equal depending on how you handle your objects.
Edit: Just read what you base your equality upon. The equals method should be implemented in your node class. However, for it to work, you need to implement and override the equals method for the source and target too. That is, if they are objects. Be watchful though, if they are Nodes too, this may result in quite some tests spanning all of the network.
Update: Added code to reflect the purpose of the code in the comments.
ArrayList<Node> matchingNodes = sourceMap.get(desiredSourde).retainAll(targetMap.get(desiredTarget));
Now you have a list of all nodes that match the source and target criteria. Provided that you are willing to sacrifice a bit of memory, the lookup above will have a complexity of O(|sourceMap| * (|sourceMap|+|targetMap|)) [1]. While this is superior to just a linear lookup of all nodes, O(|allNodeList|), if your network is big enough, which with 1000 nodes I think it is, you could benefit much. If your network follows a naturally occurring network, then, as Albert-László Barabási has shown, it is likely scale-free. This means that splitting your network into lists of at least source and target will likely (I have no proof for this) result in a scale-free size distribution of these lists. Therefore, I believe the complexity of looking up source and target will be substantially reduced as |sourceMap| and |targetMap| should be substantially lower than |allNodeList|.

You'll need to combine the source and target into a single comparator, e.g.
compare(T o1, T o2) {
if(o1.source < o2.source) { return -1; }
else if(o1.source > o2.source) { return 1; }
// else o1.source == o2.source
else if(o1.target < o2.target) { return -1; }
else if(o1.target > o2.target) { return 1; }
else return 0;
}

You can use the .compareTo() method to compares your nodes.

You can create two ArrayLists. The first sorted by source, the second sorted by target.
Then you can search by source or target using binarySearch on the corresponding List.

You can make a helper class to store source-target pairs:
class SourceTarget {
public final Source source; // public fields are OK when they're final and immutable.
public final Target target; // you can use getters but I'm lazy
// (don't give this object setters. Map keys should ideally be immutable)
public SourceTarget( Source s, Target t ){
source = s;
target = t;
}
#Override
public boolean equals( Object other ){
// Implement in the obvious way (only equal when both source and target are equal
}
#Override
public int hashCode(){
// Implement consistently with equals
}
}
Then store your things in a HashMap<SourceTarget, List<Node>>, with each source-target pair mapped to the list of nodes that have exactly that source-target pair.
To retrieve just use
List<Node> results = map.get( new SourceTarget( node.source, node.target ) );
Alternatively to making a helper class, you can use the comparator in Zim-Zam's answer and a TreeMap<Node,List<Node>> with a representative Node object acting as the SourceTarget pair.

Related

Balance a binary search tree

I'm trying to implement a binary search tree class in Java with a method that can rebalance the tree if there's a difference in height. I'm trying to do it by first storing the value of the nodes in an List (an attribute of the class).
I then want to take the middle element of this list and assign this to the root of the tree. After this I take the left- and right part of the list and do the same thing recursively to the left- and right children of the root and so on.
My algorithm doesn't seem to work though and I don't understand what I'm doing wrong. I wonder if someone can take a look at my code and explain what the problem is? What I do is basically pass the ordered list of elements of the tree (an attribute of the class) and the root into the function below:
public void build(BinaryNode<E> n,List<E> list) {
int idx = (int)Math.floor(list.size()/2);
if(n!=null) {
n.element = list.get(idx);
} else if(n==null) {
n = new BinaryNode<E>(list.get(idx));
}
if(!list.subList(0,idx).isEmpty()) {
build(n.left,list.subList(0,idx));
}
if(!list.subList(idx+1,list.size()).isEmpty() ){
build(n.right,list.subList(idx+1,list.size()));
}
return;
}
Kind regards,
Java method calls are "call by value". This means changing a parameter (like n in your case) has no effect outside of the method.
Try to define your method like this
public BinaryNode<E> build(List<E> list) { ... }
Try investigating about AVL tree
Some useful links:
https://en.wikipedia.org/wiki/AVL_tree
https://www.geeksforgeeks.org/avl-tree-set-1-insertion/

How to implement Union-Find using linked lists?

I need to write a piece of code using the Kruskal algorithm, which in turn needs the Union-Find algorithm.
This includes the methods Make-Set(x), Find-Set(x) and Union(x, y).
I need to implement them using linked lists, but I am not sure of how to start with the Make-Set method.
The Make-Set Method should create a set and make the first element into a key (to compare sets). How exactly would I declare a key using linked lists?
Shortly put: How do I implement this pseudo code for linked lists in Java?
Make-Set(x)
x.p = x
x.rank = 0
Thanks for your help in advance!
I've heard this referred to in the past not as "Union-Find" but as a disjoint set. It isn't exactly a linked list, since the nodes do have a link, but they aren't necessarily linked up in a linear fashion. It's more like a tree where each node has a pointer to its parent and you can walk up the tree to the root.
I don't have much time right now, but here's a quick sketch of how I would implement it in Java:
class Disjoint {
Disjoint next;
Disjoint findSet() {
Disjoint head = this;
if (next != null) {
head = next.findSet();
next = head;
}
return head;
}
void union(Disjoint other) {
Disjoint us = this.findSet();
Disjoint them = other.findSet();
us.next = them;
}
}
Creating an instance is your Make-Set. What you call Find-Set I would call find head or find leader, maybe find identity. I've called it findSet here, though. It walks the chain to find the root of the tree. It also performs an optional operation; it snaps all the links on the way back out of the recursive call so that they all point directly at the root. This is an optimization to keep the chains short.
Finally, Union is implemented just by assigning one root's next pointer to point at the other set. I'm not sure what you intended with rank; if it's the size of the set, you can add a field for that and simply sum them when you union two sets. But you initialize it to 0 for a new set when I would expect it to be initialized to 1.
Two nodes a and b belong to the same set if a.findSet() == b.findSet(). If you need the nodes to carry some data, make the class generic and provide the data to the constructor, and add a getter:
class Disjoint<T> {
Disjoint<T> next;
T data;
public Disjoint(final T data) {
this.data = data;
}
public T getData() {
return data;
}
// rest of class identical except Disjoint replaced with Disjoint<T> everywhere
}

Best way to code large if-else-statements of Objects in Java?

I'm using Swing and have a JTree in my GUI. During execution, I parse through the nodes of the tree to determine what action to perform. Here's a snippet:
// action to modify node
public void modifyMenuItem(DefaultMutableTreeNode node)
{
// if node is node 1 in the tree
if(node.equals(treeNodes.node1))
{
// perform action 1
}
// if node is node 2 in the tree
else if(node.equals(treeNodes.node2))
{
// perform action 2
}
// if node is node 3 in the tree
else if(node.equals(treeNodes.node3))
{
// perform action 3
}
etc.
}
Problem is, I have close to 50 nodes in my tree and I'm afraid that I'm really hurting performance by having this type of implementation. I have similar if-statements throughout my code. What is the preferred method for handling large if-statements of Objects like this? Obviously I can't use a switch statement since these aren't Integer values, so should I create a Hashmap and then use a switch based off the Hash keys?
I would use polymorphism:
public void modifyMenuItem(DefaultMutableTreeNode node) {
((MyUserObject) node.getUserObject()).doAction();
}
For this to work, all the user objects of your nodes must be instances of the same MyUserObject interface.
Not knowing more what you are trying to do, I'd suggest you use polymorphism. You create subclasses of DefaultMutableTreeNode that also implement an interface of yours, let's call it ActionPerfomer with a method performAction(). You use instances of those classes as Nodes in your Tree and then can simply write:
// I suppose you can't change the signature?
public void modifyMenuItem(DefaultMutableTreeNode node) {
if (node instanceof ActionPerfomer) {
ActionPerfomer ap = (ActionPerfomer) node;
ap.performAction();
} else {
logger.log("How did that get here?");
}
}
Can you use a map with 50 items, keyed on the different treenodes (or treenode names), and returning an 'Action' that can be performed on the node?
Map<String, Action> map = new HashMap<String, Action>();
// populate the map with items and Actions
Action action = map.get(node.getName());
action.perform(node);
Either the treenodes need to implement equals and hashcode correctly... or you can just use the node name (given it is unique).
Good luck!

Simple database-like collection class in Java

The problem: Maintain a bidirectional many-to-one relationship among java objects.
Something like the Google/Commons Collections bidi maps, but I want to allow duplicate values on the forward side, and have sets of the forward keys as the reverse side values.
Used something like this:
// maintaining disjoint areas on a gameboard. Location is a space on the
// gameboard; Regions refer to disjoint collections of Locations.
MagicalManyToOneMap<Location, Region> forward = // the game universe
Map<Region, <Set<Location>>> inverse = forward.getInverse(); // live, not a copy
Location parkplace = Game.chooseSomeLocation(...);
Region mine = forward.get(parkplace); // assume !null; should be O(log n)
Region other = Game.getSomeOtherRegion(...);
// moving a Location from one Region to another:
forward.put(parkplace, other);
// or equivalently:
inverse.get(other).add(parkplace); // should also be O(log n) or so
// expected consistency:
assert ! inverse.get(mine).contains(parkplace);
assert forward.get(parkplace) == other;
// and this should be fast, not iterate every possible location just to filter for mine:
for (Location l : mine) { /* do something clever */ }
The simple java approaches are: 1. To maintain only one side of the relationship, either as a Map<Location, Region> or a Map<Region, Set<Location>>, and collect the inverse relationship by iteration when needed; Or, 2. To make a wrapper that maintains both sides' Maps, and intercept all mutating calls to keep both sides in sync.
1 is O(n) instead of O(log n), which is becoming a problem. I started in on 2 and was in the weeds straightaway. (Know how many different ways there are to alter a Map entry?)
This is almost trivial in the sql world (Location table gets an indexed RegionID column). Is there something obvious I'm missing that makes it trivial for normal objects?
I might misunderstand your model, but if your Location and Region have correct equals() and hashCode() implemented, then the set of Location -> Region is just a classical simple Map implementation (multiple distinct keys can point to the same object value). The Region -> Set of Location is a Multimap (available in Google Coll.). You could compose your own class with the proper add/remove methods to manipulate both submaps.
Maybe an overkill, but you could also use in-memory sql server (HSQLDB, etc). It allows you to create index on many columns.
I think you could achieve what you need with the following two classes. While it does involve two maps, they are not exposed to the outside world, so there shouldn't be a way for them to get out of sync. As for storing the same "fact" twice, I don't think you'll get around that in any efficient implementation, whether the fact is stored twice explicitly as it is here, or implicitly as it would be when your database creates an index to make joins more efficient on your 2 tables. you can add new things to the magicset and it will update both mappings, or you can add things to the magicmapper, which will then update the inverse map auotmatically. The girlfriend is calling me to bed now so I cannot run this through a compiler - it should be enough to get you started. what puzzle are you trying to solve?
public class MagicSet<L> {
private Map<L,R> forward;
private R r;
private Set<L> set;
public MagicSet<L>(Map forward, R r) {
this.forward = map;
this.r = r;
this.set = new HashSet<L>();
}
public void add(L l) {
set.add(l);
forward.put(l,r);
}
public void remove(L l) {
set.remove(l);
forward.remove(l);
}
public int size() {
return set.size();
}
public in contains(L l){
return set.contains(l);
}
// caution, do not use the remove method from this iterator. if this class was going
// to be reused often you would want to return a wrapped iterator that handled the remove method properly. In fact, if you did that, i think you could then extend AbstractSet and MagicSet would then fully implement java.util.Set.
public Iterator iterator() {
return set.iterator();
}
}
public class MagicMapper<L,R> { // note that it doesn't implement Map, though it could with some extra work. I don't get the impression you need that though.
private Map<L,R> forward;
private Map<R,MagicSet<L>> inverse;
public MagicMapper<L,R>() {
forward = new HashMap<L,R>;
inverse = new HashMap<R,<MagicSet<L>>;
}
public R getForward(L key) {
return forward.get(key);
}
public Set<L> getBackward(R key) {
return inverse.get(key); // this assumes you want a null if
// you try to use a key that has no mapping. otherwise you'd return a blank MagicSet
}
public void put (L l, R r) {
R oldVal = forward.get(l);
// if the L had already belonged to an R, we need to undo that mapping
MagicSet<L> oldSet = inverse.get(oldVal);
if (oldSet != null) {oldSet.remove(l);}
// now get the set the R belongs to, and add it.
MagicSet<L> newSet = inverse.get(l);
if (newSet == null) {
newSet = new MagicSet<L>(forward, r);
inverse.put(r,newSet);
}
newSet.add(l); // magically updates the "forward" map
}
}

What is a data structure kind of like a hash table, but infrequently-used keys are deleted?

I am looking for a data structure that operates similar to a hash table, but where the table has a size limit. When the number of items in the hash reaches the size limit, a culling function should be called to get rid of the least-retrieved key/value pairs in the table.
Here's some pseudocode of what I'm working on:
class MyClass {
private Map<Integer, Integer> cache = new HashMap<Integer, Integer>();
public int myFunc(int n) {
if(cache.containsKey(n))
return cache.get(n);
int next = . . . ; //some complicated math. guaranteed next != n.
int ret = 1 + myFunc(next);
cache.put(n, ret);
return ret;
}
}
What happens is that there are some values of n for which myFunc() will be called lots of times, but many other values of n which will only be computed once. So the cache could fill up with millions of values that are never needed again. I'd like to have a way for the cache to automatically remove elements that are not frequently retrieved.
This feels like a problem that must be solved already, but I'm not sure what the data structure is that I would use to do it efficiently. Can anyone point me in the right direction?
Update I knew this had to be an already-solved problem. It's called an LRU Cache and is easy to make by extending the LinkedHashMap class. Here is the code that incorporates the solution:
class MyClass {
private final static int SIZE_LIMIT = 1000;
private Map<Integer, Integer> cache =
new LinkedHashMap<Integer, Integer>(16, 0.75f, true) {
protected boolean removeEldestEntry(Map.Entry<Integer, Integer> eldest)
{
return size() > SIZE_LIMIT;
}
};
public int myFunc(int n) {
if(cache.containsKey(n))
return cache.get(n);
int next = . . . ; //some complicated math. guaranteed next != n.
int ret = 1 + myFunc(next);
cache.put(n, ret);
return ret;
}
}
You are looking for an LRUList/Map. Check out LinkedHashMap:
The removeEldestEntry(Map.Entry) method may be overridden to impose a policy for removing stale mappings automatically when new mappings are added to the map.
Googling "LRU map" and "I'm feeling lucky" gives you this:
http://commons.apache.org/proper/commons-collections//javadocs/api-release/org/apache/commons/collections4/map/LRUMap.html
A Map implementation with a fixed
maximum size which removes the least
recently used entry if an entry is
added when full.
Sounds pretty much spot on :)
WeakHashMap will probably not do what you expect it to... read the documentation carefully and ensure that you know exactly what you from weak and strong references.
I would recommend you have a look at java.util.LinkedHashMap and use its removeEldestEntry method to maintain your cache. If your math is very resource intensive, you might want to move entries to the front whenever they are used to ensure that only unused entries fall to the end of the set.
The Adaptive Replacement Cache policy is designed to keep one-time requests from polluting your cache. This may be fancier than you're looking for, but it does directly address your "filling up with values that are never needed again".
Take a look at WeakHashMap
You probably want to implement a Least-Recently Used policy for your map. There's a simple way to do it on top of a LinkedHashMap:
http://www.roseindia.net/java/example/java/util/LRUCacheExample.shtml

Categories