Algorithm: how to check whether two set doesn't conflict - java

write a function:
Input: two set, one reflect equal relationship, like {A = B, B = C}, another one reflect not-equal relationship, like {A != C}
Output: check whether the two set conflict or not, return a boolean type
public boolean isConflict(List<String> equalRelation, List<String> notEqualRelation) {
//......
}

Create a graph from your data. Every variable is a node and every equal is an undirected edge between corresponding nodes. After that every connected component represents equality between all of its nodes. For finding contradictions you could simply check if two nodes in a not equal relationship are in a same component or not. If they are this a contradiction.
You can use dfs,bfs for finding connected components. But if your data is dynamic it is better to use disjoint-set data structure.

You have done half the job:
your set of set equal relationship => keep it like Set< Pair< String,String> > , and the same thing for not-equal relationship
create a set of entities equal, like a Map : any symbol => reduced symbol equals. For example: A=B gives you, if A< B : A=>A (implicit), B=>A, and if A=>C, then B=>C. You have to iterate another time to reduce everything. With that,you map every equal symbols to an unique one (the least in my example);
finally, iterate through your Set of non-equals, and check every pair (X,Y) => is reduced(X)=reduced(Y) which means X=Y : you have a conflict there
If you want to keep Pair<>, use this, courtesy of :
A Java collection of value pairs? (tuples?)
public class Pair<L,R> implements java.io.Serializable {
private final L left;
private final R right;
public Pair(L left, R right) {
this.left = left;
this.right = right;
}
public L getLeft() { return left; }
public R getRight() { return right; }
#Override
public int hashCode() { return left.hashCode() ^ right.hashCode(); }
#Override
public boolean equals(Object o) {
if (o == null) return false;
if (!(o instanceof Pair)) return false;
Pair pairo = (Pair) o;
return this.left.equals(pairo.getLeft()) &&
this.right.equals(pairo.getRight());
}
}

Vogel612 is pretty much right. The basic algorithm is: complete the equality relation, then test each inequality in the list to see if any conflicts. However, you would not generally do any completion on the inequalities.
I’ll mention that equivalence is transitive (a = b = c implies a = c) but inequality is not (1 ≠ 1+1 and 2 ≠ 1, but 1+1 = 2). Both are symmetric, though (if a = b, b = a), and equivalence is also reflexive (each element must equal itself).
The exact data structure to use might depend on whether we have a list of all the elements in advance, how many there are, and how sparse the truth table for the relation is likely to be. If the amount of memory we need to store the truth table as a two-dimensional array is reasonable, we could do it that way. Vogel612’s approach of storing the table as a collection of rows, each row containing the set of elements equivalent to its index, is a good way of storing a truth table with only a few elements equivalent to every element. (If every row is guaranteed to contain at least one element, itself, this should become an array or arrayList.) Another approach is to create an arbitrary total ordering on the element names, such as the order in which we added them to an arrayList, the hash value of their names, or a lexicographic ordering. This order has nothing to do with the values of those elements or the provided equality relation, but can be used to put every pair into a canonical form. That is, our elements are listed a, b, c, and a < b < c when we compare their names even if the values are equal by the relation. Keep a set (probably a hashSet in Java) of tuples of elements (a,b) such that 'a' precedes 'b' in this ordering and their contents are equal, letting the facts that a = a, b = b and b = a be implicit. This will be efficient when very few other pairs of elements are equivalent. There are also more complicated data structures for sparse matrix storage that are not in the standard library.
In any case, you always want x ≠ x to fail and to look for the tuples in the same canonical form that you stored them in.

Related

Swapping keys in TreeSet?

Suppose I have this class:
public class Node implements Comparable<Node>{
public float key;
public TreeSet<Node> neighbors;
public Node{
//fill neighbors somehow
}
#Override
public int compareTo(Node n) {
if(this.key == n.key)
return 0;
else if(this.key > n.key)
return 1;
else
return -1;
}
}
So this is a classic node of a graph, where each node is connected to a set of nodes (i.e. its neighbors). I'm using TreeSet because I often (very often) to know all the neighbors with their key bigger (smaller) than a certain value. Now, let's suppose I have this method:
//swap nodes keys
void swapKeys(Node a, Node b){
float ak = a.key;
a.key = b.key;
b.key = ak;
}
Notice that this method changes only the two nodes keys, nothing more.
Do this "break" the structure, or everything will continue to work fine?
If this breaks the structure, what about this simple solution:
//swap nodes keys
void swapKeys(Node a, Node b){
a.remove(b);
b.remove(a);
float ak = a.key;
a.key = b.key;
b.key = ak;
a.add(b);
b.add(a);
}
From the TreeSet documentation :
Note that the ordering maintained by a set (whether or not an explicit
comparator is provided) must be consistent with equals if it is to
correctly implement the Set interface.
Your Node class' Comparable implementation is not consistent with equals. (compareTo can return 0 for two Node instances wich are not equal).
This in itself makes your Node class unfit to be elements of a TreeSet.
Even the proposed workaround is not sufficient.
You may be tempted to fix this by implementing equals() (and hashCode()) to be based upon the value contained in the node. But to no avail, as this would go against a warning on the documention of the general Set interface :
Note: Great care must be exercised if mutable objects are used as set
elements. The behavior of a set is not specified if the value of an
object is changed in a manner that affects equals comparisons while
the object is an element in the set. A special case of this
prohibition is that it is not permissible for a set to contain itself
as an element.
So adding equals and hashCode is still not sufficient : your instances must also be immutable.
However the simplest solution, seems to be to forego the Comparable interface altogether, to not implement equals and hashCode, and to simply use a HashSet instead of a TreeSet. In that case you can change the contents of your nodes without consequences to the proper functioning of the set of neighbours.

Arrange complex datatype for fast partial match comparison

I am writing a java program that compares two datasets, each set contains data of the same type. The datatypes are basically classes, containing both Strings, ints and a String[]. Let's call this class Foo and the datasets a and b. For each item in a, I need to find the item in b that matches it most closely.
My problem is speed - I have outlined below, in pseudo-code, what I do right now. As you can imagine, it doesn't scale very well with increasing size (and I DO have much increasing sizes...). If anyone could point me in the direction of a better solution, I would greatly appreciate it. I am aware that sorting the arrays, in case of e.g. String or int comparisons, would increase speed vastly, but since my datatype is more complex, I don't see how that could work here.
Foo[] a = new Foo[...];
Foo[] b = new Foo[...];
for (item_a : a) {
double bestMatch = 0;
for (item_b : b) {
double match = compareFoo(item_a,item_b);
if (match > bestMatch) {
bestMatch = match;
}
}
//Do stuff with bestMatch - display, save etc.
}
private double compareFoo(Foo item_a, Foo item_b) {
//Compare every element of a and b,
//return value between 0 (no match) and 1 (identical)
}

Java: How to refer to a class inside the class but of another element?

I'm not to sure on how to explain this but basically I am trying to refer to the List Classes front which is of Element A(can be from any list). But what happens is that when it goes through the Elements of the list it is comparing from two different lists and ends up not matching. ie compares original list which contains the front b to list containing element A. Now I'm just wondering about how i would get the front of Element A set to b so that i can compare where it is.
/*front is a dummy element used to keep position.
List is a class i have made under requirements of naming for subject.
i don't want a solution. I only want to know about how to do it.
This is what is an example code of whats causing the problem USED IN DRIVER PROGRAM
DLL.concat(DLL2);
it is basically getting DLL's front and going through the loop when it should be using DLL2's.
DLL and DLL2 are both Lists
***/
//will return the index of the Element for comparing
private int checkElement(Element A){
Element b = front;
int i = 0;
while (b != a && i<size)
{
b = b.next;
i++;
}
return i;
}
//edit: add
//size is the size of the list gets increased everytime a variable is added to the list on top of the dummy element.
//Item is a private class inside the List class. it contains the values: element,next, previous in which element contains an object, next and previous contain the next element in the list and the previous one (its a double linked list)
// this is what causes the error to turn up in the above method as im using two different lists and joining them.
public void concat(List L){
if (splice(L.first(),L.last(),last())){
size = size+L.size;
}
}
//this is the splice method for cutting out elements and attaching them after t
//just using the check method to assert that a<b and will later use it to assert t not inbetween a and b
public boolean splice(Element a, Element b, Element t){
if (checkElement(a) < checkElement(b)){
Element A = a.previous;
Element B = b.next;
A.next = B;
B.previous = A;
Element T = t.next;
b.next = T;
a.previous = t;
t.next = a;
T.previous = b;
return true;
}
else {
System.out.println("Splicing did not occur due to b<a");
return false;
}
}
So despite my comment, I see one glaring problem with this. You can't use equality operators on reference types. That is, anything other than a primitive type (double, int, etc). What happens is you're comparing the address of the instance and unless they are literally the same object (same address in memory), it isn't going to return true, ever. Maybe that's what you want, but I suspect not. You need to override the method
public boolean equals(Object obj);
and use that to compare two instances of a given class. Am I correct in my assumptions?
Edit Ok, I think my original guess was correct. It works if they are from the same list because they end up being the same elements (stored in the same memory location). You need to use equals() or !equals() rather than == and !=. Try that, and see if it solves your problems. Also, don't just use them, you must override equals to actually compare the elements internal properties.

Mutable objects and hashCode

Have the following class:
public class Member {
private int x;
private long y;
private double d;
public Member(int x, long y, double d) {
this.x = x;
this.y = y;
this.d = d;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + x;
result = (int) (prime * result + y);
result = (int) (prime * result + Double.doubleToLongBits(d));
return result;
}
#Override
public boolean equals(Object obj) {
if (this == obj) {
return true;
}
if (obj instanceof Member) {
Member other = (Member) obj;
return other.x == x && other.y == y
&& Double.compare(d, other.d) == 0;
}
return false;
}
public static void main(String[] args) {
Set<Member> test = new HashSet<Member>();
Member b = new Member(1, 2, 3);
test.add(b);
System.out.println(b.hashCode());
b.x = 0;
System.out.println(b.hashCode());
Member first = test.iterator().next();
System.out.println(test.contains(first));
System.out.println(b.equals(first));
System.out.println(test.add(first));
}
}
It produces the following results:
30814
29853
false
true
true
Because the hashCode depends of the state of the object it can no longer by retrieved properly, so the check for containment fails. The HashSet in no longer working properly. A solution would be to make Member immutable, but is that the only solution? Should all classes added to HashSets be immutable? Is there any other way to handle the situation?
Regards.
Objects in hashsets should either be immutable, or you need to exercise discipline in not changing them after they've been used in a hashset (or hashmap).
In practice I've rarely found this to be a problem - I rarely find myself needing to use complex objects as keys or set elements, and when I do it's usually not a problem just not to mutate them. Of course if you've exposed the references to other code by this time, it can become harder.
Yes. While maintaining your class mutable, you can compute the hashCode and the equals methods based on immutable values of the class ( perhaps a generated id ) to adhere to the hashCode contract defined in Object class:
Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.
If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Depending on your situation this may be easier or not.
class Member {
private static long id = 0;
private long id = Member.id++;
// other members here...
public int hashCode() { return this.id; }
public boolean equals( Object o ) {
if( this == o ) { return true; }
if( o instanceOf Member ) { return this.id == ((Member)o).id; }
return false;
}
...
}
If you need a thread safe attribute, you may consider use: AtomicLong instead, but again, it depends on how are you going to use your object.
As already mentioned, one can accept the following three solutions:
Use immutable objects; even when your class is mutable, you may use immutable identities on your hashcode implementation and equals checking, eg an ID-like value.
Similarly to the above, implement add/remove to get a clone of the inserted object, not the actual reference. HashSet does not offer a get function (eg to allow you alter the object later on); thus, you are safe there won't exist duplicates.
Exercise discipline in not changing them after they've been used, as #Jon Skeet suggests
But, if for some reason you really need to modify objects after being inserted to a HashSet, you need to find a way of "informing" your Collection with the new changes. To achieve this functionality:
You can use the Observer design pattern, and extend HashSet to implement the Observer interface. Your Member objects must be Observable and update the HashSet on any setter or other method that affects hashcode and/or equals.
Note 1: Extending 3, using 4: we may accept alterations, but those that do not create an already existing object (eg I updated a user's ID, by assigning a new ID, not setting it to an existing one). Otherwise, you have to consider the scenario where an object is transformed in such a way that is now equal to another object already existing in the Set. If you accept this limitation, 4th suggestion will work fine, else you must be proactive and define a policy for such cases.
Note 2: You have to provide both previous and current states of the altered object on your update implementation, because you have to initially remove the older element (eg use getClone() before setting new values), then add the object with the new state. The following snippet is just an example implementation, it needs changes based on your policy of adding a duplicate.
#Override
public void update(Observable newItem, Object oldItem) {
remove(oldItem);
if (add(newItem))
newItem.addObserver(this);
}
I've used similar techniques on projects, where I require multiple indices on a class, so I can look up with O(1) for Sets of objects that share a common identity; imagine it as a MultiKeymap of HashSets (this is really useful, as you can then intersect/union indices and work similarly to SQL-like searching). In such cases I annotate methods (usually setters) that must fireChange-update each of the indices when a significant change occurs, so indices are always updated with the latest states.
Jon Skeet has listed all alternatives. As for why the keys in a Map or Set must not change:
The contract of a Set implies that at any time, there are no two objects o1 and o2 such that
o1 != o2 && set.contains(o1) && set.contains(o2) && o1.equals(o2)
Why that is required is especially clear for a Map. From the contract of Map.get():
More formally, if this map contains a mapping from a key
k to a value v such that (key==null ? k==null : key.equals(k)), then this method returns v, otherwise it returns null. (There can be at most one such mapping.)
Now, if you modify a key inserted into a map, you might make it equal to some other key already inserted. Moreover, the map can not know that you have done so. So what should the map do if you then do map.get(key), where key is equal to several keys in the map? There is no intuitive way to define what that would mean - chiefly because our intuition for these datatypes is the mathematical ideal of sets and mappings, which don't have to deal with changing keys, since their keys are mathematical objects and hence immutable.
Theoretically (and more often than not practically too) your class either:
has a natural immutable identity that can be inferred from a subset of its fields, in which case you can use those fields to generate the hashCode from.
has no natural identity, in which case using a Set to store them is unnecessary, you could just as well use a List.
Never change 'hashable field" after putting in hash based container.
As if you (Member) registered your phone number (Member.x) in yellow page(hash based container), but you changed your number, then no one can find you in the yellow page any more.

Java: Equalator? (removing duplicates from a collection of objects)

I have a bunch of objects of a class Puzzle. I have overridden equals() and hashCode(). When it comes time to present the solutions to the user, I'd like to filter out all the Puzzles that are "similar" (by the standard I have defined), so the user only sees one of each.
Similarity is transitive.
Example:
Result of computations:
A (similar to A)
B (similar to C)
C
D
In this case, only A or D and B or C would be presented to the user - but not two similar Puzzles. Two similar puzzles are equally valid. It is only important that they are not both shown to the user.
To accomplish this, I wanted to use an ADT that prohibits duplicates. However, I don't want to change the equals() and hashCode() methods to return a value about similarity instead. Is there some Equalator, like Comparator, that I can use in this case? Or is there another way I should be doing this?
The class I'm working on is a Puzzle that maintains a grid of letters. (Like Scrabble.) If a Puzzle contains the same words, but is in a different orientation, it is considered to be similar. So the following to puzzle:
(2, 2): A
(2, 1): C
(2, 0): T
Would be similar to:
(1, 2): A
(1, 1): C
(1, 0): T
Okay you have a way of measuring similarity between objects. That means they form a Metric Space.
The question is, is your space also a Euclidean space like normal three dimensional space, or integers or something like that? If it is, then you could use a binary space partition in however many dimensions you've got.
(The question is, basically: is there a homomorphism between your objects and an n-dimensional real number vector? If so, then you can use techniques for measuring closeness of points in n-dimensional space.)
Now, if it's not a euclidean space then you've got a bigger problem. An example of a non-euclidean space that programers might be most familiar with would be the Levenshtein Distance between to strings.
If your problem is similar to seeing how similar a string is to a list of already existing strings then I don't know of any algorithms that would do that without O(n2) time. Maybe there are some out there.
But another important question is: how much time do you have? How many objects? If you have time or if your data set is small enough that an O(n2) algorithm is practical, then you just have to iterate through your list of objects to see if it's below a certain threshold. If so, reject it.
Just overload AbstractCollection and replace the Add function. Use an ArrayList or whatever. Your code would look kind of like this
class SimilarityRejector<T> extends AbstractCollection<T>{
ArrayList<T> base;
double threshold;
public SimilarityRejector(double threshold){
base = new ArrayList<T>();
this.threshold = threshold;
}
public void add(T t){
boolean failed = false;
for(T compare : base){
if(similarityComparison(t,compare) < threshold) faled = true;
}
if(!failed) base.add(t);
}
public Iterator<T> iterator() {
return base.iterator();
}
public int size() {
return base.size();
}
}
etc. Obviously T would need to be a subclass of some class that you can perform a comparison on. If you have a euclidean metric, then you can use a space partition, rather then going through every other item.
I'd use a wrapper class that overrides equals and hashCode accordingly.
private static class Wrapper {
public static final Puzzle puzzle;
public Wrapper(Puzzle puzzle) {
this.puzzle = puzzle;
}
#Override
public boolean equals(Object object) {
// ...
}
#Override
public int hashCode() {
// ...
}
}
and then you wrap all your puzzles, put them in a map, and get them out again…
public Collection<Collection<Puzzle>> method(Collection<Puzzles> puzzles) {
Map<Wrapper,<Collection<Puzzle>> map = new HashMap<Wrapper,<Collection<Puzzle>>();
for (Puzzle each: puzzles) {
Wrapper wrapper = new Wrapper(each);
Collection<Puzzle> coll = map.get(wrapper);
if (coll == null) map.put(wrapper, coll = new ArrayList<Puzzle>());
coll.add(puzzle);
}
return map.values();
}
Create a TreeSet using your Comparator
Adds all elements into the set
All duplicates are stripped out
Normally "similarity" is not a transitive relationship. So the first step would be to think of this in terms of equivalence rather than similarity. Equivalence is reflexive, symmetric and transitive.
Easy approach here is to define a puzzle wrapper whose equals() and hashCode() methods are implemented according to the equivalence relation in question.
Once you have that, drop the wrapped objects into a java.util.Set and that filters out duplicates.
IMHO, most elegant way was described by Gili (TreeSet with custom Comparator).
But if you like to make it by yourself, seems this easiest and clearest solution:
/**
* Distinct input list values (cuts duplications)
* #param items items to process
* #param comparator comparator to recognize equal items
* #return new collection with unique values
*/
public static <T> Collection<T> distinctItems(List<T> items, Comparator<T> comparator) {
List<T> result = new ArrayList<>();
for (int i = 0; i < items.size(); i++) {
T item = items.get(i);
boolean exists = false;
for (int j = 0; j < result.size(); j++) {
if (comparator.compare(result.get(j), item) == 0) {
exists = true;
break;
}
}
if (!exists) {
result.add(item);
}
}
return result;
}

Categories