I wrote a maze solving program which is supposed to support DFS, BFS, A*, Dijkstra's, and greedy algorithm. Anyway, I chose PriorityQueue for my frontier data structure since I thought a priority can behave like a queue, stack, or priority queue depends on the implementation of the comparator.
This is how I implemented my comparator to turn the priority queue into a queue:
/Since the "natural ordering" of a priority queue has the least element at the head and a conventional comparator returns -1 when the first is less than the second, the hacked comparator always return 1 so that the current (last) square will be placed at the tail (this should work recursively)/
public int compare(Square square1, Square square2)
{
return 1;
}
However, my solution for the maze was not optimal after I did a BFS.
The maze starts at top right corner with coordinate (35,1) and my program checks the left, then up, then down, then right neighbour.
Here are the println I did:
polled out (35,1)
added (34,1)
added (35,2)
polled out (34,1)
added (33,1)
added (34,2)
polled out (35,2)
added (35,3)
polled out (33,1)
added (32,1)
added (33,2)
polled out (34,2)
add (34,3)
poll out (32,1)
......
Notice in a BFS (35,3) should be polled out before (32,1) since the former is added into the queue before the latter. What really confused me is that the data structure behaved like a queue--all new members were added from the back--until I added (32,1), which was placed at the head of the queue.
I thought my comparator should force the priority queue to put new comers in the back. What is even stranger to me is that the data structure changed its nature from a queue to a stack in the middle.
Many thanks to you guys ahead and sorry about my poor English,
Sincerely,
Sean
The way you've implemented compare is wrong, and would only work if it's called only in a very specific way that you're assuming. However, you have no idea in what context the PriorityQueue actually calls compare. The compare function might well be called on an existing element inside the data structure, instead of the new one, or vice versa.
(Even if you did read the source code and traced it and found that this particular implementation works in a certain way, you shouldn't depend on that if you want your code to be maintainable. At the least, you'd be making yourself more work by having to explain why it works.)
You could just use some sort of counter and assign it as the value for each added item, then implement compare correctly based on the value.
A correct implementation of compare might look like this:
int compare(Object x, Object y){
return x.getSomeProperty() - y.getSomeProperty();
}
Note that if you switch the order of the parameters, the answer will change as well. No, the int returned does not necessarily have to come from {-1, 0, 1}. The spec calls for 0, or a negative or positive integer. You can use any one you wish, so long as it's the correct sign.
Related
I have a list of classes that I am attempting to sort in ascending order, by adding items in a for loop like so.
private static void addObjectToIndex(classObject object) {
for (int i=0;i<collection.size();i++)
{
if (collection.get(i).ID >= object.ID)
{
collection.insertElementAt(object, i);
return;
}
}
if (classObject.size() == 0)
classObject.add(object);
}
This is faster than sorting it every time I call that function, as that would be simpler but slower, as it gives O(N) time as opposed to using Collections.sort's O(N log N) every time (unless I'm wrong).
The problem is that when I run Collections.binarySearch to attempt to grab an item out of the Vector collection(The collection requires method calls on an atomic basis) it still ends up returning negative numbers as shown in the code below.
Comparator<classObject> c = new Comparator<classObject>()
{
public int compare(classObject u1, classObject u2)
{
int z1 = (int)(u1).ID;
int z2 = (int)(u2).ID;
if(z1 > z2)
return 1;
return z2 <= z1 ? 0 : -1;
}
};
int result = Collections.binarySearch(collection, new classObject(pID), c);
if (result < 0)
return null;
if (collection.get(result).ID != pID)
return null;
else
return collection.get(result);
Something like
result = -1043246
Shows up in the debugger, resulting in the second code snippet returning null.
Is there something I'm missing here? It's probably brain dead simple. I've tried adjusting the for loop that places things in order, <=, >=, < and > and it doesn't work. Adding object to the index i+1 doesn't work. Still returning null, which makes the entire program blow up.
Any insight would be appreciated.
Boy, did you get here from the 80s, because it sure sounds like you've missed quite a few API updates!
This is faster than sorting it every time I call that function, as that would be simpler but slower, as it gives O(N) time as opposed to using Collections.sort's O(N log N) every time (unless I'm wrong).
You're now spending an O(n) investment on every insert, So that's O(n^2) total, vs the model of 'add everything you want to add without sorting it' and then 'at the very end, sort the entire list', which is O(n logn).
Vector is threadsafe which is why I'm using it as opposed to something else, and that can't change
Nope. Threadsafety is not that easy; what you've written isn't thread safe.
Vector is obsolete and should never be used. What Vector does (vs. ArrayList) is that each individual operation on a vector is thread safe (i.e. atomic). Note that you can get this behaviour from any list if you really need it with: List<T> myList = Collections.synchronizedList(someList);, but it is highly unlikely you want this.
Take your current impl of addObjectToIndex. it is not atomic: It makes many different method calls on your vector, and these have zero guarantee of being consistent. If two threads both call addObjectToIndex and your computer has more than one core, than you will eventually end up with a list that looks like: [1, 2, 5, 4, 10] - i.e., not sorted.
Take your addObjectToIndex method: That method just doesn't work properly unless its view of your collection is consistent for the entirety of the run. In other words, that block needs to be 'atomic' - it either does it all or does none of it, and it needs a consistent view throughout. Stick a synchronized around the entire block. In contrast to Vector, which considers each individual call atomic and nothing else, which doesn't work here. More generally, 'just synchronize' is a rather inefficient way to do multicore - the various collections in the java.util.concurrent are usually vastly more efficient and much easier to use, you should read through that API and see if there's anything that'll work for you.
if(z1 > z2) return 1;
I'm pretty sure your insert code sorts ascending, but your comparator sorts descending. Which would break the binary search code (the binary search code is specced to return arbitrary garbage if the list isn't sorted, and as far as the comparator you use here is concerned, it isn't). You should use the same comparator anytime it is relevant, and not re-implement the logic multiple times (or if you do, at least test it!).
There is also no need to write all this code.
Comparator<classObject> c = Comparator::comparingInt(co -> co.ID);
is all you need.
However
It looks like what you really want is a collection that keeps itself continually sorted. Java has that; it's called a TreeSet. You pass it a Comparator (or you don't, and TreeSet expects that the elements you put in have a natural order, either is fine), and it will keep the collection sorted, at very cheap cost (far better than your O(n^2)!), continually. It IS a set, meaning if the comparator says that 2 items are equal, then adding both to the set results in the second add call being ignored (sets cannot contain the same element more than once, and for a TreeSet, 'the same element' is defined solely by 'comparing them returns 0' - TreeSet ignores hashCode and equals entirely).
This sounds like what you really want. If you need 2 different objects with the same ID to be added anyway, then add some more fields to your comparator (instead of returning 0 upon same ID, move on to checking the insertion timestamp or whatnot). But, with a name like 'ID', sounds like duplicates aren't going to happen.
The reason you want to use this off-the-shelf stuff is because otherwise you need to do it yourself, and if you're going to endeavour to write it yourself, you need to be a good programmer. Which you clearly aren't (yet - we all started a newbie and learned to become good later, it's the natural order of things). For example, if I try to add an element to a non-empty collection where the element I try to add has a larger ID than anything in the collection, it just won't add anything. That's because you wrote if (classObject.size() == 0) classObject.add(object); but you wanted classObject.add(object); without the if. Also, In java we write ClassObject, not ClassObject, and more generally, ClassObject is a completely meaningless name. Find a better name; this helps code be less confusing, and this question does suggest you could use some of that.
I need to implement a priority queue where the priority of an item in the queue can change and the queue adjusts itself so that items are always removed in the correct order. I have some ideas of how I could implement this but I'm sure this is quite a common data structure so I'm hoping I can use an implementation by someone smarter than me as a base.
Can anyone tell me the name of this type of priority queue so I know what to search for or, even better, point me to an implementation?
Priority queues such as this are typically implemented using a binary heap data structure as someone else suggested, which usually is represented using an array but could also use a binary tree. It actually is not hard to increase or decrease the priority of an element in the heap. If you know you are changing the priority of many elements before the next element is popped from the queue you can temporarily turn off dynamic reordering, insert all of the elements at the end of the heap, and then reorder the entire heap (at a cost of O(n)) just before the element needs to be popped. The important thing about heaps is that it only costs O(n) to put an array into heap order but O(n log n) to sort it.
I have used this approach successfully in a large project with dynamic priorities.
Here is my implementation of a parameterized priority queue implementation in the Curl programming language.
A standard binary heap supports 5 operations (the example below assume a max heap):
* find-max: return the maximum node of the heap
* delete-max: removing the root node of the heap
* increase-key: updating a key within the heap
* insert: adding a new key to the heap
* merge: joining two heaps to form a valid new heap containing all the elements of both.
As you can see, in a max heap, you can increase an arbitrary key. In a min heap you can decrease an arbitrary key. You can't change keys both ways unfortunately, but will this do? If you need to change keys both ways then you might want to think about using a a min-max-heap.
I would suggest first trying the head-in approach, to update a priority:
delete the item from the queue
re-insert it with the new priority
In C++, this could be done using a std::multi_map, the important thing is that the object must remember where it is stored in the structure to be able to delete itself efficiently. For re-insert, it's difficult since you cannot presume you know anything about the priorities.
class Item;
typedef std::multi_map<int, Item*> priority_queue;
class Item
{
public:
void add(priority_queue& queue);
void remove();
int getPriority() const;
void setPriority(int priority);
std::string& accessData();
const std::string& getData() const;
private:
int mPriority;
std::string mData;
priority_queue* mQueue;
priority_queue::iterator mIterator;
};
void Item::add(priority_queue& queue)
{
mQueue = &queue;
mIterator = queue.insert(std::make_pair(mPriority,this));
}
void Item::remove()
{
mQueue.erase(mIterator);
mQueue = 0;
mIterator = priority_queue::iterator();
}
void Item::setPriority(int priority)
{
mPriority = priority;
if (mQueue)
{
priority_queue& queue = *mQueue;
this->remove();
this->add(queue);
}
}
I am looking for just exactly the same thing!
And here is some of my idea:
Since a priority of an item keeps changing,
it's meaningless to sort the queue before retrieving an item.
So, we should forget using a priority queue. And "partially" sort the
container while retrieving an item.
And choose from the following STL sort algorithms:
a. partition
b. stable_partition
c. nth_element
d. partial_sort
e. partial_sort_copy
f. sort
g. stable_sort
partition, stable_partition and nth_element are linear-time sort algorithms, which should be our 1st choices.
BUT, it seems that there is no those algorithms provided in the official Java library. As a result, I will suggest you to use java.util.Collections.max/min to do what you want.
Google has a number of answers for you, including an implementation of one in Java.
However, this sounds like something that would be a homework problem, so if it is, I'd suggest trying to work through the ideas yourself first, then potentially referencing someone else's implementation if you get stuck somewhere and need a pointer in the right direction. That way, you're less likely to be "biased" towards the precise coding method used by the other programmer and more likely to understand why each piece of code is included and how it works. Sometimes it can be a little too tempting to do the paraphrasing equivalent of "copy and paste".
I am new in programming in general and in Java in particular. I want to implement an LRU cache and would like to have O(1) complexity.
I have seen some implementations in the Internet using a manually implemented doubly linked list for the cache (two arrays, a Node class, previous, next etc.) and a HashMap where Key is the item to be cached and Value is the timestamp.
I really don't see the reason to use timestamps: the inserted item goes to the head of the manually-implemented LinkedList, the evicted item is the cached item located at the tail, and in every insertion the previously cached items are shifted one position towards the tail.
The only problems that I see are the following:
For the cache lookup (to find if we have a cache hit or miss for the requested item), we have to "scan" the cache list, which implies a for loop of some type (conventional, for-each etc., I don't really care much at this point). Obviously, we don't want that. I believe this issue can be solved easily by using an array of boolean variables to indicate whether an item is in the cache or not (1: in, 0: out) - let's call it lookupArray - as following: Let's say that the items are distinguished by some numeric ID, i.e. an integer between 1 and N. Then, this lookupArray of booleans will have size N+1 (because array indexing starts from zero) and it will be initialized at all zero values. When the item with numeric ID k, where 1<=k<=N, enters the cache, we set the boolean value at index k of lookupArray to 1. That way, cache lookup does not need any search in the cache: in order to check whether the item with numeric ID k is in the cache or not, we simply check whether the value of lookupArray at index k is 1 or 0, respectively. (We already have the index, i.e. we know where to look, thus there is no need to use a for loop.)
The second problem, though, is not easilly solvable. Let's say that we have a cache hit for an item. Then, if this item is not located at the head (i.e. if it is not the most recently used item), we have to locate it in the cache list and then put it at the head. As far as I understand, this implies searching in the cache list, i.e. a for loop. Then, we can't achieve the O(1) objective.
Am I right about (2)? Is any way to do this without using a HashMap and timestamps?
Due to the fact that I am relatively new in programming as I stated at the beginning of the post, I would really appreciate the use, if possible, of any code snippets demonstrating the implementation with a manually implemented doubly linked list.
Sorry for the long message, I hope it is not only detailed but also clear.
Thank you!
Consider using a queue. It allows you to remove an object and insert it at the beginning. It also has a size and can be used for caching.
http://docs.oracle.com/javase/7/docs/api/java/util/Queue.html
O, maybe you should not implement it yourself. There is an LRUMap available in the Apache Commons Collections library.
https://commons.apache.org/proper/commons-collections/javadocs/api-3.2.1/org/apache/commons/collections/map/LRUMap.html
I made a Java program to browse a tree with depth-first. The program is correct, but the choice of the son of a node is random. for example in this tree :
sometimes, the result is:
A-B-E-C-F-D
A-C-F-D-B-E
A-B-E-D-C-F
I want to make test (unit testing) of this program but i I have no idea, how i can do it? please
I thought to do a List that contains the elements and compare the elements of the list with the result of my depth-first tree, but the result of my depth-first is randomly. Then I can not compare it with the elements of the List.
There are 2 properties you want to test for:
Each node visited exactly one
Traversal is depth first
The first is easy to test: the number of unique nodes visited must be equal to the number of nodes in the tree. Can test that against any random tree.
The second is slightly trickier - expressing it in the general case is probably more complex than the tested code. Easier just to pick some representative constraints based on the specific known data, i.e.
B must be after A
E must be immediately after B
...
Hard to conceive of realistic code that satisfies the first property for all trees, but would fail the second only in specific cases. So outside of the most formal of safety critical systems (and what are they doing using dynamic data structures anyway?), that's going to be enough.
I haven't clicked on your link, but if the code is truly random and is intended to be, then you should make your unit test so that it says "given this input, then the output must be one of these three things". This isn't ideal because it might take many, many runs before a bug shows up (i.e. the first few times you run it, it might just randomly mask the bug), but I suspect it's the best you can do for testing an algorithm with random behaviour.
This means that the order of the children of each node is not deterministic. You probably used a Set to hold children. Consider using a LinkedHashSet (which preserves insertion order) or a SortedSet (which sorts children). This way, the order will always be the same.
If randomness is a feature of your tree and you want to keep it as is, then see the other answers, or change the algorithm itself to make sure you always sort the children while traversing the tree.
Choose a data set for the unit test that has just a few valid results (it should however have more than one, obviously), and test whether the result is one of them.
Alternatively, you could try to impose a well-defined order on the nodes (e.g. by alphabetically sorting the children of each node, instead of managing them in a Set)
I have a set of time stamped values I'd like to place in a sorted set.
public class TimedValue {
public Date time;
public double value;
public TimedValue(Date time, double value) {
this.time = time;
this.value = value;
}
}
The business logic for sorting this set says that values must be ordered in descending value order, unless it's more than 7 days older than the newest value.
So as a test, I came up with the following code...
DateFormat dateFormatter = new SimpleDateFormat("MM/dd/yyyy");
TreeSet<TimedValue> mySet = new TreeSet<TimedValue>(new DateAwareComparator());
mySet.add(new TimedValue(dateFormatter.parse("01/01/2009"), 4.0 )); // too old
mySet.add(new TimedValue(dateFormatter.parse("01/03/2009"), 3.0)); // Most relevant
mySet.add(new TimedValue(dateFormatter.parse("01/09/2009"), 2.0));
As you can see, initially the first value is more relevant than the second, but once the final value is added to the set, the first value has expired and should be the least relevant.
My initial tests say that this should work... that the TreeSet will dynamically reorder the entire list as more values are added.
But even though I see it, I'm not sure I believe it.
Will a sorted collection reorder the entire set as each element is added? Are there any gotchas to using a sorted collection in this manner (i.e performance)? Would it be better to manually sort the list after all values have been added (I'm guessing it would be)?
Follow-up:
As many (and even I to a certain extent) suspected, the sorted collection does not support this manner of "dynamic reordering". I believe my initial test was "working" quite by accident. As I added more elements to the set, the "order" broke down quite rapidly. Thanks for all the great responses, I refactored my code to use approaches suggested by many of you.
I don't see how your comparator can even detect the change, unless it remembers the newest value it's currently seen - and that sounds like an approach which is bound to end in tears.
I suggest you do something along the following lines:
Collect your data in an unordered set (or list)
Find the newest value
Create a comparator based on that value, such that all comparisons using that comparator will be fixed (i.e. it will never return a different result based on the same input values; the comparator itself is immutable although it depends on the value originally provided in the constructor)
Create a sorted collection using that comparator (in whatever way seems best depending on what you then want to do with it)
I would advise against this for a few reasons:
Since it's basically a red-black tree behind the scenes (which doesn't necessarily have to be rebuilt from scratch on every insertion), you might easily end up with values in the wrong part of the tree (invalidating most of the TreeSet API).
The behavior is not defined in the spec, and thus may change later even if it's working now.
In the future, when anything goes strangely wrong in anything remotely touching this code, you'll spend time suspecting that this is the cause.
I would recommend either recreating/resorting the TreeSet before searching it or (my preference) iterating through the set before the search and removing any of the objects that are too old. You could even, if you wanted to trade some memory for speed, keep a second list ordered by date and backed by the same objects so that you all you would have to do to filter your TreeSet is remove objects from the TreeSet based on the time-sorted list.
I don't believe the JDK libraries or even 3rd party libraries are written to handle a comparator whose results are not consistent. I wouldn't depend on this working. I would worry more if your Comparator can return not-equal for two values when called one time and can return equal for the same two values if called later.
Read carefully the contract of Comparator.compare(). Does your Comparator satisfy those constraints?
To elaborate, if your Comparator returns that two values are not equals when you call it once, but then later returns that the two values are equal because a later value was added to the set and has changed the output of the Comparator, the definition of "Set" (no duplicates) becomes undone.
Jon Skeet's advice in his answer is excellent advice and will avoid the need to worry about these sorts of problems. Truly, if your Comparator does not return values consistent with equals() then you can have big problems. Whether or not a sorted set will re-sort each time you add something, I wouldn't depend on, but the worst thing that would occur from order changing is your set would not remain sorted.
No, this won't work.
If you are using comparable keys in a collection, the results of the comparison between two keys must remain the same over time.
When storing keys in a binary tree, each fork in the path is chosen as the result of the comparison operation. If a later comparison returns a different result, a different fork will be taken, and the previously stored key will not be found.
I am 99% certain this will not work. If a value in the Set suddenly changes its comparison behaviour, it is possible (quite likely, actually) that it will not be found anymore; i.e. set.contains(value) will return false, because the search algorithm will at one point do a comparison and continue in the wrong subtree because that comparison now returns a different result than it did when the value was inserted.
I think the non-changing nature of a Comparator is supposed to be on a per-sort basis, so as long as you are consistent for the duration of a given sorting operation, you are ok (so long as none of the items cross the 7 day boundary mid-sort).
However, you might want to make it more obvious that you are asking specifically about a TreeSet, which I imagine re-uses information from previous sorts to save time when you add a new item so this is a bit of a special case. The TreeSet javadocs specifically defer to the Comparator semantics, so you are probably not officially supported, but you'd have to read the code to get a good idea of whether or not you are safe.
I think you'd be better off doing a complete sort when you need the data sorted, using a single time as "now" so that you don't risk jumping that boundary if your sort takes long enough to make it likely.
It's possible that a record will change from <7 days to >7 days mid-sort, so what you're doing violates the rules for a comparator. Of course this doesn't mean it won't work: lots of things that are documented as "unpredictable" in fact work if you know exactly what is happening internally.
I think the textbook answer is: This is not reliable with the built-in sorts. You would have to write your own sort function.
At the very least, I would say that you can't rely on a TreeSet or any "sorted structure" magically resorting itself when dates roll over the boundary. At best this might work if you re-sort just before displaying, and don't rely on anything remaining correct between updates.
At worst, inconsistent comparisons might break the sorts badly. You have no assurance that this won't put you into an infinite loop or some other deadly black hole.
So I'd say: Read the source code from Sun for whatever classes or functions you plan to use, and see if you can figure out what will happen. Testing is good, but there are potentially tricky cases that are difficult to test. THe most obvious is: What if while it's in the process of sorting, a record rolls over the date boundary? That is, it might look at a record once and say it's <7 but the next time it sees it it's >7. That could be bad, bad news.
One obvious trick that occurs to me: Convert the date to an age at the time you add the record to the structure, rather than dynamically. That way it can't change within the sort. If the structure is going to live for more than a few minutes, recalculate the ages at some appropriate time and then re-sort. I doubt someone will say your program is incorrect because you said a record was less than 7 days old when really it's 7 days, 0 hours, 0 minutes, and 2 seconds old. Even if someone noticed, how accurate is their watch?
As already noted, the Comparator cannot do this for you, because the transitivity is violated. Basically, in order to be able to sort the items, you must be able to compare each two of them (independent of the rest), which obviously you cannot do. So your scenario basically either would not work or would produce not consistent result.
Maybe something simpler would be good enough for you:
apply simple Comparator that uses the Value as you need
and simply remove from your list/collection all elements that are 7 days older then the newest. Basically whenever a new item is added, you check if it is the newest, and if it is, remove those that are 7 days older then this.
This would not work if you also remove the items from the list, in which case you would need to keep all those you removed in a separate list (which by the way you would sort by date) and add those back to the original list in case the MAX(date) is smaller after removal.