I've got an ArrayList called conveyorBelt, which stores orders that have been picked and placed on the conveyor belt. I've got another ArrayList called readyCollected which contains a list of orders that can be collected by the customer.
What I'm trying to do with the method I created is when a ordNum is entered, it returns true if the order is ready to be collected by the customer (thus removing the collected order from the readyCollected). If the order hasn't even being picked yet, then it returns false.
I was wondering is this the right way to write the method...
public boolean collectedOrder(int ordNum)
{
int index = 0;
Basket b = new Basket(index);
if(conveyorBelt.isEmpty()) {
return false;
}
else {
readyCollected.remove(b);
return true;
}
}
I'm a little confused since you're not using ordNum at all.
If you want to confirm operation of your code and generally increase the reliability of what you're writing, you should check out unit testing and the Java frameworks available for this.
You can solve this problem using an ArrayList, but I think that this is fundamentally the wrong way to think about the problem. An ArrayList is good for storing a complete sequence of data without gaps where you are only likely to add or remove elements at the very end. It's inefficient to remove elements at other positions, and if you have just one value at a high index, then you'll waste a lot of space filling in all lower positions with null values.
Instead, I'd suggest using a Map that associates order numbers with the particular order. This more naturally encodes what you want - every order number is a key associated with the order. Maps, and particularly HashMaps, have very fast lookups (expected constant time) and use (roughly) the same amount of space no matter how many keys there are. Moreover, the time to insert or remove an element from a HashMap is expected constant time, which is extremely fast.
As for your particular code, I agree with Brian Agnew on this one that you probably want to write some unit tests for it and find out why you're not using the ordNUm parameter. That said, I'd suggest reworking the system to use HashMap instead of ArrayList before doing this; the savings in time and code complexity will really pay off.
Based on your description, why isn't this sufficient :
public boolean collectedOrder(int ordNum) {
return (readyCollected.remove(ordNum) != null);
}
Why does the conveyorBelt ArrayList even need to be checked?
As already pointed out, you most likely need to be using ordNum.
Aside from that the best answer anyone can give with the code you've posted is "perhaps". Your logic certainly looks correct and ties in with what you've described, but whether it's doing what it should depends entirely on your implementation elsewhere.
As a general pointer (which may or may not be applicable in this instance) you should make sure your code deals with edge cases and incorrect values. So you might want to flag something's wrong if readyCollected.remove(b); returns false for instance, since that indicates that b wasn't in the list to remove.
As already pointed out, take a look at unit tests using JUnit for this type of thing. It's easy to use and writing thorough unit tests is a very good habit to get into.
Related
I have a list of classes that I am attempting to sort in ascending order, by adding items in a for loop like so.
private static void addObjectToIndex(classObject object) {
for (int i=0;i<collection.size();i++)
{
if (collection.get(i).ID >= object.ID)
{
collection.insertElementAt(object, i);
return;
}
}
if (classObject.size() == 0)
classObject.add(object);
}
This is faster than sorting it every time I call that function, as that would be simpler but slower, as it gives O(N) time as opposed to using Collections.sort's O(N log N) every time (unless I'm wrong).
The problem is that when I run Collections.binarySearch to attempt to grab an item out of the Vector collection(The collection requires method calls on an atomic basis) it still ends up returning negative numbers as shown in the code below.
Comparator<classObject> c = new Comparator<classObject>()
{
public int compare(classObject u1, classObject u2)
{
int z1 = (int)(u1).ID;
int z2 = (int)(u2).ID;
if(z1 > z2)
return 1;
return z2 <= z1 ? 0 : -1;
}
};
int result = Collections.binarySearch(collection, new classObject(pID), c);
if (result < 0)
return null;
if (collection.get(result).ID != pID)
return null;
else
return collection.get(result);
Something like
result = -1043246
Shows up in the debugger, resulting in the second code snippet returning null.
Is there something I'm missing here? It's probably brain dead simple. I've tried adjusting the for loop that places things in order, <=, >=, < and > and it doesn't work. Adding object to the index i+1 doesn't work. Still returning null, which makes the entire program blow up.
Any insight would be appreciated.
Boy, did you get here from the 80s, because it sure sounds like you've missed quite a few API updates!
This is faster than sorting it every time I call that function, as that would be simpler but slower, as it gives O(N) time as opposed to using Collections.sort's O(N log N) every time (unless I'm wrong).
You're now spending an O(n) investment on every insert, So that's O(n^2) total, vs the model of 'add everything you want to add without sorting it' and then 'at the very end, sort the entire list', which is O(n logn).
Vector is threadsafe which is why I'm using it as opposed to something else, and that can't change
Nope. Threadsafety is not that easy; what you've written isn't thread safe.
Vector is obsolete and should never be used. What Vector does (vs. ArrayList) is that each individual operation on a vector is thread safe (i.e. atomic). Note that you can get this behaviour from any list if you really need it with: List<T> myList = Collections.synchronizedList(someList);, but it is highly unlikely you want this.
Take your current impl of addObjectToIndex. it is not atomic: It makes many different method calls on your vector, and these have zero guarantee of being consistent. If two threads both call addObjectToIndex and your computer has more than one core, than you will eventually end up with a list that looks like: [1, 2, 5, 4, 10] - i.e., not sorted.
Take your addObjectToIndex method: That method just doesn't work properly unless its view of your collection is consistent for the entirety of the run. In other words, that block needs to be 'atomic' - it either does it all or does none of it, and it needs a consistent view throughout. Stick a synchronized around the entire block. In contrast to Vector, which considers each individual call atomic and nothing else, which doesn't work here. More generally, 'just synchronize' is a rather inefficient way to do multicore - the various collections in the java.util.concurrent are usually vastly more efficient and much easier to use, you should read through that API and see if there's anything that'll work for you.
if(z1 > z2) return 1;
I'm pretty sure your insert code sorts ascending, but your comparator sorts descending. Which would break the binary search code (the binary search code is specced to return arbitrary garbage if the list isn't sorted, and as far as the comparator you use here is concerned, it isn't). You should use the same comparator anytime it is relevant, and not re-implement the logic multiple times (or if you do, at least test it!).
There is also no need to write all this code.
Comparator<classObject> c = Comparator::comparingInt(co -> co.ID);
is all you need.
However
It looks like what you really want is a collection that keeps itself continually sorted. Java has that; it's called a TreeSet. You pass it a Comparator (or you don't, and TreeSet expects that the elements you put in have a natural order, either is fine), and it will keep the collection sorted, at very cheap cost (far better than your O(n^2)!), continually. It IS a set, meaning if the comparator says that 2 items are equal, then adding both to the set results in the second add call being ignored (sets cannot contain the same element more than once, and for a TreeSet, 'the same element' is defined solely by 'comparing them returns 0' - TreeSet ignores hashCode and equals entirely).
This sounds like what you really want. If you need 2 different objects with the same ID to be added anyway, then add some more fields to your comparator (instead of returning 0 upon same ID, move on to checking the insertion timestamp or whatnot). But, with a name like 'ID', sounds like duplicates aren't going to happen.
The reason you want to use this off-the-shelf stuff is because otherwise you need to do it yourself, and if you're going to endeavour to write it yourself, you need to be a good programmer. Which you clearly aren't (yet - we all started a newbie and learned to become good later, it's the natural order of things). For example, if I try to add an element to a non-empty collection where the element I try to add has a larger ID than anything in the collection, it just won't add anything. That's because you wrote if (classObject.size() == 0) classObject.add(object); but you wanted classObject.add(object); without the if. Also, In java we write ClassObject, not ClassObject, and more generally, ClassObject is a completely meaningless name. Find a better name; this helps code be less confusing, and this question does suggest you could use some of that.
What is the most efficient way to remember which objects are processed?
Obviously one could use a hash set:
Set<Foo> alreadyProcessed = new HashSet<>();
void process(Foo foo) {
if (!alreadyProcessed.contains(foo)) {
// Do something
alreadyProcessed.add(foo);
}
}
This makes me wonder why I would store the object, while I just want to check if the hash does exist in this set. Assuming that any hash of foo is unique.
Is there any more performant way to do this?
Take in mind that a very large number of objects will be processed and that the actual process code will not always be very heavy. Also it is not possible for me to have a precompiled worklist of objects, it will be build up dynamically during the processing.
Set#contains can be very fast. It depends how are your hashcode() and equals() methods are implemented. try to cache the hashcode value to make it more faster. (like String.java)
The other simple and fastes option is to add a boolean member to your Foo class: foo.done = true;
You can't use the hashcode since equality of the hashcode of two objects does not imply that the objects are equal.
Else depending on the use case you want to remember if you have already processed
a) the same object, tested by reference, or
b) an equal object, tested by a call to Object.equals(Object)
For b) you can use a standard Set implementation.
For a) also you can also use a standard Set implementation if you now that the equals-method is returning equality of references, or you would need something like a IdentityHashSet.
No mention of performance in this answer, you need to address correctness first!
Write good code. Optimize it for performance only if you can show that you need to in your use case.
There is no performance advantage in storing a hash code rather than the object. If you doubt that, remember that what is being stored is a reference to the object, not a copy of it. In reality that's going to be 64 bits, pretty much the same as the hash code. You've already spent a substantial amount of time thinking about a problem that none of your users will ever notice. (If you are doing this calculations millions of times in a tight, mission-critical loop that's another matter).
Using the set is simple to understand. Doing anything else runs a risk that a future maintainer will not understand the code and introduce a bug.
Also don't forget that a hash code it not guaranteed to be unique for every different object. Every so often storing the hash code will give you a false positive, causing you to fail to process an object you wanted to process. (As an aside, you need to make sure that equals() only considers two objects equal if they are the same object. The default Object.equals() does this, so don't override it)
Use the Set. If you are processing a very large number of objects, use a more efficient Set than HashSet. That is much more likely to give you a performance speedup than anything clever with hashing.
If I have a collection of objects that I'd like to be able to look up by name, I could of course use a { string => object } map.
Is there ever a reason to use a vector of objects along with a { string => index into this vector } companion map, instead?
I've seen a number of developers do this over the years, and I've largely dismissed it as an indication that the developer is unfamiliar with maps, or is otherwise confused. But in recent days, I've started second-guessing myself, and I'm concerned I might be missing a potential optimization or something, though I can't for the life of me figure out what that could optimize.
There is one reason I can think of:
Besides looking up object by name, sometimes you also want to iterate through all the objects as efficient as possible. Using a map + vector can achieve this. You pay a very small penalty for accessing the vector via index, but you could gain a big performance improvement by iterating a vector rather than a map (because vector is in continuous memory and more cache friendly).
Of course you can do similar thing using boost::multiindex, but that has some restrictions on the object itself.
I can think of at least a couple reasons:
For unrelated reasons you need to retain insertion order.
You want to have multiple maps pointing into the vector (different indexes).
Not all items in the vector need to be pointed to by a string.
There's no optimization. If you think about it, it might actually decrease performance (albeit by a few micronanoseconds). This is because the vector-based "solution" would require an extra step to find the object in the vector, while the non-vector-based solution doesn't have to do that.
For example, I would like to do something like the following in java:
int[] numbers = {1,2,3,4,5};
int[] result = numbers*2;
//result now equals {2,4,6,8,10};
Is this possible to do without iterating through the array? Would I need to use a different data type, such as ArrayList? The current iterating step is taking up some time, and I'm hoping something like this would help.
No, you can't multiply each item in an array without iterating through the entire array. As pointed out in the comments, even if you could use the * operator in such a way the implementation would still have to touch each item in the array.
Further, a different data type would have to do the same thing.
I think a different answer from the obvious may be beneficial to others who have the same problem and don't mind a layer of complexity (or two).
In Haskell, there is something known as "Lazy Evaluation", where you could do something like multiply an infinitely large array by two, and Haskell would "do" that. When you accessed the array, it would try to evaluate everything as needed. In Java, we have no such luxury, but we can emulate this behavior in a controllable manner.
You will need to create or extend your own List class and add some new functions. You would need functions for each mathematical operation you wanted to support. I have examples below.
LazyList ll = new LazyList();
// Add a couple million objects
ll.multiplyList(2);
The internal implementation of this would be to create a Queue that stores all the primitive operations you need to perform, so that order of operations is preserved. Now, each time an element is read, you perform all operations in the Queue before returning the result. This means that reads are very slow (depending on the number of operations performed), but we at least get the desired result.
If you find yourself iterating through the whole array each time, it may be useful to de-queue at the end instead of preserving the original values.
If you find that you are making random accesses, I would preserve the original values and returned modified results when called.
If you need to update entries, you will need to decide what that means. Are you replacing a value there, or are you replacing a value after the operations were performed? Depending on your answer, you may need to run backwards through the queue to get a "pre-operations" value to replace an older value. The reasoning is that on the next read of that same object, the operations would be applied again and then the value would be restored to what you intended to replace in the list.
There may be other nuances with this solution, and again the way you implement it would be entirely different depending on your needs and how you access this (sequentially or randomly), but it should be a good start.
With the introduction of Java 8 this task can be done using streams.
private long tileSize(int[] sizes) {
return IntStream.of(sizes).reduce(1, (x, y) -> x * y);
}
No it isn't. If your collection is really big and you want to do it faster you can try to operates on elements in 2 or more threads, but you have to take care of synchronization(use synchronized collection) or divide your collection to 2(or more) collections and in each thread operate on one collection. I'm not sure wheather it will be faster than just iterating through the array - it depends on size of your collection and on what you want to do with each element. If you want to use this solution you will have wheather is it faster in your case - it might be slower and definitely it will be much more complicated.
Generally - if it's not critical part of code and time of execution isn't too long i would leave it as it is now.
I have a set of time stamped values I'd like to place in a sorted set.
public class TimedValue {
public Date time;
public double value;
public TimedValue(Date time, double value) {
this.time = time;
this.value = value;
}
}
The business logic for sorting this set says that values must be ordered in descending value order, unless it's more than 7 days older than the newest value.
So as a test, I came up with the following code...
DateFormat dateFormatter = new SimpleDateFormat("MM/dd/yyyy");
TreeSet<TimedValue> mySet = new TreeSet<TimedValue>(new DateAwareComparator());
mySet.add(new TimedValue(dateFormatter.parse("01/01/2009"), 4.0 )); // too old
mySet.add(new TimedValue(dateFormatter.parse("01/03/2009"), 3.0)); // Most relevant
mySet.add(new TimedValue(dateFormatter.parse("01/09/2009"), 2.0));
As you can see, initially the first value is more relevant than the second, but once the final value is added to the set, the first value has expired and should be the least relevant.
My initial tests say that this should work... that the TreeSet will dynamically reorder the entire list as more values are added.
But even though I see it, I'm not sure I believe it.
Will a sorted collection reorder the entire set as each element is added? Are there any gotchas to using a sorted collection in this manner (i.e performance)? Would it be better to manually sort the list after all values have been added (I'm guessing it would be)?
Follow-up:
As many (and even I to a certain extent) suspected, the sorted collection does not support this manner of "dynamic reordering". I believe my initial test was "working" quite by accident. As I added more elements to the set, the "order" broke down quite rapidly. Thanks for all the great responses, I refactored my code to use approaches suggested by many of you.
I don't see how your comparator can even detect the change, unless it remembers the newest value it's currently seen - and that sounds like an approach which is bound to end in tears.
I suggest you do something along the following lines:
Collect your data in an unordered set (or list)
Find the newest value
Create a comparator based on that value, such that all comparisons using that comparator will be fixed (i.e. it will never return a different result based on the same input values; the comparator itself is immutable although it depends on the value originally provided in the constructor)
Create a sorted collection using that comparator (in whatever way seems best depending on what you then want to do with it)
I would advise against this for a few reasons:
Since it's basically a red-black tree behind the scenes (which doesn't necessarily have to be rebuilt from scratch on every insertion), you might easily end up with values in the wrong part of the tree (invalidating most of the TreeSet API).
The behavior is not defined in the spec, and thus may change later even if it's working now.
In the future, when anything goes strangely wrong in anything remotely touching this code, you'll spend time suspecting that this is the cause.
I would recommend either recreating/resorting the TreeSet before searching it or (my preference) iterating through the set before the search and removing any of the objects that are too old. You could even, if you wanted to trade some memory for speed, keep a second list ordered by date and backed by the same objects so that you all you would have to do to filter your TreeSet is remove objects from the TreeSet based on the time-sorted list.
I don't believe the JDK libraries or even 3rd party libraries are written to handle a comparator whose results are not consistent. I wouldn't depend on this working. I would worry more if your Comparator can return not-equal for two values when called one time and can return equal for the same two values if called later.
Read carefully the contract of Comparator.compare(). Does your Comparator satisfy those constraints?
To elaborate, if your Comparator returns that two values are not equals when you call it once, but then later returns that the two values are equal because a later value was added to the set and has changed the output of the Comparator, the definition of "Set" (no duplicates) becomes undone.
Jon Skeet's advice in his answer is excellent advice and will avoid the need to worry about these sorts of problems. Truly, if your Comparator does not return values consistent with equals() then you can have big problems. Whether or not a sorted set will re-sort each time you add something, I wouldn't depend on, but the worst thing that would occur from order changing is your set would not remain sorted.
No, this won't work.
If you are using comparable keys in a collection, the results of the comparison between two keys must remain the same over time.
When storing keys in a binary tree, each fork in the path is chosen as the result of the comparison operation. If a later comparison returns a different result, a different fork will be taken, and the previously stored key will not be found.
I am 99% certain this will not work. If a value in the Set suddenly changes its comparison behaviour, it is possible (quite likely, actually) that it will not be found anymore; i.e. set.contains(value) will return false, because the search algorithm will at one point do a comparison and continue in the wrong subtree because that comparison now returns a different result than it did when the value was inserted.
I think the non-changing nature of a Comparator is supposed to be on a per-sort basis, so as long as you are consistent for the duration of a given sorting operation, you are ok (so long as none of the items cross the 7 day boundary mid-sort).
However, you might want to make it more obvious that you are asking specifically about a TreeSet, which I imagine re-uses information from previous sorts to save time when you add a new item so this is a bit of a special case. The TreeSet javadocs specifically defer to the Comparator semantics, so you are probably not officially supported, but you'd have to read the code to get a good idea of whether or not you are safe.
I think you'd be better off doing a complete sort when you need the data sorted, using a single time as "now" so that you don't risk jumping that boundary if your sort takes long enough to make it likely.
It's possible that a record will change from <7 days to >7 days mid-sort, so what you're doing violates the rules for a comparator. Of course this doesn't mean it won't work: lots of things that are documented as "unpredictable" in fact work if you know exactly what is happening internally.
I think the textbook answer is: This is not reliable with the built-in sorts. You would have to write your own sort function.
At the very least, I would say that you can't rely on a TreeSet or any "sorted structure" magically resorting itself when dates roll over the boundary. At best this might work if you re-sort just before displaying, and don't rely on anything remaining correct between updates.
At worst, inconsistent comparisons might break the sorts badly. You have no assurance that this won't put you into an infinite loop or some other deadly black hole.
So I'd say: Read the source code from Sun for whatever classes or functions you plan to use, and see if you can figure out what will happen. Testing is good, but there are potentially tricky cases that are difficult to test. THe most obvious is: What if while it's in the process of sorting, a record rolls over the date boundary? That is, it might look at a record once and say it's <7 but the next time it sees it it's >7. That could be bad, bad news.
One obvious trick that occurs to me: Convert the date to an age at the time you add the record to the structure, rather than dynamically. That way it can't change within the sort. If the structure is going to live for more than a few minutes, recalculate the ages at some appropriate time and then re-sort. I doubt someone will say your program is incorrect because you said a record was less than 7 days old when really it's 7 days, 0 hours, 0 minutes, and 2 seconds old. Even if someone noticed, how accurate is their watch?
As already noted, the Comparator cannot do this for you, because the transitivity is violated. Basically, in order to be able to sort the items, you must be able to compare each two of them (independent of the rest), which obviously you cannot do. So your scenario basically either would not work or would produce not consistent result.
Maybe something simpler would be good enough for you:
apply simple Comparator that uses the Value as you need
and simply remove from your list/collection all elements that are 7 days older then the newest. Basically whenever a new item is added, you check if it is the newest, and if it is, remove those that are 7 days older then this.
This would not work if you also remove the items from the list, in which case you would need to keep all those you removed in a separate list (which by the way you would sort by date) and add those back to the original list in case the MAX(date) is smaller after removal.