Java Collections Framework's sorting algorithms - java

I'm trying to understand how Java Collections Framework sorts its collections by default and I got confused, because I read all the collections are being sorted using merge sort. But as I took a look at Array class I saw this: «Implementors should feel free to substitute other algorithms, so long as the specification itself is adhered to. (For example, the algorithm used bysort(Object[]) does not have to be a mergesort, but it does have to be stable.)» Which means it also uses other sorting algorithms. So how exactly are the collections being sorted?

The code to sort collections is delivered with the JRE/JDK.
Anyone who implements the JRE/JDK can choose to implement it in any way he wants, as long as it's conforming (i.e. it actually sorts the collection correctly and the sorting is stable).
Some implementations might choose merge-sort, others might choose something else. No specific implementation is required.

Related

Array vs ArrayList: Dynamic Use

Both ArrayLists and Vectors make use of typical arrays internally. However, that leaves me thinking... why would I use ArrayLists when I can technically do the same thing using Arrays? Is convenience the only reason? Do performance-critical applications ever make use of an ArrayList?
Any tips would be appreciated.
I believe there are multiple reasons to prefer Lists over "implementing lists over arrays" or over "using arrays", but here are the two that I think are most important:
Lists have better support to generics than Arrays (you can, and should, read about it in "Effective Java" by Bloch - see Item 25)
If you ask about using ArrayList vs. implementing it yourself - I find it hard to believe that you'll do a better job than the guys that developed it in openjdk (Josh Bloch and Neal Gafter).
Yes, performance critical applications use ArrayList all the time. It's very unlikely that array access is the dominant factor in the vast majority of programs written in Java.
The ArrayList Collection interface is much richer than the functionality provided by built-in primitive arrays. This extra functionality will save you development time as well as debugging time by not having to write those algorithms yourself.
Additionally, many programmers are already familiar with the ArrayList Collection interface and thus by utilizing the existing standard libraries it will make your code easier to read and maintain for the long term.
One reason is that ArrayLists sizes are dynamic, arrays aren't.
The internal implementation of ArrayList is array only. but ArrayList is an wrapper class which is having more capabilities added to it. These capabilities are not available when you deal with Array directly.
For example,
Delete an element from array, you will have to implement logic if your are using an Array. But if you are using ArrayList, it will do the deletion for you.
Adding an element to array:
If you are using an array, you will have to implement the logic. But using an ArrayList, it is pretty easy.
You will find lot of methods in this ArrayList class that are handy for day to day use.
Hope this will help you.

Why did they decide to make interfaces have "Optional Operations"

ImmutableSet implements the Set interface. The functions that don't make sense to an ImmutableSet are now called "Optional Operations" for Set. I assume for situations like this. So ImmutableSet now throws an UnsupportedOperationException for many Optional Operations.
This seems backwards to me. I was taught that an Interface was a contract so that you could use impose functionality across different implementations. The approach of Optional Operations seem to fundamentally change(contradict?) what Interfaces are meant to do. Implementing this today I would have the Set Interface broken into two interfaces: one for one for immutable operations and a second extending those operations for mutators. (Very quick, off the cuff solution)
I understand that technology changes. I'm not saying It should be done one way or another. My question is, does this change reflect a change in some underlying philosophy for Java? Is it just more of a bandaid to make things backwards compatible? Did I have an incomplete understanding of Interfaces?
The Java Collections API Design FAQ answers this question in detail:
Q: Why don't you support immutability directly in the core collection interfaces so that you can do away with optional operations (and UnsupportedOperationException)?
A: This is the most controversial design decision in the whole API. Clearly, static (compile time) type checking is highly desirable, and is the norm in Java. We would have supported it if we believed it were feasible. Unfortunately, attempts to achieve this goal cause an explosion in the size of the interface hierarchy, and do not succeed in eliminating the need for runtime exceptions (though they reduce it substantially).
Doug Lea, who wrote a popular Java collections package that did reflect mutability distinctions in its interface hierarchy, no longer believes it is a viable approach, based on user experience with his collections package. In his words (from personal correspondence) "Much as it pains me to say it, strong static typing does not work for collection interfaces in Java."
To illustrate the problem in gory detail, suppose you want to add the notion of modifiability to the Hierarchy. You need four new interfaces: ModifiableCollection, ModifiableSet, ModifiableList, and ModifiableMap. What was previously a simple hierarchy is now a messy heterarchy. Also, you need a new Iterator interface for use with unmodifiable Collections, that does not contain the remove operation. Now can you do away with UnsupportedOperationException? Unfortunately not.
Consider arrays. They implement most of the List operations, but not remove and add. They are "fixed-size" Lists. If you want to capture this notion in the hierarchy, you have to add two new interfaces: VariableSizeList and VariableSizeMap. You don't have to add VariableSizeCollection and VariableSizeSet, because they'd be identical to ModifiableCollection and ModifiableSet, but you might choose to add them anyway for consistency's sake. Also, you need a new variety of ListIterator that doesn't support the add and remove operations, to go along with unmodifiable List. Now we're up to ten or twelve interfaces, plus two new Iterator interfaces, instead of our original four. Are we done? No.
Consider logs (such as error logs, audit logs and journals for recoverable data objects). They are natural append-only sequences, that support all of the List operations except for remove and set (replace). They require a new core interface, and a new iterator.
And what about immutable Collections, as opposed to unmodifiable ones? (i.e., Collections that cannot be changed by the client AND will never change for any other reason). Many argue that this is the most important distinction of all, because it allows multiple threads to access a collection concurrently without the need for synchronization. Adding this support to the type hierarchy requires four more interfaces.
Now we're up to twenty or so interfaces and five iterators, and it's almost certain that there are still collections arising in practice that don't fit cleanly into any of the interfaces. For example, the collection-views returned by Map are natural delete-only collections. Also, there are collections that will reject certain elements on the basis of their value, so we still haven't done away with runtime exceptions.
When all was said and done, we felt that it was a sound engineering compromise to sidestep the whole issue by providing a very small set of core interfaces that can throw a runtime exception.
In short, having interfaces like Set with optional operations was done to prevent an exponential explosion in the number of different interfaces needed. It is not as simple as just "immutable" and "mutable". Guava's ImmutableSet then had to implement Set to be interoperable with all other code which uses Sets. It's not ideal but there is really no better way to do it.

From CObArray to ArrayList

I have some pieces of C++ code that store objects in CObArray. I want to re-code the same pieces in Java using ArrayList to store the same objects. Will there be any difference in the overall efficiency?
So is ArrayList the exact correspondent class for CObArray?
I didn't know what a CObArray was: CObArray is part of MS's C++ implementations.
They description of a CObArray sounds like it behaves in a similar fashion to an ArrayList. That is, in terms of implementation and performance. You should bare in mind that the interfaces will differ for sure. For example, Java's ArrayList does not have anything like GetUpperBound(). If you depend on something like this, you sure ensure you can live without corresponding methods.
In addition, the preferred way to work with ArrayList's in Java is by the use of Generics (specify the type that will exist in the collection at compile-time as opposed to casts performed at run-time). This sounds like it may differ from how it works with CObArray's according to AJG85. You must also ensure that before you begin your conversion to Java that you are aware of differences like this and how they work.

Google Collections equivalent to Apache Commons Collections ArrayUtils.toObject and ArrayUtils.toPrimitive

Since everyone praises Google Collections (e.g. in here)
How come I can't find the equivalent of ArrayUtils.toObject() and ArrayUtils.toPrimitive()? is it that unusable? did I miss it?
To be honest I'm not sure if either of those methods should even qualify as a collection-related operation and as such I'd wonder why they're even there in the first place.
To clarify a bit, collections are generically a group of objects with some semantic data binding them together while arrays are just a predetermined set of something. This semantic data may be information about accepting or rejecting nulls, duplicates, objects of wrong types or with unacceptable field values etc.
Most -if not all- collections do use arrays internally, however array itself isn't a collection. To qualify as a collection it needs some relevant magic such as removing and adding objects to arbitrary positions and arrays can't do that. I very much doubt you'll ever see any kind of array support in Google Collections since arrays are not collections.
However since Google Collections is going to be part of Google's Guava libraries which is a general purpose utility class library/framework of sorts, you may find what you want from com.google.common.primitives package, for example Booleans#asList(boolean... backingArray) and Booleans#toArray(Collection<Boolean> collection).
If you absolutely feel they should include equal methods to Apache Commons Collection's .toObject() and .toPrimitive() in there, you can always submit a feature request as new issue.

Java-like Collections in PHP

I'm learning PHP5 (last time I checked PHP was in PHP4 days) and I'm glad to see that PHP5 OO is more Java-alike than the PHP4 one but there's still an issue that makes me feel quite unconfortable because of my Java background : ARRAYS.
I'm reading "Proffesional PHP6" (Wrox) and It shows its own Collection implementation.
I've found other clases like the one in http://aheimlich.dreamhosters.com/generic-collections/Collection.phps based on SPL.
I've also found that there's some kind of Collection in SPL (ArrayObject)
However, I'm surprised because I don't really see people using Collections in PHP, they seem to prefer arrays.
So, isn't it a good idea using Collections in PHP just like people use ArrayList instead of basic arrays in Java? After all, php arrays aren't really like java arrays.
Collections in Java make a lot of sense since it's a strongly typed language. It makes sense to have a collection of say "Cars" and another of "Motorbikes".
However, in PHP, due to the dynamically typed nature, it is quite common to sacrifice the formality of Collections. Arrays are sufficient to be used as generic containers of various object types (Cars, Motorbikes, etc.). Also, the added benefit comes from the fact that arrays can be mutated very easily (which sometimes can be a big disadvantage when proper error checking is absent).
I come from a Java background, and I've found that using a Collections design pattern in PHP does not buy much in the way of advantages (no multi-threading, no optimization of memory allocation, no iterators, etc.).
If you're looking for any of those advantages, its probably better to construct a wrapper class around the array, implementing each feature (iterators, etc.) a la carte.
I am very pro collection objects in PHP, they can be used to add type safety, impliment easy to use search, sort and manipulation functionality, and represent the correct OO approach rather then using arrays and the multitude of useful but procedual functions that operate on them in differing patterns all over the source.
We have various collections that we use for various purposes all neatly inherited promoting type safety, consistent coding standards and a high level of code reuse.
But ultimatley, they are all array's internally!
I suppose really it comes down to choice, but in my object oriented world I like to keep easily repeatable segments of code such as sort and search algorithms in base classes, and I find the object notation more self documenting.
PHP arrays are associative... They're far more powerful than Java's arrays, and include much of the functionality of List<> and Map<>.
What do you mean by "good idea"? They're different tools, using one language in the way you used another usually results in frustration.
I, too, was somewhat dismayed to find no Collection type classes in PHP. Arrays have a couple of real disadvantages in my experience.
First, the number of functions available to manipulate them is somewhat limited. For example, I need to be able to arbitrarily insert and remove items to/from a Collection at a given index position. Doing that with the built-in language functions for arrays in PHP is painful at best.
Second, as a sort of offshoot of the first point, writing clean, readable code that manipulates arrays at any level of complexity beyond simple push/pop and iterator stuff is difficult at best. I often find that I have to use one array to index and keep track of another array in data-intensive apps I create.
I prefer working in a framework (my personal choice is NOLOH). There, I have a real Collection class called ArrayList that has functions such as Add, Insert, RemoveAt, RemoveRange and Toggle. I imagine other PHP frameworks address this issue as well.
A nice implementation of collection in php is provided by Varien Lib, this library is part of Magento code with OSL license. ( more info about Magento license and code reuse here.
Cannot find any source code for the library so the best way is to download magento and then look in /lib/Varien/
Yii has implementation of full java like collections stack
http://www.yiiframework.com/doc/api/1.1/CList
I sometimes use this really simple implementation to give me a rough and ready collection.
Normally the main requirement of a collection is enforcing a group of one type of object, you just have to setup a basic class with a constructor to implement it.
class SomeObjectCollection {
/**
* #var SomeObject[]
*/
private $collection = array();
/**
* #param SomeObject $object1
* #param SomeObject $_ [optional]
*/
function __construct(SomeObject $object1 = null, SomeObject $_ = null)
{
foreach (func_get_args() as $index => $arg) {
if(! $arg instanceof SomeObject) throw new \RuntimeException('All arguments must be of type SomeObject');
$this->collection[] = $arg;
}
}
/**
* #return SomeObject[]
*/
public function getAll()
{
return $this->collection;
}
}

Categories