Why are the 'Arrays' class' methods all static in Java?

Why are the 'Arrays' class' methods all static in Java? - java

I was going through Java documentation, and I learned that methods in the Arrays class in Java are all static. I don't really understand the reason behind why they made it static.
For example, the following code violates the OO approach, because if I have a type, 'X', then all the methods which acts on it should be inside it:
int[] a = {34, 23, 12};
Arrays.sort(a);
It would be better if they have implemented the following way:
int[] a = {34, 23, 12};
a.sort();
Can anyone explain me a bit on this?

In Java there is no way to extend the functionally of an array. Arrays all inherit from Object but this gives very little. IMHO This is a deficiency of Java.
Instead, to add functionality for arrays, static utility methods are added to classes like Array and Arrays. These methods are static as they are not instance methods.

Good observation. Observe also that not every array can be sorted. Only arrays of primitives and Objects which implement the Comparable interface can be sorted. So a general sort() method that applies to all arrays is not possible. And so we have several overloaded static methods for each of the supported types that are actually sortable.
Update:
#Holger correctly points out in the comments below that one of the overloaded static methods is indeed Arrays.sort(Object[]) but the docs explicitly state:
All elements in the array must implement the Comparable interface.
So it doesn't work for Objects that don't implement Comparable or one of its subinterfaces.

First of all, Arrays is an utility class, which does exactly that: exposes static methods. It is separate from any arr[] instances and has no OO relation to it. There are several classes like that, like Collections or various StringUtils.
Arrays are collections, they are used to store data. Arrays.sort() is an algorithm which sorts the collection. There may be many other algorithms which sort data in different way, all of them would be used in the same way: MyAlgorithm.doSthWithArray(array). Even if there was a sort() method on an array (it would then have to be a SortableArray, because not all Objects can be sorted automatically), all other algorithms would have to be called the old way anyway. Unless there was a visitor pattern introduced... But that makes things too complicated, hence, there is no point.
For a java Collection there's Collections.sort(), even in C++ there is std::sort which works similarly, as does qsort in C . I don't see a problem here, I see consistency.

Static Methods are sometimes used for utility purpose.
So Arrays is utility class for general purpose array operations.
Similarly, Collections is also Util class where utility methods are given.

Arrays are kind of like second-class generics. When you make an array it makes a custom class for the array type, but it's not full featured because they decided how arrays would work before they really fleshed out the language.
That, combined with maintaining backwards compatibility, means that Arrays are stuck with an archaic interface.
It's just an old part of the API.

An array is not an object which stores state, beyond the actual values of int the array. In other words, it's just a "dumb container". It doesn't "know" any behaviour.
A utility class is a class which has just public static methods which are stateless functions. Sorting is stateless because there's nothing remembered between calls to that method. It runs "standalone", applying its formula to whatever object is passed in, as long as that object is "sortable". A second instance of an Arrays class would have behaviour no different, so just have the one static instance.
As Dariusz pointed out, there are different ways of sorting. So you could have MyArrays.betterSort(array) as well as Arrays.sort(array).
If you wanted to have the array "know" how best to sort its own members, you'd have to have your own array class which extends an array.
But what if you had a situation where you wanted different sorting on different times on the the same array? A contrived example, maybe, but there are plenty of similar real-world examples.
And now you're getting complicated. Maybe an array of type T sort differently than type S ....
It's made simple with a static utility and the Comparator<T> interface.

For me this is the perfect solution. I have an array, and I have a class, Arrays, which operates over the data in the array. For example, you may want to hold some random numbers and you will never want to sort or any other utility method you will receive behavior which you don't want. That's why in code design it is good to separate data from the behavior.
You can read about the single responsibility principle.

The Arrays class contains methods that are independent of state, so therefore they should be static. It's essentially a utility class.
While OOP principles don't apply, the current way is clearer, concise, and more readable since you don't have to worry about polymorphism and inheritance. This all reduces scope, which ultimately reduces the chances that you screw something up.
Now, you may ask yourself "Why can't I extend the functionality of an array in Java?". A nice answer is that this introduces potential security holes, which could break system code.

Related

Why does Java's List have "List.toArray()", but arrays don't have "Array.toList()"?

Arrays don't have a "toList" function, so we need "Arrays.asList" helper functions to do the conversion.
This is quite odd: List has its own function to convert to an array, but arrays need some helper functions to convert to a List. Why not let arrays have a "toList" function, and what's the reason behind this Java design?
Thanks a lot.

Because List instances are an actual object, while arrays are (for MOST intents and purposes) a primitive and don’t expose methods. Although technically arrays are an object which is how they can have a field length and a method call such as clone(), but their classes are created after compilation by the JVM.

Another point to consider is that toArray() is declared on an INTERFACE (specifically, the java.util.List interface).
Consequently, you must call toArray() on some object instance that implements List.
This might help explain at least one part of your question: why it's not static ...

As others have pointed out: An array in Java is a rather "low-level" construct. Although it is an Object, it is not really part the object-oriented world. This is true for arrays of references, like String[], but even more so for primitive arrays like int[] or float[]. These arrays are rather "raw, contiguous blocks of memory", that have a direct representation inside the Java Virtual Machine.
From the perspective of language design, arrays in Java might even be considered as a tribute to C and C++. There are languages that are purely object-oriented and where plain arrays or primitive types do not exist, but Java is not one of them.
More specifically focusing on your question:
An important point here is that Arrays#asList does not do any conversion. It only creates List that is a view on the array. In contrast to that, the Collection#toArray method indeed creates a new array. I wrote a few words about the implications of this difference in this answer about the lists that are created with Arrays#asList.
Additionally, the Arrays#asList method only works for arrays where the elements are a reference type (like String[]). It does not work for primitive arrays like long[] *. But fortunately, there are several methods to cope with that, and I mentioned a few of them in this answer about how to convert primitive arrays into List objects.
* In fact, Arrays#asList does work for primitive arrays like long[], but it does not create a List<Long>, but a List<long[]>. That's often a source of confusion...

From the modern perspective, there's no good reason why arrays lack the method you mention as well as a ton of other methods, which would make them so much easier to use. The only objective reason is that "it made the most sense to the designers of Java twenty two years ago". It was a different landscape in 1996, an era where C and C++ were the dominant general purpose programming languages.
You can compare this to e.g. Kotlin, a language that compiles to the exact same bytecode as Java. Its arrays are full-featured objects.

Arrays are primitives which don't have methods(With the exception of String) so you have to use methods from another class. A list is a reference type so it can have methods.

Why we have Arrays and Array in Java

I have run into these two documentations:
Java documentation for the class Array
Java documentation for the class Arrays
and I'm wondering what the difference is between these two classes. They both provide a different set of static methods, but why are they separate? What is the deeper difference? And what is the relation between them and with the normal instance of array like int[].
I notice that they are from totally different packages, but still hope to find some clarification. Thanks.

The differences are made fairly clear in the docs.
From Arrays.java:
This class contains various methods for manipulating arrays (such as sorting and searching). This class also contains a static factory that allows arrays to be viewed as lists.
From Array.java
The Array class provides static methods to dynamically create and access Java arrays.
Essentialy Array is an implementation of core Array operations - getting, setting and instantiation.
Arrays is a helper class for wrapping common Array operations (conversion between Arrays and Lists, sorting, searching for a value) without polluting the core Array "api".

Collection Interface vs arrays

We are learning about the Collection Interface and I was wondering if you all have any good advice for it's general use? What can you do with an Collection that you cannot do with an array? What can you do with an array that you cannot do with a Collection(besides allowing duplicates)?

The easy way to think of it is: Collections beat object arrays in basically every single way. Consider:
A collection can be mutable or immutable. A nonempty array must always be mutable.
A collection can allow or disallow null elements. An array must always permit null elements.
A collection can be thread-safe; even concurrent. An array is never safe to publish to multiple threads.
A list or set's equals, hashCode and toString methods do what users expect; on an array they are a common source of bugs.
A collection is type-safe; an array is not. Because arrays "fake" covariance, ArrayStoreException can result at runtime.
A collection can hold a non-reifiable type (e.g. List<Class<? extends E>> or List<Optional<T>>). An array will generate a warning for this.
A collection can have views (unmodifiable, subList...). No such luck for an array.
A collection has a full-fledged API; an array has only set-at-index, get-at-index, length and clone.
Type-use annotations like #Nullable are very confusing with arrays. I promise you can't guess what #A String #B [] #C [] means.
Because of all the reasons above, third-party utility libraries should not bother adding much additional support for arrays, focusing only on collections, so you also have a network effect.
Object arrays will never be first-class citizens in Java APIs.
A couple of the reasons above are covered -- but in much greater detail -- in Effective Java, Third Edition, Item 28, from page 126.
So, why would you ever use object arrays?
You're very tightly optimizing something
You have to interact with an API that uses them and you can't fix it
so convert to/from a List as close to that API as you can
Because varargs (but varargs is overused)
so ... same as previous
Obviously some collection implementations must be using them
I can't think of any other reasons, they suck bad

It's basically a question of the desired level of abstraction.
Most collections can be implemented in terms of arrays, but they provide many more methods on top of it for your convenience. Most collection implementations I know of for instance, can grow and shrink according to demand, or perform other "high-level" operations which basic arrays can't.
Suppose for instance that you're loading strings from a file. You don't know how many new-line characters the file contains, thus you don't know what size to use when allocating the array. Therefore an ArrayList is a better choice.

The details are in the sub interfaces of Collection, like Set, List, and Map. Each of those types has semantics. A Set typically cannot contain duplicates, and has no notion of order (although some implementations do), following the mathematical concept of a Set. A List is closest to an Array. A Map has specific behavior for push and get. You push an object by its key, and you retrieve with the same key.
There are even more details in the implementations of each collection type. For example, any of the hash based collections (e.g. HashSet, HasMap) are based on the hashcode() method that exists on any Java object.
You could simulate the semantics of any collection type based of an array, but you would have to write a lot of code to do it. For example, to back a Map with an array, you would need to write a method that puts any object entered into your Map into a specific bucket in the array. You would need to handle duplicates. For an array simulating a Set, you would need to write code to not allow duplicates.

The Collection interface is just a base interface for specialised collections -- I am not aware yet of a class that simply just implements Collection; instead classes implement specialized interfaces which extend Collection. These specialized interfaces and abstract classes provide functionality for working with sets (unique objects), growing arrays (e.g. ArrayList), key-value maps etc -- all of which you cannot do out of the box with an array.
However, iterating through an array and setting/reading items from an array remains one of the fastest methods of dealing with data in Java.

One advantage is the Iterator interface. That is all Collections implement an Iterator. An Iterator is an object that knows how to iterate over the given collection and present the programmer with a uniformed interface regardless of the underlying implementation. That is, a linked list is traversed differently from a binary tree, but the iterator hides these differences from the programmer making it easier for the programmer to use one or the other collection.
This also leads to the ability to use various implementations of Collections interchangeably if the client code targets the Collection interface iteself.

Java Variable type and instantiation

This has been bugging me for a while and have yet to find an acceptable answer. Assuming a class which is either a subclass or implements an interface why would I use the Parent class or Interface as the Type i.e.
List list = new ArrayList();
Vehicle car = new car();
In terms of the ArrayList this now only gives me access to the List methods. If I have a method that takes a List as a parameter then I can pass either a List or an ArrayList to it as the ArrayList IS A List. Obviously within the method I can only use the List methods but I can't see a reason to declare it's type as List. As far as I can see it just restricts me to the methods I'm allow to use elsewhere in the code.
A scenario where List list = new ArrayList() is better than ArrayList list = new ArrayList() would be much appreciated.

You write a program that passes lists around several classes and methods. You now want to use it in a multi threading environment. If you were sensible and declared everything as List, you can now make a single change to one line of code:
List list = Colllections.synchronizedList(new ArrayList());
If you had declared the list as an ArrayList, you would instead have to re-write your entire program. The moral of the story - always program to the least restrictive interface that your code requires.

Using the interface or parent type is generally recommended if you only need the functionality of the parent type. The idea is to explicitly document that you don't really care about the implementation, thus making it easier to swap out the concrete class for a different one later.
A good example are the Java collection classes:
If you always use List, Set etc. instead of e.g. ArrayList, you can later switch from ArrayList to LinkedList if you find that it gives e.g. better performance. To do that, just change the constructors (you don't even have to change them all, you can mix). The rest of the code still sees an instance of List and continues working.
If you actually used ArrayList explicitly, you'd have to change it everywhere it's used. If you don't actually need an ArrayList specifically, there's nothing to be gained from using it over the interface.
That's why it's generally recommended (e.g. in "Effective Java" (J.Bloch), Item 52: "Refer to Objects by their interfaces".) to only use interfaces if possible.
Also see this related question: Why classes tend to be defined as interface nowadays?

The key is exactly that the interface or base class restricts what you can do with the variable. For example, if you refactor your code later to use another implementation of that interface or base class, you won't have anything to fear -- you didn't rely on the actual type's identity.
Another thing is that it often makes reading the code easier, e.g. if your method's return type is List you might find it more readable to return a variable of type List.

An interface specifies a contract (what does this thing do), an implementation class specifies the implementation details (how does it do it).
According to good OOP practice, your application code should not be tied to implementation details of other classes. Using an interface keeps your application loosely coupled (read: Coupling)
Also, using an interface lets client code pass in different implementations and apply the decorator pattern using methods like Collections.synchronizedList(), Collections.unmodifiableList() etc.

A scenario where List list = new
ArrayList() is better than ArrayList
list = new ArrayList() would be much
appreciated.
One concrete example: if it's a field declaration and you have a setList(), which of course should take a List parameter to be flexible.
For local variables (and fields with no setters), there is very little concrete benefit in using the interface type. Many people will do it anyway on general principle.

You were right. In these cases, the variables are fields or local variables, they are not public interface, they are implementation details. Implementation detail should be detailed. You should call an ArrayList an ArrayList, because you just deliberately chose it for your implementation.
People who recycle cliches: look at your post and think a little bit more. It's nonsense.
My previous answer that was downvoted to death:
Use interface or type for variable definition in java?

Java: Should I always replace Arrays for ArrayLists?

Well, it seems to me ArrayLists make it easier to expand the code later on both because they can grow and because they make using Generics easier. However, for multidimensional arrays, I find the readability of the code is better with standard arrays.
Anyway, are there some guidelines on when to use one or the other? For example, I'm about to return a table from a function (int[][]), but I was wondering if it wouldn't be better to return a List<List<Integer>> or a List<int[]>.

Unless you have a strong reason otherwise, I'd recommend using Lists over arrays.
There are some specific cases where you will want to use an array (e.g. when you are implementing your own data structures, or when you are addressing a very specific performance requirement that you have profiled and identified as a bottleneck) but for general purposes Lists are more convenient and will offer you more flexibility in how you use them.
Where you are able to, I'd also recommend programming to the abstraction (List) rather than the concrete type (ArrayList). Again, this offers you flexibility if you decide to chenge the implementation details in the future.
To address your readability point: if you have a complex structure (e.g. ArrayList of HashMaps of ArrayLists) then consider either encapsulating this complexity within a class and/or creating some very clearly named functions to manipulate the structure.

Choose a data structure implementation and interface based on primary usage:
Random Access: use List for variable type and ArrayList under the hood
Appending: use Collection for variable type and LinkedList under the hood
Loop and process: use Iterable and see the above for use under the hood based on producer code
Use the most abstract interface possible when handing around data. That said don't use Collection when you need random access. List has get(int) which is very useful when random access is needed.
Typed collections like List<String> make up for the syntactical convenience of arrays.
Don't use Arrays unless you have a qualified performance expert analyze and recommend them. Even then you should get a second opinion. Arrays are generally a premature optimization and should be avoided.
Generally speaking you are far better off using an interface rather than a concrete type. The concrete type makes it hard to rework the internals of the function in question. For example if you return int[][] you have to do all of the computation upfront. If you return List> you can lazily do computation during iteration (or even concurrently in the background) if it is beneficial.

The List is more powerful:
You can resize the list after it has been created.
You can create a read-only view onto the data.
It can be easily combined with other collections, like Set or Map.
The array works on a lower level:
Its content can always be changed.
Its length can never be changed.
It uses less memory.
You can have arrays of primitive data types.

I wanted to point out that Lists can hold the wrappers for the primitive data types that would otherwise need to be stored in an array. (ie a class Double that has only one field: a double) The newer versions of Java convert to and from these wrappers implicitly, at least most of the time, so the ability to put primitives in your Lists should not be a consideration for the vast majority of use cases.
For completeness: the only time that I have seen Java fail to implicitly convert from a primitive wrapper was when those wrappers were composed in a higher order structure: It could not convert a Double[] into a double[].

It mostly comes down to flexibility/ease of use versus efficiency. If you don't know how many elements will be needed in advance, or if you need to insert in the middle, ArrayLists are a better choice. They use Arrays under the hood, I believe, so you'll want to consider using the ensureCapacity method for performance. Arrays are preferred if you have a fixed size in advance and won't need inserts, etc.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.