Java Language Design with toString

Java Language Design with toString - java

We did they make the decision to not implement a toString method for int[], but instead let it inherit the toString method from Object?

They did implement more reasonable toString methods for arrays. They are located in the java.util.Arrays class.
As for reasoning. I'm assuming by the overrides provided in the Arrays class, that trying to implement a generic toString for different types of arrays is either complex or impossible. The toString method would have to know what type of array it was working on, and output the data approrpiately. For instance, Object[] must use toString on each element, while char[] must output the character, and the numeric datatypes must be converted to a numeric string.
The methods in Arrays get this for free, because the types are fixed due to the overrides.

I guess because of the following reasoning: how would they know how users would want to present their array? It might "array size: 10" or it might be "[x,y,z]".
They gave you a default, if you want to make it something else it is easy to do.
You can use apache's ToStringBuilder make it easier...
http://commons.apache.org/lang/api/org/apache/commons/lang/builder/ToStringBuilder.html

My guess would be is that because Array objects weren't created in Java source code by the language designers - they are created by the Java compiler. Remember, you can have an array of any object type and so the compiler creates the Array object as appropriate for the type you require.
If they were to create a standard method, it's not immediately obvious how this should work. For example, executing toString() and concatenating the results might be OK for a small array but it doesn't work for a multidimensional array or an array with 1,000 entries. So I think no toString() method is created to keep all arrays consistent.
Admittedly, it is annoying and sometimes I do think something along the lines of "Array[" + size + "] of " + getClassName() would be so much better than the default.

A bit of guesswork here, but...
There isn't an obvious string representation of an int array. People do it in different ways: comma separated, space separated, enclose in brackets or parentheses or nothing. That probably drove the decision not to implement it in Java 1.1, along with it being low priority code (since anyone can implement a method to write an array as a string themselves very simply).
Now you can't upgrade it in Java 1.2 or later because that would break back compatibility for anyone already using the old behaviour. You can however add a utility class that implements some functionality, and that's what they did with java.util.Arrays.

Related

Why does Java's List have "List.toArray()", but arrays don't have "Array.toList()"?

Arrays don't have a "toList" function, so we need "Arrays.asList" helper functions to do the conversion.
This is quite odd: List has its own function to convert to an array, but arrays need some helper functions to convert to a List. Why not let arrays have a "toList" function, and what's the reason behind this Java design?
Thanks a lot.

Because List instances are an actual object, while arrays are (for MOST intents and purposes) a primitive and don’t expose methods. Although technically arrays are an object which is how they can have a field length and a method call such as clone(), but their classes are created after compilation by the JVM.

Another point to consider is that toArray() is declared on an INTERFACE (specifically, the java.util.List interface).
Consequently, you must call toArray() on some object instance that implements List.
This might help explain at least one part of your question: why it's not static ...

As others have pointed out: An array in Java is a rather "low-level" construct. Although it is an Object, it is not really part the object-oriented world. This is true for arrays of references, like String[], but even more so for primitive arrays like int[] or float[]. These arrays are rather "raw, contiguous blocks of memory", that have a direct representation inside the Java Virtual Machine.
From the perspective of language design, arrays in Java might even be considered as a tribute to C and C++. There are languages that are purely object-oriented and where plain arrays or primitive types do not exist, but Java is not one of them.
More specifically focusing on your question:
An important point here is that Arrays#asList does not do any conversion. It only creates List that is a view on the array. In contrast to that, the Collection#toArray method indeed creates a new array. I wrote a few words about the implications of this difference in this answer about the lists that are created with Arrays#asList.
Additionally, the Arrays#asList method only works for arrays where the elements are a reference type (like String[]). It does not work for primitive arrays like long[] *. But fortunately, there are several methods to cope with that, and I mentioned a few of them in this answer about how to convert primitive arrays into List objects.
* In fact, Arrays#asList does work for primitive arrays like long[], but it does not create a List<Long>, but a List<long[]>. That's often a source of confusion...

From the modern perspective, there's no good reason why arrays lack the method you mention as well as a ton of other methods, which would make them so much easier to use. The only objective reason is that "it made the most sense to the designers of Java twenty two years ago". It was a different landscape in 1996, an era where C and C++ were the dominant general purpose programming languages.
You can compare this to e.g. Kotlin, a language that compiles to the exact same bytecode as Java. Its arrays are full-featured objects.

Arrays are primitives which don't have methods(With the exception of String) so you have to use methods from another class. A list is a reference type so it can have methods.

Is it possible in Java to override 'toString' for an Objects array?

Is it possible in Java to override a toString for an Objects array?
For example, let's say I created a simple class, User (it doesn't really matter which class is it since this is a general question). Is it possible that, once the client creates a User[] array and the client uses System.out.print(array), it won't print the array's address but instead a customized toString()?
PS: of course I can't just override toString() in my class since it's related to single instances.

No. Of course you can create a static method User.toString( User[] ), but it won't be called implicitly.

You can use Arrays.toString(Object[] a); which will call the toString() method on each object in the array.
Edit (from comment):
I understand what it is you're trying to achieve, but Java doesn't support that at this time.
In Java, arrays are objects that are dynamically created and may be assigned to variables of type Object. All methods of class Object may be invoked on an array. See JLS Ch10
When you invoke toString() on an object it returns a string that "textually represents" the object. Because an array is an instance of Object that is why you only get the name of the class, the # and a hex value. See Object#toString
The Arrays.toString() method returns the equivalent of the array as a list, which is iterated over and toString() called on each object in the list.
So while you won't be able to do System.out.println(userList); you can do System.out.println(Arrays.toString(userList); which will essentially achieve the same thing.

You can create a separate class containing the array, and override toString().
I think the simplest solution is to extend the ArrayList class, and just override toString() (for example, UserArrayList).

The only way you can do this is to re-compile Object.toString() and add instanceof clauses.
I had requested a change in Project Coin to handle arrays in a more object orientated way. I felt that it's too much for beginners to learn all the functionality you need in Array, Arrays and 7 other helper classes which are commonly used.
I think in the end it was concluded that to make arrays properly object orientated is a non-trivial task which will be pushed back to Java 9 or beyond.

Try this
User[] array = new User[2];
System.out.println(Arrays.asList(array));
of course if you have customized user.toString() method

You cannot do that. When you declare an array, then Java in fact creates a hidden object of type Array. It is a special kind of class (for example, it supports the index access [] operator) which normal code cannot directly access.
If you wanted to override toString(), you would have to extend this class. But you cannot do it, since it is hidden.
I think it is good to be it this way. If one could extend the Array class, then one could add all kinds of methods there. And when someone else manages this code, they see custom methods on arrays and they are "WTF... Is this C++ or what?".

Well, you can try to wrap all calls to Object.toString with an AspectJ around advice and return the desired output for arrays. However, I don't think it's a good solution :)

Why is String.length() a method?

If a String object is immutable (and thus obviously cannot change its length), why is length() a method, as opposed to simply being public final int length such as there is in an array?
Is it simply a getter method, or does it make some sort of calculation?
Just trying to see the logic behind this.

Java is a standard, not just an implementation. Different vendors can license and implement Java differently, as long as they adhere to the standard. By making the standard call for a field, that limits the implementation quite severely, for no good reason.
Also a method is much more flexible in terms of the future of a class. It is almost never done, except in some very early Java classes, to expose a final constant as a field that can have a different value with each instance of the class, rather than as a method.
The length() method well predates the CharSequence interface, probably from its first version. Look how well that worked out. Years later, without any loss of backwards compatibility, the CharSequence interface was introduced and fit in nicely. This would not have been possible with a field.
So let's really inverse the question (which is what you should do when you design a class intended to remain unchanged for decades): What does a field gain here, why not simply make it a method?

This is a fundamental tenet of encapsulation.
Part of encapsulation is that the class should hide its implementation from its interface (in the "design by contract" sense of an interface, not in the Java keyword sense).
What you want is the String's length -- you shouldn't care if this is cached, calculated, delegates to some other field, etc. If the JDK people want to change the implementation down the road, they should be able to do so without you having to recompile.

Perhaps a .length() method was considered more consistent with the corresponding method for a StringBuffer, which would obviously need more than a final member variable.
The String class was probably one of the very first classes defined for Java, ever. It's possible (and this is just speculation) that the implementation used a .length() method before final member variables even existed. It wouldn't take very long before the use of the method was well-embedded into the body of Java code existing at the time.

Perhaps because length() comes from the CharSequence interface. A method is a more sensible abstraction than a variable if its going to have multiple implementations.

You should always use accessor methods in public classes rather than public fields, regardless of whether they are final or not (see Item 14 in Effective Java).
When you allow a field to be accessed directly (i.e. is public) you lose the benefit of encapsulation, which means you can't change the representation without changing the API (you break peoples code if you do) and you can't perform any action when the field is accessed.
Effective Java provides a really good rule of thumb:
If a class is accessible outside its package, provide accessor methods, to preserve the flexibility to change the class's internal representation. If a public class exposes its data fields, all hope of changing its representation is lost, as client code can be distributed far and wide.
Basically, it is done this way because it is good design practice to do so. It leaves room to change the implementation of String at a later stage without breaking code for everyone.

String is using encapsulation to hide its internal details from you. An immutable object is still free to have mutable internal values as long as its externally visible state doesn't change. Length could be lazily computed. I encourage you to take a look as String's source code.

Checking the source code of String in Open JDK it's only a getter.
But as #SteveKuo points out this could differ dependent on the implementation.

In most current jvm implementations a Substring references the char array of the original String for content and it needs start and length fields to define their own content, so the length() method is used as a getter. However this is not the only possible way to implement String.
In a different possible implementation each String could have its own char array and since char arrays already have a length field with the correct length it would be redundant to have one for the String object, since String.length() is a method we don't have to do that and can just reference the internal array.length .
These are two possible implementations of String, both with their own good and bad parts and they can replace each other because the length() method hides where the length is stored (internal array or in own field).

Which toString technique is more efficient?

I have a class called Zebra (not her actual name). Zebra overrides the toString method to provide her own convoluted obfuscated stringification.
Which is more efficient to stringify an instance of Zebra? Presuming that I have to do this stringification millions of times per session.
zebra.toString()
""+zebra
static String BLANK (singleton)
BLANK+zebra (multiple executions).
Where the value of zebra is not assured to be the same.
I am conjecturing that the answer could be - no concern: the compiler makes them all equivalent. If that is not the answer, please describe the instantiation process that makes them different. (2) and (3) could be the same, since the compiler would group all similar strings and assign them to a single reference.
Normally, I do ""+zebra because I am too lazy to type zebra.toString().
ATTN: To clarify.
I have seen questions having been criticised like "why do you want to do this, it's impractical" If every programmer refrains from asking questions because it has no practical value, or every mathematician does the same - that would be the end of the human race.
If I wrote an iteration routine, the differences might be too small. I am less interested in an experimental result than I am interested in the difference in processes:
For example, zebra.toString() would invoke only one toString while, "+zebra would invoke an extra string instantiation and and extra string concat. Which would make it less efficient. Or is it. Or does the compiler nullify that.
Please do not answer if your answer is focused on writing an iterative routine, whose results will not explain the compiler or machine process.
Virtue of a good programmer = lazy to write code but not lazy to think.

Number 1 is more efficient.
The other options create an instance of StringBuilder, append an empty string to it, call zebra.toString, append the result of this to the StringBuilder, and then convert the StringBuilder to a String. This is a lot of unnecessary overhead. Just call toString yourself.
This is also true, by the way, if you want to convert a standard type, like Integer, to a String. DON'T write
String s=""+n; // where "n" is an Integer
DO write
String s=n.toString();
or
String s=String.valueOf(n);

As a general rule, I would never use the + operator unless it is on very small final/hard-coded strings. Using this operator usually results in several extra objects in memory being created before your resulting string is returned (this is bad, especially if it happens "millions of times per session").
If you ever do need to concatenate strings, such as when building a unique statement dynamically (for SQL or an output message for example). Use a StringBuilder!!! It is significantly more efficient for concatenating strings.
In the case of your specific question, just use the toString() method. If you dont like typing, use an IDE (like eclipse or netbeans) and then use code completion to save you the keystrokes. just type the first letter or 2 of the method and then hit "CTRL+SPACE"

zebra.toString() is the best option. Keep in mind zebra might be null, in which case you'll get a NullPointerException. So you might have to do something like
String s = zebra==null ? null : zebra.toString()
""+zebra results in a StringBuilder being created, then "" and zebra.String() are appended separately, so this is less efficient. Another big difference is that if zebra is null, the resulting string will be "null".

If the Zebra is Singleton class or the same instance of zebra is being used then you can store the result of toString in Zebra and reuse it for all future calls to toString.
If its not the case then in implementation of toString cache the part which is unchanges everytime in constructing String at one place, this was you can save creating some string instances every time.
Otherwise I do not see any escape from the problem you have :(

Option 1 is the best option since every option calls the toString() method of zebra, but options 2 and 3 also do other (value free) work.
zebra.toString() - Note that this calls the toString() method of zebra.
""+zebra - This also calls the toString() method of zebra.
static String BLANK; BLANK+zebra; - This also calls the toString() method of zebra.
You admit "I'm lazy so I do stupid stuff". If you are unwilling to stop being lazy, than I suggest you not concern yourself with "which is better", since lazy is likely to trump knowledge.

Since the object's toString method will be invoked implicitly in cases where it is not invoked explicitly, a more "efficient" way doesn't exist unless the "stringification" is happening to the same object. In that case, it's best to cache and reuse instead of creating millions of String instances.
Anyway, this question seems more focused on aesthetics/verbosity than efficiency/performance.

If you want to know things like this you can code small example routines and look at the generated bytecode using the javap utility.
I am conjecturing that the answer could be - no concern: the compiler makes them all equivalent. [...] Normally, I do ""+zebra because I am too lazy to type zebra.toString().
Two things:
First: The two options are different. Think about zebra being null.
Second: I'm to lazy to do this javap stuff for you.

Subclassing Array

If you're not interested in a story, skip the first 2 paragraphs.
I was talking to a friend about arrays and why they (still) crash if you try to access an object that is out of bounds in "modern" languages like Objective C (That's my main language). So we got into a debate and I said that I could write him an Array (I named it GeniusArray) that returns null and prints out an error if you try to access something out of bounds but does not crash.
After sleeping over it I realized that if you are accessing elements that are out of bounds you have some serious errors in your code and it's maybe not bad for app to crash so you get forced to fix it. :-D
But still: I'd like to prove my point and subclass an Array and override the get() method by basically adding this one if statement that every programmer writes relatively often:
// Pseudo code...
if (index < array.count) element= array[index];
I want to do it in Java and not Objective C because that's what my friend "knows" (btw, we are both students).
To cut a long story short: I tried to subclass an Array but it doesn't seem to work. I'm get ting this:
Access restriction: The type
Attribute.Array is not accessible due
to restriction on required library:
/System/Library/Frameworks/JavaVM.framework/Versions/1.6.0/Classes/classes.jar GeniusArray.java

Only classes can be subclassed. Array types are not classes. (From here: "There are three kinds of reference types: class types, interface types, and array types.")

Languages such as C, C++ and Objective-C do not check array bounds (and hence result in unpredictable behaviour if you try to access an array with an invalid index) for performance reasons.
Java does check array bounds on every array access and you'll get an ArrayIndexOutOfBoundsException if you use an invalid index. Some people argue that because of this built-in check arrays in Java are less efficient than in other programming languages.

Yes. As you've discovered you can't subclass from an array. You'll have to subclass from (say) an ArrayList and use the Decorator pattern to intercept the get() methods (and related).
Unfortunately you can't provide an operator overload for [] either, so you're going to be some distance from your original objective.

As far as I recall, you really can't subclass an Array in Java (it is a special type). The VM makes some assumptions about arrays that subclassing might mess up.
Normally, I would just try to stay away from arrays. Use ArrayLists instead.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.