Provide Programmatic Access to All Data Available in String Form: toString()

Provide Programmatic Access to All Data Available in String Form: toString() - java

Bloch said: Provide Programmatic Access to All Data Available in String Form.
I am wondering if he means to override toString() which should involve 'all data available'?
I think the 'in string form' means that the string is for human reading, so override toString() is enough for the advice. Am I correct?

No, apparently he meant quite the opposite of that. If a data member is available as part of the toString() output (or other string methods of the class), Bloch's fear is that developers using the API will rely on that and parse the strings to get at the underlying data values. His advice is to provide specific accessors for those data elements, to prevent developers from relying on the format of toString()'s output.

Related

What is the need for Name interface in Java?

In Java, when is the interface Name (which extends CharSequence) useful? Why not just use final String instead? I do not see any classes that implement Name.
Alternatively, what I am interested is the use cases of Name, CharSequence, and the String class.

It is more than just a String.
In the description of the equals method it says:
Note that the identity of a Name is a function both of its content in
terms of a sequence of characters as well as the implementation which
created it.
So this implies that there is more to an object that implements Name than just its character representation. It also has to know about the implementation which created it.
If you want to just compare the content you have to use contentEquals.
But you are probably not going to be creating classes using this interface yourself. It is part of the Java language model.

To summarize the answer based on the comments: By #Mark Rotteveel
This is part of the package java.lang.model.element which is "Interfaces used to model elements of the Java programming language.". In other words, it is not general purpose. And there are implementations, but just not in a public API as these interfaces are the API (e.g. in Temurin 11, there are
com.sun.tools.javac.util.Name,
com.sun.tools.javac.util.SharedNameTable.NameImpl and
com.sun.tools.javac.util.UnsharedNameTable.NameImpl).
And it is more than just a string based on #rghome answer.
Moreover, the reason the modeling is also provided in the same language is: By #Zabuzard
It is mostly for the meta-aspects that Java provides, in particular the reflection API that allows you to inspect the code and retrieve meta-information about it, or to dynamically trigger stuff. For example call a method by its name, given by user input.

CharSequence is a very general interface, more so than String, and contrary to the obiquitous usage of String one could have used CharSequence instead, so one could pass a StringBuilder too.
However pro String (a class) counts its evident immutability and extra methods. Though the interface CharSequence itself does not have mutable functions too. As only StringBuilder and String are the most useful implementation, one may normally forget about using CharSequence.
Name is a dedicated interface for language modeling itself. It is a form of wrapping a value type (String here) in its own type for specific usage. Like wrapping a time String in a Time. Note that for a case-insensitive language the Name#equals could ignore the case (as in PascalName implements Name). The class/interface should not be used outside language modeling. In the model a Name may contain its declaration.

Serialization vs toString()

Since I'm writing/reading from files, I was wondering if there's any difference or there's any best practice between directly sending objects or using their representation as strings on files which in my case I personally find it easier to handle.
So when should I serialize instead of writing/reading objects as String?

There's typically not enough information in the string representation of an object to be used to recreate it.
Java serialization "just works", but does not give you a human-readable representation, if that's what you are looking for.
Another alternative is to read / write JSON representations of your objects. There are several JSON serialization / federalization libraries for Java that are popular, including GSon and Jackson.

The answer is in the javadoc for Object.toString().
Returns a string representation of the object. In general, the toString method returns a string that "textually represents" this object. The result should be a concise but informative representation that is easy for a person to read.
Note that it says:
concise,
informative, and
easy for a person to read.
But it does NOT say:
complete,
unambiguous, or
easy for a computer to read.
Serialization is about producing a linear (not necessarily textual) form that can be read by a computer and used to reconstruct the state of the original object.
So a typical serialization is not particularly human readable (e.g. JSON, XML, YAML) or completely unreadable (e.g. Java Object Serialization, ASN.1). But the flip-side is that the information needed to reconstruct an object should all be present, in an unambiguous form.
(There is a lot more that could be said about various kinds serialization, their properties and their utility. However, it is beyond the scope of your question.)
Does this preclude toString() from being used for serializing data?
No, it doesn't.
But if you take that approach, you need to code your toString() methods carefully to make sure that what they produce is complete and unambiguous. Then you need to write a corresponding method to parse the toString() output and create an new object from it.
... or using their representation as strings on files which in my case I personally find it easier to handle.
I think that as you write larger and more complicated programs, you will get to the stage where that kind of code is tedious and time consuming to write, test and maintain.

Serialization allows you to convert the state of an object into a stream of bytes, which can then be saved to a file on the local disk, sent over the network to any other machine, or saved to the Database. Deserialization allows you to reverse the process, which means to reconvert the serialized byte stream into an object again. It's important to know that numbers or other types aren't as easy to write to files or treats as Strings. However, their initial states are not guaranteed to be maintained without being serialized.
Thus, it is convenient to use Strings, for simpler situations, which is not necessarily important to have a serialization, such as a college project. However, it is not recommended that this process be done, as there are other better solutions.

Represent email, telephonenumber and id's as POJO's instead of Strings

I have a typical business web application where the domain contains entities like accounts and users. My backend is Java, so they're represented by POJO's. During an early iteration, every attribute of those POJO's were just strings. This made sense because the html input was a string, and the way the data is persisted in the DB is also similar to a string.
Recently, we've been working on validating this kind of input and I found it helps if I switch over to an object notation for this kind attributes. For example, a TelephoneNumber class consists of:
(int) country calling code
(string) rest of number
(static char) the character to prefix the country calling code (in our case this is a +)
(static pattern) regular expression to match if phonenumber is sensical
methods to compare and validate telephone numbers.
This class has advantages and disadvantages:
not good: Additional object creation and conversion between string/object
good: OOP and all logic regarding telephone numbers is bundled in one class (high cohesion),
good: whenever a telephone number is needed as an argument for a method or constructor, java's strict typing makes it very clear we're not just dealing with a random string.
Compare the possible confusing double strings:
public User(String name, String telephoneNumber)
vs the clean OOP way:
public User(String name, TelephoneNumber telephoneNumber)
I think in this case the advantages outweight the disadvantges. My concern is now for the following two attributes:
-id's (like b3e99627-9754-4276-a527-0e9fb49d15bb)
-e-mailadresses
This "objects" are really just a single string. It seems overkill to turn them into objects. Especially the user.getMail.getMailString() kind of methods really bother me because I know the mailString is the only attribute of mail. However, if I don't turn them into an object, I lose some of the advantages.
So my question is: How do you deal with this concepts in a web application? Are there best practices or other concerns to take into account?

If you use Strings for everything you are essentially giving up type safety, and you have to "type check" with validation in any class or method where the string is used. Inevitably this validation code gets duplicated and makes other classes bloated, confusing, and potentially inconsistent because the validation isn't the same in all places. You can never really be sure what the string holds, so debugging becomes more difficult, maintenance gets ugly, and ultimately it wastes lots of developer time. Given the power of modern processors, you shouldn't worry about the performance cost of using lots of objects because it's not worth sacrificing programmer productivity (in most cases).
One other thing that I have found is that string variables tend to be more easily abused by future programmers who need to make a "quick fix", so they'll set new values for convenience just where they need them, instead of extending a type and making it clear what's going on.
My recommendation is to use meaningful types everywhere possible.
Maximizing the benefit of typing leads to the idea of "tiny types", which you can read about here: http://darrenhobbs.com/2007/04/11/tiny-types/
Essentially it means you make classes to represent everything. In your example with the User class, that would mean you would also make a Name class to represent the name. Inside that class you might also have two more classes, FirstName and LastName. This adds clarity to your code, and maximizes the number of logical errors the compiler stops you from making. In most cases you would never use a first name where you want a last name and vice versa.

One of the biggest advantages of objects is the fact that they can have methods. For example, all your data object (phone number, address, email etc.) can implement the same interface IValidatable with validate method, which does the obvious. In this case, it would make sense to wrap email in an object as well, since we do want to validate emails. Regarding ID - assuming it's assigned internally by your app, you probably don't need to wrap it.

Java toString for debugging or actual logical use

This might be a very basic question, apologies if this was already asked.
Should toString() in Java be used for actual program logic or is it only for debugging/human reading only. My basic question is should be using toString() or write a different method called asString() when I need to use the string representation in the actual program flow.
The reason I ask is I have a bunch of classes in a web service that rely on a toString() to work correctly, in my opinion something like asString() would have been safer.
Thanks

Except for a few specific cases, the toString should be used for debugging, not for the production flow of data.
The method has several limitations which make it less suitable for use in production data flow:
Taking no parameters, the method does not let you easily alter the string representation in response to the environment. In particular, it is difficult to format the string in a way that is sensitive to the current locale.
Being part of the java.Object class, this method is commonly overridden by subclasses. This may be harmful in situations when you depend on the particular representation, because the writers of the subclass may have no idea of your restrictions.
The obvious exceptions to this rule are toString methods of the StringBuilder and the StringBuffer classes, because these two methods simply make an immutable string from the mutable content of the corresponding object.

It is not just for debugging/human reading only, it really depends on the context in which the object is being used. For example, if you have a table which is displaying some object X, then you may want the table to display a readable textual representation of X in which case you would usually implement the toString() method. This of course is a basic example but there are many uses in which case implementing toString() would be a good idea.

Should you avoid Guavas Ordering.usingToString()?

This question was prompted after reading Joshua Bloch's "Effective Java". Specifically in Item #10, he argues that it is bad practice to parse an object's string representation and use it for anything except a friendlier printout/debug. The reason is that such a use "is error-prone, results in fragile systems that break if you change the format".
To me it looks like Guava's Ordering.usingToString() is a spot on example of this. So is it bad practice to use it?

Well, if the sorting is only used for deciding in which order to display things to a user, I'd argue it's part of "friendlier printout/debug".
If, however, your codes correctness depends on the ordering, then I'd argue that it's indeed a bad idea to depend on toString.

As the author of that method, I would agree: it's really just a crutch. For those "look, I just need an Ordering<Object>, dammit" cases. It should probably be removed, since you can get its behavior with Ordering.onResultOf(Functions.toStringFunction) anyway.

If your program ever used the toString() for lexical sorting using natural ordering in such a way that program execution depends on it, then it would be wise to override the default toString() of the class that extended. You should in that case make the toString() method final and clearly document that it is used for ordering.
It would however be much better to create another method returning a String and create an ordering depending on that result, possibly by creating a specific Comparator to do the sorting. See for instance the final method name() used for enumerations in Java. In general it creates the same String as toString() but it is still possible to perform ordering with it even if toString() has been overridden.
If you use the last method, then the Ordering.usingToString() would not be of much use of course.

There are some obvious cases where it actually makes sense like StringBuffer etc. Obviously it doesn't make sense for most "business" classes to depend on toString().

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.