Encapsulation: why the name keySet in java - java

In java to get all the keys in a map we can use the method keySet. But I was wondering why the method name is not just keys? isn't the name Set leaks details about the implementation?
As per my understanding Java is statically typed language and having types in names makes no sense at all. The calling code anyways must have the correct interface type. If we assume that this strategy is correct then every method must have types prefixed to them which doesn't makes any sense. I think #JBNizet stated correctly in his comment the reason behind the choice.

isn't the name Set leaks details about the implementation?
1) Set is an interface.
If for example, the method were named keyHashSet(), it would not be a good idea
as HashSet is an implementation of Set.
But keySet() is fine because having as return type an interface that defines a specific contract and using a name that conveys this specific contract in the method name is a valid way to program by interface and to be able to return any implementation of Set in a clean way.
2) Besides choosing in the Map interface the keys() method name rather than keySet() is not possible either.
As #JB Nizet suggests in its comment, a keys() method is already defined in a JDK collection class : Enumeration<K> keys().
It is declared in the Dictionary class that HashTable inherits from :
public class Hashtable<K,V> extends Dictionary<K,V>...{
But this method : Enumeration<K> keys(); doesn't return a Set.
So HashTable that is both a Dictionary and a Map :
public class Hashtable<K,V> extends Dictionary<K,V> implements Map<K,V>,
has to implement one method to return the keys in an Enumeration form but also another one to return them in a Set form.
A distinct naming is so required to distinguish them
You may also note that Set<Map.Entry<K, V>> entrySet() also defined in the Map interface follows the same naming logic while no other hierarchy of HashTable defines a entries() method.
It is probably to be consistent.

I was wondering why the method name is not just keys
Like so many things in modern Java, backwards compatibility got in the way a bit: keys() was already taken (by java.util.Hashtable.keys(), and that class should still be able to implement Map), so they had to choose something else.
Doesn't the name Set leaks details about the implementation ?
No, it does not. The Map interface already specifies that there cannot be duplicate keys. So the collection of keys is a Set by definition. And Set is still an interface for which there can be different implementations.

isn't the name Set leaks details about the implementation?
The declaration Set<K> keySet() doesn't tell us any more about the
implementation than Set<K> keys() would. In both cases, we know that the method returns a Set. Set is an interface, not a concrete class.
Whether it's a good idea to include that in the name is a matter of style (personally, no, I don't think it is; but reasonable people can differ). But doing so doesn't tell us anything about the implementation that we wouldn't know anyway.

Related

What class instance does Map.of() return?

I'm currently learning about collections and I noticed that the factory method Map.of returns an object of type Map<>. However since Map is only an interface what class does the Map class reference actually point to?
It is not guaranteed to return an instance of any specific class, so your code should not depend on that. However, for the sake of learning, OpenJDK 11 returns an instance of ImmutableCollections.Map1, one of several lightweight special-purpose classes written specifically for the Map.of overloads.
It won't be the same depending on the Map size and you should not rely on the value returned as per the answer by #chrylis -cautiouslyoptimistic-.
If you are interested on the value of any Map instance your code uses just print map.getClass(), here are a few examples:
System.out.println(Map.of().getClass());
System.out.println(Map.of(1,2).getClass());
System.out.println(Map.of(1,2,3,4,5,6).getClass());
Which (in JDK17) prints:
class java.util.ImmutableCollections$MapN
class java.util.ImmutableCollections$Map1
class java.util.ImmutableCollections$MapN
It returns an Immutable Map from the ImmutableCollections class.
This class is not part of the public API, but it extends AbstractMap which supplies implementations for all of the basic methods needed for a Map.
The important takeaway is that the Map returned by Map.of() is immutable so you can't add to or change it after it is created. Immutable collections are more secure, are thread safe, and can be more efficient.

Why does the Collection interface have equals() and hashCode()?

Why does the Collection interface have equals(Object o) and hashCode(), given that any implementation will have those by default (inherited from Object) ?
From the Collection JavaDoc:
While
the Collection interface adds no stipulations to the general contract
for the Object.equals, programmers who implement the Collection
interface "directly" (in other words, create a class that is a
Collection but is not a Set or a List) must exercise care if they
choose to override the Object.equals. It is not necessary to do so,
and the simplest course of action is to rely on Object's
implementation, but the implementor may wish to implement a "value
comparison" in place of the default "reference comparison." (The List
and Set interfaces mandate such value comparisons.)
The general contract for the Object.equals method states that equals
must be symmetric (in other words, a.equals(b) if and only if
b.equals(a)). The contracts for List.equals and Set.equals state that
lists are only equal to other lists, and sets to other sets. Thus, a
custom equals method for a collection class that implements neither
the List nor Set interface must return false when this collection is
compared to any list or set. (By the same logic, it is not possible to
write a class that correctly implements both the Set and List
interfaces.)
and
While the Collection interface adds no stipulations to the general contract for the Object.hashCode method, programmers should take note that any class that overrides the Object.equals method must also override the Object.hashCode method in order to satisfy the general contract for the Object.hashCode method. In particular, c1.equals(c2) implies that c1.hashCode()==c2.hashCode().
To answer your specific question: why does it have these methods? It's done simply for convenience to be able to include Java Docs giving hints as to what implementers should do with these methods (e.g. comparing equality of values rather than references).
To add to the other great answers. In the Collections interface, the equals method is defined in that interface to make some decisions in the way equaling two instances of collection should work. From the JAVA 8 documentation:
More generally, implementations of the various Collections Framework
interfaces are free to take advantage of the specified behavior of
underlying Object methods wherever the implementor deems it
appropriate.
So you don’t add methods from the Object class for any other reason that giving more definitiveness to the java doc. This is the reason why you don’t count those methods in the abstract methods in the abstract methods of an interface.
Moreover, in JAVA 8, along the same line of reasoning, default methods from the Object class are not allowed and will generate a compile error. I believe it’s was done to prevent this type of confusion. So if you try to create a default method called hashCode(), for example, it will not compile.
Here is a more in-depth explanation for this behavior in JAVA 8 from the Lambda FAQ:
An interface cannot provide a default implementation for any of the
methods of the Object class. This is a consequence of the “class wins”
rule for method resolution: a method found on the superclass chain
always takes precedence over any default methods that appear in any
superinterface. In particular, this means one cannot provide a default
implementation for equals, hashCode, or toString from within an
interface.
This seems odd at first, given that some interfaces actually define
their equals behavior in documentation. The List interface is an
example. So, why not allow this?
One reason is that it would become more difficult to reason about when
a default method is invoked. The current rules are simple: if a class
implements a method, that always wins over a default implementation.
Since all instances of interfaces are subclasses of Object, all
instances of interfaces have non-default implementations of equals,
hashCode, and toString already. Therefore, a default version of these
on an interface is always useless, and it may as well not compile.
Another reason is that providing default implementations of these
methods in an interface is most likely misguided. These methods
perform computations over the object’s state, but the interface, in
general, has no access to state; only the implementing class has
access to this state. Therefore, the class itself should provide the
implementations, and default methods are unlikely to be useful.
Just to add to the great answers above, it makes sense to have the 'equals' or `hashCode' methods in this scenario:
Collection<Whatever> list1 = getArrayList();
Collection<Whatever> list2 = getAnotherArrayList();
if(list1.equals(list2)){
// do something
}
In the absence of the equals method in the interface, we'll be forced to use concrete types, which is generally not a good practice :
ArrayList<Whatever> list1 = getArrayList();
ArrayList<Whatever> list2 = getAnotherArrayList();
if(list1.equals(list2)){
// do something
}

What is the difference in these two declarations?

List<String> someName = new ArrayList<String>();
ArrayList<String> someName = new ArrayList<String>();
Does it impact anything on performance?
The first one is a List of Objects and the latter one is ArrayList of Objects. Correct me if i am wrong. I got confused because ArrayList implements List Interface.
Why do people declare like this? Does it help in any situtions.
When i am receiving some email address from DB, what is the best way to collect it? List of eMail address Objects????
Finally one unrelated question.... can an interface have two method names with same name and signature and same name with different signature.
The difference between the declarations is more one of style. It is preferable to declare variables using the abstract, rather than the concrete implementation, because you can change the implementation choice later without changing the variable type. For example, you might change the List to use a LinkedList instead.
If you always use the abstract type (interface or abstract class) wherever you can, especially in method signatures, the client code is free to use whatever implementation they prefer. This makes the code more flexible and easier to maintain.
This is true even of variable declarations. Consider this:
public abstract class MyListUsingClass {
private List<String> list;
protected MyListUsingClass(List<String> list) {
this.list = list;
}
...
}
If the variable list was declared as ArrayList, then only ArrayLists would be accepted in the constructor. This would be a poor choice: Always try to let the client code chose the implementations they want to use.
Regarding you last question: Interfaces have the same restrictions for methods as classes do, so yes you can overload methods.
There is no performance impact, because in runtime you are dealing with the same class (ArrayList) in both cases.
They are both lists of Strings. The difference is that the first one is declared as a List but initialized as an ArrayList, which is a more specific type of List.
One instance where it helps is when you use an IDE with context-sensitive suggestions (Eclipse, NetBeans, etc). In the first case, whenever you use the suggestion feature, you will only see the members of the List interface. In the second, you will see all (public) members of ArrayList. In any given programming situation, as long as the more abstract type provides the functionality you need, you want to use that because it makes your code more robust: the more abstract a type is, the less likely it is to change in some future release of the API.
The best way to represent anything always depends on what you intend to use the data for and how much of it there is. Probably a List or a Set of javax.mail.internet.InternetAddress will fit the bill.
An interface can have two methods with the same name only if they have different parameter type signatures. Two methods which both take a single string cannot have the same name even if the parameters have different names, nor can you have two methods with the same name which differ only in return type.
In the first cause you're declaring a var of type list and using an ArrayList as its implementation.
In the second case you're declaring and defining an array list.
The difference is that, using the interface type (as in the first case), you will access only those methods defined in the List interface, and if ArrayList has some specific implementation methods, in order to access them you will need to cast your list to its sub-type (ArrayList).
In the second case, you're using a more specific type, so no cast is needed at all.
Performance - probably not.
Actually they are lists of Strings, not objects. Interfaces is not the point of what is held in Collection
Defining variable of superclass type could be usefull if you would like to make your code independent of concrete list implementation. If someday you would like to change list to LinkedList implementation - this won't be so harmful to all your code
Create new type EMail and store them into some kind of list (e.g. mentioned LinkedList or ArrayList) or just array (EMail[]). If you provide more information - this could be helpful.
edit
2. In both cases they are ArrayList of Strings. The difference is, that in first case you're doing casting to the superclass (losing access to some methods specific to ArrayList)
Does it impact anything on performance? No measurable impact. Your code will be the source of your performance issues, not nano-optimizations like this.
The first one ie s a List of Objects and the latter one is ArrayList of Objects. Correct me if i am wrong. I got confused because ArrayList implements List Interface. Exactly. You can assign a class reference to any of the types that it implements.
Why do people declare like this? Does it help in any situations.The reason you might want to is in case you want to change your implementation to use another concrete class that implements List e.g. LinkedList.
When i am receiving some email address from DB, what is the best way to collect it? List of eMail address Objects? Define "best". Depends on how you'll use them. Strings might be sufficient; perhaps a better abstraction would work for you.
Finally one un related question.... can an interface have two method names with same name and signature and same name with different signature. Interfaces define signatures, not implementation. You can have two interfaces with methods that define the same signature, but there can only be one implementation when you execute. If you have a Cowboy and Artist interfaces, both with void draw() methods, the class that implements both will have to decide what the single implementation will be. There can't be one for Cowboy and another for Artist, because interfaces don't have any notion of implementation.

Why do many Collection classes in Java extend the abstract class and implement the interface as well?

Why do many Collection classes in Java extend the Abstract class and also implement the interface (which is also implemented by the given abstract class)?
For example, class HashSet extends AbstractSet and also implements Set, but AbstractSet already implements Set.
It's a way to remember that this class really implements that interface.
It won't have any bad effect and it can help to understand the code without going through the complete hierarchy of the given class.
From the perspective of the type system the classes wouldn't be any different if they didn't implement the interface again, since the abstract base classes already implement them.
That much is true.
The reason they do implement it anyways is (probably) mostly documentation: a HashSet is-a Set. And that is made explicit by adding implements Set to the end, although it's not strictly necessary.
Note that the difference is actually observable using reflection, but I'd be hard-pressed to produce some code that would break if HashSet didn't implement Set directly.
This may not matter much in practice, but I wanted to clarify that explicitly implementing an interface is not exactly the same as implementing it by inheritance. The difference is present in compiled class files and visible via reflection. E.g.,
for (Class<?> c : ArrayList.class.getInterfaces())
System.out.println(c);
The output shows only the interfaces explicitly implemented by ArrayList, in the order they were written in the source, which [on my Java version] is:
interface java.util.List
interface java.util.RandomAccess
interface java.lang.Cloneable
interface java.io.Serializable
The output does not include interfaces implemented by superclasses, or interfaces that are superinterfaces of those which are included. In particular, Iterable and Collection are missing from the above, even though ArrayList implements them implicitly. To find them you have to recursively iterate the class hierarchy.
It would be unfortunate if some code out there uses reflection and depends on interfaces being explicitly implemented, but it is possible, so the maintainers of the collections library may be reluctant to change it now, even if they wanted to. (There is an observation termed Hyrum's Law: "With a sufficient number of users of an API, it does not matter what you promise in the contract; all observable behaviors of your system will be depended on by somebody".)
Fortunately this difference does not affect the type system. The expressions new ArrayList<>() instanceof Iterable and Iterable.class.isAssignableFrom(ArrayList.class) still evaluate to true.
Unlike Colin Hebert, I don't buy that people who were writing that cared about readability. (Everyone who thinks standard Java libraries were written by impeccable gods, should take look it their sources. First time I did this I was horrified by code formatting and numerous copy-pasted blocks.)
My bet is it was late, they were tired and didn't care either way.
From the "Effective Java" by Joshua Bloch:
You can combine the advantages of interfaces and abstract classes by adding an abstract skeletal implementation class to go with an interface.
The interface defines the type, perhaps providing some default methods, while the skeletal class implements the remaining non-primitive interface methods atop the primitive interface methods. Extending a skeletal implementation takes most of the work out of implementing an interface. This is the Template Method pattern.
By convention, skeletal implementation classes are called AbstractInterface where Interface is the name of the interface they implement. For example:
AbstractCollection
AbstractSet
AbstractList
AbstractMap
I also believe it is for clarity. The Java Collections framework has quite a hierarchy of interfaces that defines the different types of collection. It starts with the Collection interface then extended by three main subinterfaces Set, List and Queue. There is also SortedSet extending Set and BlockingQueue extending Queue.
Now, concrete classes implementing them is more understandable if they explicitly state which interface in the heirarchy it is implementing even though it may look redundant at times. As you mentioned, a class like HashSet implements Set but a class like TreeSet though it also extends AbstractSet implements SortedSet instead which is more specific than just Set. HashSet may look redundant but TreeSet is not because it requires to implement SortedSet. Still, both classes are concrete implementations and would be more understandable if both follow certain convention in their declaration.
There are even classes that implement more than one collection type like LinkedList which implements both List and Queue. However, there is one class at least that is a bit 'unconventional', the PriorityQueue. It extends AbstractQueue but doesn't explicitly implement Queue. Don't ask me why. :)
(reference is from Java 5 API)
Too late for answer?
I am taking a guess to validate my answer. Assume following code
HashMap extends AbstractMap (does not implement Map)
AbstractMap implements Map
Now Imagine some random guy came, Changed implements Map to some java.util.Map1 with exactly same set of methods as Map
In this situation there won't be any compilation error and jdk gets compiled (off course test will fail and catch this).
Now any client using HashMap as Map m= new HashMap() will start failing. This is much downstream.
Since both AbstractMap, Map etc comes from same product, hence this argument appears childish (which in all probability is. or may be not.), but think of a project where base class comes from a different jar/third party library etc. Then third party/different team can change their base implementation.
By implementing the "interface" in the Child class, as well, developer's try to make the class self sufficient, API breakage proof.
In my view,when a class implements an interface it has to implement all methods present in it(as by default they are public and abstract methods in an interface).
If we don't want to implement all methods of interface,it must be an abstract class.
So here if some methods are already implemented in some abstract class implementing particular interface and we have to extend functionality for other methods that have been unimplemented,we will need to implement original interface in our class again to get those remaining set of methods.It help in maintaining the contractual rules laid down by an interface.
It will result in rework if were to implement only interface and again overriding all methods with method definitions in our class.
I suppose there might be a different way to handle members of the set, the interface, even when supplying the default operation implementation does not serve as a one-size-fits-all. A circular Queue vs. LIFO Queue might both implement the same interface, but their specific operations will be implemented differently, right?
If you only had an abstract class you couldn't make a class of your own which inherits from another class too.

What's the difference between these two java variable declarations?

public class SomeClass {
private HashSet<SomeObject> contents = new HashSet<SomeObject>();
private Set<SomeObject> contents2 = new HashSet<SomeObject>();
}
What's the difference? In the end they are both a HashSet isn't it? The second one looks just wrong to me, but I have seen it frequently used, accepted and working.
Set is an interface, and HashSet is a class that implements the Set interface.
Declaring the variable as type HashSet means that no other implementation of Set may be used. You may want this if you need specific functionality of HashSet.
If you do not need any specific functionality from HashSet, it is better to declare the variable as type Set. This leaves the exact implementation open to change later. You may find that for the data you are using, a different implementation works better. By using the interface, you can make this change later if needed.
You can see more details here: When should I use an interface in java?
Set is a collection interface that HashSet implements.
The second option is usually the ideal choice as it's more generic.
Since the HashSet class implements the Set interface, its legal to assign a HashSet to a Set variable. You could not go the other way however (assign a Set to a more specific HashSet variable).
Set is an interface that HashSet implements, so if you do this:
Set<E> mySet = new HashSet<E>();
You will still have access to the functionality of HashSet, but you also have the flexibility to replace the concrete instance with an instance of another Set class in the future, such as LinkedHashSet or TreeSet, or another implementation.
The first method uses a concrete class, allowing you to replace the class with an instance of itself or a subclass, but with less flexibility. For example, TreeSet could not be used if your variable type was HashSet.
This is Item 52 from Joshua Bloch's Effective Java, 2nd Edition.
Refer to Objects by their interfaces
... You should favor the use of interfaces rather than classes to refer to objects. If appropriate interface types exist, then parameters, return values, variables, and fields should all be declared using interface types. The only time you really need to refer to an object's class is when you're creating it with a constructor...
// Usually Good - uses interface as type
List<T> tlist = new Vector<T>();
// Typically Bad - uses concrete class as type!
Vector<T> vec = new Vector<T>();
This practice does carry some caveats - if the implementation you want has special behavior not guaranteed by the generic interface, then you have to document your requirements accordingly.
For example, Vector<T> is synchronized, whereas ArrayList<T> (also an implementer of List<T>) does not, so if you required synchronized containers in your design (or not), you would need to document that.
One thing worth to mention, is that interface vs. concrete class rule is most important for types exposed in API, eg. method parameter or return type. For private fields and variables it only ensures you aren't using any methods from concrete implementation (i.e. HashSet), but then it's private, so doesn't really matter.
Another thing is that adding another type reference will slightly increase size of your compiled class. Most people won't care, but these things adds up.

Categories