Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
First I wish to state that I'm aware of the javadoc of java.lang.Object.hashCode() and thus there's no need to mention it again.
What I'm asking is: Why isn't java.lang.Object.hashCode() moved into a seperate interface named (probably) Hashable? Like `java.lang.Comparator'?
To me, hashCode() is only used in hash-dependent data structures like HashMap or HashTable, which a) are not used in every application b) are often used with very few types of keys like String or Integer and not with a InputStream (or something like it).
I know that I do not have to implement hashCode() for every class of mine, however, isn't adding a method to a class at the cost of performance to some extent? Especially for java.lang.Object - the superclass of every class in Java.
Even if special optimization is done within the JVM so that the lost of performance can be ignored, I still think that it's unwise to provide every Object with a behavior that is not frequently implemented at all. According to the Interface Segregation Principle:
No client should be forced to depend on methods it does not use.
I did some searches in the web and the only related page I could find is this.
The first answer expressed (partly) the same idea as mine, and some others tried to answer the question, saying mainly that "hashCode() for every Object enables storage of object of any type in HashMap", which I take as not satisfactory.
I here propose my own solution, which satisfies both the Interface Segregation Principle and the ability to store anything in a HashMap without adding much complexity the whole system:
Remove hashCode() from java.lang.Object.
Let there be an interface Hashable, containing hashCode() with the same contract as the former java.lang.Object.hashCode().
Let there be an interface HashProvider with a type parameter T containing provideHashCode(T t) to provide a hash code for an object. (Think of Comparator<T>).
Let there be an implementation of HashProvider<Object> called DefaultHashProvider which generates the hash code for any Object using the current implementation of Object.hashCode(). (As for Java 8, Object.hashCode() is a native method, I expect DefaultHashProvider.provideHashCode() to return the same thing for any Object)
Modify constructors of HashMap and HashTable so that everything can be stored in them by:
Using the provideHashCode() if a HashProvider is specified.
Using the hashCode() if its underlying elements implement Hashable.
Using DefaultHashProvider otherwise.
I believe that this is possible in practice because it's just a variation of the system of Comparable, Comparator and TreeMap.
And let my repeat my question:
Considering that the Java development team should not be unable to come up with a solution similar to mine, is there any good reason for not doing so? Are there some advanced considerations that I'm currently unaware of? I have the following hypotheses, does any of them approach the correct answer?
Some language features required by this solution, like generic types, are available only well after the very beginning of Java - since 1.5. However, I would argue that Comparator, Comparable along with TreeMap exist since 1.2, can't the technique used to write them be adapted to HashMap?
hashCode() is used somewhere within the JVM and therefore it requires every Object to have a hash code. However this can also be available using DefaultHashProvider.provideHashCode() (or its native implementation) for non-hashables.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I'm trying to learn design patterns as good coding practices and I would like to know if a HashSet is considered a well-written class? eg:
To construct a LinkedHashSet we use the following constructor
public LinkedHashSet(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor, true);
}
Which calls this one:
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
map = new LinkedHashMap<>(initialCapacity, loadFactor);
}
So there is a useless param, is it correct to do that?
Also, I see that LinkedHashSet extends HashSet
LinkedHashSet<E> extends HashSet
and HashSet references LinkedHashSet in its code, why there is nothing like a compile time recursion there?
// Create backing HashMap
map = (((HashSet<?>)this) instanceof LinkedHashSet ?
new LinkedHashMap<E,Object>(capacity, loadFactor) :
new HashMap<E,Object>(capacity, loadFactor));
This is severely optinion based.
And in a lot of situations, the programmers had to make hard choices. And there has been so many teams at work, that it's natural that some of the following problems occur.
However, here's my opinion on that topic:
Most of the JDK code is a mess. Unnecessary code, convoluted code that modern Java JIT compilers easily will do better
There's lots of micro optimizations, but in the end a lot of implementations can rewritten to be much faster, cleaner, more compatible.
Simple classes like ArrayList are already a mess, and if you write your own implementation, it will be twice as fast.
The Input/OuputStream system is a mess, should have had an interface at the very top.
A lot of solutions like ThreadLocal are inherently unsafe to use and can cause leaks if not worse problems.
There's a lot of repetition of code where there shouldn't be
a huge lacking support for default conversions, especially String to something else, should have default methods like s.toInt(int default)-like methods
The java.util. functions and FunctionalInterfaces are a mess, with lots of different names that could have been constructed in a far more logical and intuitive way, alsong with all the Collectors and stuff. Could be so much easier.
On the other hand:
The Collections structures (inheritance, Interfaces etc) are over all pretty well implemented, with only very few features missing
the java.nio stuff is really good
The Exceptions (RuntimeException, Throwable) hierarchies are quite good and provide a stable basis for additional custom classes (that one might prefer to / should use)
Some aspects that to not target the implementation of classes, but the language specification:
how they introduced and integrated Lambdas and Streams (the whole functional show)
Generics and Covariance/Contravariance
Reflection and Annotations
All in all, if you give your default Java a little boost with libraries like Project Lombok (adding AOP and stuff), it is awesome. I really love it, and so far no other language could un-favorite Java (even tho I hated Java when I had to learn it)
So, as others have stated:
learn from the code (some techniques are REALLY good)
improve upon them (some are obviously BAD)
And finally to the points you addressed:
The dummy parameter. This is a very rare occurrence, and even then mostly only occurs for quality-of-life: instead of having only one CTOR with 3 arguments, also have another CTOR with 2 arguments
concerning the compile time recursion: Java compiler can be compared to a multi-pass compiler, so the 'recursion' does not play a role. What is a little bad about this implementation - a very theoretical/academic complaint - is that HashSet and LinkedHashSet are now statically bound in BOTH directions. LinkedHashSet -> HashSet would be fine (how else would inheritance be implemented!?), but HashSet -> LinkedHashSet is a bit of a bugger. But, well, academic, because those two classes are in the same package, and not even the new Modules system will rip them apart. So unless you write a packaging tool that discerns on such a low level (like I did), this point has no practical impact.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I've seen some legacy code that uses lengthproperty on some objects and others that uses length() method. Currently I'm working with a NodeList from the org.w3c.dom package and I found that it have the getLength() method to get the numbers of elements.
My Question is how as Java developer I can know how to determine when to use length, length(), size(), getLength()? obviously it depends of the object type and the API is there for read... but the point is how the Java Development select which of that implements in their classes.
Note: In the Question When to use .length vs .length() Makoto answer's indicates that .length is a property on arrays. That isn't a method call, and length() is a method call on String. But, why is the reason? why not use ever a method or ever a property for maintain the consistency around all the API.
how would Java developers select which of [the methods] to implement in their classes?
When you implement classes that contain other objects, it's almost always going to be size(), the method provided by theCollection interface.
As far as other choices go, you should avoid exposing member variables, even final ones, because they cannot be accessed through an interface. Java gets away with it for arrays because of some JVM trickery, but you cannot do the same. Hence, length should be out: it remains in Java because it's not possible to change something that fundamental that has been in the language from day one, but it's definitely not something one should consider when designing new classes.
When you implement your own type that has length (say, a rectangle or a line segment) you should prefer getLength() to length() because of Java Beans naming conventions.
obviously it depends of the object type and the API is there for read...
You already have answered your question yourself: look in the API documentation of whatever class you are using.
but the point is how the Java Development select which of that implements in their classes.
The classes in Java's standard library have been developed over a long period of time by different people, which do not always make the same choice for the name of methods, so there are inconsistencies and unfortunately you'll just have to live with that.
There is no clear rule, otherwise we wouldn't see such a mixup in the jdk itself. But here are some things to consider when making such a design decision.
Don't worry to much. It is a minor thing and won't make to much of a difference. So when you think longer then 5 minutes about this, you are probably wasting money already.
Use getters when a frameworks need them. Many frameworks depend on the getter style. If you need or want such frameworks to work nicely with your class it might be beneficial to use that style.
Shorter is better. the 'get' part doesn't increase clarity. It just generates to characters of noise to the source code, so if you don't need it for some reason, don't use it.
Methods are easier to evolve. Length is often a quantity that is not set directly but somehow computed. If you hide that behind a method it gives you the flexibility to change that implementation later on, without changing the API.
Direct field accesses should be a tiny bit faster, but if you aren't working on high volume online trading or something, the difference isn't even worth thinking about. And if you do you should do your own measurements before making a decision. The hotspot compiler will almost for sure inline the method call anyways.
So if there aren't any external forces driving you in a different direction I would go with 'length()'
According to OOPS principles, length should be attribute and getLength() should be method. Also length attribute should be encapsulated should be exposed through methods, so getLength() sounds more appropriate.
Unfortunately not all Java library classes follow standards. There are some exceptions and this is one among them.
In a pure OO language it should be probably always a method like length(). So in a class hierarchy you can override the attribute length.
But Java is not pure OO. And the main reason for fields (.length) vs method (length()) is/was performance issues.
And even Sun/Oracle programmers did some bad class design.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
Yesterday I have attended interview in one Leading IT Service company. Technical interview was good, no issues, then I have moved to another set of round about Management, Design and Process. I have answered everything except the below question.
Question asked by interviewer:
Let say you are developing a class, which I am going to consume in my
class by extending that, what are the key points you keep in
mind? Ex, Class A, which has a method called "method A" returns a Collection,
let say "list". What are the precautions you will take?
My Answer: The following points I will consider, such as:
Class and method need to be public
Method 1 returns a list, then this needs to be generics. So we can avoid class cast exception
If this class will be accessed in a multi-threaded environment, the method needs to be synchronized.
But the interviewer wasn't convinced by my points. He was expecting a different answer from me but I am not able to get his thought process, what he was excepting.
So please provide your suggestions.
I would want you holding to design principles of Single Reaponsibility, Open/Close, and Dependency Injection. Keep it stateless, simple, and testable. Make sure it can be extended without needing to change.
But then, I wasn't interviewing you.
A few more points which haven't been mentioned yet would be:
Decent documentation for your class so that one doesn't have to dig too deep into your code to understand what functionality you offer and what are the gotchas.
Try extending your own class before handing it out to someone else. This way, you personally can feel the pain if you class is not well designed and thereby can improve it.
If you are returning a list or any collection, one important question you need to ask is, "can the caller modify the returned collection"? Or "is this returned list a direct representation of the internal state of your class?". In that case, you might want to return a copy to avoid callers messing up your internal state i.e. maintain proper encapsulation.
Plan about the visibility of methods. Draw an explicit line between public, protected, package private and private methods. Ensure that you don't expose any more than you actually want to. Removing features is hard. If something is missing from your well designed API, you can add it later. But you expose a slew of useless public methods, you really can't upgrade your API without deprecating methods since you never know who else is using it.
If you are returning a collection, the first thing you should think about is should I protect myself from the caller changing my internal state e.g.
List list = myObject.getList();
list.retainAll(list2);
Now I have all the elements in common between list1 and list2 The problem is that myObject may not expect you to destroy the contents of the list it returned.
Two common ways to fix this are to take a defensive copy or to wrap the collection with a Collections.unmodifiableXxxx() For extra paranoia, you might do both.
The way I prefer to get around this is to avoid returning the collection at all. You can return a count and a method to get the n-th value or for a Map return the keys and provide a getter, or you can allow a visitor to each element. This way you don't expose your collection or need a copy.
Question is very generic but i want to add few points:
Except the method which you want to expose make other methods and variable private. Whole point is keep visibility to minimum.
Where ever possible make it immutable, this will reduce overhead in mutithreaded environment.
You might want to evaluate if serializability is to be supported or not. If not then dont provide default constructor. And if serializable then do evaluate serialized proxy pattern.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Is it bad practice to directly manipulate data like:
Sorter.mergeSort(testData); //(testData is now sorted)
Or should I create A copy of the data and then manipulate and return that like:
sortedData = Sorter.mergeSort(testData); // (sortedData is now sorted and testData remains unsorted)?
I have several sorting methods and I want to be consistent in the way they manipulate data. With my insertionSort method I can directly work on the unsorted data. However, if I want to leave the unsorted data untouched then I would have to create a copy of the unsorted data in the insertionSort method and manipulate and return that (which seems rather unnecessary). On the other hand in my mergeSort method I need to create a copy of the unsorted data one way or another so I ended up doing something that also seems rather unnecessary as a work around to returning a new sortedList:
List <Comparable> sorted = mergeSortHelper(target);
target.clear();
target.addAll(sorted);`
Please let me know which is the better practice, thanks!
It depends whether you're optimising for performance or functional purity. Generally in Java functional purity is not emphasised, for example Collections.Sort sorts the list you give it (even though it's implemented by making an array copy first).
I would optimise for performance here, as that seems more like typical Java, and anyone who wants to can always copy the collection first, like Sorter.mergeSort(new ArrayList(testData));
The best practice is to be consistent.
Personally I prefer my methods to not modify the input parameters since it might not be appropriate in all situations (you're pushing the responsibility onto the end user to make a copy if they need to preserve the original ordering).
That being said, there are clear performance benefits of modifying the input (especially for large lists). So this might be appropriate for your application.
As long as the functionality is clear to the end user you're covered either way!
In Java I usually provide both options (when writing re-usable utility methods, anyway):
/** Return a sorted copy of the data from col. */
public List<T> mergeSort(Collection<T extends Comparable<T>> col);
/** Sort the data in col in place. */
public void mergeSortIn(List<T extends Comparable<T>> col);
I'm making some assumptions re the signatures and types here. That said, the Java norm is - or at least, has been* - generally to mutate state in place. This is often a dangerous thing, especially across API boundaries - e.g. changing a collection passed to your library by its 'client' code. Minimizing the overall state-space and mutable state in particular is often the sign of a well designed application/library.
It sounds like you want to re-use the same test data. To do that I would write a method that builds the test data and returns it. That way, if I need the same test data again in a different test (i.e. to test your mergeSort() / insertionSort() implementations on the same data), you just build and return it again. I commonly do exactly this in writing unit tests (in JUnit, for example).
Either way, if your code is a library class/method for other people to use you should document its behaviour clearly.
Aside: in 'real' code there shouldn't really be any reason to specify that merge sort is the implementation used. The caller should care what it does, not how it does it - so the name wouldn't usually be mergeSort(), insertionSort(), etc.
(*) In some of the newer JVM languages there has been a conscious movement away from mutable data. Clojure has NO mutable state at all as it is a pure functional programming language (at least in normal, single-threaded application development). Scala provides a parallel set of collection libraries that do not mutate the state of collections. This has major advantages in multi-threaded, multi-processor applications. This is not as time/space expensive as might be naively expected, due to the clever algorithms the collections use.
In your particular case, modifying the "actual" data is more efficient. You are sorting data, it is observed that its more efficient to work on sorted data rather than unsorted data. So, I don't see why you should keep the unsorted data. check out Why is it faster to process a sorted array than an unsorted array?
Mutable object should be manipulated in the functions. Like Arrays#sort
But immutable objects (like String), can only return the "new" objects. Like String#replace
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Coding to interfaces?
I am reading the Collections tutorial from Java and it strongly recommends that codes referring to Collections be implemented using its interface type and not its actual implementation type. e.g:
Set<String> s = new HashSet<String>();
It says that it will give me flexibility to change implementations should I decide to change it later on.
Set<String> s = new TreeSet<String>();
Other than flexibility, are there any other benefits to implementing a collection using its interface type?
Yes, when using the interface class, you will only have access to the most default methods. These methods guarantee to be intuitive. When using the implementation class, you might see more methods that might confuse you or be abused.
For example: a collection interface will have a method that returns the number of elements, lets say: size(). But the implementation class might also provide a capacity() method that tells you how big the underlying array is.
But as the tutorial tells you, the most important reason is that you can change the implementation without any effort. Changing the implementation might be interesting for performance optimization is very specific cases.
I think here it's more than just Collections, it's about polymorphism: better to use the interface as the declared type, because you may change implementation and/or concrete class to be bound at runtime. (discussion here could be longer - there are plenty of documents/tutorials about this - Java basics)
The interface type often contains fewer methods than an actual implementation, making it tempting to use the later because it allows access to them.
However, this then ties your hands in a significant way. For example, if you decide to expose a return type of Vector from a public method of a class that uses a vector, but later realize that your module would be better served with a LinkedList, you now have some problems -- this will break anything which uses the method returning a vector.
On the other hand, if in the first place you had used a return type of List then there would be no problem -- you could switch the internal Vector to a LinkedList, or implement your own thing that fulfills interface List. In my experience, this is a common event (of course, if you make it difficult or impossible, then it will happen less).
So, unless you have a specific reason to do otherwise (eg, you need to provide access to methods only available with vectors), always use the generic interface type for your return value.
I realize this is still about flexibility but it is not clear from your post if you understand how important that is. If you are asking for a better reason to use interfaces, ie. "is it okay to use the specific implementation if I don't care
flexibility", ie. "I don't care about flexibility and want to use specific types, is that okay?" the answer is you should care about flexibility ;) As others have said, this is an important fundamental of good java programming.