After working on a Java project for some time then coming back to C#, I've found myself really missing AutoValue. Specifically, I'd like the ability to:
Produce an immutable value class with minimal boilerplate.
Have things like equality and hash code automatically handled for me.
Ideally, have it automatically generate a builder to allow fluent construction and arbitrary validation like "if you give parameter A, you must also give B".
In the same vein, a toBuilder()-style function to make a deep copy of an existing instance while making some modifications.
All of that would have been really easy with AutoValue. Is there anything similar? I could, of course, implement all that functionality myself, but it's a lot of boilerplate, making it harder to maintain and more error-prone.
From what you've described, it seems that you will need to wait until C#9 record types in order to get what you've described of java's AutoValues, i.e. in C#9, you should be able to declare:
public data class Person
{
public string FirstName { get; init; }
public string LastName { get; init; }
}
You'll then get:
Immutable behaviour
The benefit of C#'s object initialiser syntax
An automatic default implementation of equality and hashcode
With expressions will allow the `copy most properties, but allowing some field values to be changed during the copy.
In the interim (C#8 and prior), you'll need to do some of this by hand, i.e.
Declare your class properties as get only
Initialise all properties via a constructor
Create your own static factory / builder methods
Use code generation tools in IDE's like to generate equality members
As an aside, if you have just switched from Java to C#, you may not be aware of structs as value types for trivial 'records', which from the docs:
Structs are best suited for very small data structures that contain primarily data that is not intended to be modified after the struct is created.
Although structs do have a default implementation of value equality, this can be unacceptable given that it is just the first field included in the hashcode, and that you'd need to provide an implementation of operator == if you want to use == for value equality.
That said, the use cases for structs must be carefully considered, and should generally be used only for trivial immutable records or for performance reasons when used in arrays.
Related
I was wondering, when constructing an object, is there any difference between a setter returning this:
public User withId(String name) {
this.name = name;
return this;
}
and a builder (for example one which is generated by Builder Generator plugin for IDEA)?
My first impression is that a setter returning this is much better:
it uses less code - no extra class for builder, no build() call at the end of object construction.
it reads better:
new User().withName("Some Name").withAge(30);
vs
User.UserBuilder.anUserBuilder().withName("Some Name").withAge(30).build();
Then why to use builder at all? Is there anything I am missing?
The crucial thing to understand is the concept of an immutable type.
Let's say I have this code:
public class UnitedStates {
private static final List<String> STATE_NAMES =
Arrays.asList("Washington", "Ohio", "Oregon", "... etc");
public static List<String> getStateNames() {
return STATE_NAMES:
}
}
Looks good, right?
Nope! This code is broken! See, I could do this, whilst twirling my moustache and wielding a monocle:
UnitedStates.getStateNames().set(0, "Turtlia"); // Haha, suck it washington!!
and that will work. Now for ALL callers, apparently there's some state called Turtlia. Washington? Wha? Nowhere to be found.
The problem is that Arrays.asList returns a mutable object: There are methods you can invoke on this object that change it.
Such objects cannot be shared with code you don't trust, and given that you don't remember every line you ever wrote, you can't trust yourself in a month or two, so, you basically can't trust anybody. If you want to write this code properly, all you had to do is use List.of instead of Arrays.asList, because List.of produces an immutable object. It has zero methods that change it. It seems like it has methods (it has a set method!), but try invoking it. It won't work, you'll get an exception, and crucially, the list does not change. It is in fact impossible to do so. Fortunately, String is also immutable.
Immutables are much easier to reason about, and can be shared freely with whatever you like without copying.
So, want your own immutable? Great - but apparently the only way to make one, is to have a constructor where all values are set and that's it - immutable types cannot have set methods, because that would mutate them.
If you have a lot of fields, especially if those fields have the same or similar types, this gets annoying fast. Quick!
new Bridge("Golden Gate", 1280, 1937, 2737);
when was it built? How long is it? What's the length of the largest span?
Uhhhhhhh..... how about this instead:
newBridge()
.name("Golden Gate")
.longestSpan(1280)
.built(1937)
.length(2737)
.build();
sweet. Names! builders also let you build over time (by passing the builder around to different bits of code, each responsible for setting up their bits). But a bridgebuilder isn't a bridge, and each invoke of build() will make a new one, so you keep the general rules about immutability (a BridgeBuilder is not immutable, but any Bridge objects made by the build() method are.
If we try to do this with setters, it doesn't work. Bridges can't have setters. you can have 'withers', where you have set-like methods that create entirely new objects, but, calling these 'set' is misleading, and you create both a ton of garbage (rarely relevant, the GC is very good at collecting short lived objects), and intermediate senseless bridges:
Bridge goldenGate = Bridge.create().withName("Golden Gate").withLength(2737);
somewhere in the middle of that operation you have a bridge named 'Golden Gate', with no length at all.
In fact, the builder can decide to not let you build() bridge with no length, by checking for that and throwing if you try. This process of invoking one method at a time can't do that. At best it can mark a bridge instance as 'invalid', and any attempt to interact with it, short of calling .withX() methods on it, results in an exception, but that's more effort, and leads to a less discoverable API (the with methods are mixed up with the rest, and all the other methods appear to throw some state exception that is normally never relevant.. that feels icky).
THAT is why you need builders.
NB: Project Lombok's #Builder annotation gives you builders for no effort at all. All you'd have to write is:
import lombok.Value;
import lombok.Builder;
#Value #Builder
public class Bridge {
String name;
int built;
int length;
int span;
}
and lombok automatically takes care of the rest. You can just Bridge.builder().name("Golden Gate").span(1280).built(1937).length(2737).build();.
Builders are design patterns and are used to bring a clear structure to the code. They are also often used to create immutable class variables. You can also define preconditions when calling the build() method.
I think your question is better formulated like:
Shall we create a separate Builder class when implementing the Builder Pattern or shall we just keep returning the same instance?
According to the Head First Design Patterns:
Use the Builder Pattern to encapsulate the construction of a product
and allow it to be constructed in steps.
Hence, the Encapsulation is important point.
Let's now see the difference in the approaches you have provided in your original question. The main difference is the Design, of how you implement the Builder Pattern, i.e. how you keep building the object:
In the ObjecBuilder separate class approach, you keep returning the Builder object, and you only(!) return the finalized/built Object, after you have finalized building, and that's what better encapsulates creation process, as it's more consistent and structurally well designed approach, because you have a clearly separated two distinct phases:
1.1) Building the object;
1.2) Finalizing the building, and returning the built instance (this may give you the facility to have immutable built objects, if you eliminate setters).
In the example of just returning this from the same type, you still can modify it, which probably will lead to inconsistent and insecure design of the class.
It depends on the nature of your class. If your fields are not final (i.e. if the class can be mutable), then doing this:
new User().setEmail("alalal#gmail.com").setPassword("abcde");
or doing this:
User.newBuilder().withEmail("alalal#gmail.com").withPassowrd("abcde").build();
... changes nothing.
However, if your fields are supposed to be final (which generally speaking is to be preferred, in order to avoid unwanted modifications of the fields, when of course it is not necessary for them to be mutable), then the builder pattern guarantees you that your object will not be constructed until when all fields are set.
Of course, you may reach the same result exposing a single constructor with all the parameters:
public User(String email, String password);
... but when you have a large number of parameters it becomes more convenient and more readable to be able to see each of the sets you do before building the object.
One advantage of a Builder is you can use it to create an object without knowing its precise class - similar to how you could use a Factory. Imagine a case where you want to create a database connection, but the connection class differs between MySQL, PostgreSQL, DB2 or whatever - the builder could then choose and instantiate the correct implementation class, and you do not need to actually worry about it.
A setter function, of course, can not do this, because it requires an object to already be instantiated.
The key point is whether the intermediate object is a valid instance.
If new User() is a valid User, and new User().withName("Some Name") is a valid User, and new User().withName("Some Name").withAge(30) is a valid user, then by all means use your pattern.
However, is a User really valid if you've not provided a name and an age? Perhaps, perhaps not: it could be if there is a sensible default value for these, but names and ages can't really have default values.
The thing about a User.Builder is the intermediate result isn't a User: you set multiple fields, and only then build a User.
Are there any advantages of using Tuples instead of creating a new class in Java?
I've seen something like this a few times
return Pair.of (username, password);. And I've always wondered what kind of advantages it has in relation to something like this return new Credentials (username, password).
Java doesn't have a (first class) notion of tuples. Some projects and libraries introduce types like Pair or Tuple2/Tuple3/Tuple4/... to make up for it, but this is often considered poor style in Java.
By contrast returning a clearly-defined type like Credentials that provides not just structure but also type safety and meaningful getters for your data you make your code clearer, safer, and easier to work with. The Auto/Value project in particular makes it quick and painless to create value-types, making tuple-esque types all but unnecessary.
A Pair (Apache) is immutable, for one. You cannot change it’s values after creation. Many people do in fact choose to create their own class and add methods as necessary.
In general it’s considered better practise to make your own class. You can validate parameters and so on and have the ability to add additional functionality if the need arises.
As dimo414 says, the Pair class is often encountered in 3rd party libs; it has two advantages:
it makes defining a separate class for each key/value pairing unnecessary; so you don't need to define a Credential class. Of course, this should only be used to temporarily store data, not to be used within your implementation model.
Even if you do have a Credential class already, usually Pair is immutable, while the Credential class may not be. That means that it may provide setUsername() and setPassword() methods which you don't always want; using a Pair class makes sure both key and value remain unchanged.
Other than documenting it (obviously it should also be documented), using a special return type (I'm wary of limiting myself to an ImmutableX) or having the user find out at runtime, is there any other way of telling the users of an API that the collection they receive from said API is unmodifiable/immutable?
Are there any naming conventions or marker annotations that universally signal the same thing?
Edit: Unmodifiable and immutable do not mean the same thing, but for the purposes of this question, they are similar enough. The question basically boils down to letting the user know that the returned object does not fully honour its contract (ie. some common operations will throw a runtime exception).
Not a general naming convention but you might be interested in using this #Immutable annotation: http://aspects.jcabi.com/annotation-immutable.html
Besides the documentation purpose this mechanism will also validate if your object is really immutable (during it's instantiation) and throw a runtime exception if it is not.
Good and verbose solution would be to make your own UnmodifiableCollection wrapper class, and return it:
public UnmodifiableCollection giveMeSomeUnmodifableCollection() {
return new UnmodifiableCollection(new LinkedList());
}
The name of the return type would be enough to make verbose statement about the unmodifiablility of the collection.
Document it indeed
Provide API for checking if the given object is imutable collection
Return collection in wrapper that will hold information is the collection inside of it is mutable or not - my favorite solution
If possible, dont use mullable and immutable collections, but pick one of them. Results can always be immutable as they are results - why changing it. If there would be such need, it is a matter of single line to copy collection to new, mutable one and modify it (eg for chain processing)
Writing an #Immutable annotation on the return type of a method is the best approach. It has multiple benefits:
the annotation documents the meaning for users
a tool can verify that client code respects the annotation (that is, that client code does not have bugs)
a tool can verify that the library code respects the annotation (that is, that library code does not have bugs)
What's more, the verification can occur at compile time, before you ever run your code.
If you want verification at compile time, you can use the IGJ Immutability Checker. It distinguishes between
#Immutable references whose abstract value never changes, and
#ReadOnly references upon which side effects cannot be performed.
I am trying to understand where good contracts end and paranoia starts.
Really, I just have no idea what good developer should care about and what shall he leave out :)
Let's say I have a class that holds value(s), like java.lang.Integer. Its instances are aggregated by other objects (MappedObjects), (one-to-many or many-to-many), and often used inside MappedObjects' methods. For performance reasons, I also track these relationships in TreeMap (guava MultiMap, doesn't matter) in addition, to be able to get fast iterations over MappedObjects bound to some range of Integer keys.
So, to keep system in consistent state, I should modify MappedObject.bind(Integer integer) method to update my Map like:
class MappedObject {
public void bind (Integer integer) {
MegaMap.getInstance().remove(fInteger, this);
fInteger = integer;
MegaMap.getInstance().add(fInteger, this);
}
...
private Integer fInteger;
}
I could just make abstract MappedObject class with this final method, forcing other to inherit from it, but it is rude. If I will define MappedObject as interface with method bind() and provide skeletal implementation -- other developer might later just forget to include it in object and implement method by himself without Map updating.
Yes, you should force people to do the right thing with your code. A great example of letting people do the wrong thing is the servlet method init( ServletConfig config ) that expected you would store the servlet config yourself but, obviously, a lot of people forgot to store the config and when running their servlets just failed to work.
When defining APIs, you should always follow the open-closed principle, your class should be open for extension and closed for modification. If your class has to work like this, you should only open extension points where they make sense, all the other functionality should not be available for modification, as it could lead to implementation issues in the future.
Try to focus on functionality first and leave all unnecessary things behind. Btw you can't prohibit reflection so don't worry too much on misuse. On the other hand your API should be clear and straightforward so users will have clear idea, what they should and what they shouldn't do with it.
I'd say your classes should be designed for as simple use as possible.
If you allow a developer to override methods you definitely should document the contract as good as possible. In that case the developer opts to override some basic functionality and thus is responsible to provide an implementation that adheres to the contract.
In cases where you don't want the developer to override parts of the functionality - for security reasons, if there is no sensible alternative etc. - just make that part final. In your case, the bind method might look like this:
class MappedObject {
public final void bind (Integer integer) {
MegaMap.getInstance().remove(fInteger);
internalBind( integer );
MegaMap.getInstance().add(fInteger);
}
protected void internalBind( Integer integer ) {
fInteger = integer;
}
...
private Integer fInteger;
}
Here you'd allow the developer to override the internalBind() method but ensure that bind() will do the mapping.
To summarize: Make using and extending classes as easy as (sensibly) possible and don't have the developer to copy lots of boiler plate code (like the map updates in your case) in case he just wants to override some basic functionality (like the actual binding).
At least you should do really everything that prevents bugs but cost no effort.
For example: use primitive types (int) instead of wrappers (Integer) if the variable is not allowed to be null.
So in your bind method. If you not have intended to bind null, then use int instead of Integer as parameter type.
If you think your API users are stupid, you should prohibit wrong usage. Otherwise you should not stand in their way to do things they need to do.
Domumentation and good naming of classes and methods should indicate how to use your API.
class MyThing {
protected HashMap<String,Object> fields;
protected MyThing(HashMap<String,Object> newFields){
fields.putAll(newFields);
}
protected Object get(String key){
return fields.get(key);
}
}
Now a little background. I am using this class as a super class to a bunch of different classes which represent objects from an XML file. This is basically an implementation of an API wrapper and I am using this as an adapter between the parsed XML from an API and a database. Casting is delegated to the caller of the get method. If the subclasses need to do something when they are created or when they return a variable, they just call super and then manipulate what gets returned afterwards. eg.:
class Event extends MyThing {
public Event(HashMap<String,Object> newFields){
super(newFields);
// Removes anything after an # symbol in returned data
Pattern p = Pattern.compile("\\#.*$");
Matcher m = p.matcher((String)fields.get("id"));
boolean result = m.find();
if (result)
fields.put("id", m.replaceFirst(""));
}
}
public Object get(String key){
Object obj = super(key);
if (key.equals("name")){
return "Mr./Mrs. " + ((String)obj);
}
}
}
The reason I feel like I should do this is so I don't have to write getId, getName, getWhatever methods for every single subclass just because they have different attributes. It would save time and it is pretty self explanatory.
Now this is obviously "unJavalike" and more like a ducktyped language way of doing things, but is there a logical reason why I should absolutely not be doing this?
If you're going to this level of complexity and mucking up your object model just because you don't want to have getters and setters, do it in Groovy instead.
Groovy is a duck typed dynamic language on the JVM that accepts 98% of valid Java code, so you already know most of the language (you don't lose functionality)...there are "more idiomatic" ways of doing things, but you can pick those up with time. It also already has a built in XmlSlurper, which probably does most of what you're trying to do anyway.
As for the "reasons why you shouldn't", you're introducing all types of maintainability concerns.
New classes will always have to derive from the base class.
They will have to implement a constructor that always calls a base constructor
They will have to override get() [which you're basically using to encapsulate your getters and setters anyway, why not just add that method and delegate to those other methods] and write specific logic which is likely to degrade with time.
Why shouldn't you? It'll work, right? Sure. But it's poor engineering in that you're either creating a maintenance nightmare, or reinventing the wheel and likely to do it wrong.
Obviously, it's not type safe.
Future maintainers won't know what the types are supposed to be and will get generally confused as to why you're not using POJOs.
Instead of constant time, space complexity and performance you have the characteristics of a HashMap.
It become very difficult to write non-trivial getters/setters in future.
Most data binding systems are designed to work with POJOs/Beans (JAXB, JPA, Jackson, etc).
I'm sure there are more, but this will do. Try using some proper OXM libraries and you'll be much better off.