Java Pattern class doesn't have a public constructor, why? - java

I've been reviewing Java Regex Library, surprised by the fact the Pattern class does not have a public constructor which I've taken for granted for years.
One reason I suspect the static compile method is being used in favor of constructor could be that constructor would always return a new object while a static method might return a previously created (and cached) object provided that the pattern string is the same.
However, it is not the case as demonstrated by the following.
public class PatternCompiler {
public static void main(String[] args) {
Pattern first = Pattern.compile(".");
Pattern second = Pattern.compile(".");
if (first == second) {
System.out.println("The same object has been reused!");
} else {
System.out.println("Why not just use constructor?");
}
}
}
Any other strong rationales behind using static method over constructor?
Edit: I found a related question here. None of the answers there convinced me either. Reading through all answers, I get a feeling that a static method has quite a few advantages over a public constructor regarding creating an object but not the other way around. Is that true? If so, I'm gonna create such static methods for each one of my classes and safely assume that it's both more readable and flexible.

Generally, a class won't have a public constructor for one of three reasons:
The class is a utility class and there is no reason to instantiate it (for example, java.lang.Math).
Instantiation can fail, and a constructor can't return null.
A static method clarifies the meaning behind what happens during instantiation.
In the class of Pattern, the third case is applicable--the static compile method is used solely for clarity. Constructing a pattern via new Pattern(..) doesn't make sense from an explanatory point of view, because there's a sophisticated process which goes on to create a new Pattern. To explain this process, the static method is named compile, because the regex is essentially compiled to create the pattern.
In short, there is no programmatic purpose for making Pattern only constructable via a static method.

One possible reason is that this way, caching can later be added into the method.
Another possible reason is readability. Consider this (often cited) object:
class Point2d{
static Point2d fromCartesian(double x, double y);
static Point2d fromPolar(double abs, double arg);
}
Point2d.fromCartesian(1, 2) and Point2d.fromPolar(1, 2) are both perfectly readable and unambiguous (well... apart from the argument order).
Now, consider new Point2d(1, 2). Are the arguments cartesian coordinates, or polar coordinates? It's even worse if constructors with similar / compatible signatures have entirely different semantics (say, int, int is cartesian, double, double is polar).
This rationale applies to any object that can be constructed in multiple different ways that don't differ in just the argument type. While Pattern, currently, can only be compiled from a regex, different representations of a Pattern may come in the future (admittably, then, compile is a bad method name).
Another possible reason, mentioned by #Vulcan, is that a constructor should not fail.
If Pattern.compile encounters an invalid pattern it throws a PatternSyntaxException. Some people may consider it a bad practice to throw an exception from a constructor. Admittably, FileInputStream does exactly that. Similarly, if the design decision was to return null from the compile method, this would not be possible with a constructor.
In short, a constructor is not a good design choice if:
caching may take place, or
the constructor is semantically ambiguous, or
the creation may fail.

This is just a design decision. In this case there is no "real" advantage. However, this design allows optimisation (caching for instance) without changing the API. See http://gbracha.blogspot.nl/2007/06/constructors-considered-harmful.html

Factory methods have several advantages, some of which are already specified in other answers. The advice to consider factory methods instead of constructors is even the very first chapter in the great book "Effective Java" from Joshua Bloch (a must-read for every Java programmer).
One advantage is that you can have several factory methods which have the same parameter signatures but different names. This you can't achieve with constructors.
For example, one might want to create a Pattern from several input formats, all of which are just Strings:
class Pattern {
compile(String regexp) { ... }
compileFromJson(String json) { ... }
compileFromXML(String xml) { ... }
}
Even if you are not doing this when you create the class, factory methods give you the ability to add such methods latter without causing weirdness.
For example, I have seen classes where the need for a new constructor came later and a special meaning-less second parameter had to be added to the second constructor in order to allow overloading. Obviously, this is very ugly:
class Ugly {
Ugly(String str) { ... }
/* This constructor interpretes str in some other way.
* The second parameter is ignored completely. */
Ugly(String str, boolean ignored) { ... }
}
Unfortunately, I can't remember the name of such a class, but I think it even was in the Java API.
Another advantage which has not been mentioned before is that with factory methods in combination with package-private constructors you can prohibit sub-classing for others, but still use sub-classes yourself. In the case of Pattern, you might want to have private sub-classes like CompiledPattern, LazilyCompiledPattern, and InterpretedPattern, but still prohibit sub-classing to ensure immutability.
With a public constructor, you can either prohibit sub-classing for everybody, or not at all.

If you really want to take the deep dive, plunge into the archives of JSR 51.
Regular expressions have been introduced as part of JSR 51, that’s where you might still find the design decisions in their archives, http://jcp.org/en/jsr/detail?id=51

It has a private constructor.
/**
* This private constructor is used to create all Patterns. The pattern
* string and match flags are all that is needed to completely describe
* a Pattern. An empty pattern string results in an object tree with
* only a Start node and a LastNode node.
*/
private Pattern(String p, int f) {
and compile method calls into that.
public static Pattern compile(String regex) {
return new Pattern(regex, 0);
}
Since you are using == comparison which is for references it will not work
The only reason I can think of this behaviour is that the match flag will be defaulted to zero in the compile method which acts a factory method.

Related

Immutable data in Java - static or instance operators?

Imagine any Java class which is entirely immutable. I will use the following as an example:
public class Point2D {
public final int x;
public final int y;
public Point2D(final int x, final int y) {
this.x = x;
this.y = y;
}
}
Now consider adding an operator on this class: a method which takes one or more instances of Point2D, and returns a new Point2D.
There are two possibilities for this - a static method, or an instance method:
public static Point2D add(final Point2D first, final Point2D second) {
return new Point2D(first.x + second.x, first.y + second.y);
}
or
public Point2D add(final Point2D other) {
return new Point2D(this.x + other.x, this.y + other.y);
}
Is there any reason to pick one over the other? Is there any difference at all between the two? As far as I can tell their behaviour is identical, so any differences must be either in their efficiency, or how easy they are to work with as a programmer.
Using a static method prevents two things:
mocking the class with most mocking frameworks
overwriting the method in a subclass
Depending on context, these things can be okay, but they can also create serious grief in the long run.
Thus, me personally, I only use static when there are really good reasons to do so.
Nonetheless, given the specific Point2D class from the question, I would tend to actually use the static methods. This class smells like it should have "value" semantics, so that two points for the same coordinates are equal and have the same hash code. I also don't see how you would meaningfully extend this class.
Imagine for example a Matrix2D class. There it might make a lot of sense to consider subclasses, such as SparseMatrix for example. And then, most likely, you would want to override computation intensive methods!
There is no practical difference between the two. Where it matters most is in the area of OO design and readability.
The static version of the operation seems more aligned with the static factory pattern. In addition to using a common design pattern, it is a clear creational design, which seems to meet its intent: to create a new object.
On the other hand, instance methods creating new objects are very practical when it comes to immutable objects. The best example of this is the String methods (String.concat(string), etc.). In my opinion, this is more a question of practicality (you don't want to mutate the state of the object; you need to augment the it, but the operation has to result in a new instance).
Is there any reason to pick one over the other?
There may be cases where one fits better than the other (for example, I'd prefer the static method to the instance version in a stream pipeline's reduction - as an example), but there is no evident, absolute preference to be claimed here. So...
I would use the static method for factory operations (although I'd call the method something more like create..., newInstance... for clarity)
I would use the instance method for transformations operations that return new instances to avoid mutating the object.
First and foremost, if it is an immutable make it unsubclassable to others. Usually final is used although you can hide the constructor. Not particularly relevant in this case, but static creation methods allows common values to be reuse instances, specialist implementations to be selected and the ugly diamond (<>) notation to be elided. (If you call your static creation method of it is clear to use when qualified with the type name.)
Addition is usually written as infix. If there are subexpressions involved this will make the client code look much better, though the Java syntax will still force you to have parentheses everywhere. A static method requires qualification or an import static for the client (the latter not really helpful if the method has a name like and, and 'import *' is bad if there other static method that don't make sense without qualification).
Reserve static methods for cases where the object is, in a sense, incidental to the function. For example String's join and format.
As for testing, it should not be necessary to mock a value class or static method. Immutable types should have trusted implementations and therefore not be subtypable by others.

What's the preferred Java convention for naming static constructor methods

What is the preferred convention for naming static constructor methods? For example, say I have an Error class, which has a single constructor which simply initialises the fields, and then some static constructor methods:
class Error {
static Error xxxx(String msg) {
return new Error(msg, -1);
}
static Error xxxx(String msg, int line) {
return new Error(msg, line);
}
final String msg;
final int line;
private Error(String msg, int line) {
this.msg = msg;
this.line = line
}
}
What should I name the xxxx methods. Possibilities include:
valueOf - some Java classes follow this, e.g. Integer.valueOf, but is this only used for boxing primitives?
of - more terse. Error.of(msg, i) seems readable.
error - some pros and cons - see below.
create - gives undue emphasis to the mechanism (create something) instead of the intent (give me a value). For instance, some implementations may cache and re-use values, meaning something isn't always actually created as such.
createError - wordy, and same issue as create.
I tend to code in a functional style, and, possibly as a consequence, my preference was #3, partly because if I static import Error then I can call it simply as error(msg, i), which seems readable and mimics actual constructor usage. However, it may cause confusion with local variables of the same name. For example error = error(msg, i); looks confusing.
I would be interested in seeing evidence or arguments in favour of a particular approach, rather than simple "I like xxx" answers.
If the recent JDK additions are a good indication, then you could look at java.time:
The API has a relatively large surface area in terms of number of
methods. This is made manageable through the use of consistent method
prefixes.
of - static factory method
parse - static factory method focussed on parsing
[...]
of looks like a reasonable candidate for your use case, but what probably matters more than the choice is consistency.
In "Effective Java", in addition to "valueOf" and "of" that you've already mentioned, Joshua Bloch proposes the following:
getInstance
newInstance
getType (i.e. getError)
newType (i.e. newError)
A good name for this is newError(...). It suggest the class is also a factory for itself.
valueOf and of can be used when the factory method uses some kind of cache such that the method invocation doesn't necessary create new instance.
newClassName or createClassName is for general factory methods where it always tries to create an instance.

Java Constructor and static method

When should I use a constructor and when should I use static method?
Can you explain above with small snippet? I skimmed through a few threads but I'm still not clear with this.
Joshua Bloch advises to favor static factory methods instead of constructors (which I think is a good practice). Couple of advantages and disadvantages :
Advantages of static factory methods :
unlike constructors, they have names
unlike constructors, they are not required to create a new object each time they're invoked (you can cache instances : e.g. Boolean.valueOf(..)
unlike constructors, they can return an object of any subtype of their return type (great flexibility)
Disadvantages of static factory methods :
They are not really distiguishable from other static methods (it's hard to find out how to initialize an object if you are not familiar with the API)
The main disadvantage (if you use only static factory methods, and make constructors private) is that you cannot subclass that class.
Use a public constructor when you only ever want to return a new object that type and you want simplicity.
A good example is StringBuilder as it's mutable and you are likely to want a new object each time.
public String toString() {
StringBuilder sb = new StringBuilder();
// append fields to the sb
return sb.toString();
}
Use a static factor method when you might want to re-use objects (esp if immutable), you might want to return a sub-class or you want descriptice construction. A good example is EnumSet which has a number of static factories which do different things even though some have the same arguments.
EnumSet.noneOf(RetentionPolicy.class);
// has the same arguments, but is not the same as
EnumSet.allOf(RetentionPolicy.class);
In this case, using a static factory makes it clear what the difference between these two ways of construction the set.
Also EnumSet can return two different implementations, one optimised for enums with a small number of values (<= 64) RegularEnumSet and another for many values called JumboEnumSet
Always use a constructor if your class has a state (even for a single instance; singleton pattern ).
Only use static for utility methods like in java.lang.Math
Example:
public static int max(int a, int b) {
return (a >= b) ? a : b;
}
Doesn't change any state (instance variables) of an object, thus it can be declared static.
Use constructor when you need an object and other stuffs like functions and variables having one copy for every object.
when you want to do something without creating object then use static method.
Example:
public class Test {
public int value;
public static int staticValue;
public int getValue() {
return ++value;
}
public static int getStaticValue() {
return ++staticValue;
}
}
public class TestClass {
public static void main(String[] args) {
Test obj = new Test();
Test obj1 = new Test();
S.o.p(obj.getValue());
S.o.p(obj1.getValue));
S.o.p(Test.getStaticValue());
S.o.p(Test.getStaticValue());
}
}
Static factory methods have names, constructors don't. Thus factory methods can carry natural documentation about what they do that constructors can't. For example, see the factory methods in the Guava Libraries, like ImmutableMap.copyOf(otherMap). While this might have little effect on behaviour of construction, it has a huge effect on readability of the code. Definitely consider this if you're publishing an API.
Also you can use a factory when you need to do any more complicated configuration of the object you're creating, especially if you need to publish to other threads (registering in pools, exposing as an MBean, all manner of other things...) to avoid racy publication. (See e.g. Java Concurrency In Practice section 3.2)
Static methods that do something (e.g. Math.min) are not really the same thing as static factories, which can be considered direct replacements for constructors, with added flexibility, evolvability and (often) clarity.
Whenever you need to create an instance of an object you will have to use the constructor.
So, if you want to create a Car object, then you will need a constructor for that.
The keyword static means, that your method can be called without creating an instance.
class Car
{
private int num_of_seats;
public Car(int number_of_seats)
{
this.num_of_seats = number_of_seats;
}
// You want to get the name of the class that has to do with
// this class, but it's not bounded with any data of the class
// itself. So you don't need any instance of the class, and
// you can declare it as static.
static String getClassName()
{
return "[Car]";
}
}
In general you will use static class with data that are not correlated with the instance of the object.
Another example is:
class Ring
{
private List nodes;
public Ring(List nodes)
{
this.nodes = nodes;
}
// You want to calculate the distance of two ids on the ring, but
// you don't care about the ring. You care only about the ids.
// However, this functionality logical falls into the notion of
// the ring, that's why you put it here and you can declare it
// as static. That way you don't have to manage the instance of
// ring.
static double calculateDistance(int id_1, int id_2)
{
return (id_1 - id_2)/383; // The divisor is just random just like the calculation.
}
}
As the posts above say, it's just a matter of what you want to do and how you want to do it. Also, don't try to understand everything rightaway, write some code then try different approaches of that code and try to understand what your code does. Examples are good, but you need to write and then understand what you did. I think it's the only way you will figure out
why you do staff the way you have to do.
Static methods do not have to instantiate new objects everytime. Since object instantiation is expensive it allows instances to be cached within the object. So, it can improve performance.
This is the explanation from the Effective Java :
This allows immutable classes (Item 15) to use preconstructed
instances, or to cache instances as they’re constructed, and dispense
them repeatedly to avoid creating unnecessary duplicate objects. The
Boolean.valueOf(boolean) method illustrates this technique: it never
creates an object. This technique is similar to the Flyweight pattern
[Gamma95, p. 195]. It can greatly improve performance if equivalent
objects are requested often, especially if they are expensive to
create.
i.e. if you want to use a singleton, which means that you have only one instance of the object, which might be shared with others, then you need a static method, which will internally will call the constructor. So, every time someone wants an instance of that object you will return always the same, thus you will consume memory only for one. You always need a constructor in object oriented programming, in every OO language. In java an in many other languages the default constructor of an object is implied, and built automatically. But you need some custom functionality you have to make your own.
Above you see a few good examples of the usage. However, if you have something specific in your mind, please let us know. I mean if you have a specific case where you are not sure if you should use a static method or a constructor. Anyhow, you will definitely need a constructor, but I am not sure about the static method.

Final arguments in interface methods - what's the point?

In Java, it is perfectly legal to define final arguments in interface methods and do not obey that in the implementing class, e.g.:
public interface Foo {
public void foo(int bar, final int baz);
}
public class FooImpl implements Foo {
#Override
public void foo(final int bar, int baz) {
...
}
}
In the above example, bar and baz has the opposite final definitions in the class VS the interface.
In the same fashion, no final restrictions are enforced when one class method extends another, either abstract or not.
While final has some practical value inside the class method body, is there any point specifying final for interface method parameters?
It doesn't seem like there's any point to it. According to the Java Language Specification 4.12.4:
Declaring a variable final can serve
as useful documentation that its value
will not change and can help avoid
programming errors.
However, a final modifier on a method parameter is not mentioned in the rules for matching signatures of overridden methods, and it has no effect on the caller, only within the body of an implementation. Also, as noted by Robin in a comment, the final modifier on a method parameter has no effect on the generated byte code. (This is not true for other uses of final.)
Some IDEs will copy the signature of the abstract/interface method when inserting an implementing method in a sub class.
I don't believe it makes any difference to the compiler.
EDIT: While I believe this was true in the past, I don't think current IDEs do this any more.
Final annotations of method parameters are always only relevant to the method implementation never to the caller. Therefore, there is no real reason to use them in interface method signatures. Unless you want to follow the same consistent coding standard, which requires final method parameters, in all method signatures. Then it is nice to be able to do so.
Update: Original answer below was written without fully understanding the question, and therefore does not directly address the question :) Nevertheless, it must be informative for those looking to understand the general use of final keyword.
As for the question, I would like to quote my own comment from below.
I believe you're not forced to implement the finality of an argument to leave you free to decide whether it should be final or not in your own implementation.
But yes, it sounds rather odd that you can declare it final in the interface, but have it non-final in the implementation. It would have made more sense if either:
a. final keyword was not allowed for interface (abstract) method arguments (but you can use it in implementation), or
b. declaring an argument as final in interface would force it to be declared final in implementation (but not forced for non-finals).
I can think of two reasons why a method signature can have final parameters: Beans and Objects (Actually, they are both the same reason, but slightly different contexts.)
Objects:
public static void main(String[] args) {
StringBuilder cookingPot = new StringBuilder("Water ");
addVegetables(cookingPot);
addChicken(cookingPot);
System.out.println(cookingPot.toString());
// ^--- OUTPUT IS: Water Carrot Broccoli Chicken ChickenBroth
// We forgot to add cauliflower. It went into the wrong pot.
}
private static void addVegetables(StringBuilder cookingPot) {
cookingPot.append("Carrot ");
cookingPot.append("Broccoli ");
cookingPot = new StringBuilder(cookingPot.toString());
// ^--- Assignment allowed...
cookingPot.append("Cauliflower ");
}
private static void addChicken(final StringBuilder cookingPot) {
cookingPot.append("Chicken ");
//cookingPot = new StringBuilder(cookingPot.toString());
// ^---- COMPILATION ERROR! It is final.
cookingPot.append("ChickenBroth ");
}
The final keyword ensured that we will not accidentally create a new local cooking pot by showing a compilation error when we attempted to do so. This ensured the chicken broth is added to our original cooking pot which the addChicken method got. Compare this to addVegetables where we lost the cauliflower because it added that to a new local cooking pot instead of the original pot it got.
Beans:
It is the same concept as objects (as shown above). Beans are essentially Objects in Java. However, beans (JavaBeans) are used in various applications as a convenient way to store and pass around a defined collection of related data. Just as the addVegetables could mess up the cooking process by creating a new cooking pot StringBuilder and throwing it away with the cauliflower, it could also do the same with a cooking pot JavaBean.
I believe it may be a superfluous detail, as whether it's final or not is an implementation detail.
(Sort of like declaring methods/members in an interface as public.)

What to call factory-like (java) methods used with immutable objects

When creating classes for "immutable objects" immutable meaning that state of instances can not be changed; all fields assigned in constructor) in Java (and similar languages), it is sometimes useful to still allow creation of modified instances. That is, using an instance as base, and creating a new instance that differs by just one property value; other values coming from the base instance. To give a simple example, one could have class like:
public class Circle {
final double x, y; // location
final double radius;
public Circle(double x, double y, double r) {
this.x = x;
this.y = y;
this.r = r;
}
// method for creating a new instance, moved in x-axis by specified amount
public Circle withOffset(double deltaX) {
return new Circle(x+deltaX, y, radius);
}
}
So: what should method "withOffset" be called? (note: NOT what its name ought to be -- but what is this class of methods called).
Technically it is kind of a factory method, but somehow that does not seem quite right to me, since often factories are just given basic properties (and are either static methods, or are not members of the result type but factory type).
So I am guessing there should be a better term for such methods. Since these methods can be used to implement "fluent interface", maybe they could be "fluent factory methods"?
Better suggestions?
EDIT: as suggested by one of answers, java.math.BigDecimal is a good example with its 'add', 'subtract' (etc) methods.
Also: I noticed that there's this question (by Jon Skeet no less) that is sort of related (although it asks about specific name for method)
EDIT, MAY-2014: My current favorite is mutant factory, FWIW.
I call those types of methods "copy methods".
While the clone() method creates an exact copy, a copy method makes a copy of an instance, usually with an implied or explicit variation. For example, String#toUpperCase() would be a copy method of the immutable String class - it copies an instance with a variation: it upcases all the letters.
I would consider withOffset() method in your example to be a similar copy method.
I don't know of any references that document the term "copy method". I'm borrowing the term "copy" from its use in C++: copy constructors and the "copy" naming guideline from the Taligent Coding Standards (more info).
As for the term "fluent factory methods", I don't know why "fluent" would make a difference, since a "fluent interface" is just an API style (separate from the builder pattern). If the term "factory method" doesn't apply here, I don't see how calling it a "fluent factory method" makes it apply any better.
Hrm … it creates mutated versions of the object … maybe we should call it a mutant factory? :-)
This does not really look like a factory method. The signature alone only tells me that I can chain it in different calls, but not that it is supposed to create a new instance:
I see this use case more like a StringBuilder that allows append("a").append("b").
It could, of course return a new StringBuilder every time (as your Circle does).
So it's not by design a factory. (Think of extracting the interface and writing the JavaDoc for that method: "one must return a new instance, since I'm immutable" -- why so ??) The fact that your class is immutable is just an implementation detail).
EDIT:
Perahs a better example would be BigInteger, since that's also immutable. Along with multiply(BigInteger), it provides a package-private method:
BigInteger multiply(long v)
which returns a new instance and resembles your case very well. That's simply an operation that happens to return a result of the same type as the initial object; not a factory and I don't think this kind of operation really deserves its own name.

Categories