Shouldn't "static" patterns always be static? - java

I just found a bug in some code I didn't write and I'm a bit surprised:
Pattern pattern = Pattern.compile("\\d{1,2}.\\d{1,2}.\\d{4}");
Matcher matcher = pattern.matcher(s);
Despite the fact that this code fails badly on input data we get (because it tries to find dates in the 17.01.2011 format and gets back things like 10396/2011 and then crashed because it can't parse the date but that really ain't the point of this question ; ) I wonder:
isn't one of the point of Pattern.compile to be a speed optimization (by pre-compiling regexps)?
shouldn't all "static" pattern be always compiled into static pattern?
There are so many examples, all around the web, where the same pattern is always recompiled using Pattern.compile that I begin to wonder if I'm seeing things or not.
Isn't (assuming that the string is static and hence not dynamically constructed):
static Pattern pattern = Pattern.compile("\\d{1,2}.\\d{1,2}.\\d{4}");
always preferrable over a non-static pattern reference?

Yes, the whole point of pre-compiling a Pattern is to only do it once.
It really depends on how you're going to use it, but in general, pre-compiled patterns stored in static fields should be fine. (Unlike Matchers, which aren't threadsafe and therefore shouldn't really be stored in fields at all, static or not.)
The only caveat with compiling patterns in static initializers is that if the pattern doesn't compile and the static initializer throws an exception, the source of the error can be quite annoying to track down. It's a minor maintainability problem but it might be worth mentioning.

first, the bug in pattern is because dot (.) matches everything. If you want to match dot (.) you have to escape it in regex:
Pattern pattern = Pattern.compile("\\d{1,2}\\.\\d{1,2}\\.\\d{4}");
Second, Pattern.compile() is a heavy method. It is always recommended to initialize static pattern (I mean patterns that are not being changed or not generated on the fly) only once. One of the popular ways to achieve this is to put the Pattern.compile() into static initializer.
You can use other approach. For example using singleton pattern or using framework that creates singleton objects (like Spring).

Yes, compiling the Pattern on each use is wasteful, and defining it statically would result in better performance. See this SO thread for a similar discussion.

Static Patterns would remain in memory as long as the class is loaded.
If you are worried about memory and want a throw-away Pattern that you use once in a while and that can get garbage collected when you are finished with it, then you can use a non-static Pattern.

It is a classical time vs. memory trade-off.
If you are compiling a Pattern only once, don't stick it in a static field.
If you measured that compiling Patterns is slow, pre-compile it and put it in a static field.

Related

Is it a good idea to declare all text getting used for logging as public static final

Is it a good idea to declare all text getting used for logging as public static final from performance point of view or otherwise ?
Does it have any advantage other than readability in case one string is getting used only once ?
First, the objective part of your question: is there a performance benefit from declaring a log statement static final, i.e:
private static final String SUCCESS = "Success!";
//[...]
log.info(SUCCESS);
log.info(SUCCESS);
// versus:
log.info("Success!");
log.info("Success!");
The JLS states in section 3.10.5:
[A] string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (ยง15.28) - are "interned" so as to share unique instances, using the method String.intern.
So whether your string literal is declared once as a static final or appears multiple times in the source code, it will always be the same String instance, wherever it is used, and thus take up the same amount of memory, and will be accessed in exactly the same way. There will be no performance difference.
Now the other part of the question: is it a good idea? That is inherently subjective, but my opinion is that you should avoid declaring log messages as static final. Log messages add to the readability of the code, which is especially valuable when the code is being maintained by people who did not write it. For example:
log.warn(LOGIN_ERROR_OCCURRED, userId, attempt);
// compared to:
log.warn("Login failed for user {}; attempt {} of 5.", userId, attempt);
It's much quicker and easier to read the log message in the context of the code, rather than having to jump somewhere else in the code to see the full log message.
Easier internationalization and localization are possible advantages of using identifiers for string constants.
ResourceBundle bundle = ...
private final static LOGIN_ERROR_OCCURRED = bundle.getString("Login failed for user {}; attempt {} of 5");
But the benefits of i18n/L10n for log messages may be questionable.
Logging strings almost certainly should not be declared public, at least not usually. In most cases, it's better to declare constant variables for them, but sometimes you can be loose about that. Constants should appear together near the top of the class source, so for logging strings this gives a good overview of what happens in the class. It also makes it easier to find them for maintenance, like to edit out silly extraneous exclamation points. (Don't laugh; I've seen them.) I disagree that they obscure the point of log messages, unless you suck at naming variables. Which far too many people do.

Does ReentrantLock use Decorator Design Pattern in java?

ReentrantLock contains an abstract class Sync, and Sync has two subclasses FairSync and NonFairSync. I want to know is this Decorator Design Pattern?
BTW, is there any good resources about Design Pattern usage in java source code?
No it's not. Sync (and FairSync/NonFairSync as well) are only inner classes that are used as an attribute of ReentrantLock (basically, this is only composition, no special pattern involved here).
The second question will result in opinion-based answers since each person has its own tastes and colors about design patterns (so there is no single good resource about design patterns).
If you really want to start somewhere, start on Wikipedia where each pattern is explained quite neutrally but in any case it will let you know when (and if) it is appropriate to use them.

Programming practice for defining string constants in Java

My perception for defining string constants in Java is that one should define a string constant, when the same string is used at multiple places. This help in reducing typo errors, reduce the effort for future changes to the string etc.
But how about string that are used at a single place. Should we declare string constant even in that case.
For eg. Logging Some counter (random example).
CounterLogger.addCounter("Method.Requested" , 1)
Is there an advantage of declaring constant rather than using raw string?
Does the compiler does any optimization?
Declaring constants can improve your code because they can be more descriptive. In your example
CounterLogger.addCounter("Method.Requested" , 1)
The method parameter "Method.Requested" is quite self describing but the 1 is not making this a constant would make this example more readable.
CounterLogger.addCounter("Method.Requested" , INITIAL_VALUE)
The way I see it, Strings can be used in one of two ways:
As properties / keys / enumerations - or in other words, as an internal representation of another Objects/states of your application, where one code component writes them, and another one reads them.
In UI - for GUI / console / logging display purposes.
I Think it's easy to see how in both cases it's important to avoid hard-coding.
The first kind of strings must (if possible) be stored as constants and exposed to whichever program component that might use them for input/output.
Displayed Strings (like in your Logger case) are strings that you might change somewhere in the future. Having them all stored as static final fields in a constants-dedicated class can make later modifications much easier, and help avoid duplicates of similar massages.
Regarding the optimization question - as others have already answered, I believe there's no significant difference.
Presumably, you'll want to write a unit test for whichever method contains that line of code. That unit test will need access to that String value. If you don't use a constant, you'll have the String repeated twice, and if you have to change it in the future, you'll have to change it in both places.
So best to use a constant, even though the compiler is not going to do any helpful optimisations.
In my view in your case is fine. If you cant see any advantage in declaring it as a constant dont do it. To support this point take a look at Spring JdbcTemplate (I have no doubt that Spring code is a good example to follow) it is full of String literals like these
Assert.notNull(psc, "PreparedStatementCreator must not be null");
Assert.notNull(action, "Callback object must not be null");
throw getExceptionTranslator().translate("StatementCallback", getSql(action), ex);
but only two constants
private static final String RETURN_RESULT_SET_PREFIX = "#result-set-";
private static final String RETURN_UPDATE_COUNT_PREFIX = "#update-count-";
Iterestingly, this line
Assert.notNull(sql, "SQL must not be null");
repeats 5 times in the code nevertheless the authors refused to make it a constant

Java refactoring related to type conversion

Possible ways to refactor the code which had Java interface solely used to define lots of constants.. You can now imagine how this class is used to access these consts.
This is known as a constant interface anti-pattern. Although the previous link provides a way to fix this (using a class and static imports), I think there is a better way of refactoring this. Follow the suggestion here to fix this. Overall it is better to move the constants to the appropriate classes/abstractions rather than using one utility constant class. For e.g Calendar class defines only the constants that are relevant to its operations. Also as CoolBeans suggested try converting those String constants to enums where applicable.
So you have a big bag of constants (you don't say how big, but I've seen things like this with thousands of entries). Importing all of these values is a mess, as you get all the unrelated values imported into everything.
Rather than automating the changes, what I'd do is separate the constants into logically coherent groups, and then moving each group into the class hierarchy where they make sense. e.g. if you've got constants for COLOR_RED, COLOR_GREEN, DATE_FIELD, WEEK_FIELD, you'd probably want to appropriately split them into the color and data hierarchies. For the first pass ignore edge cases where you can't decide immediately - anything you can do to trim the constants down to coherent groups will help.
Seems like a good usecase for enums. So take out the interface and replace it with an enum. Since it is a collection of constants enum fits the bill nicely. Moreover, enums are one of the most efficient ways to implement a Singleton in Java.
Instead of duplicating answers, take a look at these good relevant questions.
Java Enum Singleton
Efficient way to implement singleton pattern in Java
Alternatively as Pangea mentioned you can do static imports. I think both approaches are fine but enums in my opinion will be a better placeholder for organizing your unrelated constants in relevant meaningful groups.
You could attack the source files with a shell script that does the following:
for all .java files:
if (content matches " class ? imports Singleton {"):
replace "imports Singleton" with ""
append "import static Singleton.*;\n" after package declaration
This is far from perfect (it just ignores cases where a class imports Singleton and other interfaces...) but it could be a practical strategy - and maybe it's OK to solve 80% with a quick script and correct the remaining 20% manually (IDE will report errors).

Wanted: a very simple Java RegExp API

I'm tired of writing
Pattern p = Pattern.compile(...
Matcher m = p.matcher(str);
if (m.find()) {
...
Over and over again in my code. I was going to write a helper class to make it neater, but I then I wondered: is there a library that tries to provide a simpler facade for Regular Expressions in Java?
I'm thinking something in the style of commons-lang and Guava.
CLARIFICATION: I am actually hoping for some general library that would make working with regular expression a more streamlined experience, kind of like how perl does it. The code above was just an example.
I was thinking of something I could use like this:
for (int question : RegEx.findAllInts("SO question #(\\d+)", str)) {
// do something with int
}
Again, this is just an example of one of the many things I'd like to have. Probably not even a good example. APIs are hard.
UPDATE: I guess the answer is "No". Thanks for all the answers, have an upvote.
Why not just write your own wrapper method? Sure, you should not reinvent the wheel but another library also means another dependency.
Pattern should only be compiled once; save it in a static final field. This at least saves you from repeating, at coding time an runtime, this step. That is to say, this step ought not always go hand-in-hand with creating a Matcher for performance reasons.
In your example, it seems RegEx plays the role of a Matcher object anyway. I hope it's not supposed to be a class with a static method since this would not work in a multithreaded environment -- the find and getInt calls are not connected then. So you need a Matcher of some sort anyway.
And so you're back to precisely the Java API, when design considerations are factored in. No I don't think there's a shorter way to do this correctly and efficiently.
There is a java library which has extend feature over the built-in java regex library . Have a look at RegExPlus. I haven't tried it personally.But hope this helps.
Yeah, it's always bugged me, too, having to write so much boilerplate to perform such common tasks. I think it would help a lot if String had a pair of methods like
public String findFirst(String regex)
public String[] findAll(String regex)
These represent the two most commonly performed regex operations that aren't already supported by String methods. If we had those, plus a dynamic replacement facility like Rewriter, we could almost forget about Pattern and Matcher. We would only need them when we're writing something really complicated, like a findAllInts() method. :D
There is Jakarta Regexp (see the RE class). Have a look at this old thread for advantages of Jakarta's RegExp package over the Java built-in RegEx.
Since Java 1.4, you can also use String.matches(String regex). Which precisely is a facade to the aforementionned code.
For the specific example you give, you might be able to improvise something using Guava's splitter:
for (String number : Splitter.onPattern("[^\d]+").split(input)) {
// Do something with the number
}
or more specifically, if you had input like
SO question #1234, SO Question #3456, SO Question #5678
you might do
for (String number : Splitter.onPattern("(, )? SO Question #").split(input)) {
// Do something
}
It's a bit hacky, but in specific cases it may do what you're after.

Categories