How is Java annotation processor implemented? - java

I am writing some app in which I use integer to represent different types of semantics. For example, I use int to represent both month and year, and I want to avoid accidentally using variables with one semantic in the context requiring the other. I want to use annotation to annotate variables representing month with #Month and ones representing year with #Year, and would like compiler to warn me if unexpected assignment or method call. How do I implement that?
BTW: I don't want to introduce additional classes Month and Year because it is not as efficient and kind of verbose in syntax, e.g. I need to call month.get() to use it where an int is expected, and new Month(m) to create.
I tried to search online but don't find enough document. Any ideas?

Related

Avoid multiple castings of java properties (String) to Integer Double

I am looking for a way to reduce the castings of properties from java Properties to Numeric such as Integer, Double and if possible even a class that might casting directly for instance Integer[].
Say I have multiple properties and must pass for instantiation many times and I wish to use a class that parses and casts only once.
Properties mogaProps = parseProperties("mogabcpu/moga");
for (int i=0;i<10000;i++){
NSGA2Runner GA = new NSGA2Runner(Integer.valueOf(mogaProps.getProperty(
"NUMINITIALCHROMOSSOMES")), Integer.valueOf(mogaProps.getProperty(
"NUMCHROMOSOMES")),
Double.valueOf(mogaProps.getProperty(
"MUTATIONRATE")), Double.valueOf(mogaProps.getProperty(
"CROSSOVERRATE")),
parseStringTo1DArray(mogaProps.getProperty("NUMITERATIONS"))
}
Is there a better alternative avoiding the use of a custom class to specifically hold all potential different data types attributes?
I have found this However I could not find this Config Class dependency.
Thanks
I am looking for a way to reduce the castings of properties
You are parsing, not casting. The .valueOf calls are creating new values by interpreting the meaning held within text (parsing) rather than reshaping an existing value (casting). But not a useful distinction in the context of your Question, just FYI.
Say I have multiple properties and must pass for instantiation many times and I wish to use a class that parses and casts only once.
If the properties are not changing at runtime, just instantiate your configuration object (a NSGA2Runner object in your case, your GA) once. Pass that object around to the other methods and objects that need the information.
In your example code with the for loop, if your real code is processing the properties once and then using that data 10,000 times, move your GA = line to outside the loop.
By the way, if you are assigning an object once only, mark GA as final. That keyword obstructs any inadvertent attempt to make that variable point to any other object.
reduce the castings of properties from java Properties to Numeric such as Integer, Double
Your data is stored as text in a Properties. You need to work with that data as numbers, not text. So there is no getting around the chore of parsing that text into numbers.
I am not sure why you are concerned about this. Perhaps performance is your concern? Such parsing is quick and easy. Not a significant impact on performance unless you are often processing millions of such values. Do a bit of micro benchmarking to see for yourself.
Is there a better alternative avoiding the use of a custom class to specifically hold all potential different data types attributes?
Defining a class is the appropriate way to gather together related parts of information in an object-oriented language like Java.
If your data is read-only, then you might want to use the records feature in Java 16 and later. A record is a brief way to write a class whose main purpose is to communicate data transparently and immutably. You merely need to declare the type and name of each member field. The compiler implicitly creates the constructor, getters, equals & hashCode, and toString.
public record NSGA2Runner( int numberOfInitialChromosomes , int numberOfChromosomes , double mutationRate , double crossoverRate , int[] numberOfIterations ) {}

When to use Enum class in java?

It may be very obvious questions, but is it good to use Enum class if you know that the list of values will keep increasing?
Let's say you define an Event Enum first it contains only [Poo, Too] then as we know we always have some new requirement it becomes [Poo, Too, App, Laa] and that keep changing again and again,
So what is the best approach in this case?
tl;dr
If the entire set of possible values is known at compile-time, use enum.
If values can be added or dropped while your system is in use (at runtime), then you cannot use an enum. Use a Set, List, or Map instead.
Known at compile-time
An enum is appropriate when the domain (set of all possible values) is known at compile-time.
If this year your company is offering two products ( Poo & Too ), then make an enum for those two elements.
public enum Product { POO , TOO }
Next year, your company decides to grow their product offerings by adding App & Laa. As part of a planned deployment, add two more objects to your enum.
public enum Product { POO , TOO , APP , LAA }
By the way, notice the naming conventions. The enum has a regular class name (initial cap). The objects being automatically instantiated are constants, and so are named in all-uppercase.
Also, be aware that the enum facility in Java is quite flexible and powerful, much more so than the usual naming-a-number enum scheme seen in most languages. You can have member variables and methods and constructors on a Java enum. For example, you can add a getDisplayName method to provide text more appropriate to a user-interface than the all-caps object name, as seen in DayOfWeek::getDisplayName. You can add quite a bit of functionality, such as ChronoUnit.between.
What you cannot do at runtime with an enum in Java is add or remove objects. Thus the requirement that you know your domain at compile-time. However, when working with a group of enum objects, you can use the highly-optimized EnumSet and EnumMap classes.
Known at runtime
If you cannot determine the domain at compile-time, if users can add or remove elements at runtime, then use a collection such as a List, Set, or Map rather than an enum.
Singleton
Though not originally intended as a purpose of Enum in Java, an enum happens to be the safest (and simplest) way to implement the Singleton design pattern.
This approach to a singleton is explained in the famous book Effective Java by Dr. Joshua Bloch, et al. Using an enum solves multiple obscure technical problems with other approaches to a singleton.
Your question is pretty generic and I'm pretty sure there is no single right answer. But judging based on spring* tags, I suppose you might be asking about enums in DTOs that being sent over your system in serialized form. If that's the case, I would recommend to choose String in DTO, while inside single app it's ok to use enum. Then you would just care about deserialization/conversion in a factory manner, having ability to handle unknown/missing constant gracefully by logging/providing fallback or meaningful error.
It depends on a case-by-case situation and your question doesn't have much context. However, I do recommend using ENUMs for many cases, including if you expect the list of ENUMs to increase.
Some reasons to use them are:
It creates a definite guide of ENUM elements that can be used throughout your code. It eliminates uncertainty over what something is named or what it is. For example ENUM that contains list of animals, or enum of "something".
Its easy to refactor later if you need to change anything.
I'm sure there are many more reasons, I find it like a table of contents sometimes. For many cases, you can completely avoid it and be fine but I think its better to use it in general if you're on the fence.

Using enums and switch statements to control method execution

My question concerns a specific design convention for methods in Java... but really it would apply to C++, C# and others as well. I don't know what this convention is called, but if there is a standardized convention, I would like to know how to find it. In other words, I wish to describe this convention as I have encountered it and be directed to a place where I can learn more.
Consider java.util.Calendar, specificlaly its child, GregorianCalendar. It has an interesting "getter / setter" convention. Let's say that you instantiate this object:
GregorianCalendar cal = new GregorianCalendar();
The fields of cal now describe the instant in time (down to the millisecond) at which the constructor was called.
Now let's say that you want to access the year field or the month field. You would use the following getters.
int year = cal.get(Calendar.YEAR);
int month = cal.get(Calendar.MONTH);
Notice that it's not cal.getYear() or cal.getMonth(). It looks like there is only one getter method for this class and that the return value is determined by the parameter naming the desired field. I would imagine that within the class there is an enum set up to list the fields... and that the getter function itself is composed of some kind of switch statement.
This type of architecture is not described in any of my books... it is however something that I've been using in my current work... but I've been doing it "my" way (basically just making it up as I go along). If there is a standardized way of doing this that other people use... I'd sure love to know it. Specifically, using enums and switch statements to control the execution of methods.
Thanks so much for your time! This is my first question on this site... I have been a long time lurker though. :)
First, note that the two approaches to API design are not mutually exclusive: one could have both a "get by index" and a "get by name", i.e.
int y1 = cal.get(Calendar.YEAR);
int y2 = cal.getYear();
The primary driving force behind getters controlled by an int constant in the Calendar class is uniformity: it lets users of the Calendar class, such as the date formatters, build code that accesses the calendar by index, without further interpretation. For example, if you wanted to implement a formatter that takes a format string and stores a data structure to pull data from a calendar, you would be able to do it with an array of integers: "dd-mm-yyyy" would become int[] {Calendar.DAY, Calendar.MONTH, Calendar.YEAR}, and you would be able to get the data from calendar with a simple for loop.
Note that one of the reasons why Calendar uses integer constants instead of enums is backward compatibility: that Java did not have enum at the time when the Calendar class has been introduced.
Also note that you do not need a switch statement on an enum or int constants to implement Calendar's getters and setters: they can be implemented as direct reads and writes of the calendar component array.
Actually those are not enums. Those are integers instead. Here is the source code of the Calender get method:
public int get(int field)
{
complete();
return internalGet(field);
}
But having a single method accepting a ENUM and returning different values based on that, is good practice.
As far as the design pattern goes, IMHO it is a variation of Factory pattern.
I'm not actually aware of a name for this specific design, although I've seen it used in a few places. It's certainly not one of the standard "Design Patterns" and is really too small to qualify as a design pattern in its own right. It's just a different way of achieving encapsulation over the more traditional way with multiple getters and setters.
If I was to call it something it would probably be something like "flexible getter" or "extensible getter". I.e. "Rather than having multiple setters lets have one flexible getter"
If I was implementing something like this I would probably use the strategy pattern to do it though:
public abstract class Getter<T> {
private T getData(MyCalendar ob);
}
public static final Getter<Integer> MONTH {
Integer getData(MyCalendar ob) {
return ob.month;
}
}
Then your get method just looks like:
<T>public T get(Getter<T> toGet) {
return toGet.getData(this);
}
This uses polymorphism to fetch the data rather than a massive switch statement. It is fully flexible and extensible while still being type safe, etc.

Parsing Joda-Time Partials

I'd like to produce Partials from Strings, but can't find anything in the API that supports that. Obviously, I can write my own parser outside of the Joda-Time framework and create the Partials, but I can't imagine that the API doesn't already have the ability to do this.
Use of threeten (JSR-310) would be an acceptable solution, but it doesn't seem to support Partials. I don't know whether that is due to its alpha status, or whether the Partial concept is handled in a different manner, which I haven't discovered.
What is the best way to convert a String (2011, 02/11, etc) into a Partial?
I've extended DateTimeParserBucket. My extended class intercepts calls to the saveField() methods, and stores the field type and value before delegating to super. I've also implemented a method that uses those stored field values to create a Partial.
I'm able to pass my bucket instance to DateTimeParser.parseInto(), and then ask it to create the Partial.
It works, but I can't say I'm impressed with Joda-Time - given that it doesn't support parsing Partials out of the box. The lack of DateTimeFormatter.parsePartial(String) is a glaring omission.
You have to start by defining the valid format for Partials which you will be accepting. There is no class which will just take text and infer the best possible match for a Partial. It's way too subjective based on locale, user preference, etc. So there's no way of getting around making a list of all of the valid formats for input. It will be very difficult to make these all mutually exclusive for each other, so there should be priorities. For example, you might want mm/dd and mm/yy to both be valid formats. If I give you the string 02/11, which one should have priority?
Once you've determined exactly the valid formats, you should use DateTimeFormat.forPattern to create a DateTimeFormatter for each one. Then you can use each formatter to try to parseInto a MutableDateTime. Then, go through each field in the MutableDateTime and transfer the value into a Partial.
Unfortunately, there is no better way to handle this in the Joda library.
The ISODateTimeFormat class allows partial printing. As you say, there is no parsing method on DateTimeFormatter (although you can parse to a LocalDate and interpret that).
ThreeTen/JSR-310 has the DateTimeFields class which replaces Partial. Parsing of partials into a CalendricalMerger is supported, however that may not be convertable back into a DateTimeFields yet.

Do people provide multiple mechanisms for doing the same thing in an API?

Is it confusing to design an API with multiple ways of achieving the same outcome? For example, I have my own Date library (which is a simple wrapper around the Java Date/Calendar classes to distinguish a year-month-day, Date, from an instant-in-time, Instant and provide mechanisms to convert between the two). I started off with one method to create an instance of a Date:
Date.valueOfYearMonthDay(int year, int month, int day);
But then I found that the resultant code using the API was not very readable. So I added:
Date.yearMonthDay(int year, int month, int day)
Date.ymd(int year, int month, int day)
Date.date(int year, int month, int day)
Then I started getting fluent:
Date.january().the(int day).in(int year);
(I find that the fluent version is really useful for making readable tests). All these methods do identical things and have accurate JavaDoc. I think I've read that a strength of perl is that each programmer can choose exactly which method he/she prefers to solve something. And a strength of Java is that there is usually only one way of doing things :-)
What are people's opinions?
I've been doing academic research for the past 10 years on different issues that have to do with API usability in Java.
I can tell you that the statement about having one way to do things in Java is fairly incorrect. There are often many ways to do the same thing in Java. And unfortunately, they are often not consistent or documented.
One problem with bloating the interface of a class with convenience methods is that you are making it more difficult to understand the class and how to use it. The more choices you have, things become more complex.
In an analysis of some open-source libraries, I've found instances of redundant functionality, added by different individuals using different terms. Clearly a bad idea.
A greater problem is that the information carried by a name is no longer meaningful. For example, things like putLayer vs. setLayer in swing, where one just updates the layer and the other also refreshes (guess which one?) are a problem. Similarly, getComponentAt and findComponentAt. In other ways, the more ways to do something, the more you obfuscate everything else and reduce the "entropy" of your existing functionality.
Here is a good example. Suppose you want in Java to replace a substring inside a string with another string. You can use String.replace(CharSequence, CharSequence) which works perfectly as you'd expect, literal for literal. Now suppose you wanted to do a regular expression replacement. You could use Java's Matcher and do a regular expression based replacement, and any maintainer would understand what you did. However, you could just write String.replaceAll(String, String), which calls the Matcher version. However, many of your maintainers might not be familiar with this, and not realize the consequences, including the fact that the replacement string cannot contains "$"s. So, the replacement of "USD" with "$" signs would work well with replace(), but would cause crazy things with replaceAll().
Perhaps the greatest problem, however, is that "doing the same thing" is rarely an issue of using the same method. In many places in Java APIs (and I am sure that also in other languages) you would find ways of doing "almost the same thing", but with differences in performance, synchronization, state changes, exception handling, etc. For instance, one call would work straight, while another would establish locks, and another will change the exception type, etc. This is a recipe for trouble.
So bottom line: Multiple ways to do the same thing are not a good idea, unless they are unambiguous and very simple and you take a lot of care to ensure consistency.
I'd echo what some others said in that convenience methods are great, but will take it a step further - all "convenience" methods should eventually call the same underlying method. The only thing that the convenience methods should do other than proxy the request is to take or return variables differently.
No calculations or processing allowed in the convenience methods. If you need to add additional functionality in one of them, go the extra mile and make it happen in the "main" / "real" one.
Its fine to provide convenience methods, the real problem is if each entry point begins to do behave in subtly different ways. Thats when the api isn't convenient anymore. Its just a pain to remember which way is "right," and documentation starts saying "the recommended way is..."
If Date.yearMonthDay() began to validate the date while Date.ymd() didn't, that'd be a problem. The same goes for if each begins supporting different "features" - Date.yearMonthDay() could take non-gregorian dates, and Date.date() could take a non-gregorian dates so long as a 4th object is given that tells the calendar type.
First, please don't invent your own date library. It's too hard to get right. If you absolutely have nothing better to do, be sure to read -- and understand -- Calendrical Calculations. Without understanding Calendrical Calculations you run a big risk of doing things wrong in obscure corner and edge cases.
Second, multiple access to a common underlying method is typical. Lots of Java library API methods state that they are simply a "wrapper" around some other method of class.
Also, because of the Java language limitations, you often have overloaded method names as a way to provide "optional" arguments to a method.
Multiple access methods is a fine design.
If these do the exact same thing:
Date.yearMonthDay(int year, int month, int day)
Date.ymd(int year, int month, int day)
Date.date(int year, int month, int day)
I think that is bad form. When I am reading your code, I have no clue which one to use.
Things like
canvas.setClipRegion (int left, int top, int right, int bottom);
canvas.setClipRegion (Rect r);
are different in that it allows the caller to access the functionality without having to figure out how to format the data.
My personal opinion is that you should stick with one method to do something. It all 4 methods ultimatly call the same method then you only need on of them. If however they do something in addition to calling them method then they should exist.
so:
// This method should not exist
Data yearMonthDay(final int year, final int month, final int day)
{
return (valueOfYearMonthDay(year, month, day));
}
The first methid in addition to the fluent version would make more sense. But the yearMonthDay, ymd, and date methods should go.
Also, differnt langauges have different goals. Just because it makse "sense" in Perl doesn't mean it makes sense in Java (or C#, or C++, or C, or Basic, or...)
I find that the fluent version is really useful for making readable tests.
This is a little bit troublesome because I worry that you might only be testing the fluent version. If the only reason methodX() exists is so you can have a readable test for methodY() then there is no reason for one of methodX() or methodY() to exist. You still need to test them in isolation. You're repeating yourself needlessly.
One of the guiding principles of TDD is that you force yourself into thinking about your API while you're writing your code. Decide which method you want clients of your API to use and get rid of the redundant ones. Users won't thank you providing convenience methods, they'll curse you for cluttering your API with seemingly useless redundant methods.

Categories