custom annotations to modify values

custom annotations to modify values - java

I need your support and thanks for your Java/Annotations support..
I would to create a custom annotation f.e. #ModifyRegex to annotatate variables
and my approach is to modify/replace parts of the value of the annotated variables.
f.e.
#ModifyRegex
private String variable;
if:
variable = "AbC-ABG-kkkk-4711";
then:
variable = "ABC-4711-ABG-kkkk";
I am not sure if its possible or not, if yes please provide me a simple code example..
Thanks

Annotation processors can make new source files. They cannot change existing ones. They also cannot read inside methods. They can't read any code, in fact, so the only string literal that you could possibly see is if it's literally (heh) a literal. The java lang spec defines when the value assigned to a field is considered 'compile time constant' and is written straight in. If it's not, say:
class Example {
// these are all NOT constant, therefore, cannot be
// retrieved with an annotation processor
long x = System.currentTimeMillis();
Pattern p = Pattern.compile("^AbC-ABG-$");
String s = null; // null is considered non-constant, for some reason.
String z = "HELLO!".toLowerCase();
}
But, if it's truly simple, such as #Foo private String x = "Hello"; where x is a field (and not a local variable in a method someplace), yes, you can see it in action.
But all you can do, is make new files. You can't change an existing file. So, at best, you can make a second class that contains public static final String variable2 = "ABC-4711-...";.
But what about lombok?
Project Lombok does everything I just said you can't do: It inspects actual code, and modifies source files in-flight.
Unfortunately, lombok is a few hundred thousand lines of code and most of it is neccessary to do all this: There is no uniform way to do it, so it's a ton of custom code. More to the point, it is also fundamentally extremely complicated: IDEs do code analysis all the time, and if you change structures, that has an effect on everything from 'auto-format my file as it is saved' to refactor scripts, to 'find callers', and so much more. Because there is no standard, you have to patch the editors to figure it out.
Changing a string constant may mean you don't need as much patching. Then it's still incredibly complicated.
Lombok is open source if you want to investigate.
NB: I'm a core contributor to Project Lombok.

Related

what's wrong with this approach?

A new Code Review process has been put in place and now my team must not ever declare a string as a local variable, or the commit won't pass the code review. We are now to use constants instead.
So this is absolutely not allowed, even if we're dead sure the string will never be used in any other place
String operationId = "create";
This is what should be used instead:
private static final String OPERATION_ID = "create";
While I totally agree to use constants for strings that appears +2 times in the code ... I just find it overkill to completely not have the ability to declare a string in place if it's used only once.
Just to make sure it's clear, all the following are NOT ALLOWED under any circumstances:
String div = "div1";
Catch(Exception ex){ LOGGER.log("csv file is corrupt") }
String concatenation String str = "something ...." + someVar + "something" ... we are to replace someVar with %s, declare whole thing as a global string, and then later use String.format(....)
if( name.equals("Audi" ){....}
String value = map.get("key")
Any ideas guys ? I want some strong arguments. I'm ready to embrace any stand that's backed by a good argument.
Thanks.

First, let's throw out your assumption: There's nothing inherently wrong with the approach described.
It's not about strings being used in more than one place, it's about constants being easy to find and documented, and your code being consistent.
private static final String OPERATION_ID = "create";
Really, this isn't used anywhere else? Nothing would break if I changed this to the string "beetlejuice"? If something would break, then something else is using this constant... If the "something else" happens to be a codebase in a different language, and that's why they don't share string constants-- that's the exception, not the rule. Consistency!
That said, there are a few things I would standardize in a slightly different manner, but I would still standardize them nonetheless:
I would suggest allowing string literals in the constructors of enums:
public enum Operation {
CREATE("create"),
...
}
because here, the enum is the constant that is being referenced in the code, not the string literal. Declaring the constant as an enum or as a private static final String are equivalent to me, and there's no need to do both.
Additionally, I would not use this pattern anywhere that it breaks your IDE's ability to warn you about missing strings-- For example, looking up strings from .properties files. Many IDEs will give you proper warnings when you look up a key in a .properties file that doesn't exist, but the extra level of indirection might break that depending upon how smart your IDE is.
Catch(Exception ex){ LOGGER.log("csv file is corrupt") }
This to me is a bit of a gray area - Is this an internal-only message? Are the logs only ever seen by you, the developer, or are they for a user's benefit too?
If it's only for developers of the application These probably don't need to be localized.
If you do expect the user to view the logs, then they should be externalized into a .properties file.

It is good coding style to define a constant for a value/literal when the value/literal is used multiple times.
The imposed coding style forces you to use a constant for every string literal.
The good effect of that coding style is: All string literals which really should be declared as constants are now declared as constants.
The bad implication of that coding style is: You - the developers - are not able to decide if a string literal should be defined as constant or not. This is a heavy punch.
Therefore you should raise your concerns that the good intention of the coding style does not compensate for the mistrust in your developer qualitites.

Programming practice for defining string constants in Java

My perception for defining string constants in Java is that one should define a string constant, when the same string is used at multiple places. This help in reducing typo errors, reduce the effort for future changes to the string etc.
But how about string that are used at a single place. Should we declare string constant even in that case.
For eg. Logging Some counter (random example).
CounterLogger.addCounter("Method.Requested" , 1)
Is there an advantage of declaring constant rather than using raw string?
Does the compiler does any optimization?

Declaring constants can improve your code because they can be more descriptive. In your example
CounterLogger.addCounter("Method.Requested" , 1)
The method parameter "Method.Requested" is quite self describing but the 1 is not making this a constant would make this example more readable.
CounterLogger.addCounter("Method.Requested" , INITIAL_VALUE)

The way I see it, Strings can be used in one of two ways:
As properties / keys / enumerations - or in other words, as an internal representation of another Objects/states of your application, where one code component writes them, and another one reads them.
In UI - for GUI / console / logging display purposes.
I Think it's easy to see how in both cases it's important to avoid hard-coding.
The first kind of strings must (if possible) be stored as constants and exposed to whichever program component that might use them for input/output.
Displayed Strings (like in your Logger case) are strings that you might change somewhere in the future. Having them all stored as static final fields in a constants-dedicated class can make later modifications much easier, and help avoid duplicates of similar massages.
Regarding the optimization question - as others have already answered, I believe there's no significant difference.

Presumably, you'll want to write a unit test for whichever method contains that line of code. That unit test will need access to that String value. If you don't use a constant, you'll have the String repeated twice, and if you have to change it in the future, you'll have to change it in both places.
So best to use a constant, even though the compiler is not going to do any helpful optimisations.

In my view in your case is fine. If you cant see any advantage in declaring it as a constant dont do it. To support this point take a look at Spring JdbcTemplate (I have no doubt that Spring code is a good example to follow) it is full of String literals like these
Assert.notNull(psc, "PreparedStatementCreator must not be null");
Assert.notNull(action, "Callback object must not be null");
throw getExceptionTranslator().translate("StatementCallback", getSql(action), ex);
but only two constants
private static final String RETURN_RESULT_SET_PREFIX = "#result-set-";
private static final String RETURN_UPDATE_COUNT_PREFIX = "#update-count-";
Iterestingly, this line
Assert.notNull(sql, "SQL must not be null");
repeats 5 times in the code nevertheless the authors refused to make it a constant

How do I ensure the format for saving and parsing string representations of Objects correlate properly

I am making a small boardgame program which needs to persist the state of the board to a file, and later read from the file and re-create the board.
I am delegating this functionality to the class shown below. I would like to implement this such that the save format of a square of the board along with it's co-ordinates are captured in the SQUARE_FORMAT constant, and the regex for reading that same information is captured in the LOAD_REGEX constant. Both should co-relate in code and also be able to visually decipher (by that I mean that a person should be able to clearly see that they co-relate to the same data)
Is there an idiom or pattern for doing this in Java code ?
public class BoardPersistenceUtility {
private final String SQUARE_SAVE_FORMAT = "";
private fial String LOAD_REGEX = "";
public void save(PrintWriter writer, Board board) {
}
public Board load(BufferedReader reader) {
// Implement
return null;
}
}
Update 1:
On reading my question again, I guess it might be a bit confusing, about what exactly I am looking for. I am specifically looking for the right way to represent SQUARE_SAVE_FORMAT so that it clearly co-relates with the regex LOAD_REGEX.
SQUARE_SAVE_FORMAT would ideally be a String which uses special characters/variables that will be replaced with actual values and the result will be saved to a file. LOAD_REGEX is the corresponding regex that will be used to read contents from the file. The regex will use capturing groups so I can re-create the original object from the values I get from the capturing groups.
My question is, what are the idioms around creating such pairs of Strings - one of them a format string to be used for saving data, and the other a regex to be used while reading that data.
Update 2:
On thinking a bit more, I think I have been able to clarify my question a bit better.
If you look at both the Strings, SQUARE_SAVE_FORMAT is a format string which will be used in String.format() to create the text for a square on the board, which will be saved in the file. The constant SQUARE_LOAD_REGEX is a regex which will be used to read the line and capture relevant parts into named groups, so I can re-create the original object. (sorry if my regex is slightly incorrect... I quickly wrote something, but I need to refresh some regex principles to ensure that this is indeed what I need)
If you look at both these Strings visually, it is difficult to co-relate them together. Perhaps it is because we do not have any named variables in a Java format String. The best we can do is to specify %i where i is the index of the argument.
I would like to understand if there is any idiom or pattern to represent such pairs of Strings, where one is used for formatting some data to text and the other is used to read the same text and parse it's parts.
public class BoardPersistenceUtility {
private final String SQUARE_SAVE_FORMAT =
"%d,%d:%b-%s";
private final String SQUARE_LOAD_REGEX =
"^(?<row>\d*),(?<col>\d*):(?<mine>true|false)-(?<status>\w)$";
public void save(PrintWriter writer, Board board) {
}
public Board load(BufferedReader reader) {
// Implement
return null;
}
}

Note: you call SQUARE_SAVE_FORMAT and LOAD_REGEX "constants" which they are not, as you haven't declared them static final. It's better to keep terminology clear :-)
The simplest way to link these two is to define a class which encloses both as (final) fields. If you plan to define multiple such pairs of information, you can define multiple instances of the class, one for each type of format.
If you really want to keep these as constants, it may be best to define the enclosing class as an enum. Note that Java enums may contain methods too, so you may choose to implement the save/load logic as Strategies in the enum instances themselves, and call these polymorphically, which may help simplify your code.

I'm still not sure what you mean, but need formatting, so answer instead of comment.
First of all, the names are almost completely unrelated--related them somehow.
SQUARE_DATA_STORE
SQUARE_DATA_REGEX
Second, there's no point in differentiating the "style" of the saved data if there's only a single BoardPersistenceUtility--if there were multiple formats then that information would be captured in a persistence utility subclass, like SquareFormatPersister or something.
Third, according to your text, one string is where the data will actually be stored. The other is a regular expression. The two will, in this case, never be "visually similar"--regular expressions of any complexity will never (much) look like the strings they can represent. (In this case, we have no clue, because we don't know what the board data can look like, of course.)
If your code is so non-self-explanatory that the reader can't figure out the two fields are related through via your comments and your code, something has gone horribly wrong. I'm having a hard time imagining this code is so overwhelmingly complex that their relationship cannot be trivially communicated.
Edit after update
The answer is still no.
You could use a templating mechanism to provide names for the fields, similar to those used in your regex. This might also make the code a bit more self-explanatory as you'd fill the template context with named values (like "row" or "col").
You could use a real parser/generator, but the complexity there is a bit too much.
You could use a DSL (internal using Groovy, JRuby, JavaScript, etc. or external, which brings us back to parsing) and write chunks of the code that way.
IMO you're over-thinking, and over-estimating perceived complexity: except possibly for the templating solution, which IMO is likely over-engineering for the level of difficulty, you'd be far better off writing one or two sentences, which should be more than enough to relate the "fields" of the load and save formats.

Put comments in your code to explain that they're related, how they're related, what they're used for, and that if one is changed, the other should be modified accordingly.
Implement a unit test to make sure that a saved board can be loaded.
Make sure that your build and release process runs the unit tests, and fails if one of them doesn't pass.

Give instructions to the Java parser/lexer

Is there any way to give instructions directly to the parser and lexar from the java code level? If not, how could one go about doing this at all?
The issue is that I want to have the parser evaluate a variable, back up, then assign the value of that variable as an Object name. Like this:
String s = "text";
SomeClass (s) = new SomeClass();
parser reads--> ok, s evaluates to be "text"...
parser backtracks, while holding "text" in memory and assigns "text" as the name of the new instance of SomeClass, such that one can now do this:
text.callSomeMethod();
I need to do this because I have to instantiate an arbitrary number of objects of SomeClass. Each one has to have a unique name, and it would be ideal to do something like this:
while (someArbitrarySet.hasNext()) {
String s = "token" + Math.random();
SomeClass (s) = new SomeClass();
(s).callSomeMethod();
}
I hope this makes sense...

What you're asking for is what some languages call MACROS. They're also sometimes known as preprocessor definitions, or simply "defines".
A decision was made to not have includes and macros and the like in Java because it introduces additional code maintenance concerns that the designers concluded was going to cause code that would not have been in the style they wanted.
However, just because it's not built into the compiler doesn't mean you couldn't add it to your build script.
As part of your build, you copy all files to a src-comp directory, and as you do, replace your tokens as they're defined.
I don't recommend doing it, but that doesn't mean it isn't possible.

What you describe (creating new named variables at runtime) is possible in interpreted languages like JavaScript, Lua, Bash, but not with a compiled language like Java. When the loop is executed, there is no source code there to manipulate, and all named variables have to be defined before.
Apart from this, your variables don't need a "unique" name, if you are using them sequentially (one after another), you could just as well write your loop as this:
while (someArbitrarySet.hasNext()) {
SomeClass sC = new SomeClass();
sC.callSomeMethod();
}
If you really need your objects at the same time, put them in some sort of data structure. The simplest would be an array, you could use a Collection (like an ArrayList) or a Map (like CajunLuke wrote), if you want to find them again by key.
In fact, an array (in Java) is nothing else than a collection of variables (all of the same type), which you can index by an int.
(And the scripting languages which allow creating new variables on runtime implement this also with some kind of map String → (anything), where this map is either method/script-local or belonging to some surrounding object.)
You wrote in a comment to the question (better add those things to the question itself, it has an "edit" button):
Without getting into too many details, I'm writing an application that runs within a larger program. Normally, the objects would get garbage-collected after I was done with them, but the larger program maintains them, thus the need for a unique name for each. If I don't give each a unique name, the old object will get overwritten, but it is still needed in the context of the greater program.
So, you want to retain the objects to avoid garbage collection? Use an array (or List or anything else).
The thing is, if you want your larger program to be able to use these objects, you somehow have to give them to this larger program anyway. And then this program would have to retain references to these objects, thereby avoiding garbage collection. So it looks you want to solve a problem which does not exist by means which do not exist :-)

Not really an answer to the question you asked, but a possible solution to your problem: using a map.
Map variables = new HashMap();
while (someArbitrarySet.hasNext()) {
String s = "token" + Math.random();
variables.put(s, new SomeClass());
variables.get(s).callSomeMethod();
}
That way, you can use the "variable name" as the keys into the map, and you can get by without messing with the lexer/parser.
I really hope there is a way to do specifically what you state in Java - it would be really cool.

No. That's not possible.
Even if you could I can't think on a way to invoke them, because there won't be compiling code that could successfully reference them.
So the options are the one described by CanjuLuke or to create your own java parser, probably using ANTRL sample Java grammar and hook what you need there.
Consider the map solution.

This is answered in How do you use Java 1.6 Annotation Processing to perform compile time weaving? .
In short, there is an annotation processing tool that allows you to extend java syntax, and create DSLs that compile to java annotations.
Under JDK 1.5 you had to use apt instead of javac, but under 1.6, these are affected by the -processor flag to javac. From javac -help:
-processor <class1>[<class2>,<class3>...]Names of the annotation processors to run; bypasses default discovery process
-processorpath <path> Specify where to find annotation processors

How do I generate the source code to create an object I'm debugging?

Typical scenario for me:
The legacy code I work on has a bug that only a client in production is having
I attach a debugger and figure out how to reproduce the issue on their system given their input. But, I don't know why the error is happening yet.
Now I want to write an automated test on my local system to try and reproduce then fix the bug
That last step is really hard. The input can be very complex and have a lot of data to it. Creating the input by hand (eg: P p = new P(); p.setX("x"); p.setY("x"); imagine doing this 1000 times to create the object) is very tedious and error prone. In fact you may notice there's a typo in the example I just gave.
Is there an automated way to take a field from a break point in my debugger and generate source code that would create that object, populated the same way?
The only thing I've come up with is to serialize this input (using Xstream, for example). I can save that to a file and read it back in in an automated test. This has a major problem: If the class changes in certain ways (eg: a field/getter/setter name is renamed), I won't be able to deserialize the object anymore. In other words, the tests are extremely fragile.

Java standard serialisation is well know to be not very usefull when objects change their version ( content, naming of fields). Its fine for quick demo projects.
More suitable for your needs, is the approach that objetcs support your own (binary) custom serialisation:
This is not difficult, use DataOutputStream to write out all fields of an object. But now introduce versiong, by first writing out a versionId. Objects that have only one version, write out versionId 1. That way you can later, when you have to introduce a change in your objetcs, remove fields, add fields, raise the version number.
Such a ICustomSerializable will then first read out the version number from the input stream, in a readObject() method, and depending on the version Id call readVersionV1() or e.g readVersionV2().
public Interface ICustomSerializable {
void writeObject(DataOutputStream dos);
Object readObject(DataInputStream dis);
}
public Class Foo {
public static final VERSION_V1 = 1;
public static final VERSION_V2 = 2;
public static final CURRENT_VERSION = VERSION_V2;
private int version;
private int fooNumber;
private double fooDouble;
public void writeObject(DataOutputStream dos) {
dos.writeInt(this.version);
if (version == VERSION_V1) {
writeVersionV1(dos);
} else (version == VERSION_V2) {
writeVersionV2(dos);
} else {
throw new IllegalFormatException("unkown version: " + this.version);
}
}
public void writeVersionV1(DataOutputStream dos) {
writeInt(this.fooNumber);
writeDouble(this.fooValue);
}
}
Further getter and setter, and a constructor with initialised the version to CURRENT_VERSION is needed.
This kind of serialisazion is safe to refactoring if you change or add also the appropriate read and write version. For complex objects using classes from external libs not und your controll, it can be more work, but strings, lists are easily serialized.

I think what you want to do is store the "state", and then restore that in your test to ensure the bug stays fixed.
Short answer: There is afaik no such general code generation tool, but as long as several constraints are kept, writing such a tool is small work.
Long Comment:
There are constraints under which that can work. If everything is just beans with getter and setter for all the fields you need, then generating code for this is not so difficult. And yes that would be safe to renaming if you refactor the generated code along with the normal code. If setter are missing, then this approach will not work. And that is only one example of why this is no general solution.
Refactoring can also for example move fields to other classes. How do you want to introduce the values from the other fields of that class? How can you later know if they that altered your saved state still reflects the critical data? Or worse, imagine the refactoring gives the same field a different meaning than before.
The nature of the bug itself is also a constraint. Imagine for example the bug happened because a field/method had this and that name. If a refactoring now changes the name the bug will not appear anymore regardless your state.
Those are just arbitrary examples, that may have exactly nothing to do with your real life cases. But this is a case to case decision, not a general strategy. Anyway, if you know your code the bug and your refactorings are all well behaving enough for this, then making such a tool is done in less than day, probably much less.
With xstream you would partially get this as well, but you would have to change the xml yourself. If you used for example db4o you would have to tell it that this and that field has now this and that name.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.