Higher-level, semantic search-and-replace in Java code from command-line - java

Command-line tools like grep, sed, awk, and perl allow one to carry out textual search-and-replace operations.
However, is there any tool that would allow me to carry out semantic search-and-replace operations in a Java codebase, from command-line?
The Eclipse IDE allows me, e.g., to easily rename a variable, a field, a method, or a class. But I would like to be able to do the same from command-line.
The rename operation above is just one example. I would further like to be able to select the replacee text with additional semantic constraints such as:
only the scopes of methods M1, M2 of classes C, D, and E;
only all variables or fields of class C;
all expressions in which a variable of some class occurs;
only the scope of the class definition of a variable;
only the scopes of all overridden versions of method M of class C;
etc.
Having selected the code using such arbitrary semantic constraints, I would like to be able to then carry out arbitrary transformations on it.
So, basically, I would need access to the symbol-table of the code.
Question:
Is there an existing tool for this type of work, or would I have to build one myself?
Even if I have to build one myself, do any tools or libraries exist that would at least provide me the symbol-table of Java code, on top of which I could add my own search-and-replace and other refactoring operations?

The only tool that I know can do this easily is the long awaited Refaster. However it is still impossible to use it outside of Google. See [the research paper](http:// research.google.com/pubs/pub41876.html) and status on using Refaster outside of Google.
I am the author of AutoRefactor, and I am very interested in implementing this feature as part of this project. Please follow up on the github issue if you would like to help.

What you want is the ability to find code according to syntax, constrained by various semantic conditions, and then be able to replace the found code with new syntax.
access to the symbol table (symbol type/scope/mentions in scope) is just one kind of semantic constraint. You'll probably want others, such as control flow sequencing (this happens after that) and data flow reaching (data produced here is consumed there). In fact there are an unbounded number of semantic conditions you might consider important, depending on the properties of the language (does this function access data in parallel to that function?) or your application interests (is this matrix an upper triangular matrix?)
In general you can't have a tool that has all possible semantic conditions of interest off the shelf. That means you need to be to express new semantic conditions when you discover the need for them.
The best you might hope for is a tool that
knows the language syntax
has some standard semantic properties built in (my preference is symbol tables, control and data flow analysis)
can express patterns on the source in terms of the source code
can constrain the patterns based on such semantic properties
can be extended with new semantic analyses to provide additional properties
There is a classic category of tools that do this, call source to source program transformation systems.
My company offers the DMS Software Reengineering Toolkit, which is one of these. DMS has been used to carry out production transformations at scale on a wide variety of languages (including OP's target: Java). DMS's rewrite rules are of the form:
rule <rule_name>(syntax_parameters): syntax_category =
<match_pattern> -> <replacement_pattern>
if <semantic_condition>;
You can see a lot more detail of the pattern language and rewrite rules look like: DMS Rewrite Rules.
It is worth noting that the rewrite rules represent operations on trees. This means that while they might look like text string matches, they are not. Consequently a rewrite rule matches in spite of any whitespace issues (and in DMS's case, even in spite of differences in number radix or character string escapes). This makes the DMS pattern matches far more effective than a regex, and a lot easier to write since you don't have worry about these issues.
This Software Recommendations link shows how one can define rules with DMS, and (as per OP's request) "run them from the command line": This isn't as succinct as running SED, but then it is doing much more complex tasks.
DMS has a Java front with symbol tables, control and data flow analysis. If one wants additional semantic analyses, one codes them in DMS's underlying programming language.

Related

Why don't compilers use asserts to optimize? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Following pseudo-C++-code:
vector v;
... filling vector here and doing stuff ...
assert(is_sorted(v));
auto x = std::find(v, elementToSearchFor);
find has linear runtime, because it's called on a vector, which can be unsorted. But at that line in that specific program we know that either: The program is incorrect (as in: it doesn't run to the end if the assertion fails) or the vector to search for is sorted, therefore allowing a binary search find with O(log n). Optimizing it into a binary search should be done by a good compiler.
This is only the easiest worst case behavrior I found so far (more complex assertions may allow even more optimization).
Do some compilers do this? If yes, which ones? If not, why don't they?
Appendix: Some higher level languages may easily do this (especially in case of FP ones), so this is more about C/C++/Java/similar languages
Rice's Theorem basically states that non-trivial properties of code cannot be computed in general.
The relationship between is_sorted being true, and running a faster search is possible instead of a linear one, is a non-trivial property of the program after is_sorted is asserted.
You can arrange for explicit connections between is_sorted and the ability to use various faster algorithms. The way you communicate this information in C++ to the compiler is via the type system. Maybe something like this:
template<typename C>
struct container_is_sorted {
C c;
// forward a bunch of methods to `c`.
};
then, you'd invoke a container-based algorithm that would either use a linear search on most containers, or a sorted search on containers wrapped in container_is_sorted.
This is a bit awkward in C++. In a system where variables could carry different compiler-known type-like information at different points in the same stream of code (types that mutate under operations) this would be easier.
Ie, suppose types in C++ had a sequence of tags like int{positive, even} you could attach to them, and you could change the tags:
int x;
make_positive(x);
Operations on a type that did not actively preserve a tag would automatically discard it.
Then assert( {is sorted}, foo ) could attach the tag {is sorted} to foo. Later code could then consume foo and have that knowledge. If you inserted something into foo, it would lose the tag.
Such tags might be run time (that has cost, however, so unlikely in C++), or compile time (in which case, the tag-state of a given variable must be statically determined at a given location in the code).
In C++, due to the awkwardness of such stuff, we instead by habit simply note it in comments and/or use the full type system to tag things (rvalue vs lvalue references are an example that was folded into the language proper).
So the programmer is expected to know it is sorted, and invoke the proper algorithm given that they know it is sorted.
Well, there are two parts to the answer.
First, let's look at assert:
7.2 Diagnostics <assert.h>
1 The header defines the assert and static_assert macros and
refers to another macro,
NDEBUG
which is not defined by <assert.h>. If NDEBUG is defined as a macro name at the point in the source file where <assert.h> is included, the assert macro is defined simply as
#define assert(ignore) ((void)0)
The assert macro is redefined according to the current state of NDEBUG each time that <assert.h> is included.
2 The assert macro shall be implemented as a macro, not as an actual function. If the macro definition is suppressed in order to access an actual function, the behavior is undefined.
Thus, there is nothing left in release-mode to give the compiler any hint that some condition can be assumed to hold.
Still, there is nothing stopping you from redefining assert with an implementation-defined __assume in release-mode yourself (take a look at __builtin_unreachable() in clang / gcc).
Let's assume you have done so. Now, the condition tested could be really complicated and expensive. Thus, you really want to annotate it so it does not ever result in any run-time work. Not sure how to do that.
Let's grant that your compiler even allows that, for arbitrary expressions.
The next hurdle is recognizing what the expression actually tests, and how that relates to the code as written and any potentially faster, but under the given assumption equivalent, code.
This last step results in an immense explosion of compiler-complexity, by either having to create an explicit list of all those patterns to test or building a hugely-complicated automatic analyzer.
That's no fun, and just about as complicated as building SkyNET.
Also, you really do not want to use an asymptotically faster algorithm on a data-set which is too small for asymptotic time to matter. That would be a pessimization, and you just about need precognition to avoid such.
Assertions are (usually) compiled out in the final code. Meaning, among other things, that the code could (silently) fail (by retrieving the wrong value) due to such an optimization, if the assertion was not satisfied.
If the programmer (who put the assertion there) knew that the vector was sorted, why didn't he use a different search algorithm? What's the point in having the compiler second-guess the programmer in this way?
How does the compiler know which search algorithm to substitute for which, given that they all are library routines, not a part of the language's semantics?
You said "the compiler". But compilers are not there for the purpose of writing better algorithms for you. They are there to compile what you have written.
What you might have asked is whether the library function std::find should be implemented to potentially seek whether or not it can perform the algorithm other than using linear search. In reality it might be possible if the user has passed in std::set iterators or even std::unordered_set and the STL implementer knows detail of those iterators and can make use of it, but not in general and not for vector.
assert itself only applies in debug mode and optimisations are normally needed for release mode. Also, a failed assert causes an interrupt not a library switch.
Essentially, there are collections provided for faster lookup and it is up to the programmer to choose it and not the library writer to try to second guess what the programmer really wanted to do. (And in my opinion even less so for the compiler to do it).
In the narrow sense of your question, the answer is they do if then can but mostly they can't, because the language isn't designed for it and assert expressions are too complicated.
If assert() is implemented as a macro (as it is in C++), and it has not been disabled (by setting NDEBUG in C++) and the expression can be evaluated at compile time (or can be data traced) then the compiler will apply its usual optimisations. That doesn't happen often.
In most cases (and certainly in the example you gave) the relationship between the assert() and the desired optimisation is far beyond what a compiler can do without assistance from the language. Given the very low level of meta-programming capability in C++ (and Java) the ability to do this is quite limited.
In the wider sense I think what you're really asking for is a language in which the programmer can make asserts about the intention of the code, from which the compiler can choose between different translations (and algorithms). There have been experimental languages attempting to do that, and Eiffel had some features in that direction, but I'm now aware of any mainstream compiled languages that could do it.
Optimizing it into a binary search should be done by a good compiler.
No! A linear search results in a much more predictable branch. If the array is short enough, linear search is the right thing to do.
Apart from that, even if the compiler wanted to, the list of ideas and notions it would have to know about would be immense and it would have to do nontrivial logic on them. This would get very slow. Compilers are engineered to run fast and spit out decent code.
You might spend some time playing with formal verification tools whose job is to figure out everything they can about the code they're fed in, which asserts can trip, and so forth. They're often built without the same speed requirements compilers have and consequently they're much better at figuring things out about programs. You'll probably find that reasoning rigorously about code is rather harder than it looks at first sight.

What are the subphases of the semantics analysis compiler phase?

I took an interest in finding out how a compiler really works. I looked through several books and all of them agree on the fact that the compiler phases are roughly as this(correct me if I'm wrong): lexical analysis, syntax analysis, semantic analysis, intermediate code, code optimization, code generation. The lexical and syntax phases look pretty clear and straightforward as methods(but this does not mean easy of course). However, I'm still not able to find what the semantic phase really consist of. For one, I know that there should be some subphases like scope checking, declaration checking and type checking but question that has been bothering me is: are there other things that have to be done. Can you tell me what are the mandatory steps that have to taken during this phase. I know this strongly depends on the programming language and the compiler implementation but could you give me some examples concerning C/C++, Java. And could you please point me to a book/page/article where can I read those things in-depth. Thanks.
Edit:
The books I look through were "Compilers: Principles, Techniques, and Tools",Aho and "Modern Compiler Design", Grune, Reeuwijk. I haven't been able to answer this question using them. If you find this question too broad could you please give an answer considering an compiler implementation of your choice for either C,C++ or Java.
There are typical "semantic analysis" phases that many compilers go through in one form or another. After lexing and parsing, the following actions typically occur in this order:
Name and type resolution. Determines lexical scopes, identifiers declared in such scopes, the type information for those identifiers, and for each non-declaration use of an identifier, the declaration to which it refers
Control flow analysis. The construction of a control flow graph over the computations explicit and/or implied (e.g., constructors) by the code.
Data flow analysis. Determines where variables recieve new values, and where those values are read by other parts of the program. (This often has a local analysis done within procedures, followed possibly by one across the procedures).
Also often done, as part of data flow analysis:
Points-to analysis. Determination for each pointer, at each location in the code, which entities that pointer might reference
Call graph. Construction of a call graph across the procedures, often taking into account indirect function pointers whose estimated values occur during the points-to analysis.
As a practical matter, some of these need to be interleaved to produce better results.
Beyond this, there are many analyses used to support various optimizations and code generation passes. If you really want to know more, consult any decent compiler book.
As already mentioned by templatetypedef, semantic analysis is language specific. For C++ it would among other things involve what template instantiations are required (the C++ language tends towards more and more semantic analysis), and for Java there would need to be some checked exception analysis.
Even for C the GNU C compiler can be configured to check arguments given to string-interpolations. I guess there are hundres of semi semantic analysis-related options for GCC to choose from. If you are doing a paper on the subject, you could spend an afternoon counting them :)
Besides availability, I find that the semantic analysis is what differentiates the statically typed imperative object-oriented languages of today.
You can't necessarily divide it into sub-phases at all. There are a number of things that have to be done, but at least conceptually they are all done while walking the parse tree from top to bottom and back up again. What exactly they are and how exactly it all happens depends on the language, the statement being processed, the specific compiler writer, ...
You could start to make a list:
Build symbol table.
Find the declarations of variables referenced.
Check compatibility of variable datatypes.
Establish subexpression types.
...
You can see that already these must be somewhat intermingled in practice, rather than constitute separable sub-phases.

Programatic code modification (e.g. variable extraction) in Java

I know it's possible to do nice stuff with Reflection, such as invoking methods, or altering the values of fields. Is it possible to do heavier code modification, though, at runtime and programmatically?
For instance, if I have a method:
public void foo(){
this.bar = 100;
}
Can I write a program that modifies the innards of this method, notices that it assigns a constant to a field, and turns it into the following:
public int baz = 100;
public void foo(){
this.bar = baz;
}
Perhaps Java isn't really the language to do this kind of thing in - if not, I'm open to suggestions for languages that would allow me to basically reparse or inspect code in this way, and be able to alter it so precisely. I might be pipe dreaming here though, so please tell me if this is the case also.
Just adding a suggestion from a friend - Apache Commons' BCEL looks excellent:
http://commons.apache.org/bcel/manual.html
The Byte Code Engineering Library (Apache Commons BCEL™) is intended to
give users a convenient way to analyze, create, and manipulate (binary)
Java class files (those ending with .class). Classes are represented by
objects which contain all the symbolic information of the given class:
methods, fields and byte code instructions, in particular.
Such objects can be read from an existing file, be transformed by a
program (e.g. a class loader at run-time) and written to a file again.
An even more interesting application is the creation of classes from
scratch at run-time. The Byte Code Engineering Library (BCEL) may be
also useful if you want to learn about the Java Virtual Machine (JVM)
and the format of Java .class files.
You are looking for software that allows you to do bytecode manipulation, there are several frameworks to achieve this, but the two most known currently are:
ASM
javassist
When performing bytecode modifications at runtime in Java classes keep in mind the following:
If you change a class's bytecode after a class has been loaded by a classloader, you'll have to find a way to reload it's class definition (either through classloading tricks, or using hotswap functionalities)
If you change the classes interface (example add new methods or fields) you will be able only to reach them through reflection.
It's probably fair to say that Java wasn't designed with this purpose in mind, but you can do it potentially. How and when depends a little on the ultimate aim of the exercise. A couple of options:
At the source code level, you can use the Java Compiler API to
compile arbitrary code into a class file (which you can then load).
At the bytecode level, you can write an agent that installs a
ClassFileTransformer to arbitrarily alter a class "on the fly"
as it is loaded. In practice, if you do this, you will also probably
make use of a library such as BCEL (Bytecode Engineering
Library) to make manipulating the class easier.
You want to investigate program transformation systems (PTS), which provide general facilities for parsing and transforming languages at the source level. PTS provide rewrite rules that say in effect, "if you see this pattern, replace it by that pattern" using the surface syntax of the target language. This is done using full parsers so the rewrite rule really operates on language syntax and not text; such rewrite rules obviously won't attempt to modify code-like text in comments, unlike tools based on regexps.
Our DMS Software Reengineering Toolkit is one of these. It provides not only the usual parsing, AST building and prettyprinting (reproducing compilable source code complete with comments), but also supports symbol tables and control and data flow analysis. These are needed for almost any interesting transformations. DMS also has front ends for a variety of dialects of Java as well as many other languages.
Bytecode transformers exist because they are much easier to build; it is pretty easy to "parse" bytecode. Of course, you can't make permanent source changes with a bytecode transformer, so it is lot less useful.
You mean like this?
String script1 = "println(\"OK!\");";
eval( script1 );
script1 += "println(\"... well, maybe NOT OK after all\");";
eval( script2 );
Output:
OK!
OK!
... well, maybe NOT OK after all
... use a scripting extension to Java. Groovy and other things like that would probably allow you to do what you want. I've written a scripting extension which integrates with Java through reflection almost seamlessly myself; contact me if you're interested in the details.

Comparison of two Java classes

I have two java classes that are very similar in semantics but differ in syntax. The differences are minor, like -
Changes in variable names,
Changes in position of some statements (with no dependent lines in between),
Extra imports, etc.
I need to compare these two classes to prove that they are indeed semantically identical. The same needs to be done for a large number of java file pairs.
The first approach of reading from the two files and comparing the lines, with logic to deal with the differences mentioned above seems inefficient. Is there some other way that I can achieve this task? Any helpful APIs out there?
Compile both of the classes without debug information and then decompile them back to source files. The decompiled files should be a lot more similar than the original source files.
You can improve this further by running some optimizations on the compiled files. For example you can use Proguard with just shrinking enabled to removed unused code.
Changes in position of some statements can be hard to detect though.
If you want to examine the changes in the code try Araxis Merge or WinMerge.
But if you want logical differences, I am afraid you might have to do it manually.
I would advise to use one of these tools to look for textual changes and then look for logical differences.
There are a lot of similarity checker out there, and until now there's no yet perfect tool for this. Each has its own advantages / disadvantages. The approaches generally falls into two categories: token-based or tree-based.
Token-based similarity checking is usually done with regular expressions, but other approaches are possible. In one of my projects at university, we developed one utilizing alignment strategy from bioinformatics field. The disadvantage of this technique is mainly if the size of the two sources isn't more or less equal.
Tree-based is more like a compiler, so normally using some compilation techniques it's possible (well, more or less) to check for this. Tree-based approach has disadvantages of being exponential in comparison complexity.
Comparing line by line wont work. I think you may need to use a parser. I would suggest that you take a look at ANTLR. It should have a java grammar where you could put your actions which will do the comparison.
As far as I know there's now way to compare the semantics of two Java classes. Take for example the following two methods:
public String m1(String a, int b) { ... }
and
public String m2(String x, int y) { ... }
A part from changes in variables and methods names, their signature is the same: same return type, and same input types. However, this is no guarantee that the two methods are semantically equivalent. For example, m1 could return a string consisting of the first b characters of a, while m2 could return a string consisting of y repetitions of x. As you can see, although only variables and names change, the semantics of the two methods is totally different.
I don't see an easy way out for your problem. You can perhaps make some assumption and try the following approach:
assume that the methods names in the two classes are the same
write test cases (for example with JUnit) for all the methods in the first class
run the test cases on the second class
ensure that the second class does not have other (untested) methods (for example using reflection)
This approach gives you an idea about equivalent semantics, but it makes strong assumption.
As a final remark, let me add that specifying the semantics of programs is an interesting and open research topic. Some interesting development in this area include research on Semantic Web Services. A widely adopted approach to give machine processable semantics to programs is that of specifying their IOPE: Input and Output types (as int the Java methods above), and their Preconditions and Effects. Preconditions are essentially logical conditions that must hold true for successfully invoking the program, and Effects are formal descriptions of the changes (in the state of the world) caused by the successful execution of the program. Even with IOPE there are a lot of problems ... which I skip in this short description.

Metaprogramming - self explanatory code - tutorials, articles, books [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I am looking into improving my programming skils (actually I try to do my best to suck less each year, as our Jeff Atwood put it), so I was thinking into reading stuff about metaprogramming and self explanatory code.
I am looking for something like an idiot's guide to this (free books for download, online resources). Also I want more than your average wiki page and also something language agnostic or preferably with Java examples.
Do you know of such resources that will allow to efficiently put all of it into practice (I know experience has a lot to say in all of this but i kind of want to build experience avoiding the flow bad decisions - experience - good decisions)?
EDIT:
Something of the likes of this example from the Pragmatic Programmer:
...implement a mini-language to control a simple drawing package... The language consists of single-letter commands. Some commands are followed by a single number. For example, the following input would draw a rectangle:
P 2 # select pen 2
D # pen down
W 2 # draw west 2cm
N 1 # then north 1
E 2 # then east 2
S 1 # then back south
U # pen up
Thank you!
Welcome to the wonderful world of meta-programming :) Meta programming relates actually to many things. I will try to list what comes to my mind:
Macro. The ability to extend the syntax and semantics of a programming language was explored first under the terminology macro. Several languages have constructions which resemble to macro, but the piece of choice is of course Lisp. If you are interested in meta-programming, understanding Lisp and the macro system (and the homoiconic nature of the languge where code and data have the same representation) is definitively a must. If you want a Lisp dialect that runs on the JVM, go for Clojure. A few resources:
Clojure mini language
Beating the Averages (why Lisp is a secret weapon)
There is otherwise plenty of resource about Lisp.
DSL. The ability to extend one language syntax and semantics is now rebranded under the term "DSL". The easiest way to create a DSL is with the interpreter pattern. Then come internal DSL with fluent interface and external DSL (as per Fowler's terminology). Here is a nice video I watched recently:
DSL: what, why, how
The other answers already pointed to resources in this area.
Reflection. Meta-programming is also inseparable form reflection. The ability to reflect on the program structure at run-time is immensely powerful. It's important then to understand what introspection, intercession and reification are. IMHO, reflection permits two broad categories of things: 1. the manipulation of data whose structure is not known at compile time (the structure of the data is then provided at run-time and the program stills works reflectively). 2. powerful programming patterns such as dynamic proxy, factories, etc. Smalltalk is the piece of choice to explore reflection, where everything is reflective. But I think Ruby is also a good candidate for that, with a community that leverage meta programming (but I don't know much about Ruby myself).
Smalltalk: a reflective language
Magritte: a meta driven approach to empower developpers and end-users
There is also a rich literature on reflection.
Annotations. Annotations could be seen as a subset of the reflective capabilities of a language, but I think it deserves its own category. I already answered once what annotations are and how they can be used. Annotations are meta-data that can be processed at compile-time or at run-time. Java has good support for it with the annotation processor tool, the Pluggable Annotation Processing API, and the mirror API.
Byte-code or AST transformation. This can be done at compile-time or at run-time. This is somehow are low-level approach but can also be considered a form of meta-programming (In a sense, it's the same as macro for non-homoiconic language.)
DSL with Groovy (There is an example at the end that shows how you can plug your own AST transformation with annotations).
Conclusion: Meta-programming is the ability for a program to reason about itself or to modify itself. Just like meta stack overflow is the place to ask question about stack overflow itself. Meta-programming is not one specific technique, but rather an ensemble of concepts and techniques.
Several things fall under the umbrella of meta-programming. From your question, you seem more interested in the macro/DSL part. But everything is ultimately related, so the other aspects of meta-programming are also definitively worth looking at.
PS: I know that most of the links I've provided are not tutorials, or introductory articles. These are resources that I like which describe the concept and the advantages of meta-programming, which I think is more interesting
I've mentioned C++ template metaprogramming in my comment above. Let me therefore provide a brief example using C++ template meta-programming. I'm aware that you tagged your question with java, yet this may be insightful. I hope you will be able to understand the C++ code.
Demonstration by example:
Consider the following recursive function, which generates the Fibonacci series (0, 1, 1, 2, 3, 5, 8, 13, ...):
unsigned int fib(unsigned int n)
{
return n >= 2 ? fib(n-2) + fib(n-1) : n;
}
To get an item from the Fibonacci series, you call this function -- e.g. fib(5) --, and it will compute the value and return it to you. Nothing special so far.
But now, in C++ you can re-write this code using templates (somewhat similar to generics in Java) so that the Fibonacci series won't be generated at run-time, but during compile-time:
// fib(n) := fib(n-2) + fib(n-1)
template <unsigned int n>
struct fib // <-- this is the generic version fib<n>
{
static const unsigned int value = fib<n-2>::value + fib<n-1>::value;
};
// fib(0) := 0
template <>
struct fib<0> // <-- this overrides the generic fib<n> for n = 0
{
static const unsigned int value = 0;
};
// fib(1) := 1
template <>
struct fib<1> // <-- this overrides the generic fib<n> for n = 1
{
static const unsigned int value = 1;
};
To get an item from the Fibonacci series using this template, simply retrieve the constant value -- e.g. fib<5>::value.
Conclusion ("What does this have to do with meta-programming?"):
In the template example, it is the C++ compiler that generates the Fibonacci series at compile-time, not your program while it runs. (This is obvious from the fact that in the first example, you call a function, while in the template example, you retrieve a constant value.) You get your Fibonacci numbers without writing a function that computes them! Instead of programming that function, you have programmed the compiler to do something for you that it wasn't explicitly designed for... which is quite remarkable.
This is therefore one form of meta-programming:
Metaprogramming is the writing of computer programs that write or manipulate other programs (or themselves) as their data, or that do part of the work at compile time that
would otherwise be done at runtime.
-- Definition from the Wikipedia article on metaprogramming, emphasis added by me.
(Note also the side-effects in the above template example: As you make the compiler pre-compute your Fibonacci numbers, they need to be stored somewhere. The size of your program's binary will increase proportionally to the highest n that's used in expressions containing the term fib<n>::value. On the upside, you save computation time at run-time.)
From your example, it seems you are talking about domain specific languages (DSLs), specifically, Internal DSLs.
Here is a large list of books about DSL in general (about DSLs like SQL).
Martin Fowler has a book that is a work in progress and is currently online.
Ayende wrote a book about DSLs in boo.
Update: (following comments)
Metaprogramming is about creating programs that control other programs (or their data), sometimes using a DSL. In this respect, batch files and shell scripts can be considered to be metaprogramming as they invoke and control other programs.
The example you have shows a DSL that may be used by a metaprogram to control a painting program.
Tcl started out as a way of making domain-specific languages that didn't suck as they grew in complexity to the point where they needed to get generic programming capabilities. Moreover, it remains very easy to add in your own commands precisely because that's still an important use-case for the language.
If you're wanting an implementation integrated with Java, Jacl is an implementation of Tcl in Java which provides scriptability focussed towards DSLs and also access to access any Java object.
(Metaprogramming is writing programs that write programs. Some languages do it far more than others. To pick up on a few specific cases, Lisp is the classic example of a language that does a lot of metaprogramming; C++ tends to relegate it to templates rather that permitting it at runtime; scripting languages all tend to find metaprogramming easier because their implementations are written to be more flexible that way, though that's just a matter of degree..)
Well, in the Java ecosystem, i think the simplest way to implement a mini-language is to use scripting languages, like Groovy or Ruby (yes, i know, Ruby is not a native citizen of the java ecosystem). Both offer rather good DSL specification mechanism, that will allow you to do that with far more simplicity than the Java language would :
Writing DSL in Groovy
Creating Ruby DSL
There are however pure Java laternatives, but I think they'll be a little harder to implement.
You can have a look at the eclipse modeling project, they've got support for meta-models.
There's a course on Pluralsight about Metaprogramming which might be a good entry point https://app.pluralsight.com/library/courses/understanding-metaprogramming/table-of-contents

Categories