Java LR or LL Parsing - java

a teacher of mine said, that Java cannot be LL parsed.
I dont understand this and wonder if this is true.
I searched for a grammar of Java 8 and found this: https://github.com/antlr/grammars-v4/blob/master/java8/Java8.g4
But even if I try to analyze the grammar, I dont get the problem for LL parsing.
Does anyone know if this is true, know a scientific proof or just can explain to me why it should be not possible to find a grammar construct of Java which can be LL parsed?
Thanks a lot guys and girls.

The Java Language Specification for Java 7 says it is not LL(1):
The grammar presented in this chapter is the basis for the
reference implementation. Note that it is not an LL(1) grammar, though
in many cases it minimizes the necessary look ahead.
If you either find:
left recursion, or
an alternative (A|B) that the intersection of two or more alternatives share the same FIRST set; FIRST(A) has one or more symbols also in FIRST(B)
Your grammar won't be LL(1).

I think it's due to the left recursion. LL parsers cannot handle left recursion and the current Java grammar is specified in some cases using them, at least Java 7.
Of course, it is well known that one can construct equivalent grammars getting rid of left recursions, but in its current specification Java language could not be LL parsed.

Related

Why does Java use -> instead of => for lambda functions?

I am a .NET and JavaScript developer. Now I am working in Java, too.
In .NET LINQ and JavaScript arrow functions we have =>.
I know Java lambdas are not the same, but they are very similar. Are there any reasons (technical or non technical) that made java choose -> instead of =>?
On September 8, 2011, Brian Goetz of Oracle announced to the OpenJDK mailing list that the syntax for lambdas in Java had been mostly decided, but some of the "fine points" like which type of arrow to use were still up in the air:
This just in: the EG has (mostly) made a decision on syntax.
After considering a number of alternatives, we decided to essentially
adopt the C# syntax. We may still deliberate further on the fine points
(e.g., thin arrow vs fat arrow, special nilary form, etc), and have not
yet come to a decision on method reference syntax.
On September 27, 2011, Brian posted another update, announcing that the -> arrow would be used, in preference to C#'s (and the Java prototype's) usage of =>:
Update on syntax: the EG has chosen to stick with the -> form of the
arrow that the prototype currently uses, rather than adopt the =>.
He goes on to provide some description of the rationale considered by the committee:
You could think of this in two ways (I'm sure I'll hear both):
This is much better, as it avoids some really bad interactions with existing operators, such as:
x => x.age <= 0; // duelling arrows
or
Predicate p = x => x.size == 0; // duelling equals
What a bunch of idiots we are, in that we claimed the goal of doing what other languages did, and then made gratuitous changes "just for the sake of doing something different".
Obviously we don't think we're idiots, but everyone can have an opinion :)
In the end, this was viewed as a small tweak to avoid some undesirable
interactions, while preserving the overall goal of "mostly looks like
what lambdas look like in other similar languages."
Howard Lovatt replied in approval of the decision to prefer ->, writing that he "ha[s] had trouble reading Scala code". Paul Benedict of Apache concurred:
I am glad too. Being consistent with other languages is a laudable goal, but
since programming languages aren't identical, the needs for Java can lead to
a different conclusion. The fat arrow syntax does look odd; I admit it. So
in terms of vanity, I am glad to see that punted. The equals character is
just too strongly associated with assignment and equality.
Paigan Jadoth chimed in, too:
I find the "->" much better than "=>". If arrowlings at all instead of the
more regular "#(){...}" pattern, then something definitely distinct from the
gte/lte tokens is clearly better. And "because the others do that" has never
been a good argument, anyway :D.
In summary, then, after considering arguments on both sides, the committee felt that consistency with other languages (=> is used in Scala and C#) was less compelling than clear differentiation from the equality operators, which made -> win out.
But Lieven Lemiengre was skeptical:
Other languages (such as Scala or Groovy) don't have this problem because
they support some placeholder syntax.
In reality you don't write "x => x.age <= 0;"
But this is very common "someList.partition(x => x.age <= 18)" and I agree
this looks bad. Other languages make this clearer using placeholder syntax
"someList.partition(_.age <= 18)" or "someList.partition(it.age <= 18)"
I hope you are considering something like this, these little closures will
be used a lot!
(And I don't think replacing '=>' with '->' will help a lot)
Other than Lieven, I didn't see anyone who criticized the choice of -> and defended => replying on that mailing list. Of course, as Brian predicted, there were almost certainly opinions on both sides, but ultimately, a choice just has to be made in these types of matters, and the committee made the one they did for the stated reasons.

How can i add variables inside Java 15 text block feature? [duplicate]

This question already has an answer here:
How to have placeholder for variable value in Java Text Block?
(1 answer)
Closed 2 years ago.
Just came across a new feature in Java 15 i.e. "TEXT BLOCKS". I can assume that a variable can be added inside a text block by concatenating with a "+" operator as below:
String html = """
<html>
<body>
<p>Hello, """+strA+"""</p>
</body>
</html>
""";
But are they providing any way so that we can add variables the way which is becoming popular among many other languages as below:
String html = """
<html>
<body>
<p>Hello, ${strA}</p>
</body>
</html>
""";
This question might sound silly but it may be useful in certain scenario.
Java 15 does not support interpolation directly within text blocks nor plain string literals.
The solution in Java 15 is to use String.formatted() method:
String html = """
<html>
<body>
<p>Hello, %s</p>
</body>
</html>
""".formatted(strA);
From the spec for text blocks:
Text blocks do not directly support string interpolation.
Interpolation may be considered in a future JEP.
"String interpolation" meaning
evaluating a string literal containing one or more placeholders,
yielding a result in which the placeholders are replaced with their
corresponding values
from Wikipedia
As stated above, maybe we'll get it in the future. Though it is difficult to say how they could possibly implement that without breaking backwards compatibility -- what happens if my string contains ${}, for example? The Java language designers rarely add anything that is likely to break backwards compatibility.
It seems to me that they would be better off either supporting it immediately, or never.
Maybe it would be possible with a new kind of text block. Rather than the delimiter being """, they could use ''' to denote a parameterized text block, for example.
As already discussed, this is not possible in JDK15 and you cannot change that fact.
But, I suppose you are trying to suggest a thing like this in C# language.
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/tokens/interpolated
Although this is just a syntax sugar thing over string.Format() method in C# (which is a counterpart of String.format() in Java), apparently it is nice if we can have this in Java. This is an extension to the existing way of describing string literal in the language syntax, but of course this can be easily adapted onto text block specification as well.
If this is what you have in your mind, you can make a proposal to Java Community Process to expand Java Language Specification. This is very much lighter syntax/semantics enhancement than adding full-featured template engine in Java Compiler/Runtime specification, and it is possible that they would agree with you.
As user #Michael mentioned: No. 'they' (team Project Amber, who are implementing JEP 368) are not providing any way to interpolate the string in the text block.
Note that I somewhat doubt it'll ever happen. For starters, there is the backwards compatibility issue; any such attempt to introduce interpolation requires some marker so that any existing text blocks aren't all of a sudden going to change in what it means depending on which version of javac to invoke.
But more to the point, you yourself, asking the question, can't even come up with a valid example, which is perhaps indicative that this feature is less useful than it sounds. It looks like you came up with a valid use case, but that's not actually true: If what you wrote would compile and work, then you just wrote a webapp with a rather serious XSS security leak in it!
The point is, what you really want is 'templating', and whilst templating sounds real simple (just evaluate this expression then shove the result into the string right where I typed the expression, please!) - it just isn't. Escaping is a large reason for that. But you can't blanket-apply the rule that ${strA} in a text block means: Evaluate expression strA, then HTML escape that, then put it in, for two reasons: Who says that the string you're interpolating things into is HTML and not, say, JSON or TOML or CSV or whatnot, and who says that the interpolation I desire requires escaping in the first place? What if I want to dynamically inject <em> or not, and I don't want this to turn into <em>?
Either we update the langspec to cater to all these cases and now we're inventing an entire templating system and shoving that into a lang spec which seems like a job far better suited to a dedicated library, or we don't, and the feature seems quite useful but is in fact niche: Either you rarely use it, or you have security and other bugs all over your code base - any lang feature that invites abuse is, and I'd hope one would agree with me on this - not a great feature.
Yes, many languages have this, but the current folks who get to decide what java language features make it into future versions of the language seem to be in the phase that they acknowledge such features exist and will learn lessons from it, but won't add features to java 'just because all these other languages all have it' - some thought and use cases are always considered first, and any such analysis of interpolation on string literals probably leads to: "Eh, probably not a worthwhile addition to the language".

What language is this (think it's Java?), and how do I test (using a browser ide) the math is correct in it?

div(1, sum(1, exp(sum(div(5, product(100, .1)), -5))))
I'm using this in a Solr query, and want to verify that it is the same as :
Where x is 5.
Is this language Java?
If it is, why am I getting this output here:
http://ideone.com/LWYWtU
If it isn't, what language is this and how do I test it?
Thanks in advance for your help.
EDIT: To add more of the surrounding code, here is the full boost value I'm sending to Solr:
if(exists(query({!frange l=0 u=60 v=product(geodist(),0.621371)})),div(1, sum(1, exp(sum(div(product(5), product(100, .1)), -5)))),0)
The reason I think it might be Java is because in the docs, it says Most Java Math functions are now supported, including: and then lists the math functions I ended up using for code.
Solr is Java, but that's not relevant since this is a set of functions that Solr parses and evaluate itself (and not related to Java, except that the backing functions are implemented in Java).
As far as I can say from what you've mapped the functions correctly, as long as the 5 in product(5) is the same as X. You shouldn't need product there, as the value can be included in div directly as far as I can see.
A way to validate it would be to use debugQuery in Solr and see what the value is evaluated as, and then compare it to your own value. Remember that floating point evaluation can introduce a few uncertanities.

Analyse C++ files from a Java program

After several days of research I turn to you.
I search to analyse a C++ file for:
Count the number of parameters in method/function
Count the numbers of line in method/function
etc...
To do this I first tried to with regex, but it has not been successful (Too many cases handled, the regex really get too illegible).
Now I try with ANTLR4. Unfortunately I can not seem to find a grammar for C + + (I find a grammar for C here https://github.com/antlr/grammars-v4)
(I also tried with ANTLR3 but with this grammar, I have a C++ code !!! )
http://www.antlr3.org/grammar/1295920686207/antlr3.2_cpp_parser4.1.0.zip
So do you know where I can find a C++ grammar for ANTLR4?
Or do you know another way to do what I want?
Thank you in advance for your help
PS: sorry for my english, I'm French student
There are some good answers here. If I were you I would use a pre-built parser. After having tried to use ANTLR, I would say it takes a long time to make anything good. Personally I would try Clang.
clang has a library to build AST from where you can get the info you want.
Some existing tools compute some statistics as
cccc
ccccc
...

implementing unification algorithm

I worked the last 5 days to understand how unification algorithm works in Prolog .
Now ,I want to implement such algorithm in Java ..
I thought maybe best way is to manipulate the string and decompose its parts using some datastructure such as Stacks ..
to make it clear :
suppose user inputs is:
a(X,c(d,X)) = a(2,c(d,Y)).
I already take it as one string and split it into two strings (Expression1 and 2 ).
now, how can I know if the next char(s) is Variable or constants or etc.. ,
I can do it by nested if but it seems to me not good solution ..
I tried to use inheritance but the problem still ( how can I know the type of chars being read ?)
First you need to parse the inputs and build expression trees. Then apply Milner's unification algorithm (or some other unification algorithm) to figure out the mapping of variables to constants and expressions.
A really good description of Milner's algorithm may be found in the Dragon Book: "Compilers: Principles, Techniques and Tools" by Aho, Sethi and Ullman. (Milners algorithm can also cope with unification of cyclic graphs, and the Dragon Book presents it as a way to do type inference). By the sounds of it, you could benefit from learning a bit about parsing ... which is also covered by the Dragon Book.
EDIT: Other answers have suggested using a parser generator; e.g. ANTLR. That's good advice, but (judging from your example) your grammar is so simple that you could also get by with using StringTokenizer and a hand-written recursive descent parser. In fact, if you've got the time (and inclination) it is worth implementing the parser both ways as a learning exercise.
It sounds like this problem is more to do with parsing than unification specifically. Using something like ANTLR might help in terms of turning the original string into some kind of tree structure.
(It's not quite clear what you mean by "do it by nested", but if you mean that you're doing something like trying to read an expression, and recursing when meeting each "(", then that's actually one of the right ways to do it -- this is at heart what the code that ANTLR generates for you will do.)
If you are more interested in the mechanics of unifying things than you are in parsing, then one perfectly good way to do this is to construct the internal representation in code directly, and put off the parsing aspect for now. This can get a bit annoying during development, as your Prolog-style statements are now a rather verbose set of Java statements, but it lets you focus on one problem at a time, which is usually helpful.
(If you structure things this way, this should make it straightforward to insert a proper parser later, that will produce the same sort of tree as you have until then been constructing by hand. This will let you attack the two problems separately in a reasonably neat fashion.)
Before you get to do the semantics of the language, you have to convert the text into a form that's easy to operate on. This process is called parsing and the semantic representation is called an abstract syntax tree (AST).
A simple recursive descent parser for Prolog might be hand written, but it's more common to use a parser toolkit such as Rats! or Antlr
In an AST for Prolog, you might have classes for Term, and CompoundTerm, Variable, and Atom are all Terms. Polymorphism allows the arguments to a compound term to be any Term.
Your unification algorithm then becomes unifying the name of any compound term, and recursively unifying the value of each argument of corresponding compound terms.

Categories