Why does Java use -> instead of => for lambda functions? - java

I am a .NET and JavaScript developer. Now I am working in Java, too.
In .NET LINQ and JavaScript arrow functions we have =>.
I know Java lambdas are not the same, but they are very similar. Are there any reasons (technical or non technical) that made java choose -> instead of =>?

On September 8, 2011, Brian Goetz of Oracle announced to the OpenJDK mailing list that the syntax for lambdas in Java had been mostly decided, but some of the "fine points" like which type of arrow to use were still up in the air:
This just in: the EG has (mostly) made a decision on syntax.
After considering a number of alternatives, we decided to essentially
adopt the C# syntax. We may still deliberate further on the fine points
(e.g., thin arrow vs fat arrow, special nilary form, etc), and have not
yet come to a decision on method reference syntax.
On September 27, 2011, Brian posted another update, announcing that the -> arrow would be used, in preference to C#'s (and the Java prototype's) usage of =>:
Update on syntax: the EG has chosen to stick with the -> form of the
arrow that the prototype currently uses, rather than adopt the =>.
He goes on to provide some description of the rationale considered by the committee:
You could think of this in two ways (I'm sure I'll hear both):
This is much better, as it avoids some really bad interactions with existing operators, such as:
x => x.age <= 0; // duelling arrows
or
Predicate p = x => x.size == 0; // duelling equals
What a bunch of idiots we are, in that we claimed the goal of doing what other languages did, and then made gratuitous changes "just for the sake of doing something different".
Obviously we don't think we're idiots, but everyone can have an opinion :)
In the end, this was viewed as a small tweak to avoid some undesirable
interactions, while preserving the overall goal of "mostly looks like
what lambdas look like in other similar languages."
Howard Lovatt replied in approval of the decision to prefer ->, writing that he "ha[s] had trouble reading Scala code". Paul Benedict of Apache concurred:
I am glad too. Being consistent with other languages is a laudable goal, but
since programming languages aren't identical, the needs for Java can lead to
a different conclusion. The fat arrow syntax does look odd; I admit it. So
in terms of vanity, I am glad to see that punted. The equals character is
just too strongly associated with assignment and equality.
Paigan Jadoth chimed in, too:
I find the "->" much better than "=>". If arrowlings at all instead of the
more regular "#(){...}" pattern, then something definitely distinct from the
gte/lte tokens is clearly better. And "because the others do that" has never
been a good argument, anyway :D.
In summary, then, after considering arguments on both sides, the committee felt that consistency with other languages (=> is used in Scala and C#) was less compelling than clear differentiation from the equality operators, which made -> win out.
But Lieven Lemiengre was skeptical:
Other languages (such as Scala or Groovy) don't have this problem because
they support some placeholder syntax.
In reality you don't write "x => x.age <= 0;"
But this is very common "someList.partition(x => x.age <= 18)" and I agree
this looks bad. Other languages make this clearer using placeholder syntax
"someList.partition(_.age <= 18)" or "someList.partition(it.age <= 18)"
I hope you are considering something like this, these little closures will
be used a lot!
(And I don't think replacing '=>' with '->' will help a lot)
Other than Lieven, I didn't see anyone who criticized the choice of -> and defended => replying on that mailing list. Of course, as Brian predicted, there were almost certainly opinions on both sides, but ultimately, a choice just has to be made in these types of matters, and the committee made the one they did for the stated reasons.

Related

What would be the best way to build a Big-O runtime complexity analyzer for pseudocode in a text file?

I am trying to create a class that takes in a string input containing pseudocode and computes its' worst case runtime complexity. I will be using regex to split each line and analyze the worst-case and add up the complexities (based on the big-O rules) for each line to give a final worst-case runtime. The pseudocode written will follow a few rules for declaration, initilization, operations on data structures. This is something I can control. How should I go about designing a class considering the rules of iterative and recursive analysis?
Any help in C++ or Java is appreciated. Thanks in advance.
class PseudocodeAnalyzer
{
public:
string inputCode;
string performIterativeAnalysis(string line);
string performRecursiveAnalysis(string line);
string analyzeTotalComplexity(string inputCode);
}
An example for iterative algorithm: Check if number in a grid is Odd:
1. Array A = Array[N][N]
2. for i in 1 to N
3. for j in 1 to N
4. if A[i][j] % 2 == 0
5. return false
6. endif
7. endloop
8. endloop
Worst-case Time-Complexity: O(n*n)
The concept: "I wish to write a program that analyses pseudocode in order to print out the algorithmic complexity of the algorithm it describes" is mathematically impossible!
Let me try to explain why that is, or how you get around the inevitability that you cannot write this.
Your pseudocode has certain capabilities. You call it pseudocode, but given that you are now trying to parse it, it's still a 'real' language where terms have real meaning. This language is capable of expressing algorithms.
So, which algorithms can it express? Presumably, 'all of them'. There is this concept called a 'turing machine': You can prove that anything a computer can do, a turing machine can also do. And turing machines are very simple things. Therefore, if you have some simplistic computer and you can use that computer to emulate a turing machine, you can therefore use it to emulate a complete computer. This is how, in fundamental informatics, you can prove that a certain CPU or system is capable of computing all the stuff some other CPU or system is capable of computing: Use it to compute a turing machine, thus proving you can run it all. Any system that can be used to emulate a turing machine is called 'turing complete'.
Then we get to something very interesting: If your pseudocode can be used to express anything a real computer can do, then your pseudocode can be used to 'write'... your very pseudocode checker!
So let's say we do just that and stick the pseudocode that describes your pseudocode checker in a function we shall call pseudocodechecker. It takes as argument a string containing some pseudocode, and returns a string such as O(n^2).
You can then write this program in pseudocode:
1. if pseudocodechecker(this-very-program) == O(n^2)
2. If True runSomeAlgorithmThatIsO(1)
3. If False runSomeAlgorithmTahtIsO(n^2)
And this is self-defeating: We have 'programmed' a paradox. It's like "This statement is a lie", or "the set of all sets that do not contain themselves". If it's false it is true and if it is true it false. [Insert GIF of exploding computer here].
Thus, we have mathematically proved that what you want is impossible, unless one of the following is true:
A. Your pseudocode-based checker is incorrect. As in, it will flat out give a wrong answer sometimes, thus solving the paradox: If you feed your program a paradox, it gives a wrong answer. But how useful is such an app? An app where you know the answer it gives may be incorrect?
B. Your pseudocode-based checker is incomplete: The official definition of your pseudocode language is so incapable, you cannot even write a turing machine in it.
That last one seems like a nice solution; but it is quite drastic. It pretty much means that your algorithm can only loop over constant ranges. It cannot loop until a condition is true, for example. Another nice solution appears to be: The program is capable of realizing that an answer cannot be given, and will then report 'no answer available', but unfortunately, with some more work, you can show that you can still use such a system to develop a paradox.
The answer by #rzwitserloot and the ones given in the link are correct. Let me just add that it is possible to compute an approximation both to the halting problem as well as to finding the time complexity of a piece of code (written in a Turing-complete language!). (Compare that to the existence of automated theorem provers for arithmetic and other second order logics, which are undecidable!) A tool that under-approximated the complexity problem would output the correct time complexity for some inputs, and "don't know" for other inputs.
Indeed, the whole wide field of code analyzers, often built into the IDEs that we use every day, more often than not under-approximate decision problems that are uncomputable, e.g. reachability, nullability or value analyses.
If you really want to write such a tool: the basic idea is to identify heuristics, i.e., common patterns for which a solution is known, such as various patterns of nested for-loops with only very basic arithmetic operations manipulating the indices, or simple recursive functions where the recurrence relation can be spotted straight-away. It would actually be not too hard (though definitely not easy!) to write a tool that could solve most of the toy problems (such as the one you posted) that are given as homework to students, and that are often posted as questions here on SO, since they follow a rather small number of patterns.
If you wish to go beyond simple heuristics, the main theoretical concept underlying more powerful code analyzers is abstract interpretation. Applied to your use case, this would mean developing a mapping between code constructs in your language to code constructs in a different language (or simpler code constructs in the same language) for which it is easier to compute the time complexity. This mapping would have to conform to some constraints, in particular, the mapped constructs have have the same or worse time complexity as the original code. Actually, mapping a piece of code to a recurrence relation would be an example of abstract interpretation. So is replacing a line of code with something like "O(1)". So, the task is just to formalize some of the things that we do in our heads anyway when we are analyzing the time complexity of code.

Optional doesn't play well with Java Stream?

I'm teaching classes on the new Java constructs. I've already introduced my students to Optional<T>. Imagining that there is a Point.getQuadrant() which returns an Optional<Quadrant> (because some points lie in no quadrant at all), we can add the quadrant to a Set<Quadrant> if the point lies in a positive X quadrant, like this:
Set<Quadrant> quadrants = ...;
Optional<Point> point = ...;
point
.filter(p -> p.getX() > 0)
.flatMap(Point::getQuadrant)
.ifPresent(quadrants::add);
(That's off the top of my head; let me know if I made a mistake.)
My next step was to tell my students, "Wouldn't it be neat if you could do the same internal filtering and decisions using functions, but with multiple objects? Well that's what streams are for!" It seemed the perfect segue. I was almost ready to write out the same form but with a list of points, using streams:
Set<Point> points = ...;
Set<Quadrant> quadrants = points.stream()
.filter(p -> p.getX() > 0)
.flatMap(Point::getQuadrant)
.collect(Collectors.toSet());
But wait! Will Stream.flatMap(Point::getQuadrant) correctly unravel the Optional<Quadrant>? It appears not...
I had already read Using Java 8's Optional with Stream::flatMap , but I had thought that what was discussed there was some esoteric side case with no real-life relevance. But now that I'm playing it out, I see that this is directly relevant to most anything we do.
Put simply, if Java now expects us to use Optional<T> as the new optional-return idiom; and if Java now encourages us to use streams for processing our objects, wouldn't we expect to encounter mapping to an optional value all over the place? Do we really have to jump through series of .filter(Optional::isPresent).map(Optional::get) hoops as a workaround until Java 9 gets here years from now?
I'm hoping I just misinterpreted the other question, that I'm just misunderstanding something simple, and that someone can set me straight. Surely Java 8 doesn't have this big of a blind spot, does it?
I’m not sure where your question is aiming at. The answer of the linked question already states that this is a flaw of Java 8 that is addressed in Java 9. So there’s no sense in asking whether this really is a flaw in Java 8.
Besides that, the answer also mentions .flatMap(o -> o.isPresent()? Stream.of(o.get()): Stream.empty()) for converting the optional, rather than your .filter(Optional::isPresent) .map(Optional::get). Nevertheless, you can do it even simpler:
Instead of
.flatMap(Point::getQuadrant)
you can write
.flatMap(p -> p.getQuadrant().map(Stream::of).orElse(null))
as an alternative to Java 9’s
.flatMap(p -> p.getQuadrant().stream())
Note that
.map(Point::getQuadrant).flatMap(Optional::stream)
isn’t so much better compared to the alternatives, unless you have an irrational affinity to method references.

Formula manipulation algorithm

I am wanting to make a program that will when given a formula, it can manipulate the formula to make any value (or in the case of a simultaneous formula, a common value) the subject of the formula.
For example if given:
a + b = c
d + b = c
The program should therefore say:
b = c - a, d = c - b etc.
I'm not sure if java can do this automatically or not when I give the original formula as input. I am not really interested in solving the equation and getting the result of each variable, I am just interested in returning a manipulated formula.
Please let me know if I need to make an algorithm or not for this, and if so, how would I go about doing this. Also, if there are any helpful links that you might have, please post them.
Regards
Take a look at JavaCC. It's a little daunting at first but it's the right tool for something like this. Plus there are already examples of what you are trying to achieve.
Not sure what exactly you are after, but this problem in its general problem is hard. Very hard.
In fact, given a set of "formulas" (axioms), and deduction rules (mathematical equivalence operations), we cannot deduce if a given formula is correct or not. This problem is actually undecideable.
This issue was first addressed by Hilbert as Entscheidungsproblem
I read a book called Fluid Concepts and Creative Analogies by Douglas Hofstadter that talked about this sort of algebraic manipulations that would automatically rewrite equations in other ways attempting to join equations to other equations an infinite (yet restricted) number of ways given rules. It was an attempt to prove yet unproven theorems/proofs by brute force.
http://en.wikipedia.org/wiki/Fluid_Concepts_and_Creative_Analogies
Douglas Hofstadter's Numbo program attempts to do what you want. He doesn't give you the source, only describes how it works in detail.
It sounds like you want a program to do what highschool students do when they solve algebraic problems to move from a position where you know something, modifying it and combining it with other equations, to prove something previously unknown. It takes a strong Artificial intelligence to do this. The part of your brain that does this is the Neo Cortex, which does science, and it's operating principle is as of yet not understood.
If you want something that will do what college students do when they manipulate equations in calculus, you'll have to build a fairly strong artificial intelligence.
http://en.wikipedia.org/wiki/Neocortex
When we can do whole-brain emulation of a human neo cortex, I will post the answer here.
Yes, you need to write some algorithm to do this kind of computer algebra. At least
a parser to interpret the input
an algebra model to relate parsed operands ('a', 'b', ...) and operator ('+', '=')
implement any appropriate rule to support the manipulation you wish to do

In regards to for(), why use i++ rather than ++i?

Perhaps it doesn't matter to the compiler once it optimizes, but in C/C++, I see most people make a for loop in the form of:
for (i = 0; i < arr.length; i++)
where the incrementing is done with the post fix ++. I get the difference between the two forms. i++ returns the current value of i, but then adds 1 to i on the quiet. ++i first adds 1 to i, and returns the new value (being 1 more than i was).
I would think that i++ takes a little more work, since a previous value needs to be stored in addition to a next value: Push *(&i) to stack (or load to register); increment *(&i). Versus ++i: Increment *(&i); then use *(&i) as needed.
(I get that the "Increment *(&i)" operation may involve a register load, depending on CPU design. In which case, i++ would need either another register or a stack push.)
Anyway, at what point, and why, did i++ become more fashionable?
I'm inclined to believe azheglov: It's a pedagogic thing, and since most of us do C/C++ on a Window or *nix system where the compilers are of high quality, nobody gets hurt.
If you're using a low quality compiler or an interpreted environment, you may need to be sensitive to this. Certainly, if you're doing advanced C++ or device driver or embedded work, hopefully you're well seasoned enough for this to be not a big deal at all. (Do dogs have Buddah-nature? Who really needs to know?)
It doesn't matter which you use. On some extremely obsolete machines, and in certain instances with C++, ++i is more efficient, but modern compilers don't store the result if it's not stored. As to when it became popular to postincriment in for loops, my copy of K&R 2nd edition uses i++ on page 65 (the first for loop I found while flipping through.)
For some reason, i++ became more idiomatic in C, even though it creates a needless copy. (I thought that was through K&R, but I see this debated in other answers.) But I don't think there's a performance difference in C, where it's only used on built-ins, for which the compiler can optimize away the copy operation.
It does make a difference in C++, however, where i might be a user-defined type for which operator++() is overloaded. The compiler might not be able to assert that the copy operation has no visible side-effects and might thus not be able to eliminate it.
As for the reason why, here is what K&R had to say on the subject:
Brian Kernighan
you'll have to ask dennis (and it might be in the HOPL paper). i have a
dim memory that it was related to the post-increment operation in the
pdp-11, though beyond that i don't know, so don't quote me.
in c++ the preferred style for iterators is actually ++i for some subtle
implementation reason.
Dennis Ritchie
No particular reason, it just became fashionable. The code produced
is identical on the PDP-11, just an inc instruction, no autoincrement.
HOPL Paper
Thompson went a step further by inventing the ++ and -- operators, which increment or decrement; their prefix or postfix position determines whether the alteration occurs before or after noting the value of the operand. They were not in the earliest versions of B, but appeared along the way. People often guess that they were created to use the auto-increment and auto-decrement address modes provided by the DEC PDP-11 on which C and Unix first became popular. This is historically impossible, since there was no PDP-11 when B was developed. The PDP-7, however, did have a few ‘auto-increment’ memory cells, with the property that an indirect memory reference through them incremented the cell. This feature probably suggested such operators to Thompson; the generalization to make them both prefix and postfix was his own. Indeed, the auto-increment cells were not used directly in implementation of the operators, and a stronger
motivation for the innovation was probably his observation that the translation of ++x was smaller than that of x=x+1.
For integer types the two forms should be equivalent when you don't use the value of the expression. This is no longer true in the C++ world with more complicated types, but is preserved in the language name.
I suspect that "i++" became more popular in the early days because that's the style used in the original K&R "The C Programming Language" book. You'd have to ask them why they chose that variant.
Because as soon as you start using "++i" people will be confused and curios. They will halt there everyday work and start googling for explanations. 12 minutes later they will enter stack overflow and create a question like this. And voila, your employer just spent yet another $10
Going a little further back than K&R, I looked at its predecessor: Kernighan's C tutorial (~1975). Here the first few while examples use ++n. But each and every for loop uses i++. So to answer your question: Almost right from the beginning i++ became more fashionable.
My theory (why i++ is more fashionable) is that when people learn C (or C++) they eventually learn to code iterations like this:
while( *p++ ) {
...
}
Note that the post-fix form is important here (using the infix form would create a one-off type of bug).
When the time comes to write a for loop where ++i or i++ doesn't really matter, it may feel more natural to use the postfix form.
ADDED: What I wrote above applies to primitive types, really. When coding something with primitive types, you tend to do things quickly and do what comes naturally. That's the important caveat that I need to attach to my theory.
If ++ is an overloaded operator on a C++ class (the possibility Rich K. suggested in the comments) then of course you need to code loops involving such classes with extreme care as opposed to doing simple things that come naturally.
At some level it's idiomatic C code. It's just the way things are usually done. If that's your big performance bottleneck you're likely working on a unique problem.
However, looking at my K&R The C Programming Language, 1st edition, the first instance I find of i in a loop (pp 38) does use ++i rather than i++.
Im my opinion it became more fashionable with the creation of C++ as C++ enables you to call ++ on non-trivial objects.
Ok, I elaborate: If you call i++ and i is a non-trivial object, then storing a copy containing the value of i before the increment will be more expensive than for say a pointer or an integer.
I think my predecessors are right regarding the side effects of choosing postincrement over preincrement.
For it's fashonability, it may be as simple as that you start all three expressions within the for statement the same repetitive way, something the human brain seems to lean towards to.
I would add up to what other people told you that the main rule is: be consistent. Pick one, and do not use the other one unless it is a specific case.
If the loop is too long, you need to reload the value in the cache to increment it before the jump to the begining.
What you don't need with ++i, no cache move.
In C, all operators that result in a variable having a new value besides prefix inc/dec modify the left hand variable (i=2, i+=5, etc). So in situations where ++i and i++ can be interchanged, many people are more comfortable with i++ because the operator is on the right hand side, modifying the left hand variable
Please tell me if that first sentence is incorrect, I'm not an expert with C.

implementing unification algorithm

I worked the last 5 days to understand how unification algorithm works in Prolog .
Now ,I want to implement such algorithm in Java ..
I thought maybe best way is to manipulate the string and decompose its parts using some datastructure such as Stacks ..
to make it clear :
suppose user inputs is:
a(X,c(d,X)) = a(2,c(d,Y)).
I already take it as one string and split it into two strings (Expression1 and 2 ).
now, how can I know if the next char(s) is Variable or constants or etc.. ,
I can do it by nested if but it seems to me not good solution ..
I tried to use inheritance but the problem still ( how can I know the type of chars being read ?)
First you need to parse the inputs and build expression trees. Then apply Milner's unification algorithm (or some other unification algorithm) to figure out the mapping of variables to constants and expressions.
A really good description of Milner's algorithm may be found in the Dragon Book: "Compilers: Principles, Techniques and Tools" by Aho, Sethi and Ullman. (Milners algorithm can also cope with unification of cyclic graphs, and the Dragon Book presents it as a way to do type inference). By the sounds of it, you could benefit from learning a bit about parsing ... which is also covered by the Dragon Book.
EDIT: Other answers have suggested using a parser generator; e.g. ANTLR. That's good advice, but (judging from your example) your grammar is so simple that you could also get by with using StringTokenizer and a hand-written recursive descent parser. In fact, if you've got the time (and inclination) it is worth implementing the parser both ways as a learning exercise.
It sounds like this problem is more to do with parsing than unification specifically. Using something like ANTLR might help in terms of turning the original string into some kind of tree structure.
(It's not quite clear what you mean by "do it by nested", but if you mean that you're doing something like trying to read an expression, and recursing when meeting each "(", then that's actually one of the right ways to do it -- this is at heart what the code that ANTLR generates for you will do.)
If you are more interested in the mechanics of unifying things than you are in parsing, then one perfectly good way to do this is to construct the internal representation in code directly, and put off the parsing aspect for now. This can get a bit annoying during development, as your Prolog-style statements are now a rather verbose set of Java statements, but it lets you focus on one problem at a time, which is usually helpful.
(If you structure things this way, this should make it straightforward to insert a proper parser later, that will produce the same sort of tree as you have until then been constructing by hand. This will let you attack the two problems separately in a reasonably neat fashion.)
Before you get to do the semantics of the language, you have to convert the text into a form that's easy to operate on. This process is called parsing and the semantic representation is called an abstract syntax tree (AST).
A simple recursive descent parser for Prolog might be hand written, but it's more common to use a parser toolkit such as Rats! or Antlr
In an AST for Prolog, you might have classes for Term, and CompoundTerm, Variable, and Atom are all Terms. Polymorphism allows the arguments to a compound term to be any Term.
Your unification algorithm then becomes unifying the name of any compound term, and recursively unifying the value of each argument of corresponding compound terms.

Categories