Is this local or static variable capture? - java

In the following code, the lambda expressions capture a static variable.
However, it is also local to the scope of the enclosing class, so would this be local variable capture or static variable capture?
public class ExampleImpl{
static String someStaticVar = "text";
Example lam = () -> {
System.out.println(someStaticVar);
};
interface Example {
void sample();
}
}

The terms “local variable capture” and “static variable capture” do not appear anywhere in the specification, so their meaning would be up to whoever coined these terms.
The most likely interpretation is that “local variable capture” just mean “capture of a local variable” and likewise “static variable capture” means “capture of a static variable”, in other words, capture of a variable which happens to be of either kind, local, instance field, or static field, and then, the answer is quiet simple, the nature of the variables doesn’t change when you place a lambda expression in a different scope.
In your example, someStaticVar always is a static variable, regardless of where you access it.
It’s not clear why this distinction matters to you. There might be technical differences under the hood, which are intentionally unspecified, hence, implementation specific. The most relevant aspect of the type of the captured variables would be that capturing an instance variable will cause the generated instance to keep a reference to the instance. But first, this doesn’t apply to local variables or static variables, second, this is a natural relationship, that code potentially accessing an instance field may prevent the garbage collection of that instance.

Related

Why variable used in lambda expression should be final or effectively final

This question has been previously asked over here
My question regarding why which was answered over here
But I have some doubts about the answer.
The answer provided mentions-
Although other answers prove the requirement, they don't explain why the requirement exists.
The JLS mentions why in §15.27.2:
The restriction to effectively final variables prohibits access to dynamically-changing local variables, whose capture would likely introduce concurrency problems.
To lower the risk of bugs, they decided to ensure captured variables are never mutated.
I am confused by the statement that it would lead to concurrency problems.
I read the article about concurrency problems on Baeldung but still, I am a bit confused about how it will cause concurrency problems, can anybody help me out with an example.
Thanks in advance.
I'd like to preface this answer by saying what I show below is not actually how lambdas are implemented. The actual implementation involves java.lang.invoke.LambdaMetafactory if I'm not mistaken. My answer makes use of some inaccuracies to better demonstrate the point.
Let's say you have the following:
public static void main(String[] args) {
String foo = "Hello, World!";
Runnable r = () -> System.out.println(foo);
r.run();
}
Remember that a lambda expression is shorthand for declaring an implementation of a functional interface. The lambda body is the implementation of the single abstract method of said functional interface. At run-time an actual object is created. So the above results in an object whose class implements Runnable.
Now, the above lambda body references a local variable from the enclosing method. The instance created as a result of the lambda expression "captures" the value of that local variable. It's almost (but not really) like you have the following:
public static void main(String[] args) {
String foo = "Hello, World!";
final class GeneratedClass implements Runnable {
private final String generatedField;
private GeneratedClass(String generatedParam) {
generatedField = generatedParam;
}
#Override
public void run() {
System.out.println(generatedField);
}
}
Runnable r = new GeneratedClass(foo);
r.run();
}
And now it should be easier to see the problems with supporting concurrency here:
Local variables are not considered "shared variables". This is stated in §17.4.1 of the Java Language Specification:
Memory that can be shared between threads is called shared memory or heap memory.
All instance fields, static fields, and array elements are stored in heap memory. In this chapter, we use the term variable to refer to both fields and array elements.
Local variables (§14.4), formal method parameters (§8.4.1), and exception handler parameters (§14.20) are never shared between threads and are unaffected by the memory model.
In other words, local variables are not covered by the concurrency rules of Java and cannot be shared between threads.
At a source code level you only have access to the local variable. You don't see the generated field.
I suppose Java could be designed so that modifying the local variable inside the lambda body only writes to the generated field, and modifying the local variable outside the lambda body only writes to the local variable. But as you can probably imagine that'd be confusing and counterintuitive. You'd have two variables that appear to be one variable based on the source code. And what's worse those two variables can diverge in value.
The other option is to have no generated field. But consider the following:
public static void main(String[] args) {
String foo = "Hello, World!";
Runnable r = () -> {
foo = "Goodbye, World!"; // won't compile
System.out.println(foo);
}
new Thread(r).start();
System.out.println(foo);
}
What is supposed to happen here? If there is no generated field then the local variable is being modified by a second thread. But local variables cannot be shared between threads. Thus this approach is not possible, at least not without a likely non-trivial change to Java and the JVM.
So, as I understand it, the designers put in the rule that the local variable must be final or effectively final in this context in order to avoid concurrency problems and confusing developers with esoteric problems.
When an instance of a lambda expression is created, any variables in the enclosing scope that it refers are copied into it. Now, suppose if that were allowed to modify, and now you are working with a stale value which is there in that copy. On the other hand, suppose the copy is modified inside the lambda, and still the value in the enclosing scope is not updated, leaving an inconsistency. Thus, to prevent such occurrences, the language designers have imposed this restriction. It would probably have made their life easier too. A related answer for an anonymous inner class can be found here.
Another point is that you will be able to pass the lambda expression around and if it is escaped and a different thread executes it, while current thread is updating the same local variable, then there will be some concurrency issues too.
It is for the same reason the anonymous classes require the variables used in their coming out from the scope of themselves must be read-only -> final.
final int finalInt = 0;
int effectivelyFinalInt = 0;
int brokenInt = 0;
brokenInt = 0;
Supplier<Integer> supplier = new Supplier<Integer>() {
#Override
public Integer get() {
return finalInt; // compiles
return effectivelyFinalInt; // compiles
return brokenInt; // doesn't compile
}
};
Lambda expressions are only shortcuts for instances implementing the interface with only one abstract method (#FunctionalInterface).
Supplier<Integer> supplier = () -> brokenInt; // compiles
Supplier<Integer> supplier = () -> brokenInt; // compiles
Supplier<Integer> supplier = () -> brokenInt; // doesn't compile
I struggle to read the Java Language specification to provide support to my statements below, however, they are logical:
Note that evaluation of a lambda expression produces an instance of a functional interface.
Note that instantiating an interface requires implementing all its abstract methods. Doing as an expression produces an anonymous class.
Note that an anonymous class is always an inner class.
Each inner class can access only final or effectively-final variables outside of its scope: Accessing Members of an Enclosing Class
In addition, a local class has access to local variables. However, a local class can only access local variables that are declared final. When a local class accesses a local variable or parameter of the enclosing block, it captures that variable or parameter.

Local variable log defined in an enclosing scope must be final or effectively final

I'm new to lambda and Java8. I'm facing following error.
Local variable log defined in an enclosing scope must be final or
effectively final
public JavaRDD<String> modify(JavaRDD<String> filteredRdd) {
filteredRdd.map(log -> {
placeHolder.forEach(text -> {
//error comes here
log = log.replace(text, ",");
});
return log;
});
return null;
}
The message says exactly what the problem is: your variable log must be final (that is: carry the keyword final) or be effectively final (that is: you only assign a value to it once outside of the lambda). Otherwise, you can't use that variable within your lambda statement.
But of course, that conflicts with your usage of log. The point is: you can't write to something external from within the lambda ... so you have to step back and look for other ways for whatever you intend to do.
In that sense: just believe the compiler.
Beyond that, there is one core point to understand: you can not use a local variable that you can write to. Local variables are "copied" into the context of the lambda at runtime, and in order to achieve deterministic behavior, they can only be read, and should be constants.
If your use case is to write to some object, then it should be a field of your enclosing class for example!
So, long story short:
local variables used (read) inside a lambda must act like a constant
you can not write to local variables!
or the other way round: if you need something to write to, you have to use a field of your surrounding class for example (or provide a call back method)
The reason for this limitation is the same as the reason for the Java language feature that local variables accessed from within (anonymous) inner classes must be (effectively) final.
This answer by rgettman gets into the details of it. rgettman explains the limitations in clear detail and I link to that answer because the behavior of lambda expressions should be same as that of anonymous inner classes. Note that such limitation does not exist for class or instance variables, however. The main reason for this is slightly complicated and I couldn't explain it better than what Roedy Green does it here. Copying here only so it is at one place:
The rule is anonymous inner classes may only access final local
variables of the enclosing method. Why? Because the inner class’s
methods may be invoked later, long after the method that spawned it
has terminated, e.g. by an AWT (Advanced Windowing Toolkit) event. The
local variables are long gone. The anonymous class then must work with
flash frozen copies of just the ones it needs squirreled away covertly
by the compiler in the anonymous inner class object. You might ask,
why do the local variables have to be final? Could not the compiler
just as well take a copy of non-final local variables, much the way it
does for a non-final parameters? If it did so, you would have two
copies of the variable. Each could change independently, much like
caller and callee’s copy of a parameter, however you would use the
same syntax to access either copy. This would be confusing. So Sun
insisted the local be final. This makes irrelevant that there are
actually two copies of it.
The ability for an anonymous class to access the caller’s final local
variables is really just syntactic sugar for automatically passing in
some local variables as extra constructor parameters. The whole thing
smells to me of diluted eau de kludge.
Remember method inner classes can`t modify any value from their surrounding method. Your second lambda expression in forecach is trying to access its surrounding method variable (log).
To solve this you can avoid using lambda in for each and so a simple for each and re-palace all the values in log.
filteredRdd.map(log -> {
for (String text:placeHolder){
log = log.replace(text,",");
}
return log;
});
In some use cases there can be a work around. The following code complains about the startTime variable not being effectively final:
List<Report> reportsBeforeTime = reports.stream()
.filter(r->r.getTime().isAfter(startTime))
.collect(Collectors.toList());
So, just copy the value to a final variable before passing it to lambda:
final LocalTime finalStartTime = startTime;
List<Report> reportsBeforeTime = reports.stream()
.filter(r->r.getTime().isAfter(finalStartTime))
.collect(Collectors.toList());
However, If you need to change a local variable inside a lambda function, that won't work.
If you do not want to create your own object wrapper, you can use AtomicReference, for example:
AtomicReference<String> data = new AtomicReference<>();
Test.lamdaTest(()-> {
//data = ans.get(); <--- can't do this, so we do as below
data.set("to change local variable");
});
return data.get();
One solution is to encapsulate the code in an enclosing (inner class). You can define this:
public abstract class ValueContext<T> {
public T value;
public abstract void run();
}
And then use it like this (example of a String value):
final ValueContext<String> context = new ValueContext<String>(myString) {
#Override
public void run() {
// Your code here; lambda or other enclosing classes that want to work on myString,
// but use 'value' instead of 'myString'
value = doSomethingWithMyString(value);
}};
context.run();
myString = context.value;

Lambdas: local variables need final, instance variables don't

In a lambda, local variables need to be final, but instance variables don't. Why so?
The fundamental difference between a field and a local variable is that the local variable is copied when JVM creates a lambda instance. On the other hand, fields can be changed freely, because the changes to them are propagated to the outside class instance as well (their scope is the whole outside class, as Boris pointed out below).
The easiest way of thinking about anonymous classes, closures and labmdas is from the variable scope perspective; imagine a copy constructor added for all local variables you pass to a closure.
In a document of project lambda, State of the Lambda v4, under Section 7. Variable capture, it is mentioned that:
It is our intent to prohibit capture of mutable local variables. The
reason is that idioms like this:
int sum = 0;
list.forEach(e -> { sum += e.size(); });
are fundamentally serial; it is quite difficult to write lambda bodies
like this that do not have race conditions. Unless we are willing to
enforce—preferably at compile time—that such a function cannot escape
its capturing thread, this feature may well cause more trouble than it
solves.
Another thing to note here is, local variables are passed in the constructor of an inner class when you access them inside your inner class, and this won't work with non-final variable because value of non-final variables can be changed after construction.
While in case of an instance variable, the compiler passes a reference of the object and object reference will be used to access instance variables. So, it is not required in case of instance variables.
PS : It is worth mentioning that anonymous classes can access only final local variables (in Java SE 7), while in Java SE 8 you can access effectively final variables also inside lambda as well as inner classes.
In Java 8 in Action book, this situation is explained as:
You may be asking yourself why local variables have these restrictions.
First, there’s a key
difference in how instance and local variables are implemented behind the scenes. Instance
variables are stored on the heap, whereas local variables live on the stack. If a lambda could
access the local variable directly and the lambda were used in a thread, then the thread using the
lambda could try to access the variable after the thread that allocated the variable had
deallocated it. Hence, Java implements access to a free local variable as access to a copy of it
rather than access to the original variable. This makes no difference if the local variable is
assigned to only once—hence the restriction.
Second, this restriction also discourages typical imperative programming patterns (which, as we
explain in later chapters, prevent easy parallelization) that mutate an outer variable.
Because instance variables are always accessed through a field access operation on a reference to some object, i.e. some_expression.instance_variable. Even when you don't explicitly access it through dot notation, like instance_variable, it is implicitly treated as this.instance_variable (or if you're in an inner class accessing an outer class's instance variable, OuterClass.this.instance_variable, which is under the hood this.<hidden reference to outer this>.instance_variable).
Thus an instance variable is never directly accessed, and the real "variable" you're directly accessing is this (which is "effectively final" since it is not assignable), or a variable at the beginning of some other expression.
Putting up some concepts for future visitors:
Basically it all boils down to the point that compiler should be able to deterministically tell that lambda expression body is not working on a stale copy of the variables.
In case of local variables, compiler has no way to be sure that lambda expression body is not working on a stale copy of the variable unless that variable is final or effectively final, so local variables should be either final or effectively final.
Now, in case of instance fields, when you access an instance field inside the lambda expression then compiler will append a this to that variable access (if you have not done it explicitly) and since this is effectively final so compiler is sure that lambda expression body will always have the latest copy of the variable (please note that multi-threading is out of scope right now for this discussion). So, in case instance fields, compiler can tell that lambda body has latest copy of instance variable so instance variables need not to be final or effectively final. Please refer below screen shot from an Oracle slide:
Also, please note that if you are accessing an instance field in lambda expression and that is getting executed in multi-threaded environment then you could potentially run in problem.
It seems like you are asking about variables that you can reference from a lambda body.
From the JLS §15.27.2
Any local variable, formal parameter, or exception parameter used but not declared in a lambda expression must either be declared final or be effectively final (§4.12.4), or a compile-time error occurs where the use is attempted.
So you don't need to declare variables as final you just need to make sure that they are "effectively final". This is the same rule as applies to anonymous classes.
Within Lambda expressions you can use effectively final variables from the surrounding scope.
Effectively means that it is not mandatory to declare variable final but make sure you do not change its state within the lambda expresssion.
You can also use this within closures and using "this" means the enclosing object but not the lambda itself as closures are anonymous functions and they do not have class associated with them.
So when you use any field (let say private Integer i;)from the enclosing class which is not declared final and not effectively final it will still work as the compiler makes the trick on your behalf and insert "this" (this.i).
private Integer i = 0;
public void process(){
Consumer<Integer> c = (i)-> System.out.println(++this.i);
c.accept(i);
}
Here is a code example, as I didn't expect this either, I expected to be unable to modify anything outside my lambda
public class LambdaNonFinalExample {
static boolean odd = false;
public static void main(String[] args) throws Exception {
//boolean odd = false; - If declared inside the method then I get the expected "Effectively Final" compile error
runLambda(() -> odd = true);
System.out.println("Odd=" + odd);
}
public static void runLambda(Callable c) throws Exception {
c.call();
}
}
Output:
Odd=true
YES, you can change the member variables of the instance but you CANNOT change the instance itself just like when you handle variables.
Something like this as mentioned:
class Car {
public String name;
}
public void testLocal() {
int theLocal = 6;
Car bmw = new Car();
bmw.name = "BMW";
Stream.iterate(0, i -> i + 2).limit(2)
.forEach(i -> {
// bmw = new Car(); // LINE - 1;
bmw.name = "BMW NEW"; // LINE - 2;
System.out.println("Testing local variables: " + (theLocal + i));
});
// have to comment this to ensure it's `effectively final`;
// theLocal = 2;
}
The basic principle to restrict the local variables is about data and computation validity
If the lambda, evaluated by the second thread, were given the ability to mutate local variables. Even the ability to read the value of mutable local variables from a different thread would introduce the necessity for synchronization or the use of volatile in order to avoid reading stale data.
But as we know the principal purpose of the lambdas
Amongst the different reasons for this, the most pressing one for the Java platform is that they make it easier to distribute processing of collections over multiple threads.
Quite unlike local variables, local instance can be mutated, because it's shared globally. We can understand this better via the heap and stack difference:
Whenever an object is created, it’s always stored in the Heap space and stack memory contains the reference to it. Stack memory only contains local primitive variables and reference variables to objects in heap space.
So to sum up, there are two points I think really matter:
It's really hard to make the instance effectively final, which might cause lots of senseless burden (just imagine the deep-nested class);
the instance itself is already globally shared and lambda is also shareable among threads, so they can work together properly since we know we're handling the mutation and want to pass this mutation around;
Balance point here is clear: if you know what you are doing, you can do it easily but if not then the default restriction will help to avoid insidious bugs.
P.S. If the synchronization required in instance mutation, you can use directly the stream reduction methods or if there is dependency issue in instance mutation, you still can use thenApply or thenCompose in Function while mapping or methods similar.
First, there is a key difference in how local and instance variables are implemented behind the scenes. Instance variables are stored in the heap, whereas local variables stored in the stack.
If the lambda could access the local variable directly and the lambda was used in a thread, then the thread using the lambda could try to access the variable after the thread that allocated the variable had deallocated it.
In short: to ensure another thread does not override the original value, it is better to provide access to the copy variable rather than the original one.

What are captured variables in Java Local Classes

The Java documentation for Local Classes says that:
In addition, a local class has access to local variables. However, a
local class can only access local variables that are declared final.
When a local class accesses a local variable or parameter of the
enclosing block, it captures that variable or parameter. For example,
the PhoneNumber constructor can access the local variable numberLength
because it is declared final; numberLength is a captured variable.
What is captured variable,what is its use and why is that needed? Please help me in understanding the concept of it.
What is captured variable,what is its use and why is that needed?
A captured variable is one that has been copied so it can be used in a nested class. The reason it has to be copied is the object may out live the current context. It has to be final (or effectively final in Java 8) so there is no confusion about whether changes to the variable will be seen (because they won't)
Note: Groovy does have this rule and a change to the local variable can mean a change to the value in the enclosing class which is especially confusing if multiple threads are involved.
An example of capture variable.
public void writeToDataBase(final Object toWrite) {
executor.submit(new Runnable() {
public void run() {
writeToDBNow(toWrite);
}
});
// if toWrite were mutable and you changed it now, what would happen !?
}
// after the method returns toWrite no longer exists for the this thread...
Here is a post describing it: http://www.devcodenote.com/2015/04/variable-capture-in-java.html
Here is a snippet from the post:
”It is imposed as a mandate by Java that if an inner class defined within a method references a local variable of that method, that local variable should be defined as final.”
This is because the function may complete execution and get removed from the process stack, with all the variables destroyed but it may be the case that objects of the inner class are still on the heap referencing a particular local variable of that function. To counter this, Java makes a copy of the local variable and gives that as a reference to the inner class. To maintain consistency between the 2 copies, the local variable is mandated to be “final” and non-modifiable.
A captured variable is one from the outside of your local class - one declared in the surrounding block. In some languages this is called a closure.
In the example from the Oracle Docs (simplified) the variable numberLength, declared outside of class PhoneNumber, is "captured".
final int numberLength = 10; // in JDK7 and earlier must be final...
class PhoneNumber {
// you can refer to numberLength here... it has been "captured"
}

Local variables in java

I went through local variables and class variables concept.
But I had stuck at a doubt
" Why is it so that we cannot declare local variables as static " ?
For e.g
Suppose we have a play( ) function :
void play( )
{
static int i=5;
System.out.println(i);
}
It gives me error in eclipse : Illegal modifier for parameter i;
I had this doubt because of the following concepts I have read :
Variables inside method : scope is local i.e within that method.
When variable is declared as static , it is present for the entire class i.e not to particular object.
Please could anyone help me out to clarify the concept.
Thanks.
Because the scope of the local variables is limited to the surrounding block. That's why they cannot be referred to (neither statically, nor non-statically), from other classes or methods.
Wikipedia says about static local variables (in C++ for example):
Static local variables are declared inside a function, just like automatic local variables. They have the same scope as normal local variables, differing only in "storage duration": whatever values the function puts into static local variables during one call will still be present when the function is called again.
That doesn't exist in Java. And in my opinion - for the better.
Java doesn't have static variables like C. Instead, since every method has a class (or instance of a class) associated with it, the persistent scoped variables are best stored at that level (e.g., as private or static private fields). The only real difference is that other methods in the same class can refer to them; since all those methods are constrained to a single file anyway, it's not a big problem in practice.
Static members (variables, functions, etc.) serve to allow callers of the class, whether they're within the class or outside of the class, to execute functions and utilize variables without referring to a specific instance of the class. Because of this, the concept of a "static local" doesn't make sense, as there would be no way for a caller outside of the function to refer to the variable (since it's local to that function).
There are some languages (VB.NET, for example), that have a concept of "static" local variables, though the term "static" is inconsistently used in this scenario; VB.NET static local variables are more like hidden instance variables, where subsequent calls on the same instance will have the previous value intact. For example
Public Class Foo
Public Sub Bar()
Static i As Integer
i = i + 1
Console.WriteLine(i)
End Sub
End Class
...
Dim f As New Foo()
Dim f2 as New Foo()
f.Bar() // Prints "1"
f.Bar() // Prints "2"
f2.Bar() // Prints "1"
So, as you can see, the keyword "static" is not used in the conventional OO meaning here, as it's still specific to a particular instance of Foo.
Because this behavior can be confusing (or, at the very least, unintuitive), other languages like Java and C# are less flexible when it comes to variable declarations. Depending on how you want it to behave, you should declare your variable either as an instance variable or a static/class variable:
If you'd like the variable to exist beyond the scope of the function but be particular to a single instance of the class (like VB.NET does), then create an instance variable:
public class Foo
{
private int bar;
public void Bar()
{
bar++;
System.out.println(bar);
}
}
If you want it to be accessible to all instances of the class (or even without an instance), make it static:
public class Foo
{
private static int bar;
public static void Bar()
{
bar++;
System.out.println(bar);
}
}
(Note that I made Bar() static in the last example, but there is no reason that it has to be.)

Categories