Java memory model and local variable [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
This question is related to java memory model.
I have a java method:
public class DataUtil{
public void process(){
int c=0;
c=c+1;
System.out.println(c);
}
}
In the line "System.out.println(c)", where does the println method takes the value of c variable and prints it on screen?
CPU cache, or RAM

To be clear, the question and this answer are solely about the behavior of a program with one thread. Many of the things I say below may not apply to multi-threaded programs.
Where does the println method takes the value of c variable and prints it on screen? CPU cache, or RAM
The Java Memory Model (JLS 17.4) says nothing definite about cache and RAM. In general, it specifies visibility behavior without prescribing the way that a particular compiler implements that behavior. The JMM mandates that there is a happens before relation when a thread writes a variable and subsequently reads is. The happens before constrains the generated code to behave in a certain way. However it does not mandate any particular implementation approach for achieving those constrains.
In your example, the JMM doesn't even place an implicit constraint on whether the value comes from cache or RAM. The c variable is only accessible to one thread. (It is a local variable!). So the compiler could (in theory) store the variable's value anywhere1. The only constraint is that the most recent value is used when the variable is accessed. The compiler just needs to keep track of where the most recent variable is kept ...
As a general rule, the JMM only has something interesting to say about variables and objects that are shared by different threads.
1 - In a register, in RAM, in cache memory, on a hardware stack ... even written on a piece of paper shoved down the back of the sofa, if your hardware platform supports that.
If you are using "memory model" in a broader sense, then the plain answer is "we cannot say".
In languages (not Java!) where the memory model is specified in terms of main memory and cache, the memory model would most likely not constrain this example.
If you are not talking about any specific memory model, but just asking if the value is fetched from cache or RAM, then we cannot say ... because this is an implementation detail of the language implementation. For example, the JVM.
What we can say with a high level of certainty2 is that the implementation will fetch and print the most recent value of c from somewhere in this example.
In the case of Java, the JLS says that it must return the most recent value. (It's in JLS 17.4 if you want to look.) The JLS leaves it up to the Java implementation to decide how to do that.
It is safe to assume that any JVM implementation will have come up with a reliable solution; i.e. that the most recent value of a variable will be used. But figuring out the details would be a big task ... and (IMO) would not worth the effort. (You don't need to understand the internals of a Volvo 264 automatic gearbox to drive a car.)
2 - We can be certain because there aren't bazillions of bug reports of single threaded application not working due to problems reading and writing variables. Also, if there are any doubts, it is possible to examine the JIT compiler source code to understand what it does, or analyze the native code that it generates.

The Java Memory Model does not regulate that - it's focused on behaviours of multi-threaded programs.
Local variables are allocated on stack. Parameters to functions, such as println, are also passed via stack - they are pushed onto the top of the stack before the call (according to the calling convention). It's what happens in bytecode, though JIT compilers or the interpreter may also use CPU registers, and do not use the stack in the RAM.

Related

Can the JVM GC move objects in the middle of a reference comparison, causing a comparison to fail even when both sides refer to the same object?

It's well known that GCs will sometimes move objects around in memory. And it's to my understanding that as long as all references are updated when the object is moved (before any user code is called), this should be perfectly safe.
However, I saw someone mention that reference comparison could be unsafe due to the object being moved by the GC in the middle of a reference comparison such that the comparison could fail even when both references should be referring to the same object?
ie, is there any situation under which the following code would not print "true"?
Foo foo = new Foo();
Foo bar = foo;
if(foo == bar) {
System.out.println("true");
}
I tried googling this and the lack of reliable results leads me to believe that the person who stated this was wrong, but I did find an assortment of forum posts (like this one) that seemed to indicate that he was correct. But that thread also has people saying that it shouldn't be the case.
Java Bytecode instructions are always atomic in relation to the GC (i.e. no cycle can happen while a single instruction is being executed).
The only time the GC will run is between two Bytecode instructions.
Looking at the bytecode that javac generates for the if instruction in your code we can simply check to see if a GC would have any effect:
// a GC here wouldn't change anything
ALOAD 1
// a GC cycle here would update all references accordingly, even the one on the stack
ALOAD 2
// same here. A GC cycle will update all references to the object on the stack
IF_ACMPNE L3
// this is the comparison of the two references. no cycle can happen while this comparison
// "is running" so there won't be any problems with this either
Aditionally, even if the GC were able to run during the execution of a bytecode instruction, the references of the object would not change. It's still the same object before and after the cycle.
So, in short the answer to your question is no, it will always output true.
Source:
https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.21.3
The short answer is, looking at the java 8 specification: No.
The == operator will always perform object equality check (given that neither reference is null). Even if the object is moved, the object is still the same object.
If you see such an effect, you have just found a JVM bug. Go submit it.
It could, of course, be that some obscure implementation of the JVM does not enforce this for whatever strange performance reason. If that is the case, it would be wise to simply move on from that JVM...
TL;DR
You should not think about that kind of stuff what so ever, It's a dark place.
Java has clearly stated out it's specifications and you should not doubt it, ever.
2.7. Representation of Objects
The Java Virtual Machine does not mandate any particular internal structure for objects.
Source: JVMS SE8.
I doubt it! If you may doubt this very basic operator you may find yourself doubt everything else, getting frustrated and paranoid with trust issues is not the place you want to be.
What if it happens to me? Such a bug should not be existed. The Oracle discussion you supplied reporting a bug that happened years ago and somehow discussion OP decided to pop that up for no reason, either without reliable documentation of such bug existed now days. However, if such bug or any others has occurred to you, please submit it here.
To let your worries go away, Java has adjusted the pointer to pointer approach into the JVM pointer table, you can read more about it's efficenty here.
GCs only happen at points in the program where the state is well-defined and the JVM has exact knowledge where everything is in registers/the stack/on the heap so all references can be fixed up when an object gets moved.
I.e. they cannot occur between execution of arbitrary assembly instructions. Conceptually you can think of them occuring between bytecode instructions of the JVM with the GC adjusting all references that have been generated by previous instructions.
You are asking a question with a wrong premise. Since the == operator does not compare memory locations, it isn’t sensible to changes of memory location per se. The == operator, applied to references, compares the identity of the referred objects, regardless of how the JVM implements it.
To name an example that counteracts the usual understanding, a distributed JVM may have objects held in the RAM of different computers, including the possibility of local copies. So simply comparing addresses won’t work. Of course, it’s up to the JVM implementation to ensure that the semantics, as defined in the Java Language Specification, do not change.
If a particular JVM implementation implements a reference comparison by directly comparing memory locations of objects and has a garbage collector that can change memory locations, of course, it’s up to the JVM to ensure that these two features can’t interfere with each other in an incompatible way.
If you are curious on how this can work, e.g. inside optimized, JIT compiled code, the granularity isn’t as fine as you might think. Every sequential code, including forward branches, can be considered to run fast enough to allow to delay garbage collection to its completion. So garbage collection can’t happen at any time inside optimized code, but must be allowed at certain points, e.g.
backward branches (note that due to loop unrolling, not every loop iteration implies a backward branch)
memory allocations
thread synchronization actions
invoking a method that hasn’t been inlined/analyzed
maybe something special, I forgot
So the JVM emits code containing certain “safe points” at which it is known, which references are currently held, how to replace them, if necessary and, of course, changing locations has no impact on the correctness. Between these points, the code can run without having to care about the possibility of changing memory locations whereas the garbage collector will wait for code reaching a safe point when necessary, which is guaranteed to happen in finite, rather short time.
But, as said, these are implementation details. On the formal level, things like changing memory locations do not exist, so there is no need to explicitly specify that they are not allowed to change the semantics of Java code. No implementation detail is allowed to do that.
I understand you are asking this question after someone says it behaves that way, but really asking if it does behave that way isn't the right approach to evaluating what they said.
What you should really be asking (primarily yourself, others only if you can't decide on an answer) is whether it makes sense for the GC to be allowed to cause a comparison to fail that logically should succeed (basically any comparison that doesn't include a weak reference).
The answer to that is obviously "no", as it would break pretty much anything beyond "hello, world" and probably even that.
So, if allowed, it is a bug -- either in the spec or the implementation. Now since both the spec and the implementation were written by humans, it is possible such a bug exists. If so, it will be reported and almost certainly fixed.
No, because that would be flagrantly ridiculous and a patent bug.
The GC takes a great deal of care behind the scenes to avoid catastrophically breaking everything. In particular, it will only move objects when threads are paused at safepoints, which are specific places in the running code generated by the JVM for threads to be paused at. A thread at a safepoint is in a known state, where the positions of all the possible object references in registers and memory are known, so the GC can update them to point to the object's new address. Garbage collection won't break your comparison operations.
Java object hold a reference to the "object" not to the memory space where the object is stored.
Java do this because it allow the JVM to manage memory usage by its own (e.g. Garbage collector) and to improve global usage without impacting the client program directly.
As instance for improvement, the first X int (I don't remember how much) are always allocated in memory to execute for loop fatser (ex: for (int i =0; i<10; i++))
And as example for object reference, just try to create an and try to print it
int[] i = {1,2,3};
System.out.println(i);
You will see that Java returning a something starting with [I#. It is saying that is point on a "array of int at" and then the reference to the object. Not the memory zone!

Why Java doesn't allow object on stack? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am taking some classes and object lessons in Java having C++ background. I want to know the reason why we cannot choose the objects to be declared on the stack memory? Why must everything go on the heap except for the primitive types?
Here's something to clarify what I was asking.
Essentially, if we have:
Scanner input = new Scanner(System.in);
Then why cannot we have it on stack in the first place?
One of the strongest attractors of the original Java design (in mid-1990s) was simplicity. Supporting heap-based objects is essential, whereas stack-based ones are an optimization. Java is not alone here: many languages take that approach (LISP, Haskell, JavaScript, Ruby, etc.). Stack-based allocation does happen in Java, but only as an internal optimization trick and not something that the user can control.
Especially keep in mind that there is an essential difference in how a pointer to an object passed to a function ("a reference passed to a method" in Java-speak) can be treated by the callee: it is not allowed to retain the pointer if it's stack-based. This alone creates huge complications and bug opportunities.
Finally, stack-based objects bring much less to a garbage-collected language than to manually-managed languages like C and C++.
The data on the stack, say a C struct, disappears after the function call has returned. Hence one would need copying and correction of pointers.
Think of the hidden extra functionality needed here:
struct S* f() {
struct S s = ...;
g(&s);
return &s;
}
Java was meant as simplification, having its own management of memory, and doing things immediately on the heap seemed more direct, less convoluted.
This in view of C++, with its copy constructors, pointers and aliases.
Java does not allow explicit on stack allocation of objects. The language is not competing with low level languages such as C, and the creators of the language made this choice as a simplification.
However times change, and Java has grown since its humble beginnings. As the JVM becomes more sophisticated, automatic allocation of objects to the stack has become possible. The rationale for this is similar to the 'register' keyword in C; let the compiler manage the low level detail. It has become better at doing it than humans. In Java automatic allocation of objects onto the stack has been hampered by two factors, firstly the Sun/Oracle JVM is very old and very complex now. It is difficult to change, and Oracle has been careful about preventing backwards breaks. Secondly, so far their work on stack allocations has not yielded the large benefits that were expected. It did improve some situations, but the JVM has its own trade offs and behaviours. So this comes down to a question of time/pay-off and priorities. I believe that work to improve the benefits of automatic allocation continues behind the scenes; but there are no plans to make it explicit.
To put it simple, the key advantage of objects on stack is that the memory is automatically managed for you. When function puts objects on stack, they are cleaned from memory on function exit.
Since java already has automated garbage collection, this key advantage doesn't bring that much.
Sure there is a speed of access performance price that you might pay by being unable to allocate objects on stack directly, but as Marko mentioned, there are internal optimizations that might do just that.
Why must everything go on the heap except for the primitive types?
This statement is not accurate. Primitive types can go on the heap as well if they are part of a class instance. A local variable is stored on the stack, where as class variables are on the heap.
As for why objects are stored on the heap. It's after all a design decision. One reason is that it is a managed area in the JVM that is subject to garbage collection. As a managed area in the JVM, it may be organized in generations and may grow or shrink in size. See this section from the JVM specification:
The Java Virtual Machine has a heap that is shared among all Java Virtual Machine threads. The heap is the run-time data area from which memory for all class instances and arrays is allocated.
The heap is created on virtual machine start-up. Heap storage for objects is reclaimed by an automatic storage management system (known as a garbage collector); objects are never explicitly deallocated.

Putting a variable outside or inside a method scope within an update loop [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
This applies to both C and JAVA I'm asking for both.
I've got an update loop that's runs maybe a few hundred times a second indefinably.
My concerns are mainly memory management and what happens.
Here's the example so
public methodA(double Delta)
{
double doubleTest = Delta;
SomeObject newObject = new Object(Delta);
}
SomeObject newObject = new Object();
double doubleTest;
public methodB(double Delta)
{
doubleTest = Delta;
newObject.setUpdate(Delta);
}
Now I know in JAVA that the methodA is GC'ed at the cost of performance but what exactly happens in C or C++? Do variables or objects declared within the method scope get destoryed? If so which loop is better? (Would we be getting out of memory with the second loop?)
Also is it really worth pre-creating the object for the method update? What's the performance gain if any?
1. - The variables get destroyed.
2. - Second, if you're passing in a parameter, it does not need to be outside the method scope.
3. - It would be more efficient from a writing-perspective to put it all in one line. The memory footprint is very minimal between the two, if any.
Methods aren't garbage-collected.
I don't see any loops in your code, so I'm confused about what you are asking.
Your two methods do very different things, and so comparing them is difficult. First of all, compilers are very smart these days (Java, C, and C++). Unless Object's constructor has side effects, a reasonable compiler would probably optimize away all calls to methodA anyways, since it does nothing.
methodB does something completely different than methodA, so I'm not sure why you are comparing the two. methodB calls newObject.setUpdate(), which, presuming it has a side effect, will not be removed by the compiler. Of course, if you never actually use newObject anywhere else, the compiler may still determine that it is unnecessary and optimize away all calls to methodB.
In any case, your question is confusing to me because I'm not sure what you are specifically trying to compare.
Now I know in JAVA that the methodA is GC'ed at the cost of
performance but what exactly happens in C or C++? Do variables or
objects declared within the method scope get destoryed? If so which
loop is better? (Would we be getting out of memory with the second
loop?)
There is no concept of method being garbage collected, only references are garbage collected.
Local variables scope is limited to the method/function they are defined in for both Java and c++. But there is an exception to it in c++, if you are creating dynamic data structure using malloc/calloc then the memory of that variable will not freed until you don't explicitly do so. There is no garbage collector in C++ so you need to be careful about dyamic memory allocation and to free that memory. This responsibility lies with the developer in C++, whereas in Java , JVM garbage collector takes care of it.

In Java, can operations on object fields bypass the stack?

Java heap only stores objects, and stack only stores primitive data and object reference.
Consider A.a = B.b , where A.a and B.b are int.
In my understanding, a JVM will first GET the value of A.a from the heap to the stack, and then PUT the value to B.b which is on the heap. It seems that the only way to change data on heap is to PUT the value from the stack.
My question is: is there some way to operate data on Java heap without stack? E.g., copy the value of A.a direct to B.b without stack operation.
If you say "it depends on the implementation of JVM", then my question is about Dalvik.
As far as the abstract machine called JVM (not to be confused with the various pieces of software which go by the same name and implement that abstract machine by mapping it onto real hardware) is concerned, A.a = B.b does indeed load the value of B.b on the stack, then stores it to A.a.1
However, as the name abstract machine may tell you, this is only a way of thinking about semantics. An implementation may do whatever it pleases, as long as it preserves the effect of the program. As you should know, most implementations don't actually interpret JVM instructions most of the time, but instead compile it to machine code for the CPU they run on. If you're concerned about performance or memory traffic, you need to go deeper.
When compiling, the JVM's stack is mostly discarded in favor of the registers most physical CPUs use. If there aren't enough registers available, the hardware stack (which is distinct from the JVM stack!) may also be used. But I digress. On most architectures, there's no instruction for moving from one memory location to the other (see Assembly: MOVing between two memory addresses). However, the hardware stack is also memory, so it's actually impossible to go heap -> stack -> heap. Instead, you'll find that the code loads the value from memory into a register, then stores it to memory from a register.
Finally, if the objects A and B are short-lived and aren't aliased, they may even be elided with their fields ending up on the stack or in registers. Then, this operation becomes even simpler (or may even be removed entirely if it has no effect).
1 These two steps actually take several JVM instructions each, but that's not important here.
When considering JIT, things get complicated. I think my question is actually about Java compiler, not JVM
If you are thinking of the javac you should assume that it does almost no optimisation and gives in byte code almost a literal translation of the code, which is very stack based.
In fact the byte code will be using the stack more than your example suggests. It does operations like
push B
getAndPushField b
push A
popAndSetField a
i.e. instead of one stack operation, there is notionally 3.
The JIT on the other hand might optimise this away so it is not even using a register for the value. Depending on the processor
// R3 contains A, R7 contains B,
// a starts at the 14th byte
// b start at the 16th byte
MOVI [R3+14], [R7+16]
It doesn't depend on the implementation of the JVM. It depends on the Java Virtual Machine Specification, which doesn't provide any other way than via the stack.

Java: Find out memory size of object? [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
In Java, what is the best way to determine the size of an object?
In Actionscript I can usually just do:
var myVar:uint = 5;
getSize(myVar)
//outputs 4 bytes
How do I do this in Java?
If you turn off -XX:-UseTLAB you can check the Runtime.freeMemory() before and after. However in the case of local variables, they don't take space on the heap (as they use the stack) and you can't get it size.
However, an int is a 32-bit sign value and you can expect it will use 4-bytes (or more depending on the JVM and the stack alignment etc)
The sizeof in C++ is useful for pointer arithmetic. Since Java doesn't allow this, its isn't useful and possibly deliberately hidden to avoid developers worrying about low level details.
The only reason C had a sizeOf intrinsic (function? well something) was because they needed it for manual memory management and some pointer arithmetic stuff.
There's no need to have that in Java. Also how much memory an object takes up is completely implementation defined and can't be answered reliably, but you can try some statistics by allocating lots of the same object and averaging - this can work nicely if you observe some basic principles, but that's boring.
If we know some basics about our VM we can also just count memory, so for Hotspot:
2 words overhead per object
every object is 8byte aligned (i.e. you have to round up to the next multiple of 8)
at least 1 word for variables, i.e. even if you have an object without any variables we "waste" 1 word
Also you should know your language spec a bit, so that you understand why an inner class has 1 additional reference than is obvious and why a static inner class does not.
A bit of work, but then it's generally a rather useless thing to know - if you're that worried about memory, you shouldn't be using neither ActionScript nor Java but C/C++ - you may get identical performance in Java, but you'll generally use about a factor of 2 more memory while doing so...
I believe there is no direct way of doing this. #Peter Lawrey 's suggestion could be a close approximation. But, you cannot rely on calculating the object size by taking the difference between the available free memory before and after the Object buildup, as there could be lots of other allocations in background happening from other threads as well. Also, there could be a possibility that the garbage collector could fire up and free up some memory in between your opertions. Also specially, in a multithreaded environment relying in the memory difference is definitely not a solution.

Categories