Java_Where is non_static methods are actually stored? - java

Since in C++ all the methods (non-static and static) are global (stored in static memory area), I am wondering if it is also true for Java.
My guess is that the way Java stores object methods should be the same as in C++, since you can not have non-static methods stored in dynamic or stack, which will be memory-consuming.

All code is stored 'globally'. It has nothing to do with the heap or the stack. They are for data.

When you load a class, a JVM-internal representation of the class is created. This contains, or has pointers to, all the data in the .class file, including the bytecode sequences for the individual methods.
As a part of the class loading process, a table of instance method pointers is created, with one "slot" for each method in either that class or (recursively) it's superclass. The pointers to the individual bytecode sequences (and, should the code be JITCed, the machine code) are placed in that table.
There is only one instance of this table (and the data it points to) for all the instances of the associated class.

Related

Memory Allocation for objects in java

I am a beginner in Java and I had completed C++.
In C++,the memory allocation for member functions is done at the time they are declared as a part of a class and not when the objects are created,when the objects are created,then only the memory allocation for the instance variables is done,that is for every object memory allocation for instance variable is done seperately and the member functions are shared commonly by all the objects.
In case of instance variables,I know the same case happens in java,but what happens in case of member functions?
In C++;
For e.g. if we have 2 instance variables a and b,and we had created 3 objects x,y,z and let us say we have a member function getData(),
then all the 3 objects have a seperate copy of the two instance variables a and b,but share a common copy of getData().
For instance variables,same is the case in java,but what for member functions?
In Java, the bytecode for the methods exists only once for a class; no copy of the method's bytecode is made for every object. That would be unnecessary and wasteful; the bytecode does not change while the program runs.
So it works the same as what you say for C++:
... the member functions are shared commonly by all the objects.
In Java, member functions are loaded on-demand by the classloader. Since the entire Java standard library is available alongside any Java program, and the program itself may contain any number of classes, this is necessary in order to keep program start times in the range of sanity.
So any time the program needs a class in any way (e.g. to access a static variable or method, or to create an instance of an object) that isn't already loaded, the Java classloader loads that class, which includes the class's member functions. Then, once that class is loaded, it doesn't need to be loaded again if it's needed in the future, because similarly to what you noted in your question, only one copy of the class bytecode is necessary at any given time.
To avoid accumulating more and more loaded classes as the program runs, Java uses garbage collection to unload classes that it knows it can safely unload, similarly to how it uses garbage collection for program data. Those classes can of course be reloaded later if they are needed again. There are, of course, situations in which unloading a class wouldn't work due to the risk that reloading it would cause static variables/code to be reinitialized/run.
This ability to unload and reload classes can also be put to use in debugging: IntelliJ IDEA has a HotSwap feature that allows you to edit and recompile a class while the program is running, unload that class, then reload the new bytecode so you can quickly test ideas. (Wow, I just realized that totally sounded like an advertisement. I'm not affiliated with JetBrains, in case you're wondering.)

Compile-Time By-Reference Parameters on the JVM

Currently developing on a custom programming language on the JVM, I would like the language to support by-reference parameters in methods. How would I go about doing that? So far, I was able to come up with three different ways to accomplish this.
Wrapper Objects
The idea behind this is to create a wrapper object that is created containing the current value of the field, passed to the by-ref method call, and then unboxed after the call. This is a fairly straight-forward way to do this, but requires a lot of 'garbage' objects that are created and immediately discarded.
Arrays
Simply create an array of the type with 1 element, put field value in the array, call the method passing the array and finally assign the field from the array. The nice thing about this is that it ensures runtime type-safety, other than a generic wrapper class which would require additional casts.
Unsafe
This one is slightly more advanced: Use sun.misc.Unsafe to allocate some native memory space, store the field value on that memory, call the method and pass the address, re-assign the field from the native memory address, and free it up again.
Bonus: Is it possible to implement pointers and pointer arithmetic using the Unsafe class?
Wrapper Objects
[...] but requires a lot of 'garbage' objects that are created and immediately discarded.
If the lifetime of such a wrapper is limited to a callsite (+ inlined callee) then the compiler may be able to prove that through escape analysis and avoid the allocation by decomposing the wrapper object into its primitive members and use them directly in the generated code.
That essentially requires that those reference-wrappers are never stored to fields and only passed as method arguments
Unsafe
Use sun.misc.Unsafe to allocate some native memory space, store the field value on that memory
You cannot store object-references in native memory. The garbage collector would not know about it and thus could change the memory address under your feet or GC the object if that is your only reference.
But since you're creating your own language you could simply desugar field references into object references + an offset. I.e. pass two parameters (object ref + long offset) instead of one. If you know the offset you can use Unsafe to manipulate the field.
Obviously this will only work for object fields. Local references cannot be changed this way.
Bonus: Is it possible to implement pointers and pointer arithmetic using the Unsafe class?
Yes for unmanaged memory.
For memory within the managed heap you are only allowed to point to objects themselves and do pointer arithmetic relative to the object header.
And you always must store object references in Object-typed fields. Storing them in a long would lead to GC-implementations (precise ones at least) missing the reference.
Edit: You might also be interested in ongoing work in the JDK regarding VarHandles.
It's something you probably want to keep in mind when developing your language.
It’s seems you have missed an important point about the pass-by-reference concept: whenever a write into the reference happens, the referenced variable will be updated. This is different to any concept like yours that will actually pass a copy in a holder and update the original variable upon method return.
You can notice the difference even in single-threaded use case:
foo(myField, ()-> {
// if myField is pass-by-reference, whenever foo() modifies
// it and calls this Runnable, it should see the new value:
System.out.println(myField);
});
Of course, you could make both references accessing the same wrapper, but for an environment allowing (almost) arbitrary code, it would imply that you would have to replace every reference to the field (in the end, change the contents of the field) to the wrapper.
So if you want to implement a clean, real pass-by-value mechanism within the JVM, it must be able to modify the referenced artifact, i.e. field or array slot. For local variables, there is no way to do it so there’s no way around replacing local variables with a holder object once a reference to it has been created.
So the kind of options is already known, you can pass a java.lang.reflect.Field (does not work with array slots), a pair of java.lang.invoke.MethodHandle or an arbitrary typed object (of a generated type) offering read and write access.
When implementing this reference accessor type, you can resort to Unsafe to create an anonymous class just like Java’s lambda expression facility does. If fact, you can steal inspire yourself a lot from the lambda expression mechanism:
put an invokedynamic instruction at the place where a reference has to be created, pointing to your factory method and providing a handle to the field or array slot
Let the factory analyze the handle and dynamically create the accessor implementation, the main difference being that your type will have two operations, read and write
Use Unsafe to create that class (which might access the field, even if its private)
If the field is static, create an instance and return a CallSite with a handle returning that instance
Otherwise return a CallSite with a handle pointing to the constructor of the accessor class accepting an object instance or an array
This way you will only have an overhead at the first-time usage while subsequent uses will either use singleton in the case of static fields or construct an accessor on-the-fly for instance fields and array slots. These accessor instance creation can be elided by HotSpots escape analysis if used frequently just like with ordinary objects.

Recursively call a function in Java which creates new objects

If in a function which is getting called recursively with smaller arguments and within this function we are creating an object of a class. The objects created recursively will have the same name, hence we cannot preserve the name uniqueness. How can we handle such cases in Java?
I think this question stems from a misunderstanding. In Java, the name you give to a local variable is 100% irrelevant at the time the code runs - the only purpose is for you, the programmer, to specify which variable you are talking about (by giving its name, and having the compiler figure out what you mean by looking in the local scope, the scope above it and so on).
So, if you have a recursive method that calls itself, and in this method declare variables that hold new objects, then there is no clash as far as Java is concerned and they will all correctly refer to distinct objects in distinct places in memory.
If you actually meant 'I want to record all the new objects I make in my recursive method, but have them be distinctly referable to', then start by making a collection (ArrayList for example) one of the parameters to your recursive method - then you can add all newly made objects to this collection and when it fully returns, it will be full of your newly made objects. But if that's not distinguishing enough, then you need to ask 'what would distinguish these objects?' which will depend upon what the object is for (should some parameter of the recursive method be part of the 'name"? some other state? or does it just need to be random and unique?).

How variables are created inside a class when the class is not real?

I have read a class is a model for creating objects and does not exist physically whereas Objects are real. But we are creating variables inside a class and are even manipulating them.
How is that possible when class does not exist physically?
When is the memory created for these variables?
Where is the memory created for these variables?
If you mean static class variables, they are quaranteed to be initualized and any static initialization code inside class is quaranteed to be run, before the class is used. Exactly when, that is not specified IIRC, and different JVMs may do it at different time. They are basically same thing as global variables in languages which have those.
So to reiterate: static stuff exists and is initialized before it is first used. JVM implementation takes care of that.
But there is an object: instance of the class object, which is subclass of class Class.
Addition: In fact, in Java classes exist so concretely, that they can be serialized, transferred over network to different JVM, deserialized there, objects of the class created there and code executed. Simple example of this are vanilla Java applets running in browser. Another example is slave nodes in Jenkins/Hudson CI system, where the slave program is very small and only contains code to receive, deserialize and instantiate both classes and objects of these classes, sent by the master server they're connected to.
Try thinking of it this way. This is NOT an accurate explanation of exactly how any Java runtime does this, but a way of thinking of the class/object duality that may help you.
When you write a class X, you describe both code and data. The runtime will need only one copy of some things -- the code and static variables, for instance -- and one copy per object of other things, like the instance variables. You describe both these things in the class file you write, even though they will be stored separately.
Think of the one-copy-per-class things as all being stored in a block of memory together -- it would be called a struct in C. In Java, the first time the class X is referenced in your program, the runtime allocates this block of memory and associates it with the class X.
When the program executes a statement such as "X x1 = new X()", the runtime allocates another block of memory (or struct) containing all the instance variables, and keeps a separate pointer to that associated with the x1 variable.
Now, when the program executes something like "Arc arc = x1.getArc();", the runtime uses the first pointer to reference the code in the method getArc(), and the second pointer to reference the instance variables associated with x1, and executes the indicated code using those instance variables.
OO programming provides this way of associating data with code that manipulates it, allowing us to organize the program as 'objects' of combined code and data. The runtime does the business of keeping track of the different copies of things for us.
And I think it's inaccurate to say the class will not exist, it just won't exist in the form in which you wrote it.
Classes do exist Physically in the JVM at runtime. The explaination that you read was trying to labour point A while leaving the rest of the detail for later. School and books do this all of the time.
Within the Oracle JVM classes have physical representation from the moment that they are loaded. Infact every object has a pointer to that class and many objects can point at the same class.
I wouldn't think of a class or objects as physical things, that seems confusing to me. A class is often described as a blueprint for an object. Objects must be instantiated (created) using the new keyword, when an object is instantiated the class of the object is used as a blueprint to create the basic default object. The object may then be manipulated, even at runtime, by referencing the location in memory where it is stored and using methods inside the object's class to manipulate the fields inside the object's class, at least this is the way it usually should be done, its what's known as encapsulation and is incredibly important in OOP so if you're not familiar with encapsulation I would recommend looking into it.
I mentioned an object can be manipulated at runtime, that is a major difference between an object and a class. Classes can as well using something called reflection but that's another topic for another day. A variable or field is sometimes described as an address, it is a reference to the location in memory where the object is stored. Objects are not accessed directly, they are referenced through variables.
JLabel label;
The code above is setting aside a location in memory to store a JLabel object when it is instantiated with the new keyword. So far there has not been an object created, we have declared a variable which is a reference to the location in memory where our object WILL be stored when it is created. We are not going to access our JLabel object directly, we are going to use the 'label' variable we've created to reference the actual object in memory. So if we create two JLabel objects and instantiate them, like this...
JLabel label;
label = new JLabel();
JLabel anotherLabel = new JLabel("this is another label");
we now have two JLabel objects. The first object first declares a variable to reference the object then instantiates it on a separate line. The second object declares the reference to it and instantiates it all in one line. You can create objects either way and there are different reasons for using both of these methods. When an object is created at least one of its constructor is called the first object calls the constructor inside of the JLabel class that takes no parameters; The second object uses the constructor inside the JLabel class that takes a String and creates the object displaying the text passed in to the constructor.
Now imagine the program is running and we want to change the first object so it will display some text because it currently is blank since we used the constructor that doesn't take any parameters. We can manipulate the object using the setText(String) method like this.
label.setText("now the first label displays text");
We have not altered the JLabel class in any way by using that method, however we have changed the object so now it displays text. I'm probably going to lose a bunch of reputation points, or whatever they are, for this answer because I probably didn't explain every detail exactly correct but I answered this question because you're asking something that was pretty difficult for me to understand for a long time, probably more so than most because I've never taken a programming class. There's so much to this I cant possibly explain this entirely without writing a book so I didn't go into things like scope, access modifiers, static, etc., but I tried to cover what I think is important to understand what you're asking. Like I said, I have no formal education so take my answer for what its worth but hopefully I was able to make it a little easier to understand.
Oh I forgot about your other question. A location in memory to store an object is declared when a variable for the object is created. At that point there is a location but the size in memory is still 0 or null because there is no object. The memory necessary to actually store the object will be filled when the object is instantiated.

Java: Where is the memory allocated for the physical bytes of a class when loaded by a ClassLoader?

I have constructed a tiny custom class loader in a dummy application in order to understand how dynamic class loading works. For this question, I don't need to go into details about what it does other than to mention that it instantiates two different instances of my class loader and has each one load different classes, in order that I can satisfy myself by confirming a "ClassNotFoundException" from one of the class loader instances when only the other has loaded a particular class.
However, I have a question that can be easily expressed by the following, hopefully self-explanatory line of code.
Class clazz = myClassLoader.loadClass(theClazz);
This line of code causes my custom class loader to LOAD the class bytes into memory, and to return an instance of a Class object for that class.
My question is this: Where are the physical bytes of memory for the loaded class located (i.e., the contents of the .class file)? Are they stored inside the ClassLoader object, or are they stored inside the Class object (whereupon the ClassLoader object merely contains an internal reference to this Class object) - or somewhere else entirely?
The classloader object has a Collection of all classes it has loaded.
If the same physical class is loaded by 2 different class laoders, the bytes of that class are two times in memeory. The two classes behave like different types. They are not compatible to each other! Where the bytes are stored is not really relevant, I wonder why you want to know that. If you write your own ClassLoader you can "store" them where ever you want. However at some point you will make a call like: ClassLoader.defineClass(String, byte[], int, int). Then the relevant structures in memory inside the VM are created (MethodArea, ConstantPool etc.) as mentioned in other answers.
From the source code for ClassLoader:
// The classes loaded by this class loader. The only purpose of this table
// is to keep the classes from being GC'ed until the loader is GC'ed.
private Vector classes = new Vector();
The source code for the java classes are located in src.zip in your JDK directory.
Edit:
Was that what you asked about?
At the lowest level, the binary representation of the class is present in various runtime areas of the virtual machine, most notably in the Method Area and in the Runtime Constant Pool. In simpler terms, the Method Area is expected to contain information about the class, including the code for methods and constructors as evidenced by the following quote from the Virtual Machine Specification:
The Java virtual machine has a method
area that is shared among all Java
virtual machine threads. The method
area is analogous to the storage area
for compiled code of a conventional
language or analogous to the "text"
segment in a UNIX process. It stores
per-class structures such as the
runtime constant pool, field and
method data, and the code for methods
and constructors, including the
special methods (§3.9) used in class
and instance initialization and
interface type initialization.
"This line of code causes my custom class loader to LOAD the class bytes into memory, and to return an instance of a Class object for that class"
If I understand your question correct, memory allocation for objects is done on the heap space of the java process.
It depends on the JVM, seen for example here or here. Old versions of Mac OS used a pointer to pointer scheme, called a handle.
The class file and its internal, JVM-specific, representation are usually stored in the Permanent Generation - at least in the Sun/Oracle incarnation of the JVM.
See What does PermGen actually stand for? for more links.

Categories