I want to know why an array created in Java static even when we use the new keyword to define it.
From what I've read, the new keyword allocates a memory space in the heap whenever it is encountered during run time, so why give the size of the array at all during definition.
e.g. Why can't
int[] array1=new int[20];
simply be:
int[] array1=new int[];
I know that it does not grow automatically and we have ArrayList for that but then what is the use of keyword new in this? It could have been defined as int array1[20]; like we used to do it in C, C++ if it has to be static.
P.S. I know this is an amateurish question but I am an amateur, I tried to Google but couldn't find anything comprehensive.
This may be an amateurish question, but it is one of the best amateurish questions you could make.
In order for java to allow you to declare arrays without new, it would have to support an additional kind of data type, which would behave like a primitive in the sense that it would not require allocation, but it would be very much unlike a primitive in the sense that it would be of variable size. That would have immensely complicated the compiler and the JVM.
The approach taken by java is to provide the bare minimum and sufficient primitives in order to be able to get most things done efficiently, and let everything else be done using objects. That's why arrays are objects.
Also, you might be a bit confused about the meaning of "static" here. In C, "static" means "of file scope", that is, not visible by other object files. In C++ and in Java, "static" means "belongs to the class" rather than "belongs to instances of the class". So, the term "static" is not suitable for describing array allocation. "Fixed size" or "fixed, predefined size" would be more suitable terms.
Well, in Java everything is an object, including arrays (they have length and other data). Thats why you cannot use
int var[20];
In java that would be an int and the compiler would be confused. Instead by using this:
int[] var;
You are declaring that var is of type int[] (int array) so Java understands it.
Also in java the length of the array and other data are saved on the array, for this reason you don't have to declare size of array during declaration, instead when creating an array (using new) the data are saved.
Maybe there is a better reason that oracle may have answered already, but the fact that in Java everything is an object must have something to do with it. Java is quite specific about objects and types, unlike C where you have more freedom but everything is more loose (especially using pointers).
The main idea of the array data structure is that all its elements are located in the sequential row of memory cells. That is why you can not create array with variable size: it should be unbounbed space vector in memory for this purpose, which is impossible.
If you want change size of array, you should recreate it.
Since arrays are fixed-size they need to know how much memory to allocate at the time they are instantiated.
ArrayLists or other resizing data structures that internally use arrays to store data actually re-allocate larger arrays when their inner array data
structure fills up.
My understanding of OP's reasoning is:
new is used for allocating dynamic objects (which can grow like, ArrayList), but arrays are static (can't grow). So one of them is unnecessary: the new or the size of the array.
If that is the question, then the answer is simple:
Well, in Java new is necessary for every Object allocation, because in Java all objects are dynamically allocated.
Turns out that in Java, arrays are objects, different from C/C++ where they are not.
All of Java's variables are at most a single 64bit field. Either primitives like
integer (32bit)
long (64bit)
...
or references to Objects which depending on JVM / config / OS are 64 or 32 bit fields (but unlike 64bit primitives with atomicity guaranteed).
There is no such thing as C's int[20] "type". Neither is there C's static.
What int[] array = new int[20] boils down to is roughly
int* array = malloc(20 * sizeof(java_int))
Each time you see new in Java you can imagine a malloc and a call to the constructor method in case it's a real Object (not just an array). Each Object is more or less just a struct of a few primitives and more pointers.
The result is a giant network of relatively small structs pointing to other things. And the garbage collector's task is to free all the leaves that have fallen off the network.
And this is also the reason why you can say Java is copy by value: both primitives and pointers are always copied.
regarding static in Java: there is conceptually a struct per class that represents the static context of a class. That's the place where static instance variables are anchored. Non-static instance variables are anchored at with their own instance-struct
class Car {
static int[] forAllCars = new int[20];
Object perCar;
}
...
new Car();
translates very loosely (my C is terrible) to
struct Car-Static {
Object* forAllCars;
};
struct Car-Instance {
Object* perCar;
};
// .. class load time. Happens once and this is referenced from some root object so it can't get garbage collected
struct Car-Static *car_class = (struct Car-Static*) malloc(sizeof(Car-Static));
car_class->forAllCars = malloc(20 * 4);
// .. for every new Car();
struct Car-Instance *new_reference = (struct Car-Instance*) malloc(sizeof(Car-Instance));
new_reference.perCar = NULL; // all things get 0'd
new_reference->constructor();
// "new" essentially returns the "new_reference" then
Related
This is the usual way for declare a Java array:
int[] arr = new int[100];
But this array is using heap space. Is there a way we can declare an array using stack space like c++?
Arrays are objects irrespective of whether it holds primitive type or object type, so like any other object its allocated space on the heap.
But then from Java 6u23 version, Escape Analysis came into existence, which is by default activated in Java 7.
Escape Analysis is about the scope of the object, when an object is defined inside a method scope rather than a class scope, then the JVM knows that this object cant escape this limited method scope, and applies various optimization on it.. like Constant folding, etc
Then it can also allocate the object which is defined in the method scope,
on the Thread's Stack, which is accessing the method.
In a word, no.
The only variables that are stored on the stack are primitives and object references. In your example, the arr reference is stored on the stack, but it references data that is on the heap.
If you're asking this question coming from C++ because you want to be sure your memory is cleaned up, read about garbage collection. In short, Java automatically takes care of cleaning up memory in the heap as well as memory on the stack.
Arrays are dynamically allocated so they go on the heap.
I mean, what happens when you do this:
int[] arr = new int[4];
arr = new int[5];
If the first allocation was done on the stack, how would we garbage collect it? The reference arr is stored on the stack, but the actual array of data must be on the heap.
It's not yet supported as a language feature, because that would require value types since passing on-stack data by reference would not be safe.
But as an optimization (escape analysis) the JVM may already do that for local variables containing small, fixed-size arrays iff it can prove that it does not escape the local/callee scope. That said, it's just a runtime optimization and not some spec guarantee, so relying on it is difficult.
I know that when I initialize a char array:
I have to
char[] b= new char[5];
or
char[] b= new char[5]({1,2,3,4,5});
why not like
ArrayList<Charset> list = new ArrayList<Charset>();
initialize array :
char[] b = new char[5](); ?
Why they are different? Is it one of java philosophical nature or some reasons behind it ?
If you've ever used C, then the answer is fairly simple. In C, the way you create arrays is by allocating a static length of memory on the stack that is large enough to contain the number of elements, and point to the first element with a pointer - or dynamic length of memory on the heap, and point to the first element with a pointer.
int a[5]; //stack, static allocation
int* a = (int*)malloc(sizeof(int)*5)); //heap, dynamic allocation
And in C++, the second version was changed to this, obviously because it's more obvious what is happening:
int* a = new int[5];
And they took this type of array creation over to Java.
int[] a = new int[5];
Arrays don't really work like typical objects, hence why even creating them and manipulating them with reflection uses a different Array class in order to manipulate the object. (see http://docs.oracle.com/javase/tutorial/reflect/special/arrayInstance.html )
ArrayLists are different, because they're just everyday classes like most things in java, so you initialize them with an actual constructor call:
List<T> = new ArrayList<T>();
Basically, arrays and classes just work in different ways.
That's is simply design of Java. ArrayList and Arrays are two different things. No need to be same declaration.
I guess the guys who created Java wanted to keep a syntax close to the C syntax. In Java, arrays are minimalist low-level objects, so their case is a bit particular.
ArrayList is a container, it's similar as Vector in C++, it can add and remove elements, but array can't change its size
Arrays and ArrayList are used for different purposes. If you need a fixed size collection of objects then go for array but if you need dynamically growing collection of objects then go for arraylist. In some way compiler need to know about what is your need, hence the syntax is different.
Suppose there is an Integer array in my class:
public class Foo {
private Integer[] arr = new Integer[20];
.....
}
On a 64 bit architecture the space requirement for this is ~ (20*8+24) + 24*20 {space required for references + some array overhead + space required for objects}.
Why java stores references to all of the 20 Integer objects? Wouldn't knowing that first memory location and the number of items in the array suffice? (assuming and I also as I read somewhere that objects in an array are placed contiguously anyways). I want to know the reason for this sort of implementation. Sorry if this is a noobish question.
Like every other class, Integer is a reference type. This means it can only be accessed indirectly, via a reference. You cannot store an instance of a reference type in a field, a local variable, a slot in a collection, etc. -- you always have to store a reference and allocate the object itself separately. There are a variety of reasons for this:
You need to be able to represent null.
You need to be able to replace it with another instance of a subtype (assuming subtypes are possible, i.e. the class is not final). For example, an Object[] may actually store instances of any number of different classes with wildly varying sizes.
You need to preserve sharing, e.g. after a[0] = a[1] = someObject; all three must refer to the same object. This is much more important (vital even) if the object is mutable, but even with immutable objects the difference can be observed via reference equality checks (==).
You need reference assignment to be atomic (cf. Java memory model), so copying the whole instance is even more expensive than it seems.
With these and many other constraints, always storing references is the only feasible implementation strategy (in general). In very specific circumstances, a JIT compiler may avoid allocating an object entirely and store its directly (e.g. on the stack), but this is an obscure implementation detail, and not widely applicable. I only mention this for completeness and because it's a wonderful illustration of the as-if rule.
So in a language like C, memory is separated into 5 different parts: OS Kernel, text segment, static memory, dynamic memory, and the stack. Something like this:
If we declared a static array in C, you had to specify it's size beforehand after that would be fixed forevermore. The program would allocate enough memory for the array and stick it in the static data segment as expected.
However I noticed that in Java, you could do something like this:
public class Test {
static int[] a = new int[1];
public static void main( String[] args ) {
a = new int[2];
}
}
and everything would work as you'd expect. My question is, why does this work in Java?
EDIT: So the consensus is that an int[] in Java is acts more similarly to an int* in C. So as a follow up question, is there any way to allocate arrays in static memory in Java (if no, why not)? Wouldn't this provide quicker access to such arrays?
EDIT2: ^ this is in a new question now: Where are static class variables stored in memory?
The value of a is just a reference to an object. The array creation expression (new int[2]) creates a new object of the right size, and assigns a reference to a.
Note that static in Java is fairly separate to static in C. In Java it just means "related to the type rather than to any particular instance of the type".
In java any time you use the word new, memory for that object is allocated on the heap and a reference is returned. This is also true for arrays. The int[] a is just the reference to new int[1]. When you do new int[2], a new array is allocated and pointed to a. The old array will be garbage collected when needed.
You are creating a new array, not modifying the old one. The new array will get its own space and the old one will be garbage-collected (so long as nobody else holds a reference to it).
I assume when you're referring to "static memory" you're referring to the heap. In Java, the heap serves a similar purpose to the "static data segment" you mentioned. The heap is where most objects are allocated, including arrays. The stack, on the other hand, is where objects that are used only during the life of a single method are placed.
In Java you've merely asked that a strongly typed reference to an array be stored statically for the class Test. You can change what a refers to at runtime, which includes changing the size. This would be the C equivalent of a static storage of an int*.
In Java, a static variable exists as part of the class object. Think of it as an instance variable for the class itself. In your example, a is a reference variable, which refers to some array (or no array at all, if it is null), but the array itself is allocated as all arrays are in Java: off the heap.
Static has a different meaning in Java. In Java when you declare a variable as static it is a class variable and not an instance variable.
In Java, we can always use an array to store object reference. Then we have an ArrayList or HashTable which is automatically expandable to store objects. But does anyone know a native way to have an auto-expandable array of object references?
Edit: What I mean is I want to know if the Java API has some class with the ability to store references to objects (but not storing the actual object like XXXList or HashTable do) AND the ability of auto-expansion.
Java arrays are, by their definition, fixed size. If you need auto-growth, you use XXXList classes.
EDIT - question has been clarified a bit
When I was first starting to learn Java (coming from a C and C++ background), this was probably one of the first things that tripped me up. Hopefully I can shed some light.
Unlike C++, Object arrays in Java do not store objects. They store object references.
In C++, if you declared something similar to:
String myStrings[10];
You would get 10 String objects. At this point, it would be perfectly legal to do something like println(myStrings[5].length); - you'd get '0' - the default constructor for String creates an empty string with length 0.
In Java, when you construct a new array, you get an empty container that can hold 10 String references. So the call:
String[] myStrings = new String[10];
println(myStringsp[5].length);
would throw a null pointer exception, because you haven't actually placed a String reference into the array yet.
If you are coming from a C++ background, think of new String[10] as being equivalent to new (String *)[10] from C++.
So, with that in mind, it should be fairly clear why ArrayList is the solution for an auto expanding array of objects (and in fact, ArrayList is implemented using simple arrays, with a growth algorithm built in that allocates new expanded arrays as needed and copies the content from the old to the new).
In practice, there are actually relatively few situations where we use arrays. If you are writing a container (something akin to ArrayList, or a BTree), then they are useful, or if you are doing a lot of low level byte manipulation - but at the level that most development occurs, using one of the Collections classes is by far the preferred technique.
All the classes implementing Collection are expandable and store only references: you don't store objects, you create them in some data space and only manipulate references to them, until they go out of scope without reference on them.
You can put a reference to an object in two or more Collections. That's how you can have sorted hash tables and such...
What do you mean by "native" way? If you want an expandable list f objects then you can use the ArrayList. With List collections you have the get(index) method that allows you to access objects in the list by index which gives you similar functionality to an array. Internally the ArrayList is implemented with an array and the ArrayList handles expanding it automatically for you.
Straight from the Array Java Tutorials on the sun webpage:
-> An array is a container object that holds a fixed number of values of a single type.
Because the size of the array is declared when it is created, there is actually no way to expand it afterwards. The whole purpose of declaring an array of a certain size is to only allocate as much memory as will likely be used when the program is executed. What you could do is declare a second array that is a function based on the size of the original, copy all of the original elements into it, and then add the necessary new elements (although this isn't very 'automatic' :) ). Otherwise, as you and a few others have mentioned, the List Collections is the most efficient way to go.
In Java, all object variables are references. So
Foo myFoo = new Foo();
Foo anotherFoo = myFoo;
means that both variables are referring to the same object, not to two separate copies. Likewise, when you put an object in a Collection, you are only storing a reference to the object. Therefore using ArrayList or similar is the correct way to have an automatically expanding piece of storage.
There's no first-class language construct that does that that I'm aware of, if that's what you're looking for.
It's not very efficient, but if you're just appending to an array, you can use Apache Commons ArrayUtils.add(). It returns a copy of the original array with the additional element in it.
if you can write your code in javascript, yes, you can do that. javascript arrays are sparse arrays. it will expand whichever way you want.
you can write
a[0] = 4;
a[1000] = 434;
a[888] = "a string";