Where are the elements of a collection stored in java? [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 months ago.
Improve this question
I'm new to java and trying to understand where implementations of the Collection interface are storing the elements I add to it. I understand how to work with collections using the various methods, but what I don't understand is what is happening under the hood when I write for example:
Collection<Integer> intList= new ArrayList();
intList.add(3);
Is a new field created in the class for the element 3? If so how are these various fields linked to one another when I create the iterator?
Iterator<Integer> intIter = intList.iterator();
boolean test = intIter.hasNext();
Or how are the various elements attached to indices when working with a less general interface like a List?

First of all, Collection is just an interface, the implementation is ArrayList. So we just focus on the detail of ArrayList
ArrayList is implemented with an array internal. So add(3) means set the arr[index] = 3; The index is from zero.
View the source code, you will see the iterator(); is a wrapper for the array visitor. Some key variables is cursor, lastRet, expectedModCount.
Finally as a new beginner, view source code maybe resolve all your problems.

Collection<Integer> intList= new ArrayList();
This is syntax sugar. So lets desugar it, to understand what happens. Specifically, this code does 3 mostly unrelated things:
Declare a new typed variable, named intList.
Create a new ArrayList object. Objects do not have names.
Copy over the reference to this object into the intList variable.
In java, all variables (so also all parameters, and all fields) are either "primitive" or "reference". That's exhaustive: Those are the only 2 options. Primitives are all variables whose type is one of the primitive types. The primitive types are hardcoded into the java lang spec. They are int, long, double, float, short, byte, char, and boolean - and that's it. Those are the 8. Note that String is not amongst them.
For the primitives, a variable just stores the value. This is simple: all the primitives have the property that their size is fixed and small. At most they are 64 bits, and all CPUs built in the last decade pretty much operate on 64-bit units so that lines up nicely.
For the refs, not so much. A string, for example, can be incredibly large. So, instead, the variable does not contain the data. It instead contains a reference to the data - a pointer. Except we don't like to call it that because the term 'pointer' has baggage, but make no mistake: It's a pointer.
Here, intList is the variable that can only hold references (because its type is not primitive). new ArrayList() creates an entire object and the expression resolves to a reference, which is then assigned to the intList variable.
Think of it like a gigantic beach. new ArrayList() creates a new treasure chest out of thin air, and buries it in the sand. intList is not the treasure chest. No, it's a map to the treasure chest. The . operator is java-ese for: Follow the map and dig up the chest. intList.add(5) means: Take your treasure map called intList, and walk to the X. Now dig. Now open the treasure chest. Now yell add(5) at it. Which does.. whatever the docs say it does. Could be anything, that's the joy of programming.
If you then say intList = null, you're not destroying the treasure chest. Nope, you merely erase out your treasure map. The treasure chest is still buried in the sand. However, java has automatic garbage collection: Any treasure with the property that no maps exist anymore that could let you find it, are 'garbage' and will eventually be dug up and tossed out by the garbage collector. In C and some other languages, you can go on arbitrary digging sprees. Not so in java - you cannot dig, the language will not let you, unless you have a treasure map to lead the way.
That beach is 'the heap' and it is huge - by default gigabytes or so large, it grows as needed and the garbage collector tosses out the garbage whenever it is neccessary to make some room.
So how does ArrayList work, what is in that treasure chest? Simple: Every non-static field as defined in the class ArrayList along with every field in the class mentioned in the extends clause (ArrayList is defined as class ArrayList extends AbstractList, so every field AbstractList has too, and so on). Of course, fields are primitive or treasure maps too, so really those treasure chests aren't as large as you might imagine. They really just store numbers and maps, that's all.
So how does arraylist work internally? With arrays, hence the name. ArrayList makes an array of 10 and tracks how many slots are actually 'used' (initially, arraylists are empty, so 0 slots used). As you call .add, it just fills the array at the current 'used' slot and then increments 'used' by 1. Of course, once you try to add an 11th element that can't be done, so what arraylist's code does is make a new array (poof! a new treasure chest springs into existence, with 20 blank treasure maps inside, as well as a note with room to write down a single number), copies over the 10 existing maps, and then updates its one map that points at 'the treasure chest with all the maps in it' (it's a lot of maps that lead to treasure chests with maps that lead to more chests, and so on!). This means the old treasure chest (the old array) is still around, but garbage. (No map exists that could get you there). Eventually it'll get collected.
If you try to add the 21st element, again this copy thing happens. ArrayList's implementation multiplies by 1.5, I think, every time it needs to grow (so, from 10, to 15, to 23, to 35, etcetera).

Related

How does java handle object references while dealing with ArrayLists? [duplicate]

This question already has answers here:
Modify list has an affect on another list in java
(2 answers)
Closed 1 year ago.
While writing a piece of code, I observed an unusual behaviour.
There is a class object obj1 which has an array list of another class object obj2 called as list1. See the code for reference:
PriorityQueue<Obj2> pq = new PriorityQueue<>(Comparator);
pq.addAll(new ArrayList<>(obj1.getList()));
Obj2 obj2 = pq.poll();
obj2.setField("any value");
System.out.println(obj2);
System.out.println(obj1.getList().get(0));
Both of the sout statement above prints the value.
Why is this happening? I changed the value of obj2 reference in pq and not in Obj1 itself
While adding elements to the pq, if we don't use new ArrayList<>() then it's understandable if the both the references are pointing to same object but I have created a new ArrayList to add in pq, still this happening.
How does java handle object references while dealing with ArrayLists?
It's all references. Everything. All the way down.
Except primitives. The primitive types are int, long, double, float, byte, short, boolean, and char. That's it. The list is hardcoded and you can't make new primitive types.
So, aside from those, it is all pointers. When you write:
MyFoo foo = new MyFoo();
That's just syntax sugar for 2 separate statements:
MyFoo foo;
foo = new MyFoo();
Imagine the heap is a gigantic whiteboard.
So what's happening here is:
MyFoo foo; This makes a little postit for yourself. This postit is named 'foo', it is yours, you can't hand it to anybody else ('local variable' - hence the name 'local'), and it has just enough room to write a coordinate for that gigantic whiteboard, that's all it can hold. It is blank, for now.
new MyFoo() this goes to the whiteboard, finds some blank space on it anywhere, and writes a box, and then in that box, room for all the fields of your MyFoo class. If any fields are non-primitive, it's just enough room to write coordinates. (Each and every object is its own little box on this whiteboard and could be anywhere on it).
The expression new MyFoo() resolves to the coordinates of where you made that box. You then assign this to foo, so, copy down the location of that box on your little postit.
If you then do:
someMethod(foo);
What that does is: Grabs a new postit, copies those coordinates over to the new postit, and then hands the postit off to someMethod. Specifically:
Even if someMethod changes foo directly (foo =), that is: "They scratch out what was on the postit you gave them and write something new on it", which obviously has no effect whatsoever on your postit.
Once that method is done, they burn the postits. You never get them back. Which is fine, you gave them a copy.
If they FOLLOW the coordinates on that postit and take out their pen and edit the whiteboard, and then later on your follow YOUR postit, you will observe what they changed! . and [] are the dereference operators: That's java-ese for: "Take those coordinates, go over to the whiteboard and find the box, and now we do something to the stuff in the box', whereas = is "edit the postit, scratching out what was there and writing something else in".
With all that context:
obj1.getList() gets you the coordinates to the list object. This list object is simply a big sack of coordinates - of postits. NOT a list of Obj1s! A list of Obj1 references - of coordinates.
new ArrayList<>(that) makes a new arraylist (new box on the whiteboard), that constructor will dutifully copy each and every COORDINATE over. It does not copy each object. It can't, java has no idea how to copy arbitrary objects, and Lists can hold anything, so it doesn't know how.
You then 'poll' the top coordinate from this newly created list. Which is the same coordinate as what obj1.getList() has.
You then go to the whiteboard, following this coordinate (obj2.setField - I see a dot, so, that's 'follow the coords and get out your whiteboard pen'), and modify what is there.
Hopefully that clears up how it works. Keep thinking of that whiteboard. When reasoning about this stuff.
Solutions
The simplest solution is to adopt immutables as much as is reasonable. An immutable object is, effectively, the notion of writing the object in permanent marker. A string is immutable. it has no set or add methods at all. For example, str.toLowerCase() does not lowercase the string that str is pointing at. That method makes a new string instead. It's the equivalent of going to the whiteboard which has "hEllo!" written on it someplace, and then instead of wiping out the E and writing an e in its place (that'd be mutating, and no method in string lets you do this), toLowerCase() just draws a new little box on the whiteboard somewhere and copies the characters over, lowercasing them on the fly. The toLowerCase() call then returns the location of this new box.
If you apply the same ideas to public class Obj1, this problem goes away. So, don't call .setField, call .withField (which makes a clone but with that one field changed) or some such.
If that's not an option, you'd have to deep-clone the list, yes. This is incredibly annoying, because how deep does deep-clone mean? ArrayList itself can't simply deep-clone, you'd have to write it yourself. Something like:
List<Obj1> clone = new ArrayList<>();
for (Obj1 o : original) clone.add(new Obj1(o));
And you'd have to write the Object1(Object1 original) {} constructor yourself, copying each field. And, of course, for each non-primitive field pointing at a mutable object, you'd have to clone that too.
The JavaDoc for ArrayList(Collection<? extends E> c) says:
Constructs a list containing the elements of the specified collection, in the order they are returned by the collection's iterator.
It doesn't say "copy".
There's also no "deep copy" mechanism in Java. Even if you use Object.clone (which in general you shouldn't), you will only copy the references inside this object. The references itself will still point to the original contents.
For example:
class Obj {
String a;
int b;
OtherObj c;
}
In memory this will look like this:
[Reference To String a] [Value of int b] [Reference to OtherObj c]
(only primitive types will be stored directly inside an object, everything else is a Class and will be stored as a reference. Even the primitive wrappers like Integer are classes and will be stored as references, but those primitive wrappers are immutable, though you can't change their inner values)
Though if you create a copy of this object, you will get a new memory location for the copy, but that memory location then will contain the same data: [Reference To String a] [Value of int b] [Reference to OtherObj c].
The same happens with ArrayList. In memory it looks like this:
[Reference to Element 1] [Reference to Element 2] [Reference to Element 3] ...
And if you copy that list, you'll get a copy of that part in memory. But all the references will still point to the very same objects.
This all may change with the introduction of Project Valhalla and Value types. But that may still take months or years.

Do arrays in Java store data or pointers

I was reading about data locality and want to use it to improve my game engine that I'm writing.
Let's say that I have created five objects at different times that are now all in different places in the memory not next to each other. If I add them all to an array, will that array only hold pointers to those objects and they will stay in the same place in the memory or will adding them all to an array rearrange them and make them contiguous.
I ask this because I thought that using arrays would be a good way to make them contiguous, but I don't know if an array will fix my problem!
tl;dr
Manipulating an array of references to objects has no effect on the objects, and has no effect on the objects’ location in memory.
Objects
An array of objects is really an array of references (pointers) to objects. A pointer is an address to another location in memory.
We speak of the array as holding objects, but that is not technically accurate. Because Java does not expose pointers themselves to us as programmers, we are generally unaware of their presence. When we access an element in the array, we are actually retrieving a pointer, but Java immediately follows that pointer to locate the object elsewhere in memory.
This automatic look-up, following the pointer to the object, makes the array of pointers feel like an array of objects. The Java programmer thinks of her array as holding her objects when in reality the objects are a hop-skip-and-a-jump away.
Arrays in Java are implemented as contiguous blocks of memory. For an array of objects, the pointers to those objects are being stored in contiguous memory. But when we access the elements, we are jumping to another location in memory to access the actual object that we want.
Adding elements may be “cheap” in that if memory happens to be available next door in memory, it can be allocated to the array to make room for more elements. In practice this is unlikely. Chances are a new array must be built elsewhere in memory, with all the pointers being copied over to the new array and then discarding the original array.
Such a new-array-and-copy-over is “expensive”. When feasible, we want to avoid this operation. If you know the likely maximum size of your array, specify that size when declaring the array. The entire block of contiguous memory is claimed immediately, with empty content in the array until you later assign a pointer to the elements.
Inserting into the middle of an array is also expensive. Either a new array is built and elements copied over, or all the elements after the insertion point must be moved down into their neighboring position.
None of these operations to the array affect the objects. The objects are floating around in the ether of memory. The objects know nothing of the array. Operations on the array do not affect the objects nor their position in memory. The only relationship is that if the reference held in the array is the last reference still pointing to the object, then when that array element is cleared or deleted, the object becomes a candidate for garbage-collection.
Primitives
In Java, the eight primitive types (byte, short, int, long, float, double, boolean, and char) are not objects/classes and are not Object-Oriented Programming. One advantage is that they are fast and take little memory, compared to objects.
An array of primitives hold the values within the array itself. So these values are stored next to one another, contiguous in memory. No references/pointers. No jumping around in memory.
As for adding or inserting, the same behavior discussed above applies. Except that instead of pointers being shuffled around, the actual primitive values are being shuffled around.
Tips
In business apps, it is generally best to use objects.
That means using the wrapper classes instead of primitives. For example, Integer instead of int. The auto-boxing facility in Java makes this easier by automatically converting between primitive values and their object wrapper.
And preferring objects means using a Collection instead of arrays, usually a List, specifically a ArrayList. Or for immutable use, a List implementation returned from the new List.of method.
In contrast to business apps, in extreme situations where speed and memory usage are paramount, such as your game engine, then make the most of arrays and primitives.
In the future, the distinction between objects and primitives may blur if the work done in Project Valhalla comes to fruition.
The data or the values are stored in the objects and the values are retrieved using the references of the objects. lemme clear one more thing arrays in Java are stored in the form of objects. so there is no doubt that objects stores values and accessed using reference variable of that particular object. Hope you got it.
Java deals with references to objects only. As such, there's no guarantee that the elements of an array will be contiguous in memory.
Edit: Guess this answer wasn't that clear. My bad. I meant that there's no guarantee that the objects themselves will be contiguous, in spite of the fact that the references will be, as 1-D arrays are stored contiguously. Still, Basil Bourque's answer perfectly explains how this works.

Java: Wrapping objects in some type of collection to store duplicates in a set

I want to make a set of some type of collection (not sure which one yet) as a way of "storing duplicates" in a set. For example if I wanted to add the integer 5 with 39 additional copies I could put it into an arraylist at index 39. Thus if I were to get the size of the arraylist, I would know how many copies of 5 existed within the set.
There are a few other ways I could implement this but I have yet to decide on one. The main issue I'm having with implementing this is that I'm not sure how I can "dynamically" make arraylists (or whatever collection I may end up using) so that whenever someone were to call mySet.add(object), the object is first inserted into a unique arraylist then into the set itself.
Can anyone give me some ideas on how I could approach this?
EDIT:
Sorry I should have been more clear in my question. The point of the code that I'm writing is that we have a set-like collection that allows duplicates. And yes some of the associated methods will be re-written/will have to be re-written. Also my code should be written under the assumption that we do not know what type of object is being inserted(only one data type per set though) nor how many instances of the same object will be added nor how many different unique objects will be added.
I would rather go for using a Map like
HashMap list <Object, Integer>
where Object is the Object that you want to count and Integer is the count
You could try guava's MultiSet, I think it's what you want.
It can store the count of each object. What you need to do is just
multiSet.put(object);
And if it is put for the first time, like you said, a new list will be created, or its count will added by one.

Best way to share Java variables between classes [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have an odd problem in Java. I can solve it but the solution seems inefficient.
I have a class, which simplified is
class Zot
{
double edges[];
}
The problem is that in many cases I want instance A and instance B to share ONE of the edge instances. In other words, instance A may allocate edge[2] and want instance B to share edge[2]. This is easy enough in that I can just set instance B to point at instance's A edge[2]. But how can I do it efficiently? If I allocate the edges[] in instance A, I can then simply assign B's instance to point to A. But often I only want to allocate a single edge (e.g. edge[2]) in A and then assign it in B. But in Java, one cannot (as far as I know) allocate a single member of an array (as one can in C++). Ideally, I only want to allocate the useful member and assign it. I could, of course, allocate all 4 members, assign the ones I need, then set the unwanted members to null and let GC clean it all up, but if there are hundreds or thousands of instances that seems clumsy.
Suggestions?
You can declare and allocate double edges[] outside of both classes, then pass this array as a parameter in the constructor into both of the instances that want to share it.
In Java an array is also an object. When you make an instance like double edges[] = new double[2]; edges will be passed around as a pointer, not as a copy.
This means if you make a change in the array in your class A, then class B will also see this change.
As I understand the question, you appear to want to share an individual element from your edges array between classes, and not share the whole array itself.
If your edges array was an array of Objects then this would be possible, and could make sense. However, since your array is a primitive array then there is no real concept of sharing an individual element.
You can assign an element of your array to equal the element of another array, but subsequent changes to the element in one array will not be reflected in the other array.
You can share the entire array between classes, in which case any changes will be reflected in both arrays (well, there is only one array, so both classes will see the changes to the single array that they both reference).
Most importantly:
When you declare an array of primitives in java, the memory is allocated immediately so there is no benefit (or mechanism) to declare only a single element of the array. So with your current data model, there is no reason for you to not predeclare your arrays since you cannot save space with them.

Is there an expandable list of object references in Java?

In Java, we can always use an array to store object reference. Then we have an ArrayList or HashTable which is automatically expandable to store objects. But does anyone know a native way to have an auto-expandable array of object references?
Edit: What I mean is I want to know if the Java API has some class with the ability to store references to objects (but not storing the actual object like XXXList or HashTable do) AND the ability of auto-expansion.
Java arrays are, by their definition, fixed size. If you need auto-growth, you use XXXList classes.
EDIT - question has been clarified a bit
When I was first starting to learn Java (coming from a C and C++ background), this was probably one of the first things that tripped me up. Hopefully I can shed some light.
Unlike C++, Object arrays in Java do not store objects. They store object references.
In C++, if you declared something similar to:
String myStrings[10];
You would get 10 String objects. At this point, it would be perfectly legal to do something like println(myStrings[5].length); - you'd get '0' - the default constructor for String creates an empty string with length 0.
In Java, when you construct a new array, you get an empty container that can hold 10 String references. So the call:
String[] myStrings = new String[10];
println(myStringsp[5].length);
would throw a null pointer exception, because you haven't actually placed a String reference into the array yet.
If you are coming from a C++ background, think of new String[10] as being equivalent to new (String *)[10] from C++.
So, with that in mind, it should be fairly clear why ArrayList is the solution for an auto expanding array of objects (and in fact, ArrayList is implemented using simple arrays, with a growth algorithm built in that allocates new expanded arrays as needed and copies the content from the old to the new).
In practice, there are actually relatively few situations where we use arrays. If you are writing a container (something akin to ArrayList, or a BTree), then they are useful, or if you are doing a lot of low level byte manipulation - but at the level that most development occurs, using one of the Collections classes is by far the preferred technique.
All the classes implementing Collection are expandable and store only references: you don't store objects, you create them in some data space and only manipulate references to them, until they go out of scope without reference on them.
You can put a reference to an object in two or more Collections. That's how you can have sorted hash tables and such...
What do you mean by "native" way? If you want an expandable list f objects then you can use the ArrayList. With List collections you have the get(index) method that allows you to access objects in the list by index which gives you similar functionality to an array. Internally the ArrayList is implemented with an array and the ArrayList handles expanding it automatically for you.
Straight from the Array Java Tutorials on the sun webpage:
-> An array is a container object that holds a fixed number of values of a single type.
Because the size of the array is declared when it is created, there is actually no way to expand it afterwards. The whole purpose of declaring an array of a certain size is to only allocate as much memory as will likely be used when the program is executed. What you could do is declare a second array that is a function based on the size of the original, copy all of the original elements into it, and then add the necessary new elements (although this isn't very 'automatic' :) ). Otherwise, as you and a few others have mentioned, the List Collections is the most efficient way to go.
In Java, all object variables are references. So
Foo myFoo = new Foo();
Foo anotherFoo = myFoo;
means that both variables are referring to the same object, not to two separate copies. Likewise, when you put an object in a Collection, you are only storing a reference to the object. Therefore using ArrayList or similar is the correct way to have an automatically expanding piece of storage.
There's no first-class language construct that does that that I'm aware of, if that's what you're looking for.
It's not very efficient, but if you're just appending to an array, you can use Apache Commons ArrayUtils.add(). It returns a copy of the original array with the additional element in it.
if you can write your code in javascript, yes, you can do that. javascript arrays are sparse arrays. it will expand whichever way you want.
you can write
a[0] = 4;
a[1000] = 434;
a[888] = "a string";

Categories