Calculating memory usage of a B-Tree in Java - java

I've implemented a simple B-Tree whichs maps longs to ints. Now I wanted to estimate the memory usage of it using the following method (applies to 32bit JVM only):
class BTreeEntry {
int entrySize;
long keys[];
int values[];
BTreeEntry children[];
boolean isLeaf;
...
/** #return used bytes */
long capacity() {
long cap = keys.length * (8 + 4) + 3 * 12 + 4 + 1;
if (!isLeaf) {
cap += children.length * 4;
for (int i = 0; i < children.length; i++) {
if (children[i] != null)
cap += children[i].capacity();
}
}
return cap;
}
}
/** #return memory usage in MB */
public int memoryUsage() {
return Math.round(rootEntry.capacity() / (1 << 20));
}
But I tried it e.g. for 7mio entries and the memoryUsage method reports much higher values than the -Xmx setting would allow! E.g. it says 1040 (MB) and I set -Xmx300! Is the JVM somehow able to optimize the memory layout, eg. for empty arrays or what could be my mistake?
Update1: Ok, introducing the isLeaf boolean reduces memory usage a lot, but still it is unclear why I observed higher values than Xmx. (You can still try out this via using isLeaf == false for all contructors)
Update2: Hmmh, something is very wrong. When increasing the entries per leaf one would assume that the memory usage decreases (when doing compact for both), because less overhead of references is involved for larger arrays (and btree has smaller height). But the method memoryUsage reports an increased value if I use 500 instead 100 entries per leaf.

Ohh sh... a bit fresh air solved this issue ;)
When an entry is full it will be splitted. In my original split method checkSplitEntry (where I wanted to avoid waste of memory) I made a big memory waste mistake:
// left child: just copy pointer and decrease size to index
BTreeEntry newLeftChild = this;
newLeftChild.entrySize = splitIndex;
The problem here is, that the old children pointers are still accessible. And so, in my memoryUsage method I'm counting some children twice (especially when I did not compact!). So, without this trick all should be fine and my B-Tree will be even more memory efficient as the garbage collector can do its work!

Related

Java program slows down abruptly when indexing corpus for k-grams

I have a problem which is puzzling me. I'm indexing a corpus (17 000 files) of text files, and while doing this, I'm also storing all the k-grams (k-long parts of words) for each word in a HashMap to be used later:
public void insert( String token ) {
//For example, car should result in "^c", "ca", "ar" and "r$" for a 2-gram index
// Check if token has already been seen. if it has, all the
// k-grams for it have already been added.
if (term2id.get(token) != null) {
return;
}
id2term.put(++lastTermID, token);
term2id.put(token, lastTermID);
// is word long enough? for example, "a" can be bigrammed and trigrammed but not four-grammed.
// K must be <= token.length + 2. "ab". K must be <= 4
List<KGramPostingsEntry> postings = null;
if(K > token.length() + 2) {
return;
}else if(K == token.length() + 2) {
// insert the one K-gram "^<String token>$" into index
String kgram = "^"+token+"$";
postings = index.get(kgram);
SortedSet<String> kgrams = new TreeSet<String>();
kgrams.add(kgram);
term2KGrams.put(token, kgrams);
if (postings == null) {
KGramPostingsEntry newEntry = new KGramPostingsEntry(lastTermID);
ArrayList<KGramPostingsEntry> newList = new ArrayList<KGramPostingsEntry>();
newList.add(newEntry);
index.put("^"+token+"$", newList);
}
// No need to do anything if the posting already exists, so no else clause. There is only one possible term in this case
// Return since we are done
return;
}else {
// We get here if there is more than one k-gram in our term
// insert all k-grams in token into index
int start = 0;
int end = start+K;
//add ^ and $ to token.
String wrappedToken = "^"+token+"$";
int noOfKGrams = wrappedToken.length() - end + 1;
// get K-Grams
String kGram;
int startCurr, endCurr;
SortedSet<String> kgrams = new TreeSet<String>();
for (int i=0; i<noOfKGrams; i++) {
startCurr = start + i;
endCurr = end + i;
kGram = wrappedToken.substring(startCurr, endCurr);
kgrams.add(kGram);
postings = index.get(kGram);
KGramPostingsEntry newEntry = new KGramPostingsEntry(lastTermID);
// if this k-gram has been seen before
if (postings != null) {
// Add this token to the existing postingsList.
// We can be sure that the list doesn't contain the token
// already, else we would previously have terminated the
// execution of this function.
int lastTermInPostings = postings.get(postings.size()-1).tokenID;
if (lastTermID == lastTermInPostings) {
continue;
}
postings.add(newEntry);
index.put(kGram, postings);
}
// if this k-gram has not been seen before
else {
ArrayList<KGramPostingsEntry> newList = new ArrayList<KGramPostingsEntry>();
newList.add(newEntry);
index.put(kGram, newList);
}
}
Clock c = Clock.systemDefaultZone();
long timestart = c.millis();
System.out.println(token);
term2KGrams.put(token, kgrams);
long timestop = c.millis();
System.out.printf("time taken to put: %d\n", timestop-timestart);
System.out.print("put ");
System.out.println(kgrams);
System.out.println();
}
}
The insertion into the HashMap happens on the rows term2KGrams.put(token, kgrams); (There are 2 of them in the code snippet). When indexing, everything works fine until things suddenly, at 15 000 indexed files, go bad. Everything slows down immensely, and the program doesn't finish in a reasonable time, if at all.
To try to understand this problem, I've added some prints at the end of the function. This is the output they generate:
http://soccer.org
time taken to put: 0
put [.or, //s, /so, ://, ^ht, cce, cer, er., htt, occ, org, p:/, r.o, rg$, soc, tp:, ttp]
aysos
time taken to put: 0
put [^ay, ays, os$, sos, yso]
http://www.davisayso.org/contacts.htm
time taken to put: 0
put [.da, .ht, .or, //w, /co, /ww, ://, ^ht, act, avi, ays, con, cts, dav, g/c, htm, htt, isa, nta, o.o, ont, org, p:/, rg/, s.h, say, so., tac, tm$, tp:, ts., ttp, vis, w.d, ww., www, yso]
playsoccer
time taken to put: 0
put [^pl, ays, cce, cer, er$, lay, occ, pla, soc, yso]
This looks fine to me, the putting doesn't seem to be taking long time and the k-grams (in this case trigrams) are correct.
But one can see strange behaviour in the pace at which my computer is printing this information. In beginning, everything is printing at a super high speed. But at 15 000, that speed stops, and instead, my computer starts printing a few lines at a time, which of course means that indexing the other 2000 files of the corpus will take an eternity.
Another interesting thing I observed was when doing a keyboard interrupt (ctrl+c) after it had been printing erratically and slowly as described for a while. It gave me this message:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.base/java.lang.StringLatin1.newString(StringLatin1.java:549)sahandzarrinkoub#Sahands-MBP:~/Documents/Programming/Information Retrieval/lab3 2$ sh compile_all.sh
Note: ir/PersistentHashedIndex.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Does this mean I'm out of memory? Is that the issue? If so, that's surprising, because I've been storing quite a lot of things in memory before, such as a HashMap containing the document ID's of every single word in the corpus, a HashMap containing every single word where every single k-gram appears, etc.
Please let me know what you think and what I can do to fix this problem.
To understand this, you must first understand that java does not allocate memory dynamically (or, at least, not indefinetly). The JVM is by default configured to start with a minimum heap size and a maximum heap size. When the maximum heap size would be exceeded through some allocation, you get a OutOfMemoryError
You can change the minimum and maximum heap size for your execution with the vm parameters -Xms and -Xmx respectively. An example for an execution with at least 2, but at most 4 GB would be
java -Xms2g -Xmx4g ...
You can find more options on the man page for java.
Before changing the heap memory, however, take a close look at your system resources, especially whether your system starts swapping. If your system swaps, a larger heap size may let the program run longer, but with equally bad performance. The only thing possible then would be to optimize your program in order to use less memory or to upgrade the RAM of your machine.

Why does ArrayList seriously outperform LinkedList? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I did some research and wrote the following article: http://www.heavyweightsoftware.com/blog/linkedlist-vs-arraylist/ and wanted to post a question here.
class ListPerformanceSpec extends Specification {
def "Throwaway"() {
given: "A Linked List"
List<Integer> list
List<Integer> results = new LinkedList<>()
when: "Adding numbers"
Random random = new Random()
//test each list 100 times
for (int ix = 0; ix < 100; ++ix) {
list = new LinkedList<>()
LocalDateTime start = LocalDateTime.now()
for (int jx = 0; jx < 100000; ++jx) {
list.add(random.nextInt())
}
LocalDateTime end = LocalDateTime.now()
long diff = start.until(end, ChronoUnit.MILLIS)
results.add(diff)
}
then: "Should be equal"
true
}
def "Linked list"() {
given: "A Linked List"
List<Integer> list
List<Integer> results = new LinkedList<>()
when: "Adding numbers"
Random random = new Random()
//test each list 100 times
for (int ix = 0; ix < 100; ++ix) {
list = new LinkedList<>()
LocalDateTime start = LocalDateTime.now()
for (int jx = 0; jx < 100000; ++jx) {
list.add(random.nextInt())
}
long total = 0
for (int jx = 0; jx < 10000; ++jx) {
for (Integer num : list) {
total += num
}
total = 0
}
LocalDateTime end = LocalDateTime.now()
long diff = start.until(end, ChronoUnit.MILLIS)
results.add(diff)
}
then: "Should be equal"
System.out.println("Linked list:" + results.toString())
true
}
def "Array list"() {
given: "A Linked List"
List<Integer> list
List<Integer> results = new LinkedList<>()
when: "Adding numbers"
Random random = new Random()
//test each list 100 times
for (int ix = 0; ix < 100; ++ix) {
list = new ArrayList<>()
LocalDateTime start = LocalDateTime.now()
for (int jx = 0; jx < 100000; ++jx) {
list.add(random.nextInt())
}
long total = 0
for (int jx = 0; jx < 10000; ++jx) {
for (Integer num : list) {
total += num
}
total = 0
}
LocalDateTime end = LocalDateTime.now()
long diff = start.until(end, ChronoUnit.MILLIS)
results.add(diff)
}
then: "Should be equal"
System.out.println("Array list:" + results.toString())
true
}
}
Why does ArrayList outperform LinkedList by 28% for sequential access when LinkedList should be faster?
My question is different from When to use LinkedList over ArrayList? because I'm not asking when to choose it, but why it's faster.
Array-based lists, as Java ArrayList, use much less memory for the same data amount than link-based lists (LinkedList), and this memory is organized sequentially. This essentially decreases CPU cache trashing with side data. As soon as access to RAM requires 10-20 times more delay than L1/L2 cache access, this is causing sufficient time difference.
You can read more about these cache issues in books like this one, or similar resources.
OTOH, link-based lists outperform array-based ones in operation like insering to middle of list or deleting there.
For a solution that have both memory economy (so, fast iterating) and fast inserting/deleting, one should look at combined approaches, as in-memory B⁺-trees, or array of array lists with proportionally increased sizes.
From LinkedList source code:
/**
* Appends the specified element to the end of this list.
*
* <p>This method is equivalent to {#link #addLast}.
*
* #param e element to be appended to this list
* #return {#code true} (as specified by {#link Collection#add})
*/
public boolean add(E e) {
linkLast(e);
return true;
}
/**
* Links e as last element.
*/
void linkLast(E e) {
final Node<E> l = last;
final Node<E> newNode = new Node<>(l, e, null);
last = newNode;
if (l == null)
first = newNode;
else
l.next = newNode;
size++;
modCount++;
}
From ArrayList source code:
/**
* Appends the specified element to the end of this list.
*
* #param e element to be appended to this list
* #return <tt>true</tt> (as specified by {#link Collection#add})
*/
public boolean add(E e) {
ensureCapacityInternal(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
private void ensureExplicitCapacity(int minCapacity) {
modCount++;
// overflow-conscious code
if (minCapacity - elementData.length > 0)
grow(minCapacity);
}
So linked list has to create new node for each element added, while array list does not. ArrayList does not reallocate/resize for each new element, so most of time array list simply set object in array and increment size, while linked list does much more work.
You also commented:
When I wrote a linked list in college, I allocated blocks at a time and then farmed them out.
I do not think this would work in Java. You cannot do pointer tricks in Java, so you would have to allocate a lot of small arrays, or create empty nodes ahead. In both cases overhead would probably be a bit higher.
Why does ArrayList outperform LinkedList by 28% for sequential access when LinkedList should be faster?
You're assuming that, but don't provide anything to back it up. But it's not really a great surprise. An ArrayList has an array as the underlying data store. Accessing this sequentially is extremely fast, because you know exactly where every element is going to be. The only slowdown comes when the array grows beyond a certain size and needs to be expanded, but that can be optimised.
The real answer would probably be: check the Java source code, and compare the implementations of ArrayList and LinkedList.
One explanation is that your base assumption (that multiplication is slower than memory fetches) is questionable.
Based on this document, a AMD Bulldozer takes 1 clock cycles to perform a 64 bit integer multiply instruction (register x register) with 6 cycles of latency1. By contrast, a memory to register load takes 1 clock cycle with 4 cycles of latency. But that assumes that you get a cache hit for the memory fetch. If you get a cache miss, you need to add a number of cycles. (20 clock cycles for an L2 cache miss, according to this source.)
Now that is just one architecture, and others will vary. And we also need to consider other issues, like constraints on the number of multiplications that can be overlapped, and how well the compiler can organize the instructions to get them minimize instruction dependencies. But the fact remains that for a typical modern pipelined chip architecture, the CPU can execute integer multiplies as fast as it can execute memory to register moves, and much faster if there are more cache misses in the memory fetches.
Your benchmark is using lists with 100,000 Integer elements. When you look at the amount of memory involved, and the relative locality of the heap nodes that represent the lists and the elements, the linked list case will use significantly more memory, and have correspondingly worse memory locality. That will lead to more cache misses per cycle of the inner loop, and worse performance.
Your benchmark results are not surprising2 to me.
The other thing to note is that if you use Java LinkedList, a separate heap node is used to represent the list nodes. You can implement your own linked lists more efficiently if your element class has its own next field that can be used to chain the elements. However, brings its own limitations; e.g. an element can only be in one list at a time.
Finally, as #maaartinus points out, a full IMUL is not required in the case of a Java ArrayList. When reading or writing the ArrayList's array, the indexing multiplication will be either x 4 or x 8 and that can be performed by a MOV with one of the standard addressing modes; e.g.
MOV EAX, [EDX + EBX*4 + 8]
This multiplication can be done (at the hardware level) by shifting with much less latency than 64 bit IMUL.
1 - In this context, the latency is the number of cycles delay before the result of the instruction is available ... to the next instruction that depends on it. The trick is to order the instructions so that other work is done during the delay.
2 - If anything, I am surprised that LinkedList appears to be doing so well. Maybe calling Random.nextInt() and autoboxing the result is dominating the loop times?

Which data structures to use when storing multiple entities with multiple query criteria?

There is a storage unit, with has a capacity for N items. Initially this unit is empty.
The space is arranged in a linear manner, i.e. one beside the other in a line.
Each storage space has a number, increasing till N.
When someone drops their package, it is assigned the first available space. The packages could also be picked up, in this case the space becomes vacant.
Example: If the total capacity was 4. and 1 and 2 are full the third person to come in will be assigned the space 3. If 1, 2 and 3 were full and the 2nd space becomes vacant, the next person to come will be assigned the space 2.
The packages they drop have 2 unique properties, assigned for immediate identification. First they are color coded based on their content and second they are assigned a unique identification number(UIN).
What we want is to query the system:
When the input is color, show all the UIN associated with this color.
When the input is color, show all the numbers where these packages are placed(storage space number).
Show where an item with a given UIN is placed, i.e. storage space number.
I would like to know how which Data Structures to use for this case, so that the system works as efficiently as possible?
And I am not given which of these operations os most frequent, which means I will have to optimise for all the cases.
Please take a note, even though the query process is not directly asking for storage space number, but when an item is removed from the store it is removed by querying from the storage space number.
You have mentioned three queries that you want to make. Let's handle them one by one.
I cannot think of a single Data Structure that can help you with all three queries at the same time. So I'm going to give an answer that has three Data Structures and you will have to maintain all the three DS's state to keep the application running properly. Consider that as the cost of getting a respectably fast performance from your application for the desired functionality.
When the input is color, show all the UIN associated with this color.
Use a HashMap that maps Color to a Set of UIN. Whenever an item:
is added - See if the color is present in the HashMap. If yes, add this UIN to the set else create a new entry with a new set and add the UIN then.
is removed - Find the set for this color and remove this UIN from the set. If the set is now empty, you may remove this entry altogether.
When the input is color, show all the numbers where these packages are placed.
Maintain a HashMap that maps UIN to the number where an incoming package is placed. From the HashMap that we created in the previous case, you can get the list of all UINs associated with the given Color. Then using this HashMap you can get the number for each UIN which is present in the set for that Color.
So now, when a package is to be added, you will have to add the entry to previous HashMap in the specific Color bucket and to this HashMap as well. On removing, you will have to .Remove() the entry from here.
Finally,
Show where an item with a given UIN is placed.
If you have done the previous, you already have the HashMap mapping UINs to numbers. This problem is only a sub-problem of the previous one.
The third DS, as I mentioned at the top, will be a Min-Heap of ints. The heap will be initialized with the first N integers at the start. Then, as the packages will come, the heap will be polled. The number returned will represent the storage space where this package is to be put. If the storage unit is full, the heap will be empty. Whenever a package will be removed, its number will be added back to the heap. Since it is a min-heap, the minimum number will bubble up to the top, satisfying your case that when 4 and 2 are empty, the next space to be filled will be 4.
Let's do a Big O analysis of this solution for completion.
Time for initialization: of this setup will be O(N) because we will have to initialize a heap of N. The other two HashMaps will be empty to begin with and therefore will incur no time cost.
Time for adding a package: will include time to get a number and then make appropriate entries in the HashMaps. To get a number from heap will take O(Log N) time at max. Addition of entries in HashMaps will be O(1). Hence a worst case overall time of O(Log N).
Time for removing a package: will also be O(Log N) at worst because the time to remove from the HashMaps will be O(1) only while, the time to add the freed number back to min-heap will be upper bounded by O(Log N).
This smells of homework or really bad management.
Either way, I have decided to do a version of this where you care most about query speed but don't care about memory or a little extra overhead to inserts and deletes. That's not to say that I think that I'm going to be burning memory like crazy or taking forever to insert and delete, just that I'm focusing most on queries.
Tl;DR - to solve your problem, I use a PriorityQueue, an Array, a HashMap, and an ArrayListMultimap (from guava, a common external library), each one to solve a different problem.
The following section is working code that walks through a few simple inserts, queries, and deletes. This next bit isn't actually Java, since I chopped out most of the imports, class declaration, etc. Also, it references another class called 'Packg'. That's just a simple data structure which you should be able to figure out just from the calls made to it.
Explanation is below the code
import com.google.common.collect.ArrayListMultimap;
private PriorityQueue<Integer> openSlots;
private Packg[] currentPackages;
Map<Long, Packg> currentPackageMap;
private ArrayListMultimap<String, Packg> currentColorMap;
private Object $outsideCall;
public CrazyDataStructure(int howManyPackagesPossible) {
$outsideCall = new Object();
this.currentPackages = new Packg[howManyPackagesPossible];
openSlots = new PriorityQueue<>();
IntStream.range(0, howManyPackagesPossible).forEach(i -> openSlots.add(i));//populate the open slots priority queue
currentPackageMap = new HashMap<>();
currentColorMap = ArrayListMultimap.create();
}
/*
* args[0] = integer, maximum # of packages
*/
public static void main(String[] args)
{
int howManyPackagesPossible = Integer.parseInt(args[0]);
CrazyDataStructure cds = new CrazyDataStructure(howManyPackagesPossible);
cds.addPackage(new Packg(12345, "blue"));
cds.addPackage(new Packg(12346, "yellow"));
cds.addPackage(new Packg(12347, "orange"));
cds.addPackage(new Packg(12348, "blue"));
System.out.println(cds.getSlotsForColor("blue"));//should be a list of {0,3}
System.out.println(cds.getSlotForUIN(12346));//should be 1 (0-indexed, remember)
System.out.println(cds.getSlotsForColor("orange"));//should be a list of {2}
System.out.println(cds.removePackage(2));//should be the orange one
cds.addPackage(new Packg(12349, "green"));
System.out.println(cds.getSlotForUIN(12349));//should be 2, since that's open
}
public int addPackage(Packg packg)
{
synchronized($outsideCall)
{
int result = openSlots.poll();
packg.setSlot(result);
currentPackages[result] = packg;
currentPackageMap.put(packg.getUIN(), packg);
currentColorMap.put(packg.getColor(), packg);
return result;
}
}
public Packg removePackage(int slot)
{
synchronized($outsideCall)
{
if(currentPackages[slot] == null)
return null;
else
{
Packg packg = currentPackages[slot];
currentColorMap.remove(packg.getColor(), packg);
currentPackageMap.remove(packg.getUIN());
currentPackages[slot] = null;
openSlots.add(slot);//return slot to priority queue
return packg;
}
}
}
public List<Packg> getUINsForColor(String color)
{
synchronized($outsideCall)
{
return currentColorMap.get(color);
}
}
public List<Integer> getSlotsForColor(String color)
{
synchronized($outsideCall)
{
return currentColorMap.get(color).stream().map(packg -> packg.getSlot()).collect(Collectors.toList());
}
}
public int getSlotForUIN(long uin)
{
synchronized($outsideCall)
{
if(currentPackageMap.containsKey(uin))
return currentPackageMap.get(uin).getSlot();
else
return -1;
}
}
I use 4 different data structures in my class.
PriorityQueue I use the priority queue to keep track of all the open slots. It's log(n) for inserts and constant for removals, so that shouldn't be too bad. Memory-wise, it's not particularly efficient, but it's also linear, so that won't be too bad.
Array I use a regular Array to track by slot #. This is linear for memory, and constant for insert and delete. If you needed more flexibility in the number of slots you could have, you might have to switch this out for an ArrayList or something, but then you'd have to find a better way to keep track of 'empty' slots.
HashMap ah, the HashMap, the golden child of BigO complexity. In return for some memory overhead and an annoying capital letter 'M', it's an awesome data structure. Insertions are reasonable, and queries are constant. I use it to map between the UIDs and the slot for a Packg.
ArrayListMultimap the only data structure I use that's not plain Java. This one comes from Guava (Google, basically), and it's just a nice little shortcut to writing your own Map of Lists. Also, it plays nicely with nulls, and that's a bonus to me. This one is probably the least efficient of all the data structures, but it's also the one that handles the hardest task, so... can't blame it. this one allows us to grab the list of Packg's by color, in constant time relative to the number of slots and in linear time relative to the number of Packg objects it returns.
When you have this many data structures, it makes inserts and deletes a little cumbersome, but those methods should still be pretty straight-forward. If some parts of the code don't make sense, I'll be happy to explain more (by adding comments in the code), but I think it should be mostly fine as-is.
Query 3: Use a hash map, key is UIN, value is object (storage space number,color) (and any more information of the package). Cost is O(1) to query, insert or delete. Space is O(k), with k is the current number of UINs.
Query 1 and 2 : Use hash map + multiple link lists
Hash map, key is color, value is pointer(or reference in Java) to link list of corresponding UINs for that color.
Each link list contains UINs.
For query 1: ask hash map, then return corresponding link list. Cost is O(k1) where k1 is the number of UINs for query color. Space is O(m+k1), where m is the number of unique color.
For query 2: do query 1, then apply query 3. Cost is O(k1) where k1 is the number of UINs for query color. Space is O(m+k1), where m is the number of unique color.
To Insert: given color, number and UIN, insert in hash map of query 3 an object (num,color); hash(color) to go to corresponding link list and insert UIN.
To Delete: given UIN, ask query 3 for color, then ask query 1 to delete UIN in link list. Then delete UIN in hash map of query 3.
Bonus: To manage to storage space, the situation is the same as memory management in OS: read more
This is very simple to do with SegmentTree.
Just store a position in each place and query min it will match with vacant place, when you capture a place just assign 0 to this place.
Package information possible store in separate array.
Initiall it have following values:
1 2 3 4
After capturing it will looks following:
0 2 3 4
After capturing one more it will looks following:
0 0 3 4
After capturing one more it will looks following:
0 0 0 4
After cleanup 2 it will looks follwong:
0 2 0 4
After capturing one more it will looks following:
0 0 0 4
ans so on.
If you have segment tree to fetch min on range it possible to done in O(LogN) for each operation.
Here my implementation in C#, this is easy to translate to C++ of Java.
public class SegmentTree
{
private int Mid;
private int[] t;
public SegmentTree(int capacity)
{
this.Mid = 1;
while (Mid <= capacity) Mid *= 2;
this.t = new int[Mid + Mid];
for (int i = Mid; i < this.t.Length; i++) this.t[i] = int.MaxValue;
for (int i = 1; i <= capacity; i++) this.t[Mid + i] = i;
for (int i = Mid - 1; i > 0; i--) t[i] = Math.Min(t[i + i], t[i + i + 1]);
}
public int Capture()
{
int answer = this.t[1];
if (answer == int.MaxValue)
{
throw new Exception("Empty space not found.");
}
this.Update(answer, int.MaxValue);
return answer;
}
public void Erase(int index)
{
this.Update(index, index);
}
private void Update(int i, int value)
{
t[i + Mid] = value;
for (i = (i + Mid) >> 1; i >= 1; i = (i >> 1))
t[i] = Math.Min(t[i + i], t[i + i + 1]);
}
}
Here example of usages:
int n = 4;
var st = new SegmentTree(n);
Console.WriteLine(st.Capture());
Console.WriteLine(st.Capture());
Console.WriteLine(st.Capture());
st.Erase(2);
Console.WriteLine(st.Capture());
Console.WriteLine(st.Capture());
For getting the storage space number I used a min heap approach, PriorityQueue. This works in O(log n) time, removal and insertion both.
I used 2 BiMaps, self-created data structures, for storing the mapping between UIN, color and storage space number. These BiMaps used internally a HashMap and an array of size N.
In first BiMap(BiMap1), a HashMap<color, Set<StorageSpace>> stores the mapping of color to the list of storage spaces's. And a String array String[] colorSpace which stores the color at the storage space index.
In the Second BiMap(BiMap2), a HashMap<UIN, storageSpace> stores the mapping between UIN and storageSpace. And a string arrayString[] uinSpace` stores the UIN at the storage space index.
Querying is straight forward with this approach:
When the input is color, show all the UIN associated with this color.
Get the List of storage spaces from BiMap1, for these spaces use the array in BiMap2 to get the corresponding UIN's.
When the input is color, show all the numbers where these packages are placed(storage space number). Use BiMap1's HashMap to get the list.
Show where an item with a given UIN is placed, i.e. storage space number. Use BiMap2 to get the values from the HashMap.
Now when we are given a storage space to remove, both the BiMaps have to be updated. In BiMap1 get the entry from the array, get the corersponding Set, and remove the space number from this set. From BiMap2 get the UIN from the array, remove it and also remove it from the HashMap.
For both the BiMaps the removal and the insert operations are O(1). And the Min heap works in O(Log n), hence the total time complexity is O(Log N)

Graphhopper Dijkstra One-To-Many Memory Error

No matter the size of the graph and the server I use, any time I attempt to route by the dijkstra_one_to_many algorithm, I overflow my heap. Test environment is a m3.2xlarge with 30gb of RAM and 2x80gb SSD drives.
java.lang.OutOfMemoryError: Java heap space
I've tracked down the code block that is the problem inside com.graphhopper.routing.DijkstraOneToMany in the findEndNode method:
while (true) {
visitedNodes++;
EdgeIterator iter = outEdgeExplorer.setBaseNode(currNode);
while (iter.next()) {
int adjNode = iter.getAdjNode();
int prevEdgeId = edgeIds[adjNode];
if (!accept(iter, prevEdgeId))
continue;
double tmpWeight = weighting.calcWeight(iter, false, prevEdgeId) + weights[currNode];
if (Double.isInfinite(tmpWeight))
continue;
double w = weights[adjNode];
if (w == Double.MAX_VALUE) {
parents[adjNode] = currNode;
weights[adjNode] = tmpWeight;
heap.insert_(tmpWeight, adjNode);
changedNodes.add(adjNode);
edgeIds[adjNode] = iter.getEdge();
} else if (w > tmpWeight) {
parents[adjNode] = currNode;
weights[adjNode] = tmpWeight;
heap.update_(tmpWeight, adjNode);
changedNodes.add(adjNode);
edgeIds[adjNode] = iter.getEdge();
}
}
if (heap.isEmpty() || isMaxVisitedNodesExceeded() || isWeightLimitExceeded())
return NOT_FOUND;
// calling just peek and not poll is important if the next query is cached
currNode = heap.peek_element();
if (finished())
return currNode;
heap.poll_element();
}
```
It seems to never find the end node and the internal data structure (min heap?) grows and grows and grows until I run out of heap space. Why is this happening?
I can post my config.properties as well if that is needed. Thank you Peter for putting together an awesome piece of open source software.
The DijkstraOneToMany class is currently not intended to be (easily) used from outside e.g. it is not thread safe. You could switch to a simple Dijkstra without a different finish condition to lower your memory requirements for simple cases.
That said ... there can be the following issues:
make sure that you cache the calls to DijkstraOneToMany as it creates big initial datastructures
again: use it from one thread only (e.g. via ThreadLocal)
It seems to never find the end node -> Maybe you use the QueryGraph with it? That will not really work as we create so called virtual nodes in the QueryGraph which DijkstraOneToMany does not know, instead try to pick the next tower node e.g. via avoiding QueryGraph completely or manually via an EdgeIterator
Thank you Peter for putting together an awesome piece of open source software.
It was not just me - it is a community effort :) !

Which is my best option to process big 2D arrays in a Java app?

I'm developing a image processing app in Java (Swing), which have lots of calculations.
It crashes when big images are loaded:
java.lang.OutOfMemoryError: Java heap space due things like:
double matrizAdj[][] = new double[18658][18658];
So I'm decided to experiment a light, and faster as possible, database to deal with this problem. Thinking to use a table as it were a 2D array, loop throught it insert resulting values into other table.
I'm also thinking about using JNI, but as I'm not familiarized with C/C++ and I don't have the time needed to learn.
Currently, my problem is not processing, only heap overload.
I would like to hear what is my best option to solve this.
EDIT :
Little explanation: First I get all white pixels from a binarized image into a list. Lets say I got 18k pixels. Then I perform a lot of operations with that list. Like variance, standard deviation, covariance... and goes on... At the end I have to multiply two 2D array([2][18000] & [18000][2]) resulting in a double[18000][18000] that is causing me trouble. After that, other operations are done with this 2D array, resulting in more than one big 2D array.
I can't deal with requiring large ammounts of RAM to use this app.
Well, for trivia's sake, that matrix you're showing consumes roughly 2.6Gb of RAM. So, that's a benchmark of how much memory you need should you decided to pursue that tact.
If it's efficient for you, you could store the rows of the matrix in to blobs within a database. In this case you'd have 18658 rows, with a serialized double[18658] store on it.
I wouldn't suggest that though.
A better tact would be to use the image file directly, and look at NIO and byte buffers to use mmap to map them in to your program space.
Then you can use things like DoubleBuffers to access the data. This lets the VM page in as much of the original file is necessary, and it also keeps the data off the Java heap (rather it's stored in process RAM associated with the JVM). The big benefit is that it keeps these monster data structures away from the Garbage Collector.
You'll still need physical RAM on the machine, of course, but it's not Java Heap RAM.
But this is would likely be the most efficient way to access this data for your process.
Since you stated you "can't deal with requiring large ammounts of RAM to use this app" your only option is to store the big array off RAM - disk being the most obvious choice (using a relational database is just an unnecessary overhead).
You can use a little utility class which provides a persistent 2-dimensional double array functionality. Here is my solution to that using RandomAccessFile. This solution also has the advantage that you can keep the array and reuse it when you restart the application!
Note: the presented solution is not thread-safe. Synchronization needed if you want to access it from multiple threads concurrently.
Persistent 2-dimensional double array:
public class FileDoubleMatrix implements Closeable {
private final int rows;
private final int cols;
private final long rowSize;
private final RandomAccessFile raf;
public FileDoubleMatrix(File f, int rows, int cols) throws IOException {
if (rows < 0 || cols < 0)
throw new IllegalArgumentException(
"Rows and cols cannot be negative!");
this.rows = rows;
this.cols = cols;
rowSize = cols * 8;
raf = new RandomAccessFile(f, "rw");
raf.setLength(rowSize * cols);
}
/**
* Absolute get method.
*/
public double get(int row, int col) throws IOException {
pos(row, col);
return get();
}
/**
* Absolute set method.
*/
public void set(int row, int col, double value) throws IOException {
pos(row, col);
set(value);
}
public void pos(int row, int col) throws IOException {
if (row < 0 || col < 0 || row >= rows || col >= cols)
throw new IllegalArgumentException("Invalid row or col!");
raf.seek(row * rowSize + col * 8);
}
/**
* Relative get method. Useful if you want to go though the whole array or
* though a continuous part, use {#link #pos(int, int)} to position.
*/
public double get() throws IOException {
return raf.readDouble();
}
/**
* Relative set method. Useful if you want to go though the whole array or
* though a continuous part, use {#link #pos(int, int)} to position.
*/
public void set(double value) throws IOException {
raf.writeDouble(value);
}
public int getRows() { return rows; }
public int getCols() { return cols; }
#Override
public void close() throws IOException {
raf.close();
}
}
The presented FileDoubleMatrix supports relative get() and set() methods which is very useful if you process your whole array or a continuous part of it (e.g. you iterate over it). Use the relative methods when you can for faster operations.
Example using the FileDoubleMatrix:
final int rows = 10;
final int cols = 10;
try (FileDoubleMatrix arr = new FileDoubleMatrix(
new File("array.dat"), rows, cols)) {
System.out.println("BEFORE:");
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++) {
System.out.print(arr.get(row, col) + " ");
}
System.out.println();
}
// Process array; here we increment the values
for (int row = 0; row < rows; row++)
for (int col = 0; col < cols; col++)
arr.set(row, col, arr.get(row, col) + (row * cols + col));
System.out.println("\nAFTER:");
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++)
System.out.print(arr.get(row, col) + " ");
System.out.println();
}
} catch (IOException e) {
e.printStackTrace();
}
More about the relative get and set methods:
The absolute get and set methods require the position (row and column) of the element to be returned or set. The relative get and set methods do not require the position, they return or set the current element. The current element is in fact the pointer of the underlying file. The position can be set with the pos() method.
Whenever a relative get() or set() method is called, after returning they implicitly move the pointer to the next element, in a row-continuity manner (moving to the next in the row, and if the end of row reached, moving to the first element of the next row etc.)
For example here is how we can zero the whole array using the relative set method:
// Fill the whole array with zeros using relative set
// First position to the beginning:
arr.pos(0, 0);
// And execute a "set zero" operation
// as many times as many elements the array has:
for ( int i = rows * cols; i > 0; i--)
arr.set(0);
The relative get and set methods automatically move the pointer to the next element.
It should be obvious that in my implementation the absolute get and set methods also change the pointer which must not be forgotten when relative and absolute get/set methods are used.
Another example: let's set the sum of each row to the last element of the row, but also include the last element in the sum! For this we will use the mixture of realtive and absolute get/set methods:
// Start with the first row:
arr.pos(0, 0);
for (int row = 0; row < rows; row++) {
double sum = 0;
for (int col = 0; col < cols; col++)
sum += arr.get(); // Relative get to calculate row sum
// Now set the sum to the end of row.
// For this we have to position back, so we use the absolute set.
arr.set(row, cols - 1, sum);
// The absolute set method also moves the pointer, and since
// it is the end of row, it moves to the first of the next row.
}
And that's all. Using the relative get/set methods we don't have to pass the "matrix indices" when processing continuous parts of the array, and also the implementation does not have to move the internal pointer which is more than handy when processing millions of elements as in your example.
I would recommend the following things in order.
Investigate why your app is running out of memory. Are you creating arrays or other objects bigger than what you need. I hope you might have done that already. But still I thought it's worth mentioning because this should not be ignored.
If you think there is nothing wrong with step 1 then check you are not running with too low memory settings. or 32 bit jvm
If there is no issue with step 2. Now it's not always true that a light weight database will give you best performance. If you don't require searching the temp data may be you won't gain much from implementing a light weight database. But if your application needs lot of searching / querying the temp data it may be different case. If you don't need searching custom file format may be fast and efficient.
I hope it helps you solve the issue at hand :)
The simplest fix would be simply to give your program more memory. For example, if you specify -xmx 11G on your Java command line, the JVM will be able to allocate up to 11 GB of heap space - enough memory to carry several copies of your array, which is around 2.6 GB in size, in memory at a time.
If speed is really not an issue, you can do this even if you don't have enough physical memory, by allocating enough virtual memory and letting the OS swap the memory to disk.
I personally also think this is the best solution. Memory on this scale is cheaper than programmer time.
I would suggest a different approach.
Since most image processing operations are done by going over all of the pixels in some order exactly once, it's usually possible to do them on one piece of the image at a time. What I mean is that there's usually no random access to pixels of the image. If I'm not mistaking, all of the operations you mention in your question fit this description.
Therefore, I would suggest loading the image lazily, a piece at a time. Then, implement methods that retrieve the next chunk of pixels once the previous one is processed, and feeds these chunks to the algorithms you use.
In order to support that, I would suggest converting the images to a non compressed format that you could create a lazy reader for easily.
Not sure I would bother with a database for this, just open a temporary file and spill parts of your matrix in there as needed, and delete the file when you're done. Whatever solution you choose has to depend somewhat on your matrix library being able to use it. If you're using a third party library then you're probably limited to whatever options (if any) they provide. However if you've implemented your own matrix operations then definitely would just go with a temporary file that I manage myself. That will be fastest and lightest weight.
You can use split and reduce technique.
split your image into small fragments, or you can use sliding window technique
http://forums.ni.com/t5/Machine-Vision/sliding-window-technique/td-p/2586621
cheers,

Categories