Java code optimization, Heap size constraint (use less memory) - java

I am facing of memory exception while running this code and the constraint is heap size. Can anyone suggest if there is a way to optimize this code further?
public class getCustomerList {
public static List <Customer> retrieve() throws ParseException {
List<Customer> customers = new ArrayList<Customer>();
for (int i = 0; i < 100000; i++) {
Customer customer = new Customer();
customer.setAge(new Integer(i));
customer.setBirthDate((new SimpleDateFormat("ddMMyyyy")).parse("01061986"));
customer.setName("Customer" + new String((new Integer(i)).toString()));
customers.add(customer);
}
return customers;
}
}

Few ideas that might help:
Make age primitive, if it already is, provide the method an int, not Integer.
customer.setAge(i);
Move SimpleDateFormat outside the loop, currently you create 100000 same instances.
customer.setBirthDate(format.parse("01061986"));
Do you really need 100000 same Date instances for every customer? If you don't, you can get away with setting the same date instance in every customer.
customer.setBirthDate(date);
Current name creation is very inefficient, you create Integer object, then create string from it(and the integer is thrown away), then create copy of said string(and throw away the initial one). Just do:
customer.setName("Customer" + i);

Related

How to search an array of objects and then update a part of the object in java?

I have an Array of objects. Each object is a customer record, which is the customer ID (int), first name (String), last name(String), and balance (double).
My problem is that i am not supposed to have duplicate customer records, so if they appear in the file twice, I have to just update their balance. I cannot figure out how to search the array to find out if i need to just update the balance or make a new record in the array.
I feel like i should do this in the get/setters, but i am not exactly sure.
edit: to clarify on "if they appear in the file twice, I have to just update their balance." I have a file i made in notepad which is supposed to be a customer list, which has all of their information. if the same customer shows up twice, say the following day to buy more stuff, i am not supposed to create a new object for them since they already have an object and a place in the array. instead, i am supposed to take the amount they spent, and add it to their already existing balance within their existing object.
edit2: i thought i would give you the bit of code i have already where i read in the values into the array. i based this off of the example we did in class, but we didn't have to update anything, just store information into an array and print it if needed.
public CustomerList(){
data = new CustomerRecord[100]; //i'm only allowed 100 unique customers
try {
Scanner input = new Scanner(new File("Records.txt"));
for(int i = 0; i < 100; ++i){
data[i] = new CustomerRecord();
data[i].setcustomerNumber(input.nextInt());
data[i].setfirstName(input.next());
data[i].setlastName(input.next());
data[i].setTransactionAmount(input.nextDouble());
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
You shouldn't be using arrays in that case. A Set would be much more suitable as it, by definition, does not have duplicate entries.
What you need to do is to implement the equals() and hashCode() methods in your Customer class so they only use id (or id and name fields) but not balance.
If for some reason you need to use arrays you have two options:
sort the array and use binary search to find if the customer is there, this is nice if the array doesn't change much but you're doing a lot of updates
simply do a linear scan of the array, checking each entry to see if a given customer is already there, if so then update the balance, otherwise add it as a new entry
It would be something like:
public void updateOrAdd(Customer cst) {
boolean exists = false;
for(Customer existing : array) {
// !!! You need to implement your own equals method in the
// customer so it doesn't take into account the balance !!!
if(existing.equals(cst)) {
exists = true;
existing.updateBalance(cst.getBalance());
break;
}
}
if(!exists) {
// add the cst to the array
}
}
The difference is in runtime, the set solution will be constant O(1) on average (unless you incorrectly implement your hashCode() method).
Suppose you have a Customer array:
Customer[] customers = new Customer[size];
... // fill the array with data
Then you get a new customer object called newCustomer. You need to search for newCustomer in your array and, update it if it is already there, or add it if it's not. So you can do something like this:
// Return, if it exists, a customer with id equal to newCustomer.getId()
Optional<Customer> existingCustomer =
Arrays.stream(customers)
.filter(c -> newCustomer.getId().equals(c.getId()))
.findFirst();
if (existingCustomer.isPresent()) {
// update the customer object with newCustomer information as appropriate
Customer customer = existingCustomer.get();
// assuming you have an updateBalance() method
customer.updateBalance(newCustomer.amountSpent());
} else {
// if the customer is not in the array already, add them to the current position
customers[currentIndex] = newCustomer;
}

Counting reference targets in a heap dump of Set<WeakReference>

I'm currently looking at the heap dump of this silly little test class (taken at the very end of the main method):
public class WeakRefTest {
static final class RefObj1 { int i; }
static final class RefObj2 { int j; }
public static void main(String[] args) {
Set<WeakReference<?>> objects = new HashSet<>();
RefObj1 obj1 = new RefObj1();
RefObj2 obj2 = new RefObj2();
for (int i = 0; i < 1000; i++) {
objects.add(new WeakReference<RefObj1>(obj1));
objects.add(new WeakReference<RefObj2>(obj2));
}
}
}
Now I'm trying to figure out how to count the number of references to a specific class in objects. If this were a SQL database, it'd be easy:
select objects.className as referent, count(*) as cnt
from java.lang.ref.WeakReference ref
inner join heapObjects objects on ref.referent = objects.objectId
group by objects.className;
Result:
referent | cnt
===================
WeakRefTest$RefObj1 | 1000
WeakRefTest$RefObj2 | 1000
After some research, I figured I can construct a Eclipse MAT OQL query that gives me the classes involved:
select DISTINCT OBJECTS classof(ref.referent) from java.lang.ref.WeakReference ref
Alas, this doesn't include their count and OQL doesn't seem to support a GROUP BY clause. Any ideas how to get this information?
Edited to add: In reality, none of the objects added to the Set (nor the Set implementation itself, obviously) are under my control. So sorry, modifying RefObj1 and RefObj2 isn't allowed.
Edit2: I found this related question about using OQL in jvisualvm but it turns out that OQL is actually Javascript unleashed at a heap dump. I'd be fine with something like that, too. But playing around with it hasn't produced results for me, yet. I'll update the question if that changes.
Open the histogram view (there is a toolbar button for this, which looks like a bar graph).
In the first row of the histogram view where it says "Regex", type WeakReference to filter the view.
Find the java.lang.ref.WeakReference line, right-click, and choose "Show Objects By Class" -> "By Outgoing References".
The resulting view should be summarize the objects being referred to, grouped by class as you require. The Objects column should indicate the number of instances for each class.
You could just write a method in the object that returns the information and call that from Eclipse...
Since you cannot modify the object then the next best thing will be to write a utility function in some method that you can modify and call that from the eclipse debugger. I don't know Eclipse well enough to help you do it without inserting something to the source code, sorry.
I would use a Weak HashSet. You can just use set.size() to get the number of references still alive.
static final class RefObj1 { int i; }
static final class RefObj2 { int j; }
public static void main(String[] args) {
Set objects = Collections.newSetFroMap(new WeakHashMap());
RefObj1 obj1 = new RefObj1();
RefObj2 obj2 = new RefObj2();
for (int i = 0; i < 1000; i++) {
objects.add(obj1);
objects.add(obj2);
}
obj1 = null;
System.gc();
System.out.println("Objects left is " + objects.size());
}
I would expect this to print 0, 1 or 2 depending on how the objects are cleaned up.

Java collection and memory optimization

I wrote a custom index to a custom table which uses 500MB of heap for 500k strings. Only 10% of the strings are unique; the rest are repeats. Every string is of length 4.
How i can optimize my code? Should I use another collection? I tried to implement a custom string pool to save memory:
public class StringPool {
private static WeakHashMap<String, String> map = new WeakHashMap<>();
public static String getString(String str) {
if (map.containsKey(str)) {
return map.get(str);
} else {
map.put(str, str);
return map.get(str);
}
}
}
private void buildIndex() {
if (monitorModel.getMessageIndex() == null) {
// the index, every columns create an index
ArrayList<HashMap<String, TreeSet<Integer>>> messageIndex = new ArrayList<>(filterableColumn.length);
for (int i = filterableColumn.length; i >= 0; i--) {
// key -> string, value -> treeset, the row wich contains the key
HashMap<String, TreeSet<Integer>> hash = new HashMap<>();
messageIndex.add(hash);
}
// create index for every column
for (int i = monitorModel.getParser().getMyMessages().getMessages().size() - 1; i >= 0; --i) {
TreeSet<Integer> tempList;
for (int j = 0; j < filterableColumn.length; j++) {
String value = StringPool.getString(getValueAt(i, j).toString());
if (!messageIndex.get(j).containsKey(value)) {
tempList = new TreeSet<>();
messageIndex.get(j).put(value, tempList);
} else {
tempList = messageIndex.get(j).get(value);
}
tempList.add(i);
}
}
monitorModel.setMessageIndex(messageIndex);
}
}
No need to come up with a custom pool. Just use String.intern().
You might want to examine your memory heap in a profiler. My guess is that the memory consumption isn't primarily in the String storage, but in the many TreeSet<Integer> instances. If so, you could optimize considerably by using primitive arrays (int[], short[], or byte[], depending on the actual size of the integer values you're storing). Or you could look into a primitive collection type, such as those provided by FastUtil or Trove.
If you do find that the String storage is problematic, I'll assume that you want to scale your application beyond 500k Strings, or that especially tight memory constraints require you to deduplicate even short Strings.
As Dev said, String.intern() will deduplicate Strings for you. One caveat, however - in the Oracle and OpenJDK virtual machines, String.intern() will store those Strings in the VM permanent-generation, such that they will not be garbage-collected in the future. That's appropriate (and helpful) if:
The Strings you're storing do not change throughout the life of the VM (e.g., if you read in a static list at startup and use it throughout the life of your application).
The Strings you need to store fit comfortably in the VM permanent generation (with adequate room for classloading and other consumers of PermGen). Update: see below.
If either of those conditions is false, you are probably correct to build a custom pool. But my recommendation is that you consider a simple HashMap in place of the WeakHashMap you're currently using. You probably don't want these values to be garbage-collected while they're in your cache, and WeakHashMap adds another level of indirection (and the associated object pointers), increasing memory consumption further.
Update: I'm told that JDK 7 stores interned Strings (String.intern()) in the main heap, not in perm-gen, as earlier JDKs did. That makes String.intern() less risky if you're using JDK 7.

Array parameter passing

I'm in a beginner's java class. This Lab is for me to make a class "Wallet" that manipulates an array that represents a Wallet. Wallet contains the "contents[]" array to store integers represing paper currency. The variable "count" holds the number of banknotes in a wallet. After writing methods (that match provided method calls in a serpate Driver class) to initialize the Wallet and add currency/update "count", I need to transfer the array of one instantiated Wallet to another. I don't know how that would work because the one Wallet class has only been messing with a wallet called "myWallet" and now I need to take a new Wallet called "yourWallet" and fill it with "myWallet"'s array values.
//I should note that using the Java API library is not allowed in for this course
My Wallet class looks like this so far:
public class Wallet
{
// max possible # of banknotes in a wallet
private static final int MAX = 10;
private int contents[];
private int count; // count # of banknotes stored in contents[]
public Wallet()
{
contents = new int[MAX];
count = 0;
}
/** Adds a banknote to the end of a wallet. */
public void addBanknote(int banknoteType)
{
contents[count] = banknoteType;
count = count + 1;
}
/**
* Transfers the contents of one wallet to the end of another. Empties the donor wallet.
*/
public void transfer(Wallet donor)
{
//my code belongs here
}
...
The Driver code looks like this:
public class Driver
{
public static void main(String args[])
{
Wallet myWallet = new Wallet();
myWallet.addBanknote(5);
myWallet.addBanknote(50);
myWallet.addBanknote(10);
myWallet.addBanknote(5);
System.out.println("myWallet contains: " + myWallet.toString());
// transfer all the banknotes from myWallet to yourWallet
Wallet yourWallet = new Wallet();
yourWallet.addBanknote(1);
yourWallet.transfer(myWallet);
System.out.println("\nnow myWallet contains: "
+ myWallet.toString());
System.out.println("yourWallet contains: "
+ yourWallet.toString());
I want to use addBanknote() to help with this, but I don't know how to tell the transfer() method to transfer all of myWallet into yourWallet.
I had the idea to do somethign like this in transfer():
yourWallet.addBanknote(myWallet.contents[i]);
with a traversal to increase i for myWallet contents. It seems horribly wrong, but I'm at a complete loss as to write this method.
If my problem is so unclear that nobody can help, I would be more than happy to receive advice on how to ask a better question or on how to search with correct terms.
Thanks for any help you can provide.
I don't want to spoil your homework as you seem to be going the right way, but I do have some comments which you may either take or not :)
First, I would probably put the bank note types in some enumeration. But as that sounds a bit to advanced, consider
public class Wallet {
public static final int ONE_DOLLAR_BILL = 1;
public static final int FIVE_DOLLAR_BILL = 5;
...
// looks a bit more readable to me
myWallet.addBanknote(ONE_DOLLAR_BILL);
Transferring all the banknotes from the donor to yourself should not be so much of a problem
(a for loop would do) but I think you're in a world of hurt if you are trying to implement a
removeBanknote(int banknoteType);
as you are using count not only as a length but also as an index variable. By this I mean that you assume contents[0] ... contents[count-1] hold valid banknotes. And how do you remove one without too much work?
Warning: a bit more advanced
In your case I would probably opt to have a banknoteType of 0 indicating an empty banknote slot in your wallet, and implement _addBanknote(int banknoteType) as:
public void addBanknote(int banknoteType) {
for (int i=0; i < contents.length; i++) {
if (contents[i] == 0) {
contents[i] = banknoteType;
count++;
return; // OK inserted banknote at the first empty slot
}
}
throw new RuntimeException("Wallet is full");
}
This may be a bit overwhelming at this point. But it would allow you to implement:
public void removeBanknote(int banknoteType) {
for (int i=0; i < contents.length; i++) {
if (contents[i] == banknoteType) {
contents[i] = 0; // better: NO_BANKNOTE = 0
count--;
return;
}
}
throw new RuntimeException("This wallet does not contain a banknote of type " + banknoteType);
}
Please note that in both methods I return when I successfully removed or added the banknote. Only when I could not find a free slot, or the requested banknote, I finish the for loop and end up throwing an exception and thereby stopping the program.
I think the question is fine and I think you're on the right path. The way you're calling Wallet#addBanknote(int) is correct. What you have said is the right thing:
public void transfer(Wallet donor)
{
// Traverse the donor's wallet
// Add the bank note from the donor to this wallet
// What do you think also needs to happen to make sure
// the donor is actually giving their bank note?
}
Just another thing, what would happen in your Wallet#addBanknote(int) method if you have more contents than the MAX?
You can create either a constructor that takes another wallet, or a function (as already mentioned) and use System.arraycopy to copy the array in one fell swoop. System.arraycopy is fast, and its definitely overkill for something small like this, but its good tool to have in your toolkit.
The other alternative mentioned, copy the elements from one array to the other element by element in a loop will work fine too.
The myWallet inside the transfer method is named 'donor', and with that, it doesn't look horribly wrong:
addBanknote (donor.contents [i]);
You just need a loop around it, and to remove the yourWallet. which is the name of an instance of that class. That instance is inside the Class/method this, but needn't be specified, because there is no other addBanknote-Method in scope, which could be meant. (Thanks to mangoDrunk).

Better practice to re-instantiate a List or invoke clear()

Using Java (1.6) is it better to call the clear() method on a List or just re-instantiate the reference?
I have an ArrayList that is filled with an unknown number of Objects and periodically "flushed" - where the Objects are processed and the List is cleared. Once flushed the List is filled up again. The flush happens at a random time. The number within the List can potentially be small (10s of Objects) or large (millions of objects).
So is it better to have the "flush" call clear() or new ArrayList() ?
Is it even worth worrying about this sort of issues or should I let the VM worry about it? How could I go about looking at the memory footprint of Java to work this sort of thing out for myself?
Any help greatly appreciated.
The main thing to be concerned about is what other code might have a reference to the list. If the existing list is visible elsewhere, do you want that code to see a cleared list, or keep the existing one?
If nothing else can see the list, I'd probably just clear it - but not for performance reasons; just because the way you've described the operation sounds more like clearing than "create a new list".
The ArrayList<T> docs don't specify what happens to the underlying data structures, but looking at the 1.7 implementation in Eclipse, it looks like you should probably call trimToSize() after clear() - otherwise you could still have a list backed by a large array of null references. (Maybe that isn't an issue for you, of course... maybe that's more efficient than having to copy the array as the size builds up again. You'll know more about this than we do.)
(Of course creating a new list doesn't require the old list to set all the array elements to null... but I doubt that that will be significant in most cases.)
The way you are using it looks very much like how a Queue is used. When you work of the items on the queue they are removed when you treat them.
Using one of the Queue classes might make the code more elegant.
There are also variants which handle concurrent updates in a predictable way.
I think if the Arraylist is to be too frequently flushed,like if it's run continuously in loop or something then better use clear if the flushing is not too frequent then you may create a new instance.Also since you say that elements may vary from 10 object to millions you can probably go for an in-between size for each new Arraylist your creating so that the arraylist can avoid resizing a lot of time.
There is no advantage for list.clear() than new XXList.
Here is my investigation to compare performance.
import java.util.ArrayList;
import java.util.List;
public class ClearList {
public static void testClear(int m, int n) {
List<Integer> list = new ArrayList<>();
long start = System.currentTimeMillis();
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
list.add(Integer.parseInt("" + j + i));
}
list.clear();
}
System.out.println(System.currentTimeMillis() - start);
}
public static void testNewInit(int m, int n) {
List<Integer> list = new ArrayList<>();
long start = System.currentTimeMillis();
for (int i = 0; i < m; i++) {
for (int j = 0; j < n; j++) {
list.add(Integer.parseInt("" + j + i));
}
list = new ArrayList<>();
}
System.out.println(System.currentTimeMillis() - start);
}
public static void main(String[] args) {
System.out.println("clear ArrayList:");
testClear(991000, 100);
System.out.println("new ArrayList:");
testNewInit(991000, 100);
}
}
/*--*
* Out:
*
* clear ArrayList:
* 8391
* new ArrayList:
* 6871
*/

Categories