first of all thanks for the help.
I'm aware of the Reference passing mechanism of java and I need to read one million of lines (a word + a_list_of_integers each line) from a text file and put them in some structures that are class attributes, one hashmap and two arraylist.
The problem is that with the code below, written to save memory reusing the list "termine_frequenza", when I try to get and element from the "frequency" arraylist or the "dictionaryMarTD" hashmap, the list that returns is always the last list that I added.
Adding the declaration of the "Arraylist termine_frequenza" into the While obviously solves the problem but I receive a prevedible "GC overhead limit exceeded" error because of multiple declaration (i tried to increase heap o disable it, but GC fills the cpu capacity trying to free memory.
The question is simple: how can I save memory and at the same time have a correct reading? Thanks.
//Class attributes
private HashMap<String, ArrayList> dictionaryMapTD;
private ArrayList<String> words;
private ArrayList<ArrayList> frequency;
//This is the code of a method of the class that reads from a file
br = new BufferedReader(new FileReader("dictionary.txt"));
s = br.readLine();
String[] splitted;
ArrayList<Integer> termine_frequenza = new ArrayList<>();
while(s!=null)
{
termine_frequenza.clear();
splitted = s.split(" ");
words.add(splitted[0]);
for (int i = 1; i < splitted.length; i++)
{
termine_frequenza.add(Integer.valueOf(splitted[i]));
}
frequency.add(termine_frequenza);
dictionaryMapTD.put(splitted[0], termine_frequenza);
s = br.readLine();
}
//END
Change your XMS/XMX parameters in your eclips.ini file.
I set it -Xms256m-Xmx7024m for 3000000
If it has no effect then try to modify that parameters for application
In your eclips go to
RunConfigurations->Arguments->VM Arguments
for your application and put
-Xms256m
-Xmx7024m
Then in your code move
termine_frequenza = new ArrayList<>();
inside while and remove
termine_frequenza.clear();
GC should not complain
In my case It runs for 7000000 records
Let me know if it helps
Related
I have a similar code:
ArrayList<HashMap<String, String>> myArray = new ArrayList<HashMap<String, String>>();
for(int i=0; i<8000; i++){
HashMap <String, String> hashMap = new HashMap<String, String>();
hashMap.put("key1", string1);
hashMap.put("key2", string2);
myArray.add(hashMap);
}
Sometimes happen that in older Android device, this code leads to OutOfMemory on new HashMap.
There is a way to improve this code?
Thanks
EDIT:
I have this structure in my Application Class to retrive this array around the app and do something like that:
ArrayList<String> allObj1 = new ArrayList<String>();
ArrayList<String> allObj2 = new ArrayList<String>();
for (int i = 0; i<myArray.size(); i++) {
String obj1 = myArray.get(i).get("key1");
String obj2 = myArray.get(i).get("key2");
allObj1.add(obj1);
allObj2.add(obj2);
}
String[] stringObj1 = allObj1.toArray(new String[allObj1.size()]);
String[] stringObj2 = allObj2.toArray(new String[allObj2.size()]);
list.setAdapter(new Adapter(this, stringObj1, stringObj2));
You can improve your code a bit, I'll write two solutions, the first is better, but if you can't use it, use the second one:
In both solutions, use a constructor with initial capacity.
Use SparseArray if you can change your keys in int values:
ArrayList<SparseArray<String>> myArray = new ArrayList<>(8000);
for(int i=0; i<8000; i++) {
SparseArray<String> sp = new SparseArray<>(2);
sp.put(1, string1);
sp.put(2, string2);
myArray.add(sp);
}
Use ArrayMap instead:
ArrayList<ArrayMap<String, String>> myArray = new ArrayList<>(8000);
for (int i = 0; i < 8000; i++) {
ArrayMap<String, String> am = new ArrayMap<>(2);
am.put("key1", string1);
am.put("key2", string2);
myArray.add(am);
}
my answer is coming from here
why not to create an object that holds your properties
like this
class A{
String key1;
String key2;
}
ArrayList<A> myArray = new ArrayList<A>();
for(int i=0; i<8000; i++) {
A a=new A();
a.key1=string1;
a.key2=string2;
myArray.Add(a);
}
what i'm trying to say here that the hashmap object has an overhead that can be shrinked using an object
You code should be changed to:
String[] stringObj1 = new String[myArray.size()]);
String[] stringObj2 = new String[myArray.size()]);
for (int i = 0; i < myArray.size(); i++) {
stringObj1[i] = myArray.get(i).get("key1");
stringObj2[i] = myArray.get(i).get("key2");
}
list.setAdapter(new Adapter(this, stringObj1, stringObj2));
This prevents the intermediate lists and saves memory. The copy operation does not start if the memory for the two arrays is not available.
myArray does not seems to be an array but a list. :o
What You have done is good enough. the only concern is the device doesnt have that much memory to allocate for new hashmap object every time.
When you start JVM you define how much RAM it can use use for processing. JVM divides this into certain memory locations for its processing purpose, two of those are Stack & Heap
OutOfMemoryError is related to Heap. If you have large objects (or) referenced objects in memeory, then you will see OutofMemoryError. If you have strong references to objects, then GC can't clean the memory space allocated for that object. When JVM tries to allocate memory for new object and not enough space available it throws OutofMemoryError because it can't allocate required amount of memory.
How to avoid: Make sure un-necessary objects are available for GC
StackOverflowError is related to stack. All your local variables and methods calls related data will be on stack. For every method call one stack frame will be created and local as well as method call related data will be placed inside the stack frame. Once method execution is completed, stack frame will be removed. ONE WAY to reproduce this is, have infinite loop for method call, you will see stackoverflow error, because stack frame will be populated with method data for every call but it won't be freed (removed).
How to avoid Make sure method calls are ending (not in infinite loop)
ArrayList<HashMap<String, String>> myArray = new ArrayList<HashMap<String, String>>();
HashMap <String, String> hashMap = new HashMap<String, String>();
for(int i=0; i<8000; i++) {
hashMap.put("key1", string1);
hashMap.put("key2", string2);
myArray.add(hashMap);
}
Do like this declare hashMap outside for loop. you are creating object again and again in memory pool.
I have a number of repetitions of a task I would like to put in a for loop. I have to store a time series object as an IExchangeItem, a special class in openDA (a data assimilation software).
This is one of the tasks (that works):
HashMap<String, TimeSeries> items = new LinkedHashMap<String, TimeSeries>();
...
TimeSeries tsc1Q = new TimeSeries(time,value);
id = "Q1";
tsc1Q.setId(id);
this.items.put(id,tsc1Q);
IExchangeItem c1Q = new TimeSeries(tsc1Q);
What changes across the tasks is the id of the time series object and the name of IExchangeItem. I have to create a new IExchangeItem object for each time series.
This is what I tried in the for loop:
HashMap<String, TimeSeries> items = new LinkedHashMap<String, TimeSeries>();
...
TimeSeries temp;
for (int i = 0; i<readDataDim[0]; i++) {
value[0] = values[i];
id = exchangeItemIDs[i];
temp = new TimeSeries(time,value);
temp.setId(id);
this.items.put(id,temp);
IExchangeItem <??> = new TimeSeries(temp); //* How can I handle this line?
}
I know I cannot use dynamic variable names in java and that arrays, lists, or maps are commonly used to work around this issue (this is why I used <??> in the code snippet above. However, I'm a relative beginner with java and I have no clue how I can work around this specific problem since I have to have a new invocation of IExchangeItem for each time series.
From here I take it that my IExchangeItem created in the for loop will not be accessible outside the for loop so how can I initialise n replicates of IExchangeItem outside the for loop?
Edit:
Does a HashMap create n instances of IExchangeItem if I try something like this?
HashMap<String,IExchangeItem> list = new LinkedHashMap<String,IExchangeItem>();
Just one suggestion, try to write a separate method when you can pass the size of the array or a fixed number (based on array), then you created a hashMap and add that many number of instances with its keys, and values, cannot post this as a comment and hence posting it as an answer.
Try to create a new method using the value of readDataDim[0] value,
public Map<String, IExchangeItem> createAndInitialzeMap(int maxValue) {
Map<String, IExchangeItem> map = new HashMap<>();
String temp = "tempName";
for(int i =0; i < maxValue ; i ++ ) {
map.put(temp+i, new IExchangeItem());
}
return map;
}
return this way you can initialize your map along with its variable name and you can use it in your app anywhere. However I would consider refactoring if such code exists and time permits.
One more thing you should read about hashMap. :) :)
Hi all please help me achieve this scenario where I have multiple files like aaa.txt, bbb.txt, ccc.txt with data as
aaa.txt:
100110,StringA,22
200110,StringB,2
300110,StringC, 12
400110,StringD,34
500110,StringE,423
bbb.txt as:
100110,StringA,20.1
200110,StringB,2.1
300110,StringC, 12.2
400110,StringD,3.2
500110,StringE,42.1
and ccc.txt as:
100110,StringA,2.1
200110,StringB,2.1
300110,StringC, 11
400110,StringD,3.2
500110,StringE,4.1
Now I have to read all the three files (huge files) and report the result as
100110: (22, 20.1,2.1).
Issue is with the size of files and how to achieve this in optimized way.
I assume you have some sort of code to handle reading the files line by line, so I'll pseudocode a scanner that can keep pulling lines.
The easiest way to handle this would be to use a Map. In this case, I'll just use a HashMap.
HashMap<String, String[]> map = new HashMap<>();
while (aaa.hasNextLine()) {
String[] lineContents = aaa.nextLine().split(",");
String[] array = new String[3];
array[0] = lineContents[2].trim();
map.put(lineContents[0], array);
}
while (bbb.hasNextLine()) {
String[] lineContents = bbb.nextLine().split(",");
String[] array = map.get(lineContents[0]);
if (array != null) {
array[1] = lineContents[2].trim();
map.put(lineContents[0], lineContents[2].trim());
} else {
array = new String[3];
array[1] = lineContents[2].trim();
map.put(lineContents[0], array);
}
}
// same for c, with a new index of 2
To add synchronicity, you would probably use one of these maps.
Then you'd create 3 threads that just read and put.
Unless you are doing a lot of processing on loading these files, or are reading a lot of smaller files, it might work better as a sequential operation.
If your files are all ordered, simply maintain an array of Scanner pointing to your files and read the lines one by one, output the result file in a file as you go.
Doing so, you will only keep in memory as many lines as the number of files. It is both time and memory efficient.
If your files are not ordered, you can use the sort command to sort them.
So I am implementing a mapreduce job which means I am dealing with key value pairs.
I have the variable
Iterable<FreqDataWritable> values
FreqDataWritable is an object that contains pieces of information, but for now I am only concerned with one piece of information it holds which is a String which is accessed by getFilename().
I have the following loop:
ArrayList<String> filenames = new ArrayList<String>();
for(FreqDataWritable i : values) {
filenames.add(i.getFilename());
}
Now all I want to do is print the values in the array list filenames.
for(int i = 0; i < filenames.size(); i++) {
System.out.println(filenames.get(i));
}
However when I do this everything in filenames is the same. The only thing printed out is a single filename printed multiple times.
My original code is more complex than this, but I simplified it for help. Anyone know how to fix this?
Thanks
I figured it out. Hadoop has an odd memory usage so when I iterated over the values the first time it was just adding the same object over and over again to the arraylist.
Instead I need to do this:
for(FreqDataWritable i : values) {
filenames.add(new String(i.getFilename()));
}
for(String filename : filenames) {
System.out.println(fn);
}
Let me know if this will help?
Have you tried an iterator-based method?
Iterator i = values.iterator();
fileNames.add(i.next().getFileName());
for(i; i.hasNext();) {
String stringI = i.next().getLast().getFileName();
if(!stringI.equals(fileNames.get(fileNames.size() - 1)))
fileNames.add(i.next().getLast().getFileName());
}
I'm trying to read a CSV file into a list of lists (of strings), pass it around for getting some data from a database, build a new list of lists of new data, then pass that list of lists so it can be written to a new CSV file. I've looked all over, and I can't seem to find an example on how to do it.
I'd rather not use simple arrays since the files will vary in size and I won't know what to use for the dimensions of the arrays. I have no issues dealing with the files. I'm just not sure how to deal with the list of lists.
Most of the examples I've found will create multi-dimensional arrays or perform actions inside the loop that's reading the data from the file. I know I can do that, but I want to write object-oriented code. If you could provide some example code or point me to a reference, that would be great.
ArrayList<ArrayList<String>> listOLists = new ArrayList<ArrayList<String>>();
ArrayList<String> singleList = new ArrayList<String>();
singleList.add("hello");
singleList.add("world");
listOLists.add(singleList);
ArrayList<String> anotherList = new ArrayList<String>();
anotherList.add("this is another list");
listOLists.add(anotherList);
Here's an example that reads a list of CSV strings into a list of lists and then loops through that list of lists and prints the CSV strings back out to the console.
import java.util.ArrayList;
import java.util.List;
public class ListExample
{
public static void main(final String[] args)
{
//sample CSV strings...pretend they came from a file
String[] csvStrings = new String[] {
"abc,def,ghi,jkl,mno",
"pqr,stu,vwx,yz",
"123,345,678,90"
};
List<List<String>> csvList = new ArrayList<List<String>>();
//pretend you're looping through lines in a file here
for(String line : csvStrings)
{
String[] linePieces = line.split(",");
List<String> csvPieces = new ArrayList<String>(linePieces.length);
for(String piece : linePieces)
{
csvPieces.add(piece);
}
csvList.add(csvPieces);
}
//write the CSV back out to the console
for(List<String> csv : csvList)
{
//dumb logic to place the commas correctly
if(!csv.isEmpty())
{
System.out.print(csv.get(0));
for(int i=1; i < csv.size(); i++)
{
System.out.print("," + csv.get(i));
}
}
System.out.print("\n");
}
}
}
Pretty straightforward I think. Just a couple points to notice:
I recommend using "List" instead of "ArrayList" on the left side when creating list objects. It's better to pass around the interface "List" because then if later you need to change to using something like Vector (e.g. you now need synchronized lists), you only need to change the line with the "new" statement. No matter what implementation of list you use, e.g. Vector or ArrayList, you still always just pass around List<String>.
In the ArrayList constructor, you can leave the list empty and it will default to a certain size and then grow dynamically as needed. But if you know how big your list might be, you can sometimes save some performance. For instance, if you knew there were always going to be 500 lines in your file, then you could do:
List<List<String>> csvList = new ArrayList<List<String>>(500);
That way you would never waste processing time waiting for your list to grow dynamically grow. This is why I pass "linePieces.length" to the constructor. Not usually a big deal, but helpful sometimes.
Hope that helps!
If you are really like to know that handle CSV files perfectly in Java, it's not good to try to implement CSV reader/writer by yourself. Check below out.
http://opencsv.sourceforge.net/
When your CSV document includes double-quotes or newlines, you will face difficulties.
To learn object-oriented approach at first, seeing other implementation (by Java) will help you. And I think it's not good way to manage one row in a List. CSV doesn't allow you to have difference column size.
The example provided by #tster shows how to create a list of list. I will provide an example for iterating over such a list.
Iterator<List<String>> iter = listOlist.iterator();
while(iter.hasNext()){
Iterator<String> siter = iter.next().iterator();
while(siter.hasNext()){
String s = siter.next();
System.out.println(s);
}
}
Something like this would work for reading:
String filename = "something.csv";
BufferedReader input = null;
List<List<String>> csvData = new ArrayList<List<String>>();
try
{
input = new BufferedReader(new FileReader(filename));
String line = null;
while (( line = input.readLine()) != null)
{
String[] data = line.split(",");
csvData.add(Arrays.toList(data));
}
}
catch (Exception ex)
{
ex.printStackTrace();
}
finally
{
if(input != null)
{
input.close();
}
}
I'd second what xrath said - you're better off using an existing library to handle reading / writing CSV.
If you do plan on rolling your own framework, I'd also suggest not using List<List<String>> as your implementation - you'd probably be better off implementing CSVDocument and CSVRow classes (that may internally uses a List<CSVRow> or List<String> respectively), though for users, only expose an immutable List or an array.
Simply using List<List<String>> leaves too many unchecked edge cases and relying on implementation details - like, are headers stored separately from the data? or are they in the first row of the List<List<String>>? What if I want to access data by column header from the row rather than by index?
what happens when you call things like :
// reads CSV data, 5 rows, 5 columns
List<List<String>> csvData = readCSVData();
csvData.get(1).add("extraDataAfterColumn");
// now row 1 has a value in (nonexistant) column 6
csvData.get(2).remove(3);
// values in columns 4 and 5 moved to columns 3 and 4,
// attempting to access column 5 now throws an IndexOutOfBoundsException.
You could attempt to validate all this when writing out the CSV file, and this may work in some cases... but in others, you'll be alerting the user of an exception far away from where the erroneous change was made, resulting in difficult debugging.
public class TEst {
public static void main(String[] args) {
List<Integer> ls=new ArrayList<>();
ls.add(1);
ls.add(2);
List<Integer> ls1=new ArrayList<>();
ls1.add(3);
ls1.add(4);
List<List<Integer>> ls2=new ArrayList<>();
ls2.add(ls);
ls2.add(ls1);
List<List<List<Integer>>> ls3=new ArrayList<>();
ls3.add(ls2);
methodRecursion(ls3);
}
private static void methodRecursion(List ls3) {
for(Object ls4:ls3)
{
if(ls4 instanceof List)
{
methodRecursion((List)ls4);
}else {
System.out.print(ls4);
}
}
}
}
Also this is an example of how to print List of List using advanced for loop:
public static void main(String[] args){
int[] a={1,3, 7, 8, 3, 9, 2, 4, 10};
List<List<Integer>> triplets;
triplets=sumOfThreeNaive(a, 13);
for (List<Integer> list : triplets){
for (int triplet: list){
System.out.print(triplet+" ");
}
System.out.println();
}
}