I am reading through a CSV and saving the data to objects (an object is created for each row). The rows in the CSV are grouped by the first element (group number) - somewhere between 2-10ish rows share a group number. There are ~180 groups in the data set. To handle this data more easily, I store the data into HashMaps, where the key is the group number, and the value tied to the key is an ArrayList of the data objects.
As I iterate through the CSV's rows, I add objects to the HashMap, using the row's group number to tell where to put the new data object. If the object has a group number which has not been entered into the CSV yet, it creates a new key (its group number) and an ArrayList of data objects, containing just itself.
If the row's group number IS a key in the HashMap, it gets the ArrayList tied to the group number, adds the new data object to it, and uses the put function to re-add the new entry, with the updated ArrayList (now with one more data entry tied to the shared group number).
Code example:
ArrayList<CSVData> csvListNew = new ArrayList<CSVData>();
HashMap<Integer,ArrayList<CSVData>> CSVDataMapNew = new HashMap<Integer,ArrayList<CSVData>>();
while ((line = reader.readLine()) != null && !(line.contains(",,,,,,,,,")))
{
System.out.println(line);
String[] csvDataNew = line.split(",");
String currentGroup = csvDataNew[GroupIndex];
try {
currentGroupNumber = Integer.parseInt(currentGroup.replace("group", "").replace(" ", ""));
} catch (Exception ex) {
currentGroupNumber = previousGroupNumber;
}
String path = csvDataNew[PathIndex];
startLine = Integer.parseInt(csvDataNew[StartLineIndex]);
endLine = Integer.parseInt(csvDataNew[EndLineIndex]);
CSVData data = new CSVData(currentGroupNumber, path, startLine, endLine);
if (CSVDataMapNew.containsKey(currentGroupNumber)) { //if it does contain the current key, add the current object to the ArrayList tied to it.
csvListNew = CSVDataMapNew.get(currentGroupNumber);
csvListNew.add(clone);
CSVDataMapNew.put(currentGroupNumber, csvListNew);
} else { //if it doesnt contain the current key, make new entry
csvListNew.add(clone);
CSVDataMapNew.put(currentGroupNumber, csvListNew);
System.out.println(CSVDataMapNew.size());
System.out.println(CSVDataMapNew.get(currentGroupNumber).size());
}
csvListNew.clear(); //to make sure no excess objects are entered into the map.
previousGroupNumber = currentGroupNumber;
}
There are appropriate try-catches, etc. and the CSVDataTable is declared in its own class, being referenced statically.
The issue is, when I add in print statements at each step, its like each ArrayList within the HashMap gets erased at the end of every loop. So once the CSV is finished being iterated through, it has each key value, but the ArrayLists tied to each key are all empty. (Evidenced by looping through the HashMap afterwards).
How can I resolve this, so when I enter a value into the ArrayList and re 'put' the key and updated ArrayList into the Map, it keeps its data?
So once the CSV is finished being iterated through, it has each key
value, but the ArrayLists tied to each key are all empty. (
This
ArrayList<CSVData> csvListNew = new ArrayList<CSVData>();
should be invoked and associated to each key of your map.
But you use a single instance of the ArrayList as value for every key of your map.
And at the end of your method, you do :
csvListNew.clear();
So all your values of your map are an empty ArrayList as all refers to the same ArrayList.
To solve your problem, if the key doesn't exist in the map you should create a new ArrayList and associate it to this key :
ArrayList<CSVData> csvListNew = CSVDataMapNew.get(currentGroupNumber);
if (csvListNew == null)
csvListNew = new ArrayList<CSVData>();
CSVDataMapNew.put(csvListNew);
}
Then reuse the csvListNew variable to add the element in:
csvListNew.add(clone);
It simplifies your actual code that has undesirable duplication.
You always put the same ArrayList instance as value in your HashMap. That's the ArrayList instance created before the loop and referenced by the csvListNew variable.
This means that when you call csvListNew.clear(), you clear all the ArrayLists of your HashMap.
This can be fixed by creating a new ArrayList each time you want to put a new entry in your HashMap:
if (CSVDataMapNew.containsKey(currentGroupNumber)) {
csvListNew = CSVDataMapNew.get(currentGroupNumber);
csvListNew.add(clone);
} else {
csvListNew = new ArrayList<>(); // that's the main required fix
csvListNew.add(clone);
CSVDataMapNew.put(currentGroupNumber, csvListNew);
System.out.println(CSVDataMapNew.size());
System.out.println(CSVDataMapNew.get(currentGroupNumber).size());
}
In addition, remove the csvListNew.clear() call.
When you get a list from a hashMap you get a reference to the list. Everything you do with this list afterwards will affect the list that is in the map. This means two things:
You don't have to put the List back into the map after you added something to it
You have to create a new List for every Map entry. You currently don't do that.
This should fix it (also some adjusted code style):
Map<Integer,List<CSVData>> CSVDataMapNew = new HashMap<>();
while ((line = reader.readLine()) != null && !(line.contains(",,,,,,,,,")))
{
System.out.println(line);
String[] csvDataNew = line.split(",");
String currentGroup = csvDataNew[GroupIndex];
try {
currentGroupNumber = Integer.parseInt(currentGroup.replace("group", "").replace(" ", ""));
} catch (Exception ex) {
currentGroupNumber = previousGroupNumber;
}
String path = csvDataNew[PathIndex];
startLine = Integer.parseInt(csvDataNew[StartLineIndex]);
endLine = Integer.parseInt(csvDataNew[EndLineIndex]);
CSVData data = new CSVData(currentGroupNumber, path, startLine, endLine);
if (CSVDataMapNew.containsKey(currentGroupNumber)) {
CSVDataMapNew.get(currentGroupNumber).add(clone);
} else {
ArrayList<CSVData> csvListNew = new ArrayList<CSVData>();
CSVDataMapNew.put(currentGroupNumber, csvListNew);
csvListNew.add(clone);
}
previousGroupNumber = currentGroupNumber;
}
public class JavaApplication13 {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
// TODO code application logic here
BufferedReader br;
String strLine;
ArrayList<String> arr =new ArrayList<>();
HashMap<Integer,ArrayList<String>> hm = new HashMap<>();
try {
br = new BufferedReader( new FileReader("words.txt"));
while( (strLine = br.readLine()) != null){
arr.add(strLine);
}
} catch (FileNotFoundException e) {
System.err.println("Unable to find the file: fileName");
} catch (IOException e) {
System.err.println("Unable to read the file: fileName");
}
ArrayList<Integer> lengths = new ArrayList<>(); //List to keep lengths information
System.out.println("Total Words: "+arr.size()); //Total waords read from file
int i=0;
while(i<arr.size()) //this loop will itrate our all the words of text file that are now stored in words.txt
{
boolean already=false;
String s = arr.get(i);
//following for loop will check if that length is already in lengths list.
for(int x=0;x<lengths.size();x++)
{
if(s.length()==lengths.get(x))
already=true;
}
//already = true means file is that we have an arrayist of the current string length in our map
if(already==true)
{
hm.get(s.length()).add(s); //adding that string according to its length in hm(hashmap)
}
else
{
hm.put(s.length(),new ArrayList<>()); //create a new element in hm and the adding the new length string
hm.get(s.length()).add(s);
lengths.add(s.length());
}
i++;
}
//Now Print the whole map
for(int q=0;q<hm.size();q++)
{
System.out.println(hm.get(q));
}
}
}
is this approach is right?
Explanation:
load all the words to an ArrayList.
then iterate through each index and check the length of word add it to an ArrayList of strings containing that length where these ArrayList are mapped in a hashmap with length of words it is containing.
Firstly, your code is working only for the files which contain one word by line as you're processing whole lines as words. To make your code more universal you have to process each line by splitting it to words:
String[] words = strLine.split("\\s+")
Secondly, you don't need any temporary data structures. You can add your words to the map right after you read the line from file. arr and lengths lists are actually useless here as they do not contain any logic except temporary storing. You're using lengths list just to store the lengths which has already been added to the hm map. The same can be reached by invoking hm.containsKey(s.length()).
And an additional comment on your code:
for(int x=0;x<lengths.size();x++) {
if(s.length()==lengths.get(x))
already=true;
}
when you have a loop like this when you only need to find if some condition is true for any element you don't need to proceed looping when the condition is already found. You should use a break keyword inside your if statement to terminate the loop block, e.g.
for(int x=0;x<lengths.size();x++) {
if(s.length()==lengths.get(x))
already=true;
break; // this will terminate the loop after setting the flag to true
}
But as I already mentioned you don't need it at all. That is just for educational purposes.
Your approach is long, confusing, hard to debug and from what I see it's not good performance-wise (check out the contains method). Check this:
String[] words = {"a", "ab", "ad", "abc", "af", "b", "dsadsa", "c", "ghh", "po"};
Map<Integer, List<String>> groupByLength =
Arrays.stream(words).collect(Collectors.groupingBy(String::length));
System.out.println(groupByLength);
This is just an example, but you get the point. I have an array of words, and then I use streams and Java8 magic to group them in a map by length (exactly what you're trying to do). You get the stream, then collect it to a map, grouping by length of the words, so it's gonna put every 1 letter word in a list under key 1 etc.
You can use the same approach, but you have your words in a list so remember to not use Arrays.stream() but just .stream() on your list.
I have a very long string containing GPS data but this is not important. What I need to do is separate the string which is in an arraylist (one big string) into multiple pieces.
The tricky part is that the string is made up of multiple 'gps sentances' and I only require two types of these sentences.
The types I need start with $GPSGSV and $GPSGGA. Basically I need to dump ONLY THESE sentences into another arraylist while leaving all the rest behind.
The new arraylist must be in line-by-line form so that each sentence is followed by a new line.
Each sentence also ends in one white space which could be helpful when splitting up. The arraylist data is shown below. - This is printed from the arraylist.
[$GPGSA,A,3,28,09,26,15,08,05,21,24,07,,,,1.6,1.0,1.3*3A,
$GPRMC,151018.000,A,5225.9627,N,00401.1624,W,0.11,104.71,210214,,*14,
$GPGGA,151019.000,5225.9627,N,00401.1624,W,1,09,1.0,38.9,M,51.1,M,,0000*72,
$GPGSA,A,3,28,09,26,15,08,05,21,24,07,,,,1.6,1.0,1.3*3A,
$GPGSV,3,1,12,26,80,302,44,09,55,063,40,05,53,191,39,08,51,059,37*79,
$GPGSV,3,2,12,28,43,112,34,15,40,284,42,21,18,305,33,07,18,057,27*7E,
$GPGSV,3,3,12,10,05,153,,24,05,234,38,18,05,318,22,19,05,035,*79,
$GPRMC,151019.000,A,5225.9627,N,00401.1624,W,0.10,105.97,210214,,*1D,
$GPGGA,151020.000,5225.9627,N,00401.1624,W,1,09,1.0,38.9,M,51.1,M,,0000*78,
$GPGSA,A,3,28,09,26,15,08,05,21,24,07,,,,1.6,1.0,1.3*3A,
$GPRMC,151020.000,A,5225.9627,N,00401.1624,W,0.12,105.18,210214,,*12,
$GPGSA,A,3,28,09,26,15,08,05,21,24,07,,,,1.6,1.0,1.3*3A,
$GPRMC,151021.000,A,5225.9626,N,00401.1624,W,0.11,99.26,210214,,*28,
$GPGGA,151022.000,5225.9626,N,00401.1623,W,1,09,1.0,38.9,M,51.1,M,,0000*7C,
$GPGSA,A,3,28,09,26,15,08,05,21,24,07,,,,1.6,1.0,1.3*3A,
$GPRMC,151022.000,A,5225.9626,N,00401.1623,W,0.11,109.69,210214,,*1F,
The data continues up to 2000 sentences.
Any help would be great. Thanks
EDITS ------
Looking back at what I have.. It may be best if I just read in the lines (as the file is formatted to be one sentence per line) which start with either the GSV or the GGA tag. In the buffered reader section of the method, how could I go about doing that? Here is some of my code ....
try {
File gpsioFile = new File(gpsFile);
FileReader file = new FileReader(gpsFile);
BufferedReader buffer = new BufferedReader(file);
StringBuffer stringbuff = new StringBuffer();
String ans;
while ((ans = buffer.readLine()) != null) {
gps.add(ans);
stringbuff.append(ans);
stringbuff.append("\n");
}
} catch (Exception e) {
e.printStackTrace();
}
From this could I get an Arraylist with just the GGA and GSV sentences/lines but in the same order that they were from the file?
Thanks
OK, I'd first start by splitting your string into individual lines with spilt():
String[] split = "$GPGSA,A,3,28,09,26,15,08,05,21,24,07,,,,1.6,1.0,1.3*3A,".split(",");
you can also use "\n" as a split delimiter instead of ",". This will give you an array over which you can iterate.
List<String> filtered = new ArrayList<String>()
for (String item, split) {
if (item.startsWith("$GPGSA")) {
filtered.add(item);
}
}
filtered would be a new Array with the items you want to keep.
This approach works with JDK 6+. In JDK 8, this kind of problem can be solved more elegantly with the stream API.
My understanding is that you've got an ArrayList with a single String element. That String is a comma separated list of values. So step one is to extract the string and split it into it's constituent parts. Once you've done that you can process the each item in turn.
private static List<List<String>> splitData(final ArrayList<String> data) {
final List<List<String>> filteredData = new ArrayList<List<String>>();
String fullText = data.get(0);
String[] splitData = fullText.split(",");
List<String> currentList = null;
for (int i = 0;i < splitData.length; i++) {
final String next = splitData[i];
if (startTags.contains(next)) {
if (interestingStartTags.contains(next)) {
currentList = new ArrayList<String>();
filteredData.add(currentList);
} else {
currentList = null;
}
}
if (currentList != null) {
currentList.add(next);
}
}
return filteredData;
}
The two static Set<String> provide the set of all 'gps sentence' start tags and also the set of ones you're interested in. The split data method uses startTags to determine if it has reached the start of a new sentence. If the new tag is also interesting, then a new list is created and added to the List<List<String>>. It is this list of lists that is returned.
If you don't know all of the strings you want to use as 'startTag' then you could next.startsWith("$GP") or similar.
Reading the file
Looking at the updated question of how to read the file you could remove the StringBuffer and instead simply add each line you read to an ArrayList. The code below will step over any lines that do not start with the two tags you are interested in. The order of the lines within lineList will match the order they are found in the file.
FileReader file = new FileReader(gpsFile);
BufferedReader buffer = new BufferedReader(file);
String ans;
ArrayList<String> lineList = new ArrayList<String>();
while ((ans = buffer.readLine()) != null) {
if (ans.startsWith("$GPSGSV")||ans.startsWith("$GPSGGA")) {
lineList.add(ans);
}
}