Store associative array of strings with length as keys

Store associative array of strings with length as keys - java

I have this input:
5
it
your
reality
real
our
First line is number of strings comming after. And i should store it this way (pseudocode):
associative_array = [ 2 => ['it'], 3 => ['our'], 4 => ['real', 'your'], 7 => ['reality']]
As you can see the keys of associative array are the length of strings stored in inner array.
So how can i do this in java ? I came from php world, so if you will compare it with php, it will be very well.

MultiMap<Integer, String> m = new MultiHashMap<Integer, String>();
for(String item : originalCollection) {
m.put(item.length(), item);
}

djechlin already posted a better version, but here's a complete standalone example using just JDK classes:
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
public class Main {
public static void main(String[] args) throws Exception{
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
String firstLine = reader.readLine();
int numOfRowsToFollow = Integer.parseInt(firstLine);
Map<Integer,Set<String>> stringsByLength = new HashMap<>(numOfRowsToFollow); //worst-case size
for (int i=0; i<numOfRowsToFollow; i++) {
String line = reader.readLine();
int length = line.length();
Set<String> alreadyUnderThatLength = stringsByLength.get(length); //int boxed to Integer
if (alreadyUnderThatLength==null) {
alreadyUnderThatLength = new HashSet<>();
stringsByLength.put(length, alreadyUnderThatLength);
}
alreadyUnderThatLength.add(line);
}
System.out.println("results: "+stringsByLength);
}
}
its output looks like this:
3
bob
bart
brett
results: {4=[bart], 5=[brett], 3=[bob]}

Java doesn't have associative arrays. But it does have Hashmaps, which mostly accomplishes the same goal. In your case, you can have multiple values for any given key. So what you could do is make each entry in the Hashmap an array or a collection of some kind. ArrayList is a likely choice. That is:
Hashmap<Integer,ArrayList<String>> words=new HashMap<Integer,ArrayList<String>>();
I'm not going to go through the code to read your list from a file or whatever, that's a different question. But just to give you the idea of how the structure would work, suppose we could hard-code the list. We could do it something like this:
ArrayList<String> set=new ArrayList<String)();
set.add("it");
words.put(Integer.valueOf(2), set);
set.clear();
set.add("your");
set.add("real");
words.put(Integer.valueOf(4), set);
Etc.
In practice, you probably would regularly be adding words to an existing set. I often do that like this:
void addWord(String word)
{
Integer key=Integer.valueOf(word.length());
ArrayList<String> set=words.get(key);
if (set==null)
{
set=new ArrayList<String>();
words.put(key,set);
}
// either way we now have a set
set.add(word);
}
Side note: I often see programmers end a block like this by putting "set" back into the Hashmap, i.e. "words.put(key,set)" at the end. This is unnecessary: it's already there. When you get "set" from the Hashmap, you're getting a reference, not a copy, so any updates you make are just "there", you don't have to put it back.
Disclaimer: This code is off the top of my head. No warranties expressed or implied. I haven't written any Java in a while so I may have syntax errors or wrong function names. :-)

As your key appears to be small integer, you could use a list of lists. In this case the simplest solution is to use a MultiMap like
Map<Integer, Set<String>> stringByLength = new LinkedHashMap<>();
for(String s: strings) {
Integer len = s.length();
Set<String> set = stringByLength.get(s);
if(set == null)
stringsByLength.put(len, set = new LinkedHashSet<>());
set.add(s);
}

private HashMap<Integer, List<String>> map = new HashMap<Integer, List<String>>();
void addStringToMap(String s) {
int length = s.length();
if (map.get(length) == null) {
map.put(length, new ArrayList<String>());
}
map.get(length).add(s);
}

Related

How to sort data in a CSV file using a particular field in Java?

I want to read a CSV file in Java and sort it using a particular column. My CSV file looks like this:
ABC,DEF,11,GHI....
JKL,MNO,10,PQR....
STU,VWX,12,XYZ....
Considering I want to sort it using the third column, my output should look like:
JKL,MNO,10,PQR....
ABC,DEF,11,GHI....
STU,VWX,12,XYZ....
After some research on what data structure to use to hold the data of CSV, people here suggested to use Map data structure with Integer and List as key and value pairs in this question:
Map<Integer, List<String>>
where the value, List<String> = {[ABC,DEF,11,GHI....], [JKL,MNO,10,PQR....],[STU,VWX,12,XYZ....]...}
And the key will be an auto-incremented integer starting from 0.
So could anyone please suggest a way to sort this Map using an element in the 'List' in Java? Also if you think this choice of data structure is bad, please feel free to suggest an easier data structure to do this.
Thank you.

I would use an ArrayList of ArrayList of String:
ArrayList<ArrayList<String>>
Each entry is one line, which is a list of strings.
You initialize the list by:
List<ArrayList<String>> csvLines = new ArrayList<ArrayList<String>>();
To get the nth line:
List<String> line = csvLines.get(n);
To sort you write a custom Comparator. In the Constructor of that comparator you can pass the field position used to sort.
The compare method then gets the String value on stored position and converts it to a primitive ava type depending on the position. E.g you know that at position 2 in the csv there is an Integer, then convert the String to an int. This is neccessary for corretcly sorting. You may also pass an ArrayList of Class to the constructor such that it knows which field is what type.
Then use String.compareTo() or Integer.compare(), depending on column position etc.
Edit example of working code:
List<ArrayList<String>> csvLines = new ArrayList<ArrayList<String>>();
Comparator<ArrayList<String>> comp = new Comparator<ArrayList<String>>() {
public int compare(ArrayList<String> csvLine1, ArrayList<String> csvLine2) {
// TODO here convert to Integer depending on field.
// example is for numeric field 2
return Integer.valueOf(csvLine1.get(2)).compareTo(Integer.valueOf(csvLine2.get(2)));
}
};
Collections.sort(csvLines, comp);

In Java 8 you can do
SortedMap<Integer, List<String>> collect = Files.lines(Paths.get(filename))
.collect(Collectors.groupingBy(
l -> Integer.valueOf(l.split(",", 4)[2]),
TreeMap::new, Collectors.toList()));
Note: comparing numbers as Strings is a bad idea as "100" < "2" might not be what you expect.
I would use a sorted multi-map. If you don't have one handy you can do this.
SortedMap<Integer, List<String>> linesByKey = new TreeMap<>();
public void addLine(String line) {
Integer key = Integer.valueOf(line.split(",", 4));
List<String> lines = linesByKey.get(key);
if (lines == null)
linesByKey.put(key, lines = new ArrayList<>());
lines.add(line);
}
This will produce a collection of lines, sorted by the number where lines with duplicate numbers have a preserved order. e.g. if all the lines have the same number, the order is unchanged.

You can also use a list of lists:
List<List<String>> Llp = new ArrayList<List<String>>();
Then you need to call sort that extends a custom comparator that compares the third item in the list:
Collections.sort(Llp, new Comparator<LinkedList<String>>() {
#Override
public int compare(LinkedList<String> o1, LinkedList<String> o2) {
try {
return o1.get(2).compareTo(o2.get(2));
} catch (IndexOutOfBoundsException e) {
return 0;
}
}

In the below code I have sorted the CSV file based on the second column.
public static void main(String[] args) throws IOException {
String csvFile = "file_1.csv";
String line = "";
String cvsSplitBy = ",";
List<List<String>> llp = new ArrayList<>();
try (BufferedReader br = new BufferedReader(new FileReader(csvFile))) {
while ((line = br.readLine()) != null) {
llp.add(Arrays.asList(line.split(cvsSplitBy)));
}
llp.sort(new Comparator<List<String>>() {
#Override
public int compare(List<String> o1, List<String> o2) {
return o1.get(1).compareTo(o2.get(1));
}
});
System.out.println(llp);
} catch (IOException e) {
e.printStackTrace();
}
}

How can I retrieve the value in a Hashmap stored in an arraylist type hashmap?

I am a beginner in Java. Basically, I have loaded each text document and stored each individual words in the text document in the hasmap. Afterwhich, I tried storing all the hashmaps in an ArrayList. Now I am stuck with how to retrieve all the words in my hashmaps that is in the arraylist!
private static long numOfWords = 0;
private String userInputString;
private static long wordCount(String data) {
long words = 0;
int index = 0;
boolean prevWhiteSpace = true;
while (index < data.length()) {
//Intialise character variable that will be checked.
char c = data.charAt(index++);
//Determine whether it is a space.
boolean currWhiteSpace = Character.isWhitespace(c);
//If previous is a space and character checked is not a space,
if (prevWhiteSpace && !currWhiteSpace) {
words++;
}
//Assign current character's determination of whether it is a spacing as previous.
prevWhiteSpace = currWhiteSpace;
}
return words;
} //
public static ArrayList StoreLoadedFiles()throws Exception{
final File f1 = new File ("C:/Users/Admin/Desktop/dataFiles/"); //specify the directory to load files
String data=""; //reset the words stored
ArrayList<HashMap> hmArr = new ArrayList<HashMap>(); //array of hashmap
for (final File fileEntry : f1.listFiles()) {
Scanner input = new Scanner(fileEntry); //load files
while (input.hasNext()) { //while there are still words in the document, continue to load all the words in a file
data += input.next();
input.useDelimiter("\t"); //similar to split function
} //while loop
String textWords = data.replaceAll("\\s+", " "); //remove all found whitespaces
HashMap<String, Integer> hm = new HashMap<String, Integer>(); //Creates a Hashmap that would be renewed when next document is loaded.
String[] words = textWords.split(" "); //store individual words into a String array
for (int j = 0; j < numOfWords; j++) {
int wordAppearCount = 0;
if (hm.containsKey(words[j].toLowerCase().replaceAll("\\W", ""))) { //replace non-word characters
wordAppearCount = hm.get(words[j].toLowerCase().replaceAll("\\W", "")); //remove non-word character and retrieve the index of the word
}
if (!words[j].toLowerCase().replaceAll("\\W", "").equals("")) {
//Words stored in hashmap are in lower case and have special characters removed.
hm.put(words[j].toLowerCase().replaceAll("\\W", ""), ++wordAppearCount);//index of word and string word stored in hashmap
}
}
hmArr.add(hm);//stores every single hashmap inside an ArrayList of hashmap
} //end of for loop
return hmArr; //return hashmap ArrayList
}
public static void LoadAllHashmapWords(ArrayList m){
for(int i=0;i<m.size();i++){
m.get(i); //stuck here!
}

Firstly your login wont work correctly. In the StoreLoadedFiles() method you iterate through the words like for (int j = 0; j < numOfWords; j++) { . The numOfWords field is initialized to zero and hence this loop wont execute at all. You should initialize that with length of words array.
Having said that to retrieve the value from hashmap from a list of hashmap, you should first iterate through the list and with each hashmap you could take the entry set. Map.Entry is basically the pair that you store in the hashmap. So when you invoke map.entrySet() method it returns a java.util.Set<Map.Entry<Key, Value>>. A set is returned because the key will be unique.
So a complete program will look like.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map.Entry;
import java.util.Scanner;
public class FileWordCounter {
public static List<HashMap<String, Integer>> storeLoadedFiles() {
final File directory = new File("C:/Users/Admin/Desktop/dataFiles/");
List<HashMap<String, Integer>> listOfWordCountMap = new ArrayList<HashMap<String, Integer>>();
Scanner input = null;
StringBuilder data;
try {
for (final File fileEntry : directory.listFiles()) {
input = new Scanner(fileEntry);
input.useDelimiter("\t");
data = new StringBuilder();
while (input.hasNext()) {
data.append(input.next());
}
input.close();
String wordsInFile = data.toString().replaceAll("\\s+", " ");
HashMap<String, Integer> wordCountMap = new HashMap<String, Integer>();
for(String word : wordsInFile.split(" ")){
String strippedWord = word.toLowerCase().replaceAll("\\W", "");
int wordAppearCount = 0;
if(strippedWord.length() > 0){
if(wordCountMap.containsKey(strippedWord)){
wordAppearCount = wordCountMap.get(strippedWord);
}
wordCountMap.put(strippedWord, ++wordAppearCount);
}
}
listOfWordCountMap.add(wordCountMap);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} finally {
if(input != null) {
input.close();
}
}
return listOfWordCountMap;
}
public static void loadAllHashmapWords(List<HashMap<String, Integer>> listOfWordCountMap) {
for(HashMap<String, Integer> wordCountMap : listOfWordCountMap){
for(Entry<String, Integer> wordCountEntry : wordCountMap.entrySet()){
System.out.println(wordCountEntry.getKey() + " - " + wordCountEntry.getValue());
}
}
}
public static void main(String[] args) {
List<HashMap<String, Integer>> listOfWordCountMap = storeLoadedFiles();
loadAllHashmapWords(listOfWordCountMap);
}
}
Since you are beginner in Java programming I would like to point out a few best practices that you could start using from the beginning.
Closing resources : In your while loop to read from files you are opening a Scanner like Scanner input = new Scanner(fileEntry);, But you never closes it. This causes memory leaks. You should always use a try-catch-finally block and close resources in finally block.
Avoid unnecessary redundant calls : If an operation is the same while executing inside a loop try moving it outside the loop to avoid redundant calls. In your case for example the scanner delimiter setting as input.useDelimiter("\t"); is essentially a one time operation after a scanner is initialized. So you could move that outside the while loop.
Use StringBuilder instead of String : For repeated string manipulations such as concatenation should be done using a StringBuilder (or StringBuffer when you need synchronization) instead of using += or +. This is because String is an immutable object, meaning its value cannot be changed. So each time when you do a concatenation a new String object is created. This results in a lot of unused instances in memory. Where as StringBuilder is mutable and values could be changed.
Naming convention : The usual naming convention in Java is starting with lower-case letter and first letter upper-case for each word. So its a standard practice to name a method as storeLoadedFiles as opposed to StoreLoadedFiles. (This could be opinion based ;))
Give descriptive names : Its a good practice to give descriptive names. It helps in later code maintenance. Say its better to give a name as wordCountMap as opposed to hm. So in future if someone tries to go through your code they'll get a better and faster understanding about your code with descriptive names. Again opinion based.
Use generics as much as possible : This avoid additional casting overhead.
Avoid repetition : Similar to point 2 if you have an operation that result in the same output and need to be used multiple times try moving it to a variable and use the variable. In your case you were using words[j].toLowerCase().replaceAll("\\W", "") multiple times. All the time the result is the same but it creates unnecessary instances and repetitions. So you could move that to a String and use that String elsewhere.
Try using for-each loop where ever possible : This relieves us from taking care of indexing.
These are just suggestions. I tried to include most of it in my code but I wont say its the perfect one. Since you are a beginner if you tried to include these best practices now itself it'll get ingrained in you. Happy coding.. :)

for (HashMap<String, Integer> map : m) {
for(Entry<String,Integer> e:map.entrySet()){
//your code here
}
}
or, if using java 8 you can play with lambda
m.stream().forEach((map) -> {
map.entrySet().stream().forEach((e) -> {
//your code here
});
});
But before all you have to change method signature to public static void LoadAllHashmapWords(List<HashMap<String,Integer>> m) otherwise you would have to use a cast.
P.S. are you sure your extracting method works? I've tested it a bit and had list of empty hashmaps all the time.

Can I create an array of sets?

Here is what I am trying to do.
I am reading in a list of words with each having a level of complexity. Each line has a word followed by a comma and the level of the word. "watch, 2" for example. I wish to put all of the words of a given level into a set to ensure their uniqueness in that level. There are 5 levels of complexity, so ideally I'd like an array with 5 elements, each of which is a set.
I can then add words to each of the sets as I read them in. Later on, I wish to pull out a random word of a specified level.
I'm happy with everything except how to create an array of sets. I've read several other posts here that seem to agree that this can't be done exactly as I would hope, but I can't find a good work around. (No, I'm not willing to have 5 sets in a switch statement. Goes against the grain.)
Thanks.

You can use a map . Use level as key and value as the set which contains the words. This will help you to pull out the value for a given level, When a random word is requested from a level, get the value(set in this case) using the key which is the level and pick a random value from that. This will also scale if you increase the number of levels
public static void main(String[] args) {
Map<Integer, Set<String>> levelSet = new HashMap();
//Your code goes here to get the level and word
//
String word="";
int level=0;
addStringToLevel(levelSet,word,level);
}
private static void addStringToLevel(Map<Integer, Set<String>> levelSet,
String word, int level) {
if(levelSet.get(level) == null)
{
// this means this is the first string added for this level
// so create a container to hold the object
levelSet.put(level, new HashSet());
}
Set<String> wordContainer = levelSet.get(level);
wordContainer.add(word);
}
private static String getStringFromLevel(Map<Integer, Set<String>> levelSet,
int level) {
if(levelSet.get(level) == null)
{
return null;
}
Set<String> wordContainer = levelSet.get(level);
return "";// return a random string from wordContainer`
}

If you are willing to use Guava, try SetMultimap. It will take care of everything for you.
SetMultimap<Integer, String> map = HashMultimap.create();
map.put(5, "value");
The collection will take care of creating the inner Set instances for you unlike the array or List solutions which require either pre-creating the Sets or checking that they exist.

Consider using a List instead of an array.
Doing so might make your life easier.
List<Set<String>> wordSetLevels = new ArrayList();
// ...
for ( i = 0; i < 5; i++ ) {
wordSetLevels.add(new HashSet<String>());
}
wordSetLevels = Collections.unmodifiableList(wordSetLevels);
// ...
wordSetLevels.get(2).add("watch");

import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class Main {
private Set<String>[] process(List<String> words) {
#SuppressWarnings("unchecked")
Set<String>[] arrayOfSets = new Set[5];
for(int i=0; i<arrayOfSets.length; i++) {
arrayOfSets[i] = new HashSet<String>();
}
for(String word: words) {
int index = getIndex(word);
String val = getValue(word);
arrayOfSets[index].add(val);
}
return arrayOfSets;
}
private int getIndex(String str) {
//TODO Implement
return 0;
}
private String getValue(String str) {
//TODO Implement
return "";
}
}

How do I grab an index from an array in a HashMap?

I've got a HashMap<Object, String[]> I just want to grab the 0 index position from the String[]. How do I do that?
Can I just do this? mMap.get(position)[0]?

Yes, you can do what you've indicated, provided position is a key in the map.

HashMap doesn't have a 0index and it doesn't have a String[]
You cannot do what you ask because it doesn't make sense.
Can I just do this? mMap.get(position)[0]?
You can. Have you tried this to see if it works? Note: it will fail if map.get() returns null

A map is not an array. It's a dictionary of keys that map to items. So there is no ordered index in the Map part, there's a lookup key. This means that a Map has no "next" item.
In the event that you stored a String[] in the map, then you could get the first element of the String array like so:
String first = ((String[])mMap.get(lookupKey))[0];
Since you are using generics, your compiler will make the casting unnecessary, simplifying the answer to
String first = mMap.get(lookupKey)[0];
Note that this is not using the position to access the item stored in the Map, it's using the lookup key. In addition, there is a casting of the returned Object into a String[] (because we stored a String[] in the map earlier), and then there is a dereferncing of the first ('0') element.

Here is a little demo program that does what you're asking.
import java.util.Map;
import java.util.HashMap;
public class FirstElementInHashMap {
public static void main(String[] args) {
Map<Object,String[]> mMap = new HashMap<Object,String[]>();
mMap.put("myKeyA", new String[] { "myValue1", "myValue2", "myValue3" });
mMap.put("myKeyB", new String[] { "myValue4", "myValue5", "myValue6" });
mMap.put("myKeyC", new String[] { "myValue7", "myValue8", "myValue9" });
Object position = "myKeyB";
String[] strings = mMap.get(position);
// make sure position exists in the Map and contains a non-empty array
// so we don't throw an NullPointerException
String firstStringInArray = null;
if (strings != null && strings.length > 0) {
firstStringInArray = strings[0];
}
System.out.println(firstStringInArray);
}
}
The output of the above program is:
myValue4

Is there an object like a "Set" that can contain only unique string values, but also contain a count on the number of occurrences of the string value?

In Java is there an object like a "Set" that can contain only unique string values, but also contain a count on the number of occurrences of the string value?
The idea is simple
With a data set ala...
A
B
B
C
C
C
I'd like to add each line of text to a Set-like object. Each time that a non-unique text is added to the set I'd like to also have a numeric value associated with the set to display how many times it was added. So if I ran it on the above data set the output would be something like:
A : 1
B : 2
C : 3
any ideas?

You want a "Bag", like the Bag in Apache Commons Collections or the Multiset in Google Collections. You can add the same value to it multiple times, and it'll record the counts of each value. You can then interrogate the counts.
You'd do something like this with Apache Commons' Bag:
Bag myBag = new HashBag();
myBag.add("Orange");
myBag.add("Apple", 4);
myBag.add("Apple");
myBag.remove("Apple", 2);
int apples = myBag.getCount("Apple"); // Should be 3.
int kumquats = myBag.getCount("Kumquat"); // Should be 0.
And this with Google Collections' Multiset.
Multiset<String> myMultiset= HashMultiset.create();
myMultiset.add("Orange");
myMultiset.add("Apple", 4);
myMultiset.add("Apple");
myMultiset.remove("Apple", 2);
int apples = myMultiset.count("Apple"); // 3
int kumquats = myMultiset.count("Kumquats"); // 0
The problem with Apache Collections in general is that it isn't being very actively maintained, and it doesn't yet support Java Generics. To step into this gap, Google's written their own Collections which are extremely powerful. Be sure to evaluate Google Collections first.
Update: Google Collections also offers Multimap, a "collection similar to a Map, but which may associate multiple values with a single key".

Map<String, Integer> would be the best bet, to put in words what you want to do is to Map the amount of occurrences of a string. Basically have something like this:
public void add(String s) {
if (map.containsKey(s)) {
map.put(s, map.get(s) + 1);
} else {
map.put(s, 1);
}
}

Yeap, not directly in the core, but can be built easily with a Map.
Here's a naive implementation:
import java.util.Map;
import java.util.HashMap;
public class SetLike {
private Map<String, Integer> map = new HashMap<String,Integer>();
public void add( String s ) {
if( !map.containsKey( s ) ){
map.put( s, 0 );
}
map.put( s, map.get( s ) + 1 );
}
public void printValuesAndCounts() {
System.out.println( map );
}
public static void main( String [] args ){
String [] data = {"A","B","B","C","C","C"};
SetLike holder = new SetLike();
for( String value : data ) {
holder.add( value );
}
holder.printValuesAndCounts();
}
}
Test it
$ javac SetLike.java
$ java SetLike
{A=1, C=3, B=2}
Of course you can improve it much more. You can implement the Set interface, or a List, or a Collection, etc, you can add the iterators, implement Iterable and so on, it depends on what you want and what you need.

This will be helpful..
List<String> myList=new ArrayList<String>();
myList.add("A");
myList.add("B");
myList.add("B");
myList.add("C");
myList.add("C");
myList.add("C");
Set<String> set=new HashSet<String>(myList);
for (String value : set)
{
int occurance=Collections.frequency(myList, value);
System.out.println(value +" occur "+occurance + " times ");
}
Result :
A occur 1 times
B occur 2 times
C occur 3 times

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Store associative array of strings with length as keys - java

MultiMap<Integer, String> m = new MultiHashMap<Integer, String>(); for(String item : originalCollection) { m.put(item.length(), item); }

private HashMap<Integer, List<String>> map = new HashMap<Integer, List<String>>(); void addStringToMap(String s) { int length = s.length(); if (map.get(length) == null) { map.put(length, new ArrayList<String>()); } map.get(length).add(s); }

Related

How to sort data in a CSV file using a particular field in Java?

How can I retrieve the value in a Hashmap stored in an arraylist type hashmap?

Can I create an array of sets?

How do I grab an index from an array in a HashMap?

Is there an object like a "Set" that can contain only unique string values, but also contain a count on the number of occurrences of the string value?

Categories

Resources