Alright, so my problem here right now is that I can get all the words from a list, find the occurrence and then add key and value pairs to map, but since I need to return a list of words which frequency is even, I get stuck. Any help?
public static List<String> onlyEvenWordsList(List<String> words) {
Map<String, Integer> wordsWithCount = new HashMap<>();
List<String> onlyEvenWords = new ArrayList<>();
for (String word : words) {
Integer count = wordsWithCount.get(word);
if (count == null) {
count = 0;
}
wordsWithCount.put(word, count + 1);
}
for(Integer value: wordsWithCount.values()){
if(value % 2 == 0){
....
}
}
return onlyEvenWords;
}
public static List<String> onlyEvenWordsList(List<String> words) {
Map<String, Integer> wordsWithCount = new HashMap<>();
List<String> onlyEvenWords = new ArrayList<>();
for (String word : words) {
Integer count = wordsWithCount.get(word);
if (count == null) {
count = 0;
}
wordsWithCount.put(word, count + 1);
}
for(Map.Entry<String, Integer> entry: wordsWithCount.entrySet()){
if(entry.getValue()%2==0){
onlyEvenWords.add(entry.getKey());
}
}
return onlyEvenWords;
}
And to print the list in main
System.out.println(Arrays.toString( onlyEvenWordsList(words).toArray()));
I have a Input String as :
String str="1,1,2,2,2,1,3";
I want count each id occurrence and store them into List,and I want output Like this:
[
{
"count": "3",
"ids": "1, 2"
}
{
"count": "1",
"ids": "3"
}
]
I tried by using org.springframework.util.StringUtils.countOccurrencesOf(input, "a"); like this. But after counting not getting the things like I want.
This will give you the desired result. You first count the occurrences of each character, then you group by count each character in a new HashMap<Integer, List<String>>.
Here's a working example:
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
public class Test {
public static void main(String[] args) {
String str = "1,1,2,2,2,1,3";
String[] list = str.split(",");
HashMap<String, Integer> occr = new HashMap<>();
for (int i = 0; i < list.length; i++) {
if (occr.containsKey(list[i])) {
occr.put(list[i], occr.get(list[i]) + 1);
} else {
occr.put(list[i], 1);
}
}
HashMap<Integer, List<String>> res = new HashMap<>();
for (String key : occr.keySet()) {
int count = occr.get(key);
if (res.containsKey(count)) {
res.get(count).add(key);
} else {
List<String> l = new ArrayList<>();
l.add(key);
res.put(count, l);
}
}
StringBuffer sb = new StringBuffer();
sb.append("[\n");
for (Integer count : res.keySet()) {
sb.append("{\n");
List<String> finalList = res.get(count);
sb.append("\"count\":\"" + count + "\",\n");
sb.append("\"ids\":\"" + finalList.get(0));
for (int i = 1; i < finalList.size(); i++) {
sb.append("," + finalList.get(i));
}
sb.append("\"\n}\n");
}
sb.append("\n]");
System.out.println(sb.toString());
}
}
EDIT: A more generalised solution
Here's the method that returns a HashMap<Integer,List<String>>, which contains the number of occurrences of a string as a key of the HashMap where each key has a List<String> value which contains all the strings that occur key number of times.
public HashMap<Integer, List<String>> countOccurrences(String str, String delimiter) {
// First, we count the number of occurrences of each string.
String[] list = str.split(delimiter);
HashMap<String, Integer> occr = new HashMap<>();
for (int i = 0; i < list.length; i++) {
if (occr.containsKey(list[i])) {
occr.put(list[i], occr.get(list[i]) + 1);
} else {
occr.put(list[i], 1);
}
}
/** Now, we group them by the number of occurrences,
* All strings with the same number of occurrences are put into a list;
* this list is put into a HashMap as a value, with the number of
* occurrences as a key.
*/
HashMap<Integer, List<String>> res = new HashMap<>();
for (String key : occr.keySet()) {
int count = occr.get(key);
if (res.containsKey(count)) {
res.get(count).add(key);
} else {
List<String> l = new ArrayList<>();
l.add(key);
res.put(count, l);
}
}
return res;
}
You need to do some boring transfer, I'm not sure if you want to keep the ids sorted. A simple implementation is:
public List<Map<String, Object>> countFrequency(String s) {
// Count by char
Map<String, Integer> countMap = new HashMap<String, Integer>();
for (String ch : s.split(",")) {
Integer count = countMap.get(ch);
if (count == null) {
count = 0;
}
count++;
countMap.put(ch, count);
}
// Count by frequency
Map<Integer, String> countByFrequency = new HashMap<Integer, String>();
for (Map.Entry<String, Integer> entry : countMap.entrySet()) {
String chars = countByFrequency.get(entry.getValue());
System.out.println(entry.getValue() + " " + chars);
if (chars == null) {
chars = "" + entry.getKey();
} else {
chars += ", " + entry.getKey();
}
countByFrequency.put(entry.getValue(), chars);
}
// Convert to list
List<Map<String, Object>> result = new ArrayList<Map<String, Object>>();
for (Map.Entry<Integer, String> entry : countByFrequency.entrySet()) {
Map<String, Object> item = new HashMap<String, Object>();
item.put("count", entry.getKey());
item.put("ids", entry.getValue());
result.add(item);
}
return result;
}
Hey check the below code, it help you to achieve your expected result
public class Test
{
public static void main(String args[])
{
String str = "1,1,2,2,2,1,3"; //Your input string
List<String> listOfIds = Arrays.asList(str.split(",")); //Splits the string
System.out.println("List of IDs : " + listOfIds);
HashMap<String, List<String>> map = new HashMap<>();
Set<String> uniqueIds = new HashSet<>(Arrays.asList(str.split(",")));
for (String uniqueId : uniqueIds)
{
String frequency = String.valueOf(Collections.frequency(listOfIds, uniqueId));
System.out.println("ID = " + uniqueId + ", frequency = " + frequency);
if (!map.containsKey(frequency))
{
map.put(frequency, new ArrayList<String>());
}
map.get(frequency).add(uniqueId);
}
for (Map.Entry<String, List<String>> entry : map.entrySet())
{
System.out.println("Count = "+ entry.getKey() + ", IDs = " + entry.getValue());
}
}
}
One of the approach i can suggest you is to
put each "character" in hashMap as a key and "count" as a value.
Sample code to do so is
String str = "1,1,2,2,2,1,3";
HashMap<String, String> map = new HashMap();
for (String c : str.split(",")) {
if (map.containsKey( c)) {
int count = Integer.parseInt(map.get(c));
map.put(c, ++count + "");
} else
map.put(c, "1");
}
System.out.println(map.toString());
}
<!--first you split string based on "," and store into array, after that iterate array end of array lenght in side loop create new map and put element in map as a Key and set value as count 1 again check the key and increase count value in map-->
like....
String str="1,1,2,2,2,1,3";
String strArray=str.split(",");
Map strMap= new hashMap();
for(int i=0; i < strArray.length(); i++){
if(!strMap.containsKey(strArray[i])){
strMap.put(strArray[i],1)
}else{
strMap.put(strArray[i],strMap.get(strArray[i])+1)
}
}
String str="1,1,2,2,2,1,3";
//Converting given string to string array
String[] strArray = str.split(",");
//Creating a HashMap containing char as a key and occurrences as a value
Map<String,Integer> charCountMap = new HashMap<String, Integer>();
//checking each element of strArray
for(String num :strArray){
if(charCountMap.containsKey(num))
{
//If char is present in charCountMap, incrementing it's count by 1
charCountMap.put(num, charCountMap.get(num)+1);
}
else
{
//If char is not present in charCountMap, and putting this char to charCountMap with 1 as it's value
charCountMap.put(num, 1);
}
}
//Printing the charCountMap
for (Map.Entry<String, Integer> entry : charCountMap.entrySet())
{
System.out.println("ID ="+entry.getKey() + " count=" + entry.getValue());
}
}
// Split according to comma
HashMap<String, Integer> hm = new HashMap<String, Integer>();
for (String key : tokens) {
if (hm.containsKey(key)) {
Integer currentCount = hm.get(key);
hm.put(key, ++currentCount);
} else {
hm.put(key, 1);
}
}
// Organize info according to ID
HashMap<Integer, String> result = new HashMap<Integer, String>();
for (Map.Entry<String, Integer> entry : hm.entrySet()) {
Integer newKey = entry.getValue();
if (result.containsKey(newKey)) {
String newValue = entry.getKey() + ", " + result.get(newKey);
result.put(newKey, newValue);
} else {
result.put(newKey, entry.getKey());
}
}
And here is a complete Java 8 streaming solution for the problem. The main idea is to first build a map of the occurances of each id, which results in:
{1=3, 2=3, 3=1}
(first is ID and second the count) and then to group by it by the count:
public static void main(String[] args) {
String str = "1,1,2,2,2,1,3";
System.out.println(
Pattern.compile(",").splitAsStream(str)
.collect(groupingBy(identity(), counting()))
.entrySet().stream()
.collect(groupingBy(i -> i.getValue(), mapping( i -> i.getKey(), toList())))
);
}
which results in:
{1=[3], 3=[1, 2]}
This is the most compact version I could come up with. Is there anything even smaller?
EDIT: By the way here is the complete class, to get all static method imports right:
import static java.util.function.Function.identity;
import java.util.regex.Pattern;
import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.mapping;
import static java.util.stream.Collectors.toList;
public class Java8StreamsTest6 {
public static void main(String[] args) {
String str = "1,1,2,2,2,1,3";
System.out.println(
Pattern.compile(",").splitAsStream(str)
.collect(groupingBy(identity(), counting()))
.entrySet().stream()
.collect(groupingBy(i -> i.getValue(), mapping(i -> i.getKey(), toList())))
);
}
}
I am writing an inverted index program on java which returns the frequency of terms among multiple documents. I have been able to return the number times a word appears in the entire collection, but I have not been able to return which documents the word appears in. This is the code I have so far:
import java.util.*; // Provides TreeMap, Iterator, Scanner
import java.io.*; // Provides FileReader, FileNotFoundException
public class Run
{
public static void main(String[ ] args)
{
// **THIS CREATES A TREE MAP**
TreeMap<String, Integer> frequencyData = new TreeMap<String, Integer>( );
Map[] mapArray = new Map[5];
mapArray[0] = new HashMap<String, Integer>();
readWordFile(frequencyData);
printAllCounts(frequencyData);
}
public static int getCount(String word, TreeMap<String, Integer> frequencyData)
{
if (frequencyData.containsKey(word))
{ // The word has occurred before, so get its count from the map
return frequencyData.get(word); // Auto-unboxed
}
else
{ // No occurrences of this word
return 0;
}
}
public static void printAllCounts(TreeMap<String, Integer> frequencyData)
{
System.out.println("-----------------------------------------------");
System.out.println(" Occurrences Word");
for(String word : frequencyData.keySet( ))
{
System.out.printf("%15d %s\n", frequencyData.get(word), word);
}
System.out.println("-----------------------------------------------");
}
public static void readWordFile(TreeMap<String, Integer> frequencyData)
{
int total = 0;
Scanner wordFile;
String word; // A word read from the file
Integer count; // The number of occurrences of the word
int counter = 0;
int docs = 0;
//**FOR LOOP TO READ THE DOCUMENTS**
for(int x=0; x<Docs.length; x++)
{ //start of for loop [*
try
{
wordFile = new Scanner(new FileReader(Docs[x]));
}
catch (FileNotFoundException e)
{
System.err.println(e);
return;
}
while (wordFile.hasNext( ))
{
// Read the next word and get rid of the end-of-line marker if needed:
word = wordFile.next( );
// This makes the Word lower case.
word = word.toLowerCase();
word = word.replaceAll("[^a-zA-Z0-9\\s]", "");
// Get the current count of this word, add one, and then store the new count:
count = getCount(word, frequencyData) + 1;
frequencyData.put(word, count);
total = total + count;
counter++;
docs = x + 1;
}
} //End of for loop *]
System.out.println("There are " + total + " terms in the collection.");
System.out.println("There are " + counter + " unique terms in the collection.");
System.out.println("There are " + docs + " documents in the collection.");
}
// Array of documents
static String Docs [] = {"words.txt", "words2.txt",};
Instead of simply having a Map from word to count, create a Map from each word to a nested Map from document to count. In other words:
Map<String, Map<String, Integer>> wordToDocumentMap;
Then, inside your loop which records the counts, you want to use code which looks like this:
Map<String, Integer> documentToCountMap = wordToDocumentMap.get(currentWord);
if(documentToCountMap == null) {
// This word has not been found anywhere before,
// so create a Map to hold document-map counts.
documentToCountMap = new TreeMap<>();
wordToDocumentMap.put(currentWord, documentToCountMap);
}
Integer currentCount = documentToCountMap.get(currentDocument);
if(currentCount == null) {
// This word has not been found in this document before, so
// set the initial count to zero.
currentCount = 0;
}
documentToCountMap.put(currentDocument, currentCount + 1);
Now you're capturing the counts on a per-word and per-document basis.
Once you've completed the analysis and you want to print a summary of the results, you can run through the map like so:
for(Map.Entry<String, Map<String,Integer>> wordToDocument :
wordToDocumentMap.entrySet()) {
String currentWord = wordToDocument.getKey();
Map<String, Integer> documentToWordCount = wordToDocument.getValue();
for(Map.Entry<String, Integer> documentToFrequency :
documentToWordCount.entrySet()) {
String document = documentToFrequency.getKey();
Integer wordCount = documentToFrequency.getValue();
System.out.println("Word " + currentWord + " found " + wordCount +
" times in document " + document);
}
}
For an explanation of the for-each structure in Java, see this tutorial page.
For a good explanation of the features of the Map interface, including the entrySet method, see this tutorial page.
Try adding second map word -> set of document name like this:
Map<String, Set<String>> filenames = new HashMap<String, Set<String>>();
...
word = word.replaceAll("[^a-zA-Z0-9\\s]", "");
// Get the current count of this word, add one, and then store the new count:
count = getCount(word, frequencyData) + 1;
frequencyData.put(word, count);
Set<String> filenamesForWord = filenames.get(word);
if (filenamesForWord == null) {
filenamesForWord = new HashSet<String>();
}
filenamesForWord.add(Docs[x]);
filenames.put(word, filenamesForWord);
total = total + count;
counter++;
docs = x + 1;
When you need to get a set of filenames in which you encountered a particular word, you'll just get() it from the map filenames. Here is the example that prints out all the file names, in which we have encountered a word:
public static void printAllCounts(TreeMap<String, Integer> frequencyData, Map<String, Set<String>> filenames) {
System.out.println("-----------------------------------------------");
System.out.println(" Occurrences Word");
for(String word : frequencyData.keySet( ))
{
System.out.printf("%15d %s\n", frequencyData.get(word), word);
for (String filename : filenames.get(word)) {
System.out.println(filename);
}
}
System.out.println("-----------------------------------------------");
}
I've put a scanner into the main methode, and the word I search for will return the documents the word occurce in. I also return how many times the word occurs, but I will only get it to be the total of times in all of three documents. And I want it to return how many times it occurs in each document. I want this to be able to calculate tf-idf, if u have a total answer for the whole tf-idf I would appreciate. Cheers
Here is my code:
import java.util.*; // Provides TreeMap, Iterator, Scanner
import java.io.*; // Provides FileReader, FileNotFoundException
public class test2
{
public static void main(String[ ] args)
{
// **THIS CREATES A TREE MAP**
TreeMap<String, Integer> frequencyData = new TreeMap<String, Integer>();
Map<String, Set<String>> filenames = new HashMap<String, Set<String>>();
Map<String, Integer> countByWords = new HashMap<String, Integer>();
Map[] mapArray = new Map[5];
mapArray[0] = new HashMap<String, Integer>();
readWordFile(countByWords, frequencyData, filenames);
printAllCounts(countByWords, frequencyData, filenames);
}
public static int getCount(String word, TreeMap<String, Integer> frequencyData)
{
if (frequencyData.containsKey(word))
{ // The word has occurred before, so get its count from the map
return frequencyData.get(word); // Auto-unboxed
}
else
{ // No occurrences of this word
return 0;
}
}
public static void printAllCounts( Map<String, Integer> countByWords, TreeMap<String, Integer> frequencyData, Map<String, Set<String>> filenames)
{
System.out.println("-----------------------------------------------");
System.out.print("Search for a word: ");
String worde;
int result = 0;
Scanner input = new Scanner(System.in);
worde=input.nextLine();
if(!filenames.containsKey(worde)){
System.out.println("The word does not exist");
}
else{
for(String filename : filenames.get(worde)){
System.out.println(filename);
System.out.println(countByWords.get(worde));
}
}
System.out.println("\n-----------------------------------------------");
}
public static void readWordFile(Map<String, Integer> countByWords ,TreeMap<String, Integer> frequencyData, Map<String, Set<String>> filenames)
{
Scanner wordFile;
String word; // A word read from the file
Integer count; // The number of occurrences of the word
int counter = 0;
int docs = 0;
//**FOR LOOP TO READ THE DOCUMENTS**
for(int x=0; x<Docs.length; x++)
{ //start of for loop [*
try
{
wordFile = new Scanner(new FileReader(Docs[x]));
}
catch (FileNotFoundException e)
{
System.err.println(e);
return;
}
while (wordFile.hasNext( ))
{
// Read the next word and get rid of the end-of-line marker if needed:
word = wordFile.next( );
// This makes the Word lower case.
word = word.toLowerCase();
word = word.replaceAll("[^a-zA-Z0-9\\s]", "");
// Get the current count of this word, add one, and then store the new count:
count = countByWords.get(word);
if(count != null){
countByWords.put(word, count + 1);
}
else{
countByWords.put(word, 1);
}
Set<String> filenamesForWord = filenames.get(word);
if (filenamesForWord == null) {
filenamesForWord = new HashSet<String>();
}
filenamesForWord.add(Docs[x]);
filenames.put(word, filenamesForWord);
counter++;
docs = x + 1;
}
} //End of for loop *]
System.out.println("There are " + counter + " terms in the collection.");
System.out.println("There are " + docs + " documents in the collection.");
}
// Array of documents
static String Docs [] = {"Document1.txt", "Document2.txt", "Document3.txt"};
}
Let's say I have something like this:
LinkedHashMap <String, ArrayList<String>> h
keyOne has
stringOne
stringTwo
stringThree
keyTwo has
stringOne
How do I count the size of ArrayList of the associated key? So for keyOne, it should give me 3.
Did you try:
ArrayList<String> tmp = h.get("keyOne");
if ( tmp != null ) {
return tmp.size();
} else {
return 0;
}
You can just write h.get(key).size()
int count = h.get("keyOne").size();
should do it.
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.Set;
public class Test {
public static void main(String[] args) {
LinkedHashMap<String, ArrayList<String>> h = new LinkedHashMap<String, ArrayList<String>>();
ArrayList<String> al1 = new ArrayList<String>();
al1.add("Value11");
al1.add("Value12");
al1.add("Value13");
ArrayList<String> al2 = new ArrayList<String>();
al2.add("Value21");
h.put("key1", al1);
h.put("key2", null);
Set<String> set = h.keySet();
for (String key : set) {
ArrayList<String> al = h.get(key);
if (al != null)
System.out.println (key + " : " + al.size());
else
System.out.println (key + " is empty ");
}
}
}
Output:
key1 : 3
key2 is empty