WordCount with treemap

WordCount with treemap - java

I am writing a program to read a text file, store in a treeMap and then print out the word frequences(wordcount) to the console. I keep getting the FileNotFoundException "I THINK" I'm pretty much almost done, with the remainder of the code. Any help, pointers, suggestions and tips would be appreciated. thx. Code Below
import java.util.*;
/**
*
* #author
*
*/
public class WordCount {
public static void main(String[] args) {
// TODO Auto-generated method stub
TextFileInput take = new TextFileInput("noteFile.txt");
String m = take.readLine();
String [] input = m.split("[ \n\t\r,.;:!?(){}}]");
TreeMap <String, Integer> myMap = new TreeMap <String, Integer> ();
/**Set set = myMap.entrySet();
Iterator i = set.iterator();
Map.Entry <String, Integer> me; **/
for(int f = 0; f < input.length; f++) {
String key = input[f].toUpperCase();
if(input[f].length() > 1) {
if(myMap.get(key) == null) {
myMap.put(key, 1);
}
else {
int value = myMap.get(key).intValue();
value++;
myMap.put(key, value);
}
}
}
/**while(i.hasNext()) {
me = (Map.Entry)i.next();
System.out.print(me.getKey() + ": ");
System.out.println(me.getValue()); **/
for(Map.Entry<String, Integer> entry : myMap.entrySet()) {
System.out.println(entry.getKey() + " : "+ entry.getValue());
}
}
}
}

TextFileInput - I am not sure about this. You can use File and Scanner to read from the file.
Give the absolute path of the file. eg. C://notepad.txt (for windows)
Moreover you are reading one line from the file. You can add it in while loop. And to print the TreeMap you can do as follows,
for(String entry : myMap.keySet()) {
System.out.println(entry + " : "+ myMap.get(entry));
}
And the complete code is below,
import java.io.File;
import java.io.FileNotFoundException;
import java.util.*;
public class WordCount {
public static void main(String[] args) throws FileNotFoundException {
File file = new File("C://notepad.txt");
Scanner scanner=new Scanner(file);
TreeMap <String, Integer> myMap = new TreeMap <String, Integer> ();
while(scanner.hasNext())
{
String m = scanner.nextLine();
String [] input = m.split("[ \n\t\r,.;:!?(){}}]");
for(int f = 0; f < input.length; f++) {
String key = input[f].toUpperCase();
if(input[f].length() > 1) {
if(myMap.get(key) == null) {
myMap.put(key, 1);
}
else {
myMap.put(key, (myMap.get(key))+1);
}
}
}
}
for(String entry : myMap.keySet()) {
System.out.println(entry + " : "+ myMap.get(entry));
}
}
}

Related

How to make a Java program that takes two types of arguments?

so I want to have my program able to take two arguments: the first one is a text file path and the second one is an integer represents how many lines I want to see.
Right now, my code can work with the text file path argument but I can't get to the second method I made for the integer argument.
For example, I try to run my program like this:
run Read test.txt 3
And it prints everything in the text file instead of 3 lines which I want it to be.
Any help would be appreciated!
import java.io.*;
import java.util.*;
public class Read{
public static void main(String[] args) throws FileNotFoundException{
String in = new String(args[0]);
String input = in;
if (Character.isDigit(in.charAt(in.length() - 1))){
hasNumInput(input);
}else onlyStrInput(input);
}
public static void onlyStrInput(String filename) throws FileNotFoundException{
File file = new File(filename);
Scanner inFile = new Scanner(file);
HashMap<String, Integer> map = new HashMap<>();
String input = "";
while (inFile.hasNext()){
input = inFile.next();
if (!map.containsKey(input)) map.put(input,1);
else map.put(input, map.get(input) + 1);
}
int index = 1;
while (!map.isEmpty()) {
int max = 0;
String maxP = "";
for (Map.Entry<String, Integer> stringIntegerEntry : map.entrySet()) {
Map.Entry entry = (Map.Entry) stringIntegerEntry;
String key = (String) entry.getKey();
Integer value = (Integer) entry.getValue();
//if equals, alphabetically
if (value == max) {
if (maxP.compareTo(key) > 0) {
max = value;
maxP = key;
}
}
if (value > max) {
max = value;
maxP = key;
}
}
System.out.println(index++ + ". " + maxP + " = " + map.get(maxP));
//pop out the max
map.remove(maxP);
}
}
public static void hasNumInput(String in) throws FileNotFoundException{
int indexSpace = in.indexOf(" ");
String strNum = in.substring(indexSpace + 1,in.length());
int num = Integer.parseInt(strNum);
String filename = in.substring(0,indexSpace);
File file = new File(filename);
Scanner inFile = new Scanner(file);
HashMap<String, Integer> map = new HashMap<>();
String input = "";
while (inFile.hasNext()){
input = inFile.next();
if (!map.containsKey(input)) map.put(input,1);
else map.put(input, map.get(input) + 1);
}
int index = 1;
while (!map.isEmpty() && num > 0) {
int max = 0;
String maxP = "";
for (Map.Entry<String, Integer> stringIntegerEntry : map.entrySet()) {
Map.Entry entry = (Map.Entry) stringIntegerEntry;
String key = (String) entry.getKey();
Integer value = (Integer) entry.getValue();
//if equals, alphabetically
if (value == max) {
if (maxP.compareTo(key) > 0) {
max = value;
maxP = key;
}
}
if (value > max) {
max = value;
maxP = key;
}
}
System.out.println(index++ + ". " + maxP + " = " + map.get(maxP));
//pop out the max
map.remove(maxP);
num--;
}
}
}

String[] args does not pass everything in args[0], instead you must check the array length and then act accordingly. Your hasNumInput method should take an int argument (and the current String). Something like,
public static void hasNumInput(String in, int num) throws FileNotFoundException {
File file = new File(in);
Scanner inFile = new Scanner(file);
Then your main might look like
public static void main(String[] args) throws FileNotFoundException {
if (args.length > 1) {
hasNumInput(args[0], Integer.parseInt(args[1]));
} else {
onlyStrInput(args[0]);
}
}

args is an array. As you already know, the first argument is the first element of the array, referred to using args[0] so logically, the second command line argument would be referred to using args[1] and so on.
public class Foo
{
public static void main(String args[])
{
System.out.println("String = " + args[0]);
System.out.println("int = " + args[1]);
}
}
The args array is of type String, so if you want an int, you have to make sure to cast it using Integer.valueOf(args[1]);

How can I convert a String into ArrayList by counting occurrence of each characters?

I have a Input String as :
String str="1,1,2,2,2,1,3";
I want count each id occurrence and store them into List,and I want output Like this:
[
{
"count": "3",
"ids": "1, 2"
}
{
"count": "1",
"ids": "3"
}
]
I tried by using org.springframework.util.StringUtils.countOccurrencesOf(input, "a"); like this. But after counting not getting the things like I want.

This will give you the desired result. You first count the occurrences of each character, then you group by count each character in a new HashMap<Integer, List<String>>.
Here's a working example:
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
public class Test {
public static void main(String[] args) {
String str = "1,1,2,2,2,1,3";
String[] list = str.split(",");
HashMap<String, Integer> occr = new HashMap<>();
for (int i = 0; i < list.length; i++) {
if (occr.containsKey(list[i])) {
occr.put(list[i], occr.get(list[i]) + 1);
} else {
occr.put(list[i], 1);
}
}
HashMap<Integer, List<String>> res = new HashMap<>();
for (String key : occr.keySet()) {
int count = occr.get(key);
if (res.containsKey(count)) {
res.get(count).add(key);
} else {
List<String> l = new ArrayList<>();
l.add(key);
res.put(count, l);
}
}
StringBuffer sb = new StringBuffer();
sb.append("[\n");
for (Integer count : res.keySet()) {
sb.append("{\n");
List<String> finalList = res.get(count);
sb.append("\"count\":\"" + count + "\",\n");
sb.append("\"ids\":\"" + finalList.get(0));
for (int i = 1; i < finalList.size(); i++) {
sb.append("," + finalList.get(i));
}
sb.append("\"\n}\n");
}
sb.append("\n]");
System.out.println(sb.toString());
}
}
EDIT: A more generalised solution
Here's the method that returns a HashMap<Integer,List<String>>, which contains the number of occurrences of a string as a key of the HashMap where each key has a List<String> value which contains all the strings that occur key number of times.
public HashMap<Integer, List<String>> countOccurrences(String str, String delimiter) {
// First, we count the number of occurrences of each string.
String[] list = str.split(delimiter);
HashMap<String, Integer> occr = new HashMap<>();
for (int i = 0; i < list.length; i++) {
if (occr.containsKey(list[i])) {
occr.put(list[i], occr.get(list[i]) + 1);
} else {
occr.put(list[i], 1);
}
}
/** Now, we group them by the number of occurrences,
* All strings with the same number of occurrences are put into a list;
* this list is put into a HashMap as a value, with the number of
* occurrences as a key.
*/
HashMap<Integer, List<String>> res = new HashMap<>();
for (String key : occr.keySet()) {
int count = occr.get(key);
if (res.containsKey(count)) {
res.get(count).add(key);
} else {
List<String> l = new ArrayList<>();
l.add(key);
res.put(count, l);
}
}
return res;
}

You need to do some boring transfer, I'm not sure if you want to keep the ids sorted. A simple implementation is:
public List<Map<String, Object>> countFrequency(String s) {
// Count by char
Map<String, Integer> countMap = new HashMap<String, Integer>();
for (String ch : s.split(",")) {
Integer count = countMap.get(ch);
if (count == null) {
count = 0;
}
count++;
countMap.put(ch, count);
}
// Count by frequency
Map<Integer, String> countByFrequency = new HashMap<Integer, String>();
for (Map.Entry<String, Integer> entry : countMap.entrySet()) {
String chars = countByFrequency.get(entry.getValue());
System.out.println(entry.getValue() + " " + chars);
if (chars == null) {
chars = "" + entry.getKey();
} else {
chars += ", " + entry.getKey();
}
countByFrequency.put(entry.getValue(), chars);
}
// Convert to list
List<Map<String, Object>> result = new ArrayList<Map<String, Object>>();
for (Map.Entry<Integer, String> entry : countByFrequency.entrySet()) {
Map<String, Object> item = new HashMap<String, Object>();
item.put("count", entry.getKey());
item.put("ids", entry.getValue());
result.add(item);
}
return result;
}

Hey check the below code, it help you to achieve your expected result
public class Test
{
public static void main(String args[])
{
String str = "1,1,2,2,2,1,3"; //Your input string
List<String> listOfIds = Arrays.asList(str.split(",")); //Splits the string
System.out.println("List of IDs : " + listOfIds);
HashMap<String, List<String>> map = new HashMap<>();
Set<String> uniqueIds = new HashSet<>(Arrays.asList(str.split(",")));
for (String uniqueId : uniqueIds)
{
String frequency = String.valueOf(Collections.frequency(listOfIds, uniqueId));
System.out.println("ID = " + uniqueId + ", frequency = " + frequency);
if (!map.containsKey(frequency))
{
map.put(frequency, new ArrayList<String>());
}
map.get(frequency).add(uniqueId);
}
for (Map.Entry<String, List<String>> entry : map.entrySet())
{
System.out.println("Count = "+ entry.getKey() + ", IDs = " + entry.getValue());
}
}
}

One of the approach i can suggest you is to
put each "character" in hashMap as a key and "count" as a value.
Sample code to do so is
String str = "1,1,2,2,2,1,3";
HashMap<String, String> map = new HashMap();
for (String c : str.split(",")) {
if (map.containsKey( c)) {
int count = Integer.parseInt(map.get(c));
map.put(c, ++count + "");
} else
map.put(c, "1");
}
System.out.println(map.toString());
}

<!--first you split string based on "," and store into array, after that iterate array end of array lenght in side loop create new map and put element in map as a Key and set value as count 1 again check the key and increase count value in map-->
like....
String str="1,1,2,2,2,1,3";
String strArray=str.split(",");
Map strMap= new hashMap();
for(int i=0; i < strArray.length(); i++){
if(!strMap.containsKey(strArray[i])){
strMap.put(strArray[i],1)
}else{
strMap.put(strArray[i],strMap.get(strArray[i])+1)
}
}

String str="1,1,2,2,2,1,3";
//Converting given string to string array
String[] strArray = str.split(",");
//Creating a HashMap containing char as a key and occurrences as a value
Map<String,Integer> charCountMap = new HashMap<String, Integer>();
//checking each element of strArray
for(String num :strArray){
if(charCountMap.containsKey(num))
{
//If char is present in charCountMap, incrementing it's count by 1
charCountMap.put(num, charCountMap.get(num)+1);
}
else
{
//If char is not present in charCountMap, and putting this char to charCountMap with 1 as it's value
charCountMap.put(num, 1);
}
}
//Printing the charCountMap
for (Map.Entry<String, Integer> entry : charCountMap.entrySet())
{
System.out.println("ID ="+entry.getKey() + " count=" + entry.getValue());
}
}

// Split according to comma
HashMap<String, Integer> hm = new HashMap<String, Integer>();
for (String key : tokens) {
if (hm.containsKey(key)) {
Integer currentCount = hm.get(key);
hm.put(key, ++currentCount);
} else {
hm.put(key, 1);
}
}
// Organize info according to ID
HashMap<Integer, String> result = new HashMap<Integer, String>();
for (Map.Entry<String, Integer> entry : hm.entrySet()) {
Integer newKey = entry.getValue();
if (result.containsKey(newKey)) {
String newValue = entry.getKey() + ", " + result.get(newKey);
result.put(newKey, newValue);
} else {
result.put(newKey, entry.getKey());
}
}

And here is a complete Java 8 streaming solution for the problem. The main idea is to first build a map of the occurances of each id, which results in:
{1=3, 2=3, 3=1}
(first is ID and second the count) and then to group by it by the count:
public static void main(String[] args) {
String str = "1,1,2,2,2,1,3";
System.out.println(
Pattern.compile(",").splitAsStream(str)
.collect(groupingBy(identity(), counting()))
.entrySet().stream()
.collect(groupingBy(i -> i.getValue(), mapping( i -> i.getKey(), toList())))
);
}
which results in:
{1=[3], 3=[1, 2]}
This is the most compact version I could come up with. Is there anything even smaller?
EDIT: By the way here is the complete class, to get all static method imports right:
import static java.util.function.Function.identity;
import java.util.regex.Pattern;
import static java.util.stream.Collectors.counting;
import static java.util.stream.Collectors.groupingBy;
import static java.util.stream.Collectors.mapping;
import static java.util.stream.Collectors.toList;
public class Java8StreamsTest6 {
public static void main(String[] args) {
String str = "1,1,2,2,2,1,3";
System.out.println(
Pattern.compile(",").splitAsStream(str)
.collect(groupingBy(identity(), counting()))
.entrySet().stream()
.collect(groupingBy(i -> i.getValue(), mapping(i -> i.getKey(), toList())))
);
}
}

Same word occurrence in a string. Java [duplicate]

I am writing a very basic java program that calculates frequency of each word in a sentence so far i managed to do this much
import java.io.*;
class Linked {
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(
new InputStreamReader(System.in));
System.out.println("Enter the sentence");
String st = br.readLine();
st = st + " ";
int a = lengthx(st);
String arr[] = new String[a];
int p = 0;
int c = 0;
for (int j = 0; j < st.length(); j++) {
if (st.charAt(j) == ' ') {
arr[p++] = st.substring(c,j);
c = j + 1;
}
}
}
static int lengthx(String a) {
int p = 0;
for (int j = 0; j < a.length(); j++) {
if (a.charAt(j) == ' ') {
p++;
}
}
return p;
}
}
I have extracted each string and stored it in a array , now problem is actually how to count the no of instances where each 'word' is repeated and how to display so that repeated words not get displayed multiple times , can you help me in this one ?

Use a map with word as a key and count as value, somthing like this
Map<String, Integer> map = new HashMap<>();
for (String w : words) {
Integer n = map.get(w);
n = (n == null) ? 1 : ++n;
map.put(w, n);
}
if you are not allowed to use java.util then you can sort arr using some sorting algoritm and do this
String[] words = new String[arr.length];
int[] counts = new int[arr.length];
words[0] = words[0];
counts[0] = 1;
for (int i = 1, j = 0; i < arr.length; i++) {
if (words[j].equals(arr[i])) {
counts[j]++;
} else {
j++;
words[j] = arr[i];
counts[j] = 1;
}
}
An interesting solution with ConcurrentHashMap since Java 8
ConcurrentMap<String, Integer> m = new ConcurrentHashMap<>();
m.compute("x", (k, v) -> v == null ? 1 : v + 1);

In Java 8, you can write this in two simple lines! In addition you can take advantage of parallel computing.
Here's the most beautiful way to do this:
Stream<String> stream = Stream.of(text.toLowerCase().split("\\W+")).parallel();
Map<String, Long> wordFreq = stream
.collect(Collectors.groupingBy(String::toString,Collectors.counting()));

import java.util.*;
public class WordCounter {
public static void main(String[] args) {
String s = "this is a this is this a this yes this is a this what it may be i do not care about this";
String a[] = s.split(" ");
Map<String, Integer> words = new HashMap<>();
for (String str : a) {
if (words.containsKey(str)) {
words.put(str, 1 + words.get(str));
} else {
words.put(str, 1);
}
}
System.out.println(words);
}
}
Output:
{a=3, be=1, may=1, yes=1, this=7, about=1, i=1, is=3, it=1, do=1, not=1, what=1, care=1}

Try this
public class Main
{
public static void main(String[] args)
{
String text = "the quick brown fox jumps fox fox over the lazy dog brown";
String[] keys = text.split(" ");
String[] uniqueKeys;
int count = 0;
System.out.println(text);
uniqueKeys = getUniqueKeys(keys);
for(String key: uniqueKeys)
{
if(null == key)
{
break;
}
for(String s : keys)
{
if(key.equals(s))
{
count++;
}
}
System.out.println("Count of ["+key+"] is : "+count);
count=0;
}
}
private static String[] getUniqueKeys(String[] keys)
{
String[] uniqueKeys = new String[keys.length];
uniqueKeys[0] = keys[0];
int uniqueKeyIndex = 1;
boolean keyAlreadyExists = false;
for(int i=1; i<keys.length ; i++)
{
for(int j=0; j<=uniqueKeyIndex; j++)
{
if(keys[i].equals(uniqueKeys[j]))
{
keyAlreadyExists = true;
}
}
if(!keyAlreadyExists)
{
uniqueKeys[uniqueKeyIndex] = keys[i];
uniqueKeyIndex++;
}
keyAlreadyExists = false;
}
return uniqueKeys;
}
}
Output:
the quick brown fox jumps fox fox over the lazy dog brown
Count of [the] is : 2
Count of [quick] is : 1
Count of [brown] is : 2
Count of [fox] is : 3
Count of [jumps] is : 1
Count of [over] is : 1
Count of [lazy] is : 1
Count of [dog] is : 1

From Java 10 you can use the following:
import java.util.Arrays;
import java.util.stream.Collectors;
public class StringFrequencyMap {
public static void main(String... args){
String[] wordArray = {"One", "One", "Two","Three", "Two", "two"};
var freq = Arrays.stream(wordArray)
.collect(Collectors.groupingBy(x -> x, Collectors.counting()));
System.out.println(freq);
}
}
Output:
{One=2, two=1, Two=2, Three=1}

You could try this
public static void frequency(String s) {
String trimmed = s.trim().replaceAll(" +", " ");
String[] a = trimmed.split(" ");
ArrayList<Integer> p = new ArrayList<>();
for (int i = 0; i < a.length; i++) {
if (p.contains(i)) {
continue;
}
int d = 1;
for (int j = i+1; j < a.length; j++) {
if (a[i].equals(a[j])) {
d += 1;
p.add(j);
}
}
System.out.println("Count of "+a[i]+" is:"+d);
}
}

package naresh.java;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Set;
public class StringWordDuplicates {
static void duplicate(String inputString){
HashMap<String, Integer> wordCount = new HashMap<String,Integer>();
String[] words = inputString.split(" ");
for(String word : words){
if(wordCount.containsKey(word)){
wordCount.put(word, wordCount.get(word)+1);
}
else{
wordCount.put(word, 1);
}
}
//Extracting of all keys of word count
Set<String> wordsInString = wordCount.keySet();
for(String word : wordsInString){
if(wordCount.get(word)>1){
System.out.println(word+":"+wordCount.get(word));
}
}
}
public static void main(String args[]){
duplicate("I am Java Programmer and IT Server Programmer with Java as Best Java lover");
}
}

class find
{
public static void main(String nm,String w)
{
int l,i;
int c=0;
l=nm.length();String b="";
for(i=0;i<l;i++)
{
char d=nm.charAt(i);
if(d!=' ')
{
b=b+d;
}
if(d==' ')
{
if(b.compareTo(w)==0)
{
c++;
}
b="";
}
}
System.out.println(c);
}
}

public class wordFrequency {
private static Scanner scn;
public static void countwords(String sent) {
sent = sent.toLowerCase().replaceAll("[^a-z ]", "");
ArrayList<String> arr = new ArrayList<String>();
String[] sentarr = sent.split(" ");
Map<String, Integer> a = new HashMap<String, Integer>();
for (String word : sentarr) {
arr.add(word);
}
for (String word : arr) {
int count = Collections.frequency(arr, word);
a.put(word, count);
}
for (String key : a.keySet()) {
System.out.println(key + " = " + a.get(key));
}
}
public static void main(String[] args) {
scn = new Scanner(System.in);
System.out.println("Enter sentence:");
String inp = scn.nextLine();
countwords(inp);
}
}

Determine the frequency of words in a file.
File f = new File(fileName);
Scanner s = new Scanner(f);
Map<String, Integer> counts =
new Map<String, Integer>();
while( s.hasNext() ){
String word = s.next();
if( !counts.containsKey( word ) )
counts.put( word, 1 );
else
counts.put( word,
counts.get(word) + 1 );
}

The following program finds the frequency, sorts it accordingly, and prints it.
Below is the output grouped by frequency:
0-10:
The 2
Is 4
11-20:
Have 13
Done 15
Here is my program:
package com.company;
import java.io.*;
import java.util.*;
import java.lang.*;
/**
* Created by ayush on 12/3/17.
*/
public class Linked {
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(
new InputStreamReader(System.in));
System.out.println("Enter the sentence");
String st = br.readLine();
st=st.trim();
st = st + " ";
int count = lengthx(st);
System.out.println(count);
String arr[] = new String[count];
int p = 0;
int c = 0;
for (int i = 0; i < st.length(); i++) {
if (st.charAt(i) == ' ') {
arr[p] = st.substring(c,i);
System.out.println(arr[p]);
c = i + 1;
p++;
}
}
Map<String, Integer> map = new HashMap<>();
for (String w : arr) {
Integer n = map.get(w);
n = (n == null) ? 1 : ++n;
map.put(w, n);
}
for (String key : map.keySet()) {
System.out.println(key + " = " + map.get(key));
}
Set<Map.Entry<String, Integer>> entries = map.entrySet();
Comparator<Map.Entry<String, Integer>> valueComparator = new Comparator<Map.Entry<String,Integer>>() {
#Override
public int compare(Map.Entry<String, Integer> e1, Map.Entry<String, Integer> e2) {
Integer v1 = e1.getValue();
Integer v2 = e2.getValue();
return v1.compareTo(v2); }
};
List<Map.Entry<String, Integer>> listOfEntries = new ArrayList<Map.Entry<String, Integer>>(entries);
Collections.sort(listOfEntries, valueComparator);
LinkedHashMap<String, Integer> sortedByValue = new LinkedHashMap<String, Integer>(listOfEntries.size());
for(Map.Entry<String, Integer> entry : listOfEntries){
sortedByValue.put(entry.getKey(), entry.getValue());
}
for(Map.Entry<String, Integer> entry : listOfEntries){
sortedByValue.put(entry.getKey(), entry.getValue());
}
System.out.println("HashMap after sorting entries by values ");
Set<Map.Entry<String, Integer>> entrySetSortedByValue = sortedByValue.entrySet();
for(Map.Entry<String, Integer> mapping : entrySetSortedByValue){
System.out.println(mapping.getKey() + " ==> " + mapping.getValue());
}
}
static int lengthx(String a) {
int count = 0;
for (int j = 0; j < a.length(); j++) {
if (a.charAt(j) == ' ') {
count++;
}
}
return count;
}
}

import java.io.*;
class Linked {
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(
new InputStreamReader(System.in));
System.out.println("Enter the sentence");
String st = br.readLine();
st = st + " ";
int a = lengthx(st);
String arr[] = new String[a];
int p = 0;
int c = 0;
for (int j = 0; j < st.length(); j++) {
if (st.charAt(j) == ' ') {
arr[p++] = st.substring(c,j);
c = j + 1;
}
}
}
static int lengthx(String a) {
int p = 0;
for (int j = 0; j < a.length(); j++) {
if (a.charAt(j) == ' ') {
p++;
}
}
return p;
}
}

Simply use Java 8 Stream collectors groupby function:
import java.util.function.Function;
import java.util.stream.Collectors;
static String[] COUNTRY_NAMES
= { "China", "Australia", "India", "USA", "USSR", "UK", "China",
"France", "Poland", "Austria", "India", "USA", "Egypt", "China" };
Map<String, Long> result = Stream.of(COUNTRY_NAMES).collect(
Collectors.groupingBy(Function.identity(), Collectors.counting()));

Count frequency of elements of list in java 8
List<Integer> list = new ArrayList<Integer>();
Collections.addAll(list,3,6,3,8,4,9,3,6,9,4,8,3,7,2);
Map<Integer, Long> frequencyMap = list.stream().collect(Collectors.groupingBy(Function.identity(),Collectors.counting()));
System.out.println(frequencyMap);
Note :
For String frequency counting split the string and convert it to list and use streams for count frequency => (Map frequencyMap)*
Check below link

String s[]=st.split(" ");
String sf[]=new String[s.length];
int count[]=new int[s.length];
sf[0]=s[0];
int j=1;
count[0]=1;
for(int i=1;i<s.length;i++)
{
int t=j-1;
while(t>=0)
{
if(s[i].equals(sf[t]))
{
count[t]++;
break;
}
t--;
}
if(t<0)
{
sf[j]=s[i];
count[j]++;
j++;
}
}

Created a simple easy to understand solution for this problem covers all test cases-
import java.util.HashMap;
import java.util.Map;
/*
* Problem Statement - Count Frequency of each word in a given string, ignoring special characters and space
* Input 1 - "To be or Not to be"
* Output 1 - to(2 times), be(2 times), or(1 time), not(1 time)
*
* Input 2 -"Star 123 ### 123 star"
* Output - Star(2 times), 123(2 times)
*/
public class FrequencyofWords {
public static void main(String[] args) {
String s1="To be or not **** to be! is all i ask for";
fnFrequencyofWords(s1);
}
//-------Supporting Function-----------------
static void fnFrequencyofWords(String s1) {
//------- Convert String to proper format----
s1=s1.replaceAll("[^A-Za-z0-9\\s]","");
s1=s1.replaceAll(" +"," ");
s1=s1.toLowerCase();
//-------Create String to an array with words------
String[] s2=s1.split(" ");
System.out.println(s1);
//-------- Create a HashMap to store each word and its count--
Map <String , Integer> map=new HashMap<String, Integer>();
for(int i=0;i<s2.length;i++) {
if(map.containsKey(s2[i])) //---- Verify if Word Already Exits---
{
map.put(s2[i], 1+ map.get(s2[i])); //-- Increment value by 1 if word already exits--
}
else {
map.put(s2[i], 1); // --- Add Word to map and set value as 1 if it does not exist in map--
}
}
System.out.println(map); //--- Print the HashMap with Key, Value Pair-------
}
}

public class WordFrequencyProblem {
public static void main(String args[]){
String s="the quick brown fox jumps fox fox over the lazy dog brown";
String alreadyProcessedWords="";
boolean isCount=false;
String[] splitWord = s.split("\\s|\\.");
for(int i=0;i<splitWord.length;i++){
String word = splitWord[i];
int count = 0;
isCount=false;
if(!alreadyProcessedWords.contains(word)){
for(int j=0;j<splitWord.length;j++){
if(word.equals(splitWord[j])){
count++;
isCount = true;
alreadyProcessedWords=alreadyProcessedWords+word+" ";
}
}
}
if(isCount)
System.out.println(word +"Present "+ count);
}
}
}

public class TestSplit {
public static void main(String[] args) {
String input="Find the repeated word which is repeated in this string";
List<String> output= (List) Arrays.asList(input.split(" "));
for(String str: output) {
int occurrences = Collections.frequency(output, str);
System.out.println("Occurence of " + str+ " is "+occurrences);
}
System.out.println(output);
}
}

Please try these it may be help for you
public static void main(String[] args) {
String str1="I am indian , I am proud to be indian proud.";
Map<String,Integer> map=findFrquenciesInString(str1);
System.out.println(map);
}
private static Map<String,Integer> findFrquenciesInString(String str1) {
String[] strArr=str1.split(" ");
Map<String,Integer> map=new HashMap<>();
for(int i=0;i<strArr.length;i++) {
int count=1;
for(int j=i+1;j<strArr.length;j++) {
if(strArr[i].equals(strArr[j]) && strArr[i]!="-1") {
strArr[j]="-1";
count++;
}
}
if(count>1 && strArr[i]!="-1") {
map.put(strArr[i], count);
strArr[i]="-1";
}
}
return map;
}

try this
public void count()throws IOException
{
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
System.out.println("enetr the strring");
String s = in.readLine();
int l = s.length();
int a=0,b=0,c=0,i,j,y=0;
char d;
String x;
String n[] = new String [50];
int m[] = new int [50];
for (i=0;i<50;i++)
{
m[i]=0;
}
for (i=0;i<l;i++)
{
d = s.charAt(i);
if((d==' ')||(d=='.'))
{
x = s.substring(a,i);
a= i+1;
for(j=0;j<b;j++)
{
if(x.equalsIgnoreCase(n[j]) == true)
{
m[j]++;
c = 1;
}
}
if(c==0)
{
n[b] = x;
m[b] = 1;
b++;
}
}
c=0;
}
for(i=0;i<b;i++)
{
for (j=0;j<b;j++)
{
if(y<m[j])
{
y=m[j];
}
}
if(m[i]==y)
{
System.out.println(n[i] + " : " + m[i]);
m[i]=0;
}
y=0;
}
}

Java Inverted Index program

I am writing an inverted index program on java which returns the frequency of terms among multiple documents. I have been able to return the number times a word appears in the entire collection, but I have not been able to return which documents the word appears in. This is the code I have so far:
import java.util.*; // Provides TreeMap, Iterator, Scanner
import java.io.*; // Provides FileReader, FileNotFoundException
public class Run
{
public static void main(String[ ] args)
{
// **THIS CREATES A TREE MAP**
TreeMap<String, Integer> frequencyData = new TreeMap<String, Integer>( );
Map[] mapArray = new Map[5];
mapArray[0] = new HashMap<String, Integer>();
readWordFile(frequencyData);
printAllCounts(frequencyData);
}
public static int getCount(String word, TreeMap<String, Integer> frequencyData)
{
if (frequencyData.containsKey(word))
{ // The word has occurred before, so get its count from the map
return frequencyData.get(word); // Auto-unboxed
}
else
{ // No occurrences of this word
return 0;
}
}
public static void printAllCounts(TreeMap<String, Integer> frequencyData)
{
System.out.println("-----------------------------------------------");
System.out.println(" Occurrences Word");
for(String word : frequencyData.keySet( ))
{
System.out.printf("%15d %s\n", frequencyData.get(word), word);
}
System.out.println("-----------------------------------------------");
}
public static void readWordFile(TreeMap<String, Integer> frequencyData)
{
int total = 0;
Scanner wordFile;
String word; // A word read from the file
Integer count; // The number of occurrences of the word
int counter = 0;
int docs = 0;
//**FOR LOOP TO READ THE DOCUMENTS**
for(int x=0; x<Docs.length; x++)
{ //start of for loop [*
try
{
wordFile = new Scanner(new FileReader(Docs[x]));
}
catch (FileNotFoundException e)
{
System.err.println(e);
return;
}
while (wordFile.hasNext( ))
{
// Read the next word and get rid of the end-of-line marker if needed:
word = wordFile.next( );
// This makes the Word lower case.
word = word.toLowerCase();
word = word.replaceAll("[^a-zA-Z0-9\\s]", "");
// Get the current count of this word, add one, and then store the new count:
count = getCount(word, frequencyData) + 1;
frequencyData.put(word, count);
total = total + count;
counter++;
docs = x + 1;
}
} //End of for loop *]
System.out.println("There are " + total + " terms in the collection.");
System.out.println("There are " + counter + " unique terms in the collection.");
System.out.println("There are " + docs + " documents in the collection.");
}
// Array of documents
static String Docs [] = {"words.txt", "words2.txt",};

Instead of simply having a Map from word to count, create a Map from each word to a nested Map from document to count. In other words:
Map<String, Map<String, Integer>> wordToDocumentMap;
Then, inside your loop which records the counts, you want to use code which looks like this:
Map<String, Integer> documentToCountMap = wordToDocumentMap.get(currentWord);
if(documentToCountMap == null) {
// This word has not been found anywhere before,
// so create a Map to hold document-map counts.
documentToCountMap = new TreeMap<>();
wordToDocumentMap.put(currentWord, documentToCountMap);
}
Integer currentCount = documentToCountMap.get(currentDocument);
if(currentCount == null) {
// This word has not been found in this document before, so
// set the initial count to zero.
currentCount = 0;
}
documentToCountMap.put(currentDocument, currentCount + 1);
Now you're capturing the counts on a per-word and per-document basis.
Once you've completed the analysis and you want to print a summary of the results, you can run through the map like so:
for(Map.Entry<String, Map<String,Integer>> wordToDocument :
wordToDocumentMap.entrySet()) {
String currentWord = wordToDocument.getKey();
Map<String, Integer> documentToWordCount = wordToDocument.getValue();
for(Map.Entry<String, Integer> documentToFrequency :
documentToWordCount.entrySet()) {
String document = documentToFrequency.getKey();
Integer wordCount = documentToFrequency.getValue();
System.out.println("Word " + currentWord + " found " + wordCount +
" times in document " + document);
}
}
For an explanation of the for-each structure in Java, see this tutorial page.
For a good explanation of the features of the Map interface, including the entrySet method, see this tutorial page.

Try adding second map word -> set of document name like this:
Map<String, Set<String>> filenames = new HashMap<String, Set<String>>();
...
word = word.replaceAll("[^a-zA-Z0-9\\s]", "");
// Get the current count of this word, add one, and then store the new count:
count = getCount(word, frequencyData) + 1;
frequencyData.put(word, count);
Set<String> filenamesForWord = filenames.get(word);
if (filenamesForWord == null) {
filenamesForWord = new HashSet<String>();
}
filenamesForWord.add(Docs[x]);
filenames.put(word, filenamesForWord);
total = total + count;
counter++;
docs = x + 1;
When you need to get a set of filenames in which you encountered a particular word, you'll just get() it from the map filenames. Here is the example that prints out all the file names, in which we have encountered a word:
public static void printAllCounts(TreeMap<String, Integer> frequencyData, Map<String, Set<String>> filenames) {
System.out.println("-----------------------------------------------");
System.out.println(" Occurrences Word");
for(String word : frequencyData.keySet( ))
{
System.out.printf("%15d %s\n", frequencyData.get(word), word);
for (String filename : filenames.get(word)) {
System.out.println(filename);
}
}
System.out.println("-----------------------------------------------");
}

I've put a scanner into the main methode, and the word I search for will return the documents the word occurce in. I also return how many times the word occurs, but I will only get it to be the total of times in all of three documents. And I want it to return how many times it occurs in each document. I want this to be able to calculate tf-idf, if u have a total answer for the whole tf-idf I would appreciate. Cheers
Here is my code:
import java.util.*; // Provides TreeMap, Iterator, Scanner
import java.io.*; // Provides FileReader, FileNotFoundException
public class test2
{
public static void main(String[ ] args)
{
// **THIS CREATES A TREE MAP**
TreeMap<String, Integer> frequencyData = new TreeMap<String, Integer>();
Map<String, Set<String>> filenames = new HashMap<String, Set<String>>();
Map<String, Integer> countByWords = new HashMap<String, Integer>();
Map[] mapArray = new Map[5];
mapArray[0] = new HashMap<String, Integer>();
readWordFile(countByWords, frequencyData, filenames);
printAllCounts(countByWords, frequencyData, filenames);
}
public static int getCount(String word, TreeMap<String, Integer> frequencyData)
{
if (frequencyData.containsKey(word))
{ // The word has occurred before, so get its count from the map
return frequencyData.get(word); // Auto-unboxed
}
else
{ // No occurrences of this word
return 0;
}
}
public static void printAllCounts( Map<String, Integer> countByWords, TreeMap<String, Integer> frequencyData, Map<String, Set<String>> filenames)
{
System.out.println("-----------------------------------------------");
System.out.print("Search for a word: ");
String worde;
int result = 0;
Scanner input = new Scanner(System.in);
worde=input.nextLine();
if(!filenames.containsKey(worde)){
System.out.println("The word does not exist");
}
else{
for(String filename : filenames.get(worde)){
System.out.println(filename);
System.out.println(countByWords.get(worde));
}
}
System.out.println("\n-----------------------------------------------");
}
public static void readWordFile(Map<String, Integer> countByWords ,TreeMap<String, Integer> frequencyData, Map<String, Set<String>> filenames)
{
Scanner wordFile;
String word; // A word read from the file
Integer count; // The number of occurrences of the word
int counter = 0;
int docs = 0;
//**FOR LOOP TO READ THE DOCUMENTS**
for(int x=0; x<Docs.length; x++)
{ //start of for loop [*
try
{
wordFile = new Scanner(new FileReader(Docs[x]));
}
catch (FileNotFoundException e)
{
System.err.println(e);
return;
}
while (wordFile.hasNext( ))
{
// Read the next word and get rid of the end-of-line marker if needed:
word = wordFile.next( );
// This makes the Word lower case.
word = word.toLowerCase();
word = word.replaceAll("[^a-zA-Z0-9\\s]", "");
// Get the current count of this word, add one, and then store the new count:
count = countByWords.get(word);
if(count != null){
countByWords.put(word, count + 1);
}
else{
countByWords.put(word, 1);
}
Set<String> filenamesForWord = filenames.get(word);
if (filenamesForWord == null) {
filenamesForWord = new HashSet<String>();
}
filenamesForWord.add(Docs[x]);
filenames.put(word, filenamesForWord);
counter++;
docs = x + 1;
}
} //End of for loop *]
System.out.println("There are " + counter + " terms in the collection.");
System.out.println("There are " + docs + " documents in the collection.");
}
// Array of documents
static String Docs [] = {"Document1.txt", "Document2.txt", "Document3.txt"};
}

Create word count of text using hashmap

I am trying to create a program as a tutorial for myself for hashmaps. I ask the user into text and try to split it into hashmaps and then increase the count if the word repeats. This is my program:
import java.util.*;
import java.lang.*;
import javax.swing.JOptionPane;
import java.io.*;
public class TestingTables
{
public static void main(String args[])
{
{
String s = JOptionPane.showInputDialog("Enter any text.");
String[] splitted = s.split(" ");
HashMap hm = new HashMap();
int x;
for (int i=0; i<splitted.length ; i++) {
hm.put(splitted[i], i);
System.out.println(splitted[i] + " " + i);
if (hm.containsKey(splitted[i])) {
x = ((Integer)hm.get(splitted[i])).intValue();
hm.put(splitted[i], new Integer(x+1)); }
}
}
}
}
When I input "random random random", I get:
random 0
random 1
random 2
What do I need to change so I get:
random 3
Also, do I need to use an iterator to print out the hashmap, or is what I used OK?

Your initialization is wrong hm.put(splitted[i], i).
You should initialize to 0 or to 1 (to count, not to index).
So do this loop first.
for (int i = 0; i < splitted.length; i++) {
if (!hm.containsKey(splitted[i])) {
hm.put(splitted[i], 1);
} else {
hm.put(splitted[i], (Integer) hm.get(splitted[i]) + 1);
}
}
Then just do one more loop (iterate through the keys of the HashMap) and print the counts out.
for (Object word : hm.keySet()){
System.out.println(word + " " + (Integer) hm.get(word));
}

import java.util.*;
import java.lang.*;
import javax.swing.JOptionPane;
import java.io.*;
public class TestingTables
{
public static void main(String args[])
{
{
String s = JOptionPane.showInputDialog("Enter any text.");
String[] splitted = s.split(" ");
Map<String, Integer> hm = new HashMap<String, Integer>();
int x;
for (int i=0; i<splitted.length ; i++) {
if (hm.containsKey(splitter[i])) {
int cont = hm.get(splitter[i]);
hm.put(splitter[i], cont + 1)
} else {
hm.put(splitted[i], 1);
}
}
}
}
Your Map declaration is wrong, remember the correct way to implement a Map.

This should work, its a pretty simple implementation..
Map<String, Integer> hm = new HashMap<String, Integer>();
int x;
for (int i = 0; i < splitted.length; i++) {
if (hm.containsKey(splitted[i])) {
x = hm.get(splitted[i]);
hm.put(splitted[i], x + 1);
} else {
hm.put(splitted[i], 1);
}
}
for (String key : hm.keySet()) {
System.out.println(key + " " + hm.get(key));
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

WordCount with treemap - java

Related

How to make a Java program that takes two types of arguments?

How can I convert a String into ArrayList by counting occurrence of each characters?

Same word occurrence in a string. Java [duplicate]

Java Inverted Index program

Create word count of text using hashmap

Categories

Resources