Compare Lists of Pairs to find similars - java

Movie1{{'hello',5},{'foo',3}}
Movie2{{'hi',2},{'foo',2}}
While testing i am testing with 2 movies each has around 20 unique words grouped in pairs of word and frequency
public ArrayList<Pair<String, Integer>> getWordsAndFrequency() {
String[] keys = description.split(" ");
String[] uniqueKeys;
int count = 0;
uniqueKeys = getUniqueKeys(keys);
for (String key : uniqueKeys) {
if (null == key) {
break;
}
for (String s : keys) {
if (key.equals(s)) {
count++;
}
}
words.add(Pair.of(key, count));
count = 0;
}
sortWords(words);
return words;
}

Your bug is your getWordsAndFrequency() method actually adds more entries to words. So each time you call it the word list gets longer and longer. To fix this, you should calculate the words and frequency once and add these Pairs to the list, then just return the list in the getWordsAndFrequency() method rather than calculating it every time.

Can you put the data (that is currently stored in an arraylist of pairs) in a hashmap?
You can then compute the intersection of the sets of keywords between two movies and add their scores
For example:
Map<String, Integer> keyWordsMovie1 = movie1.getWordsAndFrequency();
Map<String, Integer> keyWordsMovie2 = movie2.getWordsAndFrequency();
Set<String> commonKeyWords = new HashSet<String>(keyWordsMovie1.keySet()); //set of all keywords in movie1
intersection.retainAll(keyWordsMovie2.keySet());
for (String keyWord : intersection){
int freq1 = keyWordsMovie1.get(keyWord);
int freq2 = keyWordsMovie2.get(keyWord);
//you now have the frequencies of the keyword in both movies
}

Related

How to get few random keys from HashMap

I have a map with titles. I want to print 10 random keys from my hashmap.
For example my map (String, Object) contains 100 pairs: "A, new Object(...)", "B, ...", "C, ..." etc.
I want to get 10 random keys from this map and append it to one string.
So my string should looks like: "A\nD\nB".
A quick way to get random 10 keys without repetition is putting the keys in a list and using Collections.shuffle to shuffle the list.
Map<String, Object> map = ...yourmap
ArrayList<String> keys = new ArrayList<>(map.keySet());
Collections.shuffle(keys);
List<String> randomTenKeys = keys.subList(0, 10);
Creating a list of all keys and shuffling it is not the most efficient thing you can do. You can do it in a single pass with a reservoir sampling algorithm. I haven't looked into it but you can probably find an implementation in some Apache or Guava library.
Joni's answer is quite good and short. But, here is a fully working example if you'd like. I split your problem into two methods - one to return a list of randomly selected keys and another to print keys in whichever way you like. You could combine the two methods into one. But, its better to keep them separate.
import java.util.*;
import java.util.stream.IntStream;
public class Test {
public static void main(String [] args){
Map<String, Object> map = new HashMap<>();
//You can use for loop instead to make a map of String, Integer.
IntStream.rangeClosed(0, 9).forEach(i -> map.put(i +"", i));//Map of 10 numbers.
List<String> keys = getRandomKeys(map, 3);
String allKeys = combineKeys(keys, "\n");
System.out.println(allKeys);
}
public static List<String> getRandomKeys(Map<String, Object> map, int keyCount) {
List<String> keys = new ArrayList<>(map.keySet());
for(int i = 0; i < map.size()-keyCount; i++){
int idx = (int) ( Math.random() * keys.size() );
keys.remove(idx);
}
return keys;
}
public static String combineKeys(List<String> keys, String separator){
String all = "";
for(int i = 0; i < keys.size() - 1; i++){
all = all + keys.get(i) + separator;
}
all += keys.get(keys.size()-1);//last element does not need separator.
return all;
}
}
HashMap Stores the values already in unsorted order it is random.
you can directly use
for(Map.Entry entry : map.entrySet())
str.append(entry.getKey()+" "+entry.getValue());
however if you want new order every time you can shuffle your data.
For Shuffle you need to get all keys in a array or list
Then you can shuffle that list and iterate over that list to get values from hashmap
This is a complementary answer to Joni's answer. Use String:join to join the randomTenKeys.
Given below is Joni's answer:
Map<String, Object> map = ...yourmap
ArrayList<String> keys = new ArrayList<>(map.keySet());
Collections.shuffle(keys);
List<String> randomTenKeys = keys.subList(0, 10);
and the complementary answer is:
String joinedKeys = String.join("\n", randomTenKeys);
Set<String> keys = myMap.keySet();
String combined = "";
for (int i=0; i<10; i++)
{
int random = (int)(Math.random() * keys.size());
String key = keys.get(random);
combined += key + "\n";
keys.remove(random);
}

Getting Objects with specified values from hashtable Java

I have a hashtable with (String, Object). I have to segregate all objects by the length of the key String and create an array of arrays of Strings with the same length. Can someone guide me how could I accomplish that?
My code so far:
Set<String> keys = words.keySet();
ArrayList<ArrayList<Word>> outer = new ArrayList<ArrayList<Word>>();
ArrayList<Word> inner = new ArrayList<Word>();
for(String key: keys) {
for (int i=0; i< 15; i++) {
if (key.length() == i) {
inner.add(words.get(key));
}
outer.add(i, inner);
}
}
The way you're looping is inefficient since you may not have many words of certain sizes so you'll be needlessly checking the length of every single word against i for each length. You can just go through your list of words once and use a map to associate words with the keys representing their lengths, then collate the lists at the end.
Try this:
Map<Integer, List<String>> sizeMap = new HashMap<>();
for (String key: keys) {
int length = key.length();
if (sizeMap.containsKey(length)) {
// If we already have a list initialized, add the word
List<String> mWords = sizeMap.get(length);
mWords.add(key);
} else {
// Otherwise, add an empty list so later we don't try appending to null
sizeMap.put(length, new ArrayList<>());
}
}
// Convert the map to a list of lists
for (List<String> sizeGrouping : sizeMap.values()) {
outer.add(sizeGrouping);
}

how to find the duplicates in ArrayList using hashmap in java?

my program is reading large txt files(in MBs) which contain the source ip and destination ip(for example 192.168.125.10,112.25.2.1) ,,,Here read is an ArrayList in which the data is present.
i have generated unique ids(uid int type) using srcip and destip and now i am storing in
static ArrayList<Integer[]> prev = new ArrayList<Integer[]>();
where Array is
:-
static Integer[] multi1;
multi1 = new Integer[]{(int)uid,count,flag};
i have to print the all uids with there count or their frequencies using hashmap.
Plz give some solution...
for (ArrayList<String> read : readFiles.values())
{
if(file_count<=2)
{
for(int i=0 ; i<read.size() ; i++)
{
String str1=read.get(i).split(",")[0];//get only srcIP
String str2=read.get(i).split(",")[1];//get only destIP
StringTokenizer tokenizer1=new StringTokenizer(str1,".");
StringTokenizer tokenizer2=new StringTokenizer(str2,".");
if(tokenizer1.hasMoreTokens()&&tokenizer2.hasMoreTokens())
{
sip_oct1=Integer.parseInt(tokenizer1.nextToken());
sip_oct2=Integer.parseInt(tokenizer1.nextToken());
sip_oct3=Integer.parseInt(tokenizer1.nextToken());
sip_oct4=Integer.parseInt(tokenizer1.nextToken());
dip_oct1=Integer.parseInt(tokenizer2.nextToken());
dip_oct2=Integer.parseInt(tokenizer2.nextToken());
dip_oct3=Integer.parseInt(tokenizer2.nextToken());
dip_oct4=Integer.parseInt(tokenizer2.nextToken());
uid=uniqueIdGenerator(sip_oct1,sip_oct2,sip_oct3,sip_oct4,dip_oct1,dip_oct2,dip_oct3,dip_oct4);
}
multi1 = new Integer[]{(int)uid,count,flag};
prev.add(multi1);
System.out.println(prev.get(i)[0]);//getting uids from prev
Map<ArrayList<Integer []> , Integer> map = new HashMap<ArrayList<Integer[]>, Integer>();
for (int j=0 ; j<prev.size() ; j++)
{
Integer temp=map.get(prev.get(i)[0]);
count = map.get(temp);
map.put(temp, (count == null) ? 1 : count++);
}
printMap(map);
System.out.println("uids--->"+prev.get(i)[0]+" Count--- >"+count+" flag--->"+prev.get(i)[2]);
}
}
file_count++;
}
}
public static void printMap(Map<ArrayList<Integer[]>, Integer> map)
{
for (Entry<ArrayList<Integer[]>, Integer> entry : map.entrySet())
{
System.out.println(" Value : "+ entry.getValue()+"key : "+entry.getKey());
}
}
public static double uniqueIdGenerator(int oc1,int oc2,int oc3,int oc4,int oc5,int oc6,int oc7,int oc8)
{
int a,b;
double c;
a=((oc1*10+oc2)*10+oc3)*10+oc4;
b=((oc5*10+oc6)*10+oc7)*10+oc8;
c= Math.log(a)+Math.log(b);
return Math.round(c*1000);
}
Now understanding what you want, there are (at least) 2 ways of doing this.
1st: Make a list with the uid's. Then a second list where you can have a value (your uid) and keep a count. Was thinking of HashMap, but there you can not easily change the count. Maybe an ArrayList of a list with 2 values.
Then loop over your list with the uid's, check with a second for loop if the uid is already in the second list. If it is, add one to the count. If it is not, add it to the list.
2nd: Do the same thing, but then with classes (very Java). Then you can put even more info into the class ;)
Hope this helps!
*edit: #RC. indeed gives cleaner code.

Create a 2d Boolean array in Java from table data

I have a .csv file of type:
Event Participant
ConferenceA John
ConferenceA Joe
ConferenceA Mary
ConferenceB John
ConferenceB Ted
ConferenceC Jessica
I would like to create a 2D boolean matrix of the following format:
Event John Joe Mary Ted Jessica
ConferenceA 1 1 1 0 0
ConferenceB 1 0 0 1 0
ConferenceC 0 0 0 0 1
I start by reading in the csv and using it to initialize an ArrayList of type:
AttendaceRecord(String title, String employee)
How can I iterate through this ArrayList to create a boolean matrix like the one above in Java?
This is the easiest way I can think of for you. This answer can certainly be improved or done in a completely different way. I'm taking this approach because you mentioned that you are not completely familiar with Map (I'm also guessing with Set). Anyway let's dive in.
In your AttendanceRecord class you are going to need the following instance variables: two LinkedHashSet and one LinkedHashMap. LinkedHashSet #1 will store all conferences and LinkedHashSet #2 will store all participants. The LinkedHashMap will store the the conferences as keys and participants list as values. The reason for this will be clear in a minute. I'll first explain why you need the LinkedHashSet.
Purpose of LinkedHashSet
Notice in your 2d array, the rows (conferences) and columns (participants) are arranged in the order they were read. Not only that, all duplicates read from the file are gone. To preserve the ordering and eliminate duplicates a LinkedHashSet fits this purpose perfectly. Then, we will have a one-to-one relationship between the row positions and the column positions of the 2d array and each LinkedHashSet via their array representation. Let's use Jhon from ConferenceA for example. Jhon will be at position 0 in the array representation of the participant Set and ConferenceA will be at position 0 in the array representation of the conference Set. Not only that, the size of each array will be used to determine the size of your 2d array (2darray[conferenceArrayLength][participantArrayLength])
Purpose of the LinkedHashMap
We need the LinkedHashMap to preserve the ordering of the elements (hence Linked). The elements will be stored internally like this.
ConferenceA :Jhon Joe Mary
ConferenceB :Jhon Ted
ConferenceC :Jessica
We will then iterate through the data structure and send each key value pair to a function which returns the position of each element from each array returned from each LinkedHashSet. As each row and column position is returned, we will add a 1 to that position in the 2d array.
Note: I used an Integer array for my example, substitute as needed.
AttendanceRecord.java
public class AttendanceRecord {
private Map<String, ArrayList> attendanceRecordMap = new LinkedHashMap<String, ArrayList>();
private Set<String> participants = new LinkedHashSet<String>();
private Set<String> conferences = new LinkedHashSet<String>();
public AttendanceRecord() {
}
public Map<String, ArrayList> getAttendanceRecordMap() {
return attendanceRecordMap;
}
public Object[] getParticipantsArray() {
return participants.toArray();
}
public Object[] getConferencesArray() {
return conferences.toArray();
}
public void addToRecord(String title, String employee) {
conferences.add(title);
participants.add(employee);
if (attendanceRecordMap.containsKey(title)) {
ArrayList<String> tempList = attendanceRecordMap.get(title);
tempList.add(employee);
} else {
ArrayList<String> attendees = new ArrayList<String>();
attendees.add(employee);
attendanceRecordMap.put(title, attendees);
}
}
}
Test.java
public class Test {
public static void main(String[] args) {
AttendanceRecord attendanceRecord = new AttendanceRecord();
//There are hardcoded. You will have to substitute with your code
//when you read the file
attendanceRecord.addToRecord("ConferenceA", "Jhon");
attendanceRecord.addToRecord("ConferenceA", "Joe");
attendanceRecord.addToRecord("ConferenceA", "Mary");
attendanceRecord.addToRecord("ConferenceB", "Jhon");
attendanceRecord.addToRecord("ConferenceB", "Ted");
attendanceRecord.addToRecord("ConferenceC", "Jessica");
int[][] jaccardArray = new int[attendanceRecord.getConferencesArray().length][attendanceRecord.getParticipantsArray().length];
setUp2dArray(jaccardArray, attendanceRecord);
print2dArray(jaccardArray);
}
public static void setUp2dArray(int[][] jaccardArray, AttendanceRecord record) {
Map<String, ArrayList> recordMap = record.getAttendanceRecordMap();
for (String key : recordMap.keySet()) {
ArrayList<String> attendees = recordMap.get(key);
for (String attendee : attendees) {
int row = findConferencePosition(key, record.getConferencesArray());
int column = findParticipantPosition(attendee, record.getParticipantsArray());
System.out.println("Row inside " + row + "Col inside " + column);
jaccardArray[row][column] = 1;
}
}
}
public static void print2dArray(int[][] jaccardArray) {
for (int i = 0; i < jaccardArray.length; i++) {
for (int j = 0; j < jaccardArray[i].length; j++) {
System.out.print(jaccardArray[i][j]);
}
System.out.println();
}
}
public static int findParticipantPosition(String employee, Object[] participantArray) {
int position = -1;
for (int i = 0; i < participantArray.length; i++) {
if (employee.equals(participantArray[i].toString())) {
position = i;
break;
}
}
return position;
}
public static int findConferencePosition(String employee, Object[] conferenceArray) {
int position = -1;
for (int i = 0; i < conferenceArray.length; i++) {
if (employee.equals(conferenceArray[i])) {
position = i;
break;
}
}
return position;
}
}
Basically you'll want to start by searching through your input strings to find each of the names (String.contains) and set a boolean array of each field name.
Then you'll make an array of those boolean arrays (or a list, whatever).
Then you simply sort through them, looking for T/F and printing corresponding messages.
I included some very rough pseudocode, assuming I am understanding your problem correctly.
// For first row
List labelStrings[];
labelStrings = {"Event", "John", "Joe", "Mary", "Ted", "Jessica"};
// For the matrix data
// List to iterate horizontally EDIT: Made boolean!
List<Boolean> strList= new ArrayList()<List>;
// List to iterate vertically
List<List> = listList new ArrayList()<List>;
/* for all the entries in AttendanceRecord (watch your spelling, OP)
for all data sets mapping title to employee
add the row data to strList[entry_num] */
for (int i = 0; i < listList.size()-1; i++)
for (int j = 0; j < labelStrings.size()-1; j++)
{
if (i == 0)
System.out.println(strList[j] + "\t\n\n");
else
{
// print listLists[i][j]
}
// iterate row by row (for each horizontal entry in the column of entries)
}
Sorry, I'm just reading through the comments now.
You'll definitely want to arrange your data in a way that is easy to iterate through. Since you have a fixed table size, you could hardcode a boolean array for each entry and then print on validation they were mapped to the event as indicated in your input string.
Try creating a hash map containing
HashMap map = new HashMap<conferenceStr, HashMap<nameStr, int>>()
As you iterate through your ArrayList, you can do something like
innerMap = map.get(conferenceStr)
innerMap.put(nameStr, 1)
of course you'll need some initialization logic, like you can check if innerMap.get(nameStr) exists, if not, iterate over every inner map and innerMap.put(nameStr, 0)
This structure can be used to generate that final 2D boolean matrix.
Elaboration edit:
ArrayList<AttendanceRecord> attendanceList = new ArrayList<AttendanceRecord>();
// populate list with info from the csv (you implied you can do this)
HashMap<String, HashMap<String, Integer>> map = new HashMap<String, HashMap<String, Integer>>();
//map to store every participant, this seems inefficient though
HashMap<String, Integer>> participantMap = new HashMap<String, Integer>();
for (AttendanceRecord record : attendanceList) {
String title = record.getTitle();
String employee = record.getEmployee();
participantMap.put(employee, 0);
HashMap<String, Integer> innerMap = map.get(title);
if (innerMap == null) {
innerMap = new HashMap<String, Integer>();
}
innerMap.put(employee, 1);
}
//now we have all the data we need, it's just about how you want to format it
for example if you wanted to just print out a table like that you could iterate through every element of map doing this:
for (HashMap<String, Integer> innerMap : map.values()) {
for (String employee : participantMap.values()) {
if (innerMap.get(employee)) {
//print 1
}
else
//print 0
}
}

How do I get the list of possible values in a HashMap<String, Integer>? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How do I iterate over each Entry in a Map?
I'm writing a program that will take an input (data), which is an array of strings, and return them in order of frequency of appearance, and then alphabetical order if they have the same number of appearances in the input. I've used a HashMap to map each string to the number of times it appears in the array, and my idea after that was to use a for loop to iterate through each number of appearances, however I'm unable to find a command that returns the number of unique values in a Hashmap. Does anyone know how to get this value?
Also, if you have a simpler way to perform the task I described, any advice is welcome.
HashMap<String, Integer> sortmap = new HashMap<String, Integer>();
ArrayList<String> stringlist = new ArrayList<String>();
ArrayList<String> stringlist2 = new ArrayList<String>();
for(String x : data)
{
if(sortmap.containsKey(x)){
sortmap.put(x, sortmap.get(x)+1);
}
else{
sortmap.put(x, 1);
}
}
for (String s : sortmap.keySet()){
for (int i : sortmap.values()){
if (sortmap.get(s) == i){
stringlist2.add(s);
}
}
}
Double looping at the end is very unfortunate.
Take sortmap.entrySet() and store it in an array. Then sort that array with Arrays.sort using your own Comparator which first takes into account the counters and if they are equal, compares strings alphabetically.
I figured it out - here's the full logic for people who were wondering:
public String[] sort(String[] data) {
TreeMap<String, Integer> sortmap = new TreeMap<String, Integer>();
ArrayList<String> stringlist = new ArrayList<String>();
for(String x : data){
if(sortmap.containsKey(x))
sortmap.put(x, sortmap.get(x)+1);
else
sortmap.put(x, 1);
}
Arrays.sort(sortmap.values().toArray(), 0, sortmap.size());
for (int i = data.length; i > 0; i--){
for (Entry<String, Integer> k : sortmap.entrySet()){
if (k.getValue() == i)
stringlist.add(k.getKey());
}
}
String[] output = stringlist.toArray(new String[stringlist.size()]);
return output;
}

Categories