Reversing a HashMap storing words and their line numbers - java

I have a HashMap
HashMap<String, LinkedList<Integer>> indexMap;
which is storing all words in a file and their corresponding line numbers where they appear.
Example -
This is just an example
to demonstrate what I am saying an is
Would display
This [1]
demonstrate [2]
an [1 2]
is [1 2]
...
....
And so on. I want to reverse this HashMap so that it displays the words stored at each line number.
For the particular example above, it should display
1 [This, an, just, example, is]
2 [demonstrate, what, to, I, am, saying, is, an]
For this particular task, this is what I have done -
import java.io.FileReader;
import java.io.LineNumberReader;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.Map;
public class ReverseIndex {
private static Map<String, LinkedList<Integer>> indexMap = new HashMap<String, LinkedList<Integer>>();
public static LinkedList<Integer> getIndex(String word) {
return indexMap.get(word);
}
public static void main(String[] args) {
try {
LineNumberReader rdr = new LineNumberReader(
new FileReader(
args[0]));
String line = "";
int lineNumber = 0;
//CREATING THE INITIAL HASHMAP WHICH WE WANT TO REVERSE
while ((line = rdr.readLine()) != null) {
lineNumber++;
String[] words = line.split("\\s+");
for (int i = 0; i < words.length; i++) {
LinkedList<Integer> temp = new LinkedList<Integer>();
if (getIndex(words[i]) != null)
temp = getIndex(words[i]);
temp.add(lineNumber);
indexMap.put(words[i], temp);
}
}
//FINISHED CREATION
Map<Integer, LinkedList<String>> myNewHashMap = new HashMap<Integer, LinkedList<String>>();
for(Map.Entry<String, LinkedList<Integer>> entry : indexMap.entrySet()){
LinkedList<Integer> values = entry.getValue();
String key = entry.getKey();
LinkedList<String> temp = new LinkedList<String>();
for(int i = 0; i <= lineNumber; i++) {
if(values.contains(i)) {
if(!temp.contains(key))
temp.add(key);
myNewHashMap.put(i, temp);
}
}
}
for(Map.Entry<Integer, LinkedList<String>> entry : myNewHashMap.entrySet()){
Integer tester = entry.getKey();
LinkedList<String> temp2 = new LinkedList<String>();
temp2 = entry.getValue();
System.out.print(tester + " ");
for(int i = 0; i < temp2.size(); i++) {
System.out.print(temp2.get(i) + " ");
}
System.out.println();
}
rdr.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
However the problem with this is, for the example that we had above, it would print -
1 example
2 an
How could I reverse it so that it works perfectly with the expected output?

Just replace the first for loop in your main with the below code. I have made some changes to you original code as per convention like moved variable declaration out of loop and changed the logic in a way it checks if the LinkedList<'String'> already exists for the line number if so add it to the list or else create a new LinkedList<'String'> and then add word.
LinkedList<Integer> values = null;
String key = null;
LinkedList<String> temp = null;
for(Map.Entry<String, LinkedList<Integer>> entry : indexMap.entrySet())
{
values = entry.getValue();
key = entry.getKey();
temp = new LinkedList<String>();
for(int value : values)
{
temp = myNewHashMap.get(value);
if(temp == null )
{
temp = new LinkedList<String>();
myNewHashMap.put(value,temp);
}
temp.add(key);
}
}

Related

How do I count every unique item in an Arraylist?

I need to count every unique character in an Arraylist. I already seperated everys single character.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.ArrayList;
public class Aschenputtel {
public static void main(String args[]) {
ArrayList <String> txtLowCase = new ArrayList <String> ();
ArrayList <Character> car = new ArrayList <Character> ();
File datei = new File ("C:/Users/Thomas/Downloads/Aschenputtel.txt");
Scanner scan = null;
try {
scan = new Scanner (datei);
} catch (FileNotFoundException e) {
System.out.println("File not found.");
}
while (scan.hasNext()) {
String temp = scan.next().replace("„", "„").replace("“", "“").toLowerCase();
txtLowCase.add(temp);
for(int i = 0; i < temp.length(); i++) {
car.add(temp.charAt(i));
}
}
System.out.println(car);
}
}
That is my current code.
car currently gives every single character but the result should be something like:
a = 16, b = 7, c = 24,....
Is there a good way to do that?
Once you have your character you can do something like in your for loop :
...
Map<Character, Integer> map2=new HashMap<Character, Integer>();
for (int i = 0; i < temp.length(); i++) {
map2.put(temp.charAt(i), map2.getOrDefault(temp.charAt(i), 0)+1);
}
System.out.println(map2);
...
You've got the first part of the algorithm, the data processing:
Process the data, from a text file to an ArrayList of characters, car.
Count the characters in the list.
You want to associate each character to a count. A HashMap would be great for that.
Here's a method for the second part, with some explanations:
/*
This will return something that looks like:
{ a: 16, b: 7, c: 24, ... }
*/
HashMap<Character, Int> getCharacterCount(ArrayList<Character> charList) {
// for each character we associate an int, its count.
HashMap<Character, Int> counter = new Hashmap<>();
for (Character car: charList) {
// if the map doesn't contain our character, we've never seen it before.
int currentCount = counter.containsKey(car) ? counter.get(car) : 0;
// increment this character's current count
counter.put(car, currentCount + 1);
}
return counter;
}
If you need to count the number of characters within your car list, you could use a collection stream and the aggregate operation groupingby in conjunction with the downstream counting.
This operation will yield a Map whose keys are the list's characters and their corresponding values their occurrences within the list.
Sample Code
List<Character> list = new ArrayList<>(List.of('a', 'a', 'a', 'b', 'c', 'b', 'b', 'b', 'a', 'c'));
Map<Character, Long> mapRes = list.stream().collect(Collectors.groupingBy(c -> c, Collectors.counting()));
System.out.println(mapRes);
Your Code
public class Aschenputtel {
public static void main(String args[]) {
ArrayList<String> txtLowCase = new ArrayList<String>();
ArrayList<Character> car = new ArrayList<Character>();
File datei = new File("C:/Users/Thomas/Downloads/Aschenputtel.txt");
Scanner scan = null;
try {
scan = new Scanner(datei);
} catch (FileNotFoundException e) {
System.out.println("File not found.");
}
while (scan.hasNext()) {
String temp = scan.next().replace("„", "„").replace("“", "“").toLowerCase();
txtLowCase.add(temp);
for (int i = 0; i < temp.length(); i++) {
car.add(temp.charAt(i));
}
}
Map<Character, Long> mapRes = car.stream().collect(Collectors.groupingBy(c -> c, Collectors.counting()));
System.out.println(mapRes);
}
}
Here is the most simplistic approach, pass on your array list to a HashSet when initializing it.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.ArrayList;
public class Aschenputtel {
public static void main(String args[]) {
ArrayList <String> txtLowCase = new ArrayList <String> ();
ArrayList <Character> car = new ArrayList <Character> ();
File datei = new File ("C:/Users/Thomas/Downloads/Aschenputtel.txt");
Scanner scan = null;
try {
scan = new Scanner (datei);
} catch (FileNotFoundException e) {
System.out.println("File not found.");
}
while (scan.hasNext()) {
String temp = scan.next().replace("„", "„").replace("“", "“").toLowerCase();
txtLowCase.add(temp);
for(int i = 0; i < temp.length(); i++) {
car.add(temp.charAt(i));
}
}
HashSet<String> hset = new HashSet<String>(ArrList); // Only adds an element if it doesn't exists prior.
System.out.println("ArrayList Unique Values is : " + hset); // All the unique values.
System.out.println("ArrayList Total Coutn Of Unique Values is : " + hset.size()); // Count of all unique items.
}
}
If you want to have more control/customizability you could also do something like this:
private static void calcCount(List<Character> chars) {
chars.sort(Comparator.naturalOrder());
Character prevChar = null;
int currentCount = 0;
for (Character aChar : chars) {
if (aChar != prevChar) {
if (prevChar != null) {
System.out.print(currentCount + " ");
currentCount = 0;
}
System.out.print(aChar + ":");
prevChar = aChar;
}
currentCount++;
}
System.out.print(currentCount);
}

How to insert a value into the same index of array?

So I have an array that can store at least values. I let my user enter 3 times using for loop. The question is here is that, when the user enter the same key the value will overwrite. How to put the value into next index?
here is my entire code so far:
import java.util.*;
import java.io.*;
public class low{
public static void main(String[] args)throws IOException {
BufferedReader sc = new BufferedReader(new InputStreamReader(System.in));
int key = 0;
// StringBuffer value = new StringBuffer();
String value = "";
ArrayList<String> hash = new ArrayList<String>();
String[] arr = new String[5];
for(int i = 0; i < 3; i++){
System.out.println("Key: ");
key = Integer.parseInt(sc.readLine());
System.out.println("Value: ");
value = sc.readLine();
key++;
arr[key] = value;
}
for(int x = 0; x < arr.length; x++){
System.out.println("Element at index " + x + " : "+ arr[x]);
}
}
}
You may need a Map to store the values rather than array
Scanner in = new Scanner(System.in);
Map<Integer,String> map = new HashMap<>();
for(int i=0; i<3; i++)
{
int key = Integer.parseInt(in.nextLine().trim());
if(key < 0 || key > 4) // it looks like you want the keys stay between [0,5)
throw new IllegalArgumentException("invalid key");
String value = in.nextLine();
map.put(key,value);
}
for(Map.Entry<Integer,String> entry : map.entrySet())
{
System.out.printf("Element at index %d : %s\n",entry.getKey(),entry.getValue());
}
Instead of
key++;
arr[key] = value;
Do
if(arr[key] == null)
arr[key + 1] = value;
else
arr[key] = value;
But you’ll overwrite the information at index key + 1. You might want to rethink how you’re adding data.

Counting the number of unique values in a vector

I have a method which takes in parameters in the form of a vector from another vector. This vector can be of the size 2, 3 or 4 elements.
I want to count the frequency of every word in that vector. For example, if the vector contained the strings : "hello", "my" , "hello" , I want to output an array that is
[2, 1] where 2 is the frequency of hello and 1 is the frequency of my.
Here is my attempt after reading a few questions on this website:
int vector_length = query.size();
int [] tf_q = new int [vector_length];
int string_seen = 0;
for (int p = 0; p< query.size(); p++)
{
String temp_var = query.get(p);
for (int q = 0; q< query.size(); q++)
{
if (temp_var == query.get(q) )
{
if (string_seen == 0)
{
tf_q[p]++;
string_seen++;
}
else if (string_seen == 1)
{
tf_q[p]++;
string_seen = 0;
query.remove(p);
}
}
}
}
System.out.print(Arrays.toString(tf_q));
What is the right direction to go?
Use a HashMap of type to track the unique string values you encounter that count each word
String[] vector // your vector
Map<String, Integer> stringMap = new HashMap<String, Integer>();
for (int i = 0; i < vector.length; i++) {
if (stringMap.containsKey(vector[i]) {
Integer wordCount = stringMap.get(vector[i]);
stringMap.put(vector[i], new Integer(wordCount + 1));
}
else {
stringMap.put(vector[i], new Integer(1));
}
}
String[] input = {"Hello", "my", "Hello", "apple", "Hello"};
// use hashmap to track the number of strings
HashMap<String, Integer> map = new HashMap<String, Integer>();
// use arraylist to track the sequence of the output
ArrayList<String> list = new ArrayList<String>();
for (String str : input){
if(map.containsKey(str)){
map.put(str, map.get(str)+1);
} else{
map.put(str, 1);
list.add(str); // if the string never occurred before, add it to arraylist
}
}
int[] output = new int[map.size()];
int index = 0;
for (String str : list){
output[index] = map.get(str);
index++;
}
for (int i : output){
System.out.println(i);
}
This should be your answer! Result is in "int[] output"
If you want to maintain the relation between each word and the frequency of that word, then I suggest that you use a HashMap instead. For example:
Map<String,Integer> histogram = new HashMap<String,Integer>();
for (String word : query)
{
Integer count = histogram.get(word);
if (count == null)
histogram.put(word,1);
else
histogram.put(word,count+1);
}
At this point, you can (for example) print each word with the corresponding frequency:
for (String word : histogram.keySet())
System.out.println(word+" "+histogram.get(word));
Or you can obtain an array which contains only the frequencies, if that's all you want:
Integer[] array = histogram.values().toArray(new Integer[histogram.size()]);
Or even a collection, which is just as useful and convenient as any native array:
Collection<Integer> collection = histogram.values();

Invert Concordance Using Java

today I'm working with a client that creates a concordance from a text file using Java. All I need to do is invert the concordance to essentially recreate the text from start to finish. Now, the issue I seem to be having is where to start and how to do each step. As of now I have tried to create an array of words and iterate through my symbol table and assign each key to the array. Then I end up getting just a list of words from the concordance. For some reason this problem makes me feel very stupid because it seems like it should be a simple solution. I can't seem to think of any valid ideas to get me started with recreating the story. I have included the source here:
public class InvertedConcordance {
public static ST<String, SET<Integer>> createConcordance (String[] words) {
ST<String, SET<Integer>> st = new ST<String, SET<Integer>>();
for (int i = 0; i < words.length; i++) {
String s = words[i];
if (!st.contains(s)) {
st.put(s, new SET<Integer>());
}
SET<Integer> set = st.get(s);
set.add(i);
}
return st;
}
public static String[] invertConcordance (ST<String, SET<Integer>> st) {
//This is what I have so far
//Here is what I have that doesnt work
for(String key : st.keys())
{
inv[i++] = key;
}
for(int z = 0; z< inv.length; z++)
{
System.out.println(inv[z]);
}
String[]inv = new String[st.size()];
return inv;
}
private static void saveWords (String fileName, String[] words) {
int MAX_LENGTH = 70;
Out out = new Out (fileName);
int length = 0;
for (String word : words) {
length += word.length ();
if (length > MAX_LENGTH) {
out.println ();
length = word.length ();
}
out.print (word);
out.print (" ");
length++;
}
out.close ();
}
public static void main(String[] args) {
String fileName = "data/tale.txt";
In in = new In (fileName);
String[] words = in.readAll().split("\\s+");
ST<String, SET<Integer>> st = createConcordance (words);
StdOut.println("Finished building concordance");
// write to a file and read back in (to check that serialization works)
//serialize ("data/concordance-tale.txt", st);
//st = deserialize ("data/concordance-tale.txt");
words = invertConcordance (st);
saveWords ("data/reconstructed-tale.txt", words);
}
}
First of all - why are you using some weird classes like:
SET
ST
instead of built-in java classes:
Set
Map
Which are nedded here?
As for your problem, your code should not compile at all since you are declaring the variable inv AFTER using it:
public static String[] invertConcordance (ST<String, SET<Integer>> st) {
//This is what I have so far
//Here is what I have that doesnt work
for(String key : st.keys())
{
inv[i++] = key;
}
for(int z = 0; z< inv.length; z++)
{
System.out.println(inv[z]);
}
String[]inv = new String[st.size()];
return inv;
}
If I understand your idea correctly, the concordances simply creates the list of words and sets containing indices on which there were found. If this is a correct interpretation then an inverse operation would be:
public static String[] invertConcordance (ST<String, SET<Integer>> st) {
//First - figure out the length of the document, which is simply the maximum index in the concordancer
int document_length = 0;
for(String key : st.keys()){
for(Integer i : st.get(key)){
if(i>document_length){
document_length=i;
}
}
}
//Create the document
String[] document = new String[document_length+1];
//Reconstruct
for(String key : st.keys()){
for(Integer i : st.get(key)){
document[i] = key;
}
}
return document;
}
I assumed, that indices are numbered from 0 to the document's length-1, if there are actually stored from the 1 to document'length you should modify lines:
String[] document = new String[document_length+1];
to
String[] document = new String[document_length];
and
document[i] = key;
to
document[i-1] = key;

Add to list at certain index

I'm having a problem with some list manipulation. I take the user's input and search through it: if i find an "=" sign i assume that the string in front of it is the name of a variable , so on the line right above that variable i want to add a new string to the user's input (in this case it is called "tempVAR", doesn't really matter though). I've been trying to do this with StringBuilder but without any success , so i currently am trying to do it with ArrayLists but I am getting stuck at adding new elements to the list. Because of the way list.add(index,string) works , the elements to the right of what i am adding will always add +1 to their index. Is there a way to always know exactly what index i am looking for even after a random number of string has been added? Here is my code so far, if you run it you will see what i mean, instead of "tempVAR" or "tempVar1" being added above the name of the variable they will be added one or to positions in the wrong way.
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map.Entry;
public class ToTestStuff {
static List<String> referenceList = new ArrayList<String>();
public static final String SEMICOLUMN = ";";
public static final String BLANK = " ";
public static final String EMPTY = "";
public static final String LEFT_CURLY = "{";
public static final char CARRIAGE_RETURN = '\r';
public static final String CR_STRING = "CARRIAGE_RETURN_AND_NEW_LINE";
public static final char NEW_LINE = '\n';
public static void main(String[] args) {
List<String> test = new ArrayList<String>();
String x = "AGE_X";
String y = "AGE_Y";
String z = "AGE_YEARS";
String t = "P_PERIOD";
String w = "T_VALID";
referenceList.add(x);
referenceList.add(y);
referenceList.add(z);
referenceList.add(t);
referenceList.add(w);
String text2 = " if ( AGE_YEARS > 35 ) {\r\n"
+ " varX = P_PERIOD ;\r\n"
+ " }\r\n"
+ " if ( AGE_YEARS < 35 ) {\r\n"
+ " varY = T_VALID ;\r\n"
+ " varZ = AGE_Y ;\r\n"
+ " varA = AGE_X ;\r\n"
+ " }";
detectEquals(text2);
}
public static String detectEquals(String text) {
String a = null;
// text = text.trim();
// text = TestSplitting.addDelimiters(text);
String[] newString = text.split(" ");
List<String> test = Arrays.asList(newString);
StringBuilder strBuilder = new StringBuilder();
HashMap<String, List<Integer>> signs = new HashMap<String, List<Integer>>();
HashMap<String, List<Integer>> references = new HashMap<String, List<Integer>>();
HashMap<Integer, Integer> indexesOfStringAndList = new HashMap<Integer, Integer>();
List<String> testList = new ArrayList<String>();
List<Integer> lastList = new ArrayList<Integer>();
List<Integer> indexList = new ArrayList<Integer>();
List<Integer> refList = new ArrayList<Integer>();
List<String> keysList = new ArrayList<String>();
List<List> minList = new ArrayList<List>();
String previous = null;
int index = 0;
Object obj = new Object();
List<Integer> referenceValueList = new ArrayList<Integer>();
List<Integer> indexPosition = new ArrayList<Integer>();
String b = null;
int indexOfa = 0;
// System.out.println("a----> " + test);
List<String> anotherList = new ArrayList(test);
for (int i = 0; i < anotherList.size(); i++) {
a = anotherList.get(i).trim();
index = strBuilder.length();// - a.length();
// index = i;
strBuilder.append(a); // "=", 3 - if, 14 - while, 36 , "=", 15
testList.add(a);
if (a.equals("if") || a.equals("=")) {
lastList.add(i);
indexOfa = i;
indexesOfStringAndList.put(index, indexOfa);
refList.add(index);
indexPosition.add(index);
if (signs.containsKey(a)) {
signs.get(a).add(index);
} else {
signs.put(a, refList);
}
refList = new ArrayList<Integer>();
}
if (referenceList.contains(a)) {
indexList.add(index);
if (references.containsKey(a)) {
references.get(a).add(index);
} else {
references.put(a, indexList);
}
indexList = new ArrayList<Integer>();
}
}
for (String k : references.keySet()) {
keysList.add(k);
referenceValueList = references.get(k);
obj = Collections.min(referenceValueList);
int is = (Integer) obj;
ArrayList xx = new ArrayList();
xx.add(new Integer(is));
xx.add(k);
minList.add(xx);
}
for (List q : minList) {
Integer v = (Integer) q.get(0);
String ref = (String) q.get(1);
int x = closest(v, indexPosition);
int lSize = anotherList.size();
int sizeVar = lSize - test.size();
int indexOfPx = 0;
int px = 0;
if (x != 0) {
px = indexesOfStringAndList.get(x) - 1;
} else {
px = indexesOfStringAndList.get(x);
}
if (px == 0) {
System.out.println("previous when x=0 " +anotherList.get(px+sizeVar));
anotherList.add(px, "tempVar1=\r\n");
} else {
previous = anotherList.get(px + sizeVar);
System.out.println("previous is---> " + previous + " at position " + anotherList.indexOf(previous));
anotherList.add(anotherList.indexOf(previous) - 1, "\r\ntempVAR=");
}
}
strBuilder.setLength(0);
for (int j = 0; j < anotherList.size(); j++) {
b = anotherList.get(j);
strBuilder.append(b);
}
String stream = strBuilder.toString();
// stream = stream.replaceAll(CR_STRING, CARRIAGE_RETURN + EMPTY + NEW_LINE);
System.out.println("after ----> " + stream);
return stream;
}
public static int closest(int of, List<Integer> in) {
int min = Integer.MAX_VALUE;
int closest = of;
for (int v : in) {
final int diff = Math.abs(v - of);
if (diff < min) {
min = diff;
closest = v;
}
}
return closest;
}
}
I've mapped the positions of the "=" and "if" to their positions in the StringBuilder, but these are remnants from when i was trying to use a stringBuilder to do what i said above.
I have been struggling with this for a few days now and still haven't managed to do what i need, i am not sure where i am going wrong. At the moment i am hellbent on making this work as it is (with either lists or string builder) after which , if there is a better way i will look into that and adapt this accordingly.
The addDelimiters() method is a method i created to avoid writing the string as you see it in "String text2" but i took that out for this because it would only clutter my already chaotic code even more :), i don't think it has any relevance to why what i am trying to do is not working.
TLDR: at the line above front of every varX or varY or other "var" i would like to be able to add a string to the list but i think my logic in getting the variable names or in adding to the list is wrong.
I think we both know that your code is messed up and that you need many more abstractions to make it better. But you could make it work by maintaining an offset variable, lets say "int offset". Each time you insert a string after the initial pass you increment it, and when you access the list you use it, "list.get(index+offset);". Read up on Abstract syntax trees. , which are a great way to parse and manipulate languages.

Categories