manipulating strings withinside an arraylist - java

I have built an ArrayList of strings from two sources:
Path p1 = Paths.get("C:/Users/Green/documents/dictionary.txt");
Scanner sc = new Scanner(p1.toFile()).useDelimiter("\\s*-\\s*");
ArrayList al = new ArrayList();
while (sc.hasNext()) {
String word = (sc.next());
al.add(word);
al.add(Translate(word));
}
The array is made up of a word from a text file dictionary read one line at a time. The second is a translation of the word. The translation Translate is a Java method that now returns a string. so I am adding two strings to the array list for as many lines that there are in the dictionary.
I can print the dictionary out and the translations....but the printout is unhelpful as it prints all the words and then all the translations....not much use to quickly look up.
for(int i=0;i<al.size();i++){
al.forEach(word ->{ System.out.println(word); });
}
Is there a way that I can either manipulate the way I add the strings to the ArrayList or how I manipulate after so that I can retrieve one word and its translation at a time.
Ideally I want to be able to sort the dictionary as the file I receive is not in alphabetic order.

I am not sure why you have to use ArrayList data structure as it is required or not.
I would suggest you use the Map for this kind of dictionary data. Map data structure will manage your data as a key which is your original word and a value which is a translated word.
Here is a simple example:
Path p1 = Paths.get("C:/Users/Green/documents/dictionary.txt");
Scanner sc = new Scanner(p1.toFile()).useDelimiter("\\s*-\\s*");
Map<String, String> dic = new HashMap<String, String>();
while (sc.hasNext()) {
String word = (sc.next());
dic.put(word, Translate(word));
}
//print out from dictionary data
for(Map.Entry<String, String> entry: dic.entrySet()){
System.out.println(dic.getKey() + " - " + dic.getValue());
}

You can use object like this
class Word {
String original;
String translation;
public Word(String original, String translation) {
this.original = original;
this.translation = translation;
}
}
Put words to list:
while (sc.hasNext()) {
String word = (sc.next());
al.add(new Word(word, Translate(word)));
}
And then:
for (Word word : al) {
}

Related

Dictionary of Words: need to scan each element for word length and make searchable [duplicate]

This question already has answers here:
length of each element in an array Java
(4 answers)
Closed 5 years ago.
I have converted a file of words into a String array. I need to somehow convert the Array into a list of word lengths and make it searchable. In other words, I need to be able to enter a word length (say, 5) and be presented with only the words that have a word length of five. Help?
public static void main(String[] args) throws IOException {
String token1 = "";
Scanner scan = new Scanner(new File("No.txt"));
List<String> temps = new ArrayList<String>();
while (scan.hasNext()){
token1 = scan.next();
temps.add(token1);
}
scan.close();
String[] tempsArray = temps.toArray(new String[0]);
for (String s : tempsArray) {
You don't use an array for that. The things you need are collections, more precisely: Maps and Lists; as you want to use a Map<Integer, List<String>>.
Meaning: a map that uses "word length" as key; and the mapped entry is a list containing all those words with that length. Here is a bit of code to get you started:
Map<Integer, List<String>> wordsByLength = new HashMap<>();
// now you have to fill that map; lets assume tempsArray contains all your words
for (String s : tempsArray) {
List<String> listForCurrentLength = wordsByLength.get(s.length());
if (listForCurrentLength == null) {
listForCurrentLength = new ArrayList<>();
}
listForCurrentLength.add(s);
wordsByLength.put(s.length(), listForCurrentLength);
The idea is basically to iterate that array you already got; and for each string in there ... put it into that map; depending on its length.
( the above was just written down; neither compiled nor tested; as said it is meant as "pseudo code" to get you going )

Grouping of words from a text file to Arraylist on the basis of length

public class JavaApplication13 {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
// TODO code application logic here
BufferedReader br;
String strLine;
ArrayList<String> arr =new ArrayList<>();
HashMap<Integer,ArrayList<String>> hm = new HashMap<>();
try {
br = new BufferedReader( new FileReader("words.txt"));
while( (strLine = br.readLine()) != null){
arr.add(strLine);
}
} catch (FileNotFoundException e) {
System.err.println("Unable to find the file: fileName");
} catch (IOException e) {
System.err.println("Unable to read the file: fileName");
}
ArrayList<Integer> lengths = new ArrayList<>(); //List to keep lengths information
System.out.println("Total Words: "+arr.size()); //Total waords read from file
int i=0;
while(i<arr.size()) //this loop will itrate our all the words of text file that are now stored in words.txt
{
boolean already=false;
String s = arr.get(i);
//following for loop will check if that length is already in lengths list.
for(int x=0;x<lengths.size();x++)
{
if(s.length()==lengths.get(x))
already=true;
}
//already = true means file is that we have an arrayist of the current string length in our map
if(already==true)
{
hm.get(s.length()).add(s); //adding that string according to its length in hm(hashmap)
}
else
{
hm.put(s.length(),new ArrayList<>()); //create a new element in hm and the adding the new length string
hm.get(s.length()).add(s);
lengths.add(s.length());
}
i++;
}
//Now Print the whole map
for(int q=0;q<hm.size();q++)
{
System.out.println(hm.get(q));
}
}
}
is this approach is right?
Explanation:
load all the words to an ArrayList.
then iterate through each index and check the length of word add it to an ArrayList of strings containing that length where these ArrayList are mapped in a hashmap with length of words it is containing.
Firstly, your code is working only for the files which contain one word by line as you're processing whole lines as words. To make your code more universal you have to process each line by splitting it to words:
String[] words = strLine.split("\\s+")
Secondly, you don't need any temporary data structures. You can add your words to the map right after you read the line from file. arr and lengths lists are actually useless here as they do not contain any logic except temporary storing. You're using lengths list just to store the lengths which has already been added to the hm map. The same can be reached by invoking hm.containsKey(s.length()).
And an additional comment on your code:
for(int x=0;x<lengths.size();x++) {
if(s.length()==lengths.get(x))
already=true;
}
when you have a loop like this when you only need to find if some condition is true for any element you don't need to proceed looping when the condition is already found. You should use a break keyword inside your if statement to terminate the loop block, e.g.
for(int x=0;x<lengths.size();x++) {
if(s.length()==lengths.get(x))
already=true;
break; // this will terminate the loop after setting the flag to true
}
But as I already mentioned you don't need it at all. That is just for educational purposes.
Your approach is long, confusing, hard to debug and from what I see it's not good performance-wise (check out the contains method). Check this:
String[] words = {"a", "ab", "ad", "abc", "af", "b", "dsadsa", "c", "ghh", "po"};
Map<Integer, List<String>> groupByLength =
Arrays.stream(words).collect(Collectors.groupingBy(String::length));
System.out.println(groupByLength);
This is just an example, but you get the point. I have an array of words, and then I use streams and Java8 magic to group them in a map by length (exactly what you're trying to do). You get the stream, then collect it to a map, grouping by length of the words, so it's gonna put every 1 letter word in a list under key 1 etc.
You can use the same approach, but you have your words in a list so remember to not use Arrays.stream() but just .stream() on your list.

How can I retrieve the value in a Hashmap stored in an arraylist type hashmap?

I am a beginner in Java. Basically, I have loaded each text document and stored each individual words in the text document in the hasmap. Afterwhich, I tried storing all the hashmaps in an ArrayList. Now I am stuck with how to retrieve all the words in my hashmaps that is in the arraylist!
private static long numOfWords = 0;
private String userInputString;
private static long wordCount(String data) {
long words = 0;
int index = 0;
boolean prevWhiteSpace = true;
while (index < data.length()) {
//Intialise character variable that will be checked.
char c = data.charAt(index++);
//Determine whether it is a space.
boolean currWhiteSpace = Character.isWhitespace(c);
//If previous is a space and character checked is not a space,
if (prevWhiteSpace && !currWhiteSpace) {
words++;
}
//Assign current character's determination of whether it is a spacing as previous.
prevWhiteSpace = currWhiteSpace;
}
return words;
} //
public static ArrayList StoreLoadedFiles()throws Exception{
final File f1 = new File ("C:/Users/Admin/Desktop/dataFiles/"); //specify the directory to load files
String data=""; //reset the words stored
ArrayList<HashMap> hmArr = new ArrayList<HashMap>(); //array of hashmap
for (final File fileEntry : f1.listFiles()) {
Scanner input = new Scanner(fileEntry); //load files
while (input.hasNext()) { //while there are still words in the document, continue to load all the words in a file
data += input.next();
input.useDelimiter("\t"); //similar to split function
} //while loop
String textWords = data.replaceAll("\\s+", " "); //remove all found whitespaces
HashMap<String, Integer> hm = new HashMap<String, Integer>(); //Creates a Hashmap that would be renewed when next document is loaded.
String[] words = textWords.split(" "); //store individual words into a String array
for (int j = 0; j < numOfWords; j++) {
int wordAppearCount = 0;
if (hm.containsKey(words[j].toLowerCase().replaceAll("\\W", ""))) { //replace non-word characters
wordAppearCount = hm.get(words[j].toLowerCase().replaceAll("\\W", "")); //remove non-word character and retrieve the index of the word
}
if (!words[j].toLowerCase().replaceAll("\\W", "").equals("")) {
//Words stored in hashmap are in lower case and have special characters removed.
hm.put(words[j].toLowerCase().replaceAll("\\W", ""), ++wordAppearCount);//index of word and string word stored in hashmap
}
}
hmArr.add(hm);//stores every single hashmap inside an ArrayList of hashmap
} //end of for loop
return hmArr; //return hashmap ArrayList
}
public static void LoadAllHashmapWords(ArrayList m){
for(int i=0;i<m.size();i++){
m.get(i); //stuck here!
}
Firstly your login wont work correctly. In the StoreLoadedFiles() method you iterate through the words like for (int j = 0; j < numOfWords; j++) { . The numOfWords field is initialized to zero and hence this loop wont execute at all. You should initialize that with length of words array.
Having said that to retrieve the value from hashmap from a list of hashmap, you should first iterate through the list and with each hashmap you could take the entry set. Map.Entry is basically the pair that you store in the hashmap. So when you invoke map.entrySet() method it returns a java.util.Set<Map.Entry<Key, Value>>. A set is returned because the key will be unique.
So a complete program will look like.
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map.Entry;
import java.util.Scanner;
public class FileWordCounter {
public static List<HashMap<String, Integer>> storeLoadedFiles() {
final File directory = new File("C:/Users/Admin/Desktop/dataFiles/");
List<HashMap<String, Integer>> listOfWordCountMap = new ArrayList<HashMap<String, Integer>>();
Scanner input = null;
StringBuilder data;
try {
for (final File fileEntry : directory.listFiles()) {
input = new Scanner(fileEntry);
input.useDelimiter("\t");
data = new StringBuilder();
while (input.hasNext()) {
data.append(input.next());
}
input.close();
String wordsInFile = data.toString().replaceAll("\\s+", " ");
HashMap<String, Integer> wordCountMap = new HashMap<String, Integer>();
for(String word : wordsInFile.split(" ")){
String strippedWord = word.toLowerCase().replaceAll("\\W", "");
int wordAppearCount = 0;
if(strippedWord.length() > 0){
if(wordCountMap.containsKey(strippedWord)){
wordAppearCount = wordCountMap.get(strippedWord);
}
wordCountMap.put(strippedWord, ++wordAppearCount);
}
}
listOfWordCountMap.add(wordCountMap);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} finally {
if(input != null) {
input.close();
}
}
return listOfWordCountMap;
}
public static void loadAllHashmapWords(List<HashMap<String, Integer>> listOfWordCountMap) {
for(HashMap<String, Integer> wordCountMap : listOfWordCountMap){
for(Entry<String, Integer> wordCountEntry : wordCountMap.entrySet()){
System.out.println(wordCountEntry.getKey() + " - " + wordCountEntry.getValue());
}
}
}
public static void main(String[] args) {
List<HashMap<String, Integer>> listOfWordCountMap = storeLoadedFiles();
loadAllHashmapWords(listOfWordCountMap);
}
}
Since you are beginner in Java programming I would like to point out a few best practices that you could start using from the beginning.
Closing resources : In your while loop to read from files you are opening a Scanner like Scanner input = new Scanner(fileEntry);, But you never closes it. This causes memory leaks. You should always use a try-catch-finally block and close resources in finally block.
Avoid unnecessary redundant calls : If an operation is the same while executing inside a loop try moving it outside the loop to avoid redundant calls. In your case for example the scanner delimiter setting as input.useDelimiter("\t"); is essentially a one time operation after a scanner is initialized. So you could move that outside the while loop.
Use StringBuilder instead of String : For repeated string manipulations such as concatenation should be done using a StringBuilder (or StringBuffer when you need synchronization) instead of using += or +. This is because String is an immutable object, meaning its value cannot be changed. So each time when you do a concatenation a new String object is created. This results in a lot of unused instances in memory. Where as StringBuilder is mutable and values could be changed.
Naming convention : The usual naming convention in Java is starting with lower-case letter and first letter upper-case for each word. So its a standard practice to name a method as storeLoadedFiles as opposed to StoreLoadedFiles. (This could be opinion based ;))
Give descriptive names : Its a good practice to give descriptive names. It helps in later code maintenance. Say its better to give a name as wordCountMap as opposed to hm. So in future if someone tries to go through your code they'll get a better and faster understanding about your code with descriptive names. Again opinion based.
Use generics as much as possible : This avoid additional casting overhead.
Avoid repetition : Similar to point 2 if you have an operation that result in the same output and need to be used multiple times try moving it to a variable and use the variable. In your case you were using words[j].toLowerCase().replaceAll("\\W", "") multiple times. All the time the result is the same but it creates unnecessary instances and repetitions. So you could move that to a String and use that String elsewhere.
Try using for-each loop where ever possible : This relieves us from taking care of indexing.
These are just suggestions. I tried to include most of it in my code but I wont say its the perfect one. Since you are a beginner if you tried to include these best practices now itself it'll get ingrained in you. Happy coding.. :)
for (HashMap<String, Integer> map : m) {
for(Entry<String,Integer> e:map.entrySet()){
//your code here
}
}
or, if using java 8 you can play with lambda
m.stream().forEach((map) -> {
map.entrySet().stream().forEach((e) -> {
//your code here
});
});
But before all you have to change method signature to public static void LoadAllHashmapWords(List<HashMap<String,Integer>> m) otherwise you would have to use a cast.
P.S. are you sure your extracting method works? I've tested it a bit and had list of empty hashmaps all the time.

Best way to create categories with words

I am currently working on a little "Hangman" project. I want to make it possible for the user to choose from different categories, such as "Countries" or "Food" and this got me thinking about what would be the best way to handle and sort those categories.
I saved the words to a little text file, which looks something like this:
Countries: Hungary Austria Argentina Canada;
Food: Donut Bread Hamburger;
For now, I created a multidimensional ArrayList that stores all the words, each category in an individual ArrayList in that ArrayList.
ArrayList< ArrayList<String> > words = new ArrayList< ArrayList<String> >();
// ... read words from .txt file and store it in the words-ArrayList ...
I know, that in each category the first word is the title of a category, so if I wanted to get all the titles of the categories, it would look something like this:
for( ArrayList list : words ) {
System.out.println( list.get(0) );
}
Now this method I'm using works perfectally fine, it just seems a bit too complex to me and I was wondering, if there are simpler methods to do that. I want to thank in advance for any suggestions you can give me.
Better to use Map<String, List<String>> for my money. The Map can be a HashMap, and the word category would be the key while the List (an ArrayList in the concrete form) would be the related value.
Then to extract the categories, all you'd need to do would be to extract the key set and iterate through it. e.g.,
Map<String, List<String>> mapList = new HashMap<String, List<String>>();
// fill map here...
for (String key : mapList.keySet()) {
List<String> list = mapList.get(key);
System.out.printf("%s: %s%n", key, list);
}
If you want the keys to be in a certain order, then you'd need to use one of the other concrete implementations of Map such as a TreeMap.
For a simple example:
import java.io.InputStream;
import java.util.*;
public class MapList {
public static void main(String[] args) {
Map<String, List<String>> mapList = new HashMap<String, List<String>>();
String sourcePath = "MapListData.txt";
InputStream source = MapList.class.getResourceAsStream(sourcePath);
if (source == null) {
return;
}
Scanner scan = new Scanner(source);
while (scan.hasNextLine()) {
String line = scan.nextLine().trim();
if (!line.isEmpty()) {
line = line.replace(";", "");
String[] mainTokens = line.split("\\s*:\\s*");
if (mainTokens.length == 2) {
String key = mainTokens[0];
List<String> list = new ArrayList<String>();
String[] subTokens = mainTokens[1].split("\\s+");
for (String subToken : subTokens) {
list.add(subToken);
}
mapList.put(key, list);
}
}
}
if (scan != null) {
scan.close();
}
for (String key : mapList.keySet()) {
List<String> list = mapList.get(key);
System.out.printf("%s: %s%n", key, list);
}
}
}
For me returns:
Beer: [Pilsner, Weiss, Brown_Ale, IPA]
Countries: [Hungary, Austria, Argentina, Canada]
Food: [Donut, Bread, Hamburger]

string compare in java

I have a ArrayList, with elements something like:
[string,has,was,hctam,gnirts,saw,match,sah]
I would like to delete the ones which are repeating itself, such as string and gnirts, and delete the other(gnirts). How do I go about achieving something as above?
Edit: I would like to rephrase the question:
Given an arrayList of strings, how does one go about deleting elements containing reversed strings?
Given the following input:
[string,has,was,hctam,gnirts,saw,match,sah]
How does one reach the following output:
[string,has,was,match]
Set<String> result = new HashSet<String>();
for(String word: words) {
if(result.contains(word) || result.contains(new StringBuffer(word).reverse().toString())) {
continue;
}
result.add(word);
}
// result
You can use a comparator that sorts the characters before checking them for equality. This means that compare("string", "gnirts") will return 0. Then use this comparator as you traverse through the list and copy the matching elements to a new list.
Another option (if you have a really large list) is to create an Anagram class that extends the String class. Override the hashcode method so that anagrams produce the same hashcode, then use a hashmap of anagrams to check your array list for anagrams.
HashSet<String> set = new HashSet<String>();
for (String str : arraylst)
{
set.add(str);
}
ArrayList<String> newlst = new ArrayList<String>();
for (String str : arraylst)
{
if(!set.contains(str))
newlst.add(str);
}
To remove duplicate items, you can use HashMap (), where as the key codes will be used by the sum of the letters (as each letter has its own code - is not a valid situation where two different words have an identical amount of code numbers), as well as the value - this the word. When adding a new word in a HashMap, if the amount of code letters of new words is identical to some of the existing key in a HashMap, then the word with the same key is replaced by a new word. Thus, we get the HashMap collection of words without repetition.
With regard to the fact that the bottom line "string" looks better "gnirts". It may be a situation where we can not determine which word is better, so the basis has been taken that the final form of the word is not important - thing is that there are no duplicate
ArrayList<String> mainList = new ArrayList<String>();
mainList.add("string,has,was,hctam,gnirts,saw,match,sah");
String[] listChar = mainList.get(0).split(",");
HashMap <Integer, String> hm = new HashMap<Integer, String>();
for (String temp : listChar) {
int sumStr=0;
for (int i=0; i<temp.length(); i++)
sumStr += temp.charAt(i);
hm.put(sumStr, temp);
}
mainList=new ArrayList<String>();
Set<Map.Entry<Integer, String>> set = hm.entrySet();
for (Map.Entry<Integer, String> temp : set) {
mainList.add(temp.getValue());
}
System.out.println(mainList);
UPD:
1) The need to maintain txt-file in ANSI
In the beginning, I replaced Scaner on FileReader and BufferedReader
String fileRStr = new String();
String stringTemp;
FileReader fileR = new FileReader("text.txt");
BufferedReader streamIn = new BufferedReader(fileR);
while ((stringTemp = streamIn.readLine()) != null)
fileRStr += stringTemp;
fileR.close();
mainList.add(fileRStr);
In addition, all the words in the file must be separated by commas, as the partition ishonoy lines into words by the function split (",").
If you have words separated by another character - replace the comma at the symbol in the following line:
String[] listChar = mainList.get(0).split(",");

Categories