string compare in java - java

I have a ArrayList, with elements something like:
[string,has,was,hctam,gnirts,saw,match,sah]
I would like to delete the ones which are repeating itself, such as string and gnirts, and delete the other(gnirts). How do I go about achieving something as above?
Edit: I would like to rephrase the question:
Given an arrayList of strings, how does one go about deleting elements containing reversed strings?
Given the following input:
[string,has,was,hctam,gnirts,saw,match,sah]
How does one reach the following output:
[string,has,was,match]

Set<String> result = new HashSet<String>();
for(String word: words) {
if(result.contains(word) || result.contains(new StringBuffer(word).reverse().toString())) {
continue;
}
result.add(word);
}
// result

You can use a comparator that sorts the characters before checking them for equality. This means that compare("string", "gnirts") will return 0. Then use this comparator as you traverse through the list and copy the matching elements to a new list.
Another option (if you have a really large list) is to create an Anagram class that extends the String class. Override the hashcode method so that anagrams produce the same hashcode, then use a hashmap of anagrams to check your array list for anagrams.

HashSet<String> set = new HashSet<String>();
for (String str : arraylst)
{
set.add(str);
}
ArrayList<String> newlst = new ArrayList<String>();
for (String str : arraylst)
{
if(!set.contains(str))
newlst.add(str);
}

To remove duplicate items, you can use HashMap (), where as the key codes will be used by the sum of the letters (as each letter has its own code - is not a valid situation where two different words have an identical amount of code numbers), as well as the value - this the word. When adding a new word in a HashMap, if the amount of code letters of new words is identical to some of the existing key in a HashMap, then the word with the same key is replaced by a new word. Thus, we get the HashMap collection of words without repetition.
With regard to the fact that the bottom line "string" looks better "gnirts". It may be a situation where we can not determine which word is better, so the basis has been taken that the final form of the word is not important - thing is that there are no duplicate
ArrayList<String> mainList = new ArrayList<String>();
mainList.add("string,has,was,hctam,gnirts,saw,match,sah");
String[] listChar = mainList.get(0).split(",");
HashMap <Integer, String> hm = new HashMap<Integer, String>();
for (String temp : listChar) {
int sumStr=0;
for (int i=0; i<temp.length(); i++)
sumStr += temp.charAt(i);
hm.put(sumStr, temp);
}
mainList=new ArrayList<String>();
Set<Map.Entry<Integer, String>> set = hm.entrySet();
for (Map.Entry<Integer, String> temp : set) {
mainList.add(temp.getValue());
}
System.out.println(mainList);
UPD:
1) The need to maintain txt-file in ANSI
In the beginning, I replaced Scaner on FileReader and BufferedReader
String fileRStr = new String();
String stringTemp;
FileReader fileR = new FileReader("text.txt");
BufferedReader streamIn = new BufferedReader(fileR);
while ((stringTemp = streamIn.readLine()) != null)
fileRStr += stringTemp;
fileR.close();
mainList.add(fileRStr);
In addition, all the words in the file must be separated by commas, as the partition ishonoy lines into words by the function split (",").
If you have words separated by another character - replace the comma at the symbol in the following line:
String[] listChar = mainList.get(0).split(",");

Related

manipulating strings withinside an arraylist

I have built an ArrayList of strings from two sources:
Path p1 = Paths.get("C:/Users/Green/documents/dictionary.txt");
Scanner sc = new Scanner(p1.toFile()).useDelimiter("\\s*-\\s*");
ArrayList al = new ArrayList();
while (sc.hasNext()) {
String word = (sc.next());
al.add(word);
al.add(Translate(word));
}
The array is made up of a word from a text file dictionary read one line at a time. The second is a translation of the word. The translation Translate is a Java method that now returns a string. so I am adding two strings to the array list for as many lines that there are in the dictionary.
I can print the dictionary out and the translations....but the printout is unhelpful as it prints all the words and then all the translations....not much use to quickly look up.
for(int i=0;i<al.size();i++){
al.forEach(word ->{ System.out.println(word); });
}
Is there a way that I can either manipulate the way I add the strings to the ArrayList or how I manipulate after so that I can retrieve one word and its translation at a time.
Ideally I want to be able to sort the dictionary as the file I receive is not in alphabetic order.
I am not sure why you have to use ArrayList data structure as it is required or not.
I would suggest you use the Map for this kind of dictionary data. Map data structure will manage your data as a key which is your original word and a value which is a translated word.
Here is a simple example:
Path p1 = Paths.get("C:/Users/Green/documents/dictionary.txt");
Scanner sc = new Scanner(p1.toFile()).useDelimiter("\\s*-\\s*");
Map<String, String> dic = new HashMap<String, String>();
while (sc.hasNext()) {
String word = (sc.next());
dic.put(word, Translate(word));
}
//print out from dictionary data
for(Map.Entry<String, String> entry: dic.entrySet()){
System.out.println(dic.getKey() + " - " + dic.getValue());
}
You can use object like this
class Word {
String original;
String translation;
public Word(String original, String translation) {
this.original = original;
this.translation = translation;
}
}
Put words to list:
while (sc.hasNext()) {
String word = (sc.next());
al.add(new Word(word, Translate(word)));
}
And then:
for (Word word : al) {
}

arraylist of character arrays java

I originally have an arraylist of strings but I want to save it as an arraylist of those strings.toCharArray() instead. Is it possible to make an arraylist that stores char arrays? Here is how I tried to implement it.
String[] words = new String[]{"peter","month","tweet", "pete", "twee", "pet", "et"};
HashMap<Integer,ArrayList<Character[]>> ordered = new HashMap<>();
int length = 0;
int max = 0; //max Length of words left
for(String word: words){
if(ordered.containsKey(length) == false){ //if int length key doesnt exist yet
ordered.put(length, new ArrayList<Character[]>()); //put key in hashmap with value of arraylist with the one value
ordered.get(length).add(word.toCharArray());
}
}
Note that toCharArray() returns an array of primitives (char[]), and not an array of the boxing class (Character[] as you currently have). Additionally, you're only adding the given array to the map if the length of the array isn't in the map, which probably isn't the behavior you wanted (i.e., you should move the line ordered.get(length).add(word.toCharArray()); outside the if statement).
Also, note that Java 8's streams can do a lot of the heavy lifting for you:
String[] words = new String[]{"peter","month","tweet", "pete", "twee", "pet", "et"};
Map<Integer, List<char[]>> ordered =
Arrays.stream(word)
.map(String::toCharArray)
.collect(Collectors.groupingBy(x -> x.length));
EDIT:
As per the question in the comment, this is also entirely possible in Java 7 without streams:
String[] words = new String[]{"peter","month","tweet", "pete", "twee", "pet", "et"};
Map<Integer, List<char[]>> ordered = new HashMap<>();
for (String word: words) {
int length = words.length();
// if int length key doesnt exist in the map already
List<char[]> list = orderd.get(length);
if (list == null) {
list = new ArrayList<>();
orderd.put(length, list);
}
list.add(word);
}

Dictionary of Words: need to scan each element for word length and make searchable [duplicate]

This question already has answers here:
length of each element in an array Java
(4 answers)
Closed 5 years ago.
I have converted a file of words into a String array. I need to somehow convert the Array into a list of word lengths and make it searchable. In other words, I need to be able to enter a word length (say, 5) and be presented with only the words that have a word length of five. Help?
public static void main(String[] args) throws IOException {
String token1 = "";
Scanner scan = new Scanner(new File("No.txt"));
List<String> temps = new ArrayList<String>();
while (scan.hasNext()){
token1 = scan.next();
temps.add(token1);
}
scan.close();
String[] tempsArray = temps.toArray(new String[0]);
for (String s : tempsArray) {
You don't use an array for that. The things you need are collections, more precisely: Maps and Lists; as you want to use a Map<Integer, List<String>>.
Meaning: a map that uses "word length" as key; and the mapped entry is a list containing all those words with that length. Here is a bit of code to get you started:
Map<Integer, List<String>> wordsByLength = new HashMap<>();
// now you have to fill that map; lets assume tempsArray contains all your words
for (String s : tempsArray) {
List<String> listForCurrentLength = wordsByLength.get(s.length());
if (listForCurrentLength == null) {
listForCurrentLength = new ArrayList<>();
}
listForCurrentLength.add(s);
wordsByLength.put(s.length(), listForCurrentLength);
The idea is basically to iterate that array you already got; and for each string in there ... put it into that map; depending on its length.
( the above was just written down; neither compiled nor tested; as said it is meant as "pseudo code" to get you going )

How to create vocabulary from Arrays of Strings

I have to make a vocabulary with unique words of some texts. I have texts converted to Arrays of Strings. Now I want the Array list with only unique words. So the first step, convert the first Array of Strings to a List<Strings> (I guess?) where all double words are filtered out. That is my first step, how do I do this, and do I use a List<String> or another String[]?
Second, the next String[] I 'read-in' should update the vocabulary List<String> but ONLY add new words from the text.
It must look something like:
public List<String> makeVocabulary(String[] tokens){
List<String> vocabulay = new ArrayList<>;
//add unique words from 'tokens' to vocabulary
return vocabulary;
}
TL;DR: how do I convert a whole bunch of String[] to one List<String> with only the unique words from the String[]'s?
Upon review of your code, it appears that you would be clearing vocabulary each time you run this command, so it can only be done once. If you'd like to make it more modular, do something like this:
public class yourClass
{
private List<String> vocabulary = new ArrayList<String>();
public List<String> makeVocabulary(String[] tokens)
{
for( int i = 0; i < tokens.length; i++ )
if( !vocabulary.contains( tokens[i] ) )
vocabulary.add(tokens[i]);
return vocabulary;
}
}
For determining unique tokens, use a Set implementation...
public List<String> makeVocabulary(String[] tokens){
Set<String> uniqueTokens = new HashSet<String>();
for(String token : tokens) {
uniqueTokens.add(token);
}
List<String> vocabulay = new ArrayList<String>(uniqueTokens);
return vocabulary;
}
One way to achieve your goal is to make use of the Set class as opposed to a List of strings. You could look into that e.g. like the code below.
public List<String> makeVocabulary(String[] tokens){
Set<String> temp = new HashSet<>;
//add unique words from 'tokens' to temp
List<String> vocabulary = new ArrayList<>;
vocabulary.addAll(temp);
return vocabulary;
}
If you can live with Set as the return type of makeVocabulary, you can just return temp.

How to Count Unique Values in an ArrayList?

I have to count the number of unique words from a text document using Java. First I had to get rid of the punctuation in all of the words. I used the Scanner class to scan each word in the document and put in an String ArrayList.
So, the next step is where I'm having the problem! How do I create a method that can count the number of unique Strings in the array?
For example, if the array contains apple, bob, apple, jim, bob; the number of unique values in this array is 3.
public countWords() {
try {
Scanner scan = new Scanner(in);
while (scan.hasNext()) {
String words = scan.next();
if (words.contains(".")) {
words.replace(".", "");
}
if (words.contains("!")) {
words.replace("!", "");
}
if (words.contains(":")) {
words.replace(":", "");
}
if (words.contains(",")) {
words.replace(",", "");
}
if (words.contains("'")) {
words.replace("?", "");
}
if (words.contains("-")) {
words.replace("-", "");
}
if (words.contains("‘")) {
words.replace("‘", "");
}
wordStore.add(words.toLowerCase());
}
} catch (FileNotFoundException e) {
System.out.println("File Not Found");
}
System.out.println("The total number of words is: " + wordStore.size());
}
Are you allowed to use Set? If so, you HashSet may solve your problem. HashSet doesn't accept duplicates.
HashSet noDupSet = new HashSet();
noDupSet.add(yourString);
noDupSet.size();
size() method returns number of unique words.
If you have to really use ArrayList only, then one way to achieve may be,
1) Create a temp ArrayList
2) Iterate original list and retrieve element
3) If tempArrayList doesn't contain element, add element to tempArrayList
Starting from Java 8 you can use Stream:
After you add the elements in your ArrayList:
long n = wordStore.stream().distinct().count();
It converts your ArrayList to a stream and then it counts only the distinct elements.
I would advice to use HashSet. This automatically filters the duplicate when calling add method.
Although I believe a set is the easiest solution, you can still use your original solution and just add an if statement to check if value already exists in the list before you do your add.
if( !wordstore.contains( words.toLowerCase() )
wordStore.add(words.toLowerCase());
Then the number of words in your list is the total number of unique words (ie: wordStore.size() )
This general purpose solution takes advantage of the fact that the Set abstract data type does not allow duplicates. The Set.add() method is specifically useful in that it returns a boolean flag indicating the success of the 'add' operation. A HashMap is used to track the occurrence of each original element. This algorithm can be adapted for variations of this type of problem. This solution produces O(n) performance..
public static void main(String args[])
{
String[] strArray = {"abc", "def", "mno", "xyz", "pqr", "xyz", "def"};
System.out.printf("RAW: %s ; PROCESSED: %s \n",Arrays.toString(strArray), duplicates(strArray).toString());
}
public static HashMap<String, Integer> duplicates(String arr[])
{
HashSet<String> distinctKeySet = new HashSet<String>();
HashMap<String, Integer> keyCountMap = new HashMap<String, Integer>();
for(int i = 0; i < arr.length; i++)
{
if(distinctKeySet.add(arr[i]))
keyCountMap.put(arr[i], 1); // unique value or first occurrence
else
keyCountMap.put(arr[i], (Integer)(keyCountMap.get(arr[i])) + 1);
}
return keyCountMap;
}
RESULTS:
RAW: [abc, def, mno, xyz, pqr, xyz, def] ; PROCESSED: {pqr=1, abc=1, def=2, xyz=2, mno=1}
You can create a HashTable or HashMap as well. Keys would be your input strings and Value would be the number of times that string occurs in your input array. O(N) time and space.
Solution 2:
Sort the input list.
Similar strings would be next to each other.
Compare list(i) to list(i+1) and count the number of duplicates.
In shorthand way you can do it as follows...
ArrayList<String> duplicateList = new ArrayList<String>();
duplicateList.add("one");
duplicateList.add("two");
duplicateList.add("one");
duplicateList.add("three");
System.out.println(duplicateList); // prints [one, two, one, three]
HashSet<String> uniqueSet = new HashSet<String>();
uniqueSet.addAll(duplicateList);
System.out.println(uniqueSet); // prints [two, one, three]
duplicateList.clear();
System.out.println(duplicateList);// prints []
duplicateList.addAll(uniqueSet);
System.out.println(duplicateList);// prints [two, one, three]
public class UniqueinArrayList {
public static void main(String[] args) {
StringBuffer sb=new StringBuffer();
List al=new ArrayList();
al.add("Stack");
al.add("Stack");
al.add("over");
al.add("over");
al.add("flow");
al.add("flow");
System.out.println(al);
Set s=new LinkedHashSet(al);
System.out.println(s);
Iterator itr=s.iterator();
while(itr.hasNext()){
sb.append(itr.next()+" ");
}
System.out.println(sb.toString().trim());
}
}
3 distinct possible solutions:
Use HashSet as suggested above.
Create a temporary ArrayList and store only unique element like below:
public static int getUniqueElement(List<String> data) {
List<String> newList = new ArrayList<>();
for (String eachWord : data)
if (!newList.contains(eachWord))
newList.add(eachWord);
return newList.size();
}
Java 8 solution
long count = data.stream().distinct().count();

Categories