First Unique String in an String array

First Unique String in an String array - java

Given a String Array how would you find the first unique String element in the array
public static String UniqueString(String[] s) {
String str ="";
for(int i=0;i<s.length;i++) {
for(int j=i+1;j<s.length;j++) {
System.out.println(s[i]+" "+s[j]);
str = s[i];
if(str==s[j]) {
break;
}
}if(!(str==s[i+1])){
return str;
}
}
return str;
}
so a String array of {Dog,Cat,Dog,Wolf,lion} would return as Cat

Your approach grows quadratically with the size of the list. There's a better approach that is essentially linear in the list size, which is to use an ordered map from strings to the number of occurrences. Use one pass through the list to build the map and then one pass through the map to find the first element (if any) with a count of 1. You can use a LinkedHashMap to implement this.
public static String uniqueString(String[] list) {
Integer ZERO = 0; // to avoid repeated autoboxing below
final LinkedHashMap<String, Integer> map = new LinkedHashMap<>(list.size());
// build the map
for (String s : list) {
Integer count = map.getOrDefault(s, ZERO);
map.put(s, count + 1);
}
// find the first unique entry. Note that set order is deterministic here.
for (Set.Entry<String, Integer> entry : map.entrySet()) {
if (entry.getValue() == 1) {
return entry.getKey();
}
}
// if we get this far, there was no unique string in the list
return "";
}
Note that you could use any kind of Map implementation (including HashMap) and forgo the ordering property of LinkedHashMap by replacing the second loop with a loop through the original list:
for (String s : list) {
if (map.get(s) == 1) {
return s;
}
}
However, if the list has lots of repeated strings then iterating through the
map will probably require significantly fewer iterations. So might as well use the added functionality of LinkedHashMap, which you get for very little performance penalty compared to HashMap.

You were very close to a working solution, you need a flag to indicate whether you found the String again in s (not sure where you got names). Also we compare String(s) with .equals (not ==). And method names start with a lower case letter. Something like,
public static String uniqueString(String[] s) {
for (int i = 0; i < s.length; i++) {
boolean unique = true;
for (int j = i + 1; j < s.length; j++) {
if (s[j].equals(s[i])) {
s[j] = s[s.length - 1]; // <-- handle bug, ensure that dupes aren't
// found again.
unique = false;
break;
}
}
if (unique) {
return s[i];
}
}
return "";
}

Java 8
public static String uniqueString(String[] s) {
StringBuilder result = new StringBuilder();
Stream.of(s)
.collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
.entrySet()
.stream()
.filter(entry -> entry.getValue() == 1)
.findFirst()
.ifPresent(entry -> result.append(entry.getKey()));
return result.toString();
}
Update, after 2 years:
Not sure why I had used a StringBuilder when I could just do it all in a single statement:
public static String uniqueString(String[] s) {
return Stream.of(s)
.collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()))
.entrySet()
.stream()
.filter(entry -> entry.getValue() == 1)
.findFirst()
.map(Map.Entry::getKey)
.orElse(null);
}

Perhaps there is another solution that can also solve your problem in a more java-8 way:
using a map to record the count of the duplicated strings and then
directly traverse the array from the very beginning till the end and
once the string is not duplicated, we get it right there.
That could be like:
public static void main(String... args) {
String[] arr = {"Dog", "Cat", "Dog", "Wolf", "lion"};
Map<String, Long> stringCountMap = Arrays.stream(arr)
.collect(Collectors.groupingBy(s -> s, Collectors.counting()));
for (String s : arr) {
if (stringCountMap.get(s) == 1) {
System.out.println("The first non-duplicate string: " + s);
break;
}
}
}
Also you can turn to LinkedHashMap as others mentioned to keep the order to avoid traverse the original array again as:
private static void another(String[] arr) {
Map<String, Long> stringCountMap = Arrays.stream(arr)
.collect(Collectors.groupingBy(s -> s, LinkedHashMap::new, Collectors.counting()));
for (String s : stringCountMap.keySet()) {
if (stringCountMap.get(s) == 1) {
System.out.println("The first non-duplicate string: " + s);
break;
}
}
}
The output will always be:
The first non-duplicate string: Cat

The above answer does not work in all cases.
for instance {"Dog","Dog",Cat} would return dog. the problem being that It does not check the entire array for duplicates.
private static String getFirstUniqueString(String[] names) {
for(int x=0;x<names.length;x++)
{
if(countOccurences(names[x],names) == 1 )
return names[x];
}
return "No Unique Strings";
}
private static int countOccurences(String string, String[] names)
{
int count=0;
for(int y = 0; y<names.length;y++)
{
if(names[y].equals(string))
{
count++;
if(count >1 )
return count;
}
}
return count;
}
Instead maybe break it into two pieces.
One method to find the unique string the other to count occurrences this way we count exactly how many times the word is mentioned through the entire array. If you simply want to know there more then one and you want to save run time, uncomment the if statement.

public static String FirstUniqueString(String[] strs) {
List<String> list = new LinkedList<>();
for (String s: strs) {
if (list.contains(s)) {
list.remove(s);
} else {
list.add(s);
}
}
return list.get(0);
}
We can use a simple LinkedList to keep a track of the duplicates. For example, if the input is new String[] {Dog, Cat, Dog, Wolf, Lion, Dog}, the first unique element would still be Cat. The list after for-each loop will be in the following sequence: {Cat, Wolf, Lion, Dog}. Big O runtime will be O(N ^ 2) as the for-each loop and contains() requiring O(N) respectively.

Related

How can I get the count of most duplicated value in a list after sorting it alphabetically?

What is the easiest way to get the most duplicated value in a list and sorted in descending order...
for example:
List<String> list = new ArrayList<>(List.of("Renault","BMW","Renault","Renault","Toyota","Rexon","BMW","Opel","Rexon","Rexon"));
`
"renault" & "rexon" are most duplicated and if sorted in descending order alphabetically I would like to get the rexon.

I think one of the most readable and elegant way would be to use the Streams API
strings.stream()
.collect(Collectors.groupingBy(x -> x, Collectors.counting()))
.entrySet().stream()
.max(Comparator.comparingLong((ToLongFunction<Map.Entry<String, Long>>) Map.Entry::getValue).thenComparing(Map.Entry::getKey))
.map(Map.Entry::getKey)
.ifPresent(System.out::println);

Create a map of names with their corresponding number of occurrences.
Get names and sort them in descending order.
Print the first name that has the highest number of occurrences.
class Scratch {
public static void main(String[] args) {
List<String> list = List.of("Renault","BMW","Renault","Renault","Toyota","Rexon","BMW","Opel","Rexon","Rexon");
Map<String, Integer> duplicates = new HashMap<>();
// 1. Create a map of names with their corresponding
// number of occurrences.
for (String s: list) {
duplicates.merge(s, 1, Integer::sum);
}
// 2. Get names and sort them in descending order.
List<String> newList = new ArrayList<String>(duplicates.keySet());
newList.sort(Collections.reverseOrder());
// 3. Print the first name that has the highest number of
// occurrences.
Integer max = Collections.max(duplicates.values());
newList.stream().filter(name -> duplicates.get(name).equals(max))
.findFirst()
.ifPresent(System.out::println);
}
}

After some time this is what I came with (I only tested it with your example and it worked):
public class Duplicated {
public static String MostDuplicated(String[] a) {
int dup = 0;
int position = -1;
int maxDup = 0;
for(int i = 0; i < a.length; i++) { //for every position
for(int j = 0; j < a.length; j++){ //compare it to all
if(a[i].equals(a[j])) { dup++; } // and count how many time is duplicated
}
if (dup > maxDup) { maxDup = dup; position = i;}
//if the number of duplications
//is greater than the maximum you have got so far, save this position.
else if (dup == maxDup) {
if( a[i].compareTo(a[position]) > 0 ){ position = i; }
//if its the same, keep the position of the alphabetical last
// (if u want the alphabetical first, just change the "<" to ">")
}
}
return a[position]; //return the position you saved
}
}

You are asking to sort the list and then find the most common item.
I would suggest that the easiest way to sort the list is using the sort method that is built into list.
I would then suggest finding the most common by looping with the for..each construct, keeping track of the current and longest streaks.
I like Yassin Hajaj's answer with streams but I find this way easier to write and easier to read. Your mileage may vary, as this is subjective. :)
import java.util.*;
public class SortingAndMostCommonDemo {
public static void main(String[] args) {
List<String> list = new ArrayList<>(List.of("Renault","BMW","Renault","Renault","Toyota","Rexon","BMW","Opel","Rexon","Rexon"));
list.sort(Comparator.reverseOrder());
System.out.println(list);
System.out.println("The most common is " + mostCommon(list) + ".");
}
private static String mostCommon(List<String> list) {
String mostCommon = null;
int longestStreak = 0;
String previous = null;
int currentStreak = 0;
for (String s : list) {
currentStreak = 1 + (s.equals(previous) ? currentStreak : 0);
if (currentStreak > longestStreak) {
mostCommon = s;
longestStreak = currentStreak;
}
previous = s;
}
return mostCommon;
}
}

The fast algorithm takes advantage of the fact that the list is sorted and finds the list with the most duplicates in O(n), with n being the size of the list. Since the list is sorted the duplicates will be together in consecutive positions:
private static String getMostDuplicates(List<String> list) {
if(!list.isEmpty()) {
list.sort(Comparator.reverseOrder());
String prev = list.get(0);
String found_max = prev;
int max_dup = 1;
int curr_max_dup = 0;
for (String s : list) {
if (!s.equals(prev)) {
if (curr_max_dup > max_dup) {
max_dup = curr_max_dup;
found_max = prev;
}
curr_max_dup = 0;
}
curr_max_dup++;
prev = s;
}
return found_max;
}
return "";
}
Explanation:
We iterate through the list and keep track of the maximum of duplicates found so far and the previous element. If the current element is the same as the previous one we increment the number of duplicates found so far. Otherwise, we check if the number of duplicates is the bigger than the previous maximum of duplicates found. If it is we update accordingly
A complete running example:
public class Duplicates {
private static String getMostDuplicates(List<String> list) {
if(!list.isEmpty()) {
list.sort(Comparator.reverseOrder());
String prev = list.get(0);
String found_max = prev;
int max_dup = 1;
int curr_max_dup = 0;
for (String s : list) {
if (!s.equals(prev)) {
if (curr_max_dup > max_dup) {
max_dup = curr_max_dup;
found_max = prev;
}
curr_max_dup = 0;
}
curr_max_dup++;
prev = s;
}
return found_max;
}
return "";
}
public static void main(String[] args) {
List<String> list = new ArrayList<>(List.of("Renault","BMW","Renault","Renault","Toyota","Rexon","BMW","Opel","Rexon","Rexon"));
String duplicates = getMostDuplicates(list);
System.out.println("----- Test 1 -----");
System.out.println(duplicates);
list = new ArrayList<>(List.of("Renault","BMW"));
duplicates = getMostDuplicates(list);
System.out.println("----- Test 2 -----");
System.out.println(duplicates);
list = new ArrayList<>(List.of("Renault"));
duplicates = getMostDuplicates(list);
System.out.println("----- Test 3 -----");
System.out.println(duplicates);
}
}
Output:
----- Test 1 -----
Rexon
----- Test 2 -----
Renault
----- Test 3 -----
Renault

Actually, I found a solution which works:
public static void main(String[] args) {
List<String> list = new ArrayList<>(List.of("Renault", "BMW", "BMW", "Renault", "Renault", "Toyota",
"Rexon", "BMW", "Opel", "Rexon", "Rexon"));
Map<String, Integer> soldProducts = new HashMap<>();
for (String s : list) {
soldProducts.put(s, soldProducts.getOrDefault(s, 0) + 1);
}
LinkedHashMap<String, Integer> sortedMap = soldProducts.entrySet()
.stream()
.sorted(VALUE_COMPARATOR.thenComparing(KEY_COMPARATOR_REVERSED))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e2, LinkedHashMap::new));
String result = "";
for (Map.Entry<String, Integer> s : sortedMap.entrySet()) {
result = s.getKey();
}
System.out.println(result);
}
static final Comparator<Map.Entry<String, Integer>> KEY_COMPARATOR_REVERSED =
Map.Entry.comparingByKey(Comparator.naturalOrder());
static final Comparator<Map.Entry<String, Integer>> VALUE_COMPARATOR =
Map.Entry.comparingByValue();

Grouping elements from lists into sub lists without duplicates in Java

I am working on 'Grouping Anagrams'.
Problem statement: Given an array of strings, group anagrams together.
I could group the anagrams but I am not able to avoid the ones which are already grouped. I want to avoid duplicates. An element can only belong to one group. In my code, an element belongs to multiple groups.
Here is my code:
public class GroupAnagrams1 {
public static void main(String[] args) {
String[] input = {"eat", "tea", "tan", "ate", "nat", "bat"};
List<List<String>> result = groupAnagrams(input);
for(List<String> s: result) {
System.out.println(" group: ");
for(String x:s) {
System.out.println(x);
}
}
}
public static List<List<String>> groupAnagrams(String[] strs) {
List<List<String>> result = new ArrayList<List<String>>();
for(int i =0; i < strs.length; i++) {
Set<String> group = new HashSet<String>();
for(int j= i+1; j < strs.length; j++) {
if(areAnagrams(strs[i], strs[j])) {
group.add(strs[i]);
group.add(strs[j]);
}
}
if(group.size() > 0) {
List<String> aList = new ArrayList<String>(group);
result.add(aList);
}
}
return result;
}
Here comes the method to check if two string are anagrams.
private static boolean areAnagrams(String str1, String str2) {
char[] a = str1.toCharArray();
char[] b = str2.toCharArray();
int[] count1 = new int[256];
Arrays.fill(count1, 0);
int[] count2 = new int[256];
Arrays.fill(count2, 0);
for(int i = 0; i < a.length && i < b.length; i++) {
count1[a[i]]++;
count2[b[i]]++;
}
if(str1.length() != str2.length())
return false;
for(int k=0; k < 256; k++) {
if(count1[k] != count2[k])
return false;
}
return true;
}
}
expected output:
group:
tea
ate
eat
group:
bat
group:
tan
nat
actual output:
group:
tea
ate
eat
group:
tea
ate
group:
tan
nat
The order in which the groups are displayed does not matter. The way it is displayed does not matter.
Preference: Please feel free to submit solutions using HashMaps but I prefer to see solutions without using HashMaps and using Java8

I also would recommend using java Streams for that. Because you don't want that here is another solution:
public static List<List<String>> groupAnagrams(String[] strs) {
List<List<String>> result = new ArrayList<>();
for (String str : strs) {
boolean added = false;
for (List<String> r : result) {
if (areAnagrams(str, r.get(0))) {
r.add(str);
added = true;
break;
}
}
if (!added) {
List<String> aList = new ArrayList<>();
aList.add(str);
result.add(aList);
}
}
return result;
}
The problem in your solution is that you are moving each iteration one step ahead, so you just generate the not full complete group ["tea", "ate"] instead of ["bat"].
My solution uses a different approach to check if you have a group where the first word is an anagram for the searched word. if not create a new group and move on.
Because I would use Java Streams as I said at the beginning here is my initial solution using a stream:
List<List<String>> result = new ArrayList<>(Arrays.stream(words)
.collect(Collectors.groupingBy(w -> Stream.of(w.split("")).sorted().collect(Collectors.joining()))).values());
To generate the sorted string keys to group the anagrams you can look here for more solutions.
The result is both my provided solutions will be this:
[[eat, tea, ate], [bat], [tan, nat]]

I would have taken a slightly different approach using streams:
public class Scratch {
public static void main(String[] args) {
String[] input = { "eat", "tea", "tan", "ate", "nat", "bat" };
List<List<String>> result = groupAnagrams(input);
System.out.println(result);
}
private static List<List<String>> groupAnagrams(String[] input) {
return Arrays.asList(input)
// create a list that wraps the array
.stream()
// stream that list
.map(Scratch::sortedToOriginalEntryFor)
// map each string we encounter to an entry containing
// its sorted characters to the original string
.collect(Collectors.groupingBy(Entry::getKey, Collectors.mapping(Entry::getValue, Collectors.toList())))
// create a map whose key is the sorted characters and whose
// value is a list of original strings that share the sorted
// characters: Map<String, List<String>>
.values()
// get all the values (the lists of grouped strings)
.stream()
// stream them
.collect(Collectors.toList());
// convert to a List<List<String>> per your req
}
// create an Entry whose key is a string of the sorted characters of original
// and whose value is original
private static Entry<String, String> sortedToOriginalEntryFor(String original) {
char c[] = original.toCharArray();
Arrays.sort(c);
String sorted = new String(c);
return new SimpleEntry<>(sorted, original);
}
}
This yields:
[[eat, tea, ate], [bat], [tan, nat]]
If you want to eliminate repeated strings (e.g. if "bat" appears twice in your input) then you can call toSet() instead of toList() in your Collectors.groupingBy call, and change the return type as appropriate.

Get each character from each string by column

I've been trying to get each character from every String by column but I only got the first characters of every string, I want to get every character by column from every string.
For example:
I have three strings from ArrayList of Strings:
chi
llo
ut
What I want to happen must be like this, after getting each character by column from strings:
clu
hlt
io
So long, my current source code only gets the first characters of first two string which is 'cl', Here's my current source code:
List<String> New_Strings = new ArrayList<String>();
int Column_Place = 0;
for (String temp_str : Strings) {
try{ //For StringIndexOutOfBoundsException (handle last String)
if(Column_Place >= temp_str.length()){
Current_Character = temp_str.charAt(Column_Place);
New_Strings.add(Character.toString(Current_Character));
break;
}else if (Column_Place < temp_str.length()){
Current_Character = temp_str.charAt(Column_Place);
New_Strings.add(Character.toString(Current_Character));
}
}catch(Exception e){
continue;
}
Column_Place++;
}

You're adding string representations of the individual characters to the result string. Instead, you should accumulate these characters in to a result string. E.g.:
int numStrings = strings.size();
List<String> result = new ArrayList<>(numStrings);
for (int i = 0; i < numStrings; ++i) {
StringBuilder sb = new StringBuilder();
for (String s : strings) {
if (i < s.length) {
sb.append(s.charAt(i));
}
}
result.add(sb.toString());
}

Just call groupByColumn(Arrays.asList("chi", "llo", "ut"):
public static List<String> groupByColumn(List<String> words) {
if (words == null || words.isEmpty()) {
return Collections.emptyList();
}
return IntStream.range(0, longestWordLength(words))
.mapToObj(ind -> extractColumn(words, ind))
.collect(toList());
}
public static String extractColumn(List<String> words, int columnInd) {
return words.stream()
.filter(word -> word.length() > columnInd)
.map(word -> String.valueOf(word.charAt(columnInd)))
.collect(Collectors.joining(""));
}
public static int longestWordLength(List<String> words) {
String longestWord = Collections.max(words, Comparator.comparing(String::length));
return longestWord.length();
}

You iterate on the List with a enhanced/foreach loop.
So you will iterate a single time on each
String. Whereas your result : only the first letters are hanlded.
You should use a while loop with as while condition while(Column_Place < Strings.size()) with such an approach.
Or as alternative you could do things in two distinct steps and use Java 8 features.
Note that in Java, variables starts with lowercase. Please follow the conventions to make your code more readable/understandable here and there.
In Java 8 you could do :
List<String> strings = new ArrayList<>(Arrays.asList("chi", "llo", "ut"));
int maxColumn = strings.stream()
.mapToInt(String::length)
.max()
.getAsInt(); // suppose that you have at least one element in the List
List<String> values =
// stream from 0 the max number of column
IntStream.range(0, maxColumn)
// for each column index : create the string by joining their
// String value or "" if index out of bound
.mapToObj(i -> strings.stream()
.map(s -> i < s.length() ? String.valueOf(
s.charAt(i)) : "")
.collect(Collectors.joining()))
.collect(Collectors.toList());

just think the list as two dimensional array. split each item from the list, get the j-th character from each item, if and only if the item's length is greater than the index j.
ArrayList<String> list = new ArrayList<String>();
list.add("chi");
list.add("llo");
list.add("ut");
int size = list.size();
int i=0, j=0,k=0;
while(size-- > 0){
for(i=0; i<list.size(); i++){
String temp = list.get(i);
if(j < temp.length()){
System.out.print(temp.charAt(j));
}
}
j++;
System.out.println();
}

Best way of scanning for letter combinations

So let's say I have a 32 character string like this:
GCAAAGCTTGGCACACGTCAAGAGTTGACTTT
My goal is to count all occurrences of specific substrings, such as 'AA' 'ATT' 'CGG' and so on. For this purpose, the 3rd through 5th characters above contain 2 occurrences of 'AA'. There are a total of 8 of these substrings, 6 that are 3 characters in length and 2 that are 2 characters in length, and I would want counts for all eight.
What would be the most efficient way of doing this in Java? My thoughts follow a couple lines:
Scan through character by character, checking and flagging for each substring. This seems intensive and inefficient.
Find some existing function that would do the work (not sure of efficiency of what function it would be, String.contains is a boolean, not a count).
Scan through the string multiple times, each sweep checking for a different substring.
The implementation of 3 is trivial, but 1 might give a few extra headaches and won't be very clean code.

I think this should answer your question.
The naive approach (checking for substring at each possible index)
runs in O(nk) where n is the length of the string and k is the length
of the substring. This could be implemented with a for-loop, and
something like haystack.substring(i).startsWith(needle).
More efficient algorithms exist though. You may want to have a look at
the Knuth-Morris-Pratt algorithm, or the Aho-Corasick algorithm. As
opposed to the naive approach, both of these algorithms behave well
also on input like "look for the substring of 100 'X' in a string of
10000 'X's.
Taken from stackoverflow.com/questions/4121875/count-of-substrings-within-string

One approach is to essentially code up an NFA (http://en.wikipedia.org/wiki/Nondeterministic_finite_automaton)
and just run your input on the NFA.
Here's my attempt at coding an NFA. You'd probably want to convert to a DFA first before running it so that you don't have to manage a bunch of branches. With the branches it's basically as slow as O(nk), whereas if you convert to a DFA it would be O(n)
import java.util.*;
public class Test
{
public static void main (String[] args)
{
new Test();
}
private static final String input = "TAAATGGAGGTAATAGAGGAGGTGTAT";
private static final String[] substrings = new String[] { "AA", "AG", "GG", "GAG", "TA" };
private static final int[] occurrences = new int[substrings.length];
public Test()
{
ArrayList<Branch> branches = new ArrayList<Branch>();
// For each character, read it, create branches for each substring, and pass the current character
// to each active branch
for (int i = 0; i < input.length(); i++)
{
char c = input.charAt(i);
// Make a new branch, one for each substring that we are searching for
for (int j = 0; j < substrings.length; j++)
branches.add(new Branch(substrings[j], j, branches));
// Pass the current input character to each branch that is still alive
// Iterate in reverse order because the nextCharacter method may
// cause the branch to be removed from the ArrayList
for (int j = branches.size()-1; j >= 0; j--)
branches.get(j).nextCharacter(c);
}
for (int i = 0; i < occurrences.length; i++)
System.out.println(substrings[i]+": "+occurrences[i]);
}
private static class Branch
{
private String searchFor;
private int position, index;
private ArrayList<Branch> parent;
public Branch(String searchFor, int searchForIndex, ArrayList<Branch> parent)
{
this.parent = parent;
this.searchFor = searchFor;
this.position = 0;
this.index = searchForIndex;
}
public void nextCharacter(char c)
{
// If the current character matches the ith character of the string we are searching for,
// Then this branch will stay alive
if (c == searchFor.charAt(position))
position++;
// Otherwise the substring didn't match, so this branch dies
else
suicide();
// Reached the end of the substring, so the substring was found.
if (position == searchFor.length())
{
occurrences[index] += 1;
suicide();
}
}
private void suicide()
{
parent.remove(this);
}
}
}
output for this example is
AA: 3
AG: 4
GG: 4
GAG: 3
TA: 4

Do you want to find all possible substrings that are longer than 1 character?
In that case one approach is to use HashMaps.
This example outputs:
{AA=3, TT=4, AC=3, CTT=2, CAA=2, GCA=2, CAC=2, AG=3, TTG=2, AAG=2, GT=2, CT=2, TG=2, GA=2, GC=3, CA=4}
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
public class Test {
public static void main(String[] args) {
String str = "GCAAAGCTTGGCACACGTCAAGAGTTGACTTT";
HashMap<String, Integer> map = countMatches(str);
System.out.println(map);
}
private static HashMap<String, List<Integer>> findOneLetterMatches(String str) {
ArrayList<Integer> list = new ArrayList<>();
for(int i = 0; i < str.length(); i++) list.add(i);
return extendMatches(str, list, 1);
}
private static HashMap<String, List<Integer>> extendMatches(String str, List<Integer> indices, int targetLength) {
HashMap<String, List<Integer>> map = new HashMap<>();
for(int index: indices) {
if(index+targetLength <= str.length()) {
String s = str.substring(index, index + targetLength);
List<Integer> list = map.get(s);
if(list == null) {
list = new ArrayList<>();
map.put(s, list);
}
list.add(index);
}
}
return map;
}
private static void addIfListLongerThanOne(HashMap<String, List<Integer>> source,
HashMap<String, List<Integer>> target) {
for(Map.Entry<String, List<Integer>> e: source.entrySet()) {
String s = e.getKey();
List<Integer> l = e.getValue();
if(l.size() > 1) target.put(s, l);
}
}
private static HashMap<String, List<Integer>> extendAllMatches(String str, HashMap<String, List<Integer>> map, int targetLength) {
HashMap<String, List<Integer>> result = new HashMap<>();
for(List<Integer> list: map.values()) {
HashMap<String, List<Integer>> m = extendMatches(str, list, targetLength);
addIfListLongerThanOne(m, result);
}
return result;
}
private static HashMap<String, Integer> countMatches(String str) {
HashMap<String, Integer> result = new HashMap<>();
HashMap<String, List<Integer>> matches = findOneLetterMatches(str);
for(int targetLength = 2; !matches.isEmpty(); targetLength++) {
HashMap<String, List<Integer>> m = extendAllMatches(str, matches, targetLength);
for(Map.Entry<String, List<Integer>> e: m.entrySet()) {
String s = e.getKey();
List<Integer> l = e.getValue();
result.put(s, l.size());
}
matches = m;
}
return result;
}
}

Find the most common String in ArrayList()

Is there a way to find the most common String in an ArrayList?
ArrayList<String> list = new ArrayList<>();
list.add("test");
list.add("test");
list.add("hello");
list.add("test");
Should find the word "test" from this list ["test","test","hello","test"]

Don't reinvent the wheel and use the frequency method of the Collections class:
public static int frequency(Collection<?> c, Object o)
Returns the number of elements in the specified collection equal to
the specified object. More formally, returns the number of elements e
in the collection such that (o == null ? e == null : o.equals(e)).
If you need to count the occurrences for all elements, use a Map and loop cleverly :)
Or put your list in a Set and loop on each element of the set with the frequency method above. HTH
EDIT / Java 8: If you fancy a more functional, Java 8 one-liner solution with lambdas, try:
Map<String, Long> occurrences =
list.stream().collect(Collectors.groupingBy(w -> w, Collectors.counting()));

In statistics, this is called the "mode". A vanilla Java 8 solution looks like this:
Stream.of("test","test","hello","test")
.collect(Collectors.groupingBy(s -> s, Collectors.counting()))
.entrySet()
.stream()
.max(Comparator.comparing(Entry::getValue))
.ifPresent(System.out::println);
Which yields:
test=3
jOOλ is a library that supports mode() on streams. The following program:
System.out.println(
Seq.of("test","test","hello","test")
.mode()
);
Yields:
Optional[test]
(disclaimer: I work for the company behind jOOλ)

As per question, Specifically just to get word, not the number of times (i.e. value of key).
String mostRepeatedWord
= list.stream()
.collect(Collectors.groupingBy(w -> w, Collectors.counting()))
.entrySet()
.stream()
.max(Comparator.comparing(Entry::getValue))
.get()
.getKey();

You can make a HashMap<String,Integer>. If the String already appears in the map, increment its key by one, otherwise, add it to the map.
For example:
put("someValue", 1);
Then, assume it's "someValue" again, you can do:
put("someValue", get("someValue") + 1);
Since the key of "someValue" is 1, now when you put it, the key will be 2.
After that you can easily go through the map and extract the key that has the highest value.
I didn't write a full solution, try to construct one, if you have problems post it in another question. Best practice is to learn by yourself.

I think the best way to do it is using maps containing counts.
Map<String, Integer> stringsCount = new HashMap<>();
And iterate over your array filling this map:
for(String s: list)
{
Integer c = stringsCount.get(s);
if(c == null) c = new Integer(0);
c++;
stringsCount.put(s,c);
}
Finally, you can get the most repeated element iterating over the map:
Map.Entry<String,Integer> mostRepeated = null;
for(Map.Entry<String, Integer> e: stringsCount.entrySet())
{
if(mostRepeated == null || mostRepeated.getValue()<e.getValue())
mostRepeated = e;
}
And show the most common string:
if(mostRepeated != null)
System.out.println("Most common string: " + mostRepeated.getKey());

You could use a HashMap<String,Integer>. Looping through the array, you can check for each String if it is not already a Key of your HashMap, add it and set the value to 1, if it is, increase its value by 1.
Then you have a HashMap with all unique Strings and an associated number stating their amount in the array.

If somebody need to find most popular from usual String[] array (using Lists):
public String findPopular (String[] array) {
List<String> list = Arrays.asList(array);
Map<String, Integer> stringsCount = new HashMap<String, Integer>();
for(String string: list)
{
if (string.length() > 0) {
string = string.toLowerCase();
Integer count = stringsCount.get(string);
if(count == null) count = new Integer(0);
count++;
stringsCount.put(string,count);
}
}
Map.Entry<String,Integer> mostRepeated = null;
for(Map.Entry<String, Integer> e: stringsCount.entrySet())
{
if(mostRepeated == null || mostRepeated.getValue()<e.getValue())
mostRepeated = e;
}
try {
return mostRepeated.getKey();
} catch (NullPointerException e) {
System.out.println("Cannot find most popular value at the List. Maybe all strings are empty");
return "";
}
}
case non-sensitive

i know this takes more time to implement but you can use heap data structure by storing in the nodes the count and the string information

You can use Guava's Multiset:
ArrayList<String> names = ...
// count names
HashMultiset<String> namesCounts = HashMultiset.create(names);
Set<Multiset.Entry<String>> namesAndCounts = namesCounts.entrySet();
// find one most common
Multiset.Entry<String> maxNameByCount = Collections.max(namesAndCounts, Comparator.comparing(Multiset.Entry::getCount));
// pick all with the same number of occurrences
List<String> mostCommonNames = new ArrayList<>();
for (Multiset.Entry<String> nameAndCount : namesAndCounts) {
if (nameAndCount.getCount() == maxNameByCount.getCount()) {
mostCommonNames.add(nameAndCount.getElement());
}
}

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;
public class StringChecker {
public static void main(String[] args) {
ArrayList<String> string;
string = new ArrayList<>(Arrays.asList("Mah", "Bob", "mah", "bat", "MAh", "BOb"));
Map<String, Integer> wordMap = new HashMap<String, Integer>();
for (String st : string) {
String input = st.toUpperCase();
if (wordMap.get(input) != null) {
Integer count = wordMap.get(input) + 1;
wordMap.put(input, count);
} else {
wordMap.put(input, 1);
}
}
System.out.println(wordMap);
Object maxEntry = Collections.max(wordMap.entrySet(), Map.Entry.comparingByValue()).getKey();
System.out.println("maxEntry = " + maxEntry);
}

With this method, if there is more than one most common elements in your ArrayList, you get back all of them by adding them to a new ArrayList.
public static void main(String[] args) {
List <String> words = new ArrayList<>() ;
words.add("cat") ;
words.add("dog") ;
words.add("egg") ;
words.add("chair") ;
words.add("chair") ;
words.add("chair") ;
words.add("dog") ;
words.add("dog") ;
Map<String,Integer> count = new HashMap<>() ;
for (String word : words) { /* Counts the quantity of each
element */
if (! count.containsKey(word)) {
count.put(word, 1 ) ;
}
else {
int value = count.get(word) ;
value++ ;
count.put(word, value) ;
}
}
List <String> mostCommons = new ArrayList<>() ; /* Max elements */
for ( Map.Entry<String,Integer> e : count.entrySet() ) {
if (e.getValue() == Collections.max(count.values() )){
/* The max value of count */
mostCommons.add(e.getKey()) ;
}
}
System.out.println(mostCommons);
}
}

There are a lot of answers suggesting HashMaps. I really don't like them, because you have to iterate through them once again anyway. Rather, I would sort the List
Collections.sort(list);
and then loop through it. Something similar to
String prev = null, mostCommon=null;
int num = 0, max = 0;
for (String str:list) {
if (str.equals(prev)) {
num++;
} else {
if (num>max) {
max = num;
mostCommon = str;
}
num = 1;
prev = str;
}
}
should do it.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

First Unique String in an String array - java

Related

How can I get the count of most duplicated value in a list after sorting it alphabetically?

Grouping elements from lists into sub lists without duplicates in Java

Get each character from each string by column

Best way of scanning for letter combinations

Find the most common String in ArrayList()

Categories

Resources