Get values in TreeMap whose String keys start with a pattern - java

I have a Treemap with Strings as keys. I want to get all the values whose keys start with the String search.
I think what I need to do here is something like:
myTreeMap.subMap(search.concat(X1), true, search.concat(X2), true);
where X1 and X2 are the highest and lowest possible character.
Is there a better approach? If not, what are X1 and X2?
Thanks in advance.

basically you need lexicographically next prefix as the second boundary:
public <T> Map<String, T> subMapWithKeysThatAreSuffixes(String prefix, NavigableMap<String, T> map) {
if ("".equals(prefix)) return map;
String lastKey = createLexicographicallyNextStringOfTheSameLenght(prefix);
return map.subMap(prefix, true, lastKey, false);
}
String createLexicographicallyNextStringOfTheSameLenght(String input) {
final int lastCharPosition = input.length()-1;
String inputWithoutLastChar = input.substring(0, lastCharPosition);
char lastChar = input.charAt(lastCharPosition) ;
char incrementedLastChar = (char) (lastChar + 1);
return inputWithoutLastChar+incrementedLastChar;
}

Hmmmm. I would say you should do myTreeMap.subMap(search, true, search2, false) where search2 isn't concatenated, but is instead "incremented". After all, if X2 was just a character, then your implementation would miss search.concat(X2).concat(X2).

The problem is the partial key search your trying to do.
myTreeMap.subMap(search.concat(X1), true, search.concat(X2), true);
Let's assume you have some key/value pairs:
fooBar -> Some value
fooBage -> Some other value
barBear -> Running out of value ideas
barTender -> Another value
Now you want to find all "foo*", in this example fooBar and fooBage. The key is treated as a single token, that happens to be a string in this case. There is no way to treat the key as a partial key. Even saying you want "fooA" through "fooZ" won't get you fooBar, or fooBage.
If you make the key class (I'll call it FractionalKey), and override the equals method, then you can define equals as "some regex", or "either the entire thing, or just the first part" etc. the problem with this is that if equals returns true, then the hashcodes must also be equal, and this would break that rule I think.
I think this is your only option, other then searching the list of keys for the one you want.

Since my edit to the answer above was rejected for being too original, I'll post it here. This answer fixes typos and handles int overflow which the original did not.
public <T> Map<String, T> subMapWithKeysThatAreSuffixes(String prefix, NavigableMap<String, T> map) {
if ("".equals(prefix)) return map;
String lastKey = createLexicographicallyNextStringOfTheSameLength(prefix);
return map.subMap(prefix, true, lastKey, false);
}
String createLexicographicallyNextStringOfTheSameLength(String input) {
final int lastCharPosition = input.length()-1;
String inputWithoutLastChar = input.substring(0, lastCharPosition);
char lastChar = input.charAt(lastCharPosition);
char incrementedLastChar = (char) (lastChar + 1);
// Handle int/char overflow. This wasn't done above.
if (incrementedLastChar == ((char) 0)) return input+incrementedLastChar;
return inputWithoutLastChar+incrementedLastChar;
}

Related

Finding the first Non-repeating Character in the given string, not able to pass a few test cases due to Timeout

I'm working on a Problem from CodeSignal:
Given a String s consisting of the alphabet only, return the first
non-repeated element. Otherwise, return '-'.
Example: input -
s="abacabad", output - 'c'.
I came up with the following the code. It passes only 16/19 test cases. Is there a way to solve this problem in O(n) or O(1)?
My code:
public char solution(String s) {
ArrayList<Character> hs = new ArrayList<>();
for (char c:s.toCharArray()) {
hs.add(c);
}
for (int j=0; j<s.length(); j++) {
if ( 1 == Collections.frequency(hs, s.charAt(j))) {
return s.charAt(j);
}
}
return '_';
}
The minimal possible time complexity for this task is linear O(n), because we need to examine every character in the given string to find out whether a particular character is unique.
Your current solution runs in O(n^2) - Collections.frequency() iterates over all characters in the string and this iteration and this method is called for every character. That's basically a brute-force implementation.
We can generate a map Map<Character,Boolean>, which associates each character with a boolean value denoting whether it's repeated or not.
That would allow to avoid iterating over the given string multiple times.
Then we need to iterate over the key-set to find the first non-repeated character. As the Map implementation LinkedHashMap is used to ensure that returned non-repeated character would be the first encountered in the given string.
To update the Map I've used Java 8 method merge(), which expects three arguments: a key, a value, and a function responsible for merging the old value and the new one.
public char solution(String s) {
Map<Character, Boolean> isNonRepeated = getMap(s);
for (Map.Entry<Character, Boolean> entry: isNonRepeated.entrySet()) {
if (entry.getValue()) {
return entry.getKey();
}
}
return '_';
}
public Map<Character, Boolean> getMap(String s) {
Map<Character, Boolean> isNonRepeated = new LinkedHashMap<>();
for (int i = 0; i < s.length(); i++) {
isNonRepeated.merge(s.charAt(i), true, (v1, v2) -> false);
}
return isNonRepeated;
}
In case if you're comfortable with streams, this problem can be addressed in one statement (the algorithm remains the same and time complexity would be linear as well):
public char solution(String s) {
return s.chars()
.mapToObj(c -> (char) c)
.collect(Collectors.toMap( // creates intermediate Map<Character, Boolean>
Function.identity(), // key
c -> true, // value - first occurrence, character is considered to be non-repeated
(v1, v2) -> false, // resolving values, character is proved to be a duplicate
LinkedHashMap::new
))
.entrySet().stream()
.filter(Map.Entry::getValue)
.findFirst()
.map(Map.Entry::getKey)
.orElse('_');
}
Here is a slightly different approach using both a Set to account for duplicates, and a Queue to hold candidates before a possible duplicate is discovered.
iterate over the list of characters.
try and add the character to the seen set. If not already there,
also add it to the candidates queue.
else if it has been "seen", try and remove it from the candidates queue.
By the time this gets done, the head of the queue should contain the first, non-repeating character. If the queue is empty, return the default value as no unique character was found.
public char solution(String s) {
Queue<Character> candidates = new LinkedList<>();
Set<Character> seen = new HashSet<>();
for (char c : s.toCharArray()) {
if (seen.add(c)) {
candidates.add(c);
} else {
candidates.remove(c);
}
}
return candidates.isEmpty() ? '_' : candidates.peek();
}
I have done pretty extensive testing of this and it has yet to fail. It is also comparatively very efficient. But as can happen, I may have overlooked something.
One technique would be a 2 pass solution using a frequency/count array for each character.
public static char firstNonRepeatingChar(String s) {
int[] frequency = new int[26]; // this is O(1) space complexity because alphabet is finite of 26 letters
/* First Pass - Fill our frequency array */
for(int i = 0; i < s.length(); i++) {
frequency[s.charAt(i) - 'a']++;
}
/* Second Pass - Look up our frequency array */
for(int i = 0; i < s.length(); i++) {
if(frequency[s.charAt(i) - 'a'] == 1) {
return s.charAt(i);
}
}
/* Not Found */
return '_';
}
This solution is O(2n) -> O(n) and a space complexity of O(1) because we are using a finite set of the English alphabet (26 letters). This wouldn't work in other scenarios for non-English alphabets.

can i keep duplicate contains of key in Map

Why am i able to keep duplicate contains in Map as key,
i had heart about map is : it cat't contains duplicate keys
import java.util.LinkedHashMap;
import java.util.HashMap;
class LinkedHasMapDemo
{
#SuppressWarnings("unchecked")
public static void main(String[] args)
{
LinkedHashMap l = new LinkedHashMap();
//{116=kumar, 116=kumar, kumar=kumar, 117=Ram charan, 105=Yash}
//HashMap l = new HashMap();
//{116=kumar, 117=Ram charan, 116=kumar, kumar=kumar, 105=Yash}
l.put("116","kumar"); //key is String Object
l.put(116,"kumar"); //key is Integer Object
l.put("kumar","kumar");
l.put(117,"Ram charan");
l.put(105,"Yash");
System.out.println(l);
}
}
but is in this example i am able to keep duplicate keys in the both LinkedHashMap as well as in HashMap
You are right, a Map does not hold duplicate keys (this only applies to keys, values can be equal). If you put a value under an already added key the previous value will be overridden. Therefore consider the following example:
HashMap<String, Integer> map = new HashMap<>();
map.put("key", 1);
System.out.println(map.get("key")); // Outputs 1
System.out.println(map.size()); // Outputs 1
map.put("key", 2);
System.out.println(map.get("key")); // Outputs 2
System.out.println(map.size()); // Still outputs 1
The problem with your counter-example is that you actually don't have duplicates in your map.
You put 116 (an int or Integer after boxing) and "116" (a String). Since both are of different type the map differentiates them, they are different objects. Consider the following example
HashMap<Object, Integer> map = new HashMap<>();
map.put("116", 1);
System.out.println(map.size()); // Outputs 1
map.put(116, 2);
System.out.println(map.size()); // Now outputs 2
System.out.println("116".equals(116)); // Outputs false
In general you should never use raw-types, that is using HashMap without specifying the generic type to use, like HashMap<String, Integer>. If you don't specify anything it will use HashMap<Object, Object>. By that it allows every Object to be put into the map. In many cases you can and want to restrict this to a specific type only.
Try the following:
String s123="123";
Integer i123=123;
System.out.println(s123.hashCode());
System.out.println(i123.hashCode());
System.out.println(i123.equals(s123)); // you can try s123.equals(i123) too
You can even test it online, just copy/type these lines to http://www.javarepl.com/term.html - you will see that the String has a hashcode of 48690, the Integer has 123, and they do not consider equal to each other.
(Of course it works with 116 too just I did not have that number in front of me while typing this answer)
You don't have duplicates. Integer and String objects are not the same type so "116" and 116 are not equals and they have deferent Hash code
Objects that are equal must have the same hash code within a running process
in equals method if the type is not equals for both objects, it will return false, please check Integer equals implantation
public boolean equals(Object obj) {
if (obj instanceof Integer) {
return value == ((Integer)obj).intValue();
}
return false;
}
also for Hash code they will not be equals in your case :
how String hash code is calculated :
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
And for Integer hash code is it the same integer value so in your case it will be 116 for Integer instance, so those will never be the same.
please avoid raw-types, that is using HashMap without specifying the generic type, please read this article what-is-a-raw-type-and-why-shouldnt-we-use-it for more details

Why is the key not being found in the hashmap?

I print out the key being searched for and the keys in the map and they are there but the assignment fails. I test by populating the map with one object and itterate through and print out the keys. The key I am referencing is there so I can't see how temp is null?
Birds temp = (Birds)hint.get(input.substring(0, input.length()-1).trim());//the last char is being dropped off on purpose
if(temp == null)
{
System.out.println("failed to map key");
Iterator entries = hint.entrySet().iterator();
while (entries.hasNext()) {
Map.Entry thisEntry = (Map.Entry) entries.next();
System.out.println("Key1: "+
thisEntry.getKey()); //this an next line printout the same
System.out.println("key2: "+
input.substring(0, input.length()-1).trim());
}
}
I added the following lines to bird class but still the same problem
#Override public int hashCode()
{
return name.hashCode();
}
#Override
public boolean equals(Object obj) {
Bird b = (Bird)obj;
String str = b.name;
if(str.compareTo(this.name) == 0)
return true;
else
return false;
}
Turned out white space was screwing things up and I wasn't calling trim() often enough.
When you call substring, keep in mind that the ending index is not included in the substring.
The substring begins at the specified beginIndex and extends to the character at index endIndex - 1
In your call
input.substring(0, input.length()-1)
you are actually taking the last character off of whatever is currently in input. So, if you had a key "finch", you are inadvertently looking up the key "finc".
I don't see a reason for the substring call at all; remove it:
Birds temp = (Birds) hint.get(input.trim());
Additionally, the cast to Birds would be unnecessary if you supplied generic type parameters to your HashMap, something like this:
Map<String, Birds> hint = new HashMap<>();
Then, when calling get, you no longer need the cast:
Birds temp = hint.get(input.trim());

Java: Composite key in hashmaps

I would like to store a group of objects in a hashmap , where the key shall be a composite of two string values. is there a way to achieve this?
i can simply concatenate the two strings , but im sure there is a better way to do this.
You could have a custom object containing the two strings:
class StringKey {
private String str1;
private String str2;
}
Problem is, you need to determine the equality test and the hash code for two such objects.
Equality could be the match on both strings and the hashcode could be the hashcode of the concatenated members (this is debatable):
class StringKey {
private String str1;
private String str2;
#Override
public boolean equals(Object obj) {
if(obj != null && obj instanceof StringKey) {
StringKey s = (StringKey)obj;
return str1.equals(s.str1) && str2.equals(s.str2);
}
return false;
}
#Override
public int hashCode() {
return (str1 + str2).hashCode();
}
}
You don't need to reinvent the wheel. Simply use the Guava's HashBasedTable<R,C,V> implementation of Table<R,C,V> interface, for your need. Here is an example
Table<String, String, Integer> table = HashBasedTable.create();
table.put("key-1", "lock-1", 50);
table.put("lock-1", "key-1", 100);
System.out.println(table.get("key-1", "lock-1")); //prints 50
System.out.println(table.get("lock-1", "key-1")); //prints 100
table.put("key-1", "lock-1", 150); //replaces 50 with 150
public int hashCode() {
return (str1 + str2).hashCode();
}
This seems to be a terrible way to generate the hashCode: Creating a new string instance every time the hash code is computed is terrible! (Even generating the string instance once and caching the result is poor practice.)
There are a lot of suggestions here:
How do I calculate a good hash code for a list of strings?
public int hashCode() {
final int prime = 31;
int result = 1;
for ( String s : strings ) {
result = result * prime + s.hashCode();
}
return result;
}
For a pair of strings, that becomes:
return string1.hashCode() * 31 + string2.hashCode();
That is a very basic implementation. Lots of advice through the link to suggest better tuned strategies.
Why not create a (say) Pair object, which contains the two strings as members, and then use this as the key ?
e.g.
public class Pair {
private final String str1;
private final String str2;
// this object should be immutable to reliably perform subsequent lookups
}
Don't forget about equals() and hashCode(). See this blog entry for more on HashMaps and keys, including a background on the immutability requirements. If your key isn't immutable, then you can change its components and a subsequent lookup will fail to locate it (this is why immutable objects such as String are good candidates for a key)
You're right that concatenation isn't ideal. For some circumstances it'll work, but it's often an unreliable and fragile solution (e.g. is AB/C a different key from A/BC ?).
I have a similar case. All I do is concatenate the two strings separated by a tilde ( ~ ).
So when the client calls the service function to get the object from the map, it looks like this:
MyObject getMyObject(String key1, String key2) {
String cacheKey = key1 + "~" + key2;
return map.get(cachekey);
}
It is simple, but it works.
I see that many people use nested maps. That is, to map Key1 -> Key2 -> Value (I use the computer science/ aka haskell curring notation for (Key1 x Key2) -> Value mapping which has two arguments and produces a value), you first supply the first key -- this returns you a (partial) map Key2 -> Value, which you unfold in the next step.
For instance,
Map<File, Map<Integer, String>> table = new HashMap(); // maps (File, Int) -> Distance
add(k1, k2, value) {
table2 = table1.get(k1);
if (table2 == null) table2 = table1.add(k1, new HashMap())
table2.add(k2, value)
}
get(k1, k2) {
table2 = table1.get(k1);
return table2.get(k2)
}
I am not sure that it is better or not than the plain composite key construction. You may comment on that.
Reading about the spaguetti/cactus stack I came up with a variant which may serve for this purpose, including the possibility of mapping your keys in any order so that map.lookup("a","b") and map.lookup("b","a") returns the same element. It also works with any number of keys not just two.
I use it as a stack for experimenting with dataflow programming but here is a quick and dirty version which works as a multi key map (it should be improved: Sets instead of arrays should be used to avoid looking up duplicated ocurrences of a key)
public class MultiKeyMap <K,E> {
class Mapping {
E element;
int numKeys;
public Mapping(E element,int numKeys){
this.element = element;
this.numKeys = numKeys;
}
}
class KeySlot{
Mapping parent;
public KeySlot(Mapping mapping) {
parent = mapping;
}
}
class KeySlotList extends LinkedList<KeySlot>{}
class MultiMap extends HashMap<K,KeySlotList>{}
class MappingTrackMap extends HashMap<Mapping,Integer>{}
MultiMap map = new MultiMap();
public void put(E element, K ...keys){
Mapping mapping = new Mapping(element,keys.length);
for(int i=0;i<keys.length;i++){
KeySlot k = new KeySlot(mapping);
KeySlotList l = map.get(keys[i]);
if(l==null){
l = new KeySlotList();
map.put(keys[i], l);
}
l.add(k);
}
}
public E lookup(K ...keys){
MappingTrackMap tmp = new MappingTrackMap();
for(K key:keys){
KeySlotList l = map.get(key);
if(l==null)return null;
for(KeySlot keySlot:l){
Mapping parent = keySlot.parent;
Integer count = tmp.get(parent);
if(parent.numKeys!=keys.length)continue;
if(count == null){
count = parent.numKeys-1;
}else{
count--;
}
if(count == 0){
return parent.element;
}else{
tmp.put(parent, count);
}
}
}
return null;
}
public static void main(String[] args) {
MultiKeyMap<String,String> m = new MultiKeyMap<String,String>();
m.put("brazil", "yellow", "green");
m.put("canada", "red", "white");
m.put("USA", "red" ,"white" ,"blue");
m.put("argentina", "white","blue");
System.out.println(m.lookup("red","white")); // canada
System.out.println(m.lookup("white","red")); // canada
System.out.println(m.lookup("white","red","blue")); // USA
}
}
public static String fakeMapKey(final String... arrayKey) {
String[] keys = arrayKey;
if (keys == null || keys.length == 0)
return null;
if (keys.length == 1)
return keys[0];
String key = "";
for (int i = 0; i < keys.length; i++)
key += "{" + i + "}" + (i == keys.length - 1 ? "" : "{" + keys.length + "}");
keys = Arrays.copyOf(keys, keys.length + 1);
keys[keys.length - 1] = FAKE_KEY_SEPARATOR;
return MessageFormat.format(key, (Object[]) keys);}
public static string FAKE_KEY_SEPARATOR = "~";
INPUT:
fakeMapKey("keyPart1","keyPart2","keyPart3");
OUTPUT: keyPart1~keyPart2~keyPart3
I’d like to mention two options that I don’t think were covered in the other answers. Whether they are good for your purpose you will have to decide yourself.
Map<String, Map<String, YourObject>>
You may use a map of maps, using string 1 as key in the outer map and string 2 as key in each inner map.
I do not think it’s a very nice solution syntax-wise, but it’s simple and I have seen it used in some places. It’s also supposed to be efficient in time and memory, while this shouldn’t be the main reason in 99 % of cases. What I don’t like about it is that we’ve lost the explicit information about the type of the key: it’s only inferred from the code that the effective key is two strings, it’s not clear to read.
Map<YourObject, YourObject>
This is for a special case. I have had this situation more than once, so it’s not more special than that. If your objects contain the two strings used as key and it makes sense to define object equality based on the two, then define equals and hashCode in accordance and use the object as both key and value.
One would have wished to use a Set rather than a Map in this case, but a Java HashSet doesn’t provide any method to retrieve an object form a set based on an equal object. So we do need the map.
One liability is that you need to create a new object in order to do lookup. This goes for the solutions in many of the other answers too.
Link
Jerónimo López: Composite key in HashMaps on the efficiency of the map of maps.

Java sort String array of file names by their extension

I have an array of filenames and need to sort that array by the extensions of the filename. Is there an easy way to do this?
Arrays.sort(filenames, new Comparator<String>() {
#Override
public int compare(String s1, String s2) {
// the +1 is to avoid including the '.' in the extension and to avoid exceptions
// EDIT:
// We first need to make sure that either both files or neither file
// has an extension (otherwise we'll end up comparing the extension of one
// to the start of the other, or else throwing an exception)
final int s1Dot = s1.lastIndexOf('.');
final int s2Dot = s2.lastIndexOf('.');
if ((s1Dot == -1) == (s2Dot == -1)) { // both or neither
s1 = s1.substring(s1Dot + 1);
s2 = s2.substring(s2Dot + 1);
return s1.compareTo(s2);
} else if (s1Dot == -1) { // only s2 has an extension, so s1 goes first
return -1;
} else { // only s1 has an extension, so s1 goes second
return 1;
}
}
});
For completeness: java.util.Arrays and java.util.Comparator.
If I remember correctly, the Arrays.sort(...) takes a Comparator<> that it will use to do the sorting. You can provide an implementation of it that looks at the extension part of the string.
You can implement a custom Comparator of Strings. Make it sort them by the substring after the last index of '.'. Then pass in the comparator and your array into
Arrays.sort(stringArray, yourComparator);
// An implementation of the compare method
public int compare(String o1, String o2) {
return o1.substring(o1.lastIndexOf('.')).compareTo(o2.substring(o2.lastIndexOf('.'));
}
Comparators are often hard to get exactly right, and the comparison key has to be generated for every comparison which for most sorting algorithms mean O(n log n). Another approach is to create (key, value) pairs for each item you need to sort, put them in a TreeMap, and then ask for the values as these are sorted according to the key.
For instance
import java.util.Arrays;
import java.util.TreeMap;
public class Bar {
public static void main(String[] args) {
TreeMap<String, String> m2 = new TreeMap<String, String>();
for (String string : Arrays.asList(new String[] { "#3", "#2", "#1" })) {
String key = string.substring(string.length() - 1);
String value = string;
m2.put(key, value);
}
System.out.println(m2.values());
}
}
prints out
[#1, #2, #3]
You should easily be able to adapt the key calculation to your problem.
This only calculates the key once per entry, hence O(n) - (but the sort is still O(n log n)). If the key calculation is expensive or n is large this might be quite measurable.
Create a Comparator and compare the string extensions. Take a look at the following
http://java.sun.com/j2se/1.4.2/docs/api/java/util/Comparator.html
Then pass in your List of strings to Arrays.sort(List, Comparator)
Create your own Comparator that treats the strings as filenames and compares them based on the extensions. Then use Arrays.sort with the Comparator argument.
String DELIMETER = File.separator + ".";
List<String> orginalList = new CopyOnWriteArrayList<>(Arrays.asList(listOfFileNames));
Set<String> setOfuniqueExtension = new TreeSet<>();
for (String item : listOfFileNames) {
if (item.contains(".")) {
String[] split = item.split(DELIMETER);
String temp = "." + split[split.length - 1];
setOfuniqueExtension.add(temp);
}
}
List<String> finalListOfAllFiles = new LinkedList<>();
setOfuniqueExtension.stream().forEach((s1) -> {
for (int i = 0; i < orginalList.size(); i++) {
if (orginalList.get(i).contains(s1)) {
finalListOfAllFiles.add(orginalList.get(i));
orginalList.remove(orginalList.get(i));
i--;
}
}
});
orginalList.stream().filter((s1) -> (!finalListOfAllFiles.contains(s1))).forEach((s1) -> {
finalListOfAllFiles.add(s1);
});
return finalListOfAllFiles;
If you just want to group the files by their extension and do not care about the actual alphabetical order, you can use this:
I think the simplest thing you can do that also works when the filenname does not have a "." is to just reverse the names and compare them.
Arrays.sort(ary, new Comparator<String>() {
#Override
public int compare(String o1, String o2) {
String r1 = new StringBuffer(o1).reverse().toString();
String r2 = new StringBuffer(o2).reverse().toString();
return r1.compareTo(r2);
}
});
Its a shame that java's string does not even have a reverse().

Categories