In C++, I can look up a key in a map and insert it if it's not there for the cost of a single look up. Can I do the same in Java?
Update:
(For those of you who must see code.)
long id = 0xabba;
int version = 0xb00b;
for (List<Object> key : keys) {
if (!index.containsKey(key)) {
index.put(key, Maps.<Long,Integer>newHashMap());
}
index.get(key).put(id, version);
}
There are two look ups when the key is first inserted into the map. In C++, I could do it with a single look up.
Concurrent maps have an atomic putIfAbsent method, if this is what you mean.
I am not entirely familiar with C++ intrinsic implementation, but I have some doubts about it being a single operation in terms of performance/efficiency.
Even if it was, why would you necessarily need one in Java? Or even want one?
Assuming that it looks something like:
lookup(object) // side effect of object insertion
I wouldn't want something like this in Java for anything other than concurrency.
EDIT: clarification
Related
This question already has answers here:
Class Object vs Hashmap
(3 answers)
Closed 3 years ago.
I have some piece of code that returns a min and max values from some input that it takes. I need to know what are the benefits of using a custom class that has a minimum and maximum field over using a map that has these two values?
//this is the class that holds the min and max values
public class MaxAndMinValues {
private double minimum;
private double maximum;
//rest of the class code omitted
}
//this is the map that holds the min and max values
Map<String, Double> minAndMaxValuesMap
The most apparent answer would be Object Oriented Programming aspects like the possibility to data with functionality, and the possibility to derive that class.
But let's for the moment assume, that is not a major factor, and your example is so simplistic, that I wouldn't use a Map either. What I would use is the Pair class from Apache Commons: https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/Pair.html
(ImmutablePair):
https://commons.apache.org/proper/commons-lang/javadocs/api-3.1/org/apache/commons/lang3/tuple/ImmutablePair.html
The Pair class is generic, and has two generic types, one for each field. You can basically define a Pair of something, and get type safety, IDE support, autocompletion, and the big benefit of knowing what is inside. Also a Pair features stuff that a Map can not. For example, a Pair is potentially Comparable. See also ImmutablePair, if you want to use it as key in another Map.
public Pair<Double, Double> foo(...) {
// ...
Pair<Double, Double> range = Pair.of(minimum, maximum);
return range;
}
The big advantage of this class is, that the type you return exposes the contained types. So if you need to, you could return different types from a single method execution (without using a map or complicated inner class).
e.g. Pair<String, Double> or Pair<String, List<Double>>...
In simple situation, you just need to store min and max value from user input, your custom class will be ok than using Map, the reason is: in Java, a Map object can be a HashMap, LinkedHashMap or and TreeMap. it get you a short time to bring your data into its structure and also when you get value from the object. So in simple case, as you just described, just need to use your custom class, morever, you can write some method in your class to process user input, what the Map could not process for you.
I would say to look from perspective of the usage of a programming language. Let it be any language, there will be multiple ways to achieve the result (easy/bad/complicated/performing ...). Considering an Object oriented language like java, this question points more on to the design side of your solution.
Think of accessibility.
The values in a Map is kind of public that , you can modify the contents as you like from any part of the code. If you had a condition that the min and max should be in the range [-100 ,100] & if some part of your code inserts a 200 into map - you have a bug. Ok we can cover it up with a validation , but how many instances of validations would you write? But an Object ? there is always the encapsulation possibilities.
Think of re-use
. If you had the same requirement in another place of code, you have to rewrite the map logic again(probably with all validations?) Doesn't look good right?
Think of extensibility
. If you wanted one more data like median or average -either you have to dirty the map with bad keys or create a new map. But a object is always easy to extend.
So it all relates to the design. If you think its a one time usage probably a map will do ( not a standard design any way. A map must contain one kind of data technically and functionally)
Last but not least, think of the code readability and cognitive complexity. it will be always better with objects with relevant responsibilities than unclear generic storage.
Hope I made some sense!
The benefit is simple : make your code clearer and more robust.
The MaxAndMinValues name and its class definition (two fields) conveys a min and a max value but overall it makes sure that will accept only these two things and its class API is self explanatory to know how to store/get values from it.
While Map<String, Double> minAndMaxValuesMap conveys also the idea that a min and a max value are stored in but it has also multiple drawbacks in terms of design :
we don't know how to retrieve values without looking how these were added.
About it, how to name the keys we we add entries in the map ? String type for key is too broad. For example "MIN", "min", "Minimum" will be accepted. An enum would solve this issue but not all.
we cannot ensure that the two values (min and max) were added in (while an arg constructor can do that)
we can add any other value in the map since that is a Map and not a fixed structure in terms of data.
Beyond the idea of a clearer code in general, I would add that if MaxAndMinValues was used only as a implementation detail inside a specific method or in a lambda, using a Map or even an array {15F, 20F} would be acceptable. But if these data are manipulated through methods, you have to do their meaning the clearest possible.
We used custom class over Hashmap to sort Map based on values part
I am using a hashmap to store objects with a key that evolves over time.
HashMap<String,Stuff> hm = new HashMap<String,Stuff>()
Stuff stuff = new Stuff();
hm.put( "OrignalKey", stuff);
I didn't find anything better than removing "OrignalKey" and put() a new entry with the same object.
hm.remove("OriginalKey");
hm.put("NewKey", stuff);
remove() seems to be taking a significant cpu toll hence my questions:
What is the actual the memory cost to leave duplicate entries (there is no overlapping risk)?
Am I just missing some neat swapKey() method?
What is the actual the memory cost to leave duplicate entries (there is no overlapping risk)?
Well, you've got an extra entry, and the key itself can't be garbage collected. If the key is "large", that could be a problem. It also means that you'll never be able to get an accurate count, you'll never be able to sensibly iterate over all the values, etc. It seems like a bad idea to me.
Am I just missing some neat swapKey() method?
There's no such thing - and it feels like a fairly rare requirement to me. Any such method would pretty much have to do what you're doing anyway - it has to find the old key, remove it from the data structure, and insert an entry for the new key. I can't easily imagine any optimizations possible just by knowing about both operations at once.
swapping of the key is not easily possible, since the key is used for hashing.
changing the key means that the hashvalue is most probably different, too. in this case, changing the key conforms to deletion and following reinsertion
The following are some of the codes:
ENJOY08A,
AUTO09B,
PLAY06D,
SUMMER08W,
WINTER03S,
LEAF02A,
Each of these values correspond to a specific area.
For example, ENJOY08A and AUTO09B correspond to DEPT_A
PLAY06D corresponds to DEPT_B
SUMMER08W, WINTER03S and LEAF02A corresponds to DEPT_C
There are a fixed number of areas (5 areas), but unlimited codes. A code will correspond to only one area, but an area can have any number of codes.
None of the above will be stored in the database.
I need to create a Java class which will support the following operations..
Given the code, I need to know the corresponding area.
Given the area, I need to know all the corresponding codes.
What's the best way to go about designing the Java class?
Check out the guava multimap. I believe it provides the functionality you want.
I think the simplest method is to put the AREAS into a <String, Set> HashMap, then enter all the associated CODES into the relevant map. Then, you can get the Set of all codes for an area or iterate over the sets to find which one contains the code you are searching for.
for (String k : (Set<String>)areas.keySet()) {
if (areas.get(k).contains(theCode))
return k;
}
return "NO CORRESPONDING AREA FOUND";
Use a MultiMap (Google guava library), which is a bi-directional map, as a backend.
Fill the map using normal puts, either during construction time or from a properties file using the Java Properties class
Offer two interface methods: regionForCode(String code) and codeForRegion(String region) which use the BiMap to retrieve the mapped values.
You might even consider putting region into an enum instead of a simple String, because your region values are fixed. Then the domain would be described a little bit more consistently.
Edited: I noticed that BiMap is for unique mappings. The anser with MultiMap is correct, so I corrected my answer.
Since the number of codes is unlimited, you need to come up with a rule for mapping codes into departments. Probably it will be something like add up all the codes and take that sum modulo 5. There are an infinite number of choices for what this rule can be.
Something like
public class DepartmentCoder {
public static String toCode(Department department) {
// TODO: Randomly generate a string having the desired property, such
// as the sum of the string's codepoint modulo the number of departments
// equaling the ordinal value of the department.
}
public static Department(String code) {
// TODO: Do some math on the codepoints of the string
return result % Deparatments.NUMBER_OF_DEPARTMENTS;
}
}
I think, though, that the notion of "code" (as opposed to compression or encryption) is that the code values are fixed. Any rule-based solution can be figured out more easily.
I would use a class with two Maps.
The first is a Map<String, Set<String>> for looking up the set (alternatively List) of codes for a given area. Somebody also suggested Guava's MultiMap, that is essentially what this is.
The second would be a Map<String, String> for looking up the area of any given code.
Then just implement the appropriate String findAreaForCode(String code) and Set<String> listCodesInArea(String area) methods.
Finally, if the list of areas is small, finite and relatively static (doesn't need to grow dynamically at run-time) then I would consider using a Java enum in place of an ordinary String.
I've got an ArrayList called conveyorBelt, which stores orders that have been picked and placed on the conveyor belt. I've got another ArrayList called readyCollected which contains a list of orders that can be collected by the customer.
What I'm trying to do with the method I created is when a ordNum is entered, it returns true if the order is ready to be collected by the customer (thus removing the collected order from the readyCollected). If the order hasn't even being picked yet, then it returns false.
I was wondering is this the right way to write the method...
public boolean collectedOrder(int ordNum)
{
int index = 0;
Basket b = new Basket(index);
if(conveyorBelt.isEmpty()) {
return false;
}
else {
readyCollected.remove(b);
return true;
}
}
I'm a little confused since you're not using ordNum at all.
If you want to confirm operation of your code and generally increase the reliability of what you're writing, you should check out unit testing and the Java frameworks available for this.
You can solve this problem using an ArrayList, but I think that this is fundamentally the wrong way to think about the problem. An ArrayList is good for storing a complete sequence of data without gaps where you are only likely to add or remove elements at the very end. It's inefficient to remove elements at other positions, and if you have just one value at a high index, then you'll waste a lot of space filling in all lower positions with null values.
Instead, I'd suggest using a Map that associates order numbers with the particular order. This more naturally encodes what you want - every order number is a key associated with the order. Maps, and particularly HashMaps, have very fast lookups (expected constant time) and use (roughly) the same amount of space no matter how many keys there are. Moreover, the time to insert or remove an element from a HashMap is expected constant time, which is extremely fast.
As for your particular code, I agree with Brian Agnew on this one that you probably want to write some unit tests for it and find out why you're not using the ordNUm parameter. That said, I'd suggest reworking the system to use HashMap instead of ArrayList before doing this; the savings in time and code complexity will really pay off.
Based on your description, why isn't this sufficient :
public boolean collectedOrder(int ordNum) {
return (readyCollected.remove(ordNum) != null);
}
Why does the conveyorBelt ArrayList even need to be checked?
As already pointed out, you most likely need to be using ordNum.
Aside from that the best answer anyone can give with the code you've posted is "perhaps". Your logic certainly looks correct and ties in with what you've described, but whether it's doing what it should depends entirely on your implementation elsewhere.
As a general pointer (which may or may not be applicable in this instance) you should make sure your code deals with edge cases and incorrect values. So you might want to flag something's wrong if readyCollected.remove(b); returns false for instance, since that indicates that b wasn't in the list to remove.
As already pointed out, take a look at unit tests using JUnit for this type of thing. It's easy to use and writing thorough unit tests is a very good habit to get into.
So, I have a situation where I need to pass in three values into a serial BlockingQueue queue:
(SelectableChannel, ComponentSocketBasis, Integer).
They don't actually need to be hash mapped at all, and to use a HashMap is ridiculous as there will always be only one key for each entry; it'd be fine if they were just in some sort of ordered set. For lack of a known alternative, however, I used a HashMap in my implementation and produced this obfuscated generics composition:
private LinkedBlockingQueue<HashMap<HashMap<SelectableChannel, ComponentSocketBasis>, Integer>> deferredPollQueue = new LinkedBlockingQueue<HashMap<HashMap<SelectableChannel, ComponentSocketBasis>, Integer>>();
This seems really ridiculous. I must be a terrible n00b. Surely there is a better way to do this that doesn't require me to decompose the key when retrieving the values or waste the (theoretical--in practice, Java's always bloated :) algorithmic complexity on a useless hash computation I don't need because I have a key space of 1 and don't even want to relationally map the three references, but merely to group them? With this implementation, I have to pull out the values thusly:
while(deferredPollQueue.size() > 0) {
System.out.println("*** Draining new socket channel from queue");
HashMap<HashMap<SelectableChannel, ComponentSocketBasis>, Integer> p = deferredPollQueue.take();
SelectableChannel chan = null;
ComponentSocketBasis sock = null;
int ops = 0;
HashMap<SelectableChannel, ComponentSocketBasis> q = p.keySet().iterator().next();
chan = q.keySet().iterator().next();
sock = q.get(chan);
ops = p.get(q).intValue();
SelectionKey k = chan.register(selector, ops);
if(!channelSupervisorMap.containsKey(k))
channelSupervisorMap.put(k, sock);
}
I am pretty sure every being capable of sentient reason here probably thinks this is a ridiculous way to do it, so the question is - what's the right way? :) I can't find evidence of a java.util.Pair or java.util.Triplet anywhere.
I suppose an Orthodox Way(TM) would be to do a custom class or interface just for the purpose of housing this triplet, but for such a small task in such a large system this seems preposterously verbose and unecessary--though, then again, that's Java itself.
By the same token, perhaps the values can be put onto an ArrayList or a Vector or derivative thereof, but in Java this does not yield a more terse way of addressing them than I'm getting out of this HashMap here, though it does solve the algorithmic complexity issue perhaps.
Back in Perl land, we'd just do this by using an array reference as a value inside an array:
push(#$big_queue_array, [$elem1, \%elem2, \#elem3]);
What's the best equivalent in Java?
Why not just create your own generic Pair or Triple classes? Pretty much every Java 5+ project ends up having them in their own util classes!
You say that a custom class to hold the triplet would be bloated and unnecessary, but this really is the way to do it, that's how object-oriented modelling works. A custom class is explicit and readable, and takes up no more runtime resources than a generic holder class would.
Functional Java has pairs, triplets, and tuples up to arity 8. There's also a type called HList for arbitary arity. So your type would be:
LinkedBlockingQueue<P3<SelectableChannel, ComponentSocketBasis, Integer>>
This is just a library, so drop the jar in your classpath and you're good to go.
You could just use an ArrayList to store the objects, since you know which object will be at which location. Creating a new class with the members SelectableChannel and ComponentSocketBasis for the would probably be better.
If you're going to be doing this kind of thing a lot, creating a generic pair or tuple now will save you a lot of time, but if this is the only place you're going to be using it, then creating a new class will result in much easier to read code.
Whenever you see your classname in the code, you'll know exactly what it's for, whereas if you just see your generic amalgamation, it might be harder for you (or someone else) to make sense of what it's being used for.
It's a tradeoff between programming time and readability.