Store value in variable or call HashMap function? [duplicate] - java

This question already has answers here:
java performance : Is storing a hashMap value in a variable redundant?
(4 answers)
Is it better to store value as variable or call method again?
(5 answers)
Calling getters on an object vs. storing it as a local variable (memory footprint, performance)
(6 answers)
Java Method invocation vs using a variable
(14 answers)
Closed 4 years ago.
When you need to call a HashMap's get on the same value a few times within a for loop, would it be more efficient to store it in a variable or to make the call two or three times?

Retrieving a value from a HashMap is an O(1) operation, assuming your keys have a reasonable implementation of hashCode().
If you're only retrieving this object a couple of times it may be a micro-optimization (read: premature optimization) to store it in a local variable, but you probably won't notice any difference either way. The real reason to store such an object in a local variable is to avoid duplicating boiler-plate code that checks the key really exists in the map, the value isn't null, etc.

Accessing data in HashMap is O(1), so in general it's quite fast. However, if you initiate a variable with the proper value from the HashMap it would be a little bit faster. If you are accessing HashMap with some key, firstly hashCode method of the key is called. If you call that once - it would be faster.
My experience shows that preparing a variable for such cases is a better solution not only because of performance purposes but also because of refactoring. If it happened you had to change some code, you made one change in HashMap call instead of many in different lines, leaving often one line unchanged (which leads to a bug).

HashMap get runs in constant time. So from efficiency point of view, it doesn't matter. Although storing the value in a variable is more cleaner.

Calling hashmap.get() creates an indirection, therefor it's will be slower than a direct variable reference. The fact that hashmap.get() has O(1) complexity has absolutely nothing to do with the answer to this question, because O(1) complexity only means that the execution complexity of the algorithm does not increase with a growing number of elements, but, it does not say anything about how many cpu cycles a run takes, it only states that it's constant.
Storing the result in a variable would probably be the most performant.

I've made a simple test with HashMap<String, String>, where both keys and values are random-generated 64-character strings. It uses 1.000.000 records in the map and goes through each of them in 2 for-loops: First is calling get() once and saving it to variable. Second is calling get() 3 times. This is done in 5 iterations.
Results (in milliseconds):
1 2 3 4 5 avg
Store in a variable: 125 126 103 104 102 | 112
Call 3 times: 151 135 137 134 152 | 142
So, for this configuration (map of string-string), calling get() once and storing result in a variable is more effective.
Code of the test:
ArrayList<String> keys;
HashMap<String, String> data;
void run() {
generateData(1_000_000);
long start, end;
for (int i = 0; i < 5; ++i) {
start = System.nanoTime();
for (String key : keys) {
String value = data.get(key);
}
end = System.nanoTime();
System.out.println("Store in a variable: " + ((end - start) / 1000 / 1000) + "ms");
start = System.nanoTime();
for (String key : keys) {
data.get(key);
data.get(key);
data.get(key);
}
end = System.nanoTime();
System.out.println("Call 3 times: " + ((end - start) / 1000 / 1000) + "ms");
}
}
void generateData(int size) {
keys = new ArrayList<>(size);
data = new HashMap<>(size);
for (int i = 0; i < size; ++i) {
String key = getRandomString(64);
keys.add(key);
data.put(key, getRandomString(64));
}
}
String getRandomString(int length) {
StringBuilder str = new StringBuilder();
for (int i = 0; i < length; ++i) {
str.append((char) ThreadLocalRandom.current().nextInt(128));
}
return str.toString();
}

Related

ConcurrentHashMap throws recursive update exception

Here is my Java code:
static Map<BigInteger, Integer> cache = new ConcurrentHashMap<>();
static Integer minFinder(BigInteger num) {
if (num.equals(BigInteger.ONE)) {
return 0;
}
if (num.mod(BigInteger.valueOf(2)).equals(BigInteger.ZERO)) {
//focus on stuff thats happening inside this block, since with given inputs it won't reach last return
return 1 + cache.computeIfAbsent(num.divide(BigInteger.valueOf(2)),
n -> minFinder(n));
}
return 1 + Math.min(cache.computeIfAbsent(num.subtract(BigInteger.ONE), n -> minFinder(n)),
cache.computeIfAbsent(num.add(BigInteger.ONE), n -> minFinder(n)));
}
I tried to memoize a function that returns a minimum number of actions such as division by 2, subtract by one or add one.
The problem I'm facing is when I call it with smaller inputs such as:
minFinder(new BigInteger("32"))
it works, but with bigger values like:
minFinder(new BigInteger("64"))
It throws a Recursive Update exception.
Is there any way to increase recursion size to prevent this exception or any other way to solve this?
From the API docs of Map.computeIfAbsent():
The mapping function should not modify this map during computation.
The API docs of ConcurrentHashMap.computeIfAbsent() make that stronger:
The mapping function must not modify this map during computation.
(Emphasis added)
You are violating that by using your minFinder() method as the mapping function. That it seems nevertheless to work for certain inputs is irrelevant. You need to find a different way to achieve what you're after.
Is there any way to increase recursion size to prevent this exception or any other way to solve this?
You could avoid computeIfAbsent() and instead do the same thing the old-school way:
BigInteger halfNum = num.divide(BigInteger.valueOf(2));
BigInteger cachedValue = cache.get(halfNum);
if (cachedValue == null) {
cachedValue = minFinder(halfNum);
cache.put(halfNum, cachedValue);
}
return 1 + cachedValue;
But that's not going to be sufficient if the computation loops. You could perhaps detect that by putting a sentinel value into the map before you recurse, so that you can recognize loops.

compare three variables and get the variable with min/max value back (not value)

I have 3 variables with long values timestamp1 timestamp2 timestamp3 and an arraylist timestampList. I want to compare them in an if/else. It could be possible, that the three timestamps have different values, so I want to add the values with the min value to the list. I should also mention, that these timestamps are coming in every 2 minutes.
When the three variables are same, I could simply do
if(timestamp1 == timestamp2 && timestamp2 == timestamp3){
timestampList.add(timestamp 1); //since they are the same it doesn't matter which i add to the list
.
.
.
}
now in the else or else if I want to check the three timestamps and get the variable with the min value, not the value itself. Because I need the variable for other variables further in the code. Of course I also want to add the min value to the list too. I can imagine, that I could do more if/else branches in the else like
else{
if(timestamp1 < timestamp2){
if(timestamp1 < timestamp3){
...
}else{
...
}
}
}
but that would be too much and there is certainly a better way.
Try this:
long timestamp1 = 1;
long timestamp2 = 2;
long timestamp3 = 3;
long result = LongStream.of(timestamp1, timestamp2, timestamp3)
.min()
.getAsLong();
You can not really get a "pointer" to a variable in Java, as you could in C. The closest thing would be using a mutable type instead, so instead of assigning a new value to the variable you can modify an attribute of the existing instance, and the change will be reflected anywhere else in your code where you have a reference to that instance.
For example, you could wrap your long timestamps into AtomicLong instances:
// wrap in mutable AtomicLong instances
AtomicLong timestamp1 = new AtomicLong(123);
AtomicLong timestamp2 = new AtomicLong(456);
AtomicLong timestamp3 = new AtomicLong(789);
// get minimum or maximum, using streams or any other way
AtomicLong minTimeStamp = Stream.of(timestamp1, timestamp2, timestamp3)
.min(Comparator.comparing(AtomicLong::get)).get();
// modify value
System.out.println(minTimeStamp); // 123
timestamp1.set(1000); // change original variable
System.out.println(minTimeStamp); // 1000
Another way to keep a variable to a certain value would be to use Optional<Long> as #tobias_k mentioned in his comment. An alternative to Stream would be to use a TreeSet. When creating the TreeSet we provide the Function to compare the elements. After all timestamps has been added, we can call TreeSet.first() to get the minimum element.
List<Optional<Long>> timestamps = Arrays.asList(Optional.of(1234l), Optional.of(2345l), Optional.of(1234l));
TreeSet<Optional<Long>> setOfTimestamps = new TreeSet<>(Comparator.comparing(Optional::get));
setOfTimestamps.addAll(timestamps);
Optional<Long> min = setOfTimestamps.first();
System.out.println(min.get());
So I went with the idea of #Amadan and created an array with the timestamps. But after thinking a little bit, I came to the conclusion that it would be possible without getting the variable.
long[] arrayDo = new long[3];
arrayDo[0] = eafedo5.getServiceInformation(eafedo5Count).getTimestamp();
arrayDo[1] = eafedo6.getServiceInformation(eafedo6Count).getTimestamp();
arrayDo[2] = eafedo7.getServiceInformation(eafedo7Count).getTimestamp();
Then I calculate the minValue of the array.
long minTimestamp = Math.min(arrayDo[0],Math.min(arrayDo[1],arrayDo[2]));
Then I ask if the timestamps are equal to minValue
if(!timestamps.contains(minTimestamp)){
timestamps.add(minTimestamp);
}
if(eafedo5.getServiceInformation(eafedo5Count).getTimestamp() ==minTimestamp){
for(CHostNeighbor n : hostNeighborsEafedo5){
msgsCountDo += n.getInboundMsgs();
}
eafedo5Count--;
}
if(eafedo6.getServiceInformation(eafedo6Count).getTimestamp() ==minTimestamp){
for(CHostNeighbor n : hostNeighborsEafedo6){
msgsCountDo += n.getInboundMsgs();
}
eafedo6Count--;
}
if(eafedo7.getServiceInformation(eafedo7Count).getTimestamp() ==minTimestamp){
for(CHostNeighbor n : hostNeighborsEafedo7){
msgsCountDo += n.getInboundMsgs();
}
eafedo7Count--;
}
msgsDo.add(msgsCountDo);
I mentioned in my question that I need the variable for later purposes. It was because I needed the name of the variable to decrement the count variable of the specific host. (the eafed... are hosts).
Thanks for all the answers!

Java : Is the get method of an Arraylist cached?

Does the Arraylist object store the last requested value in memory to access it faster the next time? Or do I need to do this myself?
Or more concretely, in terms of performance, is it better to do this :
for (int i = 0; i < myArray.size(); i++){
int value = myArray.get(i);
int result = value + 2 * value - 5 / value;
}
Instead of doing this :
for (int i = 0; i < myArray.size(); i++)
int result = myArray.get(i) + 2 * myArray.get(i) - 5 / myArray.get(i);
In terms of performance, it doesn't matter one bit. No, ArrayList doesn't cache anything, although the JITted end result could be a different issue.
If you're wondering which version to use, use the first one. It's clearer.
You can answer your (first) question yourself by looking into the actual source:
public E get(int index) {
rangeCheck(index);
return elementData(index);
}
So: No, there is no caching taking place but you can also see that there is no much of an impact in terms of performance because the get method is essentially just an access to an array.
But it's still good to avoid multiple calls for some reasons:
int result = value + 2 * value - 5 / value is easier to understand (i.e. realizing that you use the same value three times in your calculation)
If you later decide to change the underlying list (e.g. to a LinkedList) you might end up with an impact on performance and then have to change your code to get around it.
As long as you don't synchronize the access to the list, repeated calls of get(index) might actually return different values if between two calls a call of set(index, value) has taken place (even in small souce blocks like this, it's possible to happen - BTST)
The second point has also a consequence in terms of how to access all values of a list, that leads to the decision to avoid list.get(i) altogether if you're going to iterate over all elements in a list. In that case it's better to use the Iterator or streams:
You code would then look like this:
Iterator it = myArray.iterator();
while (it.hasNext()) {
int value = it.next();
int result = value + 2 * value - 5 / value;
}
LinkedList is very slow when trying to access elements in it by specific index but can iteratre quite fast from one element to the next, so the Iterator returned by LinkedList makes use of that while the Iterator returned by ArrayList simply accesses the internal array (without the need to do the repeated range check calls you can see in the get-method above

Calculate all permutations of a collection in parallel

I need to calculate all permutations of a collection and i have a code for that but the problem is that it is linear and takes a lot of time.
public static <E> Set<Set<E>> getAllCombinations(Collection<E> inputSet) {
List<E> input = new ArrayList<>(inputSet);
Set<Set<E>> ret = new HashSet<>();
int len = inputSet.size();
// run over all numbers between 1 and 2^length (one number per subset). each bit represents an object
// include the object in the set if the corresponding bit is 1
for (int i = (1 << len) - 1; i > 0; i--) {
Set<E> comb = new HashSet<>();
for (int j = 0; j < len; j++) {
if ((i & 1 << j) != 0) {
comb.add(input.get(j));
}
}
ret.add(comb);
}
return ret;
}
I am trying to make the computation run in parallel.
I though of the option to writing the logic using recursion and then parallel execute the recursion call but i am not exactly sure how to do that.
Would appreciate any help.
There is no need to use recursion, in fact, that might be counter-productive. Since the creation of each combination can be performed independently of the others, it can be done using parallel Streams. Note that you don’t even need to perform the bit manipulations by hand:
public static <E> Set<Set<E>> getAllCombinations(Collection<E> inputSet) {
// use inputSet.stream().distinct().collect(Collectors.toList());
// to get only distinct combinations
// (in case source contains duplicates, i.e. is not a Set)
List<E> input = new ArrayList<>(inputSet);
final int size = input.size();
// sort out input that is too large. In fact, even lower numbers might
// be way too large. But using <63 bits allows to use long values
if(size>=63) throw new OutOfMemoryError("not enough memory for "
+BigInteger.ONE.shiftLeft(input.size()).subtract(BigInteger.ONE)+" permutations");
// the actual operation is quite compact when using the Stream API
return LongStream.range(1, 1L<<size) /* .parallel() */
.mapToObj(l -> BitSet.valueOf(new long[] {l}).stream()
.mapToObj(input::get).collect(Collectors.toSet()))
.collect(Collectors.toSet());
}
The inner stream operation, i.e. iterating over the bits, is too small to benefit from parallel operations, especially as it would have to merge the result into a single Set. But if the number of combinations to produce is sufficiently large, running the outer stream in parallel will already utilize all CPU cores.
The alternative is not to use a parallel stream, but to return the Stream<Set<E>> itself instead of collecting into a Set<Set<E>>, to allow the caller to chain the consuming operation directly.
By the way, hashing an entire Set (or lots of them) can be quite expensive, so the cost of the final merging step(s) are likely to dominate the performance. Returning a List<Set<E>> instead can dramatically increase the performance. The same applies to the alternative of returning a Stream<Set<E>> without collecting the combinations at all, as this also works without hashing the Sets.

Parsing field access flags in java

I have an assignment wherein I have to parse the field access flags of a java .class file.
The specification for a .class file can be found here: Class File Format (page 26 & 27 have the access flags and hex vals).
This is fine, I can do this no worries.
My issue is that there is a large number of combinations.
I know the public, private and protected are mutually exclusive, which reduces the combinations somewhat. Final and transient are also mutually exclusive. The rest however are not.
At the moment, I have a large switch statement to do the comparison. I read in the hex value of the access flag and then increment a counter, depending on if it is public, private or protected. This works fine, but it seems quite messy to just have every combination listed in a switch statement. i.e. public static, public final, public static final, etc.
I thought of doing modulo on the access flag and the appropriate hex value for public, private or protected, but public is 0x0001, so that won't work.
Does anyone else have any ideas as to how I could reduce the amount of cases in my switch statement?
What is the problem? The specification says that it's a bit flag, that means that you should look at a value as a binary number, and that you can test if a specific value is set by doing a bitwise AND.
E.g
/*
ACC_VOLATILE = 0x0040 = 10000000
ACC_PUBLIC = 0x0001 = 00000001
Public and volatile is= 10000001
*/
publicCount += flag & ACC_PUBLIC > 0 ? 1 : 0;
volatileCount += flag & ACC_VOLATILE > 0 ? 1 : 0;
If you are trying to avoid a pattern like this one I just stole:
if (access_flag & ACC_PUBLIC != 0)
{
public++;
}
if (access_flag & ACC_FINAL != 0)
{
final++;
}
...
It's a great instinct. I make it a rule never to write code that looks redundant like that. Not only is it error-prone and more code in your class, but copy & paste code is really boring to write.
So the big trick is to make this access "Generic" and easy to understand from the calling class--pull out all the repeated crap and just leave "meat", push the complexity to the generic routine.
So an easy way to call a method would be something like this that gives an array of bitfields that contain many bit combinations that need counted and a list of fields that you are interested in (so that you don't waste time testing fields you don't care about):
int[] counts = sumUpBits(arrayOfFlagBitfields, ACC_PUBLIC | ACC_FINAL | ACC_...);
That's really clean, but then how do you access the return fields? I was originally thinking something like this:
System.out.println("Number of public classes="+counts[findBitPosition(ACC_PUBLIC]));
System.out.println("Number of final classes="+counts[findBitPosition(ACC_FINAL)]);
Most of the boilerplate here is gone except the need to change the bitfields to their position. I think two changes might make it better--encapsulate it in a class and use a hash to track positions so that you don't have to convert bitPosition all the time (if you prefer not to use the hash, findBitPosition is at the end).
Let's try a full-fledged class. How should this look from the caller's point of view?
BitSummer bitSums=new BitSummer(arrayOfFlagBitfields, ACC_PUBLIC, ACC_FINAL);
System.out.println("Number of public classes="+bitSums.getCount(ACC_PUBLIC));
System.out.println("Number of final classes="+bitSums.getCount(ACC_FINAL));
That's pretty clean and easy--I really love OO! Now you just use the bitSums to store your values until they are needed (It's less boilerplate than storing them in class variables and more clear than using an array or a collection)
So now to code the class. Note that the constructor uses variable arguments now--less surprise/more conventional and makes more sense for the hash implementation.
By the way, I know this seems like it would be slow and inefficient, but it's probably not bad for most uses--if it is, it can be improved, but this should be much shorter and less redundant than the switch statement (which is really the same as this, just unrolled--however this one uses a hash & autoboxing which will incur an additional penalty).
public class BitSummer {
// sums will store the "sum" as <flag, count>
private final HashMap<Integer, Integer> sums=new HashMap<Integer, Integer>();
// Constructor does all the work, the rest is just an easy lookup.
public BitSummer(int[] arrayOfFlagBitfields, int ... positionsToCount) {
// Loop over each bitfield we want to count
for(int bitfield : arrayOfFlagBitfields) {
// and over each flag to check
for(int flag : positionsToCount) {
// Test to see if we actually should count this bitfield as having the flag set
if((bitfield & flag) != 0) {
sums.put(flag, sums.get(flag) +1); // Increment value
}
}
}
}
// Return the count for a given bit position
public int getCount(int bit) {
return sums.get(bit);
}
}
I didn't test this but I think it's fairly close. I wouldn't use it for processing video packets in realtime or anything, but for most purposes it should be fast enough.
As for maintaining code may look "Long" compared to the original example but if you have more than 5 or 6 fields to check, this will actually be a shorter solution than the chained if statements and significantly less error/prone and more maintainable--also more interesting to write.
If you really feel the need to eliminate the hashtable you could easily replace it with a sparse array with the flag position as the index (for instance the count of a flag 00001000/0x08 would be stored in the fourth array position). This would require a function like this to calculate the bit position for array access (both storing in the array and retrieving)
private int findBitPosition(int flag) {
int ret;
while( ( flag << 1 ) != 0 )
ret++;
return ret;
}
That was fun.
I'm not sure that's what you're looking for, but I would use if-cases with binary AND to check if a flag is set:
if (access_flag & ACC_PUBLIC != 0)
{
// class is public
}
if (access_flag & ACC_FINAL != 0)
{
// class is final
}
....

Categories