Why do we use number 31 while calculating hashcode [duplicate] - java

This question already has answers here:
Why does Java's hashCode() in String use 31 as a multiplier?
(13 answers)
Closed 2 years ago.
I have started learning Collections. So when we generate hashcode using eclipse below is the formula which is present in the method:
final int prime = 31;
int result = 1;
result = prime * result + ((id == null) ? 0 : id.hashCode());
result = prime * result + ((pin == null) ? 0 : pin.hashCode());
I have searched and found that since 31 is odd prime we use it while calculating hashcode. Multiplying by prime gives a good distribution of hashcodes.But haven't come across any concrete/layman explaination on why do we use the above formula and why exactly 31 is used. Can someone please help elaborate on how exactly does multiplying by 31 give a better distribution of hashcode?

from Joshua Bloch, Effective Java, Chapter 3, Item 9
The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i. Modern VMs do this sort of optimization automatically.
Some words about multiplication can be replaced by a shift. Multiplying to 2 is pretty easy operations in binary algebra. You just need to shift the number to the left and add 0 to the end. 4*2 = b100 << 1 = b1000 = 8. If a factor is a power of 2, you need to shift the binary number by the power value. 4*8 = 4 * 2^3 = b100 << 3 = b100000 = 32.
Also the same logic works for dividing: 8/4 = 8/2^2 = b1000 >> 2 = b10 = 2

Related

Why does Integer.MAX_VALUE*2 returns -2? [duplicate]

This question already has answers here:
Multiplication of Integer.MAX_VALUE in Java [duplicate]
(3 answers)
Closed 4 years ago.
I was looking for a value of i where i==i+1 should always be true. Some others used double and float to find a solution to the problem.
I was trying to answer to a question on stack overflow. And thought about integers, trying to see if that could be possible too.
But while manipulating Integer.MAX_VALUE and Integer.MIN_VALUE many times, I found this strange behavior:
Integer.MAX_VALUE + 1 == Integer.MIN_VALUE
But Integer.MAX_VALUE * 2 == -2 and Integer.MAX_VALUE * 4 == -4
This happens only if I use even values like (2,4,6,8,10,12...). When I use odd values except 1, this happens:
Integer.MAX_VALUE * n == Integer.MAX_VALUE - n + 1 where n != Integer.MIN_VALUE and n > Integer.MIN_VALUE
Why is this behaving so? Is there something I am missing?
EDIT: I saw someone flagging the question as duplicate but navigating to the referred link, my question is different.
If it overflows, it goes back to the minimum value and continues from >there. If it underflows, it goes back to the maximum value and continues >from there.
I've carefully read answers from there but none of them explained why:
Integer.MAX_VALUE*x should be -x if x is an even number and Integer.MAX_VALUE*y should be Integer.MAX_VALUE-y+1 if y is an odd number.
That's exactly what is confusing me the most.
That's because Java's signed integers follow two's complement representation.
In a two's complement representation, negative numbers are represented as an N-bit complement of their corresponding positive value, i.e. their sum modulo 2n is 0.
Integer.MAX_VALUE is 2147483647 (hex 0x7FFFFFFF).
When multiplied, it overflows, and what remains is the lowest 32 bits (i.e. modulo 232):
0x7FFFFFFF * 2 = 0x0FFFFFFFE (mod32 = 0xFFFFFFFE = -2)
0x7FFFFFFF * 3 = 0x17FFFFFFD (mod32 = 0x7FFFFFFD = 2147483645)
0x7FFFFFFF * 4 = 0x1FFFFFFFC (mod32 = 0xFFFFFFFC = -4)
0x7FFFFFFF * 5 = 0x27FFFFFFB (mod32 = 0x7FFFFFFB = 2147483643)
0x7FFFFFFF * 6 = 0x2FFFFFFFA (mod32 = 0xFFFFFFFA = -6)
0x7FFFFFFF * 7 = 0x37FFFFFF9 (mod32 = 0x7FFFFFF9 = 2147483641)
An interesting property of two's complement representation is that the highest bit corresponds to the sign of the value.
Notice how the leftmost 7 results in alternating 0/1 bit 31. That bit happens to control the sign of the result, hence the alternating sign.
Why 0x7FFFFFFF * 2 is -2 is because 0x7FFFFFFF in a 31-bit representation (the largest possible representation without overflowing) is -1. And -1 * 2 = -2.
You can achieve a similar result if you take Long.MAX_VALUE and cast the result to int:
long x = Long.MAX_VALUE;
for (int i = 2; i < 8; i++) {
System.out.println((int)(x * i));
}
Just prints:
-2
-3
-4
-5
-6
-7
Now bit 31 isn't alternating anymore so we get stable results.
2147483647 is the max value, now try executing below set of instructions:
int i = 2147483647;
i++;
System.out.println(i);//prints -2147483647 as the 32 bit limit exceeds
binary equivalent of 2147483647 is 01111111111111111111111111111111
binary equivalent of 2 is 00000000000000000000000000000010
binary addition is 110000000000000000000000000000000
NOTE: The last bit at left stands for sign, int being 32 bit signed integer
thus, the number becomes -2147483647
i++;
System.out.println(i); // prints -2147483647
on the same lines, 2147483647 * 2 is actually -> 2147483647 + 2147483647
int j = 2147483647;
j += 2147483647;
System.out.println(j); // prints -2
and thus you answer.
now when you do 2147483647 * 3,
it is:
2147483647 + 2147483647 + 2147483647 = -2 + 2147483647 = 2147483645

Effiecient Algorithm for Finding if a Very Big Number is Divisible by 7

So this was a question on one of the challenges I came across in an online competition, a few days ago.
Question:
Accept two inputs.
A big number of N digits,
The number of questions Q to be asked.
In each of the question, you have to find if the number formed by the string between indices Li and Ri is divisible by 7 or not.
Input:
First line contains the number consisting on N digits. Next line contains Q, denoting the number of questions. Each of the next Q lines contains 2 integers Li and Ri.
Output:
For each question, print "YES" or "NO", if the number formed by the string between indices Li and Ri is divisible by 7.
Constraints:
1 ≤ N ≤ 105
1 ≤ Q ≤ 105
1 ≤ Li, Ri ≤ N
Sample Input:
357753
3
1 2
2 3
4 4
Sample Output:
YES
NO
YES
Explanation:
For the first query, number will be 35 which is clearly divisible by 7.
Time Limit: 1.0 sec for each input file.
Memory Limit: 256 MB
Source Limit: 1024 KB
My Approach:
Now according to the constraints, the maximum length of the number i.e. N can be upto 105. This big a number cannot be fitted into a numeric data structure and I am pretty sure thats not the efficient way to go about it.
First Try:
I thought of this algorithm to apply the generic rules of division to each individual digit of the number. This would work to check divisibility amongst any two numbers, in linear time, i.e. O(N).
static String isDivisibleBy(String theIndexedNumber, int divisiblityNo){
int moduloValue = 0;
for(int i = 0; i < theIndexedNumber.length(); i++){
moduloValue = moduloValue * 10;
moduloValue += Character.getNumericValue(theIndexedNumber.charAt(i));
moduloValue %= divisiblityNo;
}
if(moduloValue == 0){
return "YES";
} else{
return "NO";
}
}
But in this case, the algorithm has to also loop through all the values of Q, which can also be upto 105.
Therefore, the time taken to solve the problem becomes O(Q.N) which can also be considered as Quadratic time. Hence, this crossed the given time limit and was not efficient.
Second Try:
After that didn't work, I tried searching for a divisibility rule of 7. All the ones I found, involved calculations based on each individual digit of the number. Hence, that would again result in a Linear time algorithm. And hence, combined with the number of Questions, it would amount to Quadratic Time, i.e. O(Q.N)
I did find one algorithm named Pohlman–Mass method of divisibility by 7, which suggested
Using quick alternating additions and subtractions: 42,341,530
-> 530 − 341 = 189 + 42 = 231 -> 23 − (1×2) = 21 YES
But all that did was, make the time 1/3rd Q.N, which didn't help much.
Am I missing something here? Can anyone help me find a way to solve this efficiently?
Also, is there a chance this is a Dynamic Programming problem?
There are two ways to go through this problem.
1: Dynamic Programming Approach
Let the input be array of digits A[N].
Let N[L,R] be number formed by digits L to R.
Let another array be M[N] where M[i] = N[1,i] mod 7.
So M[i+1] = ((M[i] * 10) mod 7 + A[i+1] mod 7) mod 7
Pre-calculate array M.
Now consider the expression.
N[1,R] = N[1,L-1] * 10R-L+1 + N[L,R]
implies (N[1,R] mod 7) = (N[1,L-1] mod 7 * (10R-L+1mod 7)) + (N[L,R] mod 7)
implies N[L,R] mod 7 = (M[R] - M[L-1] * (10R-L+1 mod 7)) mod 7
N[L,R] mod 7 gives your answer and can be calculated in O(1) as all values on right of expression are already there.
For 10R-L+1 mod 7, you can pre-calculate modulo 7 for all powers of 10.
Time Complexity :
Precalculation O(N)
Overall O(Q) + O(N)
2: Divide and Conquer Approach
Its a segment tree solution.
On each tree node you store the mod 7 for the number formed by digits in that node.
And the expression given in first approach can be used to find the mod 7 of parent by combining the mod 7 values of two children.
The time complexity of this solution will be O(Q log N) + O(N log N)
Basically you want to be able to to calculate the mod 7 of any digits given the mod of the number at any point.
What you can do is to;
record the modulo at each point O(N) for time and space. Uses up to 100 KB of memory.
take the modulo at the two points and determine how much subtracting the digits before the start would make e.g. O(N) time and space (once not per loop)
e.g. between 2 and 3 inclusive
357 % 7 = 0
3 % 7 = 3 and 300 % 7 = 6 (the distance between the start and end)
and 0 != 6 so the number is not a multiple of 7.
between 4 and 4 inclusive
3577 % 7 == 0
357 % 7 = 0 and 0 * 10 % 7 = 0
as 0 == 0 it is a multiple of 7.
You first build a list of digits modulo 7 for each number starting with 0 offset (like in your case, 0%7, 3%7, 35%7, 357%7...) then for each case of (a,b) grab digits[a-1] and digits[b], then multiply digits[b] by 1-3-2-6-4-5 sequence of 10^X modulo 7 defined by (1+b-a)%6 and compare. If these are equal, return YES, otherwise return NO. A pseudocode:
readString(big);
Array a=[0]; // initial value
Array tens=[1,3,2,6,4,5]; // quick multiplier lookup table
int d=0;
int l=big.length;
for (int i=0;i<l;i++) {
int c=((int)big[i])-48; // '0' -> 0, and "big" has characters
d=(3*d+c)%7;
a.push(d); // add to tail
}
readInt(q);
for (i=0;i<q;i++) {
readInt(li);
readInt(ri); // get question
int left=(a[li-1]*tens[(1+ri-li)%6])%7;
if (left==a[ri]) print("YES"); else print("NO");
}
A test example:
247761901
1
5 9
61901 % 7=0. Calculating:
a = [0 2 3 2 6 3 3 4 5 2]
li = 5
ri = 9
left=(a[5-1]*tens[(1+9-5)%6])%7 = (6*5)%7 = 30%7 = 2
a[ri]=2
Answer: YES

How Java processes for overflow integers [duplicate]

This question already has answers here:
Why do these two multiplication operations give different results?
(2 answers)
Closed 9 years ago.
Now signed_int max value is 2,147,483,647 i.e. 2^31 and 1 bit is sign bit, so
when I run long a = 2,147,483,647 + 1;
It gives a = -2,147,483,648 as answer.. This hold good.
But, 24*60*60*1000*1000 = 86400000000 (actually)...
In java, 24*60*60*1000*1000 it equals to 500654080..
I understand that it is because of overflow in integer, but what processing made this value come, What logic was used to get that number by Java. I also refered here.
Multiplication is executed from left to right like this
int x = 24 * 60;
x = x * 60;
x = x * 1000;
x = x * 1000;
first 3 operations produce 86400000 which still fits into Integer.MAX_VALUE. But the last operation produces 86400000000 which is 0x141dd76000 in hex. Bytes above 4 are truncated and we get 0x1dd76000. If we print it
System.out.println(0x1dd76000);
the result will be
500654080
This is quite subtle: when writing long a = 2147483647 + 1, the right hand side is computed first using ints since you have supplied int literals. But that will clock round to a negative (due to overflow) before being converted to a long. So the promotion from int to long is too late for you.
To circumvent this behaviour, you need to promote at least one of the arguments to a long literal by suffixing an L.
This applies to all arithmetic operations using literals (i.e. also your multiplication): you need to promote one of them to a long type.
The fact that your multiplication answer is 500654080 can be seen by looking at
long n = 24L*60*60*1000*1000;
long m = n % 4294967296L; /* % is extracting the int part so m is 500654080
n.b. 4294967296L is 2^32 (using OP notation, not XOR). */
What's happening here is that you are going 'round and round the clock' with the int type. Yes, you are losing the carry bits but that doesn't matter with multiplication.
As the range of int is -2,147,483,648 to 2,147,483,647.
So, when you keep on adding numbers and its exceed the maximum limit it start gain from the left most number i.e. -2,147,483,648, as it works as a cycle. That you had already mentioned in your question.
Similarly when you are computing 24*60*60*1000*1000 which should result 86400000000 as per Maths.
But actually what happens is somehow as follows:
86400000000 can be written as 2147483647+2147483647+2147483647+2147483647+..36 times+500654080
So, after adding 2147483647 for 40 times results 0 and then 500654080 is left which ultimately results in 500654080.
I hope its clear to you.
Add L in your multiplicatoin. If you add L than it multiply you in Long range otherwise in Integer range which overflow. Try to multiply like this.
24L*60*60*1000*1000
This give you a right answer.
An Integer is 32 bit long. Lets take for example a number that is 4 bit long for the sake of simplicity.
It's max positive value would be:
0111 = 7 (first bit is for sign; 0 means positive, 1 means negative)
0000 = 0
It's min negative value would be:
1111 = -8 (first bit is for sign)
1000 = -1
Now, if we call this type fbit, fbit_max is equal to 7.
fbit_max + 1 = -8
because bitwise 0111 + 1 = 1111
Therefore, the span of fbit_min to fbit_max is 16. From -8 to 7.
If you would multiply something like 7*10 and store it in fbit, the result would be:
fbit number = 7 * 10 (actually 70)
fbit number = 7 (to get to from zero to max) + 16 (min to max) + 16 (min to max) + 16 (min to max) + 15 (the rest)
fbit number = 6
24*60*60*1000*1000 = 86400000000
Using MOD as follows: 86400000000 % 2147483648 = 500654080

check number present in a sequences

I am writing a program which I found on a coding competition website, I have sort of figured out how to solve the problem but, I am stuck on a math part of it, I am completely diluting the problem and showing what I need.
first I need to check if a number is part of a sequence, my sequence is 2*a+1 where a is the previous element in the sequence or 2^n-1 to get nth item in the sequence. so it is 1,3,7,15,31,63...
I don't really want to create the whole sequence and check if a number is present, but I am not sure what a quicker method to do this would be.
Second if I am given a number lets say 25, I want to figure out the next highest number in my sequence to this number. So for 25 it would be 31 and for 47 it would be 63, for 8 it would be 13.
How can i do these things without creating the whole sequence.
I have seen similar questions here with different sequences but I am still not sure how to solve this
Start by finding the explicit formula for any term in your sequence. I'm too lazy to write out a proof, so just add 1 to each term in your sequence:
1 + 1 = 2
3 + 1 = 4
7 + 1 = 8
15 + 1 = 16
31 + 1 = 32
63 + 1 = 64
...
You can clearly see that a_n = 2^n - 1.
To check if a particular number is in your sequence, assume that it is:
x = 2^n - 1
x + 1 = 2^n
From Wikipedia:
The binary representation of integers makes it possible to apply a
very fast test to determine whether a given positive integer x is a
power of two:
positive x is a power of two ⇔ (x & (x − 1)) equals to zero.
So to check, just do:
bool in_sequence(int n) {
return ((n + 1) & n) == 0;
}
As #Blender already pointed out your sequence is essentially 2^n - 1, you can use this trick if you use integer format to store it:
boolean inSequence(int value) {
for (int i = 0x7FFF; i != 0; i >>>= 1) {
if (value == i) {
return true;
}
}
return false;
}
Note that for every elements in your sequence, its binary representation will be lots of 0s and then lots of 1s.
For example, 7 in binary is 0000000000000000000000000000111 and 63 in binary is 0000000000000000000000000111111.
This solution starts from 01111111111111111111111111111111 and use an unsigned bitshift, then compare if it is equal to your value.
Nice and simple.
How to find the next higher number :
For example, we get 19 ( 10011 ) , should return 31 (11111)
int findNext(int n){
if(n == 0) return 1;
int ret = 2; // start from 10
while( (n>>1) > 0){ // end with 100000
ret<<1;
}
return ret-1;
}

Modulo gives unexpected result

I have some problem with numerator, denumerator and modulo. 7 / 3 = 2.3333333333 gives me a modulo of 1!? Must be some wrong? I study a non-objective ground level course, so my code is simple and I have simplified the code below. (Some lines are in swedish)
Calling the method:
// Anropar metod och presenterar beräkning av ett bråktal utifrån täljare och nämnare
int numerator = 7;
int denumerator = 3;
System.out.println("Bråkberäkning med täljare " + numerator + " och nämnare " + denumerator + " ger " + fraction(numerator,denumerator));
And the method:
// Metod för beräkning av bråktal utifrån täljare och nämnare
public static String fraction(int numerator, int denumerator) {
// Beräkning
int resultat1 = numerator / denumerator;
int resultat2 = numerator % denumerator;
return Integer.toString(resultat1) + " rest " + Integer.toString(resultat2);
}
3 goes into 7 twice with 1 left over. The answer is supposed to be 1. That's what modulo means.
7 modulo 3 gives 1. Since 7 = 2*3 + 1.
7 % 3 = 1
Just as expected. If you want the .3333 you could take the modulo and devide it by your denominator to get 1 / 3 = 0.3333
Or do (7.0 / 3.0) % 1 = 0.3333
Ehm 7 % 3 = 1
What would you expect?
Given two positive numbers, a (the dividend) and n (the divisor), a modulo n (abbreviated as a mod n) can be thought of as the remainder, on division of a by n. For instance, the expression "5 mod 4" would evaluate to 1 because 5 divided by 4 leaves a remainder of 1, while "9 mod 3" would evaluate to 0 because the division of 9 by 3 leaves a remainder of 0; there is nothing to subtract from 9 after multiplying 3 times 3. (Notice that doing the division with a calculator won't show you the result referred to here by this operation, the quotient will be expressed as a decimal.) When either a or n is negative, this naive definition breaks down and programming languages differ in how these values are defined. Although typically performed with a and n both being integers, many computing systems allow other types of numeric operands.
More info : http://en.wikipedia.org/wiki/Modulo_operation
you didn't do a question!
And if your question is just:
"...gives me a modulo of 1!? Must be some wrong?"
No, it isn't, 7/3 = 2, and has a modulo of 1. Since (3 * 2) + 1 = 7.
You are using integer operands so you get an integer result. That's how the language works.
A modulo operator will give you the reminder of a division. Therefore, it is normal that you get the number 1 as a result.
Also, note that you are using integers... 7/3 != 2.3333333333.
One last thing, be careful with that code. A division by zero would make your program crash. ;)
% for ints does not give the decimal fraction but the remainder from the division. Here it is from 6 which is the highest multiplum of 2 lower than your number 7. 7-6 is 1.

Categories