Luhn checksum validation in Java

Luhn checksum validation in Java - java

I have to replicate the luhn algorithm in Java, the problem I face is how to implement this in an efficient and elegant way (not a requirement but that is what I want).
The luhn-algorithm works like this:
You take a number, let's say 56789
loop over the next steps till there are no digits left
You pick the left-most digit and add it to the total sum. sum = 5
You discard this digit and go the next. number = 6789
You double this digit, if it's more than one digit you take apart this number and add them separately to the sum. 2*6 = 12, so sum = 5 + 1 = 6 and then sum = 6 + 2 = 8.
Addition restrictions
For this particular problem I was required to read all digits one at a time and do computations on each of them separately before moving on. I also assume that all numbers are positive.
The problems I face and the questions I have
As said before I try to solve this in an elegant and efficient way. That's why I don't want to invoke the toString() method on the number to access all individual digits which require a lot of converting. I also can't use the modulo kind of way because of the restriction above that states once I read a number I should also do computations on it right away. I could only use modulo if I knew in advance the length of the String, but that feels like I first have to count all digits one-for-once which thus is against the restriction. Now I can only think of one way to do this, but this would also require a lot of computations and only ever cares about the first digit*:
int firstDigit(int x) {
while (x > 9) {
x /= 10;
}
return x;
}
Found here: https://stackoverflow.com/a/2968068/3972558
*However, when I think about it, this is basically a different and weird way to make use of the length property of a number by dividing it as often till there is one digit left.
So basically I am stuck now and I think I must use the length property of a number which it does not really have, so I should find it by hand. Is there a good way to do this? Now I am thinking that I should use modulo in combination with the length of a number.
So that I know if the total number of digits is uneven or even and then I can do computations from right to left. Just for fun I think I could use this for efficiency to get the length of a number: https://stackoverflow.com/a/1308407/3972558
This question appeared in the book Think like a programmer.

You can optimise it by unrolling the loop once (or as many times are you like) This will be close to twice as fast for large numbers, however make small numbers slower. If you have an idea of the typical range of numbers you will have you can determine how much to unroll this loop.
int firstDigit(int x) {
while (x > 99)
x /= 100;
if (x > 9)
x /= 10;
return x;
}

use org.apache.commons.validator.routines.checkdigit.LuhnCheckDigit . isValid()
Maven Dependency:
<dependency>
<groupId>commons-validator</groupId>
<artifactId>commons-validator</artifactId>
<version>1.4.0</version>
</dependency>

Normally you would process the numbers from right to left using divide by 10 to shift the digits and modulo 10 to extract the last one. You can still use this technique when processing the numbers from left to right. Just use divide by 1000000000 to extract the first number and multiply by 10 to shift it left:
0000056789
0000567890
0005678900
0056789000
0567890000
5678900000
6789000000
7890000000
8900000000
9000000000
Some of those numbers exceed maximum value of int. If you have to support full range of input, you will have to store the number as long:
static int checksum(int x) {
long n = x;
int sum = 0;
while (n != 0) {
long d = 1000000000l;
int digit = (int) (n / d);
n %= d;
n *= 10l;
// add digit to sum
}
return sum;
}

As I understand, you will eventually need to read every digit, so what is wrong with convert initial number to string (and therefore char[]) and then you can easily implement the algorithm iterating that char array.
JDK implementation of Integer.toString is rather optimized so that you would need to implement your own optimalizations, e.g. it uses different lookup tables for optimized conversion, convert two chars at once etc.
final static int [] sizeTable = { 9, 99, 999, 9999, 99999, 999999, 9999999,
99999999, 999999999, Integer.MAX_VALUE };
// Requires positive x
static int stringSize(int x) {
for (int i=0; ; i++)
if (x <= sizeTable[i])
return i+1;
}
This was just an example but feel free to check complete implementation :)

I would first convert the number to a kind of BCD (binary coded decimal). I'm not sure to be able to find a better optimisation than the JDK Integer.toString() conversion method but as you said you did not want to use it :
List<Byte> bcd(int i) {
List<Byte> l = new ArrayList<Byte>(10); // max size for an integer to avoid reallocations
if (i == 0) {
l.add((byte) i);
}
else {
while (i != 0) {
l.add((byte) (i % 10));
i = i / 10;
}
}
return l;
}
It is more or less what you proposed to get first digit, but now you have all you digits in one single pass and can use them for your algorythm.
I proposed to use byte because it is enough, but as java always convert to int to do computations, it might be more efficient to directly use a List<Integer> even if it really wastes memory.

Related

Sum a series n^n for values 1 through n with no overflow? Only last digits of answer needed

I want to write a Java program that sums all the integers n^n from 1 through n. I only need the last 10 digits of this number, but the values given for n exceed 800.
I have already written a basic java program to calculate this, and it works fine for n < 16. But it obviously doesn't deal with such large numbers. I am wondering if there is a way to just gather the last 10 digits of a number that would normally overflow a long, and if so, what that method or technique might be.
I have no code to show, just because the code I wrote already is exactly what you'd expect. A for loop that runs i*i while i<=n and a counter that sums each iteration with the one before. It works. I just don't know how to approach the problem for bigger numbers, and need guidance.
Around n=16, the number overflows a long, and returns negative values. Will BigInteger help with this, or is that still too small a data type? Or could someone point me towards a technique for gathering the last 10 digits of a massive number? I could store it in an array and then sum them up if I could just get that far.
Anyhow, I don't expect a finished piece of code, but maybe some suggestions as to how I could look at this problem anew? Some techniques my n00b self is missing?
Thank you!

sums all the integers n^n from 1 through n. I only need the last 10 digits of this number
If you only need last 10 digits, that means you need sum % 10¹⁰.
The sum is 1¹ + 2² + 3³ + ... nⁿ.
According to equivalences rules:
(a + b) % n = [(a % n) + (b % n)] % n
So you need to calculate iⁱ % 10¹⁰, for i=1 to n, sum them, and perform a last modulus on that sum.
According to the modular exponentiation article on Wikipedia, there are efficient ways to calculate aⁱ % m on a computer. You should read the article.
However, as the article also says:
Java's java.math.BigInteger class has a modPow() method to perform modular exponentiation
Combining all that to an efficient implementation in Java that doesn't use excessive amounts of memory:
static BigInteger calc(int n) {
final BigInteger m = BigInteger.valueOf(10_000_000_000L);
BigInteger sum = BigInteger.ZERO;
for (int i = 1; i <= n; i++) {
BigInteger bi = BigInteger.valueOf(i);
sum = sum.add(bi.modPow(bi, m));
}
return sum.mod(m);
}
Or the same using streams:
static BigInteger calc(int n) {
final BigInteger m = BigInteger.valueOf(10).pow(10);
return IntStream.rangeClosed(1, n).mapToObj(BigInteger::valueOf).map(i -> i.modPow(i, m))
.reduce(BigInteger.ZERO, BigInteger::add).mod(m);
}
Test
System.out.println(calc(800)); // prints: 2831493860

BigInteger would be suitable to work with these kinds of numbers. It's quite frankly what it's designed for.
Do note that instances of BigInteger are immutable and any operations you do on one will give you back a new BigInteger instance. You're going to want to store some of your results in variables.

How do I make this have a space complexity of O(1) instead of O(n)?

I'm trying to convert a decimal into binary number using iterative process. How can I make this have a space complexity of O(1) instead of O(n)?
int i = 0;
int j;
int bin[] = new int[n]; //n here is my paramater int n
while(n > 0) {
bin[i] = n % 2;
n /= 2;
i++;
}
//I'm reversing the order of index i with variable j to get right order (e.g. 26 has 11010, instead of 01011)
for(j = i -1; j >= 0; j--) {
System.out.print(bin[j]);
}

First, you don't need place for n bits if the value itself is n. You just need log2(n)+1. It won't give you wrong results to use n bits, but for big values of n, the memory available to your Java process might be not enough.
And, about O(1)... maybe not really what you were thinking, but:
Javas int has a specific fixed value range, which leads to the guarantee that a (positive) int value needs max 31 bit (if you have negative numbers too, storing the sign somewhere is necessary, that's bit 32).
With that information, strictly speaking, you can get O(1) just by rewriting your loops so that they loop exactly 31 times. Then, for each value of n, your code has exactly the same amount of steps, and that is O(1) per definition.
Going the bit fiddling route won't help here. There are some useful shortcuts if your values fulfil certain conditions, but if you want your code to work with any int value, the normal loop as you have here is likely the best you can get.
(Of yourse, CPU intrinsics may help, but not for Java...)

Sorting by least significant digit

I am trying to write a program that accepts an array of five four digit numbers and sorts the array based off the least significant digit. For example if the numbers were 1234, 5432, 4567, and 8978, the array would be sorted first by the last digit so the nest sort would be 5432, 1224, 4597, 8978. Then after it would be 1224, 5432, 8978, 4597. And so on until it is fully sorted.
I have wrote the code for displaying the array and part of it for sorting. I am not sure how to write the equations I need to compare each digit. This is my code for sorting by each digit so far:
public static void sortByDigit(int[] array, int size)
{
for(int i = 0; i < size; i++)
{
for(int j = 0; j < size; j++)
{
}
for(i = 0; i < size; i++)
{
System.out.println(array[i]);
}
}
}
I am not sure what to put in the nested for loop. I think I need to use the modulus.
I just wrote this to separate the digits but I don't know how to swap the numbers or compare them.
int first = array[i]%10;
int second = (array[i]%100)/10;
int third = (array[i]%1000)/10;
int fourth = (array[i]%10000)/10;
Would this would go in the for loop?

It seems like your problem is mainly just getting the value of a digit at a certain index. Once you can do that, you should be able to formulate a solution.
Your hunch that you need modulus is absolutely correct. The modulo operator (%) returns the remainder on a given division operation. This means that saying 10 % 2 would equal 0, as there is no remainder. 10 % 3, however, would yield 1, as the remainder is one.
Given that quick background on modulus, we just need to figure out how to make a method that can grab a digit. Let's start with a general signature:
public int getValueAtIdx(int value, int idx){
}
So, if we call getValueAtIdx(145, 2), it should return 1 (assuming that the index starts at the least significant digit). If we call getValueAtIdx(562354, 3), it should return 2. You get the idea.
Alright, so let's start by using figuring out how to do this on a simple case. Let's say we call getValueAtIdx(27, 0). Using modulus, we should be able to grab that 7. Our equation is 27 % x = 7, and we just need to determine x. So 27 divided by what will give us a remainder of 7? 10, of course! That makes our equation 27 % 10 = 7.
Now that's all find and dandy, but how does 10 relate to 0? Well, let's try and grab the value at index 1 this time (2), and see if we can't figure it out. With what we did last time, we should have something like 27 % x = 27 (WARNING: There is a rabbit-hole here where you could think x should be 5, but upon further examination it can be found that only works in this case). What if we take the 10 we used earlier, but square it (index+1)? That would give us 27 % 100 = 27. Then all we have to do is divide by 10 and we're good.
So what would that look like in the function we are making?
public int getValueAtIdx(int value, int idx){
int modDivisor = (int) Math.pow(10, (idx+1));
int remainder = value % modDivisor;
int digit = remainder / (modDivisor / 10);
return digit;
}
Ok, so let's to back to the more complicated example: getValueAtIdx(562354, 3).
In the first step, modDivisor becomes 10^4, which equals 10000.
In the second step, remainder is set to 562354 % 10000, which equals 2354.
In the third and final step, digit is set to remainder / (10000 / 10). Breaking that down, we get remainder / 1000, which (using integer division) is equal to 2.
Our final step is return the digit we have acquired.
EDIT: As for the sort logic itself, you may want to look here for a good idea.
The general process is to compare the two digits, and if they are equal move on to their next digit. If they are not equal, put them in the bucket and move on.

Which data type or data structure to choose to calculate factorial of 100?

I thought of writing a program to evaluate factorial of a given integer.
Following basics I wrote the below code in java :
long fact(int num){
if(num == 1)
return 1;
else
return num*fact(num-1);
}
But then I realized that for many integer input the result may not be what is desired and hence for testing directly gave input as 100.
My doubt was true as Result I got was "0"(cause result might be out of range of long).
So,I am just curious and eager to know as how may I make my program work for inputs<=150.
I would appreciate any valid solution in C programming language or Java.

BigInteger is your class. It can store integers of seemingly any size.
static BigInteger fact(BigInteger num) {
if (num.equals(BigInteger.ONE))
return BigInteger.ONE;
else
return num.multiply(fact(num.subtract(BigInteger.ONE)));
}

If you're not after a naive approach of factorial computation, you should do some research into the problem. Here's a good overview of some algorithms for computing factorials: http://www.luschny.de/math/factorial/conclusions.html
But like the other answers suggest, your current problem is that you need to use a large number implementation (e.g. BigInt) instead of fixed size integers.

In C Language, you can use array to store factorial of large number.
my reference: Calculate the factorial of an arbitrarily large number, showing all the digits. it very helpful post.
I made small changes in code to convert into C.
int max = 5000;
void factorial(int arr[], int n){//factorial in array
if (!n) return;
int carry = 0;
int i=max-1;
for (i=max-1; i>=0; --i){
arr[i] = (arr[i] * n) + carry;
carry = arr[i]/10;
arr[i] %= 10;
}
factorial(arr,n-1);
}
void display(int arr[]){// to print array
int ctr = 0;
int i=0;
for (i=0; i<max; i++){
if (!ctr && arr[i])
ctr = 1;
if(ctr)
printf("%d", arr[i]);
}
}
int main(){
int *arr = calloc(max, sizeof(int));
arr[max-1] = 1;
int num = 100;
printf("factorial of %d is: ",num);
factorial(arr,num);
display(arr);
free(arr);
return 0;
}
And its working for 100! see: here Codepad
I would like to give you links of two more useful posts.
1) How to handle arbitrarily large integers suggests GPU MP
2) C++ program to calculate large factorials

In java you have the BigInteger that can store arbitrary big integers. Unfortunately there is no equivelent in C. You either have to use a third-party library or to implement big integers on your own. Typical approach for this is to have a dynammically-allocated array that stores each of the digits of the given number in some numeric system(usually base more than 10 is chosen so that you reduce the total number of digits you need).

A decimal (base 10) digit takes about 3.3 bits (exactly: log(10)/log(2)). 100! is something like 158 digits long, so you need 158 * 3.3 = 520 bits.
There is certainly no built in type in C that will do this. You need some form of special library if you want every digit in the factorial calculation to be "present".
Using double would give you an approximate result (this assumes that double is a 64-bit floating point value that is IEEE-754 compatible, or with similar range - the IEEE-754 double format will give about 16 decimal digits (52 bits of precision, divided by the log(10)/log(2) like above). I believe there are more than 16 digits in this value, so you won't get an exact value, but it will calculate some number that is within a 10 or more digits.

Dealing with overflow in Java without using BigInteger

Suppose I have a method to calculate combinations of r items from n items:
public static long combi(int n, int r) {
if ( r == n) return 1;
long numr = 1;
for(int i=n; i > (n-r); i--) {
numr *=i;
}
return numr/fact(r);
}
public static long fact(int n) {
long rs = 1;
if(n <2) return 1;
for (int i=2; i<=n; i++) {
rs *=i;
}
return rs;
}
As you can see it involves factorial which can easily overflow the result. For example if I have fact(200) for the foctorial method I get zero. The question is why do I get zero?
Secondly how do I deal with overflow in above context? The method should return largest possible number to fit in long if the result is too big instead of returning wrong answer.
One approach (but this could be wrong) is that if the result exceed some large number for example 1,400,000,000 then return remainder of result modulo
1,400,000,001. Can you explain what this means and how can I do that in Java?
Note that I do not guarantee that above methods are accurate for calculating factorial and combinations. Extra bonus if you can find errors and correct them.
Note that I can only use int or long and if it is unavoidable, can also use double. Other data types are not allowed.
I am not sure who marked this question as homework. This is NOT homework. I wish it was homework and i was back to future, young student at university. But I am old with more than 10 years working as programmer. I just want to practice developing highly optimized solutions in Java. In our times at university, Internet did not even exist. Today's students are lucky that they can even post their homework on site like SO.

Use the multiplicative formula, instead of the factorial formula.

Since its homework, I won't want to just give you a solution. However a hint I will give is that instead of calculating two large numbers and dividing the result, try calculating both together. e.g. calculate the numerator until its about to over flow, then calculate the denominator. In this last step you can chose the divide the numerator instead of multiplying the denominator. This stops both values from getting really large when the ratio of the two is relatively small.
I got this result before an overflow was detected.
combi(61,30) = 232714176627630544 which is 2.52% of Long.MAX_VALUE
The only "bug" I found in your code is not having any overflow detection, since you know its likely to be a problem. ;)

To answer your first question (why did you get zero), the values of fact() as computed by modular arithmetic were such that you hit a result with all 64 bits zero! Change your fact code to this:
public static long fact(int n) {
long rs = 1;
if( n <2) return 1;
for (int i=2; i<=n; i++) {
rs *=i;
System.out.println(rs);
}
return rs;
}
Take a look at the outputs! They are very interesting.
Now onto the second question....
It looks like you want to give exact integer (er, long) answers for values of n and r that fit, and throw an exception if they do not. This is a fair exercise.
To do this properly you should not use factorial at all. The trick is to recognize that C(n,r) can be computed incrementally by adding terms. This can be done using recursion with memoization, or by the multiplicative formula mentioned by Stefan Kendall.
As you accumulate the results into a long variable that you will use for your answer, check the value after each addition to see if it goes negative. When it does, throw an exception. If it stays positive, you can safely return your accumulated result as your answer.
To see why this works consider Pascal's triangle
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
which is generated like so:
C(0,0) = 1 (base case)
C(1,0) = 1 (base case)
C(1,1) = 1 (base case)
C(2,0) = 1 (base case)
C(2,1) = C(1,0) + C(1,1) = 2
C(2,2) = 1 (base case)
C(3,0) = 1 (base case)
C(3,1) = C(2,0) + C(2,1) = 3
C(3,2) = C(2,1) + C(2,2) = 3
...
When computing the value of C(n,r) using memoization, store the results of recursive invocations as you encounter them in a suitable structure such as an array or hashmap. Each value is the sum of two smaller numbers. The numbers start small and are always positive. Whenever you compute a new value (let's call it a subterm) you are adding smaller positive numbers. Recall from your computer organization class that whenever you add two modular positive numbers, there is an overflow if and only if the sum is negative. It only takes one overflow in the whole process for you to know that the C(n,r) you are looking for is too large.
This line of argument could be turned into a nice inductive proof, but that might be for another assignment, and perhaps another StackExchange site.
ADDENDUM
Here is a complete application you can run. (I haven't figured out how to get Java to run on codepad and ideone).
/**
* A demo showing how to do combinations using recursion and memoization, while detecting
* results that cannot fit in 64 bits.
*/
public class CombinationExample {
/**
* Returns the number of combinatios of r things out of n total.
*/
public static long combi(int n, int r) {
long[][] cache = new long[n + 1][n + 1];
if (n < 0 || r > n) {
throw new IllegalArgumentException("Nonsense args");
}
return c(n, r, cache);
}
/**
* Recursive helper for combi.
*/
private static long c(int n, int r, long[][] cache) {
if (r == 0 || r == n) {
return cache[n][r] = 1;
} else if (cache[n][r] != 0) {
return cache[n][r];
} else {
cache[n][r] = c(n-1, r-1, cache) + c(n-1, r, cache);
if (cache[n][r] < 0) {
throw new RuntimeException("Woops too big");
}
return cache[n][r];
}
}
/**
* Prints out a few example invocations.
*/
public static void main(String[] args) {
String[] data = ("0,0,3,1,4,4,5,2,10,0,10,10,10,4,9,7,70,8,295,100," +
"34,88,-2,7,9,-1,90,0,90,1,90,2,90,3,90,8,90,24").split(",");
for (int i = 0; i < data.length; i += 2) {
int n = Integer.valueOf(data[i]);
int r = Integer.valueOf(data[i + 1]);
System.out.printf("C(%d,%d) = ", n, r);
try {
System.out.println(combi(n, r));
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
}
}
Hope it is useful. It's just a quick hack so you might want to clean it up a little.... Also note that a good solution would use proper unit testing, although this code does give nice output.

You can use the java.math.BigInteger class to deal with arbitrarily large numbers.

If you make the return type double, it can handle up to fact(170), but you'll lose some precision because of the nature of double (I don't know why you'd need exact precision for such huge numbers).
For input over 170, the result is infinity

Note that java.lang.Long includes constants for the min and max values for a long.
When you add together two signed 2s-complement positive values of a given size, and the result overflows, the result will be negative. Bit-wise, it will be the same bits you would have gotten with a larger representation, only the high-order bit will be truncated away.
Multiplying is a bit more complicated, unfortunately, since you can overflow by more than one bit.
But you can multiply in parts. Basically you break the to multipliers into low and high halves (or more than that, if you already have an "overflowed" value), perform the four possible multiplications between the four halves, then recombine the results. (It's really just like doing decimal multiplication by hand, but each "digit" is, say, 32 bits.)

You can copy the code from java.math.BigInteger to deal with arbitrarily large numbers. Go ahead and plagiarize.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.