Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
As a school assingment I have a really big for (it goes from 1 to a 13 digits number) with a few BigInteger operations inside. To reduce the loop I can skip all the even numbers and to avoid unnecessary BigInteger operations I can check for multiples of 3, 5, 7 and 11. Here is an exemple:
for(long i = min; i < max; i += 2){
if( i%3 != 0 && i%5 != 0 && i%7 != 0 && i%11 != 0){
BigInteger aux = new BigInteger(Long.toString(i));
BigInteger[] aux2 = k.divideAndRemainder(aux);
if(aux2[1].longValueExact() == 0){
list.add(aux);
list.add(aux2[0]);
}
}
Now, this loop would take months to finish, so I thought of breaking the for in multiple threads, each one covering a window of the original 13-digits-number.
My first question is: with a i7-3770 processor, how many threads could I have to be as fast as efficiently possible?
edit: the assingment is to otimize a problem that requires a lot of CPU. In this case, to find all divisors of a 23 digits number (the "k" in k.divideAndRemainder(aux)), thats why I'm using BigIntegers. The 13 digits number used by the for is the root of k.
Java is fine for multithreading. BigInteger is not exactly performant, but that's not the problem.
On that processor, up to 8 threads can utilize the cores completely. I doubt you run a custom OS that allows you to cede full CPU control to an application, so you want to keep at least 1 logical core for the OS and associated services, as well as 1 for Java misc threads.
If your program takes months to complete on a single thread, then using 4 threads efficiently will make it take weeks to complete. While that's a big speedup, I don't think that's quite enough for you, is it?
This is a school assignment, as you've said yourself. Just like with Project Euler, there are ways to do it, and there are ways to do it right. The first ones take days of computing on powerful machines after 10 minutes programming, the latter ones take days of thinking and seconds of computing. Your problem is not the hardware or language, it's the algorithm.
Now for the actual stuff that you can do better. From obvious to less so.
BigInteger aux = new BigInteger(Long.toString(i)); Are you mad? Why would you convert a long to a String and then parse it back to a number?! This is stupid slow. Just use BigInteger.valueOf(long val).
BigInteger is slower than primitives. Why do you need it if your values (up to 13 digits, as you've said yourself) fit comfortably in a long? If the only reason is the divideAndRemainder method - write one yourself! It's going to be 2 lines of code...
You seem to be looking for divisors of a particular number. (The filtering of multiples of 3,5,7,11 makes me think you're looking for prime ones and will do some filtering later) There is a number of speedups for such algorithms, main one being (and it's not clear whether you're using it) is making the maximum checked number the square root of the target, rather than the number itself, half of it, or whatever arbitrary bound people make up. Other ones can be found in the wiki or more complex wiki
Related
I found a simple Java exercise and I answered it, yet there seems to be an issue with my code and I can't seem find the problem. Please point me towards the issue:
The problem is:
We want to make a row of bricks that is goal inches long. We have a number of small bricks (1 inch each) and big bricks (5 inches each). Return true if it is possible to make the goal by choosing from the given bricks. This is a little harder than it looks and can be done without any loops.
And I made this function as answer:
public boolean makeBricks(int small, int big, int goal) {
if (small>=goal) return true;
if ((goal>=5) && (big>=1)){ makeBricks(small,big-1,goal-5);}
return false;
}
Yet when running it on https://codingbat.com/prob/p183562 it says that it's wrong and it all looks correct to me.
Adding a return statement fixes your technical problem of being unable to determine the truth value of calls further down the stack, but that's a linear solution to a problem that can be solved in constant time with basic math:
public boolean makeBricks(int small, int big, int goal) {
return big * 5 + small >= goal && goal % 5 <= small;
}
The idea here is to first determine if all of our bricks combined meets or exceeds the goal: big * 5 + small >= goal. If we can't satisfy this equation, we're definitely out of luck.
However, this is overly optimistic and does not account for cases when we have sufficient blocks to exceed the goal but not enough small blocks to remove some number of larger blocks and meet the goal. Testing goal % 5 <= small ensures that we have enough small blocks to bridge the gap of 5 that will be left as each large block is removed.
If that's still not clear, let's examine an edge case: makeBricks(3, 2, 9). Our goal is 9 and we have 3 small blocks and 2 large ones. Combining our entire arsenal gives a total of 13, which seems sufficient to meet the goal. However, if we omit one of our large blocks, the closest we can get is 8. If we omit all of our small blocks, the closest we can get is 10. No matter what we do, the goal is one block out of reach.
Let's check that against our formula: 9 mod 5 == 4, which is 1 more than our number of small blocks, 3, and matches our hand computation. We should return false on this input. On the other hand, if we had 1 extra small block, 9 % 5 == small would be true, and we'd have just enough blocks to bridge the gap.
Put a return in front of your recursive call:
return makeBricks(small,big-1,goal-5);
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
Using Java, I am trying to work my way through a list of problems like this:
Write a program that prints all prime numbers. (Note: if your programming language does not support arbitrary size numbers, printing all primes up to the largest number you can easily represent is fine too.)
Do they mean "all prime numbers up to n"? How can I know whether Java supports arbitrary size numbers or not, and if yes, what is it?
Java's primitives have well defined ranges. ints, for example, range between -231 and 231-1. You can see the full details in Java's tutorial. If you want to represent larger integers than the long primitive allows (i.e., 263-1), you'll have to resort to using the BigInteger class.
A program that printed all primes would be a non-halting program. There are infinitely many primes, so you could run the program forever.
The basic number types in most languages have a defined amount of memory space, and therefore a limited range of numbers they can express. For example, a Java int is stored in 4 bytes. That would make the non-halting primes program impossible, since you would eventually get to a prime larger than the largest number you could store.
As well as primitive number types, many languages have arbitrary size number types. The amount of memory they use will grow as you store larger numbers. Java has BigInteger for this purpose, while Python just stores all numbers that way by default.
Even these arbitrary-size numbers, though, are eventually limited by the amount of memory the program can access. In reality, it is impossible to print all the primes.
What the problem statement you are trying to solve is really saying is that they don't want you to bother about any of this! Just use the standard number type in your language. No system is capable of calculating infinitely many primes. What they want you to do is to write the algorithm that would calculate infinitely many primes given a system and data type capable of doing so.
Perhaps they could have worded it better. In short: don't worry about it.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
In a past examination paper, there was a question that gave two methods to check if int[] A contained all the same values as int[] B (two unsorted arrays of size N), an engineer had to decide which implementation he would use.
The first method used a single for loop with a nested call to linear search; its asymptotic run time was calculated to be theta(n^2) and additional memory usage for creating copies of A and B is around (N*4N + 4N) bytes
The second method used an additional int[] C, that is a copy of B, which is sorted. A for loop is used with a nested binary search; it's asymptotic run time was calculated as theta(nlog(n^2)) and additional memory usage for creating copies of A and B is around (4N + 4N + N*4N + 4N) bytes (the first two 4N's are due to C being a copy of B, and the subsequent copy of C being created in the sort(C); function)
The final question asks which implementation the Engineer should use, I believe that a faster algorithm would be the better option, as for larger inputs the faster algorithm would cut the time of computation drastically, although the draw back is that with larger inputs, he runs the risk of an OutOfMemory Error, I understand one could state either method, depending on the size of the arrays, although my question is, in the majority of cases, which implementation is the better implementation?
The first alg. has complexity of theta (n^2) and the second of theta ( n log(n^2)).
Hence, the second one is much faster for n great enough.
As you mentioned, the memory usage comes into account and one would argue that the second one consumes a lot more memory but that is not true.
The first alg. consumes: n*4n+4n= 4n^2 + 4n
The second alg. consumes: 4n+4n+4n+n*4n=4n^2+12n
Suppose n is equal to 1000 then the first alg consumes 4004000 and the second 4012000 memory. So there is no big difference in memory consumption between the two algorithms.
So, from a memory consumption perspective it does not really matter which alg. you choose and in terms of complexity they both consumes theta (n^2) memory.
A relevant question to many a programmer, but the answer depends entirely on the context. If the software runs on a system with a little memory, then the algorithm with a smaller footprint will likely be better. On the other hand, a real-time system needs speed; so the faster algorithm would likely be better.
It is important to also recognize the more subtle issues that arise during execution, e.g. when the additional memory requirements of the faster algorithm force the system to use paging and, in turn, slow down execution.
Thus, it is important to understand in-context the benefit-cost trade off for each algorithm. And as always, emphasize code readability and sound algorithm design/integration.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a Rectangle class. It has a length, breadth and area (all ints). I want to hash it such that every rectangle with the same length and breadth hashes to the same value. What is a way to do this?
EDIT: I understand its a broad question. Thats why I asked for "a" way to do it. Not the best way.
A good and simple scheme is to calculate the hash for a pair of integers as follows:
hash = length * CONSTANT + width
Empirically, you will get best results (i.e. the fewest number collisions) if CONSTANT is a prime number. A lot of people1 recommend a value like 31, but the best choice depends on the most likely range of the length and width value. If they are strictly bounded, and small enough, then you could do better than 31.
However, 31 is probably good enough for practical purposes2. A few collisions at this level is unlikely to make a significant performance difference, and even a perfect hashing function does not eliminate collisions at the hash table level ... where you use the modulus of the hash value.
1 - I'm not sure where this number comes from, or whether there are empirical studies to back it up ... in the general case. I suspect it comes from hashing of (ASCII) strings. But 31 is prime ... and it is a Mersenne prime (2^7 - 1) which means it could be computed using a shift and a subtraction if hardware multiple is slow.
2 - I'm excluding cases where you need to worry about someone deliberately creating hash function collisions in an attempt to "break" something.
You can use the Apache Commons library, which has a HashCodeBuilder class. Assuming you have a Rectangle class with a width and a height, you can add the following method:
#Override
public int hashCode(){
return new HashCodeBuilder().append(width).append(height).append(children).toHashCode();
}
What you want (as clarified in your comment on the question) is not possible. There are N possible hashCodes, one for each int, where N is approximately 4.2 billion. Assuming rectangles must have positive dimensions, there are ((N * N) / 4) possible rectangles. How do you propose to make them fit into N hashCodes? When N is > 4, you have more possible rectangles than hashCodes.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I did an algoritms refresher and I (re)read about a sorting algorithm that runs in linear time, namely Counting Sort.
To be honest I had forgotten about it.
I understand the structure and logic and the fact that it runs in linear time is a very attactive quality.
But I have the folowing question:
As I understanding the concrete implementation of the algorithm relies on 2 things:
1) The range of input numbers is small (otherwise the intermediate array will be huge and with many gaps).
2) We actually know the range of numbers.
Taking that these 2 assumptions are correct (please correct me otherwise), I was wondering what is the best application domain that this algorithm applies to.
I mean specifically in Java, is an implementation like the following Java/Counting sort sufficient:
public static void countingSort(int[] a, int low, int high)
{
int[] counts = new int[high - low + 1]; // this will hold all possible values, from low to high
for (int x : a)
counts[x - low]++; // - low so the lowest possible value is always 0
int current = 0;
for (int i = 0; i < counts.length; i++)
{
Arrays.fill(a, current, current + counts[i], i + low); // fills counts[i] elements of value i + low in current
current += counts[i]; // leap forward by counts[i] steps
}
}
or it is not a trivial matter to come up with the high and low?
Is there a specific application in Java that counting sort is best suited for?
I assume there are subtleties like these otherwise why would anyone bother with all the O(nlogn) algorithms?
Algorithms are not about the language, so this is language-agnostic. As you have said - use counting sort when the domain is small. If you have only three numbers - 1, 2, 3 it is far better to sort them with counting sort, than a quicksort, heapsort or whatever which are O(nlogn). If you have a specific question, feel free to ask.
It is incorrect to say counting sort is O(n) in the general case, the "small range of element" is not a recommendation but a key assumption.
Counting sort assumes that each of the elements is an integer in the
range 1 to k, for some integer k. When k = O(n), the Counting-sort
runs in O(n) time.
In the general situation, the range of key k is independent to the number of element n and can be arbitrarily large. For example, the following array:
{1, 1000000000000000000000000000000000000000000000000000000000000000000000000000}
As for the exact break even point value of k and n where counting sort outperform traditional sort, it is hugely dependent on the implementation and best done via benchmarking (trying both out).