Performance of nested loop vs hard coded matrix multiplication - java

I am reading a book on 2D game programming and am being walked through a 3x3 matrix class for linear transformations. The author has written a method for multiplying two 3x3 matrices as follows.
public Matrix3x3f mul(Matrix3x3f m1)
{
return new Matrix3x3f(new float[][]
{
{
this.m[0][0] * m1.m[0][0] // M[0,0]
+ this.m[0][1] * m1.m[1][0]
+ this.m[0][2] * m1.m[2][0],
this.m[0][0] * m1.m[0][1] // M[0,1]
+ this.m[0][1] * m1.m[1][1]
+ this.m[0][2] * m1.m[2][1],
this.m[0][0] * m1.m[0][2] // M[0,2]
+ this.m[0][1] * m1.m[1][2]
+ this.m[0][2] * m1.m[2][2],
},
{
this.m[1][0] * m1.m[0][0] // M[1,0]
+ this.m[1][1] * m1.m[1][0]
+ this.m[1][2] * m1.m[2][0],
this.m[1][0] * m1.m[0][1] // M[1,1]
+ this.m[1][1] * m1.m[1][1]
+ this.m[1][2] * m1.m[2][1],
this.m[1][0] * m1.m[0][2] // M[1,2]
+ this.m[1][1] * m1.m[1][2]
+ this.m[1][2] * m1.m[2][2],
},
{
this.m[2][0] * m1.m[0][0] // M[2,0]
+ this.m[2][1] * m1.m[1][0]
+ this.m[2][2] * m1.m[2][0],
this.m[2][0] * m1.m[0][1] // M[2,1]
+ this.m[2][1] * m1.m[1][1]
+ this.m[2][2] * m1.m[2][1],
this.m[2][0] * m1.m[0][2] // M[2,2]
+ this.m[2][1] * m1.m[1][2]
+ this.m[2][2] * m1.m[2][2],
},
});
}
If I personally needed to write a method to do the same I would have come up with some nested loop which did all of these calculations automatically, I am assuming that perhaps the author has written it out this way so that people with little math background can follow along easier.
Does this sound like a fair assumption or could a nested loop version of this method possibly cause performance issues when used heavily in a loop where performance is vital?

I think this is a performance issue.
If you use a loop, it will use a lot of jumping orders, since every iteration it needs to check "if cond goto ___". You should read this post on Branch Prediction and also learn a bit on computer architecture to understand how instructions affects performance, in this case I think you might find caching interesting.

From the looks of it, I think it's for clarity's sake, not for performance's sake. Consider the fact that it's Java code. There's object allocation in the return statement. If it were so performance critical that the conditional jump of a for-loop can't be afforded, the result would be written into a mutable instance.

If the hardcoded operations are exactly the same as the operations processed by a loop, I can see no reason why the loop would be less efficient (or at least, not in a considerable way). Actually, large loops (which is not the case here) are more efficient than hardcoding by far because :
some optimizations can be processed by the compiler and the JVM at runtime
(they enable a clearer code and a shorter binary)
I heard that soometimes it could be better to hardcode the operations if the loop iterates through a tiny space but I don't think it is really interesting to do so.
Finally, for multiplying matrices, using a loop or not won't change much things, what could speed up your calculations is using dynamic programming. I don't know if it's worth doing it for small matrices but if I were you I would give it a try.

This is definitely for performance issue. Having nested loops that have to increment the loop index and to check whether the loop has ended always makes it a slower implementation. For computer graphic and CAD/CAM software, the 3x3 or 4x4 matrix multiplication will be done for every rendering action. So, the matrix multiplication can be easily done millions of times. Therefore, implementing 3x3 or 4x4 matrix multiplication without using nested loops is a common practice, especially in the older days where there is no such thing as GPU. For matrices with more than 4 rows/columns, nested loops approach is still used.

Related

Behind the scenes of recursion? [duplicate]

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
One of the topics that seems to come up regularly on mailing lists and online discussions is the merits (or lack thereof) of doing a Computer Science Degree. An argument that seems to come up time and again for the negative party is that they have been coding for some number of years and they have never used recursion.
So the question is:
What is recursion?
When would I use recursion?
Why don't people use recursion?
There are a number of good explanations of recursion in this thread, this answer is about why you shouldn't use it in most languages.* In the majority of major imperative language implementations (i.e. every major implementation of C, C++, Basic, Python, Ruby,Java, and C#) iteration is vastly preferable to recursion.
To see why, walk through the steps that the above languages use to call a function:
space is carved out on the stack for the function's arguments and local variables
the function's arguments are copied into this new space
control jumps to the function
the function's code runs
the function's result is copied into a return value
the stack is rewound to its previous position
control jumps back to where the function was called
Doing all of these steps takes time, usually a little bit more than it takes to iterate through a loop. However, the real problem is in step #1. When many programs start, they allocate a single chunk of memory for their stack, and when they run out of that memory (often, but not always due to recursion), the program crashes due to a stack overflow.
So in these languages recursion is slower and it makes you vulnerable to crashing. There are still some arguments for using it though. In general, code written recursively is shorter and a bit more elegant, once you know how to read it.
There is a technique that language implementers can use called tail call optimization which can eliminate some classes of stack overflow. Put succinctly: if a function's return expression is simply the result of a function call, then you don't need to add a new level onto the stack, you can reuse the current one for the function being called. Regrettably, few imperative language-implementations have tail-call optimization built in.
* I love recursion. My favorite static language doesn't use loops at all, recursion is the only way to do something repeatedly. I just don't think that recursion is generally a good idea in languages that aren't tuned for it.
** By the way Mario, the typical name for your ArrangeString function is "join", and I'd be surprised if your language of choice doesn't already have an implementation of it.
Simple english example of recursion.
A child couldn't sleep, so her mother told her a story about a little frog,
who couldn't sleep, so the frog's mother told her a story about a little bear,
who couldn't sleep, so the bear's mother told her a story about a little weasel...
who fell asleep.
...and the little bear fell asleep;
...and the little frog fell asleep;
...and the child fell asleep.
In the most basic computer science sense, recursion is a function that calls itself. Say you have a linked list structure:
struct Node {
Node* next;
};
And you want to find out how long a linked list is you can do this with recursion:
int length(const Node* list) {
if (!list->next) {
return 1;
} else {
return 1 + length(list->next);
}
}
(This could of course be done with a for loop as well, but is useful as an illustration of the concept)
Whenever a function calls itself, creating a loop, then that's recursion. As with anything there are good uses and bad uses for recursion.
The most simple example is tail recursion where the very last line of the function is a call to itself:
int FloorByTen(int num)
{
if (num % 10 == 0)
return num;
else
return FloorByTen(num-1);
}
However, this is a lame, almost pointless example because it can easily be replaced by more efficient iteration. After all, recursion suffers from function call overhead, which in the example above could be substantial compared to the operation inside the function itself.
So the whole reason to do recursion rather than iteration should be to take advantage of the call stack to do some clever stuff. For example, if you call a function multiple times with different parameters inside the same loop then that's a way to accomplish branching. A classic example is the Sierpinski triangle.
You can draw one of those very simply with recursion, where the call stack branches in 3 directions:
private void BuildVertices(double x, double y, double len)
{
if (len > 0.002)
{
mesh.Positions.Add(new Point3D(x, y + len, -len));
mesh.Positions.Add(new Point3D(x - len, y - len, -len));
mesh.Positions.Add(new Point3D(x + len, y - len, -len));
len *= 0.5;
BuildVertices(x, y + len, len);
BuildVertices(x - len, y - len, len);
BuildVertices(x + len, y - len, len);
}
}
If you attempt to do the same thing with iteration I think you'll find it takes a lot more code to accomplish.
Other common use cases might include traversing hierarchies, e.g. website crawlers, directory comparisons, etc.
Conclusion
In practical terms, recursion makes the most sense whenever you need iterative branching.
Recursion is a method of solving problems based on the divide and conquer mentality.
The basic idea is that you take the original problem and divide it into smaller (more easily solved) instances of itself, solve those smaller instances (usually by using the same algorithm again) and then reassemble them into the final solution.
The canonical example is a routine to generate the Factorial of n. The Factorial of n is calculated by multiplying all of the numbers between 1 and n. An iterative solution in C# looks like this:
public int Fact(int n)
{
int fact = 1;
for( int i = 2; i <= n; i++)
{
fact = fact * i;
}
return fact;
}
There's nothing surprising about the iterative solution and it should make sense to anyone familiar with C#.
The recursive solution is found by recognising that the nth Factorial is n * Fact(n-1). Or to put it another way, if you know what a particular Factorial number is you can calculate the next one. Here is the recursive solution in C#:
public int FactRec(int n)
{
if( n < 2 )
{
return 1;
}
return n * FactRec( n - 1 );
}
The first part of this function is known as a Base Case (or sometimes Guard Clause) and is what prevents the algorithm from running forever. It just returns the value 1 whenever the function is called with a value of 1 or less. The second part is more interesting and is known as the Recursive Step. Here we call the same method with a slightly modified parameter (we decrement it by 1) and then multiply the result with our copy of n.
When first encountered this can be kind of confusing so it's instructive to examine how it works when run. Imagine that we call FactRec(5). We enter the routine, are not picked up by the base case and so we end up like this:
// In FactRec(5)
return 5 * FactRec( 5 - 1 );
// which is
return 5 * FactRec(4);
If we re-enter the method with the parameter 4 we are again not stopped by the guard clause and so we end up at:
// In FactRec(4)
return 4 * FactRec(3);
If we substitute this return value into the return value above we get
// In FactRec(5)
return 5 * (4 * FactRec(3));
This should give you a clue as to how the final solution is arrived at so we'll fast track and show each step on the way down:
return 5 * (4 * FactRec(3));
return 5 * (4 * (3 * FactRec(2)));
return 5 * (4 * (3 * (2 * FactRec(1))));
return 5 * (4 * (3 * (2 * (1))));
That final substitution happens when the base case is triggered. At this point we have a simple algrebraic formula to solve which equates directly to the definition of Factorials in the first place.
It's instructive to note that every call into the method results in either a base case being triggered or a call to the same method where the parameters are closer to a base case (often called a recursive call). If this is not the case then the method will run forever.
Recursion is solving a problem with a function that calls itself. A good example of this is a factorial function. Factorial is a math problem where factorial of 5, for example, is 5 * 4 * 3 * 2 * 1. This function solves this in C# for positive integers (not tested - there may be a bug).
public int Factorial(int n)
{
if (n <= 1)
return 1;
return n * Factorial(n - 1);
}
Recursion refers to a method which solves a problem by solving a smaller version of the problem and then using that result plus some other computation to formulate the answer to the original problem. Often times, in the process of solving the smaller version, the method will solve a yet smaller version of the problem, and so on, until it reaches a "base case" which is trivial to solve.
For instance, to calculate a factorial for the number X, one can represent it as X times the factorial of X-1. Thus, the method "recurses" to find the factorial of X-1, and then multiplies whatever it got by X to give a final answer. Of course, to find the factorial of X-1, it'll first calculate the factorial of X-2, and so on. The base case would be when X is 0 or 1, in which case it knows to return 1 since 0! = 1! = 1.
Consider an old, well known problem:
In mathematics, the greatest common divisor (gcd) … of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.
The definition of gcd is surprisingly simple:
where mod is the modulo operator (that is, the remainder after integer division).
In English, this definition says the greatest common divisor of any number and zero is that number, and the greatest common divisor of two numbers m and n is the greatest common divisor of n and the remainder after dividing m by n.
If you'd like to know why this works, see the Wikipedia article on the Euclidean algorithm.
Let's compute gcd(10, 8) as an example. Each step is equal to the one just before it:
gcd(10, 8)
gcd(10, 10 mod 8)
gcd(8, 2)
gcd(8, 8 mod 2)
gcd(2, 0)
2
In the first step, 8 does not equal zero, so the second part of the definition applies. 10 mod 8 = 2 because 8 goes into 10 once with a remainder of 2. At step 3, the second part applies again, but this time 8 mod 2 = 0 because 2 divides 8 with no remainder. At step 5, the second argument is 0, so the answer is 2.
Did you notice that gcd appears on both the left and right sides of the equals sign? A mathematician would say this definition is recursive because the expression you're defining recurs inside its definition.
Recursive definitions tend to be elegant. For example, a recursive definition for the sum of a list is
sum l =
if empty(l)
return 0
else
return head(l) + sum(tail(l))
where head is the first element in a list and tail is the rest of the list. Note that sum recurs inside its definition at the end.
Maybe you'd prefer the maximum value in a list instead:
max l =
if empty(l)
error
elsif length(l) = 1
return head(l)
else
tailmax = max(tail(l))
if head(l) > tailmax
return head(l)
else
return tailmax
You might define multiplication of non-negative integers recursively to turn it into a series of additions:
a * b =
if b = 0
return 0
else
return a + (a * (b - 1))
If that bit about transforming multiplication into a series of additions doesn't make sense, try expanding a few simple examples to see how it works.
Merge sort has a lovely recursive definition:
sort(l) =
if empty(l) or length(l) = 1
return l
else
(left,right) = split l
return merge(sort(left), sort(right))
Recursive definitions are all around if you know what to look for. Notice how all of these definitions have very simple base cases, e.g., gcd(m, 0) = m. The recursive cases whittle away at the problem to get down to the easy answers.
With this understanding, you can now appreciate the other algorithms in Wikipedia's article on recursion!
A function that calls itself
When a function can be (easily) decomposed into a simple operation plus the same function on some smaller portion of the problem. I should say, rather, that this makes it a good candidate for recursion.
They do!
The canonical example is the factorial which looks like:
int fact(int a)
{
if(a==1)
return 1;
return a*fact(a-1);
}
In general, recursion isn't necessarily fast (function call overhead tends to be high because recursive functions tend to be small, see above) and can suffer from some problems (stack overflow anyone?). Some say they tend to be hard to get 'right' in non-trivial cases but I don't really buy into that. In some situations, recursion makes the most sense and is the most elegant and clear way to write a particular function. It should be noted that some languages favor recursive solutions and optimize them much more (LISP comes to mind).
A recursive function is one which calls itself. The most common reason I've found to use it is traversing a tree structure. For example, if I have a TreeView with checkboxes (think installation of a new program, "choose features to install" page), I might want a "check all" button which would be something like this (pseudocode):
function cmdCheckAllClick {
checkRecursively(TreeView1.RootNode);
}
function checkRecursively(Node n) {
n.Checked = True;
foreach ( n.Children as child ) {
checkRecursively(child);
}
}
So you can see that the checkRecursively first checks the node which it is passed, then calls itself for each of that node's children.
You do need to be a bit careful with recursion. If you get into an infinite recursive loop, you will get a Stack Overflow exception :)
I can't think of a reason why people shouldn't use it, when appropriate. It is useful in some circumstances, and not in others.
I think that because it's an interesting technique, some coders perhaps end up using it more often than they should, without real justification. This has given recursion a bad name in some circles.
Recursion is an expression directly or indirectly referencing itself.
Consider recursive acronyms as a simple example:
GNU stands for GNU's Not Unix
PHP stands for PHP: Hypertext Preprocessor
YAML stands for YAML Ain't Markup Language
WINE stands for Wine Is Not an Emulator
VISA stands for Visa International Service Association
More examples on Wikipedia
Recursion works best with what I like to call "fractal problems", where you're dealing with a big thing that's made of smaller versions of that big thing, each of which is an even smaller version of the big thing, and so on. If you ever have to traverse or search through something like a tree or nested identical structures, you've got a problem that might be a good candidate for recursion.
People avoid recursion for a number of reasons:
Most people (myself included) cut their programming teeth on procedural or object-oriented programming as opposed to functional programming. To such people, the iterative approach (typically using loops) feels more natural.
Those of us who cut our programming teeth on procedural or object-oriented programming have often been told to avoid recursion because it's error prone.
We're often told that recursion is slow. Calling and returning from a routine repeatedly involves a lot of stack pushing and popping, which is slower than looping. I think some languages handle this better than others, and those languages are most likely not those where the dominant paradigm is procedural or object-oriented.
For at least a couple of programming languages I've used, I remember hearing recommendations not to use recursion if it gets beyond a certain depth because its stack isn't that deep.
A recursive statement is one in which you define the process of what to do next as a combination of the inputs and what you have already done.
For example, take factorial:
factorial(6) = 6*5*4*3*2*1
But it's easy to see factorial(6) also is:
6 * factorial(5) = 6*(5*4*3*2*1).
So generally:
factorial(n) = n*factorial(n-1)
Of course, the tricky thing about recursion is that if you want to define things in terms of what you have already done, there needs to be some place to start.
In this example, we just make a special case by defining factorial(1) = 1.
Now we see it from the bottom up:
factorial(6) = 6*factorial(5)
= 6*5*factorial(4)
= 6*5*4*factorial(3) = 6*5*4*3*factorial(2) = 6*5*4*3*2*factorial(1) = 6*5*4*3*2*1
Since we defined factorial(1) = 1, we reach the "bottom".
Generally speaking, recursive procedures have two parts:
1) The recursive part, which defines some procedure in terms of new inputs combined with what you've "already done" via the same procedure. (i.e. factorial(n) = n*factorial(n-1))
2) A base part, which makes sure that the process doesn't repeat forever by giving it some place to start (i.e. factorial(1) = 1)
It can be a bit confusing to get your head around at first, but just look at a bunch of examples and it should all come together. If you want a much deeper understanding of the concept, study mathematical induction. Also, be aware that some languages optimize for recursive calls while others do not. It's pretty easy to make insanely slow recursive functions if you're not careful, but there are also techniques to make them performant in most cases.
Hope this helps...
I like this definition:
In recursion, a routine solves a small part of a problem itself, divides the problem into smaller pieces, and then calls itself to solve each of the smaller pieces.
I also like Steve McConnells discussion of recursion in Code Complete where he criticises the examples used in Computer Science books on Recursion.
Don't use recursion for factorials or Fibonacci numbers
One problem with
computer-science textbooks is that
they present silly examples of
recursion. The typical examples are
computing a factorial or computing a
Fibonacci sequence. Recursion is a
powerful tool, and it's really dumb to
use it in either of those cases. If a
programmer who worked for me used
recursion to compute a factorial, I'd
hire someone else.
I thought this was a very interesting point to raise and may be a reason why recursion is often misunderstood.
EDIT:
This was not a dig at Dav's answer - I had not seen that reply when I posted this
1.)
A method is recursive if it can call itself; either directly:
void f() {
... f() ...
}
or indirectly:
void f() {
... g() ...
}
void g() {
... f() ...
}
2.) When to use recursion
Q: Does using recursion usually make your code faster?
A: No.
Q: Does using recursion usually use less memory?
A: No.
Q: Then why use recursion?
A: It sometimes makes your code much simpler!
3.) People use recursion only when it is very complex to write iterative code. For example, tree traversal techniques like preorder, postorder can be made both iterative and recursive. But usually we use recursive because of its simplicity.
Here's a simple example: how many elements in a set. (there are better ways to count things, but this is a nice simple recursive example.)
First, we need two rules:
if the set is empty, the count of items in the set is zero (duh!).
if the set is not empty, the count is one plus the number of items in the set after one item is removed.
Suppose you have a set like this: [x x x]. let's count how many items there are.
the set is [x x x] which is not empty, so we apply rule 2. the number of items is one plus the number of items in [x x] (i.e. we removed an item).
the set is [x x], so we apply rule 2 again: one + number of items in [x].
the set is [x], which still matches rule 2: one + number of items in [].
Now the set is [], which matches rule 1: the count is zero!
Now that we know the answer in step 4 (0), we can solve step 3 (1 + 0)
Likewise, now that we know the answer in step 3 (1), we can solve step 2 (1 + 1)
And finally now that we know the answer in step 2 (2), we can solve step 1 (1 + 2) and get the count of items in [x x x], which is 3. Hooray!
We can represent this as:
count of [x x x] = 1 + count of [x x]
= 1 + (1 + count of [x])
= 1 + (1 + (1 + count of []))
= 1 + (1 + (1 + 0)))
= 1 + (1 + (1))
= 1 + (2)
= 3
When applying a recursive solution, you usually have at least 2 rules:
the basis, the simple case which states what happens when you have "used up" all of your data. This is usually some variation of "if you are out of data to process, your answer is X"
the recursive rule, which states what happens if you still have data. This is usually some kind of rule that says "do something to make your data set smaller, and reapply your rules to the smaller data set."
If we translate the above to pseudocode, we get:
numberOfItems(set)
if set is empty
return 0
else
remove 1 item from set
return 1 + numberOfItems(set)
There's a lot more useful examples (traversing a tree, for example) which I'm sure other people will cover.
Well, that's a pretty decent definition you have. And wikipedia has a good definition too. So I'll add another (probably worse) definition for you.
When people refer to "recursion", they're usually talking about a function they've written which calls itself repeatedly until it is done with its work. Recursion can be helpful when traversing hierarchies in data structures.
An example: A recursive definition of a staircase is:
A staircase consists of:
- a single step and a staircase (recursion)
- or only a single step (termination)
To recurse on a solved problem: do nothing, you're done.
To recurse on an open problem: do the next step, then recurse on the rest.
In plain English:
Assume you can do 3 things:
Take one apple
Write down tally marks
Count tally marks
You have a lot of apples in front of you on a table and you want to know how many apples there are.
start
Is the table empty?
yes: Count the tally marks and cheer like it's your birthday!
no: Take 1 apple and put it aside
Write down a tally mark
goto start
The process of repeating the same thing till you are done is called recursion.
I hope this is the "plain english" answer you are looking for!
A recursive function is a function that contains a call to itself. A recursive struct is a struct that contains an instance of itself. You can combine the two as a recursive class. The key part of a recursive item is that it contains an instance/call of itself.
Consider two mirrors facing each other. We've seen the neat infinity effect they make. Each reflection is an instance of a mirror, which is contained within another instance of a mirror, etc. The mirror containing a reflection of itself is recursion.
A binary search tree is a good programming example of recursion. The structure is recursive with each Node containing 2 instances of a Node. Functions to work on a binary search tree are also recursive.
This is an old question, but I want to add an answer from logistical point of view (i.e not from algorithm correctness point of view or performance point of view).
I use Java for work, and Java doesn't support nested function. As such, if I want to do recursion, I might have to define an external function (which exists only because my code bumps against Java's bureaucratic rule), or I might have to refactor the code altogether (which I really hate to do).
Thus, I often avoid recursion, and use stack operation instead, because recursion itself is essentially a stack operation.
You want to use it anytime you have a tree structure. It is very useful in reading XML.
Recursion as it applies to programming is basically calling a function from inside its own definition (inside itself), with different parameters so as to accomplish a task.
"If I have a hammer, make everything look like a nail."
Recursion is a problem-solving strategy for huge problems, where at every step just, "turn 2 small things into one bigger thing," each time with the same hammer.
Example
Suppose your desk is covered with a disorganized mess of 1024 papers. How do you make one neat, clean stack of papers from the mess, using recursion?
Divide: Spread all the sheets out, so you have just one sheet in each "stack".
Conquer:
Go around, putting each sheet on top of one other sheet. You now have stacks of 2.
Go around, putting each 2-stack on top of another 2-stack. You now have stacks of 4.
Go around, putting each 4-stack on top of another 4-stack. You now have stacks of 8.
... on and on ...
You now have one huge stack of 1024 sheets!
Notice that this is pretty intuitive, aside from counting everything (which isn't strictly necessary). You might not go all the way down to 1-sheet stacks, in reality, but you could and it would still work. The important part is the hammer: With your arms, you can always put one stack on top of the other to make a bigger stack, and it doesn't matter (within reason) how big either stack is.
Recursion is the process where a method call iself to be able to perform a certain task. It reduces redundency of code. Most recurssive functions or methods must have a condifiton to break the recussive call i.e. stop it from calling itself if a condition is met - this prevents the creating of an infinite loop. Not all functions are suited to be used recursively.
hey, sorry if my opinion agrees with someone, I'm just trying to explain recursion in plain english.
suppose you have three managers - Jack, John and Morgan.
Jack manages 2 programmers, John - 3, and Morgan - 5.
you are going to give every manager 300$ and want to know what would it cost.
The answer is obvious - but what if 2 of Morgan-s employees are also managers?
HERE comes the recursion.
you start from the top of the hierarchy. the summery cost is 0$.
you start with Jack,
Then check if he has any managers as employees. if you find any of them are, check if they have any managers as employees and so on. Add 300$ to the summery cost every time you find a manager.
when you are finished with Jack, go to John, his employees and then to Morgan.
You'll never know, how much cycles will you go before getting an answer, though you know how many managers you have and how many Budget can you spend.
Recursion is a tree, with branches and leaves, called parents and children respectively.
When you use a recursion algorithm, you more or less consciously are building a tree from the data.
In plain English, recursion means to repeat someting again and again.
In programming one example is of calling the function within itself .
Look on the following example of calculating factorial of a number:
public int fact(int n)
{
if (n==0) return 1;
else return n*fact(n-1)
}
Any algorithm exhibits structural recursion on a datatype if basically consists of a switch-statement with a case for each case of the datatype.
for example, when you are working on a type
tree = null
| leaf(value:integer)
| node(left: tree, right:tree)
a structural recursive algorithm would have the form
function computeSomething(x : tree) =
if x is null: base case
if x is leaf: do something with x.value
if x is node: do something with x.left,
do something with x.right,
combine the results
this is really the most obvious way to write any algorith that works on a data structure.
now, when you look at the integers (well, the natural numbers) as defined using the Peano axioms
integer = 0 | succ(integer)
you see that a structural recursive algorithm on integers looks like this
function computeSomething(x : integer) =
if x is 0 : base case
if x is succ(prev) : do something with prev
the too-well-known factorial function is about the most trivial example of
this form.
function call itself or use its own definition.

Why does my processing time drop when running the same function over and over again (with incremented values)?

I was testing a new Method to replace my old one and made did some speed testing.
When I now look at the graph I see, that the time it takes per iteration drops drastically.
Now I'm wondering why that might be.
My quess would be, that my graphics card takes over the heavy work, but the first function iterates n times and the second (the blue one) doesn't have a single iteration but "heavy" calculation work with doubles.
In case system details are needed:
OS: Mac OS X 10.10.4
Core: 2.8 GHz Intel Core i7 (4x)
GPU: AMD Radeon R9 M370X 2048 MB
If you need the two functions:
New One:
private static int sumOfI(int i) {
int factor;
float factor_ = (i + 1) / 2;
factor = (int) factor_;
return (i % 2 == 0) ? i * factor + i / 2 : i * factor;
}
Old One:
private static int sumOfIOrdinary(int j) {
int result = 0;
for (int i = 1; i <= j; i++) {
result += i;
}
return result;
}
To clarify my question:
Why does the processing time drop that drastically?
Edit:
I understand at least a little bit about cost and such. I probably didn't explain my test method good enough. I have a simple for loop which in this test counted from 0 to 1000 and I fed each value to 1 method and recorded the time it took (for the whole loop to execute), then I did the same with the other method.
So after the loop reached about 500 the same method took significantly less time to execute.
Java did not calculate anything on the graphic card (without help from other frameworks or classes). Also what you think is a "heavy" calculation is kinda easy for a cpu this day (even if division is kinda tricky). So speed depends on the bytecode generated and the Java optimisations when running a program and mostly on the Big-O Notation.
Your method sumOfI is just x statements to execute so this is O(1), regardless how large your i is its always only this x statements. But the sumOfIOrdinary uses one loop and its O(n) this will use y statements + i statements depending on the input.
So from the theory and in worst caste sumOfI is always faster as sumOfIOrdinary.
You can also see this problem in the bytecode view. sumOfI is only some load and add and multiply calls to the cpu. But for a loop the bytecode also uses a goto and needs to return to an older address and needs to execute lines again this will cost time.
On my VM with i=500000 the first method needs <1 millisecond and the second method because of the loop takes 2-4 millisecond.
Links to explain Big-O-Notation:
Simple Big O Notation
A beginner's guide to Big O notation

What's wrong with using associativity by compilers?

Sometimes associativity can be used to loose data dependencies and I was curious how much it can help. I was rather surprised to find out that I can nearly get a speed-up factor of 4 by manually unrolling a trivial loop, both in Java (build 1.7.0_51-b13) and in C (gcc 4.4.3).
So either I'm doing something pretty stupid or the compilers ignore a powerful tool. I started with
int a = 0;
for (int i=0; i<N; ++i) a = M1 * a + t[i];
which computes something close to String.hashCode() (set M1=31 and use a char[]). The computation is pretty trivial and for t.length=1000 takes about 1.2 microsecond on my i5-2400 # 3.10GHz (both in Java and C).
Observe that each two steps a gets multiplied by M2 = M1*M1 and added something. This leads to this piece of code
int a = 0;
for (int i=0; i<N; i+=2) {
a = M2 * a + (M1 * t[i] + t[i+1]); // <-- note the parentheses!
}
if (i < len) a = M1 * a + t[i]; // Handle odd length.
This is exactly twice as fast as the first snippet. Strangely, leaving out the parentheses eats 20% of the speed-up. Funnily enough, this can be repeated and a factor of 3.8 can be achieved.
Unlike java, gcc -O3 chooses not to unroll the loop. It's wise choice since it wouldn't help anyway (as -funroll-all-loops shows).
So my question1 is: What prevents such an optimization?
Googling didn't work, I got "associative arrays" and "associative operators" only.
Update
I polished up my benchmark a little bit and can provide some results now. There's no speedup beyond unrolling 4 times, probably because of multiplication and addition together taking 4 cycles.
Update 2
As Java already unrolls the loop, all the hard work is done. What we get is something like
...pre-loop
for (int i=0; i<N; i+=2) {
a2 = M1 * a + t[i];
a = M1 * a2 + t[i+1];
}
...post-loop
where the interesting part can be rewritten like
a = M1 * ((M1 * a) + t[i]) + t[i+1]; // latency 2mul + 2add
This reveals that there are 2 multiplications and 2 additions, all of them to be performed sequentially, thus needing 8 cycles on a modern x86 CPU. All we need now is some primary school math (working for ints even in case of overflow or whatever, but not applicable to floating point).
a = ((M1 * (M1 * a)) + (M1 * t[i])) + t[i+1]; // latency 2mul + 2add
So far we gained nothing, but it allows us to fold the constants
a = ((M2 * a) + (M1 * t[i])) + t[i+1]; // latency 1mul + 2add
and gain even more by regrouping the sum
a = (M2 * a) + ((M1 * t[i]) + t[i+1]); // latency 1mul + 1add
Here is how I understand your two cases: In the first case, you have a loop that takes N steps; in the second case you manually merged two consecutive iterations of the first case into one, so you only need to do N/2 steps in the second case. Your second case runs faster and you are wondering why the dumb compiler couldn't do it automatically.
There is nothing that would prevent the compiler from doing such an optimization. But please note that this re-write of the original loop leads to larger executable size: You have more instructions inside the for loop and the additional if after the loop.
If N=1 or N=3, the original loop is likely to be faster (less branching and better caching/prefetching/branch prediction). It made things faster in your case but it may make things slower in other cases. It is not a clear cut whether it is worth doing this optimization and it can be highly nontrivial to implement such an optimization in a compiler.
By the way, what you have done is very similar to loop vectorization but in your case, you did the parallel step manually and plugged-in the result. Eric Brumer's Compiler Confidential talk will give you insight why rewriting loops in general is tricky and what drawbacks / disadvantages there are (larger executable size, potentially slower in some cases). So compiler writers are very well aware of this optimization possibility and are actively working on it but it is highly nontrivial in general and can also make things slower.
Please try something for me:
int a = 0;
for (int i=0; i<N; ++i)
a = ((a<<5) - a) + t[i];
assuming M1=31. In principle, the compiler should be smart enough to rewrite 31*a into (a<<5)-a but I am curious if it really does that.

Which is the best way to implement prime number finding algorithms in Java? How do we make library classes and use then in Java?

I want to make library classes in Java and use them in my future programs. I want these library classes to find prime numbers upto a certain number or even the next prime number or you can say solve most of the basic things related to prime numbers.
I have never made a Java Library Class. I aim to learn that doing this. Please help me without that by pointing out a tutorial or something. I am familiar with netbeans IDE.
I found out a few algorithms like Sieve of Eratosthenes and Sieve of Atkin. It would be great if you can point out a few more such efficient algorithms. I don't want them to be the best but at least good enough. My aim is to learn few things by implementing them. Because I have little practical coding experience I want to do it to improve my skills.
My friend suggested me to use Stream Classes and he was talking something about implementing it by giving the output of one file as an input to another to make my code clean. I didn't understand him very well. Please pardon me if i said anything wrong. What I want to ask in this point is, is that an efficient and OO way of doing what i want to do. If yes please tell me how to do that and if not please point out some other way to do it.
I have basic knowledge of the Java language. What I want to accomplish through this venture is gain coding experience because that is what everyone out here suggested, "to take up small things like these and learn on my own"
thanks to all of you in advance
regards
shahensha
EDIT:
In the Sieve of Eratosthenes and others we are required to store the numbers from 2 to n in a data structure. Where should I store it? I know I can use a dynamic Collection, but just a small question...If i want to find primes in the order of billions or even more (I will use Big Integer no doubt), but all this will get stored in the heap right? Is there a fear of overflow? Even if it doesn't will it be a good practice? Or would it be better to store the numbers or the list (on which we will perform actions depending on the algorithm we use) in a file and access it there? Sorry if my question was too noobish...
"Sieve of Eratosthenes" is good algorithm to find the prime numbers. If you will use google you can find ready implementation in java.
I'll add some thoughts to this:
There's nothing technically different about a Library Class, it's simply how you use it. To my mind, the most important thing is that you think hard about your public API. Make it bit enough to be useful to your prospective callers, keep it small enough that you have freedom to change the internal implementation as you see fit, and ensure that you have a good understanding of what your library does do and what it doesn't do. Don't try to do everything, just do one thing well. (And the API generally extends to documentation too, make sure you write decent Javadocs.)
Start with either of these as they are fine. If you design your API well, you can change this at any time and roll out version 1.1 that uses a different algorithm (or even uses JNI to call a native C library), and your callers can just drop in the new JAR and use your code without even recompiling. Don't forget that premature optimisation is the root of all evil; don't worry to much about making your first version fast, but focus on making it correct and making it clean.
I'm not sure why your friend was suggesting streams. Streams are a way of dealing with input and output of raw bytes - useful when reading from files or network connections, but generally not a good way to call another Java method. Your library shouldn't worry about input and output, it just needs to offer some methods for numerical calculations. So you should implement methods that take integers (or whatever is appropriate) and return integers.
For instance, you might implement:
/**
* Calculates the next prime number after a given point.
*
* Implementation detail: callers may assume that prime numbers are
* calculated deterministically, such that the efficiency of calling
* this method with a large parameter is not dramatically worse than
* calling it with a small parameter.
*
* #param x The lower bound (exclusive) of the prime number to return.
* Must be strictly positive.
* #return Colloquially, the "next" prime number after the given parameter.
* More formally, this number will be prime and there are no prime numbers
* less than this value and greater than <code>x</code> that are also
* prime.
* #throws IllegalArgumentException if <code>x</code> is not strictly
* positive.
*/
public long smallestPrimeGreaterThan(long x);
/**
* Returns all prime numbers within a given range, in order.
*
* #param lowerBound The lower bound (exclusive) of the range.
* #param upperBound The upper bound (exclusive) of the range.
* #return A List of the prime numbers that are strictly between the
* given parameters. This list is in ascending order. The returned
* value is never null; if no prime numbers exist within the given
* range, then an empty list is returned.
*/
public List<Long> primeNumbersBetween(long lowerBound, long upperBound);
No streams in sight! Uses of streams, such as outputting to the console, should be handled by applications that use your library and not by your library itself. This is what I meant in my first point about being clear of what your library does and doesn't do. You just generate the prime numbers; it's up to the caller to then do something cool with them.
But when you compare, the sieve of Atkin is faster than the sieve of Eratosthenes:
http://en.wikipedia.org/wiki/Prime_number_counting_function Also refer to this link where different functions are explained clearly :)
Good luck..
There is no such thing as "library class". I suppose you mean to make a class in such a way that does it's job in a reusable way. The way to do this is to have a clean interface - with minimal (if any) bindings to other libraries or to your execution environment (your main class etc).
The two you mention are "good enough". For your purpose you don't need to look any further.
Just read from System.in and write to System.out and that's it. Though, in your case, there is nothing to read.
To achieve what I think is your goal, you need to write a main class that hadles the execution environment - main function, initialize your algorithm, iteratively look for the next prime, and write it to System.out. Of course, you'll need another class to implement the algorithm. It should contain the internal state and provide a method for finding the next prime.
`IMO, keep aside the thought that you're making a library (.jar file according to my interpretation of this question).
Focus on creating a simple Java class first, like this:
//SieveOfEratosthenes.java
public class PrimeSieve{
public static void main(String args[])
{
int N = Integer.parseInt(args[0]);
// initially assume all integers are prime
boolean[] isPrime = new boolean[N + 1];
for (int i = 2; i <= N; i++) {
isPrime[i] = true;
}
// mark non-primes <= N using Sieve of Eratosthenes
for (int i = 2; i*i <= N; i++) {
// if i is prime, then mark multiples of i as nonprime
// suffices to consider mutiples i, i+1, ..., N/i
if (isPrime[i]) {
for (int j = i; i*j <= N; j++) {
isPrime[i*j] = false;
}
}
}
// count primes
int primes = 0;
for (int i = 2; i <= N; i++) {
if (isPrime[i]) primes++;
}
System.out.println("The number of primes <= " + N + " is " + primes);
}
}
Now, the next step; Implementing it for larger values, you can always use BigInteger. SO questions pertaining to the same:
Java BigInteger Prime numbers
Problems with java.math.BigInteger
BigNums Implementation
Try reading all questions related to BigInteger class on SO, BigInteger Tagged questions.
Hope this helps.

Fast 4x4 matrix multiplication in Java with NIO float buffers

I know there are LOT of questions like that but I can't find one specific to my situation. I have 4x4 matrices implemented as NIO float buffers (These matrices are used for OpenGL). Now I want to implement a multiply method which multiplies Matrix A with Matrix B and stores the result in Matrix C. So the code may look like this:
class Matrix4f
{
private FloatBuffer buffer = FloatBuffer.allocate(16);
public Matrix4f multiply(Matrix4f matrix2, Matrix4f result)
{
{{{result = this * matrix2}}} <-- I need this code
return result;
}
}
What is the fastest possible code to do this multiplication? Some OpenGL implementations (Like the OpenGL ES stuff in Android) provide native code for this but others doesn't. So I want to provide a generic multiplication method for these implementations.
The real answer is of course to test different implementations and check which one is fastest.
My guess, without testing, would be that as the matrices are so small, expanding the loops by hand would result in the fastest code. E.g. something like
result[0][0] = this[0][0] * matrix2[0][0] + this[0][1] * matrix2[1][0]
+ this[0][2] * matrix2[2][0] + this[0][3] * matrix2[3][0];
result[0][1] = // ... and so forth
or then maybe just unroll the innermost loop, and retain the two outermost ones to save some typing as well as I$.
Go through the FloatBuffer.array() if that operation is supported. Then just perform the necessary multiplications through that array, and return the resulting matrix.
Have a look at GameDev.net - Matrix Math for the exact computations.
If you want to optimize it further, you could try out Strassens Algorithm. You wouldn't even need to pad your matrices, since they are square and of a size that is a power of 2.

Categories