Matrix multiplication using Jblas: Matrices need to have same length

Matrix multiplication using Jblas: Matrices need to have same length - java

I am using Java + Jblas (first time user) and am trying to multiply two matrices. One is a 163x4 and the other is 4x1 matrix. I would expect the result of such a multiplication to be a 163x1 matrix. However using:
FloatMatrix a = b.mmuli(c);
I am getting the error:
Matrices must have same length (is: 652 and 4)
Now while I assume, that makes perfect sense for the program I am confused. The same multiplication worked fine in Octave (which of course applies some magic). But now for getting this to work I would need to know what kind of sorcery this is?
EDIT
So here is what the Octave Documentation says about broadcasting (the sorcery):
In case all dimensions are equal, no broadcasting occurs and ordinary
element-by-element arithmetic takes place. For arrays of higher
dimensions, if the number of dimensions isn’t the same, then missing
trailing dimensions are treated as 1. When one of the dimensions is 1,
the array with that singleton dimension gets copied along that
dimension until it matches the dimension of the other array.
So this means I just copy the 4x1 matrix 163 times. Then I can execute the multiplication, but instead of the 163x1 matrix I wanted, I now have a 163x4 matrix. Which for me is strange. What is my solution now?

So I finally figured it out. And it's one of those mistakes... It has to be
FloatMatrix a = b.mmul(c);
The element wise multiplication was the error here.

Related

Change Code from DCT to Inverse Discrete Cosine Transformation?

For an project I have to demonstrate JPEG compression and therefore the conversion with DCTII and IDCT. I have no idea how to implement this formula. But I found an website that provides the Java code and online IDE for testing.
https://ide.geeksforgeeks.org/FnC3bRJEAr here you can see the code.
(formulas from Wikipedia/JPEG)
So, what changes have to be made to the code?
I tried switching the for-loops and the variables in the formula but the values I got were definitely wrong, other tries lead to error messages.

The only difference between the DCT and IDCT is where coefficient are taken into account.
You should replace line 46 in your code by
sum = sum + ck*cl*dct1;
where ck and cl are computed as in lines 24-34, but for k and l
And suppress ci*cj in line 49
BTW, this java code is exceptionally inefficient. Precompute Math.sqrt(2), Math.sqrt(n) and put your cosine in a table and it will be at least 3 times faster.

Your summations are doing a matrix multiplication. Your a multiplying an 8x8 data matrix by an 8x8 DCT matrix.
The DCT matrix is orthonormal so its inverse is its transpose.
You should therefore be able to invert by exchanging u and v.

Normalized Iteration Count does not work. What am I doing wrong?

As you can see from the title, I'm busy programming a little programm for visualizing fractals in Java. Anybody who deals with fractals will come to the point where he/she searches for a solution to get these stupid "bands" away, when you just colour a pixel by the number of iterations it took to escape.
So I searched for a more advanced colouring algorithm, finding the "normalized iteration count". The formula I'm using is:
float loc = (float) 1 - Math.log(Math.log(c.abs())) / Math.log(2);
Everybody on the Internet is so happy about this algorithm, everybody uses it, everbody gets great results. Except me. I thought, this algorithm should provide a float between 0 and 1. But that doesn't happen. I did some calculations and came to the conclusion, that this algorithm only works for c.abs() >= Math.E && c.abs() <= Math.exp(2) (that is Math.E * Math.E).
In numbers this means, my input into this equation has to be between about 2.718 and 7.389.
But a complex number c is considerd to tend towards infinity when its magnitude gets greater than 2. But for any Input smaller than Math.E, I get a value greater than one. And for any number greater than Math.exp(2), it gets negative. That is the case if a complex number escapes really fast.
So please tell me: what am I doing wrong. I'm desperate.
Thanks.
EDIT:
I was wrong: the code I posted is correct, I just
1. used it the wrong way and so it didn't provide the right output.
2. had to set the bailout value of the mandelbrot/julia algorithm to 10, otherwise I would've got stupid bands again.
Problem solved!

As you've already discovered, you need to increase the bailout radius before smoothing will look right.
Two is the minimum length that a coordinate can have such that when you square it and add the initial value, it cannot result in a smaller length. If the previous length was 2.0, and you squared it, you'd have a length of 4.0 (pointing in whichever direction), and the most that any value of c could reduce that by is 2.0 (by pointing in precisely the opposite direction). If c were larger than that then it would start to escape right away.
Now, to estimate the fractional part of the number of iterations we look at the final |z|. If z had simply been squared and c not added to it, then it would have a length between 2.0 and 4.0 (the new value must be larger than 2.0 to bail out, and the old value must have been less than 2.0 to have not bailed out earlier).
Without c, taking |z|'s proportional position between 2 and 4 gives us a fractional part of the number of iterations. If |z| is close to 4 then the previous length must have been close to 2, so it was already close to bailing out in the previous iteration and the smoothed result should be close to the previous iteration count to represent that. If it's close to 2, then the previous iteration was further from bailing out, and so the smoothed result should be closer to the new iteration count.
Unfortunately c messes that up. The larger c is, the larger the potential error is in that simple relationship. Even if the old length was nearly at 2.0, it might have landed such that c's influence made it look like it must have been smaller.
Increasing the bailout mitigates the effect of adding c. If the bailout is 64 then the resulting length will be between 64 and 4096, and c's maximum offset of 2 has a proportionally smaller very impact on the result.

You have left out the iteration value, try this:
float loc = <iteration_value> + (float) 1 - Math.log(Math.log(c.abs())) / Math.log(2);
The iteration_value is the number of iterations which yielded c in the formula.

Hilbert sort by divide and conquer algorithm?

I'm trying to sort d-dimensional data vectors by their Hilbert order, for bulk-loading a spatial index.
However, I do not want to compute the Hilbert value for each point explicitly, which in particular requires setting a particular precision. In high-dimensional data, this involves a precision such as 32*d bits, which becomes quite messy to do efficiently. When the data is distributed unevenly, some of these calculations are unnecessary, and extra precision for parts of the data set are necessary.
Instead, I'm trying to do a partitioning approach. When you look at the 2D first order hilbert curve
1 4
| |
2---3
I'd split the data along the x-axis first, so that the first part (not necessarily containing half of the objects!) will consist of 1 and 2 (not yet sorted) and the second part will have objects from 3 and 4 only. Next, I'd split each half again, on the Y axis, but reverse the order in 3-4.
So essentially, I want to perform a divide-and-conquer strategy (closely related to QuickSort - on evenly distributed data this should even be optimal!), and only compute the necessary "bits" of the hilbert index as needed. So assuming there is a single object in "1", then there is no need to compute the full representation of it; and if the objects are evenly distributed, partition sizes will drop quickly.
I do know the usual textbook approach of converting to long, gray-coding, dimension interleaving. This is not what I'm looking for (there are plenty of examples of this available). I explicitly want a lazy divide-and-conquer sorting only. Plus, I need more than 2D.
Does anyone know of an article or hilbert-sorting algorithm that works this way? Or a key idea how to get the "rotations" right, which representation to choose for this? In particular in higher dimensionalities... in 2D it is trivial; 1 is rotated +y, +x, while 4 is -y,-x (rotated and flipped). But in higher dimensionalities this gets more tricky, I guess.
(The result should of course be the same as when sorting the objects by their hilbert order with a sufficiently large precision right away; I'm just trying to save the time computing the full representation when not needed, and having to manage it. Many people keep a hashmap "object to hilbert number" that is rather expensive.)
Similar approaches should be possible for Peano curves and Z-curve, and probably a bit easier to implement... I should probably try these first (Z-curve is already working - it indeed boils down to something closely resembling a QuickSort, using the appropriate mean/grid value as virtual pivot and cycling through dimensions for each iteration).
Edit: see below for how I solved it for Z and peano curves. It is also working for 2D Hilbert curves already. But I do not have the rotations and inversion right yet for Hilbert curves.

Use radix sort. Split each 1-dimensional index to d .. 32 parts, each of size 1 .. 32/d bits. Then (from high-order bits to low-order bits) for each index piece compute its Hilbert value and shuffle objects to proper bins.
This should work well with both evenly and unevenly distributed data, both Hilbert ordering or Z-order. And no multi-precision calculations needed.
One detail about converting index pieces to Hilbert order:
first extract necessary bits,
then interleave bits from all dimensions,
then convert 1-dimensional indexes to inverse Gray code.
If the indexes are stored in doubles:
If indexes may be negative, add some value to make everything positive and thus simplify the task.
Determine the smallest integer power of 2, which is greater than all the indexes and divide all indexes to this value
Multiply the index to 2^(necessary number of bits for current sorting step).
Truncate the result, convert it to integer, and use it for Hilbert ordering (interleave and compute the inverse Gray code)
Subtract the result, truncated on previous step, from the index: index = index - i
Coming to your variant of radix sort, i'd suggest to extend zsort (to make hilbertsort out of zsort) with two binary arrays of size d (one used mostly as a stack, other is used to invert index bits) and the rotation value (used to rearrange dimensions).
If top value in the stack is 1, change pivotize(... ascending) to pivotize(... descending), and then for the first part of the recursion, push this top value to the stack, for second one - push the inverse of this value. This stack should be restored after each recursion. It contains the "decision tree" of last d recursions of radix sort procedure (in inverse Gray code).
After d recursions this "decision tree" stack should be used to recalculate both the rotation value and the array of inversions. The exact way how to do it is non-trivial. It may be found in the following links: hilbert.c or hilbert.c.

You can compute the hilbert curve from f(x)=y directly without using recursion or L-systems or divide and conquer. Basically it's a gray code or hamiltonian path traversal. You can find a good description at Nick's spatial index hilbert curve quadtree blog or from the book hacker's delight. Or take a look at monotonic n-ary gray code. I've written an implementation in php including a moore curve.

I already answered this question (and others) but my answer(s) mysteriously disappeared. The Compact Hilbert Index implemention from http://code.google.com/p/uzaygezen/source/browse/trunk/core/src/main/java/com/google/uzaygezen/core/CompactHilbertCurve.java (method index()) already allows one to limit the number of hilbert index bits computed up to a given level. Each iteration of the loop from the mentioned method computes a number of bits equal to the dimensionality of the space. You can easily refactor the for loop to compute just one level (i.e., a number of bits equal to the dimensionality of the space) at a time, going only as deeply as needed to compare lexicographically two numbers by their Compact Hilbert Index.

Java Array Manipulation

I have a function named resize, which takes a source array, and resizes to new widths and height. The method I'm using, I think, is inefficient. I heard there's a better way to do it. Anyway, the code below works when scale is an int. However, there's a second function called half, where it uses resize to shrink an image in half. So I made scale a double, and used a typecast to convert it back to an int. This method is not working, and I dont know what the error is (the teacher uses his own grading and tests on these functions, and its not passing it). Can you spot the error, or is there a more efficient way to make a resize function?
public static int[][] resize(int[][] source, int newWidth, int newHeight) {
int[][] newImage=new int[newWidth][newHeight];
double scale=newWidth/(source.length);
for(int i=0;i<newWidth/scale;i++)
for(int j=0;j<newHeight/scale;j++)
for (int s1=0;s1<scale;s1++)
for (int s2=0;s2<scale;s2++)
newImage[(int)(i*scale+s1)][(int)(j*scale+s2)] =source[i][j];
return newImage;
}
/**
* Half the size of the image. This method should be just one line! Just
* delegate the work to resize()!
*/
public static int[][] half(int[][] source) {
int[][] newImage=new int[source.length/2][source[0].length/2];
newImage=resize(source,source.length/2,source[0].length/2);
return newImage;
}

So one scheme for changing the size of an image is to resample it (technically this is really the only way, every variation is really just a different kind of resampling function).
Cutting an image in half is super easy, you want to read every other pixel in each direction, and then load that pixel into the new half sized array. The hard part is making sure your bookkeeping is strong.
static int[][] halfImage(int[][] orig){
int[][] hi = new int[orig.length/2][orig[0].length/2];
for(int r = 0, newr = 0; r < orig.length; r += 2, newr++){
for(int c = 0, newc = 0; c < orig[0].length; c += 2, newc++){
hi[newr][newc] = orig[r][c];
}
}
return hi;
}
In the code above I'm indexing into the original image reading every other pixel in every other row starting at the 0th row and 0th column (assuming images are row major, here). Thus, r tells us which row in the original image we're looking at, and c tells us which column in the original image we're looking at. orig[r][c] gives us the "current" pixel.
Similarly, newr and newc index into the "half-image" matrix designated hi. For each increment in newr or newc we increment r and c by 2, respectively. By doing this, we skip every other pixel as we iterate through the image.
Writing a generalized resize routine that doesn't operate on nice fractional quantities (like 1/2, 1/4, 1/8, etc.) is really pretty hard. You'd need to define a way to determine the value of a sub-pixel -- a point between pixels -- for more complicated factors, like 0.13243, for example. This is, of course, easy to do, and you can develop a very naive linear interpolation principle, where when you need the value between two pixels you simply take the surrounding pixels, construct a line between their values, then read the sub-pixel point from the line. More complicated versions of interpolation might be a sinc based interpolation...or one of many others in widely published literature.
Blowing up the size of the image involves something a little different than we've done here (and if you do in fact have to write a generalized resize function you might consider splitting your function to handle upscaling and downscaling differently). You need to somehow create more values than you have originally -- those interpolation functions work for that too. A trivial method might simply be to repeat a value between points until you have enough, and slight variations on this as well, where you might take so many values from the left and so many from the right for a particular position.
What I'd encourage you to think about -- and since this is homework I'll stay away from the implementation -- is treating the scaling factor as something that causes you to make observations on one image, and writes on the new image. When the scaling factor is less than 1 you generally sample from the original image to populate the new image and ignore some of the original image's pixels. When the scaling factor is greater than 1, you generally write more often to the new image and might need to read the same value several times from the old image. (I'm doing a poor job highlighting the difference here, hopefully you see the dualism I'm getting at.)

What you have is pretty understandable, and I think it IS an O(n^4) algorithm. Ouchies!
You can improve it slightly by pushing the i*scale and j*scale out of the inner two loops - they are invariant where they are now. The optimizer might be doing it for you, however. There are also some other similar optimizations.
Regarding the error, run it twice, once with an input array that's got an even length (6x6) and another that's odd (7x7). And 6x7 and 7x6 while you're at it.

Based on your other question, it seems like you may be having trouble with mixing of types - with numeric conversions. One way to do this, which can make your code more debuggable and more readable to others not familiar with the problem space, would be to split the problematic line into multiple lines. Each minor operation would be one line, until you reach the final value. For example,
newImage[(int)(i*scale+s1)][(int)(j*scale+s2)] =source[i][j];
would become
int x = i * scale;
x += s1;
int y = j* scale;
y +=s2;
newImage[x][y] = source[i][j];
Now, you can run the code in a debugger and look at the values of each item after each operation is performed. When a value doesn't match what you think it should be, look at it and figure out why.
Now, back to the suspected problem: I expect that you need to use doubles somewhere, not ints - in your other question you talked about scaling factors. Is the factor less than 1? If so, when it's converted to an int, it'll be 0, and you'll get the wrong result.

Fast counting of 2D sub-matrices withing a large, dense 2D matrix?

What's a good algorithm for counting submatrices within a larger, dense matrix? If I had a single line of data, I could use a suffix tree, but I'm not sure if generalizing a suffix tree into higher dimensions is exactly straightforward or the best approach here.
Thoughts?
My naive solution to index the first element of the dense matrix and eliminate full-matrix searching provided only a modest improvement over full-matrix scanning.
What's the best way to solve this problem?
Example:
Input:
Full matrix:
123
212
421
Search matrix:
12
21
Output:
2
This sub-matrix occurs twice in the full matrix, so the output is 2. The full matrix could be 1000x1000, however, with a search matrix as large as 100x100 (variable size), and I need to process a number of search matrices in a row. Ergo, a brute force of this problem is far too inefficient to meet my sub-second search time for several matrices.

For an algorithms course, I once worked an exercise in which the Rabin-Karp string-search algorithm had to be extended slightly to search for a matching two-dimensional submatrix in the way you describe.
I think if you take the time to understand the algorithm as it is described on Wikipedia, the natural way of extending it to two dimensions will be clear to you. In essence, you just make several passes over the matrix, creeping along one column at a time. There are some little tricks to keep the time complexity as low as possible, but you probably won't even need them.
Searching an N×N matrix for a M×M matrix, this approach should give you an O(N²⋅M) algorithm. With tricks, I believe it can be refined to O(N²).

Algorithms and Theory of Computation Handbook suggests what is an N^2 * log(Alphabet Size) solution. Given a sub-matrix to search for, first of all de-dupe its rows. Now note that if you search the large matrix row by row at most one of the de-duped rows can appear at any position. Use Aho-Corasick to search this in time N^2 * log(Alphabet Size) and write down at each cell in the large matrix either null or an identifier for the matching row of the sub-matrix. Now use Aho-Corasick again to search down the columns of this matrix of row matches and signal a match where all the rows are present below each other.

This sounds similar to template matching. If motivated you could probably transform your original array with the FFT and drop a log from a brute force search. (Nlog(M)) instead of (NM)

I don't have a ready answer but here's how I would start:
-- You want very fast lookup, how much (time) can you spend on building index structures? When brute-force isn't fast enough you need indexes.
-- What do you know about your data that you haven't told us? Are all the values in all your matrices single-digit integers?
-- If they are single-digit integers (or anything else you can represent as a single character or index value), think about linearising your 2D structures. One way to do this would be to read the matrix along a diagonal running top-right to bottom-left and scanning from top-left to bottom-right. Difficult to explain in words, but read the matrix:
1234
5678
90ab
cdef
as 125369470c8adbef
(get it?)
Now you can index your super-matrix to whatever depth your speed and space requirements demand; in my example key 1253... points to element (1,1), key abef points to element (3,3). Not sure if this works for you, and you'll have to play around with the parameters to your solution. Choose your favourite method for storing the key-value pairs: a hash, a list, or even build some indexes into the index if things get wild.
Regards
Mark

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.