I'm trying to get the most representative frequency (or first harmonic) from an audio file using the Noise FFT library (https://github.com/paramsen/noise). I have an array with the values of size x and the output array's size is x+2. I'm not familiar with Fourier Transform, so maybe I'm missing something, but from my understanding I should have something that represents the frequencies and stores the magnitude (or in this case a complex number from with to calculate it) of each one.
The thing is: since each position in the array should be a frequency, how can I know the range of the output frequencies, what frequency is each position or something like that?
Edit: This is part of the code I'm using
float[] mono = new float[size];
// I fill the array with the appropiate values
Noise noise = Noise.real(size);
float[] dst = new float[size + 2];
float[] fft = noise.fft(mono, dst);
// The result array has the pairs of real+imaginary floats in a one dimensional array; even indices
// are real, odd indices are imaginary. DC bin is located at index 0, 1, nyquist at index n-2, n-1
double greatest = 0;
int greatestIdx = 0;
for(int i = 0; i < fft.length / 2; i++) {
float real = fft[i * 2];
float imaginary = fft[i * 2 + 1];
double magnitude = Math.sqrt(real*real+imaginary*imaginary);
if (magnitude > greatest) {
greatest = magnitude;
greatestIdx = i;
}
System.out.printf("index: %d, real: %.5f, imaginary: %.5f\n", i, real, imaginary);
}
I just noticed something I had overlooked. When reading the comment just before the for loop (which is from the sample code provided in GitHub) it says that nyquist is located at the last pair of values of the array. From what I searched, nyquist is 22050Hz, so... To know the frequency corresponding to greatestIdx I should map the range [0,size+2] to the range [0,22050] and calculate the new value? It seems like a pretty unprecise measure.
Taking the prior things into account, maybe I should use another library for more precision? If that is the case, what would be one that let me specify the output frequency range or that gives me approximately the human hearing range by default?
I believe that the answer to your question is here if I understand it correctly https://stackoverflow.com/a/4371627/9834835
To determine the frequency for each FFT bin you may use the formula
F = i * sample / nFFt
where:
i = the FFT index
sample = the sample rate
nFft = your FFT size
This is not a homework problem, I am too old to get home works :)
So, ideally I am trying to convert a number in a given base to another given base.
Can someone please share the logic, then probably I can write the code myself. Not able to find anything online surprisingly.
The answer depends on whether or not you can use a primitive, such as an int or long for your representation.
If you can, the algorithm is reasonably simple: convert the number in base X to a primitive representation, then convert that representation to base Y.
To convert a number to primitive, use this algorithm:
Make a running total res, and set it to zero
Go through the string representing the number number in base X left-to-right
Convert each "digit" (which may be represented by a letter) to its numeric value
Multiply running total by X, then add the numeric value of the digit to it
To convert back, use this algorithm:
Make a string builder
Remove the value of the last digit by obtaining digit = num % Y
Convert the digit value to digit character (it may be a letter)
Append the digit character to the string builder
Drop the last digit from representation by using num /= Y
Continue while num is not zero
Reverse the string in the string builder
If your number is too big for a primitive, such as int or long, you need to build a class for holding numbers greater than primitives. I would recommend using BigInteger initially, and then replacing it with your own implementation.
You could try this:
String input = /* your input */;
int inputBase = /* your input base */;
int outputBase = /* wished output base */;
int inputInt = Integer.valueOf(input, inputBase);
String output = Integer.toString(inputInt, outputBase);
Is the number a scalar or a fraction?
Extra work is required to convert the fraction part.
And it is not easy to cheat, unlike BigInteger.toString(),
BigDecimal.toString() does not take a radix.
General algorithm is :
Say you get a number in base x. It means Σ ai xi
first compute the number as a long or a BigInteger if too huge simply by iteratively computing above formula : n = a 0 + x * a1 + ...
then decompose n to base y : n0 = n, b0 = n0 % y
then iteratively : ni = ni-1 - bi-1 / y , bi = ni % y
And you get base y representation : Σ bi yi
I have been thinking of it but have ran out of idea's. I have 10 arrays each of length 18 and having 18 double values in them. These 18 values are features of an image. Now I have to apply k-means clustering on them.
For implementing k-means clustering I need a unique computational value for each array. Are there any mathematical or statistical or any logic that would help me to create a computational value for each array, which is unique to it based upon values inside it. Thanks in advance.
Here is my array example. Have 10 more
[0.07518284315321135
0.002987851573676068
0.002963866526639678
0.002526139418225552
0.07444872939213325
0.0037219653347541617
0.0036979802877177715
0.0017920256571474585
0.07499695903867931
0.003477831820276616
0.003477831820276616
0.002036159171625004
0.07383539747505984
0.004311312204791184
0.0043352972518275745
0.0011786937400740452
0.07353130134299131
0.004339580295941216]
Did you checked the Arrays.hashcode in Java 7 ?
/**
* Returns a hash code based on the contents of the specified array.
* For any two <tt>double</tt> arrays <tt>a</tt> and <tt>b</tt>
* such that <tt>Arrays.equals(a, b)</tt>, it is also the case that
* <tt>Arrays.hashCode(a) == Arrays.hashCode(b)</tt>.
*
* <p>The value returned by this method is the same value that would be
* obtained by invoking the {#link List#hashCode() <tt>hashCode</tt>}
* method on a {#link List} containing a sequence of {#link Double}
* instances representing the elements of <tt>a</tt> in the same order.
* If <tt>a</tt> is <tt>null</tt>, this method returns 0.
*
* #param a the array whose hash value to compute
* #return a content-based hash code for <tt>a</tt>
* #since 1.5
*/
public static int hashCode(double a[]) {
if (a == null)
return 0;
int result = 1;
for (double element : a) {
long bits = Double.doubleToLongBits(element);
result = 31 * result + (int)(bits ^ (bits >>> 32));
}
return result;
}
I dont understand why #Marco13 mentioned " this is not returning unquie for arrays".
UPDATE
See #Macro13 comment for the reason why it cannot be unquie..
UPDATE
If we draw a graph using your input points, ( 18 elements) has one spike and 3 low values and the pattern goes..
if that is true.. you can find the mean of your Peak ( 1, 4, 8,12,16 ) and find the low Mean from remaining values.
So that you will be having Peak mean and Low mean . and you find the unquie number to represent these two also preserve the values using bijective algorithm described in here
This Alogirthm also provides formulas to reverse i.e take the Peak and Low mean from the unquie value.
To find unique pair < x; y >= x + (y + ( (( x +1 ) /2) * (( x +1 ) /2) ) )
Also refer Exercise 1 in pdf page 2 to reverse x and y.
For finding Mean and find paring value.
public static double mean(double[] array){
double peakMean = 0;
double lowMean = 0;
for (int i = 0; i < array.length; i++) {
if ( (i+1) % 4 == 0 || i == 0){
peakMean = peakMean + array[i];
}else{
lowMean = lowMean + array[i];
}
}
peakMean = peakMean / 5;
lowMean = lowMean / 13;
return bijective(lowMean, peakMean);
}
public static double bijective(double x,double y){
double tmp = ( y + ((x+1)/2));
return x + ( tmp * tmp);
}
for test
public static void main(String[] args) {
double[] arrays = {0.07518284315321135,0.002963866526639678,0.002526139418225552,0.07444872939213325,0.0037219653347541617,0.0036979802877177715,0.0017920256571474585,0.07499695903867931,0.003477831820276616,0.003477831820276616,0.002036159171625004,0.07383539747505984,0.004311312204791184,0.0043352972518275745,0.0011786937400740452,0.07353130134299131,0.004339580295941216};
System.out.println(mean(arrays));
}
You can use this the peak and low values to find the similar images.
You can simply sum the values, using double precision, the result value will unique most of the times. On the other hand, if the value position is relevant, then you can apply a sum using the index as multiplier.
The code could be as simple as:
public static double sum(double[] values) {
double val = 0.0;
for (double d : values) {
val += d;
}
return val;
}
public static double hash_w_order(double[] values) {
double val = 0.0;
for (int i = 0; i < values.length; i++) {
val += values[i] * (i + 1);
}
return val;
}
public static void main(String[] args) {
double[] myvals =
{ 0.07518284315321135, 0.002987851573676068, 0.002963866526639678, 0.002526139418225552, 0.07444872939213325, 0.0037219653347541617, 0.0036979802877177715, 0.0017920256571474585, 0.07499695903867931, 0.003477831820276616,
0.003477831820276616, 0.002036159171625004, 0.07383539747505984, 0.004311312204791184, 0.0043352972518275745, 0.0011786937400740452, 0.07353130134299131, 0.004339580295941216 };
System.out.println("Computed value based on sum: " + sum(myvals));
System.out.println("Computed value based on values and its position: " + hash_w_order(myvals));
}
The output for that code, using your list of values is:
Computed value based on sum: 0.41284176550504803
Computed value based on values and its position: 3.7396448842464496
Well, here's a method that works for any number of doubles.
public BigInteger uniqueID(double[] array) {
final BigInteger twoToTheSixtyFour =
BigInteger.valueOf(Long.MAX_VALUE).add(BigInteger.ONE);
BigInteger count = BigInteger.ZERO;
for (double d : array) {
long bitRepresentation = Double.doubleToRawLongBits(d);
count = count.multiply(twoToTheSixtyFour);
count = count.add(BigInteger.valueOf(bitRepresentation));
}
return count;
}
Explanation
Each double is a 64-bit value, which means there are 2^64 different possible double values. Since a long is easier to work with for this sort of thing, and it's the same number of bits, we can get a 1-to-1 mapping from doubles to longs using Double.doubleToRawLongBits(double).
This is awesome, because now we can treat this like a simple combinations problem. You know how you know that 1234 is a unique number? There's no other number with the same value. This is because we can break it up by its digits like so:
1234 = 1 * 10^3 + 2 * 10^2 + 3 * 10^1 + 4 * 10^0
The powers of 10 would be "basis" elements of the base-10 numbering system, if you know linear algebra. In this way, base-10 numbers are like arrays consisting of only values from 0 to 9 inclusively.
If we want something similar for double arrays, we can discuss the base-(2^64) numbering system. Each double value would be a digit in a base-(2^64) representation of a value. If there are 18 digits, there are (2^64)^18 unique values for a double[] of length 18.
That number is gigantic, so we're going to need to represent it with a BigInteger data-structure instead of a primitive number. How big is that number?
(2^64)^18 = 61172327492847069472032393719205726809135813743440799050195397570919697796091958321786863938157971792315844506873509046544459008355036150650333616890210625686064472971480622053109783197015954399612052812141827922088117778074833698589048132156300022844899841969874763871624802603515651998113045708569927237462546233168834543264678118409417047146496
There are that many unique configurations of 18-length double arrays and this code lets you uniquely describe them.
I'm going to suggest three methods, with different pros and cons which I will outline.
Hash Code
This is the obvious "solution", though it has been correctly pointed out that it will not be unique. However, it will be very unlikely that any two arrays will have the same value.
Weighted Sum
Your elements appear to be bounded; perhaps they range from a minimum of 0 to a maximum of 1. If this is the case, you can multiply the first number by N^0, the second by N^1, the third by N^2 and so on, where N is some large number (ideally the inverse of your precision). This is easily implemented, particularly if you use a matrix package, and very fast. We can make this unique if we choose.
Euclidean Distance from Mean
Subtract the mean of your arrays from each array, square the results, sum the squares. If you have an expected mean, you can use that. Again, not unique, there will be collisions, but you (almost) can't avoid that.
The difficulty of uniqueness
It has already been explained that hashing will not give you a unique solution. A unique number is possible in theory, using the Weighted Sum, but we have to use numbers of a very large size. Let's say your numbers are 64 bits in memory. That means that there are 2^64 possible numbers they can represent (slightly less using floating point). Eighteen such numbers in an array could represent 2^(64*18) different numbers. That's huge. If you use anything less, you will not be able to guarantee uniqueness due to the pigeonhole principle.
Let's look at a trivial example. If you have four letters, a, b, c and d, and you have to number them each uniquely using the numbers 1 to 3, you can't. That's the pigeonhole principle. You have 2^(18*64) possible numbers. You can't number them uniquely with less than 2^(18*64) numbers, and hashing doesn't give you that.
If you use BigDecimal, you can represent (almost) arbitrarily large numbers. If the largest element you can get is 1 and the smallest 0, then you can set N = 1/(precision) and apply the Weighted Sum mentioned above. This will guarantee uniqueness. The precision for doubles in Java is Double.MIN_VALUE. Note that the array of weights needs to be stored in _Big Decimal_s!
That satisfies this part of your question:
create a computational value for each array, which is unique to it
based upon values inside it
However, there is a problem:
1 and 2 suck for K Means
I am assuming from your discussion with Marco 13 that you are performing the clustering on the single values, not the length 18 arrays. As Marco has already mentioned, Hashing sucks for K means. The whole idea is that the smallest change in the data will result in a large change in Hash Values. That means that two images which are similar, produce two very similar arrays, produce two very different "unique" numbers. Similarity is not preserved. The result will be pseudo random!!!
Weighted Sums are better, but still bad. It will basically ignore all the elements except for the last one, unless the last element is the same. Only then will it look at the next to last, and so on. Similarity is not really preserved.
Euclidean distance from the mean (or at least some point) will at least group things together in a sort of sensible way. Direction will be ignored, but at least things that are far from the mean won't be grouped with things that are close. Similarity of one feature is preserved, the other features are lost.
In summary
1 is very easy, but is not unique and doesn't preserve similarity.
2 is easy, can be unique and doesn't preserve similarity.
3 is easy, but is not unique and preserves some similarity.
Implementatio of Weighted Sum. Not really tested.
public class Array2UniqueID {
private final double min;
private final double max;
private final double prec;
private final int length;
/**
* Used to provide a {#code BigInteger} that is unique to the given array.
* <p>
* This uses weighted sum to guarantee that two IDs match if and only if
* every element of the array also matches. Similarity is not preserved.
*
* #param min smallest value an array element can possibly take
* #param max largest value an array element can possibly take
* #param prec smallest difference possible between two array elements
* #param length length of each array
*/
public Array2UniqueID(double min, double max, double prec, int length) {
this.min = min;
this.max = max;
this.prec = prec;
this.length = length;
}
/**
* A convenience constructor which assumes the array consists of doubles of
* full range.
* <p>
* This will result in very large IDs being returned.
*
* #see Array2UniqueID#Array2UniqueID(double, double, double, int)
* #param length
*/
public Array2UniqueID(int length) {
this(-Double.MAX_VALUE, Double.MAX_VALUE, Double.MIN_VALUE, length);
}
public BigDecimal createUniqueID(double[] array) {
// Validate the data
if (array.length != length) {
throw new IllegalArgumentException("Array length must be "
+ length + " but was " + array.length);
}
for (double d : array) {
if (d < min || d > max) {
throw new IllegalArgumentException("Each element of the array"
+ " must be in the range [" + min + ", " + max + "]");
}
}
double range = max - min;
/* maxNums is the maximum number of numbers that could possibly exist
* between max and min.
* The ID will be in the range 0 to maxNums^length.
* maxNums = range / prec + 1
* Stored as a BigDecimal for convenience, but is an integer
*/
BigDecimal maxNums = BigDecimal.valueOf(range)
.divide(BigDecimal.valueOf(prec))
.add(BigDecimal.ONE);
// For convenience
BigDecimal id = BigDecimal.valueOf(0);
// 2^[ (el-1)*length + i ]
for (int i = 0; i < array.length; i++) {
BigDecimal num = BigDecimal.valueOf(array[i])
.divide(BigDecimal.valueOf(prec))
.multiply(maxNums).pow(i);
id = id.add(num);
}
return id;
}
As I understand, you are going to make k-clustering, based on the double values.
Why not just wrap double value in an object, with array and position identifier, so you would know in which cluster it ended up?
Something like:
public class Element {
final public double value;
final public int array;
final public int position;
public Element(double value, int array, int position) {
this.value = value;
this.array = array;
this.position = position;
}
}
If you need to cluster array as a whole,
You can transform original arrays of length 18 to array of length 19 with last or first element being unique id, that you will ignore during clustering, but, to which you could refer after clustering finished. That way this have a small memory footprint - of 8 additional bytes for an array, and easy association with the original value.
If space is absolutely a problem, and you have all values of an array lesser than 1, you can add unique id, greater or equal to 1 to each array, and cluster, based on reminder of division to 1, 0.07518284315321135 stays 0.07518284315321135 for the 1st, and 0.07518284315321135 becomes 1.07518284315321135 for the 2nd, although this increases complexity of computation during clustering.
First of all, let's try to understand what you need mathematically:
Uniquely mapping an array of m real numbers to a single number is in fact a bijection between R^m and R, or at least N.
Since floating points are in fact rational numbers, your problem is to find a bijection between Q^m and N, which can be transformed to N^n to N, because you know your values will always be greater than 0 (just multiply your values by the precision).
Thus you need to map N^m to N. Take a look at the Cantor Pairing Function for some ideas
A guaranteed way to generate a unique result based on the array is to convert it to one big string, and use that for your computational value.
It may be slow, but it will be unique based on the array's values.
Implementation examples:
Best way to convert an ArrayList to a string
I'm confused with converting the RGB values to YCbCr color scheme. I used this equation:
int R, G, b;
double Y = 0.229 * R + 0.587 * G + 0.144 * B;
double Cb = -0.168 * R - 0.3313 * G + 0.5 * B + 128;
double Cr = 0.5 * R - 0.4187 * G - 0.0813 * B + 128;
The expected output of YCbCr is normalized between 0-255, I'm confused because one of my source says it is normalized within the range of 0-1.
And it is going well, But I am having problem when getting the LipMap to isolate/detect the lips of the face, I implemented this:
double LipMap = Cr*Cr*(Cr*Cr-n*(Cr/Cb))*(Cr*Cr-n*(Cr/Cb));
n returns 0-255, the equation for n is: n=0.95*(summation(Cr*Cr)/summation(Cr/Cb))
but another sources says: n = 0.95*(((1/k)*summation(Cr*Cr))/((1/k)*summation(Cr/Cb)))
where k is equal to the number of pixels in the face image.
It say's from my sources that it will return a result of 0-255, but in my program it always returns large numbers always, not even giving me 0-255.
So can anyone help me implement this and solve my problem?
From the sources you linked in your comments, it looks like either the equations or the descriptions in the first source are wrong:
If you use RGB values in the Range [0,255] and the given conversion (your Cb conversion differs from that btw.) you should get Cr and Cb values in the same range.
Now if you calculate n = 0.95 * (ΣCr2/Σ(Cr/Cb)) you'll notice that the values for Cr2 range from [0,65025] whereas Cr/Cb is in the range [0,255] (assuming Cb=0 is not possible and thus the highest value would be 255/1 = 255).
If you further assume an image with quite high red and low blue components, you'll get way higher values for n than what is stated in that paper:
Constant η fits final value in range 0..255
The second paper states this, which makes much more sense IMHO (although I don't know whether they normalize Cr and Cb to range [0,1] before the calculation or if they normalize the result which might result in a higher difference between Cr2 and Cr/Cb):
Where (Cr) 2,(Cr/Cb) all are normalized to the
range [0 1].
Note that in order to normalize Cr and Cb to range [0,1] you'd either need to divide the result of your equations by 255 or simply use RGB in range [0,1] and add 0.5 instead of 128:
//assumes RGB are in range [0,1]
double Cb = -0.168 * R - 0.3313 * G + 0.5 * B + 0.5;
double Cr = 0.5 * R - 0.4187 * G - 0.0813 * B + 0.5;
I want to write a program in java, which will perform a number raised to a power, but without using math.pow. The program should be generic to include fractions as well.
The loop increment method will increment by 1, which is okay for integers; but not fractions. Please Suggest a generic method that would be helpful to me.
First, observe that pow(a,x) = exp(x * log(a)).
You can implement your own exp() function using the Taylor series expansion for
ex:
ex = 1 + x + x2/2! + x3/3! + x4/4! + x5/5! + ...
This will work for non-integer values of x. The more terms you include, the more
accurate the result will be.
Note that by using some algebraic identities, you only need to resort to the series expansion for x in the range 0 < x < 1 . exp(int + frac) = exp(int)*exp(frac), and there's no need to use a series expansion for exp(int). (You just multiply it out,
since it's an integer power of e=2.71828...).
Similarly, you can implement log(x) using one of these series expansions:
log(1+x) = x - x2/2 + x3/3 - x4/4 + ...
or
log(1-x) = -1 * (x + x2/2 + x3/3 + x4/4 + ... )
But these series only converge for x in the interval -1 < x < 1. So for values
of a outside this range, you might have to use the identity
log(pq) = log(p) + log(q)
and do some repeated divisions by e (= 2.71828...) to bring a down into a range where
the series expansion converges. For example, if a=4, you'd have to take take x=3
to use the first formula, but 3 is outside the range of convergence. So we start
dividing out factors of e:
4/e = 1.47151...
log(4) = log(e*1.47151...) = 1 + log(1.47151...)
Now we can take x=.47151..., which is within the range of convergence, and evaluate log(1+x) using the series expansion.
Think about what a power function should do.
Mathematically: x^5 = x * x * x * x * x, or ((((x*x)*x)*x)*x)
Within your for loop, you can use the *= operator to achieve the operation that happens above.
How are you handling fractions? Java has no built-in fraction type; it stores decimals that would calculate the same way as integers (in other words, x * x works with both types). If you have a special class for fractions, your loop just needs two steps: one to multiply the numerator and one to multiply the denominator.
While reading up on powers on Wikipedia:
a^x = exp( x ln(a) ) for any real number x
Is this cheating?