I'm not really sure what's the right title for my question
So here's the question
Suppose I have N number of samples, eg:
1
2
3
4
.
.
.
N
Now I want to "reduce" the size of the sample from N to M, by dumping (N-M) data from the N samples.
I want the dumping to be as "distributed" as possible,
so like if I have 100 samples and want to compress it to 50 samples, I would throw away every other sample. Another example, say the data is 100 samples and I want to compress it to 25 samples. I would throw away 1 sample in the each group of 100/25 samples, meaning I iterate through each sample and count, and every time my count reaches 4 I would throw away the sample and restart the count.
The problem is how do I do this if the 4 above was to be 2.333 for example. How do I treat the decimal point to throw away the sample distributively?
Thanks a lot..
The terms you are looking for are resampling, downsampling and decimation. Note that in the general case you can't just throw away a subset of your data without risking aliasing. You need to low pass filter your data first, prior to decimation, so that there is no information above your new Nyquist rate which would be aliased.
When you want to downsample by a non-integer value, e.g. 2.333 as per your example above you would normally do this by upsampling by an integer factor M and then downsampling by a different integer factor N, where the fraction M/N gives you the required resampling factor. In your example M = 3 and N = 7, so you would upsample by a factor of 3 and then downsample by a factor of 7.
You seem to be talking about sampling rates and digital signal processing
Before you reduce, you normally filter the data to make sure high frequencies in your sample are not aliased to lower frequencies. For instance, in your (take every fourth value), a frequency of that repeats every four samples will alias to the "DC" or zero cycle frequency (for example "234123412341" starting with the first of every grouping will get "2,2,2,2", which might not be what you want. (a 3 cycle would also alias to a cycle like itself (231231231231) => 231... (unless I did that wrong because I'm tired). Filtering is a little beyond what I would like to discuss right now as it's a pretty advanced topic.
If you can represent your "2.333" as some sort of fraction, lets see, that's 7/3. you were talking 1 out of every 4 samples (1/4) sou I would say you're taking 3 out of every 7 samples. so you might (take, drop, take, drop, take, drop, drop). but there might be other methods.
For audio data that you want to sound decent (as opposed to aliased and distorted in the frequency domain), see Paul R.'s answer involving resampling. One method of resampling is interpolation, such as using a windowed-Sinc interpolation kernel which will properly low-pass filter the data as well as allow creating interpolated intermediate values.
For non-sampled and non-audio data, where you just want to throw away some samples in a close-to-evenly distributed manner, and don't care about adding frequency domain noise and distortion, something like this might work:
float myRatio = (float)(N-1) / (float)(M-1); // check to make sure M > 1 beforehand
for (int i=0; i < M; i++) {
int j = (int)roundf(myRatio * (float)i); // nearest bin decimation
myNewArrayLengthM[i] = myOldArrayLengthN[j];
}
Related
Edit: Typos fixed and ambiguity tried to fix.
I have a list of five digit integers in a text file. The expected amount can only be as large as what a 5-digit integer can store. Regardless of how many there are, the FIRST line in this file tells me how many integers are present, so resizing will never be necessary. Example:
3
11111
22222
33333
There are 4 lines. The first says there are three 5-digit integers in the file. The next three lines hold these integers.
I want to read this file and store the integers (not the first line). I then want to be able to search this data structure A LOT, nothing else. All I want to do, is read the data, put it in the structure, and then be able to determine if there is a specific integer in there. Deletions will never occur. The only things done on this structure will be insertions and searching.
What would you suggest as an appropriate data structure? My initial thought was a binary tree of sorts; however, upon thinking, a HashTable may be the best implementation. Thoughts and help please?
It seems like the requirements you have are
store a bunch of integers,
where insertions are fast,
where lookups are fast, and
where absolutely nothing else matters.
If you are dealing with a "sufficiently small" range of integers - say, integers up to around 16,000,000 or so, you could just use a bitvector for this. You'd store one bit per number, all initially zero, and then set the bits to active whenever a number is entered. This has extremely fast lookups and extremely fast setting, but is very memory-intensive and infeasible if the integers can be totally arbitrary. This would probably be modeled with by BitSet.
If you are dealing with arbitrary integers, a hash table is probably the best option here. With a good hash function you'll get a great distribution across the table slots and very, very fast lookups. You'd want a HashSet for this.
If you absolutely must guarantee worst-case performance at all costs and you're dealing with arbitrary integers, use a balanced BST. The indirection costs in BSTs make them a bit slower than other data structures, but balanced BSTs can guarantee worst-case efficiency that hash tables can't. This would be represented by TreeSet.
Given that
All numbers are <= 99,999
You only want to check for existence of a number
You can simply use some form of bitmap.
e.g. create a byte[12500] (it is 100,000 bits which means 100,000 booleans to store existence of 0-99,999 )
"Inserting" a number N means turning the N-th bit on. Searching a number N means checking if N-th bit is on.
Pseduo code of the insertion logic is:
bitmap[number / 8] |= (1>> (number %8) );
searching looks like:
bitmap[number/8] & (1 >> (number %8) );
If you understand the rationale, then a even better news for you: In Java we already have BitSet which is doing what I was describing above.
So code looks like this:
BitSet bitset = new BitSet(12500);
// inserting number
bitset.set(number);
// search if number exists
bitset.get(number); // true if exists
If the number of times each number occurs don't matter (as you said, only inserts and see if the number exists), then you'll only have a maximum of 100,000. Just create an array of booleans:
boolean numbers = new boolean[100000];
This should take only 100 kilobytes of memory.
Then instead of add a number, like 11111, 22222, 33333 do:
numbers[11111]=true;
numbers[22222]=true;
numbers[33333]=true;
To see if a number exists, just do:
int whichNumber = 11111;
numberExists = numbers[whichNumber];
There you are. Easy to read, easier to mantain.
A Set is the go-to data structure to "find", and here's a tiny amount of code you need to make it happen:
Scanner scanner = new Scanner(new FileInputStream("myfile.txt"));
Set<Integer> numbers = Stream.generate(scanner::nextInt)
.limit(scanner.nextInt())
.collect(Collectors.toSet());
Initially I have an array of time and an array of voltage and I have applied FFT and converted that time domain into frequency domain.After applying FFT I got an array of frequencies. Now I have cut off frequency and I need to implement Low pass filter on the same. I need to do this using JAVA. Could someone please refer me if there is any open source available or any idea of implementing the same. Any references that would be implemented using frequency values and cut off frequency would help.
I am completely new to this topic so my approach of question might be little weird. Thanks in advance for support!!!
Since you already have a array with the FFT values, you can implement a very crude low pass filter by just setting those FFT coefficients that correspond to frequencies over your cut-off value to zero.If you need a nicer filter you can implement a digital filter or find an LPF implementation online and just use that.
EDIT: After computing the FFT you don't get an array of frequencies, you get an array of complex numbers representing the magnitude and phase of the data.You should be able to know what frequency each and every complex number in the array corresponds to because the FFT result will correspond to evenly spaced frequencies ranging from 0 to f_s, where f_s is the sampling frequency you used to get your data.
A useful exercise might be to first try and plot a frequency spectrum, because after plotting it, it will be clear how you can discard high frequencies thus realising a LPF.This slightly similar post might help you: LINK
EDIT: 1) First you need to find the sampling frequency (f_s) of your data, this is the number of samples have taken every second.It can be computed using f_s = 1/T, where T is the time interval between any two consecutive samples in the time domain.
2) After this you divide f_c by f_s, where f_c is the cut-off frequency to get a constant k.
3) You then set all COMPLEX numbers above index ( k times N) in your array to zero, where N is the number of elements in your array, simple as that, that will give you a basic Low pass filter (LPF).
Rough, indicative (pseudo)code below:
Complex[] fftData = FFT(myData);
int N = fftData.Length;
float T = 0.001;
float f_c = 500; //f_c = 500Hz
float f_s = 1/T; //f_s = 1000Hz
float k = f_c/f_s;
int index = RoundToNextLargestInteger(k * N);
//Low pass filter
for(int i = index; index < N; index++)
fftData[i] = 0;
The fftData you receive in your case will not already be in the form of elements from the Complex class, so make sure you know how your data is represented and which data elements to set to zero.
This is not really a good way to do it though as a single frequency in your data can be spread over several bins because of leakage so the results would be nasty in that case.Ideally you would want to design a proper digital filter or just use some software library.So if you need a very accurate LPF, you can go through the normal process of designing analog LPF and then warping it to a digital filter as discussed in THIS document.
My goal is to be able to process a single note from a guitar (or other instrument), and convert it into a frequency value.
This does not have to be in real time- I figured it would be much easier to record a one-second sound and analyse the data afterwards.
I understand that to do this I need to use a Fourier transform (and have a class that will perform a FFT). However, I don't really understand the input / output of a FFT algorithm- the class I am using seems to use a complex vector input and give a complex vector output. What do these represent?
Also, could anyone recommend any Java classes that can detect and record an input (and if possible, give frequency or values that can be plugged into FFT?)?
Thanks in advance.
Input to your FFT will be a time-domain signal representing the audio. If you record some sound for a second from the mic, this will really contain a wave that is made up of various frequencies at different amounts - hopefully mostly the frequency/frequencies corresponding to the note which you are playing, plus some outside noise and noise introduced by the microphone and electronics. If in that 1 second you happen to have, say, 512 time points (so the mic can sample at 512 times a second), then each of those time points represents the intensity picked up by the mic. These sound intensity values can be turned from their time-domain representation to a frequency-domain representation using the FFT.
If you now give this to the FFT, as it is a real-valued input, you will get a symmetric complex output (symmetric around the central value) and can ignore the second half of the complex vector output and use only the first half - i.e. the second half will be symmetric (and thus "identical") to the first half. The output represents the contributions of each frequency to the input waveform - in essence, each "bin" or array index contains information about that frequency's amplitude. To extract the amplitude you want to do:
magnitudeFFTData[i] = Math.sqrt((real * real) + (imaginary * imaginary));
where real and imaginary are the real and imaginary parts of the complex number at that frequency bin. To get the frequency corresponding to a given bin, you need the following:
frequency = i * Fs / N;
where i is bin or array index number, Fs the sampling frequency and N the number of data points. From a project of mine wherein I recently used the FFT:
for (int i = (curPersonFFTData.length / 64); i < (curPersonFFTData.length / 40); i++) {
double rr = (curPersonFFTData[i].getReal());
double ri = (curPersonFFTData[i].getImaginary());
magnitudeCurPersonFFTData[i] = Math.sqrt((rr * rr) + (ri * ri));
ds.addValue(magnitudeCurPersonFFTData[i]);
}
The divisions by 64 and 40 are arbitrary and useful for my case only, to only get certain frequency components, as opposed to all frequencies, which you might want. You can easily do all this in real time.
Hi I am building a simple multilayer network which is trained using back propagation. My problem at the moment is that some attributes in my dataset are nominal (non numeric) and I have to normalize them. I wanted to know what the best approach is. I was thinking along the lines of counting up how many distinct values there are for each attribute and assigning each an equal number between 0 and 1. For example suppose one of my attributes had values A to E then would the following be suitable?:
A = 0
B = 0.25
C = 0.5
D = 0.75
E = 1
The second part to my question is denormalizing the output to get it back to a nominal value. Would I first do the same as above to each distinct output attribute value in the dataset in order to get a numerical representation? Also after I get an output from the network, do I just see which number it is closer to? For example if I got 0.435 as an output and my output attribute values were assigned like this:
x = 0
y = 0.5
z = 1
Do I just find the nearest value to the output (0.435) which is y (0.5)?
You can only do what you are proposing if the variables are ordinal and not nominal, and even then it is a somewhat arbitrary decision. Before I suggest a solution, a note on terminology:
Nominal vs ordinal variables
Suppose A, B, etc stand for colours. These are the values of a nominal variable and can not be ordered in a meaningful way. You can't say red is greater than yellow. Therefore, you should not be assigning numbers to nominal variables .
Now suppose A, B, C, etc stand for garment sizes, e.g. small, medium, large, etc. Even though we are not measuring these sizes on an absolute scale (i.e. we don't say that small corresponds to 40 a chest circumference), it is clear that small < medium < large. With that in mind, it is still somewhat arbitrary whether you set small=1, medium=2, large=3, or small=2, medium=4, large=8.
One-of-N encoding
A better way to go about this is to to use the so called one-out-of-N encoding. If you have 5 distinct values, you need five input units, each of which can take the value 1 or 0. Continuing with my garments example, size extra small can be encoded as 10000, small as 01000, medium as 00100, etc.
A similar principle applies to the outputs of the network. If we treat garment size as output instead of input, when the network output the vector [0.01 -0.01 0.5 0.0001 -.0002], you interpret that as size medium.
In reply to your comment on #Daan's post: if you have 5 inputs, one of which takes 20 possible discrete values, you will need 24 input nodes. You might want to normalise the values of your 4 continuous inputs to the range [0, 1], because they may end out dominating your discrete variable.
It really depends on the meaning of the attributes you're trying to normalize, and the functions used inside your NN. For example, if your attributes are non-linear, or if you're using a non-linear activation function, then linear normalization might not end up doing what you want it to do.
If the ranges of attribute values are relatively small, splitting the input and output into sets of binary inputs and outputs will probably be simpler and more accurate.
EDIT:
If the NN was able to accurately perform it's function, one of the outputs will be significantly higher than the others. If not, you might have a problem, depending on when you see inaccurate results.
Inaccurate results during early training are expected. They should become less and less common as you perform more training iterations. If they don't, your NN might not be appropriate for the task you're trying to perform. This could be simply a matter of increasing the size and/or number of hidden layers. Or it could be a more fundamental problem, requiring knowledge of what you're trying to do.
If you've succesfully trained your NN but are seeing inaccuracies when processing real-world data sets, then your training sets were likely not representative enough.
In all of these cases, there's a strong likelihood that your NN did something entirely different than what you wanted it to do. So at this point, simply selecting the highest output is as good a guess as any. But there's absolutely no guarantee that it'll be a better guess.
I am implementing MFCC algorithm in Java.
There is a sample code here: http://www.ee.columbia.edu/~dpwe/muscontent/practical/mfcc.m at Matlab. However I have some problems with mel filter banking process. How to generate triangular windows and how to use them?
PS1: An article which has a part that describes MFCC: http://arxiv.org/pdf/1003.4083
PS2: If there is a document about MFCC algorithms steps basically, it will be good.
PS3: My main question is related to that: MFCC with Java Linear and Logarithmic Filters some implementations use both linear and logarithmic filter and some of them not. What is that filters and what is the center frequent concept. I follow that code:MFCC Java , what is the difference of it between that code: MFCC Matlab
Triangular windows as frequency band filters aren't hard to implement. You basically want to integrate the FFT data within each band (defined as the frequency space between center frequency i-1 and center frequency i+1).
You're basically looking for something like,
for(int bandIdx = 0; bandIdx < numBands; bandIdx++) {
int startFreqIdx = centerFreqs[bandIdx-1];
int centerFreqIdx = centerFreqs[bandIdx];
int stopFreqIdx = centerFreqs[bandIdx+1];
for(int freq = startFreqIdx; i < centerFreqIdx; i++) {
magnitudeScale = centerFreqIdx-startFreqIdx;
bandData[bandIdx] += fftData[freq]*(i-startFreqIdx)/magnitudeScale;
}
for(int freq = centerFreqIdx; i <= stopFreqIdx; i++) {
magnitudeScale = centerFreqIdx-stopFreqIdx;
bandData[bandIdx] += fftData[freq]*(i-stopFreqIdx)/magnitudeScale;
}
}
If you do not understand the concept of a "center frequency" or a "band" or a "filter," pick up an elementary signals textbook--you shouldn't be implementing this algorithm without understanding what it does.
As for what the exact center frequencies are, it's up to you. Experiment and pick (or find in publications) values that capture the information you want to isolate from the data. The reason that there are no definitive values, or even scale for values, is because this algorithm tries to approximate a human ear, which is a very complicated listening device. Whereas one scale may work better for, say, speech, another may work better for music, etc. It's up to you to choose what is appropriate.
Answer for the second PS: I found this tutorial that really helped me computing the MFCCs.
As for the triangular windows and the filterbanks, from what I understood, they do overlap, they do not extend to negative frequences and the whole process of computing them from the FFT spectrum and applying them back to it goes something like this:
Choose a minimum and a maximum frequency for the filters (for example, min freq = 300Hz - the minimum voice frequency and max frequency = your sample rate / 2. Maybe this is where you should choose the 1000Hz limit you were talking about)
Compute the mel values from the min and max chosen frequences. Formula here.
Compute N equally distanced values between these two mel values. (I've seen examples of different values for N, you can even find a efficiency comparison for different of values in this work, for my tests I've picked 26)
Convert these values back to Hz. (you can find the formula on the same wiki page) => array of N + 2 filter values
Compute a filterbank (filter triangle) for each three consecutive values, either how Thomas suggested above (being careful with the indexes) or like in the turorial recommended at the beginning of this post) => an array of arrays, size NxM, asuming your FFT returned 2*M values and you only use M.
Pass the whole power spectrum (M values obtained from FFT) through each triangular filter to get a "filterbank energy" for each filter (for each filterbank (N loop), multiply each magnitude obtained after FFT to each value in the corresponding filterbank (M loop) and add the M obtained values) => N-sized array of energies.
These are your filterbank energies that you can further apply a log to, apply the DCT and extract the MFCCs...