Avoid overmodulation/distorsion when applying gain to PCM - java

I work on an audio recorder (AudioRec on Google Play).
I have the option to adjust the gain with [-20dB, + 20dB] range.
It works pretty well on my phone, but an user using a professional microphone attached to his device had complained about the gain because when selecting -20dB, the output is distorted.
See below how I impl. gain function:
for(int frameIndex=0; frameIndex<numFrames; frameIndex++){
for(int c=0; c<nChannels; c++){
if(rGain != 1){
// gain
long accumulator=0;
for(int b=0; b<bytesPerSample; b++){
accumulator+=((long)(source[byteIndex++]&0xFF))<<(b*8+emptySpace);
}
double sample = ((double)accumulator/(double)Long.MAX_VALUE);
sample *= rGain;
int intValue = (int)((double)sample*(double)Integer.MAX_VALUE);
for(int i=0; i<bytesPerSample; i++){
source[i+byteIndex2]=(byte)(intValue >>> ((i+2)*8) & 0xff);
}
byteIndex2 += bytesPerSample;
}
}//end for(channel)
}//end for(frameIndex)
Maybe I should apply some low/high filter after samle *= rGain; ? Something like if(sample < MINIMUM_VALUE || sample > MAXIMUM_VALUE) ? in this case, please let me know what are these min max values...

Simply clipping values above a threshold will most certainly cause distortion. If you can picture a pure sine wave, as you lop the top off it will begin to resemble a square wave.
That said, if you have an input signal and you are multiplying it by a value smaller than one, there is no way that you are introducing any (significant) distortion. You need to look further back in the signal path. Perhaps clipping is occurring at the input.

I would try to simplify your logic. It appears you are using 32-bit wave form but the code is far more complex than needed. This will make it harder to work out how to avoid clipping.
IntBuffer ints = ByteBuffer.wrap(source).order(ByteBuffer.nativeOrder()).asIntBuffer();
for(int i = 0; i < ints.limit(); i++) {
int signal = ints.get(i);
double gained = signal * gain;
if (gained > Integer.MAX_VALUE) {
// do something.
} else if (gained < Integer.MIN_VALUE) {
// do something
}
ints.put(i, (int) gained);
}
A simple approach is to let the values overflow, but as you say this can result in an apparent distortion. Just clipping the data could lead to long period of effective silence.
What you may have to do is a FFT and produce a signal which increases the strength of audible frequencies as the cost of lower frequencies when the gain is too high. i.e. it is the low frequencies which result in the signal being too high or too low so you can't amplify these as much if you want to stay in bounds.

Related

Coefficient Correlation Over a Large Binary Image Data-Set - Slow Performance

I am trying to build an OCR by calculating the Coefficient Correlation between characters extracted from an image with every character I have pre-stored in a database. My implementation is based on Java and pre-stored characters are loaded into an ArrayList upon the beginning of the application, i.e.
ArrayList<byte []> storedCharacters, extractedCharacters;
storedCharacters = load_all_characters_from_database();
extractedCharacters = extract_characters_from_image();
// Calculate the coefficent between every extracted character
// and every character in database.
double maxCorr = -1;
for(byte [] extractedCharacter : extractedCharacters)
for(byte [] storedCharacter : storedCharactes)
{
corr = findCorrelation(extractedCharacter, storedCharacter)
if (corr > maxCorr)
maxCorr = corr;
}
...
...
public double findCorrelation(byte [] extractedCharacter, byte [] storedCharacter)
{
double mag1, mag2, corr = 0;
for(int i=0; i < extractedCharacter.length; i++)
{
mag1 += extractedCharacter[i] * extractedCharacter[i];
mag2 += storedCharacter[i] * storedCharacter[i];
corr += extractedCharacter[i] * storedCharacter[i];
} // for
corr /= Math.sqrt(mag1*mag2);
return corr;
}
The number of extractedCharacters are around 100-150 per image but the database has 15600 stored binary characters. Checking the coefficient correlation between every extracted character and every stored character has an impact on the performance as it needs around 15-20 seconds to complete for every image, with an Intel i5 CPU.
Is there a way to improve the speed of this program, or suggesting another path of building this bringing similar results. (The results produced by comparing every character with such a large dataset is quite good).
Thank you in advance
UPDATE 1
public static void run() {
ArrayList<byte []> storedCharacters, extractedCharacters;
storedCharacters = load_all_characters_from_database();
extractedCharacters = extract_characters_from_image();
// Calculate the coefficent between every extracted character
// and every character in database.
computeNorms(charComps, extractedCharacters);
double maxCorr = -1;
for(byte [] extractedCharacter : extractedCharacters)
for(byte [] storedCharacter : storedCharactes)
{
corr = findCorrelation(extractedCharacter, storedCharacter)
if (corr > maxCorr)
maxCorr = corr;
}
}
}
private static double[] storedNorms;
private static double[] extractedNorms;
// Correlation between to binary images
public static double findCorrelation(byte[] arr1, byte[] arr2, int strCharIndex, int extCharNo){
final int dotProduct = dotProduct(arr1, arr2);
final double corr = dotProduct * storedNorms[strCharIndex] * extractedNorms[extCharNo];
return corr;
}
public static void computeNorms(ArrayList<byte[]> storedCharacters, ArrayList<byte[]> extractedCharacters) {
storedNorms = computeInvNorms(storedCharacters);
extractedNorms = computeInvNorms(extractedCharacters);
}
private static double[] computeInvNorms(List<byte []> a) {
final double[] result = new double[a.size()];
for (int i=0; i < result.length; ++i)
result[i] = 1 / Math.sqrt(dotProduct(a.get(i), a.get(i)));
return result;
}
private static int dotProduct(byte[] arr1, byte[] arr2) {
int dotProduct = 0;
for(int i = 0; i< arr1.length; i++)
dotProduct += arr1[i] * arr2[i];
return dotProduct;
}
Nowadays, it's hard to find a CPU with a single core (even in mobiles). As the tasks are nicely separated, you can do it with a few lines only. So I'd go for it, though the gain is limited.
In case you really mean cross-correlation, then a transform like DFT or DCT could help. They surely do for big images, but with yours 12x16, I'm not sure.
Maybe you mean just a dot product? And maybe you should tell us?
Note that you actually don't need to compute the correlation, most of the time you only need is find out if it's bigger than a threshold:
corr = findCorrelation(extractedCharacter, storedCharacter)
..... more code to check if this is the best match ......
This may lead to some optimizations or not, depending on how the images look like.
Note also that a simple low level optimization can give you nearly a factor of 4 as in this question of mine. Maybe you really should tell us what you're doing?
UPDATE 1
I guess that due to the computation of three products in the loop, there's enough instruction level parallelism, so a manual loop unrolling like in my above question is not necessary.
However, I see that those three products get computed some 100 * 15600 times, while only one of them depends on both extractedCharacter and storedCharacter. So you can compute
100 + 15600 + 100 * 15600
dot products instead of
3 * 100 * 15600
This way you may get a factor of three pretty easily.
Or not. After this step there's a single sum computed in the relevant step and the problem linked above applies. And so does its solution (unrolling manually).
Factor 5.2
While byte[] is nicely compact, the computation involves extending them to ints, which costs some time as my benchmark shows. Converting the byte[]s to int[]s before all the correlations gets computed saves time. Even better is to make use of the fact that this conversion for storedCharacters can be done beforehand.
Manual loop unrolling twice helps but unrolling more doesn't.

Light/Ray casting in java?

I've been trying to make a dynamic light system in java, without using libraries. For some reason, though, it seems I can't get light to run efficiently. It flickers and lags a ton. I'm doing this with no previous understanding of lighting engines in games, so I'm open to suggestions. Here is my current update method:
public void updateLight( ArrayList<Block> blocks )
{
//reset light
light.reset();
//add the x and y of this light
light.addPoint( x, y );
//precision for loops
int ires = 1;
int jres = 2;
for( int i = 0; i < width; i += ires )
{
//get radians of current angle
float rdir = (float)Math.toRadians( dir + i - width/2 );
//set up pixel vars
int px, py;
for( int j = 0; j < length; j += jres )
{
//get position of pixel
px = (int)ZZmath.getVectorX( x, rdir, j );
py = (int)ZZmath.getVectorY( y, rdir, j );
//if point gets found
boolean foundpoint = false;
for( int n = 0; n < blocks.size(); n ++ )
{
//check if block is solid
//also check that collision is possible really quickly for efficiency
if( blocks.get( n ).solid )
{
//get info on block
int bx = blocks.get( n ).x;
int by = blocks.get( n ).y;
//quick trim
if( Math.abs( bx - px ) <= 32 && Math.abs( by - py ) <= 32 )
{
int bw = blocks.get( n ).w;
int bh = blocks.get( n ).h;
if( ZZmath.pointInBounds( px, py, bx, by, bw, bh ) )
{
//add point to polygon
light.addPoint( px, py );
//found point
foundpoint = true;
}
}
}
}
//if a point is found, break
if( foundpoint )
{
break;
}
//if at end of loop, add point
//loose definition of "end" to prevent flickers
if( j >= length - jres*2 )
{
light.addPoint( px, py );
}
}
}
}
This modifies a polygon that displays for light. I'll change that later. Any idea of ways I can make this run better? Also, no, no libraries. I don't have anything against them, just don't want to use one now.
You implementation doesn't appear to use much of the stuff I see here:
http://www.cs.utah.edu/~shirley/books/fcg2/rt.pdf
I'd recommend digesting this completely. If your objective is to understand ray tracing deeply, that's how it should be done.
Maybe your objective was to learn by writing your own raytracer. In my experience I would end up rewriting this code several times and still not get it completely right. It's good to get your hands dirty but it's not necessarily the most effective way to go about things.
Overall it looks like you need to study (object oriented) programming concepts, and take a data structures and algorithms course.
The biggest thing is readability. Document your code, for your future self if no one else. This means Clear comments before and during updateLight(). The existing comments are alright (though they paraphrase the code more than justify it), but "really quickly for efficiency" is a lie.
For a small issue of readability that could be a tiny drag on performance, make a local variable for blocks.get(n). Name it something short but descriptive, save typing and only make one method call to retrieve it.
"if at end of loop": I have no idea which loop you mean, and the for loops have definite ends. A comment }//end for or }//end for width is often helpful.
Checking if the block is solid is unnecessary! Just store your blocks in two lists, and only go through the solid blocks. Even if you have some desire to have flickering blocks, one remove and add is cheaper than O(width*length*numbernotsolid) extra work.
There are many ways you could structure how the blocks are stored to facilitate quick testing. You only want or need to test blocks whose coordinates are near to a particular light. The basic strategy is divide the space into a grid, and sort the blocks based on what section of the grid they fall into. Then when you have light in a particular section of the grid, you know you only need to test blocks in that section (and possibly a neighboring section - there are details depending on their width and the light's).
I have no idea whether that is along the lines of the right approach or not. I don't know much about raytracing, although it is or used to be rather slow. It looks like you have a decent naive implementation. There might be a slightly different naive approach that is faster and some more difficult (to code to completion) algorithms that are moderately yet more fast.
Also, I see no need to do this breadth first. Why not solve for one line (you call them pixels?) at a time. Count the number of times this code calls Math.toRadians. It looks like it's just an extraneous line because you could work along the same angle until ready for the next.

How do I know that my neural network is being trained correctly

I've written an Adaline Neural Network. Everything that I have compiles, so I know that there isn't a problem with what I've written, but how do I know that I have to algorithm correct? When I try training the network, my computer just says the application is running and it just goes. After about 2 minutes I just stopped it.
Does training normally take this long (I have 10 parameters and 669 observations)?
Do I just need to let it run longer?
Hear is my train method
public void trainNetwork()
{
int good = 0;
//train until all patterns are good.
while(good < trainingData.size())
{
for(int i=0; i< trainingData.size(); i++)
{
this.setInputNodeValues(trainingData.get(i));
adalineNode.run();
if(nodeList.get(nodeList.size()-1).getValue(Constants.NODE_VALUE) != adalineNode.getValue(Constants.NODE_VALUE))
{
adalineNode.learn();
}
else
{
good++;
}
}
}
}
And here is my learn method
public void learn()
{
Double nodeValue = value.get(Constants.NODE_VALUE);
double nodeError = nodeValue * -2.0;
error.put(Constants.NODE_ERROR, nodeError);
BaseLink link;
int count = inLinks.size();
double delta;
for(int i = 0; i < count; i++)
{
link = inLinks.get(i);
Double learningRate = value.get(Constants.LEARNING_RATE);
Double value = inLinks.get(i).getInValue(Constants.NODE_VALUE);
delta = learningRate * value * nodeError;
inLinks.get(i).updateWeight(delta);
}
}
And here is my run method
public void run()
{
double total = 0;
//find out how many input links there are
int count = inLinks.size();
for(int i = 0; i< count-1; i++)
{
//grab a specific link in sequence
BaseLink specificInLink = inLinks.get(i);
Double weightedValue = specificInLink.weightedInValue(Constants.NODE_VALUE);
total += weightedValue;
}
this.setValue(Constants.NODE_VALUE, this.transferFunction(total));
}
These functions are part of a library that I'm writing. I have the entire thing on Github here. Now that everything is written, I just don't know how I should go about actually testing to make sure that I have the training method written correctly.
I asked a similar question a few months ago.
Ten parameters with 669 observations is not a large data set. So there is probably an issue with your algorithm. There are two things you can do that will make debugging your algorithm much easier:
Print the sum of squared errors at the end of each iteration. This will help you determine if the algorithm is converging (at all), stuck at a local minimum, or just very slowly converging.
Test your code on a simple data set. Pick something easy like a two-dimensional input that you know is linearly separable. Will your algorithm learn a simple AND function of two inputs? If so, will it lean an XOR function (2 inputs, 2 hidden nodes, 2 outputs)?
You should be adding debug/test mode messages to watch if the weights are getting saturated and more converged. It is likely that good < trainingData.size() is not happening.
Based on Double nodeValue = value.get(Constants.NODE_VALUE); I assume NODE_VALUE is of type Double ? If that's the case then this line nodeList.get(nodeList.size()-1).getValue(Constants.NODE_VALUE) != adalineNode.getValue(Constants.NODE_VALUE) may not really converge exactly as it is of type double with lot of other parameters involved in obtaining its value and your convergence relies on it. Typically while training a neural network you stop when the convergence is within an acceptable error limit (not a strict equality like you are trying to check).
Hope this helps

Android library to get pitch from WAV file

I have a list of sampled data from the WAV file. I would like to pass in these values into a library and get the frequency of the music played in the WAV file. For now, I will have 1 frequency in the WAV file and I would like to find a library that is compatible with Android. I understand that I need to use FFT to get the frequency domain. Is there any good libraries for that? I found that [KissFFT][1] is quite popular but I am not very sure how compatible it is on Android. Is there an easier and good library that can perform the task I want?
EDIT:
I tried to use JTransforms to get the FFT of the WAV file but always failed at getting the correct frequency of the file. Currently, the WAV file contains sine curve of 440Hz, music note A4. However, I got the result as 441. Then I tried to get the frequency of G4, I got the result as 882Hz which is incorrect. The frequency of G4 is supposed to be 783Hz. Could it be due to not enough samples? If yes, how much samples should I take?
//DFT
DoubleFFT_1D fft = new DoubleFFT_1D(numOfFrames);
double max_fftval = -1;
int max_i = -1;
double[] fftData = new double[numOfFrames * 2];
for (int i = 0; i < numOfFrames; i++) {
// copying audio data to the fft data buffer, imaginary part is 0
fftData[2 * i] = buffer[i];
fftData[2 * i + 1] = 0;
}
fft.complexForward(fftData);
for (int i = 0; i < fftData.length; i += 2) {
// complex numbers -> vectors, so we compute the length of the vector, which is sqrt(realpart^2+imaginarypart^2)
double vlen = Math.sqrt((fftData[i] * fftData[i]) + (fftData[i + 1] * fftData[i + 1]));
//fd.append(Double.toString(vlen));
// fd.append(",");
if (max_fftval < vlen) {
// if this length is bigger than our stored biggest length
max_fftval = vlen;
max_i = i;
}
}
//double dominantFreq = ((double)max_i / fftData.length) * sampleRate;
double dominantFreq = (max_i/2.0) * sampleRate / numOfFrames;
fd.append(Double.toString(dominantFreq));
Can someone help me out?
EDIT2: I manage to fix the problem mentioned above by increasing the number of samples to 100000, however, sometimes I am getting the overtones as the frequency. Any idea how to fix it? Should I use Harmonic Product Frequency or Autocorrelation algorithms?
I realised my mistake. If I take more samples, the accuracy will increase. However, this method is still not complete as I still have some problems in obtaining accurate results for piano/voice sounds.

for-loop very slow on Android device

I just ran into an issue while trying to write an bitmap-manipulating algo for an android device.
I have a 1680x128 pixel Bitmap and need to apply a filter on it. But this very simple code-piece actually took almost 15-20 seconds to run on my Android device (xperia ray with a 1Ghz processor).
So I tried to find the bottleneck and reduced as many code lines as possible and ended up with the loop itself, which took almost the same time to run.
for (int j = 0; j < 128; j++) {
for (int i = 0; i < 1680; i++) {
Double test = Math.random();
}
}
Is it normal for such a device taking so much time in a simple for-loop with no difficult operations?
I'm very new to programming on mobile devices so please excuse if this question may be stupid.
UPDATE: Got it faster now with some simpler operations.
But back to my main problem:
public static void filterImage(Bitmap img, FilterStrategy filter) {
img.prepareToDraw();
int height = img.getHeight();
int width = img.getWidth();
RGB rgb;
for (int j = 0; j < height; j++) {
for (int i = 0; i < width; i++) {
rgb = new RGB(img.getPixel(i, j));
if (filter.isBlack(rgb)) {
img.setPixel(i, j, 0);
} else
img.setPixel(i, j, 0xffffffff);
}
}
return;
}
The code above is what I really need to run faster on the device. (nearly immediate)
Do you see any optimizing potential in it?
RGB is only a class that calculates the red, green and blue value and the filter simply returns true if all three color parts are below 100 or any othe specified value.
Already the loop around img.getPixel(i,j) or setPixel takes 20 or more seconds. Is this such an expensive operation?
It may be because too many Objects of type Double being created.. thus it increase heap size and device starts freezing..
A way around is
double[] arr = new double[128]
for (int j = 0; j < 128; j++) {
for (int i = 0; i < 1680; i++) {
arr[i] = Math.random();
}
}
First of all Stephen C makes a good argument: Try to avoid creating a bunch of RGB-objects.
Second of all, you can make a huge improvement by replacing your relatively expensive calls to getPixel with a single call to getPixels
I made some quick testing and managed to cut to runtime to about 10%. Try it out. This was the code I used:
int[] pixels = new int[height * width];
img.getPixels(pixels, 0, width, 0, 0, width, height);
for(int pixel:pixels) {
// check the pixel
}
There is a disclaimer in the docs below for random that might be affecting performance, try creating an instance yourself rather than using the static version, I have highlighted the performance disclaimer in bold:
Returns a pseudo-random double n, where n >= 0.0 && n < 1.0. This method reuses a single instance of Random. This method is thread-safe because access to the Random is synchronized, but this harms scalability. Applications may find a performance benefit from allocating a Random for each of their threads.
Try creating your own random as a static field of your class to avoid synchronized access:
private static Random random = new Random();
Then use it as follows:
double r = random.nextDouble();
also consider using float (random.nextFloat()) if you do not need double precision.
RGB is only a class that calculates the red, green and blue value and the filter simply returns true if all three color parts are below 100 or any othe specified value.
One problem is that you are creating height * width instances of the RGB class, simply to test whether a single pizel is black. Replace that method with a static method call that takes the pixel to be tested as an argument.
More generally, if you don't know why some piece of code is slow ... profile it. In this case, the profiler would tell you that a significant amount of time is spent in the RGB constructor. And the memory profiler would tell you that large numbers of RGB objects are being created and garbage collected.

Categories