How do I combine/merge two wav files into one wav file?

How do I combine/merge two wav files into one wav file? - java

How can I merge two wav files using java?
I tried this but it didn't work correctly, is their any other way to do it?

If you work with the bytes of a wav file directly you can use the same strategy in any programming language. For this example I'll assume the two source files have the same bitrate/numchannels and are the same length/size.
(if not you can probably edit them before starting the merge).
First look over the wav specificaiton, I found a good one at a stanford course website:
Common header lengths are 44 or 46 bytes.
If you want to concatenate two files (ie play one wav then the other in a single file):
find out what format your wav files are
chop off the first 44/46 bytes which are the headers, the remainder of the file is the data
create a new file and stick one of the headers in that.
new wav file = {header} = {44/46} bytes long
add the two data parts from the original files
new wav file = {header + data1 + data2 } = {44/46 + size(data1) + size(data2)} bytes long
modify your header in two places to reflect the new file's length.
a. modify bytes 4+4 (ie. 4 bytes starting at offset 4).
The new value should be a hexadecimal number representing the size of the new wav file in bytes {44/46 + size(data1) + size(data2)} - 8bytes.
b. modify bytes 40+4 or 42+4 (the 4 bytes starting at offset 40 or 42, depending on if you have a 44byte header or 46 byte header).
The new value should be a hexadecimal number representing the total size of the new wav file. ie {44/46 + size(data1) + size(data2)}
If you want to instead merge or mix the two files (so that they both play at the same time then):
you won't have to edit the header if both files are the same length.
starting at byte 44/46 you will have to edit each sample to be the value in data1 + the value in data2.
so for example if your SampleRate was 8 bits you would modify 1 byte, if your sample rate was 16bits you would modify 2 bytes.
the rest of the file is just Samples of 1/2bytes storing an int value representing the waveform of the sound at that time.
a. For each of the remaining samples in the file grab the 1/2 byte hex string and get the int value from both files data1 and data2.
b. add the 1/2 byte integers together
convert the result back to hexadecimal and use that value in your output file.
c. You normally have to divide that number by 2 to get an average value that fits back in the original 1/2byte sample block. I was getting distortion when i tried it in objc(probably related to signed or unsigned ints) and just skipped the division part since it will only likely be a problem if you are merging very loud sounds together.
ie when data1 + data2 is larger than 1/2 bytes the sound will clip. There was a discussion about the clipping issue here and you may want to try one of those clipping techniques.

Merge implies mixing, but it sounds like you mean concatenation here.
To concatenate with silence in the middle you need to insert a number of frames of silence into the file. A silent frame is one where every channel has a "0" - if you are using signed samples this is literally a 0, for unsigned, it is maxvalue/2.
Each frame will have one sample for each channel. So to generate one second of silence in CD format, you would insert 44100 (hz) * 2 (channels per frame) = 88200 16 bit signed ints with a value of 0 each. I am not sure how to access the raw file abstracted by the Java audio abstractions, but that is the data to insert.

Related

What happens if a file doesn't end exactly at the last byte?

For example, if a file is 100 bits, it would be stored as 13 bytes.This means that the first 4 bits of the last byte is the file and the last 4 is not the file (useless data).
So how is this prevented when reading a file using the FileInputStream.read() function in java or similar functions in other programming language?

You'll notice if you ever use assembly, there's no way to actually read a specific bit. The smallest addressable bit of memory is a byte, memory addresses refer to a specific byte in memory. If you ever use a specific bit, in order to access it you have to use bitwise functions like | & ^ So in this situation, if you store 100 bits in binary, you're actually storing a minimum of 13 bytes, and a few bits just default to 0 so the results are the same.

Current file systems mostly store files that are an integral number of bytes, so the issue does not arise. You cannot write a file that is exactly 100 bits long. The reason for this is simple: the file metadata holds the length in bytes and not the length in bits.
This is a conscious design choice by the designers of the file system. They presumably chose the design the way they do out of a consideration that there's very little need for files that are an arbitrary number of bits long.
Those cases that do need a file to contain a non-integral number of bytes can (and need to) make their own arrangements. Perhaps the 100-bit case could insert a header that says, in effect, that only the first 100 bits of the following 13 bytes have useful data. This would of course need special handling, either in the application or in some library that handled that sort of file data.
Comments about bit-lengthed files not being possible because of the size of a boolean, etc., seem to me to miss the point. Certainly disk storage granularity is not the issue: we can store a "100 byte" file on a device that can only handle units of 256 bytes - all it takes is for the file system to note that the file size is 100, not 256, even though 256 bytes are allocated to the file. It could equally well track that the size was 100 bits, if that were useful. And, of course, we'd need I/O syscalls that expressed the transfer length in bits. But that's not hard. The in-memory buffer would need to be slightly larger, because neither the language nor the OS allocates RAM in arbitrary bit-lengths, but that's not tied tightly to file size.

Storing a (string,integer) tuple more efficiently and apply binary search

Introduction
We store tuples (string,int) in a binary file. The string represents a word (no spaces nor numbers). In order to find a word, we apply binary search algorithm, since we know that all the tuples are sorted with respect to the word.
In order to store this, we use writeUTF for the string and writeInt for the integer. Other than that, let's assume for now there are no ways to distinguish between the start and the end of the tuple unless we know them in advance.
Problem
When we apply binary search, we get a position (i.e. (a+b)/2) in the file, which we can read using methods in Random Access File, i.e. we can read the byte at that place. However, since we can be in the middle of the word, we cannot know where this words starts or finishes.
Solution
Here're two possible solutions we came up with, however, we're trying to decide which one will be more space efficient/faster.
Method 1: Instead of storing the integer as a number, we thought to store it as a string (using eg. writeChars or writeUTF), because in that case, we can insert a null character in the end of the tuple. That is, we can be sure that none of the methods used to serialize the data will use the null character, since the information we store (numbers and digits) have higher ASCII value representations.
Method 2: We keep the same structure, but instead we separate each tuple with 6-8 (or less) bytes of random noise (same across the file). In this case, we assume that words have a low entropy, so it's very unlikely they will have any signs of randomness. Even if the integer may get 4 bytes that are exactly the same as those in the random noise, the additional two bytes that follow will not (with high probability).
Which of these methods would you recommend? Is there a better way to store this kind of information. Note, we cannot serialize the entire file and later de-serialize it into memory, since it's very big (and we are not allowed to).

I assume you're trying to optimize for speed & space (in that order).
I'd use a different layout, built from 2 files:
Interger + Index file
Each "record" is exactly 8 bytes long, the lower 4 are the integer value for the record, and the upper 4 bytes are an integer representing the offset for the record in the other file (the characters file).
Characters file
Contiguous file of characters (UTF-8 encoding or anything you choose). "Records" are not separated, not terminated in any way, simple 1 by 1 characters. For example, the records Good, Hello, Morning will look like GoodHelloMorning.
To iterate the dataset, you iterate the integer/index file with direct access (recordNum * 8 is the byte offset of the record), read the integer and the characters offset, plus the character offset of the next record (which is the 4 byte integer at recordNum * 8 + 12), then read the string from the characters file between the offsets you read from the index file. Done!

it's less than 200MB. Max 20 chars for a word.
So why bother? Unless you work on some severely restricted system, load everything into a Map<String, Integer> and get a few orders of magnitude speed up.
But let's say, I'm overlooking something and let's continue.
Method 1: Instead of storing the integer as a number, we thought to store it as a string (using eg. writeChars or writeUTF), because in that case, we can insert a null character
You don't have to as you said that your word contains no numbers. So you can always parse things like 0124some456word789 uniquely.
The efficiency depends on the distribution. You may win a factor of 4 (single digit numbers) or lose a factor of 2.5 (10-digit numbers). You could save something by using a higher base. But there's the storage for the string and it may dominate.
Method 2: We keep the same structure, but instead we separate each tuple with 6-8 (or less) bytes of random noise (same across the file).
This is too wasteful. Using four zeros between the data byte would do:
Find a sequence of at least four zeros.
Find the last zero.
That's the last separator byte.
Method 3: Using some hacks, you could ensure that the number contains no zero byte (either assuming that it doesn't use the whole range or representing it with five bytes). Then a single zero byte would do.
Method 4: As disk is organized in blocks, you should probably split your data into 4 KiB blocks. Then you can add some time header allowing you quick access to the data (start indexes for the 8th, 16th, etc. piece of data). The range between e.g., the 8th and 16th block should be scanned sequentially as it's both simpler and faster than binary search.

Combining text- and bit-information in a file in Java?

Alright, so we need to store a list of words and their respective position in a much bigger text. We've been asked if it's more efficient to save the position represented as text or represented as bits (data streams in Java).
I think that a bitwise representation is best since the text "1024" takes up 4*8=32 bits while only 11 if represented as bits.
The follow up question is should the index be saved in one or two files. Here I thought "perhaps you can't combine text and bitwise-representation in one file?" and that's the reason you'd need two files?
So the question first and foremost is can I store text-information (the word) combined with bitwise-information (it's position) in one file?

Too vague in terms of whats really needed.
If you have up to a few million words + positions, don't even bother thinking about it. Store in whatever format is the simplest to implement; space would only be an issue if you need to sent the data over a low bandwidth network.
Then there is general data compression available, by just wrapping your Input/OutputStreams with deflater or gzip (already built in the JRE) you will get reasonably good compression (50% or more for text). That easily beats what you can quickly write yourself. If you need better compression there is XZ for java (implements LZMA compression), open source.
If you need random access, you're on the wrong track, you will want to carefully design the data layout for the access patterns and storage should be only of tertiary concern.

The number 1024 would at least take 2-4 bytes (so 16-32 bits), as you need to know where the number ends and where it starts, and so it must have a fixed size. If your positions are very big, like 124058936, you would need to use 4 bytes per numbers (which would be better than 9 bytes as a string representation).
Using binary files you'll need of a way to know where the string starts and end, too. You can do this storing a byte before it, with its length, and reading the string like this:
byte[] arr = new byte[in.readByte()]; // in.readByte()*2 if the string is encoded in 16 bits
in.read(arr); // in is a FileInputStream / RandomAccessFile
String yourString = new String(arr, "US-ASCII");
The other possiblity would be terminating your string with a null character (00), but you would need to create your own implementation for that, as no readers support it by default (AFAIK).
Now, is it really worth storing it as binary data? That really depends on how big your positions are (because the strings, if in the text version are separated from their position with a space, would take the same amount of bytes).
My recommendation is that you use the text version, as it will probably be easier to parse and more readable.
About using one or two files, it doesn't really matter. You can combine text and binary in the same file, and it would take the same space (though making it in two separated files will always take a bit more space, and it might make it more messy to edit).

How do I combine two AudioInputStream?

The file format is "PCM_SIGNED 44100.0 Hz, 16 bit, stereo, 4 bytes/frame, little-endian", and I want to add them together while amplifying one of the two files. I plan to read the two wav get put them into two audioinputstream instances, then store the instances into two byte[] array, manipulate in the arrays, and get return as another audioinputstream instance.
I have done a lot of research but I have got no good results.
I know that is a class from www.jsresources.org mixing two audioinputstream, but it doesn't allow me to modify either of the two streams before mixing while I want to decrease one of the streams before mixing them. What do you think I should do?

To do this, you can convert the streams to PCM data, multiply the channel whose volume you wish to change by the desired factor, add the PCM data from the results together, then convert back to bytes.
To access the AudioStreams on a per-byte basis, check out the first extended code fragment at the Java Tutorials section on Using Files and Format Converters. This shows how to get an array of sound byte data. There is a comment that reads:
// Here, do something useful with the audio data that's
// now in the audioBytes array...
At this point, iterate through the bytes, converting to PCM. A set of commands based on the following should work:
for (int i = 0; i < numBytes; i += 2)
{
pcmA[i/2] = audioBytesA[i] & 0xff ) | ( audioBytesA[i + 1] << 8 );
pcmB[i/2] = audioBytesB[i] & 0xff ) | ( audioBytesB[i + 1] << 8 );
}
In the above, audioBytesA and audioBytesB are two input streams (names based on the code from the example), and pcmA and pcmB could be either int arrays or short arrays, holding values that fit within the range of a short. It might be best to make pcm arrays floats since you will be doing some math that will result in fractions. Using floats as in the example below only adds one place worth of accuracy (better rounding than when using int), and int would perform faster. I think using floats is more often done if the audio data gets normalized for use with additional processing.
From there, the best way to change volume is to multiply every PCM value by the same amount. For example, to increase volume by 25%,
pcmA[i] = pcmA[i] * 1.25f;
Then, add pcmA and pcmB, and convert back to bytes. You might also want to put in min or max functions to ensure that the volume & merging do not exceed values that can fit in the format's 16 bits.
I use the following to convert back to bytes:
for (int i = 0; i < numBytes; i++)
{
outBuffer[i*2] = (byte) pcmCombined[i];
outBuffer[(i*2) + 1] = (byte)((int)pcmCombined[i] >> 8 );
}
Above assumes pcmCombined[] is a float array. The conversion code can be a bit simpler if it is a short[] or int[] array.
I cut and pasted the above from dev work I did for programs posted at my website, and edited it for your scenario, so if there is a typo or bug crept in, please let me know in the comments and I will fix it.

Difference between storing images in byte array and binary (BLOB) and which one is faster

I want to insert and select images from sql server in jdbc. I am confused whether BLOB and byte are the same thing or different. I have used Blob in my code and the application loads slow as it has to select the images stored in Blob and convert it pixel by pixel. I want to use byte array but I don't know whether they are same or different. My main aim is to load the image faster.
Thank you

Before going further, we may need to remember about basic concepts about bit, byte and binary, BLOB.
Bit: Abbreviation of binary digit. It is the smallest storage unit. Bits can take values of 0 or 1.
Byte: Second smallest storage which is commonly (nibble is not mentioned since it is not very common term) used. It includes eight bits.
Binary: Actually, it is a numbering scheme that each digit of a number can take a value of 0 or 1.
BLOB: Set of binary data stored in a database. Also, type of a column which stores binary data inside.
To sum up definitions: Binary format is a scheme that which include bits.
To make it more concrete, we can observe results with the code below.
import java.nio.ByteBuffer;
public class TestByteAndBinary{
public static void main(String []args){
String s = "test"; //a string, series of chars
System.out.println(s);
System.out.println();
byte[] bytes = s.getBytes(); //since each char has a size of 1 byte, we will have an array which has 4 elements
for(byte b : bytes){
System.out.println(b);
}
System.out.println();
for(byte b : bytes){
String c = String.format("%8s", Integer.toBinaryString(b)).replace(' ', '0'); //each element is printed in its binary format
System.out.println(c);
}
}
}
Output:
$javac TestByteAndBinary.java
$java -Xmx128M -Xms16M TestByteAndBinary
test
116
101
115
116
01110100
01100101
01110011
01110100
Let's go back to the question:
If you really want to store an image inside a database, you have to use the BLOB type.
BUT! It is not the best practice.
Because databases are designed to store data and filesystems are
designed to store the files.
Reading image from disk is a simple thing. But reading an image from
the database need more time to accomplished (querying data,
transforming to an array and vice versa).
While an image is being read, it will cause the database to suffer
from lower performance since it is not simple textual or numerical read.
An image file doesn't benefit from characteristical features of a database (like indexing)
At this point, it is best practice to store that image on a server and store its path on the database.
As far as I can see on enterprise level projects, images are very very rarely stored inside the database. And it is the situation that those images were needed to store encrypted since they were including very sensual data. According to my humble opinion, even in that situation, those data had not to be stored in a database.

Blob simply means (Binary Large Object) and its the way database stores byte array.
hope this is simple and it answers your question.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.