I am developing a speech dictation app in android that send the recorded audio file through email. And its difficult to send large size wav files, so I am thinking about converting wav files to an appropriate format that can be sent easily by email.
After googling I found that .dss files consumes very less size and can be easily sent, but I don't know how to convert wav files into dss format. Your answers will be very helpfull.
I would recommend using Speex, since it is a free audio codec.
There also is a free java library, which you should be easy to use in android.
http://jspeex.sourceforge.net/
Also, there is the JSPeex SVN Repo, shich should get you started. It has some code examples for a player and a recorder:
http://jspeex.svn.sourceforge.net/viewvc/jspeex/main/trunk/player/src/main/java/org/xiph/speex/player/
Aswell as the javadoc http://jspeex.sourceforge.net/doc/index.html
I setuped ndk and used speex in my project. I am able to encode the wave file successfully, but when i try to decode it back the file size keeps on increasing to a large size. I have recorded the audio at a sample rate of 8000 and 16 BIT, MONO.
The code for decoding is given below:
#include <jni.h>
#include <stdio.h>
#include "speex/speex.h"
#define FRAME_SIZE 160
void Java_com_m2_iSmartDm_ISmartDMActivity_spxDec(JNIEnv * env, jobject jobj,
jstring dir1,jstring dir2)
{
const char *inFile= (*env)->GetStringUTFChars(env,dir1,0);
const char *outFile= (*env)->GetStringUTFChars(env,dir2,0);
FILE *fin;
FILE *fout;
/*Holds the audio that will be written to file (16 bits per sample)*/
short out[FRAME_SIZE];
/*Speex handle samples as float, so we need an array of floats*/
float output[FRAME_SIZE];
char cbits[200];
int nbBytes;
/*Holds the state of the decoder*/
void *state;
/*Holds bits so they can be read and written to by the Speex routines*/
SpeexBits bits;
int i, tmp;
/*Create a new decoder state in narrowband mode*/
state = speex_decoder_init(&speex_nb_mode);
/*Set the perceptual enhancement on*/
tmp=1;
speex_decoder_ctl(state, SPEEX_SET_ENH, &tmp);
fin = fopen(inFile, "r");
fout=fopen(outFile,"w");
speex_bits_init(&bits);
while (1)
{
/*Read the size encoded by sampleenc, this part will likely be
different in your application*/
fread(&nbBytes, sizeof(int), 1, fin);
if (feof(stdin))
break;
/*Read the "packet" encoded by sampleenc*/
fread(cbits, 1, nbBytes, fin);
/*Copy the data into the bit-stream struct*/
speex_bits_read_from(&bits, cbits, nbBytes);
/*Decode the data*/
speex_decode(state, &bits, output);
/*Copy from float to short (16 bits) for output*/
for (i=0;i<FRAME_SIZE;i++)
out[i]=output[i];
/*Write the decoded audio to file*/
fwrite(out, sizeof(short), FRAME_SIZE, fout);
}
/*Destroy the decoder state*/
speex_decoder_destroy(state);
/*Destroy the bit-stream truct*/
speex_bits_destroy(&bits);
fclose(fout);
fclose(fin);
}
Is there anything wrong in my code? Why it gives in such a large size?
Related
edited for code formatting
I'm trying to implement a Ble OTA system with a Esp32
I need to read a binary file into a Hex array and pass this over Ble to my Esp32 characteristic
The Esp32 works fine, and accepts an update from a simple Python3 script
the basics are below
`
# compute the packet size
packet_size = (512)
# split the firmware into packets
with open(file_path, "rb") as file:
while chunk := file.read(packet_size):
firmware.append(chunk)
# sequentially write all packets to OTA data
for i, pkg in enumerate(firmware):
print(f"Sending packet {i}/{len(firmware)-1}")
await client.write_gatt_char(
OTA_DATA_UUID,
pkg,
response=True
)`
Trying the same in Android java produces an array of unsigned ints, this does not seem to be compatible with the type of bytes that the Esp32 is expecting
heres a cut down example of what im trying
String path =
ContentUriUtils.INSTANCE.getFilePath(firmwareSelectActivity.this,
chosenFile);
File file = new File(path);
byte[] fileByteArray = readFile(file);
byte[][] test = splitArray(fileByteArray,512);
for(int i = 0;i < test.length;i++)
{ mBluetoothLeService.addBleOtaWriteByteCommandToQueue(BluetoothLeService
.getOtaTXcharacteristic(), test[i]);
}
The ESP32 OTA library is expecting the first byte to be 0xE9 to start the upgrade process. But I seem to be sending 0x13
I'm trying to use Xuggler (which I believe uses ffmpeg under the hood) to do the following:
Accept a raw MPJPEG video bitstream (from a small TTL serial camera) and encode/transcode it to h.264; and
Accept a raw audio bitsream (from a microphone) and encode it to AAC; then
Mux the two (audio and video) bitsreams together into a MPEG-TS container
I've watched/read some of their excellent tutorials, and so far here's what I've got:
// I'll worry about implementing this functionality later, but
// involves querying native device drivers.
byte[] nextMjpeg = getNextMjpegFromSerialPort();
// I'll also worry about implementing this functionality as well;
// I'm simply providing these for thoroughness.
BufferedImage mjpeg = MjpegFactory.newMjpeg(nextMjpeg);
// Specify a h.264 video stream (how?)
String h264Stream = "???";
IMediaWriter writer = ToolFactory.makeWriter(h264Stream);
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264);
writer.encodeVideo(0, mjpeg);
For one, I think I'm close here, but it's still not correct; and I've only gotten this far by reading the video code examples (not the audio - I can't find any good audio examples).
Literally, I'll be getting byte-level access to the raw video and audio feeds coming into my Xuggler implementation. But for the life of me I can't figure out how to get them into an h.264/AAC/MPEG-TS format. Thanks in advance for any help here.
Looking at Xuggler this sample code, the following should work to encode video as H.264 and mux it into a MPEG2TS container:
IMediaWriter writer = ToolFactory.makeWriter("output.ts");
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264, width, height);
for (...)
{
BufferedImage mjpeg = ...;
writer.encodeVideo(0, mjpeg);
}
The container type is guessed from the file extension, the codec is specified explicitly.
To mux audio and video, you would do something like this:
writer.addVideoStream(videoStreamIndex, 0, videoCodec, width, height);
writer.addAudioStream(audioStreamIndex, 0, audioCodec, channelCount, sampleRate);
while (... have more data ...)
{
BufferedImage videoFrame = ...;
long videoFrameTime = ...; // this is the time to display this frame
writer.encodeVideo(videoStreamIndex, videoFrame, videoFrameTime, DEFAULT_TIME_UNIT);
short[] audioSamples = ...; // the size of this array should be number of samples * channelCount
long audioSamplesTime = ...; // this is the time to play back this bit of audio
writer.encodeAudio(audioStreamIndex, audioSamples, audioSamplesTime, DEFAULT_TIME_UNIT);
}
In this case I believe your code is responsible for interleaving the audio and video: you want to call either encodeAudio() or encodeVideo() on each pass through the loop, based on which data available (a chunk of audio samples or a video frame) has an earlier timestamp.
There is another, lower-level API you may end up using, based on IStreamCoder, which gives more control over various parameters. I don't think you will need to use that.
To answer the specific questions you asked:
(1) "Encode a BufferedImage (M/JPEG) into a h.264 stream" - you already figured that out, writer.addVideoStream(..., ICodec.ID.CODEC_ID_H264) makes sure you get the H.264 codec. To get a transport stream (MPEG2 TS) container, simply call makeWriter() with a filename with a .ts extension.
(2) "Figure out what the "BufferedImage-equivalent" for a raw audio feed is" - that is either a short[] or an IAudioSamples object (both seem to work, but IAudioSamples has to be constructed from an IBuffer which is much less straightforward).
(3) "Encode this audio class into an AAC audio stream" - call writer.addAudioStream(..., ICodec.ID.CODEC_ID_AAC, channelCount, sampleRate)
(4) "multiplex both stream into the same MPEG-TS container" - call makeWriter() with a .ts filename, which sets the container type. For correct audio/video sync you probably need to call encodeVideo()/encodeAudio() in the correct order.
P.S. Always pass the earliest audio/video available first. For example, if you have audio chunks which are 440 samples long (at 44000 Hz sample rate, 440 / 44000 = 0.01 seconds), and video at exactly 25fps (1 / 25 = 0.04 seconds), you would give them to the writer in this order:
video0 # 0.00 sec
audio0 # 0.00 sec
audio1 # 0.01 sec
audio2 # 0.02 sec
audio3 # 0.03 sec
video1 # 0.04 sec
audio4 # 0.04 sec
audio5 # 0.05 sec
... and so forth
Most playback devices are probably ok with the stream as long as the consecutive audio/video timestamps are relatively close, but this is what you'd do for a perfect mux.
P.S. There are a few docs you may want to refer to: Xuggler class diagram, ToolFactory, IMediaWriter, ICodec.
I think you should look at gstreamer: http://gstreamer.freedesktop.org/ You would have to look for plugin that can capture the camera input and then pipe it to libx264 and aac plugins and them pass them through a mpegts muxer.
A pipeline in gstreamer would look like:
v4l2src queue-size=15 ! video/x-raw,framerate=25/1,width=384,height=576 ! \
avenc_mpeg4 name=venc \
alsasrc ! audio/x-raw,rate=48000,channels=1 ! audioconvert ! lamemp3enc name=aenc \
avimux name=mux ! filesink location=rec.avi venc. ! mux. aenc. ! mux.
In this pipeline mpeg4 and mp3 encoders are being used and the stream is muxed to avi. You should be able to find plugins for libx264 and aac. Let me know if you need further pointers.
I'm using the JSpeex library for audio encoding.
The encoding seems to work fine. But decoding doesn't.(i.e. I get all zeros as decoded data.)
// encoding ///
SpeexEncoder enc = new SpeexEncoder();
// if i use channel as 1 instead of 2 even encoding doesn't work
enc.init(mode, quality, 44100, 2);
enc.processData(b, 0, b.length); // b is byte array i'm trying to encode & then decode
enc.getProcessedData(temp, 0); // save encoded data to temp // temp is byte array
////Decoding /////////
SpeexDecoder dec = new SpeexDecoder();
dec.init(mode,44100,2,true);
dec.processData(temp, 0, temp.length);
dec.getProcessedData(decoded, 0); //decoded is the output byte array which comes only zeros
If anyone has any info on this please reply.
Thanks
I realize this post is a bit old, but ran into a similar problem with Speex.js (a javascript port).
Not sure if the issue is the same for you, but I found that there was an implicit conversion from Float32Array to Int16Array that didn't actually convert the data. This meant that all of the (-1.0,1.0) float data was essentially integer zeros, and was converted as such.
Just needed to do the conversion to Int16Array before passing in the data (so it wouldn't need to do any data conversion within the library) and the output sprang to life :)
Hope that helps. cheers!
I have to send an image over a socket . so I, the client convert the image to a bytearray and send it as such.
The server.. being the applet recieves the byte array. - fine till here.
I must use Java 1.3 so NO ImageIO or Image classes.
and then since we have an applet - no fileoutputstream
Any other ways??
Toolkit.getDefaultToolkit().createImage(byte[] imagedata,
int imageoffset,
int imagelength)
It returns Image and it's available #since JDK1.1 according to javadoc
I have a text file and it can be ANSI (with ISO-8859-2 charset), UTF-8, UCS-2 Big or Little Endian.
Is there any way to detect the encoding of the file to read it properly?
Or is it possible to read a file without giving the encoding? (and it reads the file as it is)
(There are several program that can detect and convert encoding/format of text files.)
Yes, there's a number of methods to do character encoding detection, specifically in Java. Take a look at jchardet which is based on the Mozilla algorithm. There's also cpdetector and a project by IBM called ICU4j. I'd take a look at the latter, as it seems to be more reliable than the other two. They work based on statistical analysis of the binary file, ICU4j will also provide a confidence level of the character encoding it detects so you can use this in the case above. It works pretty well.
UTF-8 and UCS-2/UTF-16 can be distinguished reasonably easily via a byte order mark at the start of the file. If this exists then it's a pretty good bet that the file is in that encoding - but it's not a dead certainty. You may well also find that the file is in one of those encodings, but doesn't have a byte order mark.
I don't know much about ISO-8859-2, but I wouldn't be surprised if almost every file is a valid text file in that encoding. The best you'll be able to do is check it heuristically. Indeed, the Wikipedia page talking about it would suggest that only byte 0x7f is invalid.
There's no idea of reading a file "as it is" and yet getting text out - a file is a sequence of bytes, so you have to apply a character encoding in order to decode those bytes into characters.
You can use ICU4J (http://icu-project.org/apiref/icu4j/)
Here is my code:
String charset = "ISO-8859-1"; //Default chartset, put whatever you want
byte[] fileContent = null;
FileInputStream fin = null;
//create FileInputStream object
fin = new FileInputStream(file.getPath());
/*
* Create byte array large enough to hold the content of the file.
* Use File.length to determine size of the file in bytes.
*/
fileContent = new byte[(int) file.length()];
/*
* To read content of the file in byte array, use
* int read(byte[] byteArray) method of java FileInputStream class.
*
*/
fin.read(fileContent);
byte[] data = fileContent;
CharsetDetector detector = new CharsetDetector();
detector.setText(data);
CharsetMatch cm = detector.detect();
if (cm != null) {
int confidence = cm.getConfidence();
System.out.println("Encoding: " + cm.getName() + " - Confidence: " + confidence + "%");
//Here you have the encode name and the confidence
//In my case if the confidence is > 50 I return the encode, else I return the default value
if (confidence > 50) {
charset = cm.getName();
}
}
Remember to put all the try catch need it.
I hope this works for you.
If your text file is a properly created Unicode text file then the Byte Order Mark (BOM) should tell you all the information you need. See here for more details about BOM
If it's not then you'll have to use some encoding detection library.