What does MediaCodec, MediaExtractor and MediaMuxer mean in android? I am not a video person but I do know what encoding and decoding means, at a basic level. I need to know what are the functions of each classes and at which use cases are they used. I would also like to know:
If I want to extract frames from a camera preview and create a video file along with some editing (like speed), which classes should I use and how does it work together?
If I want to create a Video Player like Exoplayer (not all the functions but a simple Dash adaptive streaming player) , which classes should I use and how does it work together?
Hope you will answer. Thank You.
Let me start of by saying that it is hard to understand this API`s if you don't understand how video encoding/decoding works. I would suggest doing research about how encoders/decoders work.
I will provide an oversimplified explanation of each.
MediaCodec:
MediaCodec class can be used to access low-level media codecs, i.e. encoder/decoder components. It is part of the Android low-level multimedia support infrastructure
So MediaCodec handles the decoding or encoding of the video packets/buffers and is responsible for the interaction with the codec.
Here is an example of how to Initialize MediaCodec:
// Create Mediacodec instance by passing a mime type. It will select the best codec for this mime type
MediaCodec mDecoder = MediaCodec.createDecoderByType(mimeType);
// Pass an instance on MediaFormat and the output/rendering Surface
mDecoder.configure(format, surface, null, 0);
mDecoder.start();
You would then start passing buffers to MediaCodec, like this:
ByteBuffer[] inputBuffers = mDecoder.getInputBuffers();
int index = mDecoder.dequeueInputBuffer(timeout);
// Check if buffers are available
if (index >= 0) {
// Get dequeued buffer
ByteBuffer buffer = inputBuffers[index];
// Get sample data size to determine if we should keep queuing more buffers or signal end of stream
int sampleSize = mExtractor.readSampleData(buffer, 0);
if (sampleSize < 0) {
// Signal EOS, this happens when you reach the end if the video, for example.
mDecoder.queueInputBuffer(inIndex, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
} else {
// Queue the dequeued buffer and pass the extractors sample time
mDecoder.queueInputBuffer(index, 0, sampleSize, mExtractor.getSampleTime(), 0);
mExtractor.advance();
}
}
You then dequeue the output buffer and release it to your surface:
BufferInfo frameInfo = new BufferInfo();
int index mDecoder.dequeueOutputBuffer(frameInfo, timeout);
switch (index) {
case
MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
break;
case
MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
MediaFormat newFormat = mDecoder.getOutputFormat();
break;
case
MediaCodec.INFO_TRY_AGAIN_LATER: break; default:
break;
}
// You can now push the frames to the surface
// This is where you can control the playback speed, you can do this by letting your thread sleep momentarily
if (index > 0) {
mDecoder.releaseOutputBuffer(bufferIndex, true);
}
MediaExtractor:
MediaExtractor facilitates extraction of demuxed, typically encoded, media data from a data source.
The documentation description is self-explanatory.
Have a look below, I've added comments to make it more understandable:
// Initialize the extractor
MediaExtractor() mExtract = new MediaExtractor(); mExtract.setDataSource(mSource);
// Select/set the video track (if available)
int trackIndex = selectVideoTrack(mExtract);
if(trackIndex < 0)
throw new IOException("Can't find track");
mExtract.selectTrack(trackIndex);
// The extractor is now ready to be used
// Get the track format
mFormat = mExtractor.getTrackFormat(trackIndex);
// Get buffer size to check if a buffer is available
// This will be used by MediaCodec to determine if buffers are available
sampleSize = mExtractor.readSampleData(buffer, 0);
MediaMuxer:
MediaMuxer facilitates muxing elementary streams. Currently MediaMuxer supports MP4, Webm and 3GP file as the output. It also supports muxing B-frames in MP4 since Android Nougat.
This is self-explanatory once again. It's used to create a video/audio file. For example, merging two video files together.
Related
I'm likely dense but I cannot seem to find a solution to my issue
(NOTE: I CAN find lots of people reporting this issue, seems like it happened as a result of newer Java (possible 1.5?). Perhaps SAMPLE_RATE is no longer supported? I am unable to find any solution).
I'm trying to adjust the SAMPLE_RATE to speed up/slow down song. I can successfully play a .wav file without issue, so I looked into FloatControl which worked for adjusting volume:
public void adjustVolume(String audioType, float gain) {
FloatControl gainControl = null;
gainControl = (FloatControl) clipSFX.getControl(FloatControl.Type.MASTER_GAIN);
if(gain > MAX_VOLUME)
gain = MAX_VOLUME;
if(gain < MIN_VOLUME)
gain = MIN_VOLUME;
//set volume
gainControl.setValue(gain);
}
But when trying to translate this principle to SAMPLE_RATE, I get an error very early on at this stage:
public void adjustVolume(String audioType, float gain) {
FloatControl gainControl = null;
gainControl = (FloatControl) clipSFX.getControl(FloatControl.Type.SAMPLE_RATE);
//ERROR: Exception in thread "Thread-3" java.lang.IllegalArgumentException: Unsupported control type: Sample Rate
//I haven't gotten this far yet since the above breaks, but in theory will then set value?
gainControl.setValue(gain);
}
Everything I've found online seems to be related to taking input from a mic or some external line and doesn't seem to translate to using an audio file, so I'm unsure what I'm missing. Any help would be appreciated! Thanks!
Here we have a method that changes the speed - by doubling the sample rate. Basically the steps are as follows:
open the audio stream of the file
get the format
create a new format with the sample rate changed
open a data line with that format
read from the file/audio stream and play onto the line
The concepts here are SourceDataLine, AudioFormat and AudioInputStream. If you look at the javax.sound tutorial you will find them, or even the pages of the classes. You can now create your own method (like adjust(factor)) that just gets the new format and all else stay the same.
public void play() {
try {
File fileIn = new File(" ....);
AudioInputStream audioInputStream=AudioSystem.getAudioInputStream(fileIn);
AudioFormat formatIn=audioInputStream.getFormat();
AudioFormat format=new AudioFormat(formatIn.getSampleRate()*2, formatIn.getSampleSizeInBits(), formatIn.getChannels(), true, formatIn.isBigEndian());
System.out.println(formatIn.toString());
System.out.println(format.toString());
byte[] data=new byte[1024];
DataLine.Info dinfo=new DataLine.Info(SourceDataLine.class, format);
SourceDataLine line=(SourceDataLine)AudioSystem.getLine(dinfo);
if(line!=null) {
line.open(format);
line.start();
while(true) {
int k=audioInputStream.read(data, 0, data.length);
if(k<0) break;
line.write(data, 0, k);
}
line.stop();
line.close();
}
}
catch(Exception ex) { ex.printStackTrace(); }
}
It is also possible to vary the speed by using linear interpolation when progressing through the audio data.
Audio values are laid out in an array and the cursor normally goes from value to value. But you can set things up to progress an arbitrary amount, for example 1.5 frames, and create a weighted value where needed.
Suppose data is as follows:
0.5
0.8
0.2
-0.1
-0.5
-0.7
Your playback data (for 1.5 rate) would be
0.5
(0.8 + 0.2)/2
-0.1
(-0.5 + -0.7)/2
I know there have been posts that more fully explain this algorithm before on Stack Overflow. Forgive me for not tracking them down.
I use this method to allow real-time speed changes in .wav playback in the following open-source library: AudioCue. Feel free to check out the code and make use of the ideas in it.
Following is the method that creates a stereo pair of audio values from a spot that lies in between two audio frames (data is signed floats, ranging from -1 to 1). It's from an inner class AudioCuePlayer in AudioCue.java. Probably not the easiest to read. The sound data being read is in the array cue, and idx is the current "play head" location that is progressing through this array. 'intIndex' is the audio frame, and 'flatIndex' is the actual location of the frame in the array. I use frames to track the playhead's location and calculate the interpolation weights, and then use the flatIndex for getting the corresponding values from the array.
private float[] readFractionalFrame(float[] audioVals, float idx)
{
final int intIndex = (int) idx;
final int flatIndex = intIndex * 2;
audioVals[0] = cue[flatIndex + 2] * (idx - intIndex)
+ cue[flatIndex] * ((intIndex + 1) - idx);
audioVals[1] = cue[flatIndex + 3] * (idx - intIndex)
+ cue[flatIndex + 1] * ((intIndex + 1) - idx);
return audioVals;
}
I'd be happy to clarify if there are questions.
I want to get frame rate of video, but i don't want to use FFMPEG,JAVACV lib.
is that possible to get frame rate of video in android?
I read KEY_FRAME_RATE it's says that,"Specifically, MediaExtractor provides an integer value corresponding to the frame rate information of the track if specified and non-zero."
but i don't know how to use it?
if you know about how to get frame rate from video then answer here.
MediaExtractor extractor = new MediaExtractor();
int frameRate = 24; //may be default
try {
//Adjust data source as per the requirement if file, URI, etc.
extractor.setDataSource(...);
int numTracks = extractor.getTrackCount();
for (int i = 0; i < numTracks; ++i) {
MediaFormat format = extractor.getTrackFormat(i);
String mime = format.getString(MediaFormat.KEY_MIME);
if (mime.startsWith("video/")) {
if (format.containsKey(MediaFormat.KEY_FRAME_RATE)) {
frameRate = format.getInteger(MediaFormat.KEY_FRAME_RATE);
}
}
}
} catch (IOException e) {
e.printStackTrace();
}finally {
//Release stuff
extractor.release();
}
Note: Try to run the above code in worker thread.
Update 1 What is KEY_FRAME_RATE and may be optional
KEY_FRAME_RATE
Added in API level 16
String KEY_FRAME_RATE
A key describing the frame rate of a video format in frames/sec. The associated value is normally an integer when the value is used by the platform, but video codecs also accept float configuration values. Specifically, MediaExtractor provides an integer value corresponding to the frame rate information of the track if specified and non-zero. Otherwise, this key is not present. MediaCodec accepts both float and integer values. This represents the desired operating frame rate if the KEY_OPERATING_RATE is not present and KEY_PRIORITY is 0 (realtime). For video encoders this value corresponds to the intended frame rate, although encoders are expected to support variable frame rate based on buffer timestamp. This key is not used in the MediaCodec input/output formats, nor by MediaMuxer.
Constant Value: "frame-rate"
Update 2 Code check if for NPE if KEY_FRAME_RATE not present. See above
I am using JavaCV to capture the video from my web camera using FrameRecorder.
I am working to create a library utility class that would provide the webCam video as an 'avi' video InputStream, here I am unable to do so as the FrameRecorder does not provide any such facility, all it takes is a file name and persists the video on the filesystem.
What should I do to generate a java InputStream from FrameRecorder?
Following is the sample code for reference :
FrameGrabber frameGrabber = FrameGrabber.createDefault(1);
frameGrabber.start();
IplImage grabbedImage = frameGrabber.grab();
int width = grabbedImage.width();
int height = grabbedImage.height();
FrameRecorder frameRecorder = new FFmpegFrameRecorder("c:\\output.avi", width, height);
frameRecorder.setAudioChannels(frameGrabber.getAudioChannels());
frameRecorder.start();
int i = 0;
while ((grabbedImage = frameGrabber.grab()) != null && i <= 500) {
frameRecorder.record(grabbedImage);
i++;
}
frameRecorder.stop();
frameGrabber.stop();
I am open to any other alternatives too ...
thanks in advance
Ashish
According to http://code.google.com/p/javacv/source/browse/src/main/java/com/googlecode/javacv/cpp/avformat.java
Seems that the method that is called for writing the frame data is the following:
/**
* Write a packet to an output media file.
*
* The packet shall contain one audio or video frame.
* The packet must be correctly interleaved according to the container
* specification, if not then av_interleaved_write_frame must be used.
*
* #param s media file handle
* #param pkt The packet, which contains the stream_index, buf/buf_size,
* dts/pts, ...
* This can be NULL (at any time, not just at the end), in
* order to immediately flush data buffered within the muxer,
* for muxers that buffer up data internally before writing it
* to the output.
* #return < 0 on error, = 0 if OK, 1 if flushed and there is no more data to flush
*/
public static native int av_write_frame(AVFormatContext s, AVPacket pkt);
As its native method, your bet will be to redirect the native lib output other than a file,
or using the image data returned to "build" your AVI.
You can use a memory mapped file, but in that case you want continuous capture of video, I don't think it would be a good idea.
Why do you want to use a FrameRecorder if your goal is not to create a video file?
The most practical solution I can think of, would be to simply extend InputStream using a FrameGrabber as backend. Since JavaCV doesn't seem to provide that out of the box, then I'm afraid you'd have to do it yourself.
As described on the documentation for InputStream, subclasses of InputStream must only implement public abstract int read(), however keep in mind it will also most probably be necessary to overwrite other methods too.
A good bet would be to implement public int read(byte[] b).
For reference, a very simple inefficient and unsafe implementation follows.
Warning not tested!
public int read(byte[] data) throws IOException, BufferUnderflowException, Exception {
IplImage grabbedImage = frameGrabber.grab();
if(grabbedImage == null)
throw <some IOException here>
grabbedImage.getByteBuffer().get(data);
return data.length;
}
I am currently using the GraphView from the developer jjoe64 on GitGub and I was wondering how I would retrieve the double I created in my BT connected thread class to the GraphView class. This is the original function to call random data, but I want the serial data from my BlueTooth class
The current function in this realtime graph is:
private double getRandom() {
double high = 3;
double low = 0.5;
return Math.random() * (high - low) + low;
}
In my Bluetooth class, I have the command ConnectedThread.read(), but It's not really working. Here it is:
public static double read() {
try {
byte[] buffer = new byte[1024];
double bytes = mmInStream.read(buffer);
return bytes;
} catch(IOException e) {
return 5;
}
}
I am not sure if I it's just my phone that's too slow, it's running Android2.3 (DesireHD), but my professor at my school said it should work fine if I just call ConnectedThread.read() and have it equal a double. Any advice?
You haven't provided enough information for a out-of-the box solution, but I'll give it a shot anyway.
First of all, I presume that mmInStream is an InputStream or its subclass. Look at the API of int InputStream.read(byte[] b):
Reads some number of bytes from the input stream and stores them into the buffer array b. The number of bytes actually read is returned as an integer. This method blocks until input data is available, end of file is detected, or an exception is thrown.
This means that what you're returning from your read() method is just the number of bytes that have been written to the buffer from mmInStream. That is probably not what you want to do. What you probably want to do is read just the value from this stream. To do that you should:
wrap your mmInStream in a DataInputStream just after the mmInStream is created:
mmInStream = yourMethodCreatingInputStream();
dataInStream = new DataInputStream(mmInStream);
read the double value from the dataInStream. But as in all computer systems you must be aware of the exact format that your input value comes in. You must refer to the specification of the device you're using to fetch the input data.
Now the dataInStream comes in handy because it abstracts the necessary low-level IO operations and lets you focus on the data. It will automatically translate your queries for the data to the IO operations. For example:
If your data is in double format (and I believe that is the case according to the words of your professor), your read() method is as simple as:
public static double read() {
return dataInStream.readDouble();
}
And in case the data is coming in the float format:
public static double read() {
return (double)dataInStream.readFloat();
}
But again, be sure to consult the specification of the device you're using for the exact format. Some devices may pass you data in exotic formats like for example: "first 2 bytes are the integer part of the resulting value, second 2 bytes are the fractional part". It is up to you as a consumer of the data to follow its format.
I have an array of audio data I am passing to a reader:
recorder.read(audioData,0,bufferSize);
The instantiation is as follows:
AudioRecord recorder;
short[] audioData;
int bufferSize;
int samplerate = 8000;
//get the buffer size to use with this audio record
bufferSize = AudioRecord.getMinBufferSize(samplerate, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT)*3;
//instantiate the AudioRecorder
recorder = new AudioRecord(AudioSource.MIC,samplerate, AudioFormat.CHANNEL_CONFIGURATION_MONO, AudioFormat.ENCODING_PCM_16BIT,bufferSize);
recording = true; //variable to use start or stop recording
audioData = new short [bufferSize]; //short array that pcm data is put into.
I have a FFT class I have found online and a complex class to go with it.
I have tried for two days looking online everywhere but cant work out how to loop through the values stored in audioData and pass it to the FFT.
This is the FFT class I am using: http://www.cs.princeton.edu/introcs/97data/FFT.java
and this is the complex class to go with it: http://introcs.cs.princeton.edu/java/97data/Complex.java.html
Assuming the audioData array contains the raw audio data, you need to create a Complex[] object from the audioData array as such:
Complex[] complexData = new Complex[audioData.length];
for (int i = 0; i < complexData.length; i++) {
complextData[i] = new Complex(audioData[i], 0);
}
Now you can pass your complexData object as a parameter to your FFT function:
Complex[] fftResult = FFT.fft(complexData);
Some of the details will depend on the purpose of your FFT.
The length of the FFT required depends on the frequency resolution and time accuracy (which are inversely related), that you wish in your analysis, which may or may not be anywhere near the length of an audio input buffer. Given those differences in length, you may have to combine multiple buffers, segment a single buffer, or some combination of the two, to get the FFT window length that meets your analysis requirements.
PCM is the technique of encoding data. It's not relevant to getting frequency analysis of audio data using FFT. If you use Java to decode PCM encoded data you will get raw audio data which can then be passed to your FFT library.