Android: mpeg4/H.264 packetization example - java

I need to split mpeg4 video stream (actually from android video camera) to send it through RTP.
The specification is little large for quick reference.
I wonder if there any example/open source code for mpeg4 packetization?
Thanks for any help !

Mpeg4 file format is also called ISO/IEC 14496-14. Google it any you will find specifications.
However, what you are trying to do (RTP publisher) will be hard for the following reasons:
Mpeg4 has header at the end of the file. Which means header will be written out only when video stream is finished. Since you want to do real time video streaming you will need to guess where audio and video packets start/end. This will not be the same on all Android devices as they might use different video sizes and codec parameters. So your code will be device-dependent and you'll need to support and test many different devices.
Some devices do not flush video data to file in regular intervals. Some only flush once a minute or so. This will break your real-time stream.
There is no example code. I know because I looked. There are a few companies that do something similar, but mainly they skip RTP. Instead they progressively upload the file to their own server and then implement video/audio stream "chopping" and then insert it into their video/transcoder backend. I used to work for one of those companies and that's how we did it. AFAIK competition took similar approaches. The upside is that all complexity is on server an you do not need to update clients when something breaks or new devices arrive on the market.

Related

Is there a way to read Raw midi data in java?

Using javax.sound.midi i've managed to open my midi device up for outputting a .midi file in the past but the issue is i need to be able to pickup the midievent in it's raw form as in "3C40"/"903C40".
I'm able to find documentation on opening transmitters/receivers/sequencers but no code examples using these to output to say a string as the raw data from the midi device example of which is my yamaha ypt-240
digital keyboard.
Basically the reasoning behind why i want this raw data is to make some simple keybindings that would be triggered by said raw data from the midi device almost like a stream deck or using software that allows this.
The ShortMessage seems like the way to go but again can't find any code using it the way i would like to.
Every example is mididevice||file.midi -> synth||mididevice
I need a way to intercept that input java is getting from the device...
please help lol

Get SPS and PPS from h264 encoded video in JAVA

I'm a bit stuck on a question actually and i reaaly hope that someone can help me with this issue.
My problem is as follows :
I have a live usb camera with which i'm encoding only the video in h264 in order to send it with RTP over the network to a receiver (my receiver here for test purposes is Ekiga).
After having encoded only the video in h264, i have a byte array.
Now with this byte array, i want to extract the SPS and PPS. I want to get these information for me to be able to send the following sequence when sending frames to the receiver :
SPS => PPS => FRAME 1 (coded slice of an IDR picture) => FRAME 2 (non coded slice of an IDR picture) => FRAME 3 (non coded slice of an IDR picture) => and so on ...
How can i extract those information and i want a JAVA library which can help me? (JCODEC nop docs??!!)
Thanks for your help.
Ronnie
It depends on your encoder. If it is producing annex b stream the sps/pps are most likely the first and second Nalus. Unless it is also producing access unit delimiters in which case it will be second and third. If it is not producing annex b then this data will need to be obtained from the encoders API another way. Either way you will need to parse the stream. You can see more details here.
Possible Locations for Sequence/Picture Parameter Set(s) for H.264 Stream
One more thing a NALU is NOT the same thing as a frame. A frame can be made up of Many NALs.
Thanks for your help and answer. Well, my researches and works continue (i'm a newbie in this domain) and i have been able to transmit my video to the receiver. The receiver here is Jitsi which i'm using for test purposes.
I've seen too the link which you provided me and from it, i did understand many things which weren't totally clear.
Now, my actual problem is with the quality of the video i am receiving in Jitsi. In fact, i'm using Xuggler for encoding my orginal video (streaming from my webcam) to H264 format. When Xuggler encode my video, i can now see the correct SPS, PPS and SEI headers and you are right i can notice too many NALs which make up my frame to be transmitted over the network.
I think that it would be better to use another library than Xuggler but there comes my real problem. JCodec, there's no documentations and from what i have read, it's a bit slow in processing H264 videos.
Can you please guide me in the choice of a good library which can help me in encoding and decoding H264 video streams?
Does anyone know a library in Java which can do that for me and some documentations associated with this library?
Thanks for your help.
Ronnie

Reading and editing .wav (or flac) details in Java

I have tons of ripped .wav files (I'm ready to convert them into flacs if it's easier) which details I want to insert in a MySQL database. When I right click the .wav files in Windows Explorer (not the browser) and select Properties -> Details I can see some details about the song. For example the artist, genre and duration. How can I read and edit these details in Java?
To get durration information, see this link: Java - reading, manipulating and writing WAV files
Essentially, a WAV file is broken up into chunks, which either contain audio data, or describe the audio data in some way, or provide information about it. If the reader doesn't understand one of those chunks it is able to skip it, which allows placing a lot of different kinds of information in the file. One of those chunks contains information like the samplerate, number of channels and total number of sample frames, from which you can calculate the length.
For artist, genre and so on... well there's no standard chunk for that, so if that's really in the file, and not in the windows db somewhere, it's probably stored in ID3 tags embedded in the WAV. I don't know for sure what the chunkID is for ID3, but it's probably "id3 ", or "ID3 " (including the space). You coud probably figure this out by searching for strings of length 4 or more in the file -- usually data chunks are in the beginning and audio is at the end. (on unix/macos I would use the "strings" command, maybe with "head") ID3 tags are standard for MP3, and you can figure out how to parse them by googling. To get to them, you'll need to understand WAV files first, at least enough to know what chunks are, chunkIds, how to skip chunks you don't care about, and so on.
I don't know of a library that will read ID3 tags in WAV files in Java, so you'll either have to write one, or wrap one written in another language. I suspect libsndfile will work, but it doesn't have an MP3 reader, so maybe not. You could also try SOX. You can also check out http://javamusictag.sourceforge.net/ which I've never used, but it came up in a search.
good luck!
I ended up converting them into flac files and using JAudiotagger. Thanks for the responses, this time I ended up this way.
http://www.jthink.net/jaudiotagger/

forwarding and rewinding audio in xuggler

I have used xuggler to play audio files other than wav,au,aiff. Since xuggler performs audio decoding at low level it is very hard to write a method that both forwards and rewinds the audio being played . ( while decoding xuggler analyzes each data packet and then sends it to play)
One way could be read bunch of packets at a time and then send the next packet to play.This way the effect of forwarding audio can be felt . But i don't know how to implement this method Moreover this is not the best way i can forward the data.
Are there any direct methods to forward and rewind audio ? If not direct what is the algorithm , steps to do this ?
Have you looked at the seekKeyFrame() method in IContainer? See here. On seek, you could just flush the dataline and then on execution of the method the container should jump to the given location.
If you want to do it by a percentage call, then getDuration() gets the entire length of the stream (if available.) You can then work out accurate timestamps from there.

Convert Data to sound and back

Are there libraries out there that can convert data (text files, etc) to sound and back to the original data?
The sound can be transmitted any medium I wish, whether radio, etc. I just need to store data in sound files.
Scenario:
step1: Convert a .docx file with embedded images to .wav.
step2: Send over a radio wave.
step3: Convert this .wav back to the .docx file with the embedded images.
This concept can be applied to any data.
Technology:
.net or java
I think the medium is important, as are other factors such as the size of the files and the transmission time available. A simple algorithm would be to convert your files to text (UUENCODE should do that trick) then convert to morse code : http://www.codeproject.com/KB/vb/morsecode.aspx
Morse gives you a simple alphabet able to survive transmission over a fairly noisy radio channel.
If your carrier is cleaner a conversion of your UUEncoded file into a series of frequencies one per character would probably also work, and be easy enough to decode at the other end, Frequency Analyzer in C#
You could try to use the magnetic card technology for your files, I'm also trying to do this on android.
Any data can be converted to byte into a string of characters it very possible with java and android.
then use the Encoding mechanism of Magnetic Cards API to encode the string to sound. Then you can just use the vice versa, convert the sound into string convert string into byte and save the data. It's just it takes time to convert both ways but it is feasible, I'm trying to do this so that any one with unlimited voice connection can transfer files or in the future browse the internet just through calling the other number. I hope I gave you some idea.
The problem is that the data in a word document doesn't necessarily make decent sound. If you pick a 1.8kHz carrier and use the binary contents of the word document to modulate the volume or the frequency (AM or FM) the result will be messy and hardly to decode.
But if you save the document as a bitmap, you can use the pixel values to modulate the volume of the carrier wave.
We've been sending pictures (not just black/white but greyscale and color (three different separations of the image, r, g and b) over phonelines using this method for many years before modems and the internet took off.
The fun part is that you can broadcast data this way. The sound can be received by more than one receiver at the same time. There's no error correction, but as you deal with visual data, you don't have to worry about a few pixels getting lost. It's similar to old fax protocols.
Does the audio file need to be convertible using lossy compressors (MP3 etc.)? If not, you can just add a WAV container around any binary data and you'll be fine. Otherwise it gets more difficult, and you need to ensure that the audio is audible (in a reasonable frequency range when played) and be tolerant enough on the frequency detection to match the output of lossy codecs.
Best way is to convert the audio file into binary and store in a file type you specify.
Try out the AudioInputStream Class in Java
To give what I think is a better response to all of the above, have a look at packet radio and the various bits that correspond to it AX.25 is a good example and there are a number of implementations for it. POCSAG is also another good implementation. Both have libraries available for many different languages and have been around for quite a long time.
Other example include things like WEFAX (weather fax), HFFax, SSTV (slow scan tv), etc.
You can think of them all as being similar to the old school phone line modem type encoders and decoders that run around the 300-2400baud

Categories