I have a question how I can learn to handle (with perhaps java) high data rates. My task is:
I will have a fluorescence microscopy camera with around 1Gigabyte/s and number of images between 100/s and 1000/s.
The image data should be written uncompressed as raw data on the disk. The storage system is not yet decided and should be dimensioned based on the needed performance. During the data acquisition a more or less live image should be shown.
Has somebody some suggestions for books or lecture notes for me?
Your question is pretty open ended, but I can give you advice based upon my past experience building multi-camera, real time data acquisition systems.
Typically these data acquisition systems require (though you may have to purchase it separately) a video capture card. the cards typically buffer some number of frames and the frame rate you can support depends on home long you need to run the acquisition system and the slowest data transfer rate in the "camera->capture card->hard drive" chain. These cards typically come with a documented API (typically in a C variant, I've never seen a Java variant but that doesn't mean it doesn't exist) and libraries that you can compile against that support code using the documented API functions to record data to storage.
When I have worked on systems that just needed a full motion video frame rate (~ 30 Hz) a windows box with a capture card has sufficed just fine. I am pretty sure you can get cards that will sample in the 1kHz range or higher (depending on your camera resolution), but you may be limited on the duration you can sample (given the limited available storage) during the acquisition process if you are sampling data faster than the buffer can clear it to final storage.
Also there is no reason for you to display >30 Hz at one time, no display system is going to support a 1kHz refresh rate, and the human eye can't process >30 Hz anyway.
Unfortunately in my experience these systems are put together piece meal because they are highly specialized which limits the market and disincentivies a standardized approach. The bottom line is that you are probably looking at either using a capture card manufacturer provided API interface (I'd advocate against wrapping it in JAVA because you'll just be adding extra latency that you can't afford at the acquisition rates you are talking about) or having an Electrical Engineer custom fit your solution. If I were in your shoes, I'd be searching for a capture card that meets my requirements, perhaps from the microscopy camera manufacturer.
Related
I'm new to adaptive bit rate streaming. Basically I'm trying to write an app that shows information about the quality of the connection on an Android device.
Since HoneyComb(3.0), Android supports adaptive bit rate streaming through HTTP Live Streaming (HLS). It seams like support for helping developers verify the quality of this connection device side is very limited.
What I would like to know is some low level information about the stream. Such as: the number of segments, the segment duration, number of requests to change bit rate, the bit rate the media player sees (to facilitate the change), etc.
I've been able to get some information about stream quality from the MediaPlayer, MediaController, MediaMetaDataRetriever, CamcorderProfile, MediaFormat, MediaExtractor classes. However, the stuff I'm looking for is even lower level. If possible I'd like to be able to actually see how the player is communicating with the server.
I just started looking at the MediaCodec class, however I can't figure out how to get the MediaCodec from a mediaplayer. Or Maybe I just don't know how to use this properly as I cannot find any good documentation and examples.
Does anyone know if it is possible to access the low level information on the Android that I'm looking for? Is the MediaCodec the way to go? If so, does anyone have any working examples of how I could get the currently used MediaCodec and extract the information I'm looking for out of it? (Or at least point me in the right direction)
Really appreciate any help on this one.
Cheers
I'm trying to create an Android app which will get the lyrics of an mp3 from the ID3V2 tag of it. My question is, is it possible to get the lyrics automatically highlighted as the song plays? Like using speech processing or things like that. I've looked into the previous similar questions but all of them requires manual input. Need an ASAP feedback. Thank you.
This kind of thing is possible on Hollywood movie sets, using technology similar to those image enhancements that reconstruct a face using a 4-pixel square as input.
Okay, so your request is theoretically more feasible, but no current phone technology I know of could do this on the fly. You might need a Delorean, flux capacitor and some plutonium.
Also, detecting vocals over music is a much harder problem than speaking a text message into your phone:
Sung lyrics do not usually follow natural speech rhythm;
The frequency spectrum of music tends to conflict with the frequency spectrum of voice;
The voice varies in pitch, making it much harder to isolate and detect phonetic features;
The vocals are often mixed at a level equal to all other musical instruments;
IwannahuhIwannahuhIwannahuhIwannahuhIwannaReallireallirealliwannaZigaZiggUHH.
You might take a look at this paper LyricSynchronizer: Automatic Synchronization System Between Musical Audio Signals and Lyrics for a possible solution. Nothing implemented in Java for Android, but with the NDK you might take any C code and finagle it to work. ;-)
This paper describes a system that can automatically synchronize polyphonic musical audio signals with their corresponding lyrics. Although methods for synchronizing monophonic speech signals and corresponding text transcriptions by using Viterbi alignment techniques have been proposed, these methods cannot be applied to vocals in CD recordings because vocals are often overlapped by accompaniment sounds. In addition to a conventional method for reducing the influence of the accompaniment sounds, we therefore developed four methods to overcome this problem: a method for detecting vocal sections, a method for constructing robust phoneme networks, a method for detecting fricative sounds, and a method for adapting a speech-recognizer phone model to segregated vocal signals. We then report experimental results for each of these methods and also describe our music playback interface that utilizes our system for synchronizing music and lyrics.
Best of luck in your implementation!
I am investigatin this field to obtain object detection in real time.
Video example:
http://www.youtube.com/watch?v=Bm5qUG-06V8
http://www.youtube.com/watch?v=aYd2kAN0Y20
But how can they extract sift keypoint and matching them so fast?
SIFT extraction requires a second generally
I'm an OpenIMAJ developer and responsible for making the first video.
We're not doing anything particularly fancy to make the matching fast in that video, and the SIFT detection and extraction is carried out on the entirety of every frame. In fact that video was made well before we did any optimisation; the current version of that demo is much smoother. We do also have a version with a hybrid KLT-tracker that works even faster by not having to perform SIFT on every frame.
As suggested by #Mario, the image size does have a big effect on the speed of the extraction, so processing a smaller frame can give a big win. Secondly, in the original description of the difference of Gaussian interest point localisation suggested by Lowe in the SIFT paper, it was suggested that the input image was first doubled in size to increase the number of features. By not performing this double-sizing you also get a big performance boost at the expense of having fewer features to match.
The code is open source (BSD license) and you can get it by following the links at http://www.openimaj.org. As stated in the video description, the image-processing code is pure Java; the only native code is a thin interface to the webcam. Tutorial number 7 in the current tutorial pdf document walks through the process of using SIFT in OpenIMAJ. Disabling the double-sizing can be achieved by doing:
DoGSIFTEngine engine = new DoGSIFTEngine();
engine.getOptions().setDoubleInitialImage(false);
SIFT can be accelerated in several ways :
if you can afford approximations, then you can derive a keypoint called SURF which is way faster (using integral images for most tasks)
you can use parallel implementations, at the CPU level (e.g. OpenCV uses Intel's TBB) or at the GPU level (google for sift gpu for related code and doc).
Anyway, none of these is available (AFAIK) in Java, so you'll have to use a Java wrapper to opencv or work it out yourself.
General and first idea: Ask the video uploader(s). We can just assume what's done or how it's done. It might also help to know what you've done so far (e.g. your video resolution, your processing power, image preparation, etc.).
I haven't used SIFT specifically, but I did quite some object/motion tracking during the last few years, so this is more in general. You might have tried some points already, I don't know.
Reduce your image resolution: Going from 640x480 to 320x240 will reduce your data to 25%. Going down to 160x120 will cut it by another 25% (so 6.25 % data left) without significantly impacting your algorithm.
In a similar way, it might be useful to reduce the color depth of your image (not just 256 grayscale, but maybe even more; like 64 colors).
Try other methods to make features more obvious or faster to find, e.g. try running an edge detector over your image.
At least the second video mentions a tracking system, so you could try to guess the region where the object tracked should reappear the next frame (using some simple a/b filter or whatever on coordinates and possibly rotation), then use SIFT on that sub area (with some added margin) only. Only analyze the whole image if you can't find it again. At around 40 or 50 seconds in the second video they're losing the object and need quite some time/tries to find it again.
The goal is to get a simple 2d audio visualizer that is somewhat responsive to the music.
I've got the basics set up, where I have graphics that will respond to some data being fed in. Given a file, I load up an audioInputStream for playback (this works fine), and have that running in a thread. In another thread, I would like to extract byte data at a rate close to the playback (or perhaps faster, to allow for delay in processing that data). I then want to feed that to an FFT process, and feed the resulting data to my graphics object that will use it as a parameter for whatever the visualization is.
I have two questions for this process:
1) How can I get the byte data and process it at a rate that will match the normal playback of a file? Is using an audioInputStream the way to go here?
2) Once I do FFT, what's a good way to get usable data (ie: power spectrum? Somehow filtering out certain frequencies? etc..)
Some considerations about (2) using the FFT to extract "features".
You should calculate short-term FFTs for example 512 points, whenever there is enough CPU cycles free to do so. For a visualisation it is not necessary to preserve all information (i.e. work with overlapping windows), instead you could calculate a 100ms FFT 5 times per second.
Then you should calculate the logarithmic power spectrum in dB (decibel).
This gives you a pretty good impression about the detailed frequency content of your sound.
Depending on what you like to visualize you could for example combine some low frequency FFT lines (calculate the RMS) to get the "Bass" content of your sound and so on.
See this post for details.
So I want to make a new music player for Android, it's going to be open source and if you think this idea is any good feel free to let me know and maybe we can work on it.
I know it's possible to speed up and slow down a song and normalize the sound so that the voices and instruments still hit the same pitch.
I'd like to make a media play for Android aimed at joggers which will;
Beat match successive songs
Maintain a constant beat for running to
Beat can be established via accelerometer or manually
Alarms and notifications automatically at points in the run (Geo located or timer)
Now I know that this will fall down with many use cases (Slow songs sounding stupid, beat changes within song getting messed up) but I feel they can be overcome. What I really need to know is how to get started writing an application in C++ (Using the Android NDK) which will perform the analysis and adjust the stream.
Will it be feasible to do this on the fly? What approach would you use? A server that streams to the phone? Maybe offline analysis of the songs on a desktop that gets synched to your device via tether?
If this is too many questions for one post I am most interested in the easiest way of analysing the wave of an MP3 to find the beat. On top of that, how to perform the manipulation, to change the beat, would be my next point of interest.
I had a tiny crappy mp3 player that could do double speed on the fly so I'm sure it can be done!
Gav
This is technologically feasible on a smartphone-type device, although it is extremely difficult to achieve good-sounding pitch-shifting and time-stretching effects even on a powerful PC and not in realtime.
Pitch-shifting and time-stretching can be achieved on a relatively powerful mobile device in realtime (I've done it in .Net CF on a Samsung i760 smartphone) without overly taxing the processor (the simple version is not much more expensive than ordinary MP3 playback). The effect is not great, although it doesn't sound too bad if the pitch and time changes are relatively small.
Automatic determination of a song's tempo might be too time-consuming to do in real time, but this part of the process could be performed in advance of playback, or it could be done on the next song well before the current song is finished playing. I've never done this myself, so I dunno.
Everything else you mentioned is relatively easy to do. However: I don't know how easy Android's API is regarding audio output, or even whether it allows the low-level access to audio playback that this project would require.
Actually, you'll have 2 problems:
Finding the tempo of a song is not easy. The most common method involves autocorrolation, which involves quite a bit of calculus, so I hope you've studied up.
Actually changing the beat of a song without pitch shift is even harder, and still results in sound artifacts in the song. Typically it takes a long time to edit audio in this way, and it takes a lot of tinkering to get the song to sound good. To actually perform this in real time would be very, very hard. The actual process involves taking the Fourier Transform of the audio, shifting the frequency, and taking the inverse Fourier Transform. More calculus, this time with imaginary numbers.
If you really want to work on this I suggest taking a class in signals and systems from an Electrical Engineering department.
Perhaps an easier idea: Find the tempo of all the songs in a user's library, and just focus on playing songs with a close beat to the jogger's pace. You still need to do #1 but you don't need to worry about #2.
Changing the audio speed on the fly is definetly doable; I'm not sure if it's doable on the G1.
Rather than writing your own source I would recommend looking at the MythTV source and/or the mplayer source code. They both support speeding up video playback while compensating the audio.
http://picard.exceed.hu/tcpmp/test/
tcpmp did all that you asked for on an iddy biddy Palm Centro... And More, Including Video! If it can be done on a Palm Centro, it sure as heck can be done on the Android!!