I have 100+ channels of video streams to process all at the same time. I need to capture the video, generate thumbnails, and serve them out as a web service. For the generation of thumbnail, I can use JMF etc.(I noticed there is another post talking about how to generate and access: better quality thumbnails from larger image files). But my concern is: how to scale? Java EE EJB or simply Java SE Threads? What's the cons and pros? How to scale horizontally using EJB?
I am not that familiar with scalability issue, and I really appreciate your kind suggestions.
Thanks.
Agree... threads should help to scale on single machine. If you want to scale across different machines - use Terracotta.
Java SE Threads can help you scale on a single machine, but if you are going to need to scale horizontally across different machines, EJB would be one way to do it.
If it was me, I'd probably farm it out to a separate web service tier that could run on as many machines as needed, and then load balance between those machines.
I don't see a reason to use EJB in this situation. You have to ask yourself where is the bottleneck. My bet will be with the video processing. I would profile your application and see how many threads can be processing before they are spending more time waiting for their timeslice than they are processing. After a point adding more threads will not add more throughput. At that point you know what a machine will do and how many machines you will need to sustain a certain throughput. How you scale across machine is another question.
These are two distinct problems.
Capturing/processing sounds like a render farm-like problem. Those are scaled trivially horizontally. Most solutions involve a queue of jobs, you don't even need to do this in Java; just find a simple solution you like. "render farm ffmpeg" or stuff like that should yield results in Google.
Your 'serving out as a web service' part is somewhat undefined. If you want those videos to be accesible, you might just need to put them on an HTTP server- those can be load-balanced easily and thus be horizontally scalable- storage speed or network bandwidth would probably be your first bottlenecks there.
Leave the formal J2EE stack behind.
Rather, a nice message queue that talks JMS with X number of JVMs running Y number of threads as consumers.
Related
For the past few months, a developer and I have been working on a screensharing applet that streams to a media server like Wowza or Red5, but no matter what we do, we have about 5 seconds of latency, which is too long for a live application where people are interacting with each other. We've tried xuggle, different encoders, different players, different networks, different media servers, and even streaming locally, there's significant latency.
So, I'm beginning to wonder…
Is Java fast enough to do live screensharing?
I've seen lots of screen recording applets written in Java, but none of them are streaming live. Everything that's done live, such as GoToMeeting, seems to use C++. I'm thinking maybe there's a reason.
It's not a compression problem. Using ScreenVideo, we've compressed an hour-long stream down to about 100 MB, and we have plenty of bandwidth. The processor isn't overloaded doing the compression, either, but it seems to be taking too much time. We are getting the best results from some code pulled out of BigBlueButton, but still, the latency is terrible.
Streaming the WebCam, on the other hand, is nice and snappy. Almost no latency at all. So, the problem is the applet.
The only other idea I can think of is somehow emulating a WebCam with Java. Not sure if that would be faster or not.
Ideas? Or should I just give up on Java and do this in C++? I would hate to do that, because then I would have to create different versions for different platforms, but if it's the only way, it's the only way.
Many video streaming subsystems deliberately buffer so that a blip in connectivity doesn't impact the video, but that makes more sense in a recorded media scenario.
Make sure these systems have buffering turned off or turned down.
Also, while this isn't exactly scientific, you could run an app like wireshark on the outgoing and incoming computers, and try to see how long the traffic actually takes. If it's very fast, then I'd more seriously consider that buffering is the issue.
If you are on Windows, maybe just running Task Manager/Network tab would prove this or not (rather than installing something like wireshark, which isn't difficult... just trying to suggest a fast way to check)
tl;dr: Setting the polling-interval to 0 has given my performance a huge boost, but I am worried about possible problems down the line.
In my application, I am doing a fair amount of publishing from our java server to our flex client, publishing on a variety of topics and sub-topics.
Recently, we have been on a round of performance improvements system-wide, and the messaging layer was proving to be a big bottleneck.
A few minutes ago, I discovered that setting the <polling-interval-millis> property in our services-config.xml to 0 caused published messages, even when there are lots of them, to be recognized by the client almost instantly, instead of with the 3 second delay that is the default value for polling-interval-millis, which has obviously had a tremendous impact.
So, I'm pretty happy with the current performance, only thing is, I'm a bit nervous about unintended side-effects caused by this change. In particular, I am worried about our Flash client slowing way down, and of way too much unwanted traffic.
My preliminary testing has not borne out this fear, but before I commit the change to our repository, I was hoping that somebody with experience with this stuff would chime in.
Unfortunately your question is too general...there is no way to receive a specific answer. I'll write below some ideas, maybe they are helpful.
Decreasing the value from 3 to 0 means that you are receiving new data way faster. If your Flex client uses this data in order to make complex computations it is possible to slow your client or to show obsolete data (it is a known pattern, see http://help.adobe.com/en_US/LiveCycleDataServicesES/3.1/Developing/WS3a1a89e415cd1e5d1a8a18fb122bdc0aad5-8000Update.html ). You need to understand how the data is processed and probably to do some client benchmarking.
Also the server will have to handle more requests, and it would be good to identify what is the maximum requests per second which can be handled. For that, you will need to use a tool like Jmeter in order to detect the maximum capacity of your system, after that you can do some computations trying to figure out how many requests per second you will have after you reduced the interval from 3 to 0, taking into account that the number of clients is increasing with 10% per month etc etc.
The main idea is that you should do some performance testing for some API and save the scripts in order to see if your future modification are slowing down the system too much. Without having this it is quite hard to guess if it ok or not to change configuration parameters.
You might want to try out long-polling. For our Weblogic servers, we don't get any problems unless we let the poll request go to 5 minutes, so we keep it to 4, then give it a 1 second rest before starting again. We have a couple of hundred total users, with 60-70 on it hard core all day. The thing to keep in mind is that you're basically turning intermittent user requests into what amounts to almost always connected telnet sessions. Depending on the browser your users are using it can implications from that as well, but overall we've been very pleased.
I'm writing a large scale financial application that I use mostly Java for. Now, to get some data, I need to write a small script (<200 LOC) to download CSV files (over 20,000 of them) and store them to disk. I need this to be fast, but, a few minutes doesn't make a difference to me. I was planning to write it in Java which isn't very hard, but, I would be done a lot faster if I wrote it in Ruby, so I was wondering if there would be a large difference in speed between Ruby (or JRuby) and Java. The 20,000 files are all about 1/2 a megabyte, and the server I'm downloading from isn't keen to give away data (its completely legal, don't worry about that), so, my application has to randomly sleep in between, and, if the website denies a request, it has to sleep for 3 minutes.
Recommendations to any other easier-than-Java type of languages is welcome.
Use whatever makes you comfortable. Language implementation speed probably won't be an issue there, network speed and the sleeps you have to put in will be a bottleneck anyway.
Sounds like you app will be I/O bound, so the speed of the language is not terribly important
In a language like Ruby or Python, I would expect this to be more like 20 LOC or less. Especially since you have a limited request rate, there is no point using simultaneous connections to try to speed things up
If you have a bunch of machines with different ip addresses ( or one machine with several external addresses), you could split the job across those to speed things up since the rate limiting is usually by ip address
Where do your urls come from?
We have an Java ERP type of application. Communication between server an client is via RMI. In peak hours there can be up to 250 users logged in and about 20 of them are working at the same time. This means that about 20 threads are live at any given time in peak hours.
The server can run for hours without any problems, but all of a sudden response times get higher and higher. Response times can be in minutes.
We are running on Windows 2008 R2 with Sun's JDK 1.6.0_16. We have been using perfmon and Process Explorer to see what is going on. The only thing that we find odd is that when server starts to work slow, the number of handles java.exe process has opened is around 3500. I'm not saying that this is the acual problem.
I'm just curious if there are some guidelines I should follow to be able to pinpoint the problem. What tools should I use? ....
Can you access to the log configuration of this application.
If you can, you should change the log level to "DEBUG". Tracing the DEBUG logs of a request could give you a usefull information about the contention point.
If you can't, profiler tools are can help you :
VisualVM (Free, and good product)
Eclipse TPTP (Free, but more complicated than VisualVM)
JProbe (not Free but very powerful. It is my favorite Java profiler, but it is expensive)
If the application has been developped with JMX control points, you can plug a JMX viewer to get informations...
If you want to stress the application to trigger the problem (if you want to verify whether it is a charge problem), you can use stress tools like JMeter
Sounds like the garbage collection cannot keep up and starts "halt-the-world" collecting for some reason.
Attach with jvisualvm in the JDK when starting and have a look at the collected data when the performance drops.
The problem you'r describing is quite typical but general as well. Causes can range from memory leaks, resource contention etcetera to bad GC policies and heap/PermGen-space allocation. To point out exact problems with your application, you need to profile it (I am aware of tools like Yourkit and JProfiler). If you profile your application wisely, only some application cycles would reveal the problems otherwise profiling isn't very easy itself.
In a similar situation, I have coded a simple profiling code myself. Basically I used a ThreadLocal that has a "StopWatch" (based on a LinkedHashMap) in it, and I then insert code like this into various points of the application: watch.time("OperationX");
then after the thread finishes a task, I'd call watch.logTime(); and the class would write a log that looks like this: [DEBUG] StopWatch time:Stuff=0, AnotherEvent=102, OperationX=150
After this I wrote a simple parser that generates CSV out from this log (per code path). The best thing you can do is to create a histogram (can be easily done using excel). Averages, medium and even mode can fool you.. I highly recommend to create a histogram.
Together with this histogram, you can create line graphs using average/medium/mode (which ever represents data best, you can determine this from the histogram).
This way, you can be 100% sure exactly what operation is taking time. If you can't determine the culprit, binary search is your friend (fine grain the events).
Might sound really primitive, but works. Also, if you make a library out of it, you can use it in any project. It's also cool because you can easily turn it on in production as well..
Aside from the GC that others have mentioned, Try taking thread dumps every 5-10 seconds for about 30 seconds during your slow down. There could be a case where DB calls, Web Service, or some other dependency becomes slow. If you take a look at the tread dumps you will be able to see threads which don't appear to move, and you could narrow your culprit that way.
From the GC stand point, do you monitor your CPU usage during these times? If the GC is running frequently you will see a jump in your overall CPU usage.
If only this was a Solaris box, prstat would be your friend.
For acute issues like this a quick jstack <pid> should quickly point out the problem area. Probably no need to get all fancy on it.
If I had to guess, I'd say Hotspot jumped in and tightly optimised some badly written code. Netbeans grinds to a halt where it uses a WeakHashMap with newly created objects to cache file data. When optimised, the entries can be removed from the map straight after being added. Obviously, if the cache is being relied upon, much file activity follows. You probably wont see the drive light up, because it'll all be cached by the OS.
I'm developing a Java application that streams music via HTTP, and one problem I've come up against is that while the app is reading the audio file from disk and sending it to the client it usually maxes out the CPU at 90-100% (which can cause users problems running other apps).
Is it possible to control the thread doing this work to use less CPU, or does this need to be controlled by the OS? Are there any techniques for managing how intensive your application is at present?
I know you can start threads with a high/low priority, but this doesn't seem to have any effect for me in this scenario.
(I can't get my head past "I've asked the computer to do something, so it's obviously going to do it as fast as it can...")
Thanks!
rod.
That task (reading a file from the disk and sending it via HTTP) should not use any significant amount of CPU, especially at the bitrates required for music streaming (unless you're talking about multi-channel uncompressed PCM or something like that, but even then it should be I/O-bound and not use a lot of CPU).
You're probably doing the reading/writing in a very inefficient way. Do you read/write each byte separately or are you using some kind of buffer?
I would check how much buffering you are using. If you read/write one byte at a time you will consume a lot of CPU. However, if you are reading/writing blocks of say 4 kB it shouldn't use much CPU at all. If your network is the internet your CPU shouldn't be much over 10% of a single client.
One approximation for the buffer size is the bandwidth * delay. e.g. if you expect users to stream at 500 KB/s and there is a network latency of up to 0.1 sec, then the buffer size should be around 50 KB.
You can lower it's priority using methods in Thread (via Thread.currentThread() if necessary).
You can also put delays in it's processing loop (Thread.sleep()).
Other than that, let the O/S take care of it. If your program can use 100% CPU, and nothing else needs the CPU your app might as well use it rather than letting the O/S idle task have it.
It's also true that streaming data should be I/O bound, so you should definitely review what's being done between reading the data and sending it. Are you reading/sending byte by byte, unbuffered, for example?
EDIT: In response to marr75's comment, I am absolutely not advocating that you write poor, inefficient code which wastes CPU resources - There is an article on my web site which clearly conveys what I think about that mind-set. Rather, what I am saying is that if your code legitimately needs the CPU, and you've prioritized it to behave nicely if the user wants to do other things, then there is no point at all in artificially delaying the outcome just to avoid pegging the CPU - that only does the user the disservice of making them wait longer for the end result, which they presumably want as quickly as possible.
Do you have one or more of:
Software RAID
Compressed folder
Intrusive virus checker
Loopback file system
I don't think you can lower the priority without losing the functionality (stream music). Your program gets this much cpu from the OS, because it needs it. It's not like the OS is giving cpu-time away for no reason or because "it's in the mood for it".
If you think, you can do the task without using that much cpu-utilization, you can profile your app and find out, where this high cpu-utilization takes place and then try to improve your code.
I think you are doing the streaming in an inefficient way, but I say streaming CAN be a highly utilizing task.
I repeat, don't think about reducing the cpu-utilization by lowering the priority of the process or telling the OS "Don't give that much cpu-time to this process". That's the whole wrong intuition in my eyes. Reduce the cpu-utilization by improving the algorithms and code after profiling.
A good start in profiling java is this article: http://www.ibm.com/developerworks/edu/os-dw-os-ecl-tptp.html
In addition to the information given above: the JVM is free in how it uses OS threads. The Thread in your Java application might run in a seperate OS thread, or it might share that thread with other Threads. Check the documentation for the JVM you are using for additional information.
Ok, thanks for the advice guys! Looks like I'm just going to have to look into trying to improve the efficiency of the way my app is streaming (though not sure this is going to go far as I'm basically just reading the file from disk and writing it to the client...).
VisualVM is very easy to use to find out where your CPU time is being spent for Java applications, and it is included in the latest versions of the JDK (named jvisualvm.exe on Windows)
Follow up to my "well thought out buffers" comment, a good rule of thumb for TCP buffering,
buffer size = 2 * bandwidth * delay
So if you want to stream 214kbps music (around 27kB/s) and have, let's say 60ms of latency, you're looking at 3.24 kilobytes, and rounding off to a nice 4kB buffer will work very well for you on a wide range of systems.