Performance Wise, Python VS JAVA For File Based Processing

Performance Wise, Python VS JAVA For File Based Processing - java

I need to create daemon that will monitor certain directory and will process every file that's written to that particular path.
My choice is either java or python.
Did you guys have any experience using both technology? what is the best one?
EDIT 1: files that will be processed is simple text file (one line with tab separated fields).
I just need to move it to buffer and send to further to my php file.
EDIT 2: It's for freebsd server

Performance-wise, for an I/O - syscall bound task such as you're mentioning, it's going to be a wash, most likely, depending a bit on the platform. Java tends to have better CPU usage (partly because a JVM can effectively use multiple cores on a multicore CPU on different threads, with CPython having problems with that; partly because of strong JIT abilities), but typically pays for them with higher RAM footprints (no big deal if you have 64GB of RAM laying around and not much else to do on the machine, say, but often an issue in other circumstances).
If you specify the platform (Linux vs Windows vs ...), we might be able to offer more help.
Edit: with processing required as light as the OP's mentioned in the Q's edit, there's really nothing either way in the CPU-load part of the task. Unfortunately I don't know what freebsd offers for "directory watching" (like Linux's inotify, etc).

Related

JAVA Distributed processing on a single machine (Ironic i know)

I am creating a (semi) big data analysis app. I am utilizing apache-mahout. I am concerned about the fact that with java, I am limited to 4gb of memory. This 4gb limitation seems somewhat wasteful of the memory modern computers have at their disposal. As a solution, I am considering using something like RMI or some form of MapReduce. (I, as of yet, have no experience with either)
First off: is it plausible to have multiple JVM's running on one machine and have them talk? and if so, am I heading in the right direction with the two ideas alluded to above?
Furthermore,
In attempt to keep this an objective question, I will avoid asking "Which is better" and instead will ask:
1) What are key differences (not necessarily in how they work internally, but in how they would be implemented by me, the user)
2) Are there drawbacks or benefits to one or the other and are there certain situations where one or the other is used?
3) Is there another alternative that is more specific to my needs?
Thanks in advance

First, re the 4GB limit, check out Understanding max JVM heap size - 32bit vs 64bit . On a 32 bit system, 4GB is the maximum, but on a 64 bit system the limit is much higher.
It is a common configuration to have multiple jvm's running and communicating on the same machine. Two good examples would be IBM Websphere and Oracle's Weblogic application servers. They run the administrative console in one jvm, and it is not unusual to have three or more "working" jvm's under its control.
This allows each JVM to fail independently without impacting the overall system reactiveness. Recovery is transparent to the end users because some fo the "working" jvm's are still doing their thing while the support team is frantically trying to fix things.
You mentioned both RMI and MapReduce, but in a manner that implies that they fill the same slot in the architecture (communication). I think that it is necessary to point out that they fill different slots - RMI is a communications mechanism, but MapReduce is a workload management strategy. The MapReduce environment as a whole typically depends on having a (any) communication mechanism, but is not one itself.
For the communications layer, some of your choices are RMI, Webservices, bare sockets, MQ, shared files, and the infamous "sneaker net". To a large extent I recommend shying away from RMI because it is relatively brittle. It works as long as nothing unexpected happens, but in a busy production environment it can present challenges at unexpected times. With that said, there are many stable and performant large scale systems built around RMI.
The direction the world is going this week for cross-tier communication is SOA on top of something like spring integration or fuse. SOA abstracts the mechanics of communication out of the equation, allowing you to hook things up on the fly (more or less).
MapReduce (MR) is a way of organizing batched work. The MR algorithm itself is essentially turn the input data into a bunch of maps on input, then reduce it to the minimum amount necessary to produce an output. The MR environment is typically governed by a workload manager which receives jobs and parcels out the work in the jobs to its "worker bees" splattered around the network. The communications mechanism may be defined by the MR library, or by the container(s) it runs in.
Does this help?

Limit resource utilization of JNA calls without changing dll

How can you prevent a JNA method-call from exceeding thresholds for CPU utilization, thread-counts, and memory limits?
Background:
I'm working on a safety critical application and one of the non-safety-critical features requires the use of a library written in C. The dlls have been given to me as a black-box and there's no chance that I'll get access to the source code beyond the java interface files. Is there a way to limit the CPU usage, thread-count, and memory used by the JNA code?

See ulimit and sysctl, which are applicable to your overall JVM process (or any other process, for that matter).
It's not readily possible to segment parts of your JVM which are making native accesses via JNA from those that aren't, though.
You should run some profiling while you exercise your shared library to figure out what resources it does use, so you can focus on setting limits around those (lsof or strace would be used on linux, I'm not sure of the equivalent on windows).

For most operating systems you must either call your C code from a new thread or new process. I would recommend calling it from a new process as then you can sandbox it easier and deeper. Typically on a Unix like system one switches to a new user set aside for the service and that has user resource limits on it. However, on Linux one can use user namespaces and cgroups for more dynamic and flexible sandboxing. On Microsoft Windows one typically uses Job objects for resource sandboxing but permissions based sandboxing is more complicated (a lot of Windows is easily sandboxable with access controls but the GUI and window messaging parts make things complicated and annoying).

How to make full use of multiple processors?

I am doing web crawling on a server with 32 virtual processors using Java. How can I make full of these processors? I've seen some suggestions on multi-threaded programming, but I wonder how that could ensure all processors would be taken advantage of since we can do multi-threaded programming on single processor machine as well.

There is no simple answer to this ... except the way to ensure all processors are used is to use multi-threading the right way. (Note: that is a circular answer!)
Basically, the way to get effective use of multiple processors is to:
ensure that there is work that can be done in parallel, and
reduce / eliminate contention points that force one thread to wait while another thread does something.
This is difficult enough when you are doing simple computation. For a web crawler, you've got the additional problems that the threads will be competing for network and (possibly) remove server bandwidth, and they will typically be attempting to put their results into a shared data structure or database.
That's about all that can be said at this level of generality ...
And as #veer correctly points, you can't "ensure" it.
... but using a load of threads will surely be quicker wall-time-wise because all the miserable network latency will happen in parallel ...
Actually, if you go overboard, a load of threads can reduce throughput because of contention. Just throwing lots of threads at the problem is rarely a good idea.

A computer or a program is only as fast as the slowest link in its processing chain. Just increasing the CPU capacity is not going to ensure a drastic performance peak. Leaving aside other issues like your cache-size, RAM, etc., there are two basic kinds of approach to your question about how to take advantage of all your processors:
[1] Using a Jit/just-in-time compiler/interpreter technology such as Java/.NET. I don't know much about java, but the .NET jitter is definitely designed to take advantage of all the available processors on the mahcine. In fact, this very feature makes a jitter stand out against other static language compilers like C/C++, because the jitter "knows" that it is sitting on 32 processors, it is in a much better position to take advantage of them than a program statically compiled on any other machine. (provided you have written a robust multi-threading code for it!)
[2] Programming in C/C++. This is the classic approach. If you compile your code on the same machine with 32 CPUs, and take proper care in your program such as memory-management, handling pointers, etc. the C/C++ program will be the most optimal and will perform better than its CLR/JVM counterpart (as it runs without the extra overhead of a garbage-collector or a VM).
But keep in mind that writing robust code is much easier in .NET/Java than C/C++. So, if you are not a "hard-core" programmer, I would suggest going with the former approach. Also remember to handle your multiple threads with care, such as locking variables when multiple threads try to change the same variables. However, excessive locking might make your code hang, if a variable behaves unexpectedly.

Processor management is implemented in native through the Virtual machine you are using i.e., JVM. You can have a look here Java Hotspot VM Options to optimize your machine if you are using Java Hotspot VM. If you are using a third party VM then your provider may help you with tuning it for your requirements.
Application performance in design practically depends on you.
If you would like to monitor your threads and memory usage to optimize your application, you can use any VM monitoring tools available to date. The Java virtual machine (JVM) has built-in instrumentation that enables you to monitor and manage it using JMX.
For details you can check Platform Monitoring and management using JMX. For third party VMs you have to contact the vendor I guess.

Best OS to deploy a low latency Java application?

We have a low latency trading system (feed handlers, analytics, order entry) written in Java. It uses TCP and UDP extensively, it does not use Infiniband or other non-standard networking.
Can anyone comment on the tradeoffs of various OSes or OS configurations to deploy this system? While throughput is obviously important to keep up with modern price feeds, latency is our #1 priority.
Solaris seems like a natural candidate since they created Java; should I use Sparc or x64 processors?
I've heard good things about RHEL and SLERT, are those the right versions of Linux to use in our benchmarking.
Has anyone tested Windows against the above OSes? Or is it assumed to not keep up?
I'd like to leave the Java vs C++ debate for a different thread.

Vendors love this kind of benchmark. You have code, right?
IBM, Sun/Oracle, HP will all love to run your app on their gear to demonstrate their advantages.
Make them do this. If you have code, make the vendors run a demonstration on their gear to show which is best for your needs.
It's easy, painless, free, and factual. The final decision will be easy and obvious. And you will know how to install and tune to maximize performance.
What I hate doing is predicting this kind of thing before the code is written. Too many customers have asked for a H/W and OS recommendation before we've finished identifying all the use cases. Asking for that kind of precognition is simple craziness.
But you have code. You can produce test cases that exercise your code. That's perfect.

For a trading environment, in addition to low latency you are probably concerned about consistency as well as latency so focusing on reducing the impact of GC pauses as much as possible may well give you more benefit than differnt OS choices.
The G1 garbage collector in recent versions of Suns Hotspot VM improves stop the world pauses a lot, in a similar way to the JRockit VM
For real performance guarantees though, Azul Systems version of the Hotspot compiler on their Java Appliance delivers the lowest guaranteed pauses available - also it scales to a massive size - 100s of GB stack and 100s of cores.
I'd discount Java Realtime - although you'd get guarantees of response, you'd sacrifice throughput to get those guarantees
However, if your planning on using your trading system in an environment where every microsecond counts, you're really going to have to live with the lack of consistency you will get from the current generation of VM's - none of them (except realtime) guarantees low microsecond GC pauses. Of course, at this level your going to run into the same issues from OS activity (process pre-emption, interrupt handling, page faults, etc.). In this case one of the real time variants of Linux is going to help you.

I wouldn't rule out Windows from this just because it's Windows. My expirience over the last few years has been that the Windows versions of the Sun JVM was usually the most mature performance wise in contrast to Linux or Soaris x86 on the same hardware. The JVM for Solaris SPARC may be good too, but I guess with Windows on x86 you'll get more power for less money.

I would strongly recommend that you look into an operating system you already have experience with. Solaris is a strange beast if you only know Linux, e.g.
Also I would strongly recommend to use a platform actually supported by Sun, as this will make it much easier to get professional assistance when you REALLY, REALLY need it.
http://java.sun.com/javase/6/webnotes/install/system-configurations.html

I'd probably worry about garbage collection causing latency well before the operating system; have you looked into tuning that at all?
If I were willing to spend the time to trial different OSs, I'd try Solaris 10 and NetBSD, and probably a Linux variant for good measure.
I'd experiment with 32-vs-64 bit architectures; 64 bit will give you a larger heap address space... but will take longer to address each bit of memory.
I'm assuming you've profiled your application and know where the bottlenecks are; by the comment about GC, you've done that. In that case, your application shouldn't be CPU-bound, and chip architecture shouldn't be a primary concern.

I don't think managed code environments and real-time processing go together very well. If you really care about latency, remove the layer imposed by the managed code. This is not a Java vs C++ argument, but a Java/C#/... vs C/C++/FORTRAN/... argument, and I believe that is a valid design discussion to have.
And yes, I do mean FORTRAN, we run a number of near real-time systems with a FORTRAN foundation.

One way to manage latency is to have several JVM's dividing the work with smaller heaps so that a stop the world garbage collection isn't as time consuming when it happens and affects less processes.
Another approach is to load up a cluster of JVM's with enough memory and allocate the processes to ensure there won't be a stop the world garbage collection during the hours you care about latency (if this isn't a 24/7 app), and restart JVMs on off hours.
You should also look at other JVM implementations as a possibility (such as JRocket). Of course if any of them are appropriate depends entirely on your specific application.
If any of the above matters to your approach, it will affect the choice of OS. For example, if you go with another JVM implementation, that might limit OS choices, and if you go with clustering or otherwise running a several JVM's for the application, that might require some better underlying OS tools to manage effectively, further influencing the OS choice.

The choice of operating system or configurable is completely redundant considering the availability of faster network fabrics.
Look at 10GigE with ToE NICs, or the faster solution of 4X QDR (40Gbs) InfiniBand but with IPoIB presenting a standard Ethernet interface and routing.

How does the Sun JVM map Java threads to Windows threads?

My application uses loads of Java threads. I am looking for a reliable understanding how the JVM (version 5 and 6) maps the Java threads to underlying Windows threads. I know there is a document for mapping to Solaris threads, but not Windows.
Why doesn't Sun publish this information?
I want to know if there's a 1:1 mapping, or if it varies by JVM, by -server option, by workload, etc, etc.
I know I am not "supposed" to care, I should write properly synchronisd code, but I am inheriting a large body of code...
Also, does anyone know how to give names to Windows threads?

Don't have a document for you, but from the Threads column in the task-manager you can pretty reliably guess that it maps 1:1 to native threads (you need to enable the Threads column in the task manager first).
Oh, almost forgot, you can download the jdk src here and look yourself.

The mapping is platform-dependent, however I found an interesting comparison between platform threads for the vm (although probably a bit old). The bottom line is: you don't need to know. What you probably are more interested is to know about green threads (if you don't know already).
As for the naming question: Doesn't the constructor allow you to name a thread? Or do you mean name them and view their name on some windows thread browser?

How to name a Win32 thread
Unfortunately, this seems like it's impossible or at least very hard to do inside the Windows JVM.

JVM specification doesn't say anything strictly in this regard. Its left upto the JVM implementors to map Java theads to platform theads( Windows, Linux etc). Also its hard to believe that there will be one to one mapping between Java threads and OS threads.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.