I am trying to pick a right web technology both for I/O heavy and CPU heavy tasks. NodeJs is perfect for handling large load and it also can be scaled out. However, I am stuck with the cpu heavy part. Is it possible to integrate another technology (e.g. Java) into node, so that I will have it running my algorithms in other threads and then use the results again in node. Is there any existing solution? Any other suggestions will be very good.
You can intergrate NodeJS with Java using node-java.
As mentioned in a previous answer, you can use node-java which is an npm module that talks to Java. You can also use J2V8 which wraps Node.js as a Java library and provides a Node.js API in Java.
The answer is lambda architecture.
NodeJs is nice by itself - handling fast queries in a lightweight manner, not doing any extra computations on data.
The CPU heavy tasks can be easily delegated to specialized components based on JVM (well, the most famous ones are on JVM). This is nicely implemented by using message brokers and microservices.
An event-based architecture, where nodejs can be hooked up to databases like Cassandra or Mongodb and cluster computing frameworks like Apache Spark (not necessarily, though, it depends on the problem) to handle the cpu-heavy parts of the system. And lightweight containers add an icing to the cake by providing nice isolated runtime environments for each of the components to live in.
That's my conclusion so far regarding this question.
I think the suggestions above sort of eliminate the need to wrap node under java or other JVM based solution for cpu-heavy tasks.
NodeJS is based on the v8 javascript engine which is written in c++.
It is therefore possible to write fully native addons in c++ for NodeJS. Check out some of these resources:
https://github.com/nodejs/node-addon-api
https://github.com/nodejs/node-addon-examples
Related
I am trying to set up a Kafka system. Since most of the existing code in my project is already in PHP, I will most probably be writing the producers in PHP itself. But I am comparatively very less constrained when it comes to choosing a language to write the consumer. Now, that there are so many clients which can be used I am in a fix.
In other to order to choose the right tech here, what are the various factors that should be kept in mind?
Would especially like to apply this knowledge to choose between java client vs node client(multithreaded model vs async model)
Any help will be highly appreciated.
The Java client is the most advance client and officially supported by the Kafka Project -- most other clients are third party projects and many do not implement all available features.
Thus, I would recommend to use Java clients.
Kafka is basically written in pure Java and Kafka’s native API is java, so this is the only language where you’re not using a third-party library.You always have an edge over writing in other languages which have an additional overhead.
Node.js isn’t optimized for high throughput applications such as Kafka. So if you need the high processing rates that come standard on Kafka, or perhaps C++.
Also, I believe Kafka consumer clients written in Java has good community support. So it makes sense to implement it using java as long as you don't have any other dependency stopping you from implementing it from.
Also, check this out for the benchmarking results using various Kafka Clients. The results are contrasting.
Client Type Throughput(No of messages)
Java 40,000 - 50,0000
Go 28,000 - 30,0000
Node 6,000 - 8,0000
Kafka-pixy 700 - 800
Logstash 250
As far as Kafka goes, I'd use any of the languages with an official Confluent supported client: JVM, C/C++, .NET, Python, Go
I'm sure you can get others to work like Node or PHP, and maybe those can use the C library, but I would prefer something with official language support and a broader user to ask questions to.
I have a nodejs application that has some expensive computations. I'm thinking of doing this part in java so I can more easily take advantage of threading and math libraries. Is there an easy way to have nodejs talk to external java libraries?
The java library will contain a loop that frequently calls javascript functions. Will I see a big performance hit due to having these two libraries constantly cross talk (rather than packaging the entire task, sending it to the jvm, and then getting a result back)
It may be better to just create a java server to do the computations and communicate with your node.js application over a messaging queue. Here is an example which shows how to do that - http://blog.james-carr.org/2010/09/09/rabbitmq-nodejs-and-java-goodness/
You might want to take a look a Vert.X, which will let you mix and match JavaScript and Java as you see fit and communicate via a local message bus.
I'm looking into a NoSQL database for use with Vert.x
Based on the not so favorable results mongoDB is out, so I'm looking at CouchDB/CouchBase, not at least since some of our data collection runs on RaberryPI fed by Arduino I/O (with a Rasbery PI CouchDB instance for offline collection).
What Java library would be suitable/best for use with CouchDB and Vert.x
I don't know a lot about vert.x but it appears to run on the JVM, so you should just be able to use Ektorp, which is pretty much the standard Java library for CouchDB nowadays. It covers all the core functionality, it's fairly well thought out, and the maintainer has been reasonably responsive to pull requests etc, as far as I've seen.
There's more documentation on Ektorp here.
I am new to this topic. I had decided to develop a parallel processing framework for cloud data processing applications in java for my project. the framework has to divide the given sequential java code and process that sub codes in different virtual machines in the cloud. the framework has to dynamically allocate and deallocate the resources according to the load. My problem is how to develop the framework.
Is there any libraries available to schedule the java code into different virtual machine in cloud? please inform me if anything is available.
Terracotta and Gridgain are excellent solutions. Those cited by yerlikayaoglu (Hadoop and hazelcast) are excellent too in their domain but they are all 4 very different and depend on the use case. That's for the map/reduce problem
An other one is the allocation/deallocation of virtual machines. It depends on your cloud provider and some other thing. You can have a look at jClouds
There are solutions such as Hazelcast, Hadoop etc. You can look this projects.
Have a look at Hadoop, a framework which allows basically the same thing, and supports automatic code deployment over the cluster.
If you want to do real time processing you can take a look at storm.
Also Akka provides nice remote actors API for scala and java.
In delphi, I am trying to call a function from an external Java program. Is there any way to do it ?
The standard process to call native code is via JNI. A search on JNI and Delphi will reveal multiple pages that detail how this is done, like this and this
What is more desirable (setting up some out of process server (like Peter already detailed, so I skipped that) or using JNI to call a library depends on how often (and how realtime) you need this to be, and on allowable installation/configuration complexity
If it is a running Java application you will need to expose access to that function. There are a myriad of solutions possible.
If it is only 1 function or very limited functionality, then listening on the humble socket or named pipe is a solution which is currently undervalued and kind of forgotten.
On the next step of integration I would look at asynchronous message passing. It is easy to embed an activemq server or similar or start it in a separate process. This has a number of advantages like that the request are easily synchronized in the Java process by simply using one listening thread, that the behavior is well defined when the Java program is not available or the Delphi one. It is very easy to manage and you get the instrumentation for free.
An embedded Jetty webserver is an easy, reliable solution and implement a servlet to do your bidding. Again a lot of the complexity is now handled by using ubiquitous and standard protocols.
Then there are the synchronous RPC methods like COM, Corba, SOAP which I personally find much too complex, error-prone and maintenance unfriendly to use for ad-hoc communication between processes. If you want to build a complete infrastructure of stuff talking to each other it might be worth it, but not to get 2 programs talking.