Thinking in node if Java background [closed] - java

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am a Java Developer where everything is working in sequential way (or concurrently with multiple threads), one after another. And it's logical to place things in a sequential way.
But node works in concurrent order with single thread. How it can be beneficial even if it is working only on single thread?
Frankly telling, I didn't get the concept of single thread in node. Only one thread handle everything?
Any advice would be beneficial on how I can start thinking in node.

Synchronous Programming(Java)
If you are familiar with synchronous programming (writing code that does one thing after the other) like Java or .Net. take the following code,
For example:
var fs = require('fs');
var content = fs.readFileSync('simpleserver1.js','utf-8');
console.log('File content: ');
console.log(content);
It writes out the code for a simple web server to the console. The code works sequentially, executing each line after the next. The next line is not executed until the previous line finishes executing.
Although this works well,
what if the file in this example were really large and took minutes
to read from?
How could other operations be executed while that code or long
operation is running?
These questions will not arise if you are working in java, because you have many threads to work for you(to serve multiple requests)
Asynchronous Programming(Node.Js)
But when you are using Node you just have a single thread, which serves all requests.
So there comes asynchronous programming, to help you in javascript(Node)
To execute operations while other long operations are running, we use function callbacks. The code below shows how to use an asynchronous callback function:
var fs = require('fs');
fs.readFile('simpleserver1.js','utf-8', function(err,data){
if (err) {
throw err;
}
console.log(“executed from the file finishes reading”);
});
//xyz operation
Notice that the line “executed from the file finishes reading” is executed as the file is being read, thus allowing us to perform other operations while the main reading of the file is also being executed.
Now look at the //xyz operation, in the code. when the file is being read, the server will not wait for the file to be read completely. it will just start executing //xyz operation, and will get back to , the callback function provided in fs.readFile(, when the file is ready.
So thats how Asynchronous programming works in Node.
Also if you want to conpare java and Node you can read this Article
EDIT:
How is node.Js single Threaded
lets take a scenario, where clients request server:
Assumptions:
1) there is single server process, say serverProcess,
2) There are 2 clients requesting server, say clientA and clientB.
3) Now, consider clientA, is going to require a file Operation(as one
shown above using fs).
what happens here,
Flow:
1) clientA requests serverProcess, server gets the request, then
it starts performing file operation. Now it waits till the file is
ready to read(callback is not yet invoked yet).
2) clientB requests serverProcess, Now the server is free right
now, as it is not serving clientA, so it servs clientB, in the
mean-time, the callback from fs.read, Notifies the server that file
data is ready, and it can perform operations on it.
3) Now server starts serving 'clientA'.
now you see, there was just one thread of server , which handled both the client requests, right?
Now what would have happened if this was JAVA, you would have created another thread of server for serving clientB, while clientA was being served by first thread, and waiting for file to be read. So this is how Node is single threaded, meaning A single Process Handles all the requests.
Question:
while there is another process invoked who prepared data from file system, how would you say node is single threaded:
See, I/O(files/database), is itself a different process, what difference here is,
1) Node does not wait for everything to be ready(like java), instead it will just start its next work(or serve other requests), but whatever happens, node will not create a different thread to serve rest of the requests(unless explicitly done//not recommended though).
2) while java will create another thread itself for serving new requests.

This has been said million times, but let me give you a short answer with respect to Java.
You create separate Thread in Java if you want to read a long file, without blocking main thread.
In Javascript, you just read the file using callbacks.
Main difference between those two:
It is easier to screw up the code with multiple threads (race condition, etc).
You do not need exactly the power of CPU's second core to read the file, it is a question of slow I/O, not intensive communication.
In callbacks, there is single thread as you said. Though, it just asks underlying system to read the file, and continues executing your code. Once the file is read, then javascript pauses the code it was executing, and will come back to run your Callback.
Sometimes, you also have to do computationally intensive stuff in Javascript. In that case you can spawn a new process - look into cluster module. But usually, computationally, or I/O heavy operations are already done for you, and you just use them using callbacks.

Ok giving you a head start. It is not about threads its about tasks per second. In a thread mode threads block when they wait for something.
In a non-blocking design everytime you wait for something you just give the thread back and be awaken if the event you are waiting for occured. Those events are known as future. So as in the future i want to do this when this and that has happend (or in a failure case do this other thing). Thats basically it.
It is not node or javascript. It is famous for scala too and sure there are plenty of other languages. And if you are a Java guy look for async processing. Jetty provides it. Vertx is famous for a share nothing architecture.
So have fun with this. I use it regularly. I have a server storing 20GB of data in a custom datastore. Wanna know how we scaled? We brought 512GB for the server and did 20 of those stores in parallel sharing nothing. Its like having 20 servers in one machine with no noticable latency and you scale with the cores. Thats how we do business in todays world.
Hardware is cheap so why fiddle with concurrency on the lowest level?

Related

Alternative to Thread.sleep() for for better performance

I have developed a REST service. The service have one Api endpoint: v1/customer. This Api
does two things:
It executes the business logic in main thread
Spawns a child thread to perform the non critical DB writes. The main thread returns response to client immediately, whereas the child thread write to DB asynchronously.
As both of the these operations Step 1 and 2 are not synchronous, it is becoming increasingly challenging to test both of these scenario.
Let's say when I try to test the API. I am testing two things (api response and DB writes)
As the DB writes happen async fashion. I have to use a Thread.sleep(2000). This process is not scalable and doesn't yield right result. Soon I might have 1000 test cases to run and the total time to run all these testcases will increase enormously.
What design technique shall I use to test the DB writes keeping performance and execution time in mind.
I would suggest to change your api design if possible. One possible solution could be to have your first api call respond with http 202 accepted and return some kind of job ID to the client. With this job ID the client could check the progress via a GET on another endpoint. This would allow you to have polling in your test without hardcoding some sleep values.
Here is a example that shows the process in a bit more detail.
https://restfulapi.net/http-status-202-accepted/

Efficiently insert data in to database in java [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm building an application that has a database that solely will be used for logging purpose. We log the incoming transaction id and its start and end time. There is no use for the application itself from this database. Hence I want to execute this insert query as efficient as possible without affecting the application itself. My idea is to execute the whole database insert code in a separate thread. So in this way, the database insert will run without interfering the actual work. I would like to know whether there is any design patter related to this kind of scenario. Or else whether my thinking pattern is correct for this.
Your thinking pattern is right. Post your generated data from your main thread(s) into a safe-for-multi-threading blocking queue, and have the logging thread loop block waiting for a message to appear in the queue, then sending that message to the database and repeating.
If there is a chance, however small, that your application may be generating messages faster than your logging thread can process them, then consider giving the queue a maximum capacity, so that the application gets blocked when trying to enqueue a message in the event that the maximum capacity is reached. This will incur a performance penalty, but at least it will be controlled, whereas allowing the queue to grow without a limit may lead to degraded performance in all sorts of other unexpected and nasty ways, and even to out-of-memory errors.
Be advised, however, that plain insert operations (with no cursors and no returned fields) are quite fast as they are, so the gains from using a separate thread might be negligible.
Try running a benchmark while doing your logging a) from a separate logging thread as per your plan, and b) from within your main thread, and see whether it makes any difference. (And post your results here if you can, they would be interesting for others to see.)
From my point of view, the best idea is to make an Java + RabbitMq broker + Background process architecture.
For example:
Java process enqueued a JSON message in RabbitMq queue. This step can be done asynchronously through ExecutorService class if you want a thread pool. Anyway, this task can be done synchrounously due to high enqueue speed of RabbitMq.
Background process connects to queue that contains messages and start to consuming them. This process task is to read and intrepret message by message and make the insert in database with its content information.
This way, you will have two separate processes and database operations won't affect main process.

Node JS server handling 10000 websockets

I am designing a Node JS program to develop a real time system which has 10,000 sockets on data input side and some on the client app side ( dynamic as client app/web apps might not be running).
I transform input data to a readable output form. E.g an analog temperature sensor reading converted to Celsius scale.
I will be hosting this on google cloud Platform.
My question is whether the Node JS server will be able to handle the following tasks in parallel
1) registering web sockets
2) fixing/repairing web sockets
2.1) updating data in memory
2.2) accepting incoming daya
3) transforming data
3.1) sending tranformed data
4) dumping data to a database every 5 minutes
My question is whether Node JS is appropriate technology or do I need multi threaded technology like java
My question is whether Node JS is appropriate technology
To be short, yes, Node will work.
Node.js, like most modern javascript frameworks, supports asynchronous programming. What does it mean for a program to be "asynchronous"? Well, to understand what it means to be asynchronous, it's best to understand what it means to be "synchronous". Taken from Eloquent Javascript, a book by Marijn Haverbeke, available here:
In a synchronous programming model, things happen one at a time. When you call a function that performs a long-running action, it returns only when the action has finished and it can return the result. This stops your program for the time the action takes.
In other words, operations happen one at a time. If I used a synchronous program to run a ticket counter at a county faire, customer 1 would be served first, then customer 2, then customer 3, etc, etc,. Each person in front of the line would add wait time to all other persons.
An asynchronous model allows multiple things to happen at the same time. When you start an action, your program continues to run. When the action finishes, the program is informed and gets access to the result.
Going back to the ticket counter example, if done asynchronously, all persons in line would be served at the same time and it would be of little significance on any given person if there are other persons in line.
Hopefully that makes sense. With that idea fresh, let's consider how to implement a asynchronous program. As mentioned earlier, Node does support asynchronous programs, however, framework support isn't enough, you will need to deliberately build your program asynchronously.
I can provide some in depth examples of how this can be accomplished in Node, but I'm not sure what requirements/restraints you have. Feel free to add more details in a comment to this response and I can assist you more. If you need something to get started, take some time reviewing promises and callback functions

Nodejs performance event loop

There are articles claiming superior nodejs performance due to its single threaded event loop. I'm not asking for opinions, I'm asking for a mechanics explanation.
A thread starts to process a request, computes a little, and finds out that it needs to read from a database. This gets done asynchronously. No delay involved and the thread can continue... but what should it do without the data?
A1 Answer "don't know yet"?
A2 Grab another request?
A1 makes little sense to me. I can imagine a client issuing other requests in the meantime (like loading multiple resources on first site access), but in general, no.
A2 When it grabs another request, then it loses the whole context. This context gets saved in the promise which will get fulfilled when the data arrive, but which thread does process this promise?
B1 The same thread later
B2 A different thread.
In case B1 you may be lucky and some relevant data may be still in the threads' cache, but given that a DB request takes a few milliseconds, the gain is IMHO low.
Isn't case B2 practically equivalent to a context switch?
A: Node.js will not respond to any request unless you write code that actively sends a response. It doesn't matter whether that code runs synchronously or asynchronously.
The client (or even the server's networking stack) cannot know or care whether asynchrony happened in the meantime.
B: There is only one Node.js thread, period.
When a response arrives for an asynchronous operation kicked off in Node.js code, an event is raised in the Node.js event loop thread, and the appropriate callback/handler is called.
Node.js is based on libuv C library.
Threads are used internally to fake the asynchronous nature of all the
system calls. libuv also uses threads to allow you, the application,
to perform a task asynchronously that is actually blocking, by
spawning a thread and collecting the result when it is done.
A thread starts to process a request, computes a little, and finds out that it needs to read from a database. This gets done asynchronously. No delay involved and the thread can continue... but what should it do without the data?
Pass a callback to a DB module's method, and return from the current function which was invoked as an event listener too. Event loop will continue to next event in a queue.
Context is accessible inside callback as function's closure.

applicability of mutli threading to a specific scenario in a java program

I am confused about the applicability of multi threading in general...
I am creating an application which executes some code which has been saved in xml format. The work is to use apache http client and retrieve some data from websites...More than 1 website can be visited by one block of code in xml...
Now I want that if 2 users have created their own respective codes and saved them in XML, then each user's 'job' (ie block of code in xml format) runs in a separate thread.
I have with me code to execute one user's code...Now I want that multiple persons' code can be run in parallel. But I have some doubts--
(1) The Apache HTTP client provides a way of multithreaded communication, currently I am simply using the default HTTP client- this same client can be made to visit multiple websites, one after the other- as per code block in xml. Am I correct in thinking that I do not need to change my code so that it uses the recommended multithreaded communication?
(2) I am thinking of creating a servlet that when invoked, executes one block of xml code. So to execute 2 blocks of code as given by 2 different users, I will have to invoke this servlet twice. I am going to deploy this application using Amazon Elastic Beanstalk, so what I am confused about is, do I need to use multi threading at all in my program? Can I not simply invoke the existing code (which is used to execute one block of code at a time) from the servlet? And I do want to keep processing of the different blocks of XML code separate from each other, so I dont think I should use multi threading here.. Am I correct in my assumption?
Running it one after the other as per your 1st option will not be considered 'concurrent' .
Coming to the servlet method , the way you describe it will work concurrently , but you also need to think about how many users concurrently ? Since for each user , there would be a separate request , there would be some network latency involved for multiple calls. You need to think about all these factors before going ahead with this option
Since you have the code for one user's job , you can define a thread class which has userid as an attribute. In the run() method call the code for a particular user's job.
Now create two threads and set the appropriate userid for each thread and spawn them off.
If the number of users are more , you can look at using Java's Thread Pool Executor .
Since you are going to use a servlet container then it's going to manage multithreading for you. Every servlet request will be executed in a different thread. In that scenario one servlet call would execute on block of code from provided XML in a single threaded manner. If there are several sites declared per block of code they would be visited serially. Other user in the same time may call the same server with other block of code running in parallel with the first one.

Categories