Easiest way to parallelize Camel routes - java

Currently I have a camel route that makes a few API rest calls and puts the results in a database, and does this serially:
Enter route
Setup task
Call/insert 1
Call/insert 2
Call/insert 3
Finalizing task
I would like to make these calls happen in parallel.
Enter route
Setup task
Make external calls in parallel manner:
Call/insert 1
Call/insert 2
Call/insert 3
Finalizing task
Each step in the serial mode is currently its own camel route, so things are already fairly modular. My two main questions are:
best way to call the call/insert routes in a parallel manner?
I have seen examples that split up the body and send each part to a different thread, but that's not what I need here.
I have also seen setting up using .multicast().parallelProcessing().to("direct:id1", "direct:id2");; Is it as easy as that? Can I just specify the separates routes to parallelize? Is this best practice? Anything to know or note about this?
How to wait for all the threads to finish? Error handling?
Currently each external call route goes to the next. Where should they go when they are made parallel? Can I just not specify the next step?
How to handle potential errors that occur in one or more of the threads? How can I make sure these are handled? Currently each step error-checks and goes to a generic error-handler

Related

Alternative to Thread.sleep() for for better performance

I have developed a REST service. The service have one Api endpoint: v1/customer. This Api
does two things:
It executes the business logic in main thread
Spawns a child thread to perform the non critical DB writes. The main thread returns response to client immediately, whereas the child thread write to DB asynchronously.
As both of the these operations Step 1 and 2 are not synchronous, it is becoming increasingly challenging to test both of these scenario.
Let's say when I try to test the API. I am testing two things (api response and DB writes)
As the DB writes happen async fashion. I have to use a Thread.sleep(2000). This process is not scalable and doesn't yield right result. Soon I might have 1000 test cases to run and the total time to run all these testcases will increase enormously.
What design technique shall I use to test the DB writes keeping performance and execution time in mind.
I would suggest to change your api design if possible. One possible solution could be to have your first api call respond with http 202 accepted and return some kind of job ID to the client. With this job ID the client could check the progress via a GET on another endpoint. This would allow you to have polling in your test without hardcoding some sleep values.
Here is a example that shows the process in a bit more detail.
https://restfulapi.net/http-status-202-accepted/

Queueing tasks via JMS

I would like to make a question to the comunity and get as many feedbacks as possible about an strategy I have been thinking, oriented to resolve some issues of performance in my project.
The context:
We have an important process that perform 4 steps.
An entity status change and its persistence
If 1 ends OK. Entity is exported into a CSV file.
If 2 ends OK. Entity is exported into another CSV. This one with way more Info.
If 3 ends OK. The last CSV is sent by mail
Steps 1 and 2 are linked and they are critical.
Steps 3 and 4 are not critical. Doesn't even care if they ends successfully.
Performance of 1-2 is fine, but 3-4 in some escenarios are just insanely slow. Mostly cause step 3.
If we execute all the steps as a sequence, some times step 3 causes a timeout. Client do not get any response about steps 1 and 2 (the important ones) and user don't know whats going on.
This case made me think in JMS queues in order to delegate the last 2 steps to another app/process. Deallocate the notification from the business logic. Second export and mailing will be processed when posible and probably in parallel. I could also split it in 2 queues: exports, mail notification.
Our webapp runs into a WebLogic 11 cluster, so I could use its implementation.
What do you think about the strategy? Is WebLogic JMS implementation anything good? Should I check another implementation? ActiveMQ, RabbitMQ,...
I have also thinking on tiketing system implementation with spring-tasks.
At this point I have to point at spring-batch. Its usage is limited. We have already so many jobs focused on important processes of data consolidation and the window of time for allocation of more jobs is limited. Plus the impact of to try to process all items massively at once.
May be we could if we find out a way to use the multithread of spring-batch but we didn't find yet the way to fit oír requirements into such strategy.
Thank you in advance and excuse my english. I promise to keep working hard on it :-).
One problem to consider is data integrity. If step n fails, does step n-1 need to be reversed? Is there any ordering dependencies that you need to be aware of? And are you writing to the same or different CSV? If the same, then might have contention issues.
Now, back to the original problem. I would consider Java executors, using 4 fixed-sized pools and move the task through the pools as successes occur:
Submit step 1 to pool 1, getting a Future back, which will be used to check for completion.
When step 1 completes, you submit step 2 to pool 2.
When step 2 completes, you now can return a result to the caller. The call to this point has been waiting (likely with a timeout so it doesn't hang around forever) but now the critical tasks are done.
After returning to the client, submit step 3 to pool 3.
When step 3 completes, submit step to pool 4.
The pools themselves, while fixed sized, could be larger for pool 1/2 to get maximum throughput (and to get back to your client as quickly as possible) and pool 3/4 could be smaller but still large enough to get the work done.
You could do something similar with JMS, but the issues are similar: you need to have multiple listeners or multiple threads per listener so that you can process at an appropriate speed. You could do steps 1/2 synchronously without a pool, but then you don't get some of the thread management that executors give you. You still need to "schedule" steps 3/4 by putting them on the JMS queue and still have listeners to process them.
The ability to recover from server going down is key here, but Executors/ExecutorService has not persistence, so then I'd definitely be looking at JMS (and then I'd be queuing absolutely everything up, even the first 2 steps) but depending on your use case it might be overkill.
Yes, an event-driven approach where a message bus makes the integration sounds good. They are asynch so you will not have timeout. Of course you will need to use a Topic. WLS has some memory issues when you have too many messages in the server, maybe a different server would work better for separation of concerns and resources.

Thinking in node if Java background [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am a Java Developer where everything is working in sequential way (or concurrently with multiple threads), one after another. And it's logical to place things in a sequential way.
But node works in concurrent order with single thread. How it can be beneficial even if it is working only on single thread?
Frankly telling, I didn't get the concept of single thread in node. Only one thread handle everything?
Any advice would be beneficial on how I can start thinking in node.
Synchronous Programming(Java)
If you are familiar with synchronous programming (writing code that does one thing after the other) like Java or .Net. take the following code,
For example:
var fs = require('fs');
var content = fs.readFileSync('simpleserver1.js','utf-8');
console.log('File content: ');
console.log(content);
It writes out the code for a simple web server to the console. The code works sequentially, executing each line after the next. The next line is not executed until the previous line finishes executing.
Although this works well,
what if the file in this example were really large and took minutes
to read from?
How could other operations be executed while that code or long
operation is running?
These questions will not arise if you are working in java, because you have many threads to work for you(to serve multiple requests)
Asynchronous Programming(Node.Js)
But when you are using Node you just have a single thread, which serves all requests.
So there comes asynchronous programming, to help you in javascript(Node)
To execute operations while other long operations are running, we use function callbacks. The code below shows how to use an asynchronous callback function:
var fs = require('fs');
fs.readFile('simpleserver1.js','utf-8', function(err,data){
if (err) {
throw err;
}
console.log(“executed from the file finishes reading”);
});
//xyz operation
Notice that the line “executed from the file finishes reading” is executed as the file is being read, thus allowing us to perform other operations while the main reading of the file is also being executed.
Now look at the //xyz operation, in the code. when the file is being read, the server will not wait for the file to be read completely. it will just start executing //xyz operation, and will get back to , the callback function provided in fs.readFile(, when the file is ready.
So thats how Asynchronous programming works in Node.
Also if you want to conpare java and Node you can read this Article
EDIT:
How is node.Js single Threaded
lets take a scenario, where clients request server:
Assumptions:
1) there is single server process, say serverProcess,
2) There are 2 clients requesting server, say clientA and clientB.
3) Now, consider clientA, is going to require a file Operation(as one
shown above using fs).
what happens here,
Flow:
1) clientA requests serverProcess, server gets the request, then
it starts performing file operation. Now it waits till the file is
ready to read(callback is not yet invoked yet).
2) clientB requests serverProcess, Now the server is free right
now, as it is not serving clientA, so it servs clientB, in the
mean-time, the callback from fs.read, Notifies the server that file
data is ready, and it can perform operations on it.
3) Now server starts serving 'clientA'.
now you see, there was just one thread of server , which handled both the client requests, right?
Now what would have happened if this was JAVA, you would have created another thread of server for serving clientB, while clientA was being served by first thread, and waiting for file to be read. So this is how Node is single threaded, meaning A single Process Handles all the requests.
Question:
while there is another process invoked who prepared data from file system, how would you say node is single threaded:
See, I/O(files/database), is itself a different process, what difference here is,
1) Node does not wait for everything to be ready(like java), instead it will just start its next work(or serve other requests), but whatever happens, node will not create a different thread to serve rest of the requests(unless explicitly done//not recommended though).
2) while java will create another thread itself for serving new requests.
This has been said million times, but let me give you a short answer with respect to Java.
You create separate Thread in Java if you want to read a long file, without blocking main thread.
In Javascript, you just read the file using callbacks.
Main difference between those two:
It is easier to screw up the code with multiple threads (race condition, etc).
You do not need exactly the power of CPU's second core to read the file, it is a question of slow I/O, not intensive communication.
In callbacks, there is single thread as you said. Though, it just asks underlying system to read the file, and continues executing your code. Once the file is read, then javascript pauses the code it was executing, and will come back to run your Callback.
Sometimes, you also have to do computationally intensive stuff in Javascript. In that case you can spawn a new process - look into cluster module. But usually, computationally, or I/O heavy operations are already done for you, and you just use them using callbacks.
Ok giving you a head start. It is not about threads its about tasks per second. In a thread mode threads block when they wait for something.
In a non-blocking design everytime you wait for something you just give the thread back and be awaken if the event you are waiting for occured. Those events are known as future. So as in the future i want to do this when this and that has happend (or in a failure case do this other thing). Thats basically it.
It is not node or javascript. It is famous for scala too and sure there are plenty of other languages. And if you are a Java guy look for async processing. Jetty provides it. Vertx is famous for a share nothing architecture.
So have fun with this. I use it regularly. I have a server storing 20GB of data in a custom datastore. Wanna know how we scaled? We brought 512GB for the server and did 20 of those stores in parallel sharing nothing. Its like having 20 servers in one machine with no noticable latency and you scale with the cores. Thats how we do business in todays world.
Hardware is cheap so why fiddle with concurrency on the lowest level?

using different threads to do processing in web application

I have a Java EE web application. Now when a particular request comes (say /xyz url patter) I want to do complex procesing as follows
Each of the following 3 steps are very complex and takes time.
Get data from one table from DB.Table has huge data and querying takes time.
Make a web service call to some other webserive A and get its data.
Make another web service call to some otheer webserice B and get its data .
Do some processing by using output of 1, 2, 3
1, 2, and 3 are independent of each other so can be called in parallel.
Now the questions are:
Can I do operations 1, 2, and 3 in three separate threads?
Is it advisable to create 3 threads for each request?
Should I use thread pooling?
To address your first question I go through the 4 steps:
Yes, if the database driver you are using allows concurrent access, respectively is safe to use from different threads.
A web service is normally designed to deal with different requests at the same time so this should work as well, the question here is how many threads you want to use (and how long it takes to process one request) and whether the web service will guard itself against too many requests at once.
The same applies here.
Yes, but you have to do synchronization here, as in: wait until all threads have received their results. You can realize this with a java.util.concurrent.CyclicBarrier
Second question
That depends on your data and especially how fast the web services will answer, you should try it out.
Third question Definitively, that's what they are for. This will also help you to structure your application.
1) Can i do operations 1 ,2 and 3 in three separate threads?
Yes, you can.
2) Is it advisable to create 3 threads for each request?
As long as these things don't depend on each other, and as long as you're not depending on getting these in the same transaction, then it seems like it should be ok. You will have to handle the case where one or more threads don't succeed, of course. You'll need a separate watchdog thread to cancel the threads if they take too long or if one comes back with a failure.
3) Should I use thread pooling?
Regardless of what else you do, whenever you use threads you should use a pool. That way if there's a problem where threads don't complete or go into some bad state or otherwise become unavailable, you protect your application from running out of threads.

applicability of mutli threading to a specific scenario in a java program

I am confused about the applicability of multi threading in general...
I am creating an application which executes some code which has been saved in xml format. The work is to use apache http client and retrieve some data from websites...More than 1 website can be visited by one block of code in xml...
Now I want that if 2 users have created their own respective codes and saved them in XML, then each user's 'job' (ie block of code in xml format) runs in a separate thread.
I have with me code to execute one user's code...Now I want that multiple persons' code can be run in parallel. But I have some doubts--
(1) The Apache HTTP client provides a way of multithreaded communication, currently I am simply using the default HTTP client- this same client can be made to visit multiple websites, one after the other- as per code block in xml. Am I correct in thinking that I do not need to change my code so that it uses the recommended multithreaded communication?
(2) I am thinking of creating a servlet that when invoked, executes one block of xml code. So to execute 2 blocks of code as given by 2 different users, I will have to invoke this servlet twice. I am going to deploy this application using Amazon Elastic Beanstalk, so what I am confused about is, do I need to use multi threading at all in my program? Can I not simply invoke the existing code (which is used to execute one block of code at a time) from the servlet? And I do want to keep processing of the different blocks of XML code separate from each other, so I dont think I should use multi threading here.. Am I correct in my assumption?
Running it one after the other as per your 1st option will not be considered 'concurrent' .
Coming to the servlet method , the way you describe it will work concurrently , but you also need to think about how many users concurrently ? Since for each user , there would be a separate request , there would be some network latency involved for multiple calls. You need to think about all these factors before going ahead with this option
Since you have the code for one user's job , you can define a thread class which has userid as an attribute. In the run() method call the code for a particular user's job.
Now create two threads and set the appropriate userid for each thread and spawn them off.
If the number of users are more , you can look at using Java's Thread Pool Executor .
Since you are going to use a servlet container then it's going to manage multithreading for you. Every servlet request will be executed in a different thread. In that scenario one servlet call would execute on block of code from provided XML in a single threaded manner. If there are several sites declared per block of code they would be visited serially. Other user in the same time may call the same server with other block of code running in parallel with the first one.

Categories