I have a following piece for loop in a function which I intended to parallelize but not sure if the load of multiple threads will overweight the benefit of concurrency.
All I need is to send different log files to corresponding receivers. For the timebeing lets say number of receivers wont more than 10. Instead of sending log files back to back, is it more efficient if I send them all parallel?
for(int i=0; i < receiversList.size(); i++)
{
String receiverURL = serverURL + receiversList.get(i);
HttpPost method = new HttpPost(receiverURL);
String logPath = logFilesPath + logFilesList.get(i);
messagesList = readMsg(logPath);
for (String message : messagesList) {
StringEntity entity = new StringEntity(message);
log.info("Sending message:");
log.info(message + "\n");
method.setEntity(entity);
if (receiverURL.startsWith("https")) {
processAuthentication(method, username, password);
}
httpClient.execute(method).getEntity().getContent().close();
}
Thread.sleep(500); // Waiting time for the message to be sent
}
Also please tell me how can I make it parallel if it is gonna work? Should I do it manual or use ExecutorService?
All I need is to send different log files to corresponding receivers. For the time being lets say number of receivers won't be more than 10. Instead of sending log files back to back, is it more efficient if I send them all parallel?
There are a lot of questions to be asked before we can determine if doing this in parallel will buy you anything. You mentioned "receivers" but are you really talking about different receiving servers on different web addresses or are all threads sending their log files to the same server? If it is the latter then chances are you will get very little improvement in speed with concurrency. A single thread should be able to fill the network pipeline just fine.
Also, you probably would get no speed up if the messages are small. Only large messages would take any time and give you any true savings if they were sent in parallel.
I'm most familiar with the ExecutorService classes. You could do something like:
ExecutorService threadPool = Executors.newFixedThreadPool(10);
...
threadPool.submit(new Runnable() {
// you could create your own Runnable class if each one needs its own httpClient
public void run() {
StringEntity entity = new StringEntity(message);
...
// we assume that the client is some sort of pooling client
httpClient.execute(method).getEntity().getContent().close();
}
}
});
What will be good is if you want to queue up these messages and send them in a background thread to not slow down your program. Then you could submit the messages to the threadPool and keep on moving. Or you could put them in BlockingQueue<String> and have a thread taking from the BlockingQueue and calling the httpClient.execute(...).
More implementation details from this good ExecutorService tutorial.
Lastly, how about putting all of your messages into one entity and divide the messages on the server. That would be the most efficient although you might not control the server handler code.
Hello ExecutorService is certainly an option. You have 4 ways to do it in Java.
Using Threads (exposes to many details easy to make mistake)
Executor service as you have already mentioned. It comes from Java 6
Here is a tutorial demonstrating ExecutorService http://tutorials.jenkov.com/java-util-concurrent/executorservice.html
ForkJoin framework comes from Java 7
ParallelStreams comes from Java 8 bellow is a solution using ParallelStreams
Going for higher level api will spare you some errors you might otherwise do.
receiversList.paralelstream().map(t->{
String receiverURL = serverURL + receiversList.get(i);
HttpPost method = new HttpPost(receiverURL);
String logPath = logFilesPath + logFilesList.get(i);
return readMsg(logPath);
})
.flatMap(t->t.stream)
.forEach(t->{
StringEntity entity = new StringEntity(message);
log.info("Sending message:");
log.info(message + "\n");
method.setEntity(entity);
if (receiverURL.startsWith("https")) {
processAuthentication(method, username, password);
}
httpClient.execute(method).getEntity().getContent().close();})
Related
I'm writing a spring-boot based project where I have some synchronous (eg. RESTI API calls) and asynchronous (JMS) pieces of code (the broker I use is a dockerized instance of ActiveMQ in case there's some kind of trick/workaround).
One of the problems I'm currently struggling with is: my application receives a REST api call (I'll call it "a sync call"), it does some processing and then sends a JMS message to a queue (async) whose message in then handled and processed (let's say I have a heavy load to perform, so that's why I want it to be async).
Everything works fine when running the application, async messages are enqueued and dequeued as expecting.
When I'm writing tests, (and I'm testing the whole service, which includes the sync and async call in rapid succession) it happens that the test code is too fast, and the message is still waiting to be dequeued (we are talking about milliseconds, but that's the problem).
Basically as soon as i receive the response from the API call, the message is still in the queue, so if, for example I make a query to check for its existence -> ka-boom the test fails because (obviously) it doesn't find the object (that probably meanwhile is being processed and created).
Is there any way, or any pattern, I can use to make my test wait for that async message to be dequeued? I can attach code to my implementation if needed, It's a bachelors degree thesis project.
One obvious solution I'm temporarily using is adding a hundred milliseconds sleep between the method call and the assert section (hoping everything is done and persisted), but honestly I kinda dislike this solution because it seems so non-deterministic to me. Also creating a latch between development code and testing doesn't sound really good to me.
Here's the code I use as an entry-point to al the mess I explained before:
public TransferResponseDTO transfer(Long userId, TransferRequestDTO transferRequestDTO) {
//Preconditions.checkArgument(transferRequestDTO.amount.compareTo(BigDecimal.ZERO) < 0);
Preconditions.checkArgument(userHelper.existsById(userId));
Preconditions.checkArgument(walletHelper.existsByUserIdAndSymbol(userId, transferRequestDTO.symbol));
TransferMessage message = new TransferMessage();
message.userId = userId;
message.symbol = transferRequestDTO.symbol;
message.destination = transferRequestDTO.destination;
message.amount = transferRequestDTO.amount;
messageService.send(message);
TransferResponseDTO response = new TransferResponseDTO();
response.status = PENDING;
return response;
}
And here's the code that receives the message (although you wouldn't need it):
public void handle(TransferMessage transferMessage) {
Wallet source = walletHelper.findByUserIdAndSymbol(transferMessage.userId, transferMessage.symbol);
Wallet destination = walletHelper.findById(transferMessage.destination);
try {
walletHelper.withdraw(source, transferMessage.amount);
} catch (InsufficientBalanceException ex) {
String u = userHelper.findEmailByUserId(transferMessage.userId);
EmailMessage email = new EmailMessage();
email.subject = "Insufficient Balance in your account";
email.to = u;
email.text = "Your transfer of " + transferMessage.amount + " " + transferMessage.symbol + " has been DECLINED due to insufficient balance.";
messageService.send(email);
}
walletHelper.deposit(destination, transferMessage.amount);
String u = userHelper.findEmailByUserId(transferMessage.userId);
EmailMessage email = new EmailMessage();
email.subject = "Transfer executed";
email.to = u;
email.text = "Your transfer of " + transferMessage.amount + " " + transferMessage.symbol + " has been ACCEPTED.";
messageService.send(email);
}
Im' sorry if the code sounds "a lil sketchy or wrong" it's a primordial implementation.
I'm willing to write a utility to share with you all if that's the case, but, as you've probably noticed, I'm low on ideas right now.
I'm an ActiveMQ developer working mainly on ActiveMQ Artemis (the next-gen broker from ActiveMQ). We run into this kind of problem all the time in our test-suite given the asynchronous nature of the broker, and we developed a little utility class that automates & simplifies basic polling operations.
For example, starting a broker is asynchronous so it's common for our tests to include an assertion to ensure the broker is started before proceeding. Using old-school Java 6 syntax it would look something like this:
Wait.assertTrue(new Condition() {
#Override
public boolean isSatisfied() throws Exception {
return server.isActive();
}
});
Using a Java 8 lambda would look like this:
Wait.assertTrue(() -> server.isActive());
Or using a Java 8 method reference:
Wait.assertTrue(server::isActive);
The utility is quite flexible as the Condition you use can test anything you want as long as it ultimately returns a boolean. Furthermore, it is deterministic unlike using Thread.sleep() (as you noted) and it keeps testing code separate from the "product" code.
In your case you can check to see if the "object" being created by your JMS process can be found. If it's not found then it can keep checking until either the object is found or the timeout elapses.
I have a Spring Boot 1.3.5 web application (Running on Tomcat 8), one of its features is to contact a third-party API through REST and launch many lenghty jobs (From 1 to around maybe 30 depending on the user input, each one with its own REST call in a for loop). I have all this logic in a controller called using a POST with some parameters.
What I need is to launch a background task after each job has been acknowledged by the API, which would be passed some parameter (Job ID) and periodically (~30 s) poll another API to fetch the job output (Again, these jobs may take from several seconds up to an hour, and getting its job takes about 3-4 seconds plus parsing a long string) and do some business logic based on their status (Updating a DB record for now)
However I'm not sure which, if any, TaskExecutor to use, or whether I should use Java's Future structures for this. I might benefit from a Thread pool which will only run X threads parallel and queue others to not overload the server. Is there an example I can take to learn and start off?
Sample of my existing code:
#RequestMapping(value={"/job/launch"}, method={RequestMethod.POST})
public ResponseEntity<String> runJob(HttpServletRequest req) {
for (int deployments=1; deployments <= deployments_required; deployments++) {
httpPost.setEntity((HttpEntity)new StringEntity(jsonInput));
CloseableHttpResponse response = httpclient.execute(httpPost);
HttpEntity entity = response.getEntity();
responseString = EntityUtils.toString(entity, "UTF-8");
JsonObject jsonObject = new JsonParser().parse(responseString).getAsJsonObject();
if (response.getStatusLine().getStatusCode() != 200) {
resultsNotOk.add(new ResponseEntity<String>(jsonObject.get("message").getAsString(), HttpStatus.INTERNAL_SERVER_ERROR));
continue;
}
String deploymentId;
deploymentId = jsonObject.get("id").getAsString();
// Start background task to keep checking the job every few seconds and find created instance IP addresses
start_checking_execution(deploymentId);
}
}
(Yes, this code may be better put in a Service but it was originally built as is so I haven't moved it yet. It may be a good time to do it now)
I would say it's work for Spring Batch
You can define Reader/Processor (to convert source read to target write objects)/Writer to work with the the logic
You can use JobOperator to get job state. See job status transitions
I have a working Thrift client in the below snippet.
TTransport transport = new THttpClient(new Uri("http://localhost:8080/api/"));
TProtocol protocol = new TBinaryProtocol(transport);
TMultiplexedProtocol mp = new TMultiplexedProtocol(protocol, "UserService");
UserService.Client userServiceClient = new UserService.Client(mp);
System.out.println(userServiceClient.getUserById(100));
When running the client within multi-threaded environment
threads[i] = new Thread(new Runnable() {
#Override
public void run() {
System.out.println(userServiceClient.getUserById(someId));
}
}
I got an exception: out of sequence response
org.apache.thrift.TApplicationException: getUserById failed: out of sequence response
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
I guess the reason is that Thrift generated Client is not thread safe.
But if I want multi-clients to call the same method getUserById() simultaneously, how can I make it?
Thrift clients are not designed to be shared across threads. If you need multiple client threads, set up one Thrift client per thread.
But if I want multi-clients to call the same method getUserById() simultaneously, how can I make it?
We don't know much about the context, so I have to guess a bit. If the issue is that there are a lot of such calls coming in at a time, a possible solution could be to group calls to save roundtrip time:
service wtf {
list<string> getUsersById( 1 : list<int> userIds)
}
That's just a short idea. Maybe you want to return list<user_data_struct> instead. For practical reasons I would also recommend to wrap the returned list into a struct, so the whole thing becomes extensible.
I'm working on a Java client/server application with a pretty specific set of rules as to how I have to develop it. The server creates a ClientHandler instance that has input and output streams to the client socket, and any input and output between them is triggered by events in the client GUI.
I have now added in functionality server-side that will send out periodic updates to all connected clients (done by storing each created PrintWriter object from the ClientHandlers in an ArrayList<PrintWriter>). I need an equivalent mechanism client-side to process these messages, and have been told this needs to happen in a second client-side thread whose run() method uses a do...while(true) loop until the client disconnects.
This all makes sense to me so far, what I am struggling with is the fact that the two threads will have to share the one input stream, and essentially 'ignore' any messages that aren't of the type that they handle. In my head, it should look something like this:
Assuming that every message from server sends a boolean of value true on a message-to-all, and one of value false on a message to an individual client...
Existing Client Thread
//method called from actionPerformed(ActionEvent e)
//handles server response to bid request
public void receiveResponse()
{
//thread should only process to-specific-client messages
if (networkInput.nextBoolean() == false)
{
//process server response...
}
}
Second Client-side Thread
//should handle all messages set to all clients
run()
{
do {
if (networkInput.nextBoolean() == true)
{
//process broadcasted message...
} while (true);
}
As they need to use the same input stream, I would obviously be adding some synchronized, wait/notify calls, but generally, is what I'm looking to do here possible? Or will the two threads trying to read in from the same input stream interfere with each other too much?
Please let me know what you think!
Thanks,
Mark
You can do it, though it will be complicated to test and get right. How much is "too much" depends on you. A simpler solution is to have a reader thread pass messages to the two worker threads.
ExecutorService thread1 = Executors.newSingleThreadedExecutors();
ExecutorService thread2 = Executors.newSingleThreadedExecutors();
while(running) {
Message message = input.readMessage();
if (message.isTypeOne())
thread1.submit(() -> process(message));
else if (message.isTypeTwo())
thread2.submit(() -> process(message));
else
// do something else.
}
thread1.shutdown();
thread2.shutdown();
I would like to execute a Hive query on the server in an asynchronous manner. The Hive query will likely take a long time to complete, so I would prefer not to block on the call. I am currently using Thirft to make a blocking call (blocks on client.execute()), but I have not seen an example of how to make a non-blocking call. Here is the blocking code:
TSocket transport = new TSocket("hive.example.com", 10000);
transport.setTimeout(999999999);
TBinaryProtocol protocol = new TBinaryProtocol(transport);
Client client = new ThriftHive.Client(protocol);
transport.open();
client.execute(hql); // Omitted HQL
List<String> rows;
while ((rows = client.fetchN(1000)) != null) {
for (String row : rows) {
// Do stuff with row
}
}
transport.close();
The code above is missing try/catch blocks to keep it short.
Does anyone have any ideas how to do an async call? Can Hive/Thrift support it? Is there a better way?
Thanks!
AFAIK, at the time of writing Thrift does not generate asynchronous clients. The reason as explained in this link here (search text for "asynchronous") is that Thrift was designed for the data centre where latency is assumed to be low.
Unfortunately as you know the latency experienced between call and result is not always caused by the network, but by the logic being performed! We have this problem calling into the Cassandra database from a Java application server where we want to limit total threads.
Summary: for now all you can do is make sure you have sufficient resources to handle the required numbers of blocked concurrent threads and wait for a more efficient implementation.
It is now possible to make an asynchronous call in a Java thrift client after this patch was put in:
https://issues.apache.org/jira/browse/THRIFT-768
Generate the async java client using the new thrift and initialize your client as follows:
TNonblockingTransport transport = new TNonblockingSocket("127.0.0.1", 9160);
TAsyncClientManager clientManager = new TAsyncClientManager();
TProtocolFactory protocolFactory = new TBinaryProtocol.Factory();
Hive.AsyncClient client = new Hive.AsyncClient(protocolFactory, clientManager, transport);
Now you can execute methods on this client as you would on a synchronous interface. The only change is that all methods take an additional parameter of a callback.
I know nothing about Hive, but as a last resort, you can use Java's concurrency library:
Callable<SomeResult> c = new Callable<SomeResult>(){public SomeResult call(){
// your Hive code here
}};
Future<SomeResult> result = executorService.submit(c);
// when you need the result, this will block
result.get();
Or, if you do not need to wait for the result, use Runnable instead of Callable.
After talking to the Hive mailing list, Hive does not support async calls using Thirft.
I don't know about Hive in particular but any blocking call can be turned in an asynch call by spawning a new thread and using a callback. You could look at java.util.concurrent.FutureTask which has been designed to allow easy handling of such asynchronous operation.
We fire off asynchronous calls to AWS Elastic MapReduce. AWS MapReduce can run hadoop/hive jobs on Amazon's cloud with a call to the AWS MapReduce web services.
You can also monitor the status of your jobs and grab the results off S3 once the job is completed.
Since the calls to the web services are asynchronous in nature, we never block our other operations. We continue to monitor the status of our jobs in a separate thread and grab the results when the job is complete.