Invoking an external REST API concurrently via AKKA in Java

Invoking an external REST API concurrently via AKKA in Java - java

I am trying to
get the information of cars from Tesla Server through its API
. And I want to do it concurrently .i.e fetch the information of multiple cars in parallel using AKKA actors
My Approach:
(1) First get the total number of cars.
(2) Create actors equal to the number of cars.
(3) Inside each actor call rest API to get the information of cars in parallel. i.e.each actor will be provided with url containing the car id.
Am I doing it right regarding my approach?
Specifically, in point number 3, I have made call to the Tesla Server inside each actor using AsyncHttpClient from com.ning. Will using AsyncHttpClient inside each actor ensure that each actor will send request asynchronously to the server without blocking other actors?
Will provide further information if need be. I am beginner in AKKA. Looked a lot of threads but could not find exactly what I was looking for.

Specifically for point number 3, as long as you use a Future based API in your actors, the actors will not block.
In general it is hard to tell much more about your approach without knowing why you chose to use one actor per car.
Consider this question: why couldn't you simply create a listOfCars: List[String] of URLs and use Future.traverse(listOfCars)(downloadCarDataForUrl _)?
Finally, I don't know how AsyncHttpClient behaves, but I would double check that if you have a list of thousands of cars, AsyncHttpClient will not concurrently download all of them... if that's the case, you risk being blocked quite quickly by the api provider. If this becomes a problem, you could look into akka-http which only uses a limit number of connection to a certain host.

Related

Building proxies hub

First of all I'm gathering information about this question and so that i could implement this feature in a more elegant way.
Let's look at the picture below
The target server (green circle)
This is an api server that I use to fetch some data.
Features:
Only https connection
Response in json format.
Can accept get requests like these [ https://api.server.com/user=1&option&api_key=? ]
Proxy controller (blue square)
It's a simple server that stores list of proxies; Send and receive some data; And I want to talk about the software that i will to run on top of it.
Features:
Proxy list
Api keys list
I think it should be a hashmap that stores ip=>token list or database table if I want to scale my application.
Workers
Just analyze a json response and pass data to the db.
Let's go closer to the proxy controller server.
The first idea:
Create newFixedThreadPoolExecutor
Pass url/token to worker: server.submit(new Worker(url, token, proxy))
Worker analyze the data and pass it to db.
But in my opinion this solution is quite big and hard to maintain, I want to engage endpoint that gather stats, kill or spawn new workers and so on.
The second idea:
Worker generates an request like https://host/user=1&option=1
Pass it to the Proxy controller
Proxy controller assign to the request the api key and proxy server
Execute the request
Accept the response
Pass it back to a worker (I think that the best idea is to put a load balancer between workers and proxy controller).
This solution seems to me quite hacky. For example if the worker is dead the proxy server sends bunch of requests to the dead worker and it could led to dataloss.
The third idea:
The same as the second but instead of sending data directly to the worker the proxy controller pass it to some bus. I find some information about apache camel that allow me to organize this solution. In this case the dead worker is dead worker and dataloss equals zero (maybe).
Of course all three cases don't handle an errors. Some errors can be solved by resending the request with additional data. Some errors can be solved by re-span the workers.
So in your opinion what is the best solution in this case? Do I miss some hidden problems that can appear later? Which tools I should use?
Thanks

What are trying to reach?
Maybe you consider using this architecture:
NGINX (proxy + load balance) -> WORKER SERVERS -> DB SERVER (maybe use some NoSQL like Cassandra)

Global variables in the play framework / Keeping track of open streams

I'm working on a twitter app right now using twitter4j. Basically I am allowing users to create accounts, add their personal keywords, and start their own stream.
I'm trying to play as nice as possible with the twitter api. Avoid rate limits, don't be connecting the same account over and over etc. So what I think I need to do is have some object that contains a list of all the active TwitterStream object, but I don't know how to approach this. This is the controller to start the stream.
public static Result startStream(){
ObjectNode result = Json.newObject();
if (
//openStreams is a Map<Long,TwitterStream> in the TwitterListener class
TwitterListener.openStreams.containsKey(Long.parseLong(session().get("id")))
){
result.put("status", "running");
return ok(result);
}
Cache.set("twitterStream", TwitterListener.listener(
Person.find.byId(
Long.parseLong(session().get("id"))
)
)
);
result.put("status", "OK");
return ok(result);
}
As you can see I am putting them in Cache right now but I'd like to keep streams open for long periods of time, so cache won't suffice.
What is the most appropriate way to structure my application for this purpose?
Should I be using Akka?
How could I implement Play's Global object to do this?

As soon as you start to think about introducing global state in your application, you have to ask yourself, is there any possibility that I might want to scale to multiple nodes, or have multiple nodes for the purposes of redundancy? If there's even the slightest chance that the answer is yes, then you should use Akka, because Akka will allow you to easily adapt your code to work in a multi node environment with Akka clustering by simply introducing a consistent hashing router. If you don't use Akka, then you'll practically have to redesign your application when the requirement for multiple nodes comes in.
So I'm going to assume that you want to future proof your application, and explain how to use Akka (Akka is a nice way of managing global state anyway even if you don't need multiple nodes).
So in Akka, what you want is an actor that is a stream manager, this will be responsible for creating stream actors if they don't already exist as children of itself. Then the stream actors will be responsible for handling the stream, sending the stream to subscribers, and tracking how many connections are subscribed to them to them, and shutting down when there are no longer any subscribers.

Pattern/Best practice for updating objects on server from multiple clients

I have a general question about a best practice or pattern to solve a problem.
Consider that you have three programs running on seperate JVMs: Server, Client1 and Client2.
All three processes make changes to an object. When the object is changed in either client, the change in the object (not the new object) must be sent to the server. It is not possible just to send the new object from the client to the server because both clients might update the object at the same time, so we need the delta, and not the result.
I'm not so worried about reflecting changes on the server back to the clients at this point, but lets consider that a bonus question.
What would be the best practice for implementing this with X amount of processes and Y amount of object classes that may be changed?
The best way i can think of is consistently using the Command pattern to change the object on the client and the server at the same time, but there has to be a better way?

One of the possible ways to solve that is the Remote Method Invocation system in Java. Keep all the data values on the Server, then have the clients use remote calls to query them.
This would however require some smart caching to reduce the amount of pointless calls. In the end you would end up with something similar to the Command Pattern.
Modern games try to solve this issue with something I'd call an Execute-Then-Verify pattern, where every client has a local copy of the game world, that allows him to come to the same conclusion for each action as the server would. So actions of the player are applied to the local copy of the game world assuming that they are correct, then they are sent to the server, which is the ultimate instance to either accept that or revoke it later on.
The benefit of this variant of local caching is, that most players do not experience much lag, however in the case of contradictory actions they might experience the well-known roll-backs.
In the end it very much depends on what you are trying to do and what is more important for you: control over actions or client action flow.

Pausing and notifying particular threads in a Java Webservice

I'm writing a Java webservice with CXF. I have the following problem: A client calls a method from the webservice. The webservice has to do two things in parallel and starts two threads. One of the threads needs some additional information from the client. It is not possible to add this information when calling the webservice method, because it is dependent from the calculation done in the webservice. I cannot redesign the webservice becuase it is part of a course assignement and the assignements states that I have to do it this way. I want to pause the thread and notify it when the client delivers the additional information. Unfortunately it is not possible in Java to notify a particular thread. I can't find any other way to solve my problem.
Has anybody a suggestion?

I've edited my answer after thinking about this some more.
You have a fairly complex architecture and if your client requires information from the server in order to complete the request then I think you need to publish one or more 'helper' methods.
For example, you could publish (without all the Web Service annotation):
MyData validateMyData(MyData data);
boolean processMyData(MyData data);
The client would then call validateMyData() as many times as it liked, until it knew it had complete information. The server can modify (through calculation, database look-up, or whatever) the variables in MyData in order to help complete the information and pass it back to the client (for updating the UI, if there is one).
Once the information is complete the client can then call processMyData() to process the complete request.
This has the advantage that the server methods can be implemented without the need for background threads as they should be able to do their thing using the request-thread supplied by the server environment.
The only caveat to this is if MyData can get very large and you don't want to keep passing it back and forth between client and server. In that case you would need to come up with a smaller class that just contains the changes the server wants to make to MyData and exclude data that doesn't need correcting.

IMO it's pretty odd for a web service request to effectively be incomplete. Why can't the request pass all the information in one go? I would try to redesign your service like that, and make it fail if you don't pass in all the information required to process the request.
EDIT: Okay, if you really have to do this, I wouldn't actually start a new thread when you receive the first request. I would store the information from the first request (whether in a database or just in memory if this is just a dummy one) and then when the second request comes in, launch the thread.

Messaging: Lots of RemoteServices methods or Unique message builder/interpreter?

Hey guys,
I'm using GWT to code a simple multiplayer board game.
And while I was coding the question came up to my mind:
At first I though my client could simply communicate with the server via RemoteServices calls, so if a client wanted to connect to a game he could do as follows:
joinGame (String playerName, String gameName)
And the server implementation would do the necessary processing with the argument's data.
In other words, I would have lots of RemoteService methods, one for each type of message in the worst case.
I thought of another way, which would be creating a Message class and sub-classing it as needed.
This way, a single remoteService method would be enough:
sendMessage (Message m)
The messages building and interpreting processing too would be done by specialized classes.
Specially the building class could even be put in the gwt-app shared package.
That said,
I can't see the benefits of one or another. Thus I'm not sure if I should do one way or another or even another completely different way.
One vs other, who do you think it is better (has more benefits in the given situation)?
EDIT: A thing I forgot to mention is that one of the factors that made me think of the second (sendMessage) option was that in my application there is a CometServlet that queries game instances to see if there is not sent messages to the client in its own message queue (each client has a message queue).

I prefer the command pattern in this case (something like your sendMessage() concept).
If you have one remote service method that accepts a Command, caching becomes very simple. Batching is also easier to implement in this case. You can also add undo functionality, if that's something you think you may need.
The gwt-dispatch project is a great framework that brings this pattern to GWT.

Messaging takes more programmer time and creates a more obfuscated interface. Using remote service methods is cleaner and faster. If you think there are too many then you can split your service into multiple services. You could have a service for high scores, a service for player records, and a service for the actual game.
The only advantage I can see with messaging is that it could be slightly more portable if you were to move away from a Java RPC environment but that would be a fairly drastic shift.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.