My question may sound very basic one but I am a bit confused on jackson s writeValueAsString(String) is also called as serialization and conversion of java object to byte stream is also called as Serialization.
Can anyone help me understand; how both of them are different?
The reason I am asking this question is, I got into a scenario where I call a REST service. The REST responds in 10 seconds with JSON. However if I log the time of writeValueAsString(String) on server side, it hardly takes a second.
UPDATE 1: This is what I am observing
Last log on invoked REST service(returning a collection) prints at -->9:10:10 UTC. And, data simultaneously starts streaming on my machines Git bash as I am using curl to call service.
After 10 seconds last log on my Servlet filter( as intercepts the request to REST api uri) prints out at --> 9:10:20 UTC and at the very same time data streaming stops at Git bash(nearly 35Mb downloaded). So, what could be the reason for this behavior?
If Jackson simultaneously started sending bytes over the network while serialization is still on?
Is that Jackson serialization is slow or network bandwidth is low?
Note that, I tried running my serialization and deserialization only using writeValueAsString(..)/readValue(..) operations without any network call through junit with the same set of data and they executes within a second of time.
Thanks
Server response time of 10 seconds is not just the serialization time, it includes the:
total time of request reaching the REST service server over the network
internal processing in the REST service app
response reaching your application over the network
(Additionaly the time taken at various other layers, but not including them for the sake of simplicity).
And for serialization - adding the comment from #Lino here:
In computing, serialization (or serialisation) is the process of translating data structures or object state into a format that can be stored (for example, in a file or memory buffer) or transmitted (for example, across a network connection link) and reconstructed later (possibly in a different computer environment)
Source : Wikipedia
Related
I am designing a Node JS program to develop a real time system which has 10,000 sockets on data input side and some on the client app side ( dynamic as client app/web apps might not be running).
I transform input data to a readable output form. E.g an analog temperature sensor reading converted to Celsius scale.
I will be hosting this on google cloud Platform.
My question is whether the Node JS server will be able to handle the following tasks in parallel
1) registering web sockets
2) fixing/repairing web sockets
2.1) updating data in memory
2.2) accepting incoming daya
3) transforming data
3.1) sending tranformed data
4) dumping data to a database every 5 minutes
My question is whether Node JS is appropriate technology or do I need multi threaded technology like java
My question is whether Node JS is appropriate technology
To be short, yes, Node will work.
Node.js, like most modern javascript frameworks, supports asynchronous programming. What does it mean for a program to be "asynchronous"? Well, to understand what it means to be asynchronous, it's best to understand what it means to be "synchronous". Taken from Eloquent Javascript, a book by Marijn Haverbeke, available here:
In a synchronous programming model, things happen one at a time. When you call a function that performs a long-running action, it returns only when the action has finished and it can return the result. This stops your program for the time the action takes.
In other words, operations happen one at a time. If I used a synchronous program to run a ticket counter at a county faire, customer 1 would be served first, then customer 2, then customer 3, etc, etc,. Each person in front of the line would add wait time to all other persons.
An asynchronous model allows multiple things to happen at the same time. When you start an action, your program continues to run. When the action finishes, the program is informed and gets access to the result.
Going back to the ticket counter example, if done asynchronously, all persons in line would be served at the same time and it would be of little significance on any given person if there are other persons in line.
Hopefully that makes sense. With that idea fresh, let's consider how to implement a asynchronous program. As mentioned earlier, Node does support asynchronous programs, however, framework support isn't enough, you will need to deliberately build your program asynchronously.
I can provide some in depth examples of how this can be accomplished in Node, but I'm not sure what requirements/restraints you have. Feel free to add more details in a comment to this response and I can assist you more. If you need something to get started, take some time reviewing promises and callback functions
I have an application, call it Service 1, that potentially makes a lot of the same requests to another application, call it Service 2. As an example, x number of people use Service 1 and that results in x requests (which are the exact same request) to Service 2. Each response is cached in Service 1.
Currently, we have a synchronized method that checks whether or not the same request has been made within a certain time threshold. The problem we are having is that when the server is under a heavy load that synchronized method locks up the threads, kubernetes can't perform liveness checks, so kubernetes restarts the service. The reason we want to prevent duplicate requests is two fold: 1) we don't want to hammer service 2, and 2) if we already are making the request we don't want to make it again, just wait for the result that will already be coming back.
What is the fastest, most scalable solution to not making duplicate requests without locking up and taking down the server?
FWIW, my experience with rx-java specifically is very limited, so I'm not entirely confident how applicable this is for your case. This is a solution I've used several times with Scala and I know Java itself does have analogous constructs that would allow the same approach.
A solution I have used in the past that has worked very well for me involves using Futures. It does not reduce duplication entirely, but it does remove duplication per requesting server. The approach involves using a TTL Cache in which we stored the Future object that does or will contain the result of a request we want to deduplicate on. It is stored under a key that can determine uniqueness of the request such as the different parameters that might be applicable.
So let's say you have a method that you call to fetch the response from Service 2 and returns it as a Future. As an example we'll say getPage which has one parameter, an integer, which is the page you'd like to fetch.
When a request begins and we're about to call getPage with the page number of 2, we check the cache for a key like "getPage:2". This won't contain anything for the first request, so we call getPage(2) which returns a Future[SomeResponseObject]. We set "getPage:2" in the TTL Cache to the Future object. When another request comes in that may spawn a duplicate request, the same cache check happens, however, there's a Future object already in the cache. We get this future and add a response listener to be invoked when the response is available, or in Scala, simply .map() on it.
This has a few advantages. If your request is slow or there's highly duplicative requests even in a small time frame, many requests to Service 1 are serviced by a single response from Service 2.
Secondarily, once the request to Service 2 has come back, assuming you have a window in which the response is still valid, the response is already available immediately and no request is necessary at all.
If your Service 2 request takes 50ms, and your response can be considered valid for 5 seconds, all requests happening to the same server in the first 50ms are serviced at ms 50 when the response is returned, and from that point forward for the remaining 4950 ms already have access to the response.
As I alluded earlier to the effectiveness here is tied to how many instances of Service 1 are running. The number of duplicate requests at any time is linear to the number of Servers running.
This is a mostly lock free way to achieve this. I saw mostly because some synchronization is necessary the TTL Cache itself to make sure the request is only started once, but has never been an issue for performance in my experience.
As an extension of this, you can potentially use something like redis to cache responses from Service 2 if it has long-ish response times, and have your getPage equivalent first check a redis cache for the serialized response (and write an expiring value if one wasn't there). This allows you to further reduce requests to Service 2 by having a more global value cached, but having a second caching layer does add some complexity and potential for issues.
I use aws api gateway integrated with aws lambda(java), but I'm seeing some serious problems in this approach. The concept of removing the server and having your app scaled out of the box is really nice but here are the problem I'm facing. My lambda is doing 2 simple things- validate the payload received from the client and then send it to a kinesis stream for further processing from another lambda(you will ask why I don't send directly to the stream and only use 1 lambda for all of the operations. Let's just say that I want to separate the logic and have a layer of abstraction and also be able to tell the client that he's sending invalid data.).
In the implementation of the lambda I integrated the spring DI. So far so good. I started making performance testing. I simulated 50 concurrent users making 4 requests each with 5 seconds between the requests. So what happened- In the lambda's coldstart I initialize the spring's application context but it seems that having so many simultaneous requests when the lambda was not started is doing some strange things. Here's a screenshot of the times the context was initialized for.
What we can see from the screenshot is that the times for initializing the context have big difference. My assumption of what happening is that when so many requests are received and there's no "active" lambda it initializes a lambda container for every one of them and in the same time it "blocks" some of them(the ones with the big times of 18s) until the others already started are ready. So maybe it has some internal limit of the containers it can start at the same time. The problem is that if you don't have equally distributed traffic this will happen from time to time and some of the requests will timeout. We don't want this to happen.
So next thing was to do some tests without spring container as my thought was "ok, the initialization is heavy, let's just make plain old java objects initialization". And unfortunatelly the same thing happened(maybe just reduced the 3s container initialization for some of the requests). Here is a more detailed screenshot of the test data:
So I logged the whole lambda execution time(from construction to the end), the kinesis client initialization and the actual sending of the data to the stream as these are the heaviest operations in the lambda. We still have these big times of 18s or something but the interesting thing is that the times are somehow proportional. So if the whole lambda takes 18s, around 7-8s is the client initialization and 6-7 for sending the data to the stream and 4-5 seconds left for the other operations in the lambda which for the moment is only validation. On the other hand if we take one of the small times(which means that it reuses an already started lambda),i.e. 820ms, it takes 100ms for the kinesis client initialization and 340 for the data sending and 400ms for the validation. So this pushes me again to the thoughts that internally it makes some sleeps because of some limits. The next screenshot is showing what is happening on the next round of requests when the lamda is already started:
So we don't have this big times, yes we still have some relatively big delta in some of the request(which for me is also strange), but the things looks much better.
So I'm looking for a clarification from someone who knows actually what is happening under the hood, because this is not a good behavior for a serious application which is using the cloud because of it's "unlimited" possibilities.
And another question is related to another limit of the lambda-200 concurrent invocations in all lambdas within an account in a region. For me this is also a big limitation for a big application with lots of traffic. So as my business case in the moment(I don't know for the future) is more or less fire and forget the request. And I'm starting to think of changing the logic in the way that the gateway sends the data directly to the stream and the other lambda is taking care of the validation and the further processing. Yes, I'm loosing the current abstraction(which I don't need at the moment) but I'm increasing the application availability many times. What do you think?
The lambda execution time spikes to 18s because AWS launches new containers w/ your code to handle the incoming requests. The bootstrap time is ~18s.
Assigning more RAM can significantly improve the performance of your lambda function, because you have more RAM, CPU and networking throughput!
And another question is related to another limit of the lambda-200 concurrent invocations in all lambdas within an account in a region.
You can ask to the AWS Support to increase that limit. I asked to increase that limit to 10,000 invocation/second and the AWS Support did it quickly!
You can proxy straight to the Kinesis stream via API Gateway. You would lose some control in terms of validation and transformation, but you won't have the cold start latency that you're seeing from Lambda.
You can use the API Gateway mapping template to transform the data and if validation is important, you could potentially do that at the processing Lambda on the other side of the stream.
My AppEngine project retrieves XML data from a particular link using the GAE URL fetch API. I have used the sample from here except that it looks like this:
InputStream stream;
URLConnection connection = new URL(url).openConnection();
connection.setConnectTimeout(0);
connection.setReadTimeout(0);
stream = connection.getInputStream();
This takes more than 60 seconds (max allowed by the API) and hence causes a DeadlineExceededException. Using TaskQueues for the purpose is also not an option as mentioned here.
Is there any other way someone might have achieved this until now?
Thanks!
Task Queues can be active longer than the AppEngine automatic scaling request response deadline of 1 minute. On automatic scaling, a task can run for 10 minutes. On basic or manual scaling, it can run for 24 hours. See the docs here. (Note that the language python is actually not related to the material - the same is true for Java on GAE, Go, PHP, as well).
Finally, I have to echo what was said by the other users - the latency is almost certainly caused by the endpoint of your URL fetch, not by the network or app engine. You can also check this for sure by looking at your App Engine log lines for the failing requests. The cpu_millis field tells you how long the actual process GAE-side worked on the request, while the millis field will be the total time for the request. If the total time is much higher than the cpu time, it means the cost was elsewhere in the network.
It might be related to bandwidth exhaustion of multiple connections relative to the endpoint's limited resources. If the endpoint is muc2014.communitymashup.net/x3/mashup as you added in a comment, it might help to know that at the time I posted this comment, approx 1424738921 in unix time, the average latency (including full response, not just time to response beginning) on that endpoint was ~6 seconds, although that could feasibly go up to >60s given heavy load if no scaling system is set up for the endpoint. The observed latency is already quite high, but it might vary according to what kind of work needs to be done server-side, what volume of requests/data is being handled, etc.
The problem lied in the stream being used by a function from the EMF library which took a lot of time (wasn't the case previously).
Rather, loading the contents from the URL into a StringBuilder, converting it to a separate InputStream and passing that to the function worked. All this being done in a cron job.
I'm testing a Google Web Toolkit application and having some performance issue with multiple RPC calls. The structure of the app is:
User submits a query
Initial query is serviced by a single server-side servlet
Once initial reply received, multiple components are subsequently updated by iterating over each component and calling an update method, passing it the results of the initial query
Each component's update method does some work on the data passed to it, in addition to potentially calling other server-side services
On success of these calls, the component is updated in the UI.
With the initial query service and 1 component (effectively running sequentially), response time is fast. However, adding any other components (e.g initial query service + 2 components, these 2 components calling asynchronously) hugely impacts the response time.
Is there any way to improve / rectify this?
Example: (IQS = initial query, C1 = component 1, C2 = component 2, C1S = comp. 1 service, C2S = component 2 service)
Initial Query + 1 component
IQS, returned - propagating results, 1297273015477
C1, Sending server request,1297273015477
C1S, Sending back..., 1297273016486
C1, Receiving Complete, 1297273016522 (total time from initial call - 1045ms)
Initial Query + 2 components
IQS, returned - propagating results, 1297272667185
C1, Sending server request,1297272667185
C2, Sending server request,1297272668132
C1S, Sending back..., 1297272668723
C2S, Sending back..., 1297272669371
C1, Back at client to process, 1297272671077 (total time from initial call - 3892ms)
C2, Back at client to process, 1297272674518 (total time from initial call - 6386ms)
Thanks in advance.
Paul
I think you need to make your analysis more fine grained: in the data provided you have established that the client started the 2nd component call and got a response back 6386ms later. Some of this was
Going over the wire
Being received at the server
Processed at the server (this could be broken down, as well).
Sent back over the wire.
The GWT-RPC service really only has to do with 1 and 4. Do you know how long each step takes?
Well, I think your problem is not directly related to GWT. Because , I have used multiple rpc calls at same time, my application performance did not degraded.I think that you may have server side synchronization issues.
The overhead of http with cookies, and the sequencing of some of these (rather than firing all the request when the user is switching to another part of the application) is part of the reason why they seem to slow things down. E.G. A user requests a page, once that page's widgets are in place they fire requests for the data they're supposed to show, possibly making decisions to add more widgets based on that data (but hopefully passing the data into those widgets).
You might be looking for some tools that help you to create batched rpc calls like: gwt-dispatch. I don't think there's anything automatic.
A low-tech way to get more information is to put basic timing logging into every RPC to see how long they take. Create a new Date() at the top, subtract its ms from a new Date()'s ms at the end, print it to stdout or use Log.info() or whatever.
For something more industrial strength I used the "SpringSource tc" combined with Chrome's Speed Tracer in order to get a full stack view of what calls were taking what amount of time, and what was actually able to happen in parallel. Not trivial to set up but once I did I was able to zero in on the real issues (in my case it was getting tons of unnecessary information from Hibernate queries) very quickly.
Here's the basic info we used:
Download the tc Server Developer Edition (free)
http://www.springsource.com/products/tc-server-developer-edition-preview
NOTE: Do not even THINK about installing in a directory structure that has spaces.....
Installing tc Server: Main Steps
http://static.springsource.com/projects/tc-server/2.1/getting-started/html/ch06s02.html#install-developer-edition
Viewing Spring Insight Data In Google Speed Tracer
http://static.springsource.com/projects/tc-server/2.0/devedition/html/ch04s04.html
url is now localhost:8080 instead of the old port address for the other installation of tomcat.
One more detail, you'll need to make a .war file and deploy that to the tomcat directory. (You're not getting perf data on dev mode, but rather a local GWT compiled release)
-- Andrew # learnvc.com