Apache Camel, Netty4 endpoint as client - memory leakage

Apache Camel, Netty4 endpoint as client - memory leakage - java

I'm quite new to Apache Camel and trying to bring some routes into action.
I have a TCP server which serves large JSON-Messages (up to ~30-50kB in size, where i do not have any control about the source size) that contain lots of measurement data which i want to process using certain additional routes that work fine.
I'm using camel 2.20 within spring-boot environment 1.5.7.
I faced the problem that if i commented out every other routes except the incoming reduced netty4 route (only from and to a counter), see below
#Bean
public RouteBuilder getRoute() {
String fromSource = String.format("netty4:tcp://%s:%d?clientMode=true&textline=true&receiveBufferSize=64000&decoderMaxLineLength=64000",sourceIp,sourcePort);
return new RouteBuilder() {
from(fromSource)
.to("metrics:counter:incomingCounter");
};
}
The route works nearly fine but consumes more and more heap-space (around 2MB every second, where there are messages served with a frequency of around 20-30Hz) until java throws java.lang.OutOfMemoryError: Java heap space.
Without any route no memory-leak was registered, as i can focus the problem to the netty-route
Any help will be appreciated.
Thanks in advance.

I found the resolution myself by debugging the code.
I forgot to set property sync=false in netty4-camel endpoint as i don't want to process message and send an answer back to the server after processing, just consuming - while sync=true (default settings) buffers all incoming data for later response which caused my "memory-leak".
The behavior of "sync" was not totally clear from the netty4-camel documentation (http://camel.apache.org/netty4.html) - i'll suggest an improvement of the documentation (will write a mail with a proposal) to make the usage a little more clearly.
Maybe this helps someone another having a similar problem.
Best

Related

Http Websocket as Akka Stream Source

I'd like to listen on a websocket using akka streams. That is, I'd like to treat it as nothing but a Source.
However, all official examples treat the websocket connection as a Flow.
My current approach is using the websocketClientFlow in combination with a Source.maybe. This eventually results in the upstream failing due to a TcpIdleTimeoutException, when there are no new Messages being sent down the stream.
Therefore, my question is twofold:
Is there a way – which I obviously missed – to treat a websocket as just a Source?
If using the Flow is the only option, how does one handle the TcpIdleTimeoutException properly? The exception can not be handled by providing a stream supervision strategy. Restarting the source by using a RestartSource doesn't help either, because the source is not the problem.
Update
So I tried two different approaches, setting the idle timeout to 1 second for convenience
application.conf
akka.http.client.idle-timeout = 1s
Using keepAlive (as suggested by Stefano)
Source.<Message>maybe()
.keepAlive(Duration.apply(1, "second"), () -> (Message) TextMessage.create("keepalive"))
.viaMat(Http.get(system).webSocketClientFlow(WebSocketRequest.create(websocketUri)), Keep.right())
{ ... }
When doing this, the Upstream still fails with a TcpIdleTimeoutException.
Using RestartFlow
However, I found out about this approach, using a RestartFlow:
final Flow<Message, Message, NotUsed> restartWebsocketFlow = RestartFlow.withBackoff(
Duration.apply(3, TimeUnit.SECONDS),
Duration.apply(30, TimeUnit.SECONDS),
0.2,
() -> createWebsocketFlow(system, websocketUri)
);
Source.<Message>maybe()
.viaMat(restartWebsocketFlow, Keep.right()) // One can treat this part of the resulting graph as a `Source<Message, NotUsed>`
{ ... }
(...)
private Flow<Message, Message, CompletionStage<WebSocketUpgradeResponse>> createWebsocketFlow(final ActorSystem system, final String websocketUri) {
return Http.get(system).webSocketClientFlow(WebSocketRequest.create(websocketUri));
}
This works in that I can treat the websocket as a Source (although artifically, as explained by Stefano) and keep the tcp connection alive by restarting the websocketClientFlow whenever an Exception occurs.
This doesn't feel like the optimal solution though.

No. WebSocket is a bidirectional channel, and Akka-HTTP therefore models it as a Flow. If in your specific case you care only about one side of the channel, it's up to you to form a Flow with a "muted" side, by using either Flow.fromSinkAndSource(Sink.ignore, mySource) or Flow.fromSinkAndSource(mySink, Source.maybe), depending on the case.
as per the documentation:
Inactive WebSocket connections will be dropped according to the
idle-timeout settings. In case you need to keep inactive connections
alive, you can either tweak your idle-timeout or inject ‘keep-alive’
messages regularly.
There is an ad-hoc combinator to inject keep-alive messages, see the example below and this Akka cookbook recipe. NB: this should happen on the client side.
src.keepAlive(1.second, () => TextMessage.Strict("ping"))

I hope I understand your question correctly. Are you looking for asSourceOf?
path("measurements") {
entity(asSourceOf[Measurement]) { measurements =>
// measurement has type Source[Measurement, NotUsed]
...
}
}

Java REST service answer takes too much time

This is a problem i've been trying to deal with for almost a week without finding a real solution , here's the problem .
On my Angular client's side I have a button to generate a CSV file which works this way :
User clicks a button.
A POST request is sent to a REST JAX-RS webservice.
Webservice launches a database query and returns a JSON with all the lines needed to the client.
The AngularJS client receives a JSON processes it and generates the CSV.
All good here when there's a low volume of data to return , problems start when I have to return big amounts of data .Starting from 2000 lines I fell like the JBOSS server starts to struggle to send the data like i've reached a certain limit in data capacities (my eclipse where the server is running becomes very slow until the end of the data transmission )
The thing is that after testing i've found out it's not the Database query or the formating of the data that takes time but rather the sending of the data (3000 lines that are 2 MB in size take around 1 minute to reach the client) even though on my developper setup both the ANGULAR client And the JBOSS server are running on the same machine .
This is my Server side code :
#POST
#GZIP
#Path("/{id_user}/transactionsCsv")
#Produces(MediaType.APPLICATION_JSON)
#ApiOperation(value = "Transactions de l'utilisateur connecté sous forme CSV", response = TransactionDTO.class, responseContainer = "List")
#RolesAllowed(value = SecurityRoles.PORTAIL_ACTIVITE_RUBRIQUE)
public Response getOperationsCsv(#PathParam("id_user") long id_user,
#Context HttpServletRequest request,
#Context HttpServletResponse response,
final TransactionFiltreDTO filtre) throws IOException {
final UtilisateurSession utilisateur = (UtilisateurSession) request.getSession().getAttribute(UtilisateurSession.SESSION_CLE);
if (!utilisateur.getId().equals(id_user)) {
return genererReponse(new ResultDTO(Status.UNAUTHORIZED, null, null));
}
//database query
transactionDAO.getTransactionsDetailLimite(utilisateur.getId(), filtre);
//database query
List<Transaction> resultat = detailTransactionDAO.getTransactionsByUtilisateurId(utilisateur.getId(), filtre);
// To format the list to the export format
List<TransactionDTO> liste = Lists.transform(resultat, TransactionDTO.transactionToDTO);
return Response.ok(liste).build();
}
Do you guys have any idea about what is causing this problem or know another way to do things that might not cause this problem ? I would be grateful .
thank you :)
Here's the link for the JBOSS thread Dump :
http://freetexthost.com/y4kpwbdp1x

I've found in other contexts (using RMI) that the more local you are, the less worth it compression is. Your machine is probably losing most of its time on the processing work that compression and decompression require. The larger the amount of data, the greater the losses here.
Unless you really need to send this as one list, you might consider sending lists of entries. Requesting them page-wise to reduce the amount of data sent with one response. Even if you really need a single list on the client-side, you could assemble it after transport.

I'm convinced that the problem comes from the server trying to send big amount of data at once . Is there a way i can send the http answer in several small chunks instead of a single big one ?

To measure performance, we need to check the complete trace.
Many ways to do it, one of the way I find it easier.
Compress the output to ZIP, this reduces the data transfer over the network.
Index the column in Database, so that the query execution time decreases.
Check the processing time between several modules if any between different layers of code (REST -> Service -> DAO -> DB and vice versa)
If there wouldnt be much changes in the database, then you can introduce secondary caching mechanism and lower the cache eviction time or prefer the cache eviction policy as per your requirement.
To find the exact reason:
Collect the thread dump from a single run of the process.From that thread dump, we can check the exact time consumption of layers and pinpoint the problem.
Hope that helps !
[EDIT]
You should analyse the stack trace in dump and not the one added in the link.
If the larger portion of data is not able to process by the request,
Pagination, page size with number of pages might help(Only in case of non CSV file)
Limit, number of lines that can be processed.
Additional Query criteria like dates, users etc.
Sample REST URL :
http://localhost:8080/App/{id_user}/transactionCSV?limit=1000
http://localhost:8080/App/{id_user}/transactionCSV?fromDate=2011-08-01&toDate=2016-08-01
http://localhost:8080/App/{id_user}/transactionCSV?user=Admin

Zipkin, using existing libraries to handle tracing in microservices connected with Apache Kafka

I would like to implement tracing in my microservices architecture. I am using Apache Kafka as message broker and I am not using Spring Framework. Tracing is a new concept for me. At first I wanted to create my own implementation, but now I would like to use existing libraries. Brave looks like the one I will want to use. I would like to know if there are some guides, examples or docs on how to do this. Documentation on Github page is minimal, and I find it hard to start using Brave. Or maybe there is better library with proper documentation, that is easier to use. I will be looking at Apache HTrace because it looks promising. Some getting started guides will be nice.

There are a bunch of ways to answer this, but I'll answer it from the "one-way" perspective. The short answer though, is I think you have to roll your own right now!
While Kafka can be used in many ways, it can be used as a transport for unidirectional single producer single consumer messages. This action is similar to normal one-way RPC, where you have a request, but no response.
In Zipkin, an RPC span is usually request-response. For example, you see timing of the client sending to the server, and also the way back to the client. One-way is where you leave out the other side. The span starts with a "cs" (client send) and ends with a "sr" (server received).
Mapping this to Kafka, you would mark client sent when you produce the message and server received when the consumer receives it.
The trick to Kafka is that there is no nice place to stuff the trace context. That's because unlike a lot of messaging systems, there are no headers in a Kafka message. Without a trace context, you don't know which trace (or span for that matter) you are completing!
The "hack" approach is to stuff trace identifiers as the message key. A less hacky way would be to coordinate a body wrapper which you can nest the trace context into.
Here's an example of the former:
https://gist.github.com/adriancole/76d94054b77e3be338bd75424ca8ba30

I meet the same problem too.Here is my solution, a less hacky way as above said.
ServerSpan serverSpan = brave.serverSpanThreadBinder().getCurrentServerSpan();
TraceHeader traceHeader = convert(serverSpan);
//in kafka producer,user KafkaTemplete to send
String wrapMsg = "wrap traceHeader with originMsg ";
kafkaTemplate.send(topic, wrapMsg).get(10, TimeUnit.SECONDS);// use synchronization
//then in kafka consumer
ConsumerRecords<String, String> records = consumer.poll(5000);
// for loop
for (ConsumerRecord<String, String> record : records) {
String topic = record.topic();
int partition = record.partition();
long offset = record.offset();
String val = record.value();
//parse val to json
Object warpObj = JSON.parseObject(val);
TraceHeader traceHeader = warpObj.getTraceHeader();
//then you can do something like this
MyRequest myRequest = new MyRequest(traceHeader, "/esb/consumer", "POST");
brave.serverRequestInterceptor().handle(new HttpServerRequestAdapter(new MyHttpServerRequest(myRequest), new DefaultSpanNameProvider()));
//then some httprequest within brave-apache-http-interceptors
//http.post(url,content)
}
you must implements MyHttpServerRequest and MyRequest.It is easy,you just return something a span need,such as uri,header,method.
This is a rough and ugly code example,just offer an idea.

What causes "java.io.IOException: stream was reset: CANCEL" with okhttp and spdy?

I'm experimenting with OKHttp (version 2.0.0-RC2) and SPDY and seeing IOException: stream was reset: CANCEL quite a lot, maybe 10% or more of all requests in some preliminary testing. When using Apache HttpClient and regular https we were not seeing any equivalent issue as far as I'm aware. I'm pretty sure we also don't see anything equivalent with OkHttp when SPDY is disabled (client.setProtocols(ImmutableList.of(Protocol.HTTP_1_1))) but I haven't done enough testing to be 100% confident.
This previous question sees these exceptions among others and the advice there is to ignore them, but this seems crazy: we get an exception while reading data from the server, so we abort the data processing code (which using Jackson). We need to do something in such cases. We could retry the request, of course, but sometimes it's a POST request which is not retry-able, and if we've already started receiving data from the server then it's a good bet that the server as already taken the requested action.
Ideally there is some configuration of the client and/or the server that we can do in order to reduce the incidence of these exceptions, but I don't understand SPDY well enough to know even where to start looking or to advise our server-admin team to start looking.
Stack trace, in case it's helpful:
java.io.IOException: stream was reset: CANCEL
at com.squareup.okhttp.internal.spdy.SpdyStream$SpdyDataSource.checkNotClosed(SpdyStream.java:442)
at com.squareup.okhttp.internal.spdy.SpdyStream$SpdyDataSource.read(SpdyStream.java:344)
at com.squareup.okhttp.internal.http.SpdyTransport$SpdySource.read(SpdyTransport.java:273)
at okio.RealBufferedSource.exhausted(RealBufferedSource.java:60)
at okio.InflaterSource.refill(InflaterSource.java:96)
at okio.InflaterSource.read(InflaterSource.java:62)
at okio.GzipSource.read(GzipSource.java:80)
at okio.RealBufferedSource$1.read(RealBufferedSource.java:227)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.loadMore(UTF8StreamJsonParser.java:174)
at com.fasterxml.jackson.core.base.ParserBase.loadMoreGuaranteed(ParserBase.java:431)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2111)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2092)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:275)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:205)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:230)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:202)
at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:58)
at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15)
at com.fasterxml.jackson.databind.ObjectMapper._readValue(ObjectMapper.java:2765)
at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:1546)
at com.fasterxml.jackson.core.JsonParser.readValueAsTree(JsonParser.java:1363)
at (application-level code...)

Your best bet is to set a breakpoint in the two places where the CANCEL error code is assigned: that's SpdyStream#closeInternal (line 246) and SpdyStream#receiveRstStream (line 304). If you can put a breakpoint here, you can capture who is canceling your stream and that'll shed light on the problem.
If for whatever reason you cannot attach a debugger, you can instrument the code to print a stacktrace when those lines are reached:
new Exception("SETTING ERROR CODE TO " + errorCode).printStackTrace();
In either case, I'm the author of that code and I'd love to help you resolve this problem.

Had the same problem and this was a result of network connection timeout, this was a result of downloading a large file from the web service
i had my timeout set to 2-min so i changed it to 5-min and it solved my problem
val okkHttpclient = OkHttpClient.Builder()
.connectTimeout(5, TimeUnit.MINUTES)
.writeTimeout(5, TimeUnit.MINUTES) // write timeout
.readTimeout(5, TimeUnit.MINUTES) // read timeout
.addInterceptor(networkConnectionInterceptor)
.build()

We had this issue because of broken http headers. The android Base64 encoder by default adds newlines which broke our Authorization headers.

Can a web service return a stream?

I've been writing a little application that will let people upload & download files to me. I've added a web service to this applciation to provide the upload/download functionality that way but I'm not too sure on how well my implementation is going to cope with large files.
At the moment the definitions of the upload & download methods look like this (written using Apache CXF):
boolean uploadFile(#WebParam(name = "username") String username,
#WebParam(name = "password") String password,
#WebParam(name = "filename") String filename,
#WebParam(name = "fileContents") byte[] fileContents)
throws UploadException, LoginException;
byte[] downloadFile(#WebParam(name = "username") String username,
#WebParam(name = "password") String password,
#WebParam(name = "filename") String filename) throws DownloadException,
LoginException;
So the file gets uploaded and downloaded as a byte array. But if I have a file of some stupid size (e.g. 1GB) surely this will try and put all that information into memory and crash my service.
So my question is - is it possible to return some kind of stream instead? I would imagine this isn't going to be terribly OS independent though. Although I know the theory behind web services, the practical side is something that I still need to pick up a bit of information on.
Cheers for any input,
Lee

Yes, it is possible with Metro. See the Large Attachments example, which looks like it does what you want.
JAX-WS RI provides support for sending and receiving large attachments in a streaming fashion.
Use MTOM and DataHandler in the programming model.
Cast the DataHandler to StreamingDataHandler and use its methods.
Make sure you call StreamingDataHandler.close() and also close the StreamingDataHandler.readOnce() stream.
Enable HTTP chunking on the client-side.

Stephen Denne has a Metro implementation that satisfies your requirement. My answer is provided below after a short explination as to why that is the case.
Most Web Service implementations that are built using HTTP as the message protocol are REST compliant, in that they only allow simple send-receive patterns and nothing more. This greatly improves interoperability, as all the various platforms can understand this simple architecture (for instance a Java web service talking to a .NET web service).
If you want to maintain this you could provide chunking.
boolean uploadFile(String username, String password, String fileName, int currentChunk, int totalChunks, byte[] chunk);
This would require some footwork in cases where you don't get the chunks in the right order (Or you can just require the chunks come in the right order), but it would probably be pretty easy to implement.

When you use a standardized web service the sender and reciever do rely on the integrity of the XML data send from the one to the other. This means that a web service request and answer only are complete when the last tag was sent. Having this in mind, a web service cannot be treated as a stream.
This is logical because standardized web services do rely on the http-protocol. That one is "stateless", will say it works like "open connection ... send request ... receive data ... close request". The connection will be closed at the end, anyway. So something like streaming is not intended to be used here. Or he layers above http (like web services).
So sorry, but as far as I can see there is no possibility for streaming in web services. Even worse: depending on the implementation/configuration of a web service, byte[] - data may be translated to Base64 and not the CDATA-tag and the request might get even more bloated.
P.S.: Yup, as others wrote, "chuinking" is possible. But this is no streaming as such ;-) - anyway, it may help you.

I hate to break it to those of you who think a streaming web service is not possible, but in reality, all http requests are stream based. Every browser doing a GET to a web site is stream based. Every call to a web service is stream based. Yes, all. We don't notice this at the level where we are implementing services or pages because lower levels of the architecture are dealing with this for you - but it is being done.
Have you ever noticed in a browser that sometimes it can take a while to fetch a page - the browser just keeps cranking away showing the hourglass? That is because the browser is waiting on a stream.
Streams are the reason mime/types have to be sent before the actual data - it's all just a byte stream to the browser, it wouldn't be able to identify a photo if you didn't tell it what it was first. It's also why you have to pass the size of a binary before sending - the browser won't be able to tell where the image stops and the page picks up again.
It's all just a stream of bytes to the client. If you want to prove this for yourself, just get a hold of the output stream at any point in the processing of a request and close() it. You will blow up everything. The browser will immediately stop showing the hourglass, and will display a "cannot find" or "connection reset at server" or some other such message.
That a lot of people don't know that all of this stuff is stream based shows just how much stuff has been layered on top of it. Some would say too much stuff - I am one of those.
Good luck and happy development - relax those shoulders!

For WCF I think its possible to define a member on a message as stream and set the binding appropriately - I've seen this work with wcf talking to Java web service.
You need to set the transferMode="StreamedResponse" in the httpTransport configuration and use mtomMessageEncoding (need to use a custom binding section in the config).
I think one limitation is that you can only have a single message body member if you want to stream (which kind of makes sense).

Apache CXF supports sending and receiving streams.

One way to do it is to add a uploadFileChunk(byte[] chunkData, int size, int offset, int totalSize) method (or something like that) that uploads parts of the file and the servers writes it the to disk.

Keep in mind that a web service request basically boils down to a single HTTP POST.
If you look at the output of a .ASMX file in .NET , it shows you exactly what the POST request and response will look like.
Chunking, as mentioned by #Guvante, is going to be the closest thing to what you want.
I suppose you could implement your own web client code to handle the TCP/IP and stream things into your application, but that would be complex to say the least.

I think using a simple servlet for this task would be a much easier approach, or is there any reason you can not use a servlet?
For instance you could use the Commons open source library.

The RMIIO library for Java provides for handing a RemoteInputStream across RMI - we only needed RMI, though you should be able to adapt the code to work over other types of RMI . This may be of help to you - especially if you can have a small application on the user side. The library was developed with the express purpose of being able to limit the size of the data pushed to the server to avoid exactly the type of situation you describe - effectively a DOS attack by filling up ram or disk.
With the RMIIO library, the server side gets to decide how much data it is willing to pull, where with HTTP PUT and POSTs, the client gets to make that decision, including the rate at which it pushes.

Yes, a webservice can do streaming. I created a webservice using Apache Axis2 and MTOM to support rendering PDF documents from XML. Since the resulting files could be quite large, streaming was important because we didn't want to keep it all in memory. Take a look at Oracle's documentation on streaming SOAP attachments.
Alternately, you can do it yourself, and tomcat will create the Chunked headers. This is an example of a spring controller function that streams.
#RequestMapping(value = "/stream")
public void hellostreamer(HttpServletRequest request, HttpServletResponse response) throws CopyStreamException, IOException
{
response.setContentType("text/xml");
OutputStreamWriter writer = new OutputStreamWriter (response.getOutputStream());
writer.write("this is streaming");
writer.close();
}

It's actually not that hard to "handle the TCP/IP and stream things into your application". Try this...
class MyServlet extends HttpServlet
{
public void doGet(HttpServletRequest request, HttpServletResponse response)
{
response.getOutputStream().println("Hello World!");
}
}
And that is all there is to it. You have, in the above code, responded to an HTTP GET request sent from a browser, and returned to that browser the text "Hello World!".
Keep in mind that "Hello World!" is not valid HTML, so you may end up with an error on the browser, but that really is all there is to it.
Good Luck in your development!
Rodney

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.