At work we use Netflix's Feign Client to help with requests between services. However, I'm confused about its apparent lack of ability to stream data, especially given Netflix's well known business model of streaming video. I clearly am missing something here.
To explain, say Service A asks the Feign Client of Service B for a stream of data and Service B sends the stream in the response. At this point, the execute() method in the Feign Client gets called:
#Override public Response execute(Request request, Options options) throws IOException {
HttpURLConnection connection = convertAndSend(request, options);
return convertResponse(connection);
}
HttpURLConnection convertAndSend(Request request, Options options) throws IOException {
final HttpURLConnection connection = (HttpURLConnection) new URL(request.url()).openConnection();
/** SNIP **/
if (request.body() != null) {
if (contentLength != null) {
connection.setFixedLengthStreamingMode(contentLength);
} else {
connection.setChunkedStreamingMode(8196);
}
connection.setDoOutput(true);
OutputStream out = connection.getOutputStream();
if (gzipEncodedRequest) {
out = new GZIPOutputStream(out);
}
try {
out.write(request.body()); // PROBLEM
} finally {
try {
out.close();
} catch (IOException suppressed) {
}
}
}
return connection;
}
The line labelled PROBLEM is what confuses me.
The request object doesn't even have any sort of stream to read, just a byte[] body.
On the outgoing end, the entire body is written into the OutputStream at once. Shouldn't it chunk the data instead?
For example
// pseudocode
try {
location = 0
bufferSize = 2048
buffer = request.body().read(location, bufferSize)
while(out.readyToRead() && buffer.length > 0) {
out.write(buffer)
location += bufferSize
buffer = request.body().read(location, bufferSize)
}
}
If the request had a stream instead of just byte[] body, you could improve that even further to send data as it becomes available.
I'm very new to this area of service architecture. What am I missing?
Feign was designed for control plane apis, which often don't benefit by streaming upwards. Streaming downwards is supported, though.
I've no concern with being more efficient with regards to how buffering works (ex. alternative to byte array). Just bear in mind that most of feign's design revolves around templating forms (json or xml) and reusing these as much as possible (ex. on retransmit, buffered + fixed length is easy and predictable).
I think I'd be most happy about a "streaming" design if it were coupled to the http client. IOTW, a subtype that addresses streaming in a way that makes sense in the transport. For example, InputStream for regular java, OkIo buffer for OkHttp, Netty Buffer for Netty, etc.
Spencer opened this for the investigation https://github.com/Netflix/feign/issues/220
Related
I am new to netty and I followed this example to write a static file server using netty. But whenever the server serves a large js file. It runs into ClosedChannelException.
The following is my code where I write chunkedFile as http response.
When a large js file is being served I get closedChannelException and the raf file is also closed.
Could you help me figure out what I have done wrong here? Also, is there a simple tutorial where I get understand the basic flow of control in netty?
// Write the content.
ChannelFuture writeFuture = null;
try
{
long fileLength = raf.length();
HttpResponse response = new DefaultHttpResponse(
HttpVersion.HTTP_1_1, HttpResponseStatus.OK);
response.setHeader(HttpHeaders.Names.CONTENT_LENGTH, fileLength);
Channel c = ctx.getChannel();
// Write the initial line and the header.
c.write(response);
writeFuture = c.write(new ChunkedFile(raf, 0, fileLength, 8192));
}
finally
{
raf.close();
if (writeFuture != null)
writeFuture.addListener(ChannelFutureListener.CLOSE);
}
}
<
Calling raf.close() in the finally block is wrong as it may not have it written yet. In fact netty will take care to close it after the write is complete.
I'm trying to write a Java HTTP Proxy Tunnelling program, and I need an experts advice about the best and fastest stream to use for the communication.
I've implemented the basic functionality and everything works fine. The only matter is communication speed or performance. My HTTP proxy system consists of a server program, running on a remote server and a client program running on the local machine. So far, the program looks like this:
Listener.java :
/**
* Listens and accepts connection requests from the browser
*/
ServerSocket listener = null;
try {
listener = new ServerSocket(port, 128);
} catch (IOException ex) {
ex.printStackTrace(System.err);
}
ExecutorService executor = Executors.newCachedThreadPool();
Socket connection;
while (!shutdown) {
try {
connection = listener.accept();
executor.execute(new ProxyTunnel(connection));
} catch (IOException ex) {
ex.printStackTrace(System.err);
}
}
ProxyTunnel.java :
try {
byte[] buffer = new byte[8192]; // 8-KB buffer
InputStream browserInput = browser.getInputStream();
OutputStream browserOutput = browser.getOutputStream();
// Reading browser request ...
StringBuilder request = new StringBuilder(2048);
do {
int read = browserInput.read(buffer);
logger.log(read + " bytes read from browser.");
if (read > 0) {
request.append(new String(buffer, 0, read));
}
} while (browserInput.available() > 0 && read > 0);
// Connecting to proxy server ...
Socket server = new Socket(SERVER_IP, SERVER_PORT);
server.setSoTimeout(5000); // Setting 5 sec read timeout
OutputStream serverOutput = server.getOutputStream();
InputStream serverInput = server.getInputStream();
// Sending request to server ...
serverOutput.write(request.toString().getBytes());
serverOutput.flush();
// Waiting for server response ...
StringBuilder response = new StringBuilder(16384);
do {
try {
read = serverInput.read(buffer);
} catch (SocketTimeoutException ex) {
break; // Timeout!
}
if (read > 0) {
// Send response to browser.");
response.append(new String(buffer, 0, read));
browserOutput.write(buffer, 0, read);
browserOutput.flush();
}
} while (read > 0);
// Closing connections ...
server.close();
} catch (IOException ex) {
ex.printStackTrace(System.err);
} finally {
try {
browser.close();
} catch (IOException ex) {
ex.printStackTrace(System.err);
}
}
The server program uses a similar fashion and sends the HTTP request to the destination server (e.g. www.stackoverflow.com) and forwards the response to the client program, where the client program forwards the response to the local browser.
How can I improve the performance of these TCP/HTTP communications?
Does using buffered streams such as BufferedInputSream and BufferedOutputStream improve communication performance?
Will I gain any performance improvements if I use java.nio Channels and Buffers, instead of java.net Sockets and java.io Stream?
Don't do it yourself
Advice 0: there are plenty of proxy servers out there, much more scalable, stable and mature. Do you really need to write your own?
Don't use StringBuilder/String to buffer request
byte[] buffer = new byte[8192]; // 8-KB buffer
//...
browserInput.read(buffer);
//...
request.append(new String(buffer, 0, read));
//...
serverOutput.write(request.toString().getBytes());
This is flawed for several reasons:
you are assuming your HTTP calls are text (ASCII) only, binary data will be malformed after transforming to String and back to byte[], see: String, byte[] and compression
even if the protocol is text-based, you are using system's default encoding. I bet this is not what you want
finally, the most important part: do not buffer the whole request. Read chunk of data from incoming request and forward it immediately to target server in one iteration. There is absolutely no need for the extra memory overhead and latency. Immediately after receiving few bytes dispatch them and forget about them.
Don't use Executors.newCachedThreadPool()
This pool can grow inifinitely, creating thousands of threads during peak. Essentially you create one thread per connection (except that the pool reuses free threads, but creates new if none available). Consider Executors.newFixedThreadPool(100) - 100-200 threads should be enough in most cases. Above that you'll most likely burn your CPU barely in context switching, without doing much work. Don't be afraid of latency, scale out.
Use non-blocking netty stack
Which brings us to the final advice. Drop blocking sockets altogether. They are handy, but don't scale well due to thread-per-connection requirement. Too much memory is spent to hold stack, too much CPU is wasted for context switching. netty is great and it builds powerful abstraction over NIO.
Check out the examples, they include HTTP client/server code. There is a bit of a learning curve, but you can expect performance growth by several order of magnitued.
I am using Apache HTTPClient 4 to connect to twitter's streaming api with default level access. It works perfectly well in the beginning but after a few minutes of retrieving data it bails out with this error:
2012-03-28 16:17:00,040 DEBUG org.apache.http.impl.conn.SingleClientConnManager: Get connection for route HttpRoute[{tls}->http://myproxy:80->https://stream.twitter.com:443]
2012-03-28 16:17:00,040 WARN com.cloudera.flume.core.connector.DirectDriver: Exception in source: TestTwitterSource
java.lang.IllegalStateException: Invalid use of SingleClientConnManager: connection still allocated.
at org.apache.http.impl.conn.SingleClientConnManager.getConnection(SingleClientConnManager.java:216)
Make sure to release the connection before allocating another one.
at org.apache.http.impl.conn.SingleClientConnManager$1.getConnection(SingleClientConnManager.java:190)
I understand why I am facing this issue. I am trying to use this HttpClient in a flume cluster as a flume source. The code looks like this:
public Event next() throws IOException, InterruptedException {
try {
HttpHost target = new HttpHost("stream.twitter.com", 443, "https");
new BasicHttpContext();
HttpPost httpPost = new HttpPost("/1/statuses/filter.json");
StringEntity postEntity = new StringEntity("track=birthday",
"UTF-8");
postEntity.setContentType("application/x-www-form-urlencoded");
httpPost.setEntity(postEntity);
HttpResponse response = httpClient.execute(target, httpPost,
new BasicHttpContext());
BufferedReader reader = new BufferedReader(new InputStreamReader(
response.getEntity().getContent()));
String line = null;
StringBuffer buffer = new StringBuffer();
while ((line = reader.readLine()) != null) {
buffer.append(line);
if(buffer.length()>30000) break;
}
return new EventImpl(buffer.toString().getBytes());
} catch (IOException ie) {
throw ie;
}
}
I am trying to buffer 30,000 characters in the response stream to a StringBuffer and then return this as the data received. I am obviously not closing the connection - but I do not want to close it just yet I guess. Twitter's dev guide talks about this here It reads:
Some HTTP client libraries only return the response body after the
connection has been closed by the server. These clients will not work
for accessing the Streaming API. You must use an HTTP client that will
return response data incrementally. Most robust HTTP client libraries
will provide this functionality. The Apache HttpClient will handle
this use case, for example.
It clearly tells you that HttpClient will return response data incrementally. I've gone through the examples and tutorials, but I haven't found anything that comes close to doing this. If you guys have used a httpclient (if not apache) and read the streaming api of twitter incrementally, please let me know how you achieved this feat. Those who haven't, please feel free to contribute to answers. TIA.
UPDATE
I tried doing this: 1) I moved obtaining stream handle to the open method of the flume source. 2) Using a simple inpustream and reading data into a bytebuffer. So here is what the method body looks like now:
byte[] buffer = new byte[30000];
while (true) {
int count = instream.read(buffer);
if (count == -1)
continue;
else
break;
}
return new EventImpl(buffer);
This works to an extent - I get tweets, they are nicely being written to a destination. The problem is with the instream.read(buffer) return value. Even when there is no data on the stream, and the buffer has default \u0000 bytes and 30,000 of them, so this value is getting written to the destination. So the destination file looks like this.. " tweets..tweets..tweeets.. \u0000\u0000\u0000\u0000\u0000\u0000\u0000...tweets..tweets... ". I understand the count won't return a -1 coz this is a never ending stream, so how do I figure out if the buffer has new content from the read command?
The problem is that your code is leaking connections. Please make sure that no matter what you either close the content stream or abort the request.
InputStream instream = response.getEntity().getContent();
try {
BufferedReader reader = new BufferedReader(
new InputStreamReader(instream));
String line = null;
StringBuffer buffer = new StringBuffer();
while ((line = reader.readLine()) != null) {
buffer.append(line);
if (buffer.length()>30000) {
httpPost.abort();
// connection will not be re-used
break;
}
}
return new EventImpl(buffer.toString().getBytes());
} finally {
// if request is not aborted the connection can be re-used
try {
instream.close();
} catch (IOException ex) {
// log or ignore
}
}
It turns out that it was a flume issue. Flume is optimized to transfer events of size 32kb. Anything beyond 32kb, Flume bails out. (The workaround is to tune event size to be greater than 32KB). So, I've changed my code to buffer 20,000 characters at least. It kind of works, but it is not fool proof. This can still fail if the buffer length exceeds 32kb, however, it hasn't failed so far in an hour of testing - I believe it has to do with the fact that Twitter doesn't send a lot of data on its public stream.
while ((line = reader.readLine()) != null) {
buffer.append(line);
if(buffer.length()>20000) break;
}
I'm new to Java development so please bear with me. Also, I hope I'm not the champion of tl;dr :).
I'm using HttpClient to make requests over Http (duh!) and I'd gotten it to work for a simple servlet that receives an URL as a query string parameter. I realized that my code could use some refactoring, so I decided to make my own HttpResponseHandler, to clean up the code, make it reusable and improve exception handling.
I currently have something like this:
public class HttpResponseHandler implements ResponseHandler<InputStream>{
public InputStream handleResponse(HttpResponse response)
throws ClientProtocolException, IOException {
int statusCode = response.getStatusLine().getStatusCode();
InputStream in = null;
if (statusCode != HttpStatus.SC_OK) {
throw new HttpResponseException(statusCode, null);
} else {
HttpEntity entity = response.getEntity();
if (entity != null) {
in = entity.getContent();
// This works
// for (int i;(i = in.read()) >= 0;) System.out.print((char)i);
}
}
return in;
}
}
And in the method where I make the actual request:
HttpClient httpclient = new DefaultHttpClient();
HttpGet httpget = new HttpGet(target);
ResponseHandler<InputStream> httpResponseHandler = new HttpResponseHandler();
try {
InputStream in = httpclient.execute(httpget, httpResponseHandler);
// This doesn't work
// for (int i;(i = in.read()) >= 0;) System.out.print((char)i);
return in;
} catch (HttpResponseException e) {
throw new HttpResponseException(e.getStatusCode(), null);
}
The problem is that the input stream returned from the handler is closed. I don't have any idea why, but I've checked it with the prints in my code (and no, I haven't used them both at the same time :). While the first print works, the other one gives a closed stream error.
I need InputStreams, because all my other methods expect an InputStream and not a String. Also, I want to be able to retrieve images (or maybe other types of files), not just text files.
I can work around this pretty easily by giving up on the response handler (I have a working implementation that doesn't use it), but I'm pretty curious about the following:
Why does it do what it does?
How do I open the stream, if something closes it?
What's the right way to do this, anyway :)?
I've checked the docs and I couldn't find anything useful regarding this issue. To save you a bit of Googling, here's the Javadoc and here's the HttpClient tutorial (Section 1.1.8 - Response handlers).
Thanks,
Alex
It closes the stream because ResponseHandler must handle the whole response. Even if you get an open stream, it should be at the end of stream.
The stream is closed by BasicHttpEntity's consumeContent() call to ensure you don't read from the stream again.
In your case, you don't really need ResponseHandler.
The automatic resource management which is called closes the stream for you to make sure all resources are freed and ready for the next task.
If you want streams then you best bet is to copy it to a ByteArray and return a ByteArrayInputStream if the content is relatively modest.
If the content is not modest, then you'll have to do the resource management your self and not the the ResponseHandler.
Looking for a bit of help, I have currently written a HTTP server. It currently handles GET requests fine. However, whilst using POST the buffered reader seems to hang. When the request is stopped the rest of the input stream is read via the buffered reader. I have found a few things on google. I have tried changing the CRLF and the protocol version from 1.1 to 1.0 (browsers automatically make requests as 1.1) Any ideas or help would be appreciated. Thanks
I agree with Hans that you should use a standard and well-tested library to do this. However, if you are writing a server to learn about HTTP, here's some info on doing what you want to do.
You really can't use a BufferedReader because it buffers the input and might read too many bytes from the socket. That's why your code is hanging, the BufferedReader is trying to read more bytes than are available on the socket (since the POST data doesn't have an end of line), and it is waiting for more bytes (which will never be available).
The process to simply parse a POST request is to use the InputStream directly
For each line in the header
read a byte at a time until you get a '\r' and then a '\n'
Look for a line that starts with "Content-Length: ", extract the number at the end of that line.
When you get a header line that is empty, you're done with headers.
Now read exactly the # of bytes that came from the Content-Length header.
Now you can write your response.
Wouldn't write my own implementation. Look at the following existing components, if you want:
a HTTP client: Apache HttpClient
a HTTP server implementation: Apache HttpComponents core (as mentioned by Bombe)
This is not safe! But shows how to get the POST data during an Input Stream after the initial HTTP Headers.
This also only works for POST data coming in as "example=true&bad=false" etc.
private HashMap hashMap = new HashMap();
private StringBuffer buff = new StringBuffer();
private int c = 0;
private String[] post; public PostInputStream(InputStream in) {
try {
//Initalizes avaliable buff
if (in.available() != 0) {
this.buff.appendCodePoint((this.c = in.read()));
while (0 != in.available()) {
//Console.output(buff.toString());
buff.appendCodePoint((this.c = in.read()));
}
this.post = buff.toString().split("&");
for (int i = 0; i < this.post.length; i++) {
String[] n = this.post[i].split("=");
if (n.length == 2) {
hashMap.put(URLDecoder.decode(n[0], "UTF-8"), URLDecoder.decode(n[1], "UTF-8"));
} else {
Console.error("Malformed Post Request.");
}
}
} else {
Console.error("No POST Data");
}
} catch (Exception e) {
e.printStackTrace();
}
}
As karoroberts said you have to check the length of the content sent in the POST. But you still can use BufferedReader.
Just have to check for the Content-Length header for the size of it and after finishing reading all the headers you can set a char array of that size and make the reading of the POST content:
char[] buffer = new char[contentLength];
request.read(buffer);
Where request is the BufferedReader.
If you need the POST content in a string, you can use: String.valueOf(buffer);
Note: BufferedReader.read returns an int of the characters readed, so you could check there for inconsistencies with the Content-Length header.