Java Socket not receiving body of http request - java

I'm trying to read a HTTP request using only Socket and BufferedReader classes in Java. The problem is that I can't reach the body part of the request. The Buffered reader is giving me only the request line and the headers. Here is part of the code:
bufferedReader = new BufferedReader(new InputStreamReader(socket.getInputStream()));
String comando = "";
while((msgDoSocket = bufferedReader.readLine()) != null){
//telaOutput.adicionaFim(msgDoSocket);
try {
comando += msgDoSocket + " ";
//System.out.println(comando);
if(msgDoSocket.isEmpty()){
processaInput(comando);
}
} catch (Exception ex) {
Logger.getLogger(ServerThread.class.getName()).log(Level.SEVERE, null, ex);
}
}
Here is a WireShark capture showing that the POST body is being sent. My program is running on port 15000 and the data is just a string "teste12345". I'm using the app POSTMAN from google chrome to send the requests.
I'm having exactly the same problem described in this thread but the solutions proposed there didn't work. The request still getting up to the last header and no more. Thanks in advance.
Edit: Problem Solved!
Following suggestion proposed on the answer, I changed the reading to:
reader = new DataInputStream(socket.getInputStream());
String comando = "";
while( (dt = reader.readByte()) >= 0){
comando += dt;
//... do the rest of the stuff
}
Reading it as binary made it possible to reach the body part of the request.

I'm far from being a Java guru, but I bet that readLine only returns with results when it found a sequence of \r\n. since your body is not terminated with \r\n the method readLine never returns. try to manually add that character sequence to your body and see what happens, or alternatively, use the raw InputStreamReader to read the body as byte array.
never the less, you can't expect any http body to actually be a string. it can also be a binary sequence which knows nothing about \r\n.

Related

Unsuccessful in trying to reuse Java client socket

I have a software driver which communicates with a third-party controller; I have an API for using the latter but no visibility of its source code, and the supplier is not co-operative in trying to improve things!
The situation is as follows.
To send a request to the controller, I send an XML packet as the content of an HTTP POST to a servlet, which then sends me the response. The original code, implemented by a previous developer, works stably using java.net.Socket. However, our driver is implemented such that a new socket is created for EVERY request sent and, if the driver gets busy, the third-party controller struggles to keep up in terms of socket handling. In fact, their support guy said to me: "You really need to leave 5 seconds between each request...". This simply isn't commercially acceptable.
To improve performance, I wanted to try leaving our end of the socket open and reusing the socket pretty much indefinitely (given that connections can drop unexpectedly of course, but that's the least of my concerns and is manageable). However, whatever I seem to do, the effect is that if I use Comms.getSocket(false), a new socket is created for each request and everything works OK but bottlenecks when busy. If I use Comms.getSocket(true), the following happens:
Controller is sent first request
Controller responds to first request
Controller is sent second request (maybe 5 seconds later)
Controller never responds to second request or anything after it
postRequest() keeps getting called: for the first 12 seconds, the console outputs "Input shut down ? false" but, after that, the code no longer reaches there and doesn't get past the bw.write() and bw.flush() calls.
The controller allows both HTTP 1.0 and 1.1 but their docs say zilch about keep-alive. I've tried both and the code below shows that I've added Keep-Alive headers as well but the controller, as server, I'm guessing is ignoring them -- I don't think I have any way of knowing, do I ? When in HTTP 1.0 mode, the controller certainly returns a "Connection: close" but doesn't do that in HTTP 1.1 mode.
The likelihood is then that the server side is insisting on a "one socket per request" approach.
However, I wondered if I might be doing anything wrong (or missing something) in the following code to achieve what I want:
private String postRequest() throws IOException {
String resp = null;
String logMsg;
StringBuilder sb = new StringBuilder();
StringBuilder sbWrite = new StringBuilder();
Comms comms = getComms();
Socket socket = comms.getSocket(true);
BufferedReader br = comms.getReader();
BufferedWriter bw = comms.getWriter();
if (null != socket) {
System.out.println("Socket closed ? " + socket.isClosed());
System.out.println("Socket bound ? " + socket.isBound());
System.out.println("Socket connected ? " + socket.isConnected());
// Write the request
sbWrite
.append("POST /servlet/receiverServlet HTTP/1.1\r\n")
.append("Host: 192.168.200.100\r\n")
.append("Connection: Keep-Alive\r\n")
.append("Keep-Alive: timeout=10\r\n")
.append("Content-Type: text/xml\r\n")
.append("Content-Length: " + requestString.length() + "\r\n\r\n")
.append(requestString);
System.out.println("Writing:\n" + sbWrite.toString());
bw.write(sbWrite.toString());
bw.flush();
// Read the response
System.out.println("Input shut down ? " + socket.isInputShutdown());
String line;
boolean flag = false;
while ((line = br.readLine()) != null) {
System.out.println("Line: <" + line + ">");
if (flag) sb.append(line);
if (line.isEmpty()) flag = true;
}
resp = sb.toString();
}
else {
System.out.println("Socket not available");
}
return resp; // Another method will parse the response
}
To ease testing, I provide the socket using an extra Comms helper class and a method called getSocket(boolean reuse) where I can choose to always create a new socket or reuse the one that Comms creates for me, as follows:
public Comms(String ip, int port) {
this.ip = ip;
this.port = port;
initSocket();
}
private void initSocket() {
try {
socket = new Socket(ip, port);
socket.setKeepAlive(true);
socket.setPerformancePreferences(1, 0, 0);
socket.setReuseAddress(true);
bw = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream(), StandardCharsets.UTF_8));
br = new BufferedReader(new InputStreamReader(socket.getInputStream(), StandardCharsets.UTF_8));
System.out.println("### CREATED NEW SOCKET");
}
catch (UnknownHostException uhe) {
System.out.println("### UNKNOWN HOST FOR SOCKET");
}
catch (IOException ioe) {
System.out.println("### SOCKET I/O EXCEPTION");
}
}
public BufferedReader getReader() { return br; }
public BufferedWriter getWriter() { return bw; }
public Socket getSocket(boolean reuse) {
if (! reuse) initSocket();
return socket;
}
Can anyone help ?
If we assume that keep-alive thing is working as expected, I think the line while ((line = br.readLine()) != null) is a faulty one, as this is kind of infinity loop.
readline() returns null when there is no more data to read, e.g. a EOF, or when server/client closes the connection, that will break-down your reusing socket solution, since an open stream will never cause a null to a readLine() call, but blocking.
You need to fix the alg about reading a response (why not using implemented http client?), checking content-length, and when read the amount of required data from body, go for next loop by keeping the socket alive.
After that setting flag to true, you have to know what kind of data should be read(considering mime/content-type), besides that, the length of data, so reading data using readLine() may not be a good practice here.
Also make sure server allow for persistence connection, by checking if it respects it by responsing the same connection:keep-alive header.

Incrementally handling twitter's streaming api using apache httpclient?

I am using Apache HTTPClient 4 to connect to twitter's streaming api with default level access. It works perfectly well in the beginning but after a few minutes of retrieving data it bails out with this error:
2012-03-28 16:17:00,040 DEBUG org.apache.http.impl.conn.SingleClientConnManager: Get connection for route HttpRoute[{tls}->http://myproxy:80->https://stream.twitter.com:443]
2012-03-28 16:17:00,040 WARN com.cloudera.flume.core.connector.DirectDriver: Exception in source: TestTwitterSource
java.lang.IllegalStateException: Invalid use of SingleClientConnManager: connection still allocated.
at org.apache.http.impl.conn.SingleClientConnManager.getConnection(SingleClientConnManager.java:216)
Make sure to release the connection before allocating another one.
at org.apache.http.impl.conn.SingleClientConnManager$1.getConnection(SingleClientConnManager.java:190)
I understand why I am facing this issue. I am trying to use this HttpClient in a flume cluster as a flume source. The code looks like this:
public Event next() throws IOException, InterruptedException {
try {
HttpHost target = new HttpHost("stream.twitter.com", 443, "https");
new BasicHttpContext();
HttpPost httpPost = new HttpPost("/1/statuses/filter.json");
StringEntity postEntity = new StringEntity("track=birthday",
"UTF-8");
postEntity.setContentType("application/x-www-form-urlencoded");
httpPost.setEntity(postEntity);
HttpResponse response = httpClient.execute(target, httpPost,
new BasicHttpContext());
BufferedReader reader = new BufferedReader(new InputStreamReader(
response.getEntity().getContent()));
String line = null;
StringBuffer buffer = new StringBuffer();
while ((line = reader.readLine()) != null) {
buffer.append(line);
if(buffer.length()>30000) break;
}
return new EventImpl(buffer.toString().getBytes());
} catch (IOException ie) {
throw ie;
}
}
I am trying to buffer 30,000 characters in the response stream to a StringBuffer and then return this as the data received. I am obviously not closing the connection - but I do not want to close it just yet I guess. Twitter's dev guide talks about this here It reads:
Some HTTP client libraries only return the response body after the
connection has been closed by the server. These clients will not work
for accessing the Streaming API. You must use an HTTP client that will
return response data incrementally. Most robust HTTP client libraries
will provide this functionality. The Apache HttpClient will handle
this use case, for example.
It clearly tells you that HttpClient will return response data incrementally. I've gone through the examples and tutorials, but I haven't found anything that comes close to doing this. If you guys have used a httpclient (if not apache) and read the streaming api of twitter incrementally, please let me know how you achieved this feat. Those who haven't, please feel free to contribute to answers. TIA.
UPDATE
I tried doing this: 1) I moved obtaining stream handle to the open method of the flume source. 2) Using a simple inpustream and reading data into a bytebuffer. So here is what the method body looks like now:
byte[] buffer = new byte[30000];
while (true) {
int count = instream.read(buffer);
if (count == -1)
continue;
else
break;
}
return new EventImpl(buffer);
This works to an extent - I get tweets, they are nicely being written to a destination. The problem is with the instream.read(buffer) return value. Even when there is no data on the stream, and the buffer has default \u0000 bytes and 30,000 of them, so this value is getting written to the destination. So the destination file looks like this.. " tweets..tweets..tweeets.. \u0000\u0000\u0000\u0000\u0000\u0000\u0000...tweets..tweets... ". I understand the count won't return a -1 coz this is a never ending stream, so how do I figure out if the buffer has new content from the read command?
The problem is that your code is leaking connections. Please make sure that no matter what you either close the content stream or abort the request.
InputStream instream = response.getEntity().getContent();
try {
BufferedReader reader = new BufferedReader(
new InputStreamReader(instream));
String line = null;
StringBuffer buffer = new StringBuffer();
while ((line = reader.readLine()) != null) {
buffer.append(line);
if (buffer.length()>30000) {
httpPost.abort();
// connection will not be re-used
break;
}
}
return new EventImpl(buffer.toString().getBytes());
} finally {
// if request is not aborted the connection can be re-used
try {
instream.close();
} catch (IOException ex) {
// log or ignore
}
}
It turns out that it was a flume issue. Flume is optimized to transfer events of size 32kb. Anything beyond 32kb, Flume bails out. (The workaround is to tune event size to be greater than 32KB). So, I've changed my code to buffer 20,000 characters at least. It kind of works, but it is not fool proof. This can still fail if the buffer length exceeds 32kb, however, it hasn't failed so far in an hour of testing - I believe it has to do with the fact that Twitter doesn't send a lot of data on its public stream.
while ((line = reader.readLine()) != null) {
buffer.append(line);
if(buffer.length()>20000) break;
}

HTTP GET request not working in java when HTTP is 1.1?

so i made a little code that can download 4chan pages. i get the raw HTML page and parse it for my need. the code below was working fine but it suddenly stopped working. when i run it the server does not accept my request it seems its waiting for something more. however i know that HTTP request is as below
GET /ck HTTP/1.1
Host: boards.4chan.org
(extra new line)
if i change this format in anyway i revive "400 bad request" status code. but if i change HTTP/1.1 to 1.0 the server responses in "200 ok" status and i get the whole page. so this makes me thing the error is in the host line since that became mandatory in HTTP/1.1. but still i cannot figure out what exactly need to be changed.
the calling function simply this, to get one whole board
downloadHTMLThread( "ck", -1);
or for a specific thread u just change -1 to that number. for example like for the link below will have like below.
//http://boards.4chan.org/ck/res/3507158
//url.getDefaultPort() is 80
//url.getHost() is boards.4chan.org
//url.getFile() is /ck/res/3507158
downloadHTMLThread( "ck", 3507158);
any advise would be appreciated, thanks
public static final String BOARDS = "boards.4chan.org";
public static final String IMAGES = "images.4chan.org";
public static final String THUMBS = "thumbs.4chan.org";
public static final String RES = "/res/";
public static final String HTTP = "http://";
public static final String SLASH = "/";
public String downloadHTMLThread( String board, int thread) {
BufferedReader reader = null;
PrintWriter out = null;
Socket socket = null;
String str = null;
StringBuilder input = new StringBuilder();
try {
URL url = new URL(HTTP+BOARDS+SLASH+board+(thread==-1?SLASH:RES+thread));
socket = new Socket( url.getHost(), url.getDefaultPort());
reader = new BufferedReader( new InputStreamReader( socket.getInputStream()));
out = new PrintWriter(socket.getOutputStream(), true);
out.println( "GET " +url.getFile()+ " HTTP/1.1");
out.println( "HOST: " + url.getHost());
out.println();
long start = System.currentTimeMillis();
while ((str = reader.readLine()) != null) {
input.append( str).append("\r\n");
}
long end = System.currentTimeMillis();
System.out.println( input);
System.out.println( "\nTime: " +(end-start)+ " milliseconds");
} catch (Exception ex) {
ex.printStackTrace();
input = null;
} finally {
if( reader!=null){
try {
reader.close();
} catch (IOException ioe) {
// nothing to see here
}
}
if( socket!=null){
try {
socket.close();
} catch (IOException ioe) {
// nothing to see here
}
}
if( out!=null){
out.close();
}
}
return input==null? null: input.toString();
}
Try using Apache HttpClient instead of rolling your own:
static String getUriContentsAsString(String uri) throws IOException {
HttpClient client = new DefaultHttpClient();
HttpResponse response = client.execute(new HttpGet(uri));
return EntityUtils.toString(response.getEntity());
}
If you are doing this to really learn the internals of HTTP client requests, then you might start by playing with curl from the command line. This will let you get all your headers and request body squared away. Then it will be a simple matter of adjusting your request to match what works in curl.
By the code I think that you are sending 'HOST' instead of 'Host'. Since this is a compulsory header in http/1.1, but ignored in http/1.0, that might be the problem.
Anyway, you could use a program to capture the packet sent (i. e. wireshark), just to make sure.
Using println is quite useful, but the line separator appended to the command depends on the system property line.separator. I think (although I'm not sure) that the line separator used in http protocol has to be '\r\n'. If you're capturing the packet, I think it'd be a good idea to check that each line sent ends with '\r\n' (bytes x0D0A) (just in case your os line separator is different)
Use www.4chan.org as the host instead. Since boards.4chan.org is a 302 redirect to www.4chan.org, you won't be able to scrape anything from boards.4chan.org.

Small http server using java?

I have created the following test server using java:
import java.io.*;
import java.net.*;
class tcpServer{
public static void main(String args[]){
ServerSocket s = null;
try{
s = new ServerSocket(7896);
//right now the stream is open.
while(true){
Socket clientSocket = s.accept();
Connection c = new Connection(clientSocket);
//now the connection is established
}
}catch(IOException e){
System.out.println("Unable to read: " + e.getMessage());
}
}
}
class Connection extends Thread{
Socket clientSocket;
BufferedReader din;
OutputStreamWriter outWriter;
public Connection(Socket clientSocket){
try{
this.clientSocket = clientSocket;
din = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(), "ASCII"));
outWriter = new OutputStreamWriter(clientSocket.getOutputStream());
this.start();
}catch(IOException e){
System.out.println("Connection: " + e.getMessage());
}
}
public void run(){
try{
String line = null;
while((line = din.readLine())!=null){
System.out.println("Read" + line);
if(line.length()==0)
break;
}
//here write the content type etc details:
System.out.println("Someone connected: " + clientSocket);
outWriter.write("HTTP/1.1 200 OK\r\n");
outWriter.write("Date: Tue, 11 Jan 2011 13:09:20 GMT\r\n");
outWriter.write("Expires: -1\r\n");
outWriter.write("Cache-Control: private, max-age=0\r\n");
outWriter.write("Content-type: text/html\r\n");
outWriter.write("Server: vinit\r\n");
outWriter.write("X-XSS-Protection: 1; mode=block\r\n");
outWriter.write("<html><head><title>Hello</title></head><body>Hello world from my server</body></html>\r\n");
}catch(EOFException e){
System.out.println("EOF: " + e.getMessage());
}
catch(IOException e){
System.out.println("IO at run: " + e.getMessage());
}finally{
try{
outWriter.close();
clientSocket.close();
}catch(IOException e){
System.out.println("Unable to close the socket");
}
}
}
}
Now i want this server to respond to my browser. that's why i gave url: http://localhost:7896
and as a result i receive at the server side:
ReadGET / HTTP/1.1
ReadHost: localhost:7896
ReadConnection: keep-alive
ReadCache-Control: max-age=0
ReadAccept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
ReadUser-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10
ReadAccept-Encoding: gzip,deflate,sdch
ReadAccept-Language: en-US,en;q=0.8
ReadAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
ReadCookie: test_cookie=test cookie
Read
Someone connected: Socket[addr=/0:0:0:0:0:0:0:1,port=36651,localport=7896]
And a blank white screen at my browser and source code also blank. In google chrome browser.
So can anyone please tell me where i m wrong. actually i am new to this thing. so please correct me.
Thanks in advance
You almost certainly don't want to be using DataOutputStream on the response - and writeUTF certainly isn't going to do what you want. DataOutputStream is designed for binary protocols, basically - and writeUTF writes a length-prefixed UTF-8 string, whereas HTTP just wants CRLF-terminated lines of ASCII text.
You want to write headers out a line at a time - so create an OutputStreamWriter around the socket output stream, and write to that:
writer.write("HTTP/1.1 200 OK\r\n");
writer.write("Date: Tue, 11 Jan 2011 13:09:20 GMT\r\n");
etc.
You may want to write your own writeLine method to write out a line including the CRLF at the end (don't use the system default line terminator), to make the code cleaner.
Add a blank line between the headers and the body as well, and then you should be in reasonable shape.
EDIT: Two more changes:
Firstly, you should read the request from the client. For example, change din to a BufferedReader, and initialize it like this:
din = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(),
"ASCII"));
then before you start to write the output, read the request like this:
String line;
while ((line = din.readLine()) != null) {
System.out.println("Read " + line);
if (line.length() == 0) {
break;
}
}
EDIT: As noted in comments, this wouldn't be appropriate for a full HTTP server, as it wouldn't handle binary PUT/POST data well (it may read the data into its buffer, meaning you couldn't then read it as binary data from the stream). It's fine for the test app though.
Finally, you should also either close the output writer or at least flush it - otherwise it may be buffering the data.
After making those changes, your code worked for me.
If you're interested in learning the design and development of network servers like HTTP servers in Java, you might also have a look at this repo:
https://github.com/berb/java-web-server
It's a small HTTP server in Java I started for educational purposes. Though, it shouldn't be used in production or serious use cases yet. I'm still adding new features. It currently provides multi-threading, static file handling, Basic Authentication, logging and a in-memory cache.
EDIT
An obvious error in your code is the missing \r\n between your Response Header and your HTML. Just append an additional \r\n to your last header. Additionally, you must provide the content length, unless you're using Chuncked Encoding:
String out = "<html><head><title>Hello</title></head><body>Hello world from my server</body></html>\r\n";
outWriter.write("Content-Length: "+out.getBytes().length+"\r\n\r\n");
outWriter.write(out);
The HTTP protocol is ASCII based, exept the body which depends on the Content-Type header. So, no UTF-8 headers!
Headers and body must be separated by an empty line.
Why do you set your Transfert-Encoding to chuncked? Your body is not.
Check this out, it's already done for you:
http://www.mcwalter.org/technology/java/httpd/tiny/index.html
I'm not sure if you have can use writeUTF instead, instead you may need to use writeBytes. Also, you need to terminate each line with a '\n'.

Buffered Reader HTTP POST

Looking for a bit of help, I have currently written a HTTP server. It currently handles GET requests fine. However, whilst using POST the buffered reader seems to hang. When the request is stopped the rest of the input stream is read via the buffered reader. I have found a few things on google. I have tried changing the CRLF and the protocol version from 1.1 to 1.0 (browsers automatically make requests as 1.1) Any ideas or help would be appreciated. Thanks
I agree with Hans that you should use a standard and well-tested library to do this. However, if you are writing a server to learn about HTTP, here's some info on doing what you want to do.
You really can't use a BufferedReader because it buffers the input and might read too many bytes from the socket. That's why your code is hanging, the BufferedReader is trying to read more bytes than are available on the socket (since the POST data doesn't have an end of line), and it is waiting for more bytes (which will never be available).
The process to simply parse a POST request is to use the InputStream directly
For each line in the header
read a byte at a time until you get a '\r' and then a '\n'
Look for a line that starts with "Content-Length: ", extract the number at the end of that line.
When you get a header line that is empty, you're done with headers.
Now read exactly the # of bytes that came from the Content-Length header.
Now you can write your response.
Wouldn't write my own implementation. Look at the following existing components, if you want:
a HTTP client: Apache HttpClient
a HTTP server implementation: Apache HttpComponents core (as mentioned by Bombe)
This is not safe! But shows how to get the POST data during an Input Stream after the initial HTTP Headers.
This also only works for POST data coming in as "example=true&bad=false" etc.
private HashMap hashMap = new HashMap();
private StringBuffer buff = new StringBuffer();
private int c = 0;
private String[] post; public PostInputStream(InputStream in) {
try {
//Initalizes avaliable buff
if (in.available() != 0) {
this.buff.appendCodePoint((this.c = in.read()));
while (0 != in.available()) {
//Console.output(buff.toString());
buff.appendCodePoint((this.c = in.read()));
}
this.post = buff.toString().split("&");
for (int i = 0; i < this.post.length; i++) {
String[] n = this.post[i].split("=");
if (n.length == 2) {
hashMap.put(URLDecoder.decode(n[0], "UTF-8"), URLDecoder.decode(n[1], "UTF-8"));
} else {
Console.error("Malformed Post Request.");
}
}
} else {
Console.error("No POST Data");
}
} catch (Exception e) {
e.printStackTrace();
}
}
As karoroberts said you have to check the length of the content sent in the POST. But you still can use BufferedReader.
Just have to check for the Content-Length header for the size of it and after finishing reading all the headers you can set a char array of that size and make the reading of the POST content:
char[] buffer = new char[contentLength];
request.read(buffer);
Where request is the BufferedReader.
If you need the POST content in a string, you can use: String.valueOf(buffer);
Note: BufferedReader.read returns an int of the characters readed, so you could check there for inconsistencies with the Content-Length header.

Categories