I'm trying to write an HTTP proxy in Java using only the Socket class. I had attempted to construct one earlier, and I was successfully sending a request by writing to the socket's output stream But I am having a hard time reading the response. the research I have conducted suggests that I should use the input stream and read it line by line, but I have not been able to read any web-pages successfully using this method. Would anyone have any suggestions as to where I could go from here?
My code actually uses a byte buffer to read from the input stream in order to read the page in bytes:
InputStream input = clientSocket.getInputStream()
byte[] buffer = new byte[48*1024];
byte[] redData;
StringBuilder clientData = new StringBuilder();
String redDataText;
int red;
while((red = input.read(buffer)) > -1) {
redData = new byte[red];
System.arraycopy(buffer, 0, redData, 0, red);
redDataText = new String(redData, "UTF-8");
System.out.println("Got message!! " + redDataText);
clientData.append(redDataText);
}
If you are asking for a way to read an InputStream by lines, this one may serve you:
BufferedReader bufferedReader=new BufferedReader(new InputStreamReader(input, "UTF-8"));
String line;
StringBuilder clientData=new StringBuilder();
while ((line=bufferedReader.readLine()) != null)
{
clientData.append(line);
}
You have to be careful not to read an InputStream in this fashion unless you are a priori sure that it contains just plain text (and not binary data).
BTW: For shake of efficiency, I recommend you to pre-size the clientData with an initial size according to the final size (if not, it will start from a default size of 10, and will need to be re-sized more times).
Related
I am reading this file: https://www.reddit.com/r/tech/top.json?limit=100 into a BufferedReader from a HttpUrlConnection. I've got it to read some of the file, but it only reads about a 1/10th of what it should. It doesn't change anything if I change the size of the input buffer - it prints the same thing just in smaller chunks:
try{
URL url = new URL(urlString);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
StringBuilder sb = new StringBuilder();
int charsRead;
char[] inputBuffer = new char[500];
while(true) {
charsRead = reader.read(inputBuffer);
if(charsRead < 0) {
break;
}
if(charsRead > 0) {
sb.append(String.copyValueOf(inputBuffer, 0, charsRead));
Log.d(TAG, "Value read " + String.copyValueOf(inputBuffer, 0, charsRead));
}
}
reader.close();
return sb.toString();
} catch(Exception e){
e.printStackTrace();
}
I believe the issue is that the text is all on one line since it's not formatted in json correctly, and BufferedReader can only take a line so long. Is there any way around this?
read() should continue to read as long as charsRead > 0. Every time it makes a call to read, the reader marks where it last read from and the next call starts at that place and continues on until there is no more to read. There is no limit to the size it can read. The only limit is the size of the array but the overall size of the file there is none.
You could try the following:
try(InputStream is = connection.getInputStream();
ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
int read = 0;
byte[] buffer = new byte[4096];
while((read = is.read(buffer)) > 0) {
baos.write(buffer, 0, read);
}
return new String(baos.toByteArray(), StandardCharsets.UTF_8);
} catch (Exception ex){}
The above method is using purely the bytes from the stream and reading it into the output stream, then creating the string from that.
I suggest using 3d party Http client. It could reduce your code literally to just a few lines and you don't have to worry about all those little details. Bottom line is - someone already wrote the code that you are trying to write. And it works and already well tested. Few suggestions:
Apache Http Client - A well known and popular Http client, but might be a bit bulky and complicated for a simple case like yours.
Ok Http Client - Another well-known Http client
And finally, my favorite (because it is written by me) MgntUtils Open Source library that has Http Client. Maven artifacts can be found here, GitHub that includes the library itself as a jar file, source code, and Javadoc can be found here and JavaDoc is here
Just to demonstrate the simplicity of what you want to do here is the code using MgntUtils library. (I tested the code and it works like a charm)
private static void testHttpClient() {
HttpClient client = new HttpClient();
client.setContentType("application/json; charset=utf-8");
client.setConnectionUrl("https://www.reddit.com/r/tech/top.json?limit=100");
String content = null;
try {
content = client.sendHttpRequest(HttpMethod.GET);
} catch (IOException e) {
content = client.getLastResponseMessage() + TextUtils.getStacktrace(e, false);
}
System.out.println(content);
}
My wild guess is that your default platform charset was UTF-8 and encoding problems were raised. For remote content the encoding should be specified, and not assumed to be equal to the default encoding on your machine.
The charset of the response data must be correct. For that the headers must be inspected. The default should be Latin-1, ISO-8859-1, but browsers interprete that
as Windows Latin-1, Cp-1252.
String charset = connection.getContentType().replace("^.*(charset=|$)", "");
if (charset.isEmpty()) {
charset = "Windows-1252"; // Windows Latin-1
}
Then you can better read bytes, as there is no exact correspondence to the number of bytes read and the number of chars read. If at the end of a buffer is the first char of a surrogate pair, two UTF-16 chars that form a Unicode glyph, symbol, code point above U+FFFF, I do not know the efficiency of the underlying "repair."
BufferedInputStream in = new BufferedInputStream(connection.getInputStream());
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[512];
while (true) {
int bytesRead = in.read(buffer);
if (bytesRead < 0) {
break;
}
if (bytesRead > 0) {
out.write(buffer, 0, bytesRead);
}
}
return out.toString(charset);
And indeed it is safe to do:
sb.append(inputBuffer, 0, charsRead);
(Taking a copy was probably a repair attempt.)
By the way char[500] takes almost twice the memory of byte[512].
I saw that the site uses gzip compression in my browser. That makes sense for text such as json. I mimicked it by setting a request header Accept-Encoding: gzip.
URL url = new URL("https://www.reddit.com/r/tech/top.json?limit=100");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestProperty("Accept-Encoding", "gzip");
try (InputStream rawIn = connection.getInputStream()) {
String charset = connection.getContentType().replaceFirst("^.*?(charset=|$)", "");
if (charset.isEmpty()) {
charset = "Windows-1252"; // Windows Latin-1
}
boolean gzipped = "gzip".equals(connection.getContentEncoding());
System.out.println("gzip=" + gzipped);
try (InputStream in = gzipped ? new GZIPInputStream(rawIn)
: new BufferedInputStream(rawIn)) {
ByteArrayOutputStream out = new ByteArrayOutputStream();
byte[] buffer = new byte[512];
while (true) {
int bytesRead = in.read(buffer);
if (bytesRead < 0) {
break;
}
if (bytesRead > 0) {
out.write(buffer, 0, bytesRead);
}
}
return out.toString(charset);
}
}
It might be for not gzip conform "browsers" the content length of the compressed content was erroneously set in the response. Which is a bug.
I believe the issue is that the text is all on one line since it's not formatted in json correctly, and BufferedReader can only take a line so long.
This explanation is not correct:
You are not reading a line at a time, and BufferedReader is not treating the text as line based.
Even when you do read from a BufferedReader a line at a time (i.e. using readLine()) the only limits on the length of a line are the inherent limits of a Java String length (2^31 - 1 characters), and the size of your heap.
Also, note that "correct" JSON formatting is subjective. The JSON specification says nothing about formatting. It is common for JSON emitters to not waste CPU cycles and network bandwidth on formatting for JSON that a human will only rarely read. Application code that consumes JSON needs to be able cope with this.
So what is actually going on?
Unclear, but here are some possibilities:
A StringBuilder also has an inherent limit of 2^31 - 1 characters. However, with (at least) some implementations, if you attempt to grow a StringBuilder beyond that limit, it will throw an OutOfMemoryError. (This behavior doesn't appear to be documented, but it is clear from reading the source code in Java 8.)
Maybe you are reading the data too slowly (e.g. because your network connection is too slow) and the server is timing out the connection.
Maybe the server has a limit on the amount of data that it is willing to send in a response.
Since you haven't mentioned any exceptions and you always seem to get the same amount of data, I suspect the 3rd explanation is the correct one.
I am making a server with lua clients and Java server.
I need some data to be compressed in order to reduce the data flow.
In order to do this I use LibDeflate for compressing the data on the client side
local config = {level = 1}
local compressed = LibDeflate:CompressDeflate(data, config)
UDP.send("21107"..compressed..serverVehicleID) -- Send data
On the server I use this to receive the packet (TCP)
out = new PrintWriter(clientSocket.getOutputStream(), true);
in = new BufferedReader(new
InputStreamReader(clientSocket.getInputStream(), "UTF-8"));
String inputLine;
while ((inputLine = in.readLine()) != null) { // Wait for data
Log.debug(inputLine); // It is what get printed in the exemple
String[] processedInput = processInput(inputLine);
onDataReceived(processedInput);
}
I already tried sending it using UDP and TCP, the problem is the same.
I tried using LibDeflate:CompressDeflate and LibDeflate:CompressZlib
I tried tweaking the config
Nothing works :/
I expect to receive one packet with the whole string
But I received few packets each of them contains compressed characters. exemple (each line is the server think that he receive a new packet):
(source: noelshack.com)
After a lot of research I finnaly managed to fix my problem !
I used this :
DataInputStream in = new DataInputStream(new BufferedInputStream(clientSocket.getInputStream()));
int count;
byte[] buffer = new byte[8192]; // or 4096, or more
while ((count = in.read(buffer)) > 0) {
String data = new String(buffer, 0, count);
Do something...
}
I still haven't tested to see if the received compressed string works, I'll update my post when I try out.
EDIT: It seems to work
The only problem now is that I don't know what to do when the packet is bigger than the buffer size.
I want to have something that work in every situation and since some packet are bigger than 8192 they are just cut in half.
Assuming that the client side sends a single compressed "document", your server-side code should look something like this (TCP version):
is = new DeflaterInputStream(clientSocket.getInputStream());
in = new BufferedReader(new InputStreamReader(is, "UTF-8"));
String inputLine;
while ((inputLine = in.readLine()) != null) {
...
}
The above is untested, and also needs exception handling and code to ensure that the streams always get closed.
The trick is that your input pipeline needs to decompress the data stream before you attempt to read / process it as lines of text.
i have to send a short string as text from client to server and then after that send a binary file.
how would I send both binary file and the string using the same socket connection?
the server is a java desktop application and the client is an Android tablet. i have already set it up to send text messages between the client and server in both directions. i have not yet done the binary file sending part.
one idea is to set up two separate servers running at the same time. I think this is possible if i use two different port numbers and set up the servers on two different threads in the application. and i would have to set up two concurrent clients running on two services in the Android app.
the other idea is to somehow use an if else statement to determine which of the two types of files is being sent, either text of binary, and use the appropriate method to receive the file for the file type being sent.
example code for sending text
PrintWriter out;
BufferedReader in;
out = new PrintWriter(new BufferedWriter
(new OutputStreamWriter(Socket.getOutputStream())) true,);
in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
out.println("test out");
String message = in.readLine();
example code for sending binary file
BufferedOutputStream out;
BufferedInputStream in;
byte[] buffer = new byte[];
int length = 0;
out = new BufferedOutputStream(new FileOutputStream("test.pdf));
in = new BufferedInputStream(new FileOutputStream("replacement.pdf"));
while((length = in.read(buffer)) > 0 ){
out.write(buffer, 0, length);
}
I don't think using two threads would be necessary in your case. Simply use the socket's InputStream and OutputStream in order to send binary data after you have sent your text messages.
Server Code
OutputStream stream = socket.getOutputStream();
PrintWriter out = new PrintWriter(
new BufferedWriter(
new OutputStreamWriter(stream)
)
);
out.println("test output");
out.flush(); // ensure that the string is not buffered by the BufferedWriter
byte[] data = getBinaryDataSomehow();
stream.write(data);
Client Code
InputStream stream = socket.getInputStream();
String message = readLineFrom(stream);
int dataSize = getSizeOfBinaryDataSomehow();
int totalBytesRead = 0;
byte[] data = new byte[dataSize];
while (totalBytesRead < dataSize) {
int bytesRemaining = dataSize - totalBytesRead;
int bytesRead = stream.read(data, totalBytesRead, bytesRemaining);
if (bytesRead == -1) {
return; // socket has been closed
}
totalBytesRead += bytesRead;
}
In order to determine the correct dataSize on the client side you have to transmit the size of the binary block somehow. You could send it as a String right before out.flush() in the Server Code or make it part of your binary data. In the latter case the first four or eight bytes could hold the actual length of the binary data in bytes.
Hope this helps.
Edit
As #EJP correctly pointed out, using a BufferedReader on the client side will probably result in corrupted or missing binary data because the BufferedReader "steals" some bytes from the binary data to fill its buffer. Instead you should read the string data yourself and either look for a delimiter or have the length of the string data transmitted by some other means.
/* Reads all bytes from the specified stream until it finds a line feed character (\n).
* For simplicity's sake I'm reading one character at a time.
* It might be better to use a PushbackInputStream, read more bytes at
* once, and push the surplus bytes back into the stream...
*/
private static String readLineFrom(InputStream stream) throws IOException {
InputStreamReader reader = new InputStreamReader(stream);
StringBuffer buffer = new StringBuffer();
for (int character = reader.read(); character != -1; character = reader.read()) {
if (character == '\n')
break;
buffer.append((char)character);
}
return buffer.toString();
}
You can read about how HTTP protocol works which essentially sends 'ascii and human readable' headers (so to speak) and after that any content can be added with appropriate encoding like base64 for example. You may create sth similar yourself.
You need to first send the String, then the size of the byte array then the byte array, use String.startsWith() method to check what is being send.
Hello all my friends,
I am trying to send a long string through socket connection but I have them in two parts so I get an error while doing my processs.
In client I am sending the file,
BufferedWriter bufferedOut = null;
BufferedReader in = null;
socket = new Socket("192.168.0.15",4444);
bufferedOut = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
bufferedOut.write(xmlInString, 0, xmlInString.length());
/**
* wait for response
*/
byte[] buf = new byte[10000];
int actualNumberOfBytesRead = socket.getInputStream().read(buf);
String responseLine = new String(buf, 0, actualNumberOfBytesRead);
In the server,
BufferedReader in = null;
PrintWriter out = null;
in = new BufferedReader(new InputStreamReader(client.getInputStream()));
out = new PrintWriter(client.getOutputStream(), true);
//get the input
byte[] buf = new byte[10000];
int actualNumberOfBytesRead = client.getInputStream().read(buf);
line = new String(buf, 0, actualNumberOfBytesRead);
//send back
out.println(result);
How I can get my string as one part ? Can you please show me where is my mistake on the code ?
Thank you all
You will need a loop to repeatedly read from the input stream, concatenating the read data together each time, until you reach the end of the string.
Edit - a little more detail. If you are looking at transmitting multiple such strings/files, then see #arnaudĀ“s answer. If all your looking to to is send 1 big string then:
On the sender side, create the output stream, send the data (as you have done), and then don't forget to close the stream again (this will also perform a flush which ensure the data gets sent over the wire, and informs the other end that there is no more data to come).
On the recipient site, read the data in a loop until the input stream ends (read(buf) returns -1), concatenating the data together each time in one big buffer, then close the input stream.
Also, please read my comment about sending a file as bytes rather than a string. This is particularly important for XML files, which have rather special rules for encoding detection.
When using a TCP socket, you are handling "streams". That is, there is no delimitation between messages by default. By proceeding as you do, you may read part of a message, or worse, read more than a message.
The most common way to proceed is to delimit your messages. You can use DataInputStream/DataOutputStream which encodes strings into bytes and use the first bytes to indicate it's length. That way, it knows how many bytes it should read on the receiver end.
DataOutputStream out = null;
DataInputStream in = null;
Socket socket = new Socket("192.168.0.15",4444);
out = new DataOutputStream(new BufferedOutputStream(socket.getOutputStream()));
in = new DataInputStream(new BufferedInputStream(socket.getInputStream()));
out.writeUTF(xmlInString);
out.flush(); // to ensure everything is sent and nothing is kept in the buffer.
// wait for response
String responseLine = in.readUTF();
Then, adjust the server code accordingly.
When using Buffered outputs with sockets, which is advised for performance reasons, it is advised to flush() after you wrote the message to ensure that everything is actually sent over the network and nothing is kept in the buffer.
Your initial problem probably occurred because your message requires several TCP/IP packets and in your server, you read only the first one(s) which just arrived.
I am writing a java TCP client that talks to a C server.
I have to alternate sends and receives between the two.
Here is my code.
The server sends the length of the binary msg(len) to client(java)
Client sends an "ok" string
Server sends the binary and client allocates a byte array of 'len' bytes to recieve it.
It again sends back an "ok".
step 1. works. I get "len" value. However the Client gets "send blocked" and the server waits to receive data.
Can anybody take a look.
In the try block I have defined:
Socket echoSocket = new Socket("192.168.178.20",2400);
OutputStream os = echoSocket.getOutputStream();
InputStream ins = echoSocket.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(ins));
String fromPU = null;
if( (fromPU = br.readLine()) != null){
System.out.println("Pu returns as="+fromPU);
len = Integer.parseInt(fromPU.trim());
System.out.println("value of len from PU="+len);
byte[] str = "Ok\n".getBytes();
os.write(str, 0, str.length);
os.flush();
byte[] buffer = new byte[len];
int bytes;
StringBuilder curMsg = new StringBuilder();
bytes =ins.read(buffer);
System.out.println("bytes="+bytes);
curMsg.append(new String(buffer, 0, bytes));
System.out.println("ciphertext="+curMsg);
os.write(str, 0, str.length);
os.flush();
}
UPDATED:
Here is my code. At the moment, there is no recv or send blocking on either sides. However, both with Buffered Reader and DataInput Stream reader, I am unable to send the ok msg. At the server end, I get a large number of bytes instead of the 2 bytes for ok.
Socket echoSocket = new Socket("192.168.178.20",2400);
OutputStream os = echoSocket.getOutputStream();
InputStream ins = echoSocket.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(ins));
DataInputStream dis = new DataInputStream(ins);
DataOutputStream dos = new DataOutputStream(os);
if( (fromPU = dis.readLine()) != null){
//if( (fromPU = br.readLine()) != null){
System.out.println("PU Server returns length as="+fromPU);
len = Integer.parseInt(fromPU.trim());
byte[] str = "Ok".getBytes();
System.out.println("str.length="+str.length);
dos.writeInt(str.length);
if (str.length > 0) {
dos.write(str, 0, str.length);
System.out.println("sent ok");
}
byte[] buffer = new byte[len];
int bytes;
StringBuilder curMsg = new StringBuilder();
bytes =ins.read(buffer);
System.out.println("bytes="+bytes);
curMsg.append(new String(buffer, 0, bytes));
System.out.println("binarytext="+curMsg);
dos.writeInt(str.length);
if (str.length > 0) {
dos.write(str, 0, str.length);
System.out.println("sent ok");
}
Using a BufferedReader around a stream and then trying to read binary data from the stream is a bad idea. I wouldn't be surprised if the server has actually sent all the data in one go, and the BufferedReader has read the binary data as well as the line that it's returned.
Are you in control of the protocol? If so, I suggest you change it to send the length of data as binary (e.g. a fixed 4 bytes) so that you don't need to work out how to switch between text and binary (which is basically a pain).
If you can't do that, you'll probably need to just read a byte at a time to start with until you see the byte representing \n, then convert what you've read into text, parse it, and then read the rest as a chunk. That's slightly inefficient (reading a byte at a time instead of reading a buffer at a time) but I'd imagine the amount of data being read at that point is pretty small.
Several thoughts:
len = Integer.parseInt(fromPU.trim());
You should check the given size against a maximum that makes some sense. Your server is unlikely to send a two gigabyte message to the client. (Maybe it will, but there might be a better design. :) You don't typically want to allocate however much memory a remote client asks you to allocate. That's a recipe for easy remote denial of service attacks.
BufferedReader br = new BufferedReader(new InputStreamReader(ins));
/* ... */
bytes =ins.read(buffer);
Maybe your BufferedReader has sucked in too much data? (Does the server wait for the Ok before continuing?) Are you sure that you're allowed to read from the underlying InputStreamReader object after attaching a BufferedReader object?
Note that TCP is free to deliver your data in ten byte chunks over the next two weeks :) -- because encapsulation, differing hardware, and so forth makes it very difficult to tell the size of packets that will eventually be used between two peers, most applications that are looking for a specific amount of data will instead populate their buffers using code somewhat like this (stolen from Advanced Programming in the Unix Environment, an excellent book; pity the code is in C and your code is in Java, but the principle is the same):
ssize_t /* Read "n" bytes from a descriptor */
readn(int fd, void *ptr, size_t n)
{
size_t nleft;
ssize_t nread;
nleft = n;
while (nleft > 0) {
if ((nread = read(fd, ptr, nleft)) < 0) {
if (nleft == n)
return(-1); /* error, return -1 */
else
break; /* error, return amount read so far */
} else if (nread == 0) {
break; /* EOF */
}
nleft -= nread;
ptr += nread;
}
return(n - nleft); /* return >= 0 */
}
The point to take away is that filling your buffer might take one, ten, or one hundred calls to read(), and your code must be resilient against slight changes in network capabilities.