I need to receive a unicode (UTF-8) string sent by client on a server side. The length of the string is of course unknown.
ServerSocket serverSocket = new ServerSocket(567);
Socket clientSocket = serverSocket.accept();
PrintWriter out = new PrintWriter(clientSocket.getOutputStream(), true);
BufferedReader in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));
I can read bytes using in.read() (until it returns -1) but the problem is that the string is unicode, in other words, every character is represented by two bytes. So converting the result of read() which would work with normal ascii characters makes no sense.
update
As per suggestions bello, I created the reader as follows:
BufferedReader in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(),"UTF-8"));
I've changed the client side to send a newline (#10#13) after each string.
But the new problem is I get bullshit instead of real string if i call:
in.readLine();
And print the result I get some nonsense string (I cannot even copy it here) although I am not dealing with non-latin chars or anything else.
To see what's going on I introduced following code:
int j = 0
while (j < 255){
j++;
System.out.print(in.read()+", ");
}
So here I just print all bytes received. If I send "ab" I get:
97, 0, 98, 0, 10, 13,
This is what one would expect, but than why the readLine method doesn't produce "good" results?
Anyway, if we couldn't find the actual answer, I should probably collect the bytes (like above) and create my string from them? How to do that?
P.S. Just a quick note - I am on windows.
Use new InputStreamReader(clientSocket.getInputStream(), "UTF-8") in order to set properly the name of the charset to use while reading the InputStream coming from your client
When creating InputStreamReader you can set encoding like this:
BufferedReader in =
new BufferedReader(
new InputStreamReader(clientSocket.getInputStream(), "UTF-8")
);
Try this way:
Reader in = new BufferedReader(
new InputStreamReader(
clientSocket.getInputStream(), StandardCharsets.UTF_8));
Note the StandardCharsets class. It is supported since Java 1.7 and provides more elegant way to specify a standard encoding like UTF-8.
Related
I'm getting an unknown character when reading from a socket using DataInputStream. When I send "Hello" on the server side the output is "Hello" the unknown character is the issue. I'm new to socket programming so I don't know what the issue could be. I tried using a BufferReader and PrintWriter too bu the server when using readLine() does not print the text sent but rather java.io.BufferedReader.
Server Side:
//BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
DataInputStream is= new DataInputStream(socket.getInputStream());
String userinput= is.readLine();
System.out.println("Client message: "+userinput);
Client Side:
//BufferedReader std= new BufferedReader(new InputStreamReader(System.in));
String userInput;
DataOutputStream out;
while((userInput=std.readLine()) !=null){
Socket socketClient= new Socket("localhost",5000);
OutputStream os= socketClient.getOutputStream();
out=new DataOutputStream(os);
out.writeUTF(userInput);
out.flush();
socketClient.close();
}
You're using writeUTF on the client side, are you sure readLine on the server side is compatible with it? I had never seen that before, but the javadoc mentions it's a "modified" UTF-8 encoding. Try using readUTF instead, or simply use the regular write method.
If you read a UTF line then you can write UTF or ASCII characters but as opposed to if you read ASCII characters then you must write those characters as ASCII.
try to use out.write(userInput); instead of out.writeUTF(userInput);
I'm trying to write an HTTP proxy in Java using only the Socket class. I had attempted to construct one earlier, and I was successfully sending a request by writing to the socket's output stream But I am having a hard time reading the response. the research I have conducted suggests that I should use the input stream and read it line by line, but I have not been able to read any web-pages successfully using this method. Would anyone have any suggestions as to where I could go from here?
My code actually uses a byte buffer to read from the input stream in order to read the page in bytes:
InputStream input = clientSocket.getInputStream()
byte[] buffer = new byte[48*1024];
byte[] redData;
StringBuilder clientData = new StringBuilder();
String redDataText;
int red;
while((red = input.read(buffer)) > -1) {
redData = new byte[red];
System.arraycopy(buffer, 0, redData, 0, red);
redDataText = new String(redData, "UTF-8");
System.out.println("Got message!! " + redDataText);
clientData.append(redDataText);
}
If you are asking for a way to read an InputStream by lines, this one may serve you:
BufferedReader bufferedReader=new BufferedReader(new InputStreamReader(input, "UTF-8"));
String line;
StringBuilder clientData=new StringBuilder();
while ((line=bufferedReader.readLine()) != null)
{
clientData.append(line);
}
You have to be careful not to read an InputStream in this fashion unless you are a priori sure that it contains just plain text (and not binary data).
BTW: For shake of efficiency, I recommend you to pre-size the clientData with an initial size according to the final size (if not, it will start from a default size of 10, and will need to be re-sized more times).
I'm writting a simple server for experimenting with server socket and telnet.exe
public static void main(String[] args) throws IOException {
String line;
ServerSocket ss = new ServerSocket(5555);
Socket socket = ss.accept();
System.out.println("Waiting for a client...");
InputStream sin = socket.getInputStream();
OutputStream sout = socket.getOutputStream();
DataInputStream in = new DataInputStream(sin);
DataOutputStream out = new DataOutputStream(sout);
out.writeUTF("\u001B[2J");
out.writeUTF("Hello client\r\n");
line = in.readUTF();
System.out.println("The dumb client just sent me this line : " + line);
System.out.println("I'm sending it back...");
out.writeUTF(line);
out.flush();
System.out.println("Waiting for the next line...");
System.out.println();
}
Now I'm running this server and connecting to him via telnet.exe. It's ok. But when i'm sending message to server I dont receive this back:
Why it doesnt work?
A telnet client terminates each input-line with a newline. But a DataInputStream does not recognize this as a terminator for input-strings, because a DataInputStream is for binary data.
Wrap your input stream with an InputStreamReader to handle it as a character-based input stream.
Then wrap this one in a BufferedReader. This has the advantage that the input-stream can be filled by the socket in the background while your program executes. It also provides some handy utility methods like the following.
Use the readLine method, which reads data until a newline is found.
Like this:
BufferedReader inputReader = new BufferedReader(new InputStreamReader(in));
[...]
line = inputReader.readLine();
For text-based output of newline-terminated messages back to the client, you should use the output analogue of the BufferedReader, which is the PrintWriter.
DataInput is for reading binary.
From the Javadoc for readUTF()
Reads in a string that has been encoded using a modified UTF-8 format. The general contract of readUTF is that it reads a representation of a Unicode character string encoded in modified UTF-8 format; this string of characters is then returned as a String.
First, two bytes are read and used to construct an unsigned 16-bit integer in exactly the manner of the readUnsignedShort method . This integer value is called the UTF length and specifies the number of additional bytes to be read. These bytes are then converted to characters by considering them in groups. The length of each group is computed from the value of the first byte of the group. The byte following a group, if any, is the first byte of the next group.
This means the first two byte have to have the length in binary of the following UTF string.
You are typing something like k and j for the first two bytes so the length is something like 25000 bytes, i.e. you haven't typed that much which is why it doesn't return.
What you want instead is to be able to read/write text using classes like BufferedReader and PrintWriter.
Hello all my friends,
I am trying to send a long string through socket connection but I have them in two parts so I get an error while doing my processs.
In client I am sending the file,
BufferedWriter bufferedOut = null;
BufferedReader in = null;
socket = new Socket("192.168.0.15",4444);
bufferedOut = new BufferedWriter(new OutputStreamWriter(socket.getOutputStream()));
in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
bufferedOut.write(xmlInString, 0, xmlInString.length());
/**
* wait for response
*/
byte[] buf = new byte[10000];
int actualNumberOfBytesRead = socket.getInputStream().read(buf);
String responseLine = new String(buf, 0, actualNumberOfBytesRead);
In the server,
BufferedReader in = null;
PrintWriter out = null;
in = new BufferedReader(new InputStreamReader(client.getInputStream()));
out = new PrintWriter(client.getOutputStream(), true);
//get the input
byte[] buf = new byte[10000];
int actualNumberOfBytesRead = client.getInputStream().read(buf);
line = new String(buf, 0, actualNumberOfBytesRead);
//send back
out.println(result);
How I can get my string as one part ? Can you please show me where is my mistake on the code ?
Thank you all
You will need a loop to repeatedly read from the input stream, concatenating the read data together each time, until you reach the end of the string.
Edit - a little more detail. If you are looking at transmitting multiple such strings/files, then see #arnaudĀ“s answer. If all your looking to to is send 1 big string then:
On the sender side, create the output stream, send the data (as you have done), and then don't forget to close the stream again (this will also perform a flush which ensure the data gets sent over the wire, and informs the other end that there is no more data to come).
On the recipient site, read the data in a loop until the input stream ends (read(buf) returns -1), concatenating the data together each time in one big buffer, then close the input stream.
Also, please read my comment about sending a file as bytes rather than a string. This is particularly important for XML files, which have rather special rules for encoding detection.
When using a TCP socket, you are handling "streams". That is, there is no delimitation between messages by default. By proceeding as you do, you may read part of a message, or worse, read more than a message.
The most common way to proceed is to delimit your messages. You can use DataInputStream/DataOutputStream which encodes strings into bytes and use the first bytes to indicate it's length. That way, it knows how many bytes it should read on the receiver end.
DataOutputStream out = null;
DataInputStream in = null;
Socket socket = new Socket("192.168.0.15",4444);
out = new DataOutputStream(new BufferedOutputStream(socket.getOutputStream()));
in = new DataInputStream(new BufferedInputStream(socket.getInputStream()));
out.writeUTF(xmlInString);
out.flush(); // to ensure everything is sent and nothing is kept in the buffer.
// wait for response
String responseLine = in.readUTF();
Then, adjust the server code accordingly.
When using Buffered outputs with sockets, which is advised for performance reasons, it is advised to flush() after you wrote the message to ensure that everything is actually sent over the network and nothing is kept in the buffer.
Your initial problem probably occurred because your message requires several TCP/IP packets and in your server, you read only the first one(s) which just arrived.
I want to know how to receive the string from a file in Java which has different language letters.
I used UTF-8 format. This can receive some language letters correctly, but Latin letters can't be displayed correctly.
So, how can I receive all language letters?
Alternatively, is there any other format which will allow me to receive all language letters.
Here's my code:
URL url = new URL("http://google.cm");
URLConnection urlc = url.openConnection();
BufferedReader buffer = new BufferedReader(new InputStreamReader(urlc.getInputStream(), "UTF-8"));
StringBuilder builder = new StringBuilder();
int byteRead;
while ((byteRead = buffer.read()) != -1)
{
builder.append((char) byteRead);
}
buffer.close();
text=builder.toString();
If I display the "text", the letters can't be displayed correctly.
Reading a UTF-8 file is fairly simple in Java:
Reader r = new InputStreamReader(new FileInputStream(filename), "UTF-8");
If that isn't working, the issue lies elsewhere.
EDIT: According to iconv, Google Cameroon is serving invalid UTF-8. It seems to actually be iso-8859-1.
EDIT2: Actually, I was wrong. It serves (and declares) valid UTF-8 if the user agent contains "Mozilla/5.0" (or higher), but valid iso-8859-1 in (some) other cases. Obviously, the best bet is to use getContentType to check before decoding.