DataInputStream and readLine() with UTF8 - java

I've got some trouble with sending a UTF8 string from a c socket to a java socket.
The following method works fine:
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream(), "UTF8"));
main.title = in.readLine();
but then I need a int java.io.InputStream.read(byte[] b, int offset, int length) method which does not exist for a BufferedReader. So then I tried to take a DataInputStream
DataInputStream in2 = new DataInputStream(socket.getInputStream());
but everything it reads is just rubbish.
Then I tried to use the readLine() method from DataInputStream but this doesn't give me the correct UTF8 string.
You see my dilemma. Can't I use two readers for one InputStream? Or can I convert the DataInputStream.readLine() result and convert it to UTF8?
Thanks,
Martin

We know from the design of the UTF-8 encoding that the only usage of the value 0x0A is the LINE FEED ('\n'). Therefore, you can read until you hit it:
/** Reads UTF-8 character data; lines are terminated with '\n' */
public static String readLine(InputStream in) throws IOException {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
while (true) {
int b = in.read();
if (b < 0) {
throw new IOException("Data truncated");
}
if (b == 0x0A) {
break;
}
buffer.write(b);
}
return new String(buffer.toByteArray(), "UTF-8");
}
I am making the assumption that your protocol uses \n as a line terminator. If it doesn't - well, it is generally useful to point out the constraints you're writing to.

Do NOT use BufferedReader and DataInputStream on the same InputStream!! I did that and spent days trying to figure out why my code broke. BufferedReader can read more than what you extract from it into its buffer, resulting in situation when the data I was supposed to read with the DataInputStream being "in the BufferedReader". This resulted in lost data which caused my program to "hang" waiting for it to arrive.

I believe that you should not mismatch the BufferedReader and DataInputStream here. DataInputStream has readLine() too, so use it.
And yet another comment. I am not sure it is a problem but avoid multiple calls of socket.getInputStream(). Do it once and then wrap it as you want using other streams and readers.

Am I understanding it correctly that you are sending both text and binary data on the same socket, in the same "conversation"? There should be no problem creating two readers for the same inputstream. The problem is knowing when (and how much) to read which reader. They will both consume (and advance) the underlying stream when you read from them, since you have mixed types of data. You could just read the stream as bytes and then convert the bytes explicitly in your code (new String(bytes, "UTF-8") etc). Or you could split your communication onto two different sockets.

Related

Compare DataOutputStream to String in Java

I have a DataOutputStream I would like to copy into a string. I've found a lot of tutorials on converting DataOutputStreams by setting it to a new ByteArrayOutputStream, but I just want to read the string it sends when it flushes, and my DataOutputStream is already assigned to an output stream though a socket.
output.writeUTF(input.readLine());
output.flush();
If the context is helpful, I'm trying to read the output stream of a server and compare it to a string.
the flush method will flush, i.e. force write, anything buffered, but not yet written.
In the code below, try putting a breakpoint on the second call to writeUTF - if you navigate to your file system you should see the created file, and it will contain "some string". If you put the break point on flush, you can verify that the content has already been written to file.
public static void test() throws IOException {
File file = new File("/Users/Hervian/tmp/fileWithstrings.txt");
DataOutputStream dos = null;
try {
dos = new DataOutputStream(new FileOutputStream(file));
dos.writeUTF("some string");
dos.writeUTF("some other string");
dos.flush();//Flushes this data output stream. This forces any buffered output bytes to be written out to the stream.
} finally {
if (dos!=null) dos.close();
}
}
As such, you cannot extract the data from the DataOutputStream object, but in the example above we off course have those strings in the write calls.

Make a client & server connection stable in Java

I'm creating a Server application in Java but when client connects to a server and opens a stream, the stream come to and end the connection is lost. What I need is to keep that connection alive even when the stream has ended. Here is a code example to better explain what I'm saying:
diSTR = new DataInputStream(Conexao.getInputStream());
doSTR = new DataOutputStream(Conexao.getOutputStream());
conectado = true;
while (diSTR.available() > 0)
{
byte[] buffer = new byte[size];
diSTR.readFully(buffer);
String str = new String(buffer, "UTF-8");
log(str);
}
So when diSTR.available() returns 0 the method returns and the connection is over, how can I solve this problem?
So when diSTR.available() = 0 the method returns and the connection is over, how can I solve this problem?
The solution is to NOT use available().
That method tells you how many bytes are available to read right now without blocking. If you use this to tell you "the connection is over", then you will get a premature end if the other end or the network cannot keep up with the rate at which you can read and process the data. Even if the other end can keep up, all it takes is a brief networking disruption for the reader to catch up, and the connection to be "over" ... according to your criterion.
The correct way to do this is to just read on the input stream until the read call returns -1. That means "end of stream" and indicates that the other end has closed, and there won't be any more data.
You should probably use the java.net package, here's documentation for socket connections:
http://docs.oracle.com/javase/tutorial/networking/sockets/
You are misusing InputStream.available(). The available() call only tells you how many bytes you can read without blocking. It doesn't tell you if you have reached the end of the stream. It is common that an inputstream may have 0 bytes to read immediately but still be open.
Your while loop can be reconstructed like this
int count;
byte[] buffer = new byte[4096];
ByteArrayOutputStream baos = new ByteArrayOutputStream();
while((count = diSTR.read(buffer)) != -1){
baos.write(buffer, 0, count);
}
String str = new String(baos.toByteArray(), "UTF-8")
log(sb.toString());
InputStream.read(byte[]) will read bytes and return the number of bytes read or -1 when the end of stream is reached. Each time read() returns, the contents of the buffer are written to a ByteArrayOutputStream. Once all the bytes have been read (read(byte[]) returns -1) the contents of the stream can then be interpreted as a UTF-8 encoded String.

write a string to a serial port in java

i want to write a string to a serial port, but serial port write method allows only byte array to write in it...so how can i send whole string to a port..here is my code...
serialPort.setSerialPortParams(300,SerialPort.DATABITS_8,SerialPort.STOPBITS_1,SerialPort.PARITY_NONE);
OutputStream mOutputToPort = serialPort.getOutputStream();
String mValue = "ABCDEFG";
System.out.println("beginning to Write . \r\n");
mOutputToPort.write(mValue.getBytes());
System.out.println("AT Command Written to Port. \r\n");
mOutputToPort.flush();
i dont want to send it one by one char..i want whole at a time... thnxx in advance
Your code works (it does write the whole string at once), but it is not nice. If this is what you intend to do, the "clean" way to do it is:
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(mOutputToPort));
bw.write(mValue);
// probably "write" some more here to the buffer
bw.flush(); // now ensure accumulated data is actually written
If you are only writing one string and not more you might as well use an OutputStreamWriter directly and not use a buffer:
OutputStreamWriter osw = new OutputStreamWriter(mOutputToPort);
osw.write(mValue, 0, mValue.length());
osw.flush();
(In Java, Writers deal with writing characters to streams instead of bytes.)
If you want to ensure that no buffering occurs (and I doubt there is any reason for it, it will only increase system call overhead since the serial port will buffer the data anyway and send it out slower than your code delivers it), 123456789 provided a suitable answer. You should be careful with calling getBytes() though, as this will use the system's default character encoding (usually UTF-8 or ISO-8859-1, both suitable for writing pure ASCII) to convert from characters to bytes. If you want a particular encoding then specify it in the call to getBytes(), e.g.
try {
byte[] bytes = someString.getBytes("US-ASCII");
for (int i=0; i<bytes.length; i++)
mOutputToPort.write(bytes[i]); }
catch (UnsupportedEncodingException e) {} // well, this one is always supported
Try to use getBytes() method which returns byte array.
OutputStream mOutputToPort = serialPort.getOutputStream();
String mValue = "ABCDEFG";
System.out.println("beginning to Write . \r\n");
Byte[] data = mValue.getBytes();
for(int i=0;i<data.length;i++){
mOutputToPort.write(data[i]);}
System.out.println("AT Command Written to Port. \r\n");
mOutputToPort.flush();

Write a file in UTF-8 using FileWriter (Java)?

I have the following code however, I want it to write as a UTF-8 file to handle foreign characters. Is there a way of doing this, is there some need to have a parameter?
I would really appreciate your help with this. Thanks.
try {
BufferedReader reader = new BufferedReader(new FileReader("C:/Users/Jess/My Documents/actresses.list"));
writer = new BufferedWriter(new FileWriter("C:/Users/Jess/My Documents/actressesFormatted.csv"));
while( (line = reader.readLine()) != null) {
//If the line starts with a tab then we just want to add a movie
//using the current actor's name.
if(line.length() == 0)
continue;
else if(line.charAt(0) == '\t') {
readMovieLine2(0, line, surname.toString(), forename.toString());
} //Else we've reached a new actor
else {
readActorName(line);
}
}
} catch (IOException e) {
e.printStackTrace();
}
Safe Encoding Constructors
Getting Java to properly notify you of encoding errors is tricky. You must use the most verbose and, alas, the least used of the four alternate contructors for each of InputStreamReader and OutputStreamWriter to receive a proper exception on an encoding glitch.
For file I/O, always make sure to always use as the second argument to both OutputStreamWriter and InputStreamReader the fancy encoder argument:
Charset.forName("UTF-8").newEncoder()
There are other even fancier possibilities, but none of the three simpler possibilities work for exception handing. These do:
OutputStreamWriter char_output = new OutputStreamWriter(
new FileOutputStream("some_output.utf8"),
Charset.forName("UTF-8").newEncoder()
);
InputStreamReader char_input = new InputStreamReader(
new FileInputStream("some_input.utf8"),
Charset.forName("UTF-8").newDecoder()
);
As for running with
$ java -Dfile.encoding=utf8 SomeTrulyRemarkablyLongcLassNameGoeShere
The problem is that that will not use the full encoder argument form for the character streams, and so you will again miss encoding problems.
Longer Example
Here’s a longer example, this one managing a process instead of a file, where we promote two different input bytes streams and one output byte stream all to UTF-8 character streams with full exception handling:
// this runs a perl script with UTF-8 STD{IN,OUT,ERR} streams
Process
slave_process = Runtime.getRuntime().exec("perl -CS script args");
// fetch his stdin byte stream...
OutputStream
__bytes_into_his_stdin = slave_process.getOutputStream();
// and make a character stream with exceptions on encoding errors
OutputStreamWriter
chars_into_his_stdin = new OutputStreamWriter(
__bytes_into_his_stdin,
/* DO NOT OMIT! */ Charset.forName("UTF-8").newEncoder()
);
// fetch his stdout byte stream...
InputStream
__bytes_from_his_stdout = slave_process.getInputStream();
// and make a character stream with exceptions on encoding errors
InputStreamReader
chars_from_his_stdout = new InputStreamReader(
__bytes_from_his_stdout,
/* DO NOT OMIT! */ Charset.forName("UTF-8").newDecoder()
);
// fetch his stderr byte stream...
InputStream
__bytes_from_his_stderr = slave_process.getErrorStream();
// and make a character stream with exceptions on encoding errors
InputStreamReader
chars_from_his_stderr = new InputStreamReader(
__bytes_from_his_stderr,
/* DO NOT OMIT! */ Charset.forName("UTF-8").newDecoder()
);
Now you have three character streams that all raise exception on encoding errors, respectively called chars_into_his_stdin, chars_from_his_stdout, and chars_from_his_stderr.
This is only slightly more complicated that what you need for your problem, whose solution I gave in the first half of this answer. The key point is this is the only way to detect encoding errors.
Just don’t get me started about PrintStreams eating exceptions.
Ditch FileWriter and FileReader, which are useless exactly because they do not allow you to specify the encoding. Instead, use
new OutputStreamWriter(new FileOutputStream(file), StandardCharsets.UTF_8)
and
new InputStreamReader(new FileInputStream(file), StandardCharsets.UTF_8);
You need to use the OutputStreamWriter class as the writer parameter for your BufferedWriter. It does accept an encoding. Review javadocs for it.
Somewhat like this:
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("jedis.txt"), "UTF-8"
));
Or you can set the current system encoding with the system property file.encoding to UTF-8.
java -Dfile.encoding=UTF-8 com.jediacademy.Runner arg1 arg2 ...
You may also set it as a system property at runtime with System.setProperty(...) if you only need it for this specific file, but in a case like this I think I would prefer the OutputStreamWriter.
By setting the system property you can use FileWriter and expect that it will use UTF-8 as the default encoding for your files. In this case for all the files that you read and write.
EDIT
Starting from API 19, you can replace the String "UTF-8" with StandardCharsets.UTF_8
As suggested in the comments below by tchrist, if you intend to detect encoding errors in your file you would be forced to use the OutputStreamWriter approach and use the constructor that receives a charset encoder.
Somewhat like
CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
encoder.onMalformedInput(CodingErrorAction.REPORT);
encoder.onUnmappableCharacter(CodingErrorAction.REPORT);
BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("jedis.txt"),encoder));
You may choose between actions IGNORE | REPLACE | REPORT
Also, this question was already answered here.
Since Java 11 you can do:
FileWriter fw = new FileWriter("filename.txt", Charset.forName("utf-8"));
Since Java 7 there is an easy way to handle character encoding of BufferedWriter and BufferedReaders. You can create a BufferedWriter directly by using the Files class instead of creating various instances of Writer. You can simply create a BufferedWriter, which considers character encoding, by calling:
Files.newBufferedWriter(file.toPath(), StandardCharsets.UTF_8);
You can find more about it in JavaDoc:
Files class
Files#newBufferedWriter
With Chinese text, I tried to use the Charset UTF-16 and lucklily it work.
Hope this could help!
PrintWriter out = new PrintWriter( file, "UTF-16" );
OK it's 2019 now, and from Java 11 you have a constructor with Charset:
FileWriter​(String fileName, Charset charset)
Unfortunately, we still cannot modify the byte buffer size, and it's
set to 8192. (https://www.baeldung.com/java-filewriter)
use OutputStream instead of FileWriter to set encoding type
// file is your File object where you want to write you data
OutputStream outputStream = new FileOutputStream(file);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream, "UTF-8");
outputStreamWriter.write(json); // json is your data
outputStreamWriter.flush();
outputStreamWriter.close();
In my opinion
If you wanna write follow kind UTF-8.You should create a byte array.Then,you can do such as the following:
byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();
Then, you can write each byte into file you created.
Example:
OutputStream f=new FileOutputStream(xmlfile);
byte[] by=("<?xml version=\"1.0\" encoding=\"utf-8\"?>"+"Your string".getBytes();
for (int i=0;i<by.length;i++){
byte b=by[i];
f.write(b);
}
f.close();

BufferedReader to BufferedWriter

How can I obtain a BufferedWriter from a BufferedReader?
I'd like to be able to do something like this:
BufferedReader read = new BufferedReader(new InputStreamReader(...));
BufferedWriter write = new BufferedWriter(read);
You can use the following from Apache commons io:
IOUtils.copy(reader, writer);
site here
JAVA 9 Updates
Since Java 9, Reader provides a method called transferTo with the following signature:
public long transferTo(Writer out) throws IOException
As the documentation states, transferTo will:
Reads all characters from this reader and writes the characters to the given writer in the order that they are read. On return, this reader will be at end of the stream. This method does not close either reader or writer.
This method may block indefinitely reading from the reader, or writing to the writer. The behavior for the case where the reader and/or writer is asynchronously closed , or the thread interrupted during the transfer, is highly reader and writer specific, and therefore not specified.
If an I/O error occurs reading from the reader or writing to the writer, then it may do so after some characters have been read or written. Consequently the reader may not be at end of the stream and one, or both, streams may be in an inconsistent state. It is strongly recommended that both streams be promptly closed if an I/O error occurs.
So in order to write contents of a Java Reader to a Writer, you can write:
reader.transferTo(writer);
If you want to know what happens:
All input from the reader is copied to the inputstream
Something similar too:
private final void copyInputStream( InputStreamReader in, OutputStreamWriter out ) throws IOException
{
char[] buffer=new char[1024];
int len;
while ( ( len=in.read(buffer) ) >= 0 )
{
out.write(buffer, 0, len);
}
}
More on input and output on The Really big Index
BufferedWriter constructor is not overloaded for accept readers right? what Buhb said was correct.
BufferedWriter writer = new BufferedWriter(
new FileWriter("filename_towrite"));
IOUtils.copy(new InputStreamReader(new FileInputStream("filename_toread")), writer);
writer.flush();
writer.close();
You could use Piped Read/Writers (link). This is exactly what they're designed for. Not sure you could retcon them onto an existing buffered reader you got passed tho'. You'd have to construct the buf reader yourself around it deliberately.

Categories