As in subject. How to rewrite the file with different charset?
Where are can find available encodings - final static ints?
FileInputStream fis = new FileInputStream(inputFile);
InputStreamReader isr = new InputStreamReader(fis, inputEncoding);
BufferedReader in = new BufferedReader(isr);
FileOutputStream fos = new FileOutputStream(outputFile);
OutputStreamWriter osw = new OutputStreamWriter(fos, outputEncoding);
BufferedWriter out = new BufferedWriter(osw);
String line = in.readLine();
out.write(line);
As in subject. How to rewrite the file with different charset?
I'm not sure why you asked this question as your code seems legit, although it copies only 1 line (and swallows newlines). I wouldn't have used readLine(), but just read() in a loop, maybe with a buffer. This way you copy everything without modifying/swallowing newlines.
Where are can find available encodings - final static ints?
By Charset#availableCharsets().
SortedMap<String, Charset> availableCharsets = Charset.availableCharsets();
// ...
The supported encoding formats are specified in the JDK Documentation.
As per the conversion, you can use
Supported Encodings
Read Encoded Data
Write Encoded Data
Converting between String of different character sets
List all available character set converters
Related
I need to receive a unicode (UTF-8) string sent by client on a server side. The length of the string is of course unknown.
ServerSocket serverSocket = new ServerSocket(567);
Socket clientSocket = serverSocket.accept();
PrintWriter out = new PrintWriter(clientSocket.getOutputStream(), true);
BufferedReader in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));
I can read bytes using in.read() (until it returns -1) but the problem is that the string is unicode, in other words, every character is represented by two bytes. So converting the result of read() which would work with normal ascii characters makes no sense.
update
As per suggestions bello, I created the reader as follows:
BufferedReader in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream(),"UTF-8"));
I've changed the client side to send a newline (#10#13) after each string.
But the new problem is I get bullshit instead of real string if i call:
in.readLine();
And print the result I get some nonsense string (I cannot even copy it here) although I am not dealing with non-latin chars or anything else.
To see what's going on I introduced following code:
int j = 0
while (j < 255){
j++;
System.out.print(in.read()+", ");
}
So here I just print all bytes received. If I send "ab" I get:
97, 0, 98, 0, 10, 13,
This is what one would expect, but than why the readLine method doesn't produce "good" results?
Anyway, if we couldn't find the actual answer, I should probably collect the bytes (like above) and create my string from them? How to do that?
P.S. Just a quick note - I am on windows.
Use new InputStreamReader(clientSocket.getInputStream(), "UTF-8") in order to set properly the name of the charset to use while reading the InputStream coming from your client
When creating InputStreamReader you can set encoding like this:
BufferedReader in =
new BufferedReader(
new InputStreamReader(clientSocket.getInputStream(), "UTF-8")
);
Try this way:
Reader in = new BufferedReader(
new InputStreamReader(
clientSocket.getInputStream(), StandardCharsets.UTF_8));
Note the StandardCharsets class. It is supported since Java 1.7 and provides more elegant way to specify a standard encoding like UTF-8.
I'm writing a java server for an assignment, and I have observed some strange behaviour when I write both into a wrapped stream and a wrapper stream, can this cause any problems ? As far as I see, it can, but how ? Pls enlighten me.
as an example:
OutputStream os = new OutputStream(...);
PrintWriter pw = new PrintWriter(os);
And I want to write both in the PrintWriter, and the OutputStream.
To transfer from byte-based stream to character-based stream, you need to use OutputStreamWriter:
An OutputStreamWriter is a bridge from character streams to byte streams.
So that would be:
OutputStream os = ...
OutputStreamWriter osw = new OutputStreamWriter(os, "UTF-8");
PrintWriter pw = new PrintWriter(osw);
I think the problem is that you need to specify the encoding, since the constructor PrintWriter(OutputStream out) uses the default encoding which might not be correct for your data input:
OutputStream os = ...
PrintWriter pw = new PrintWriter(os, "UTF-8");
I've the necessity to share a streaming of data between two instances as below:
// get EClasses which should be connected
final uk.man.xman.xcore.Parameter source = getParameter(sourceAnchor);
final uk.man.xman.xcore.Parameter target = getParameter(targetAnchor);
// Set data channels
//Output stream
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
DataOutputStream dataOutputStream = new DataOutputStream(new BufferedOutputStream(outputStream));
source.setOutputStream(dataOutputStream);
//Input stream
DataInputStream inpuDataStream = new DataInputStream(new BufferedInputStream(new ByteArrayInputStream(outputStream.toByteArray())));
target.setInputStream(inpuDataStream);
Everything works ok if I write, during those lines of code. Strangely, when I need to use the data channel to write something in another class, like here:
DataOutputStream dataOutputStream = (DataOutputStream) inputParameter.getOutputStream();
System.out.println("WRITE:" + attributes.getValue("value"));
dataOutputStream.writeUTF(attributes.getValue("value"));
dataOutputStream.flush();
I am not able to read, and I really do not know why. Am I missing something?
Thanks for your time
Not sure if that's what you're asking, but you're creating an InputStream that reads from an empty byte array. That doesn't make much sense:
// create an Output stream that will write in memory
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
...
// transform what has been written to the output stream into a byte array.
// Since othing has been written yet, outputStream.toByteArray() returns
// an empty array
DataInputStream inpuDataStream = new DataInputStream(new BufferedInputStream(new ByteArrayInputStream(outputStream.toByteArray())));
I open a text file using windows-1251 encoding
FileInputStream is = new FileInputStream(path);
BufferedReader br = new BufferedReader(new InputStreamReader(is,
"windows-1251"));
and later write the changes like:
RandomAccessFile file = new RandomAccessFile(new File(path), "rw");
try {
file.write(etMainView.getText().toString().getBytes());
file.close();
Toast.makeText(this, "Changes saved", Toast.LENGTH_SHORT)
.show();
//..... Exception handling
The problem is that it messes up all the non-latin letters in the file and when I open it again, all such letters are replaced with some unreadable characters. I guess the RandomAccessFile uses UTF-8 by default which is causing troubles. How can I save the file keeping the encoding I used to open it?
Use .getBytes("windows-1251") instead of .getBytes(); .getBytes() uses the default JVM encoding.
If you want to use the stream apis you can do it this way
RandomAccessFile file = ....;
FileChannel fc = file.getChannel();
OutputStream os = Channels.newOutputStream(fc);
OutputStreamWriter osw = new OutputStreamWriter(os, "windows-1251");
osw.write("Some sring");
osw.flush();
file.close();
How can I transform a String value into an InputStreamReader?
ByteArrayInputStream also does the trick:
InputStream is = new ByteArrayInputStream( myString.getBytes( charset ) );
Then convert to reader:
InputStreamReader reader = new InputStreamReader(is);
I also found the apache commons IOUtils class , so :
InputStreamReader isr = new InputStreamReader(IOUtils.toInputStream(myString));
Does it have to be specifically an InputStreamReader? How about using StringReader?
Otherwise, you could use StringBufferInputStream, but it's deprecated because of character conversion issues (which is why you should prefer StringReader).
Same question as #Dan - why not StringReader ?
If it has to be InputStreamReader, then:
String charset = ...; // your charset
byte[] bytes = string.getBytes(charset);
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
InputStreamReader isr = new InputStreamReader(bais);
Are you trying to get a) Reader functionality out of InputStreamReader, or b) InputStream functionality out of InputStreamReader? You won't get b). InputStreamReader is not an InputStream.
The purpose of InputStreamReader is to take an InputStream - a source of bytes - and decode the bytes to chars in the form of a Reader. You already have your data as chars (your original String). Encoding your String into bytes and decoding the bytes back to chars would be a redundant operation.
If you are trying to get a Reader out of your source, use StringReader.
If you are trying to get an InputStream (which only gives you bytes), use apache commons IOUtils.toInputStream(..) as suggested by other answers here.
You can try Cactoos:
InputStream stream = new InputStreamOf(str);
Then, if you need a Reader:
Reader reader = new ReaderOf(stream);