Reading InputStream as UTF-8 - java

I'm trying to read from a text/plain file over the internet, line-by-line. The code I have right now is:
URL url = new URL("http://kuehldesign.net/test.txt");
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
LinkedList<String> lines = new LinkedList();
String readLine;
while ((readLine = in.readLine()) != null) {
lines.add(readLine);
}
for (String line : lines) {
out.println("> " + line);
}
The file, test.txt, contains ¡Hélló!, which I am using in order to test the encoding.
When I review the OutputStream (out), I see it as > ¬°H√©ll√≥!. I don't believe this is a problem with the OutputStream since I can do out.println("é"); without problems.
Any ideas for reading form the InputStream as UTF-8? Thanks!

Solved my own problem. This line:
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
needs to be:
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), "UTF-8"));
or since Java 7:
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream(), StandardCharsets.UTF_8));

String file = "";
try {
InputStream is = new FileInputStream(filename);
String UTF8 = "utf8";
int BUFFER_SIZE = 8192;
BufferedReader br = new BufferedReader(new InputStreamReader(is,
UTF8), BUFFER_SIZE);
String str;
while ((str = br.readLine()) != null) {
file += str;
}
} catch (Exception e) {
}
Try this,.. :-)

I ran into the same problem every time it finds a special character marks it as ��. to solve this, I tried using the encoding: ISO-8859-1
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream("txtPath"),"ISO-8859-1"));
while ((line = br.readLine()) != null) {
}
I hope this can help anyone who sees this post.

If you use the constructor InputStreamReader(InputStream in, Charset cs), bad characters are silently replaced. To change this behaviour, use a CharsetDecoder :
public static Reader newReader(Inputstream is) {
new InputStreamReader(is,
StandardCharsets.UTF_8.newDecoder()
.onMalformedInput(CodingErrorAction.REPORT)
.onUnmappableCharacter(CodingErrorAction.REPORT)
);
}
Then catch java.nio.charset.CharacterCodingException.

Related

How To Read Second Line In Text With Java

I am trying to create a simple command line program that will determine if a playlist is a media playlist or master based on the tag returned. Unfortunately both type of playlist first line tags are the same so I was wondering is their a way I could adjust my code to read the text starting at the second line?
private static String getPlaylistUrl(String theUrl) throws
FileNotFoundException, MalformedURLException, IOException{
String content = "";
//Creates a url variable
URL url = new URL(theUrl);
//Cretes a urlConnection variable
URLConnection urlConnection = (HttpURLConnection) url.openConnection();
//Wraps the urlConnection in a BufferedReader
BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));
String line;
while ((line = bufferedReader.readLine()) != null) {
content += line + "\n";
}
bufferedReader.close();
return content;
}
Just read the first line before the loop starts.
private static String getPlaylistUrl(String theUrl) throws IOException {
try (InputStream is = new URL(theUrl).openConnection().getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
Stream<String> stream = reader.lines()) {
return stream
// skip the first line
.skip(1)
// join all other lines using a new line delimiter
.collect(Collectors.joining("\n"));
}
}
Skip the header like this
String line;
bool IsHeader=true;
while ((line = bufferedReader.readLine()) != null) {
if (IsHeader){
IsHeader=false; //skip header..
}else{
content += line + "\n";
}
}
bufferedReader.close();

Cannot read large JSON request data from input stream even after using few variations

I am using post man to send the JSon request. Then I get the inputStream using the getInputStream().
InputStream inputStream = request.getInputStream();
I have a JSon request with 2032 character and it might increase based on the scenarios. I tried few suggestions for the similar kind of issue, but using all I would be able to read only 1011 character.
Below are the ways which I tried.
Declarations:
BufferedReader bufferedReader = null;
StringBuilder stringBuilder = new StringBuilder();
// stringBuilder.ensureCapacity(1048576);
JSONObject jObj = null;
InputStream inputStream = request.getInputStream();
1)
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
char[] charBuffer = new char[1048576];
int bytesRead = -1;
while ((bytesRead = bufferedReader.read(charBuffer)) > 0) {
stringBuilder.append(charBuffer, 0, bytesRead);
}
2)
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
String line = "";
String result = "";
while ((line = bufferedReader.readLine()) != null)
result += line;
inputStream.close();
3)
String line;
try {
bufferedReader = new BufferedReader(new InputStreamReader(inputStream));
while ((line = bufferedReader.readLine()) != null) {
stringBuilder.append(line);
}
} catch (Exception e) {
// TODO: handle exception
}
4)
stringBuilder.ensureCapacity(1048576);
BoundedInputStream boundedInputStream = new BoundedInputStream(inputStream);
bufferedReader = new BufferedReader(new InputStreamReader(boundedInputStream, "UTF-8"));
// StringBuilder builder= new StringBuilder();
StringBuilderWriter bufferedwriter = new StringBuilderWriter(stringBuilder);
IOUtils.copy(bufferedReader, bufferedwriter);
5)
bufferedReader = request.getReader();
char[] charBuffer = new char[1048576];
int bytesRead = -1;
while ((bytesRead = bufferedReader.read(charBuffer)) > 0) {
stringBuilder.append(charBuffer, 0, bytesRead);
}
Final Consumption: Used the second variation result was my latest try
// jObj = new JSONObject(stringBuilder.toString());
// jObj = new JSONObject(bufferedwriter.toString());
jObj = new JSONObject(result.toString());
Note: I was just verifying by increasing the char capacity to 1048576 to see if that would solve. But increasing that also have no effect on the inputstream.
Could anyone of you please advise me on how to read large Json input. Also let me know if I am doing it wrong.
Thanks in advance.
You seem to want to convert the JSON into a String. With Java 8 this has become a bit simpler.
// (1)
try (BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream))) {
// (2)
String json = reader.lines().collect(Collectors.joining("\n"));
// do something with `json`...
}
Explained:
Create a BufferedReader from the input stream. Using "try-with-resources" means, that the reader will be automatically closed when leaving the try {} block.
The BufferedReader has a method lines() which returns a Stream<String>. You can simply join all Strings using the joining collector.

How can I print every line available from one BufferedReader before checking another BufferedReader?

PrintWriter out = new PrintWriter(DoDSocket.getOutputStream(), true);
BufferedReader in = new BufferedReader(new InputStreamReader(DoDSocket.getInputStream()));
BufferedReader stdIn = new BufferedReader(new InputStreamReader(System.in));
String fromServer;
String fromUser;
while ((fromServer = in.readLine()) != null)
{
System.out.println(fromServer);
if ((fromUser= stdIn.readLine()) != null)
{
//System.out.println(fromUser);
out.println(fromUser);
}
}
In this code for a client, I've created a Print Writer and a buffered reader which communicate with a Server, I also have a separate reader which reads the System.in from the command line.
My problem at the moment is that if the server send the client a multi line string, I will have to press enter to receive each line. How can I edit this code so that every line is printed from the buffered reader from the server, before it checks what the user has typed, rather than checking after every individual line?
Why not do one loop after the other?:
PrintWriter out = new PrintWriter(DoDSocket.getOutputStream(), true);
BufferedReader in = new BufferedReader(new InputStreamReader(DoDSocket.getInputStream()));
BufferedReader stdIn = new BufferedReader(new InputStreamReader(System.in));
String fromServer;
String fromUser;
while ((fromServer = in.readLine()) != null)
{
System.out.println(fromServer);
}
while ((fromUser = stdIn.readLine()) != null)
{
System.out.println(fromUser);
}

Skip creating file in FileOutputStream when there is no data in Inputstream

This is a logging function which logs error stream from the execution of an external program. Everything works fine. But I do not want to generate the log file when there is no data in error stream. Currently it is creating zero size file. Please help.
FileOutputStream fos = new FileOutputStream(logFile);
PrintWriter pw = new PrintWriter(fos);
Process proc = Runtime.getRuntime().exec(externalProgram);
InputStreamReader isr = new InputStreamReader(proc.getErrorStream());
BufferedReader br = new BufferedReader(isr);
String line=null;
while ( (line = br.readLine()) != null)
{
if (pw != null){
pw.println(line);
pw.flush();
}
}
Thank you.
Simply defer the creating of the FileOutputStream and PrintWriter until you need it:
PrintWriter pw = null;
Process proc = Runtime.getRuntime().exec(externalProgram);
InputStreamReader isr = new InputStreamReader(proc.getErrorStream());
BufferedReader br = new BufferedReader(isr);
String line;
while ( (line = br.readLine()) != null)
{
if (pw == null)
{
pw = new PrintWriter(new FileOutputStream(logFile));
}
pw.println(line);
pw.flush();
}
Personally I'm not a big fan of PrintWriter - the fact that it just swallows all exceptions concerns me. I'd also use OutputStreamWriter so that you can explicitly specify the encoding. Anyway, that's aside from the real question here.
The obvious thing to do is to change
FileOutputStream fos = new FileOutputStream(logFile);
PrintWriter pw = new PrintWriter(fos);
....
if (pw != null){
...
}
to
FileOutputStream rawLog = null;
try {
PrintWriter Log = null;
....
if (log == null) {
rawLog = new FileOutputStream(logFile);
log = new PrintWriter(log, "UTF-8");
}
...
} finally {
// Thou shalt close thy resources.
// Icky null check - might want to split this using the Execute Around idiom.
if (rawLog != null) {
rawLog.close();
}
}

java: how to convert a file to utf8

i have a file that have some non-utf8 caracters (like "ISO-8859-1"), and so i want to convert that file (or read) to UTF8 encoding, how i can do it?
The code it's like this:
File file = new File("some_file_with_non_utf8_characters.txt");
/* some code to convert the file to an utf8 file */
...
edit: Put an encoding example
The following code converts a file from srcEncoding to tgtEncoding:
public static void transform(File source, String srcEncoding, File target, String tgtEncoding) throws IOException {
BufferedReader br = null;
BufferedWriter bw = null;
try{
br = new BufferedReader(new InputStreamReader(new FileInputStream(source),srcEncoding));
bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(target), tgtEncoding));
char[] buffer = new char[16384];
int read;
while ((read = br.read(buffer)) != -1)
bw.write(buffer, 0, read);
} finally {
try {
if (br != null)
br.close();
} finally {
if (bw != null)
bw.close();
}
}
}
--EDIT--
Using Try-with-resources (Java 7):
public static void transform(File source, String srcEncoding, File target, String tgtEncoding) throws IOException {
try (
BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(source), srcEncoding));
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(target), tgtEncoding)); ) {
char[] buffer = new char[16384];
int read;
while ((read = br.read(buffer)) != -1)
bw.write(buffer, 0, read);
}
}
String charset = "ISO-8859-1"; // or what corresponds
BufferedReader in = new BufferedReader(
new InputStreamReader (new FileInputStream(file), charset));
String line;
while( (line = in.readLine()) != null) {
....
}
There you have the text decoded. You can write it, by the simmetric Writer/OutputStream methods, with the encoding you prefer (eg UTF-8).
You need to know the encoding of the input file. For example, if the file is in Latin-1, you would do something like this,
FileInputStream fis = new FileInputStream("test.in");
InputStreamReader isr = new InputStreamReader(fis, "ISO-8859-1");
Reader in = new BufferedReader(isr);
FileOutputStream fos = new FileOutputStream("test.out");
OutputStreamWriter osw = new OutputStreamWriter(fos, "UTF-8");
Writer out = new BufferedWriter(osw);
int ch;
while ((ch = in.read()) > -1) {
out.write(ch);
}
out.close();
in.close();
You only want to read it as UTF-8?
What I did recently given a similar problem is to start the JVM with -Dfile.encoding=UTF-8, and reading/printing as normal. I don't know if that is applicable in your case.
With that option:
System.out.println("á é í ó ú")
prints correctly the characters. Otherwise it prints a ? symbol

Categories