I'm getting an strange issue in a loop that is reading a BufferedReader and never ends...
This is the code:
BufferedReader in = new BufferedReader(new InputStreamReader(clientSocket.getInputStream()));
int b;
StringBuilder buff = new StringBuilder();
while ((b = in.read()) != -1 ) {
buff.append((char) b);
}
System.out.println(buff.toString());
But never arrives to the last line to print buff.toString().
There's anything wrong in this code?
Thanks.
Can you try changing the while condition like this.
while ((b = in.read()) > -1 ) {
buff.append((char) b);
}
Your loop is trying to read until EOF (that is the only reason for an input stream/reader to return -1 for the read() method).
The problem is that your HTTP connection (and your socket) might be left open (for a while), also leaving your input stream/reader open. So instead of reaching the end condition for your loop, the in.read() call will just block, waiting for input.
If you control the "other side" of the connection, you could close it, to see what happens. But I doubt that would work for the use case in general. Instead, you need to parse the message according to the HTTP protocol (see HTTP/1.1 RFC2616).
If you only need to parse the headers, then you could use your BufferedReader, and read only full lines, until you find an empty line. This will work, because HTTP uses linefeeds (linefeed being CRLF in this case) after each header name/value pair, and end the header part with exactly two linefeeds. Everything after that will be the message body.
PS: This is the easy/happy case. Note that a single connection/socket may be re-used for multiple HTTP requests/responses. You may have handle this as well, depending on your requirements.
Related
I'm trying to write a simple HTTP server but can't figure out how to read the body segment of a POST-request. I'm having trouble reading beyond the empty line after the headers.
Here's what I do:
BufferedReader br = new BufferedReader(new InputStreamReader(client.getInputStream()));
StringBuilder request = new StringBuilder();
String line;
while(!(line = br.readLine()).isEmpty()) {
request.append(line).append(CRLF);
System.out.println(line);
}
// read body ?
So this basically loads the Request and headers in a String. But I can't figure out how to skip that one line that separates the headers from the body.
I've tried readLine() != null or to manually read two more lines after the loop terminates, but that results in a loop.
Try parsing the content-length header to get the number of bytes. After the blank line you'll want to read exactly that many bytes. Using readLine() won't work because the body isn't terminated by a CRLF.
I am trying to download web page with all its resources . First i download the html, but when to be sure to keep file formatted and use this function below .
there is and issue , i found 10 in the final file and when i found that hexadecimal code of the LF or line escape . and this makes troubles to my javascript functions .
Example of the final result :
<!DOCTYPE html>10<html lang="fr">10 <head>10 <meta http-equiv="content-type" content="text/html; charset=UTF-8" />10
Can someone help me to found the real issue ?
public static String scanfile(File file) {
StringBuilder sb = new StringBuilder();
try {
BufferedReader bufferedReader = new BufferedReader(new FileReader(file));
while (true) {
String readLine = bufferedReader.readLine();
if (readLine != null) {
sb.append(readLine);
sb.append(System.lineSeparator());
Log.i(TAG,sb.toString());
} else {
bufferedReader.close();
return sb.toString();
}
}
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
There are multiple problems with your code.
Charset error
BufferedReader bufferedReader = new BufferedReader(new FileReader(file));
This isn't going to work in tricky ways.
Files (and, for that matter, data given to you by webservers) comes in bytes. A stream of numbers, each number being between 0 and 255.
So, if you are a webserver and you want to send the character รถ, what byte(s) do you send?
The answer is complicated. The mapping that explains how some character is rendered in byte(s)-form is called a character set encoding (shortened to 'charset').
Anytime bytes are turned into characters or vice versa, there is always a charset involved. Always.
So, you're reading a file (that'd be bytes), and turning it into a Reader (which is chars). Thus, charset is involved.
Which charset? The API of new FileReader(path) explains which one: "The system default". You do not want that.
Thus, this code is broken. You want one of two things:
Option 1 - write the data as is
When doing the job of querying the webserver for the data and relaying this information onto disk, you'd want to just store the bytes (after all, webserver gives bytes, and disks store bytes, that's easy), but the webserver also sends the encoding, in a header, and you need to save this separately. Because to read that 'sack of bytes', you need to know the charset to turn it into characters.
How would you do this? Well, up to you. You could for example decree that the data file starts with the name of a charset encoding (as sent via that header), then a 0 byte, and then the data, unmodified. I think you should go with option 2, however
Option 2
Another, better option for text-based documents (which HTML is), is this: When reading the data, convert it to characters, using the encoding as that header tells you. Then, to save it to disk, turn the chars back to bytes, using UTF-8, which is a great encoding and an industry standard. That way, when reading, you just know it's UTF-8, period.
To read a UTF-8 text file, you do:
Files.newBufferedReader(Paths.get(file));
The reason this works, is that the Files API, unlike most other APIs (and unlike FileReader, which you should never ever use), defaults to UTF_8 and not to platform-default. If you want, you can make it more readable:
Files.newBufferedReader(Paths.get(file), StandardCharsets.UTF_8);
same thing - but now in the code it is clear what's happening.
Broken exception handling
} catch (IOException e) {
e.printStackTrace();
return null;
}
This is not okay - if you catch an exception, either [A] throw something else, or [B] handle the problem. And 'log it and keep going' is definitely not 'handling' it. Your strategy of exception handling results in 1 error resulting in a thousand things going wrong with a thousand stack traces, and all of them except the first are undesired and irrelevant, hence why this is horrible code and you should never write it this way.
The easy solution is to just put throws IOException on your scanFile method. The method inherently interacts with files, it SHOULD be throwing that. Note that your psv main(String[] args) method can, and usually should, be declared to throws Exception.
It also makes your code simpler and shorter, yay!
Resource Management failure
a filereader is a resource. You MUST close it, no matter what happens. You are not doing that: If .readLine() throws an exception, then your code will jump to the catch handler and bufferedReader.close is never executed.
The solution is to use the ARM (Automatic Resource Management) construct:
try (var br = Files.newBufferedReader(Paths.get(file), StandardCharsets.UTF_8)) {
// code goes here
}
This construct ensures that close() is invoked, regardless of how the 'code goes here' block exits. Even if it 'exits' via an exception or a return statement.
The problem
Your 'read a file and print it' code is other than the above three items mostly fine. The problem is that the HTML file on disk is corrupted; the error lies in your code that reads the data from the web server and saves it to disk. You did not paste that code.
Specifically, System.lineSeparator() returns the actual string. Thus, assuming the code you pasted really is the code you are running, if you are seeing an actual '10' show up, then that means the HTML file on disk has that in there. It's not the read code.
Closing thoughts
More generally the job of 'just print a file on disk with a known encoding' can be done in far fewer lines of code:
public static String scanFile(String path) throws IOException {
return Files.readString(Paths.get(path));
}
You should just use the above code instead. It's simple, short, doesn't have any bugs, cannot leak resources, has proper exception handling, and will use UTF-8.
Actually, there is no problem in this function I was mistakenly adding 10 using another function in my code .
The code below gets a byte array from an HTTP request and saves it in bytes[], the final data will be saved in message[].
I check to see if it contains a header by converting it to a String[], if I do, I read some information from the header then cut it off by saving the bytes after the header to message[].
I then try to output message[] to file using FileOutputStream and it works slightly, but only saves 10KB of information,one iteration of the while loop, (seems to be overwriting), and if I set the FileOutputStream(file, true) to append the information, it works... once, then the file is just added on to the next time I run it, which isn't what I want. How do I write to the same file with multiple chunks of bytes through each iteration, but still overwrite the file in completeness if I run the program again?
byte bytes[] = new byte[(10*1024)];
while (dis.read(bytes) > 0)
{
//Set all the bytes to the message
byte message[] = bytes;
String string = new String(bytes, "UTF-8");
//Does bytes contain header?
if (string.contains("\r\n\r\n")){
String theByteString[] = string.split("\r\n\r\n");
String theHeader = theByteString[0];
String[] lmTemp = theHeader.split("Last-Modified: ");
String[] lm = lmTemp[1].split("\r\n");
String lastModified = lm[0];
//Cut off the header and save the rest of the data after it
message = theByteString[1].getBytes("UTF-8");
//cache
hm.put(url, lastModified);
}
//Output message[] to file.
File f = new File(hostName + path);
f.getParentFile().mkdirs();
f.createNewFile();
try (FileOutputStream fos = new FileOutputStream(f)) {
fos.write(message);
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
You're opening a new FileOutputStream on each iteration of the loop. Don't do that. Open it outside the loop, then loop and write as you are doing, then close at the end of the loop. (If you use a try-with-resources statement with your while loop inside it, that'll be fine.)
That's only part of the problem though - you're also doing everything else on each iteration of the loop, including checking for headers. That's going to be a real problem if the byte array you read contains part of the set of headers, or indeed part of the header separator.
Additionally, as noted by EJP, you're ignoring the return value of read apart from using it to tell whether or not you're done. You should always use the return value of read to know how much of the byte array is actually usable data.
Fundamentally, you either need to read the whole response into a byte array to start with - which is easy to do, but potentially inefficient in memory - or accept the fact that you're dealing with a stream, and write more complex code to detect the end of the headers.
Better though, IMO, would be to use an HTTP library which already understands all this header processing, so that you don't need to do it yourself. Unless you're writing a low-level HTTP library yourself, you shouldn't be dealing with low-level HTTP details, you should rely on a good library.
Open the file ahead of the loop.
NB you need to store the result of read() in a variable, and pass that variable to new String() as the length. Otherwise you are converting junk in the buffer beyond what was actually read.
There is an issue with reading the data - you read only part of the response (because at that moment not all data was transfered to you yet) - so obviusly you write only that part.
check this answer for how to read full data from the InputStream:
Convert InputStream to byte array in Java
I'm working with Netty and it seems that a FrameDecoder in a ChannelPipeline isn't invoked unless/until a carriage return is received. For example, I have the following decoder that I've written to attempt to detect when a complete JSON string has been received:
public class JsonDecoder extends FrameDecoder {
#Override
protected Object decode(ChannelHandlerContext ctx, Channel channel, ChannelBuffer buf) {
char inChar = 0;
ChannelBuffer origBuffer = buf.copy();
StringBuilder json = new StringBuilder();
int ctr = 0;
while(buf.readable()) {
inChar = (char) buf.readByte();
json.append(inChar);
if (inChar == '{') {
ctr++;
} else if (inChar == '}') {
ctr--;
}
}
if (json.length() > 0 && ctr == 0) {
return origBuffer;
}
buf.resetReaderIndex();
return null;
}
}
(Please pardon the somewhat sloppy code - this is my first attempt using Netty and a bit of a learning experience.)
What I see happen is that this works fine when I test it by connecting to the server using telnet, paste in some valid JSON and press return. However, if I do not press return after the final closing '}' in the JSON string, the decoder never gets called with an updated buffer.
Is there a way to configure the channel pipeline to work differently? I've Googled for this and looked through the Netty documentation. I feel like I'm missing something basic and I just am not looking in the right place or searching for the right thing. Thanks for any help.
Is your telnet client reverting to 'old line by line' mode whereby only completed lines are sent to the server (telnet man page)? Try writing a simple Java client to send the message instead.
I guess reading a JSON stream is more akin to reading an HTTP stream, since you will have to keep track of the opening and closing braces (and brackets as well, should the JSON string be an array). If you look at the source for the HTTP decoder, you'll see that it is using a ReplayingDecoder. Using a replaying decoder is not necessary, but it helps a lot if the entire message is split in more than one buffer.
FrameDecoders are meant for reading messages that are "framed" by special characters (hence the name of the decoder) or prepended with a length field.
I would also highly recommend using the DecoderEmbedder helper class so that you can unit test your JSON decoder without doing actual I/O.
Hello I am currently working with sockets and input/output streams. I have a strange problem with my loop I use to send bytes. For some reason it get stuck when it tries to read from the inputstream when it is supposed to stop. Does anyone have any idea whats wrong?
int bit;
final byte[] request = new byte[1024];
if (response instanceof InputStream)
{
while ((bit = response.read(request)) > 0) { // <-- Stuck here
incoming.getOutputStream().write(request,0,bit);
incoming.getOutputStream().flush();
}
}
incoming.close();
InputStream.read blocks until input data is available, end of file is detected, or an exception is thrown.
You don't catch the exception, and don't check for EOF.
What I've done in the past to leave each side open is to add a termination character to the end of each message that you wouldn't expect to see in the message. If you are building the messages yourself then you could use a character such as a ; or maybe double pipes or something ||. Then just check for that character on the receiving end. Just a workaround. Not a solution. It was necessary in my case but may not be for you.