I'm playing around setting up my own java http server to better understand http servers and what goes on under the hood of the web. I've developed a pretty simple server and have been able to serve both html pages as well as data in JSON form. Then I saw the browser (I'm using chrome but assuming it's the same for others) was sending a request for favicon.ico. I'm able to identify that request on my server, so I'm trying to serve up a random icon I downloaded and resized to 16x16 pixels in png format, as that's what the internet says the size needs to be. Here's my code, note it's not supposed to be anything professional, just something that will work for my basic educational purposes:
[set up ServerSocket and listen]
public static String err_header = "HTTP/1.1 500 ERR\nAccess-Control-Allow-Origin: *";
public static String success_header = "HTTP/1.1 200 OK\nAccess-Control-Allow-Origin: *";
public static String end_header = "\r\n\r\n";
while(true){
try{
System.out.println("Listening for new connections");
clientSocket = server.accept();
System.out.println("Connection established");
InputStreamReader isr = new InputStreamReader(clientSocket.getInputStream());
BufferedReader reader = new BufferedReader(isr);
String getLine = reader.readLine();//first line of HTTP request
handleRequest(getLine,clientSocket);
}//end of try
catch(Exception e){
[error stuff]
}//end of catch
}//end of while
HandleRequest method:
public static void handleRequest(String getLine,Socket clientSocket) throws Exception{
if(getLine.substring(5,16).equals("favicon.ico")){
List<String> iconTag = new ArrayList<String>();
iconTag.add("\nContent-Type: image/png");
handleFileRequest("[file]",iconTag,clientSocket);
}//end of if
else{
handleFileRequest("[file]",clientSocket);
}//end of else
}//end of handleRequest
handleFileRequest for images:
public static void handleFileRequest(String fileName,List<String> headerTags,Socket clientSocket) throws Exception{
OutputStream out = clientSocket.getOutputStream();
BufferedReader read = new BufferedReader(new FileReader(fileName));
out.write(success_header.getBytes("UTF-8"));
Iterator<String> itr = headerTags.iterator();
while(itr.hasNext()){
out.write(itr.next().getBytes("UTF-8"));
}//end of while
out.write(end_header.getBytes("UTF-8"));
String readLine = "";
while((readLine = read.readLine())!=null){
out.write(readLine.getBytes("UTF-8"));
}//end of while
out.flush();
out.close();
}//end of handleFileRequest
And it appears to work, as the server sends the file, the browser shows the 200 OK response, but there's no favicon and when I filter network requests to just images, there is one image requested by the page being served but the favicon request is not listed there (the favicon request is in the "other" section). Similarly when clicking on the other image the image shows up on the preview, whereas that's not the case with the favicon request. Screenshot:
Meanwhile here's what the other image looks like, and it shows up in the page just fine:
I also tried including the Content-Length header, but that didn't seem to make a difference. Am I missing something obvious?
Also just to clarify, I know I can include the favicon in the actual html page, the goal isn't to do it, but to understand how it works.
Reading binary files
It seems the content of the favicon is not served correctly.
I suspect this is most likely due to the way you read its content:
while((readLine = read.readLine())!=null){
out.write(readLine.getBytes("UTF-8"));
}
Reading binary content line by line is inappropriate,
because the concept of lines, and also UTF-8 encoding,
don't make sense in the context of binary files.
And you cannot read binary content correctly line by line this way,
because the readLine method of a BufferedReader doesn't return the full line, because it strips the newline from the end.
You cannot manually add a newline character because you cannot know what exactly it was.
Here's a simpler and correct way to read the content of a binary file:
byte[] bytes = Files.readAllBytes(Paths.get("/path/to/file"));
Once you have this, it's easy to produce a correct file header with the content length, using the value of bytes.length.
What happens when you visit a page in a browser
It seems it will be good for you if we clarify a few things.
When you open a URL in a browser,
the browser sends a GET request to the web server to download the content of the original URL that you have specified.
Once it has the page content, it will send further GET requests:
Fetch a favicon if it doesn't have one already. The location of this may be specified in the HTML document, or else the browser will try to fetch SERVERNAME/favicon.ico by default
Fetch the images specified in src attribute of any (valid) <img/> tags in the document
Fetch the style sheets specified in href attribute of any (valid) <style/> tags in the document
... and similarly for <script/> tags, and so on...
The favicon is purely cosmetic, to show in browser tab titles,
the other resources are essential for rendering a page.
They are not essential in text-based browsers like lynx,
such browsers will obviously not fetch these resources.
This is the explanation for why the favicon is requested, and how.
How does a web server serve files?
In the most basic case, serving a file has two important components:
Produce an appropriate HTTP header: each line in the header is in name: value format, and each line must end with \n.
There must be at least a Content-type header.
The header must be terminated by a blank line.
After the blank line that terminates the header,
the content can be anything, even binary.
To illustrate with an example,
consider the curl command, which dumps the content of a url to standard output.
If you run curl url-to-some-html-file,
you will see the content of the html file.
If you run curl url-to-some-image-file,
you will see the content of the image file.
It will be unreadable, and your terminal will probably make funny noises.
You can redirect the output to a file with curl url-to-some-image-file > image.png,
and that will give you an image file,
binary content,
that you can open in any image viewer tool.
In short, serving files is really just printing a header on stdout,
then printing a blank line to terminate the header,
then printing the content on stdout.
Debugging the serving of an image
An easy way to debug that an image is correctly served is to save the URL to a file using curl,
and then verify that the saved file and the original file are identical,
for example using the cmp command:
curl -o file url-to-favicon
cmp file /path/to/original
The output of cmp should be empty.
This command only produces output if it finds a difference in the two files.
Implementing a simple HTTP server
Instead of using a ServerSocket,
here's a drastically simpler way to implement an HTTP server:
HttpServer server = HttpServer.create(new InetSocketAddress(1234), 0);
server.createContext("/favicon.ico", t -> {
byte[] bytes = Files.readAllBytes(Paths.get("/path/to/favicon"));
t.sendResponseHeaders(200, bytes.length);
try (OutputStream os = t.getResponseBody()) {
os.write(bytes);
}
});
server.createContext("/", t -> {
Charset charset = StandardCharsets.UTF_8;
List<String> lines = Files.readAllLines(Paths.get("/path/to/index"), charset);
t.sendResponseHeaders(200, 0);
try (OutputStream os = t.getResponseBody()) {
for (String line : lines) {
os.write((line + "\n").getBytes(charset));
}
}
});
server.start();
Related
I'm trying to implement a simple servlet that returns a zip file that is bundled inside the application (simple resource)
So I've implemented the following method in the server side:
#GET
#Path("{path}/{zipfile}")
#Produces("application/zip")
public Response getZipFile(
#PathParam("path") String pathFolder,
#PathParam("zipfile") String zipFile) IOException {
String fullPath= String.format("/WEB-INF/repository/%s/%s",
pathFolder, zipFile);
String realPath = ServletContextHolder.INSTANCE.getServletContext()
.getRealPath(fullPath);
File file = new File(realPath );
ResponseBuilder response = Response.ok((Object) file);
return response.build();
}
When I call this method from the borwser, the zip file is downloaded and its size is the same number of bytes as the original zip in the server.
However, when I call this using a simple XMLHttpRequest from my client side code:
var oXHR = new XMLHttpRequest();
var sUrl = "http://localhost:8080/path/file.zip"
oXHR.open('GET', sUrl);
oXHR.responseType = 'application/zip';
oXHR.send();
I can see in the Network tab of the Developer tools in chrome that the content size is bigger, and I'm unable to process this zip file (for instance JSzip doesn't recognize it).
It seems like somewhere between my response and the final response from org.glassfish.jersey.servlet.ServletContainer, some extra bytes are written/ some encoding is done on the file.
Can you please assist?
Best Regards,
Maxim
When you use an ajax request, the browser expects text (by default) and will try to decode it from UTF-8 (corrupting your data).
Try with oXHR.responseType = "arraybuffer"; : that way, the browser won't change the data and give you the raw content (which will be in oXHR.response).
This solution won't work in IE 6-9 : if you need to support it, check JSZip documentation : http://stuk.github.io/jszip/documentation/howto/read_zip.html
If it's not the right solution, try downloading directly the zip file (without any js code involved) to check if the issue comes from the js side or from the java side.
Today I'm developing a java RMI server (and also the client) that gets info from a page and returns me what I want. I put the code right down here. The problem is that sometimes the url I pass to the method throws an IOException that says that the url given makes a 503 HTTP error. It could be easy if it was always that way but the thing is that it appears sometimes.
I have this method structure because the page I parse is from a weather company and I want info from many cities, not only for one, so some cities works perfectly at the first chance and others it fails. Any suggestions?
public ArrayList<Medidas> parse(String url){
medidas = new ArrayList<Medidas>();
int v=0;
String sourceLine;
String content = "";
try{
// The URL address of the page to open.
URL address = new URL(url);
// Open the address and create a BufferedReader with the source code.
InputStreamReader pageInput = new InputStreamReader(address.openStream());
BufferedReader source = new BufferedReader(pageInput);
// Append each new HTML line into one string. Add a tab character.
while ((sourceLine = source.readLine()) != null){
if(sourceLine.contains("<tbody>")) v=1;
else if (sourceLine.contains("</tbody>"))
break;
else if(v==1)
content += sourceLine + "\n";
}
........................
........................ NOW THE PARSING CODE, NOT IMPORTANT
}
HTTP 500 errors reflect server errors so it has likely nothing to do with your client code.
You would get a 400 error if you were passing invalid parameters on your request.
503 is "Service Unavailable" and may be sent by the server when it is overloaded and cannot process your request. From a publicly accessible server, that could explain the erratic behavior.
Edit
Build a retry handler in your code when you detect a 503. Apache HTTPClient can do that automatically for you.
List of HTTP Status Codes
Check that the IOException is really not a MalformedURLException. Try printing out the URLs to verify a bad URL is not causing the IOException.
How large is the file you are parsing? Perhaps your JVM is running out of memory.
I have written a Servlet that should act like a web-proxy. But some of the Javascript GET calls only return part of the original content when I am loading a page, like localhost:8080/Proxy?requestURL=example.com.
When priting the content of the java script to the console, they are complete.
But the response at the browser is truncated.
I am writing like this:
ServletOutputStream sos = resp.getOutputStream();
OutputStreamWriter writer = new OutputStreamWriter(sos);
..
String str = content_of_get_request
..
writer.write(str);
writer.flush();
writer.close();
The strange thing is, when I request directly the Javascript that was loaded during the page request like this:
localhost:8080/Proxy?requestURL=anotherexaple.com/needed.js
The whole content is returned to the browser.
It would be great if someone had an idea.
Regards
UPDATE:
The problem was the way how I created the response String:
while ((line = rd.readLine()) != null)
{
response.append(line);
}
I read one line from a Stream and appended it on a StringBuffer, but it appears that firefox and chrome had a problem with that.
It seems that some browsers implement a maximum line length for JavaScript, however there is no maximum line length mentioned in the RFC HTTP 1.1 standard.
Fix:
Just adding a "\n" to the line fixes the issue.
response.append(line+"\n");
Because what you are doing is just reading the Html Response , but you are not actually calling the other resources that are referenced in the HTML like images, js etc.
You can observe that when you monitor how the browser renders the html though Firebug for Firefox.
1) The browser receives Html response.
2)Then it parses for referenced resources and make a separate Get call for each of those.
So in order for proxy to work you need to mimick this browser behavior.
My Advice is to use a already available open source libs HTML Unit
Everything works fine, but only if file is small, about 1MB, when I tried it with bigger files, like 20MB my browser display it, instead of force to download, I tried many headers so far, now my code looks:
PrintWriter out = response.getWriter();
String fileName = request.getParameter("filename");
File f= new File(fileName);
InputStream in = new FileInputStream(f);
BufferedInputStream bin = new BufferedInputStream(in);
DataInputStream din = new DataInputStream(bin);
while(din.available() > 0){
out.print(din.readLine());
out.print("\n");
}
response.setContentType("application/force-download");
response.setContentLength((int)f.length());
response.setHeader("Content-Transfer-Encoding", "binary");
response.setHeader("Content-Disposition","attachment; filename=\"" + "xxx\"");//fileName);
in.close();
bin.close();
din.close();
You are setting the response headers after writing the contents of the file to the output stream. This is quite late in the response lifecycle to be setting headers. The correct sequence of operations should be to set the headers first, and then write the contents of the file to the servlet's outputstream.
Therefore, your method should be written as follows (this won't compile as it is a mere representation):
response.setContentType("application/force-download");
response.setContentLength((int)f.length());
//response.setContentLength(-1);
response.setHeader("Content-Transfer-Encoding", "binary");
response.setHeader("Content-Disposition","attachment; filename=\"" + "xxx\"");//fileName);
...
...
File f= new File(fileName);
InputStream in = new FileInputStream(f);
BufferedInputStream bin = new BufferedInputStream(in);
DataInputStream din = new DataInputStream(bin);
while(din.available() > 0){
out.print(din.readLine());
out.print("\n");
}
The reason for the failure is that it is possible for the actual headers sent by the servlet would be different from what you are intending to send. After all, if the servlet container does not know what headers (which appear before the body in the HTTP response), then it may set appropriate headers to ensure that the response is valid; setting the headers after the file has been written is therefore futile and redundant as the container might have already set the headers. You could confirm this by looking at the network traffic using Wireshark or a HTTP debugging proxy like Fiddler or WebScarab.
You may also refer to the Java EE API documentation for ServletResponse.setContentType to understand this behavior:
Sets the content type of the response being sent to the client, if the response has not been committed yet. The given content type may include a character encoding specification, for example, text/html;charset=UTF-8. The response's character encoding is only set from the given content type if this method is called before getWriter is called.
This method may be called repeatedly to change content type and character encoding. This method has no effect if called after the response has been committed.
...
Set content-type and other headers before you write the file out. For small files the content is buffered, and the browser gets the headers first. For big ones the data come first.
This is from a php script which solves the problem perfectly with every browser I've tested (FF since 3.5, IE8+, Chrome)
header("Content-Disposition: attachment; filename=\"".$fname_local."\"");
header("Content-Type: application/force-download");
header("Content-Transfer-Encoding: binary");
header("Content-Length: ".filesize($fname));
So as far as I can see, you're doing everything correctly. Have you checked your browser settings?
I have a GWT page where user enter data (start date, end date, etc.), then this data goes to the server via RPC call. On the server I want to generate Excel report with POI and let user save that file on their local machine.
This is my test code to stream file back to the client but for some reason I think it does not know how to stream file to the client when I'm using RPC:
public class ReportsServiceImpl extends RemoteServiceServlet implements ReportsService {
public String myMethod(String s) {
File f = new File("/excelTestFile.xls");
String filename = f.getName();
int length = 0;
try {
HttpServletResponse resp = getThreadLocalResponse();
ServletOutputStream op = resp.getOutputStream();
ServletContext context = getServletConfig().getServletContext();
resp.setContentType("application/octet-stream");
resp.setContentLength((int) f.length());
resp.setHeader("Content-Disposition", "attachment; filename*=\"utf-8''" + filename + "");
byte[] bbuf = new byte[1024];
DataInputStream in = new DataInputStream(new FileInputStream(f));
while ((in != null) && ((length = in.read(bbuf)) != -1)) {
op.write(bbuf, 0, length);
}
in.close();
op.flush();
op.close();
}
catch (Exception ex) {
ex.printStackTrace();
}
return "Server says: " + filename;
}
}
I've read somewhere on internet that you can't do file stream with RPC and I have to use Servlet for that. Is there any example of how to use Servlet and how to call that servlet from ReportsServiceImpl. Do I really need to make a servlet or it is possible to stream it back with my RPC?
You have to make a regular Servlet, you cannot stream binary data from ReportsServiceImpl. Also, there is no way to call the servlet from ReportsServiceImpl - your client code has to directly invoke the servlet.
On the client side, you'd have to create a normal anchor link with the parameters passed via the query string. Something like <a href="http://myserver.com/myservlet?parm1=value1&.."</a>.
On the server side, move your code to a standard Servlet, one that does NOT inherit from RemoteServiceServlet. Read the parameters from the request object, create the excel and send it back to the client. The browser will automatically popup the file download dialog box.
You can do that just using GWT RPC and Data URIs:
In your example, make your myMethod return the file content.
On the client side, format a Data URI with the file content received.
Use Window.open to open a file save dialog passing the formatted DataURI.
Take a look at this reference, to understand the Data URI usage:
Export to csv in jQuery
It's possible to get the binary data you want back through the RPC channel in a number of ways... uuencode, for instance. However, you would still have to get the browser to handle the file as a download.
And, based on your code, it appears that you are trying to trigger the standard browser mechanism for handling the given mime-type by modifying the response in the server so the browser will recognize it as a download... open a save dialog, for instance. To do that, you need to get the browser to make the request for you and you need the servlet there to handle the request. It can be done with rest urls, but ultimately you will need a serviet to do even that.
You need, in effect, to set a browser window URL to the URL that sends back the modified response object.
So this question (about streaming) is not really compatible with the code sample. One or the other (communication protocols or server-modified response object) approach has to be adjusted.
The easiest one to adjust is the communication method.