I can't print the whole response from the server on the console!
,
There are 3 ways to bypass this matter,
Add this header Connection: close
Replace HTTP/1.1 to HTTP/1.0
Add this s.close(); // Socket.close();
I can't close the connection because I want to send more than once at the same connection,
I just want to print the whole response without closing the connection.
String content = "GET /Zuck HTTP/1.1\r\nHost: www.facebook.com\r\nuser-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36\r\n\r\n";
Executing your code returns the following:
HTTP/1.1 302 Found
Location: https://www.facebook.com/zuck
Strict-Transport-Security: max-age=15552000; preload
Content-Type: text/html; charset="utf-8"
X-FB-Debug: NHDnNLmTeg5PBPiSL7++1dz/ZdRbnlnKy1gpdfBbLFkvrhbJMJT+nLJd1VYpmEkkkUtmvXsjgLvFEeML/82WUA==
Date: Thu, 18 Jun 2020 15:36:24 GMT
Alt-Svc: h3-27=":443"; ma=3600
Connection: keep-alive
Content-Length: 0
HTTP reponse status code 302 indicates a redirect to the Location: https://www.facebook.com/zuck. Either handle redirects in your code or - to get your example running - simply replace Zuck with zuck in your content string.
Since your operating on raw socket you actually cannot determine when you have received whole response. You can however do it with protocols like http in same cases.
In your example you receive Content-Length: 0 which tells the number (0) of bytes the body of message have.
You can also pass header Connection: close which closes connection after sending full response, but I think it is not what you're looking for.
You can also just do read/write operations on two separate threads.
Related
I want to compress response body in javax.servlet.Filter. Here is my code
byte[] bytes = // compressing response body
response.addHeader("Content-Encoding", "gzip");
response.addHeader("Content-Length", String.valueOf(bytes.length));
response.setContentLength(bytes.length);
response.setBufferSize(bytes.length * 2);
ServletOutputStream output = response.getOutputStream();
output.write(bytes);
output.flush();
output.close();
But actual response I see in Chrome Dev Tool is
Accept-Ranges: bytes
Cache-Control: max-age=2592000
Content-Type: application/javascript;charset=UTF-8
Date: Fri, 14 Dec 2018 15:34:25 GMT
Last-Modified: Tue, 09 Oct 2018 13:42:54 GMT
Server: Apache-Coyote/1.1
Transfer-Encoding: chunked
I do not expect Transfer-Encoding: chunked, because I declare "Content-Length". I wrote a simple test on java
URLConnection connection = new URL("http://127.0.0.1:8081/js/ads.js").openConnection();
connection.addRequestProperty("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8");
connection.addRequestProperty("Accept-Encoding", "gzip, deflate");
connection.addRequestProperty("Accept-Language", "ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7");
connection.addRequestProperty("Cache-Control", "no-cache");
connection.addRequestProperty("Connection", "keep-alive");
connection.addRequestProperty("Host", "127.0.0.1:8081");
connection.addRequestProperty("Pragma", "no-cache");
connection.addRequestProperty("Upgrade-Insecure-Requests", "1");
connection.addRequestProperty("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36");
connection.connect();
connection.getHeaderFields().forEach((s, strings) ->
System.out.println(s + ":" + String.join(",", strings)));
and here is what I found:
if I comment setting "User-Agent" header or change "User-Agent" to any other value then I get response with "Content-Length"
if "User-Agent" points on Chrome then I get Transfer-Encoding: chunked.
I debugged up to sun.nio.ch.SocketChannel#write method and it gets correct ByteBuffers with Content-Length header values.
I cant undestand where is this magic transormation to chunked happening?
Update
The strange thing is that I write gziped bytes into Socket (I'm sure as I debugged up to call of native method write in SocketChannel implementation). But URLConnection returns my unzipped byte array with Chrome's User-Agent and correct gziped byte array if I dont specify User-Agent header or put some random string.
SO seems like magic happening somewhere in Windows socket implementation.
I have a Twitter shortened URL (t.co) and I'm trying to use jsoup to send a request and parse its response. There should be three redirect hops before reaching the final URL. This is not the case when using jsoup, even after setting followRedirects to true.
My code:
public static void main(String[] args) {
try {
Response response = Jsoup.connect("https://t. co/sLMy6zi4Yw").followRedirects(true).execute(); // Space intentional to avoid SOF shortened errors
System.out.println(response.statusCode()); // prints 200
} catch (IOException e) {
System.out.println(e.getMessage());
}
}
However, using Python's Request library, I can get the right response:
response = requests.get('https://t. co/sLMy6zi4Yw', allow_redirects=False)
print(response.status_code)
301
I'm using jsoup version 1.11.2 and Requests version 2.18.4 with Python 3.5.2.
Anybody have any insight on the matter?
To overcome this special case you can remove the User-Agent header which Jsoup sets by default (for some unknown/undocument reason)
Connection connection = Jsoup.connect(url).followRedirects(true);
connection.request().removeHeader("User-Agent");
Let's examine the raw requests & view the server behavior
Request with user agent (to simulate a browser) returns
status code 200
Meta refresh which is a method of instructing a web browser to automatically refresh the current web page or frame after a given time interval, this case 0 seconds and url http://bit. ly/2n3VDpo
Javascript code which replaces location to the same url (google "meta refresh is depercated" / "drawbacks using meta refresh")
Curl example
curl --include --raw "https://t. co/sLMy6zi4Yw" --user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36"
Response
Chrome/63.0.3239.132 Safari/537.36"
HTTP/1.1 200 OK
cache-control: private,max-age=300
content-length: 257
content-security-policy: referrer always;
content-type: text/html; charset=utf-8
referrer-policy: unsafe-url
server: tsa_b
strict-transport-security: max-age=0
vary: Origin
x-response-time: 20
x-xss-protection: 1; mode=block; report=https://twitter.com/i/xss_report
<head><meta name="referrer" content="always"><noscript><META http-equiv="refresh" content="0;URL=http://bit. ly/2n3VDpo"></noscript><title>http://bit. ly/2n3VDpo</title></head><script>window.opener = null;location.replace("http:\/\/bit. ly\/2n3VDpo")</script>
Request without user agent returns
status code 301
header "location" with the redirect url
Curl example
curl --include --raw "https://t. co/sLMy6zi4Yw"
HTTP/1.1 301 Moved Permanently
cache-control: private,max-age=300
content-length: 0
location: http://bit. ly/2n3VDpo
server: tsa_b
strict-transport-security: max-age=0
vary: Origin
x-response-time: 9
I am currently trying to Authenticate with an IIS/6.0 Data Server. With the code below, how do I retrieve the challenge from the server. Currently what I am doing is sending the first GET request to the server
//Part 1: The Request
pw.println("GET /dashboard/ HTTP/1.1");
pw.println("Host: MyServer.net");
pw.println("User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0");
pw.println("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
pw.println("Accept-Language: en-US,en;q=0.5");
pw.println("Accept-Encoding: gzip, deflate");
pw.println("Connection: keep-alive");
pw.println("WWW-Authenticate: Negotiate");
pw.println();
pw.flush();
//Part 1: The Response
HTTP/1.1 401 Unauthorized
Content-Length: 1656
Content-Type: text/html
Server: Microsoft-IIS/6.0
WWW-Authenticate: Negotiate
WWW-Authenticate: NTLM
X-Powered-By: ASP.NET
Date: Mon, 14 Sep 2015 19:28:16 GMT
Then I send the next request
//Part 2: The Request
pw.println("GET /dashboard/ HTTP/1.1");
pw.println("Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
pw.println("Referer: http://MyServer.net/dashboard/");
pw.println("Accept-Language: en-US,en;q=0.5");
pw.println("User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0");
pw.println("Accept-Encoding: gzip, deflate");
pw.println("Host: MyServer.net");
pw.println("Connection: keep-alive");
pw.println("Authorization: Negotiate");
pw.println();
pw.flush();
//Part 2: The Response
HTTP/1.1 401 Unauthorized
Content-Length: 1539
Content-Type: text/html
Server: Microsoft-IIS/6.0
WWW-Authenticate: Negotiate YF0GBisGAQUFAqBTMFGgMDAuBgkqhkiC9xIBAgIGCSqGSIb3EgECAgYKKoZIhvcSAQICAwYKKwYBBAGCNwICCqMdMBugGRsXcGFlbXMxOTYkQFNUQVJCVUNLUy5ORVQ=
X-Powered-By: ASP.NET
Date: Mon, 14 Sep 2015 19:28:16 GMT
There are two things wrong that I think I have done here.
The WWW-Authenticate Header field appears to be wrong in "Part 2: The Response" I think it is because I am not using NTLM (Which is what I want to use)
I have not sent My Active Directory credentials yet. I do not know what I need to do next.
Currently I found a really helpful document Responding to the Challenge which helps explain how to encode the Active Directory credentials
What steps do I need to take in order to completely authenticate with the server so that I can poll data from it?
I have a Jersey app that has been run through our corporations website vulnerability tool. It came back with a vulnerability that is quite odd. If you send in this header:
"*/*'"!#$^*\/:;.,?{}[]`~-_<sCrIpT>alert(81363)</sCrIpT>"
you actually get it back in the response and it is not escaped. Anyone think this is a problem?
Here is the actual response:
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)
Accept: */*'"!#$^*\/:;.,?{}[]`~-_<sCrIpT>alert(81363)</sCrIpT>
Pragma: no-cache
...
And one more thing. I just upgraded to jersey 1.14 and it still does this...
You mean in produced error message? Something like:
< HTTP/1.1 400 Bad Request
< Content-Type: text/plain
< Date: Thu, 27 Sep 2012 14:30:30 GMT
< Connection: close
< Transfer-Encoding: chunked
<
* Closing connection #0
The HTTP header field "Accept" with value "..." could not be parsed
?
If so, we can definitely do something about it, please report this as a new RFE at http://java.net/jira/browse/JERSEY. (I wasn't able to reproduce anything else related to this issue).
Thanks!
I don't know if this is a bug or a feature in the http spec, or I am not understanding things ok.
I have a resource that changes at most once a week, at the week's beginning. If it didn't change, then the previous week's resource continues to be valid for the whole week.
(For all our tests we have modified the one week period for five minutes, but I think our observations are still valid).
First we send the resource with the header Expires: next Monday. The whole week the browser retrieves from the cache. If on Monday we have a new resource then it is retrieved with its new headers and everything is ok.
The problem occurs when the resource is not renewed. In response to the conditional get our app (Java+Tomcat) sends new headers with Expires: next Monday but without the body. But our frontend server (apache) removes this header, because the spec says you should not send new headers if the resource did not change. So now forever (until the resource changes) the browser will send a conditional get when we would like it to continue serving straight from the cache.
Is there a spec compliant way to update the headers without updating the body? (or sending it again)
And subquestion: how to make apache pass along tomcat's headers?
Just a Expires header is not enough. According to RFC 2616 section 13.3.4, a server needs to respond with two headers, Last-Modified and ETag, to do conditional GET right:
In other words, the preferred behavior for an HTTP/1.1 origin server is to send both a strong entity tag and a Last-Modified value.
And if the client is HTTP/1.1 compliant, it should send If-Modified-Since. Then the server is supposed to respond as following (quoted from Roy Fielding's proposal to add conditional GET):
If resource is inaccessible (for whatever reason), then the server should return a 4XX message just like it does now.
If resource no longer exists, the server should return a 404 Not Found response (i.e. same as now).
If resource is accessible but its last modification date is earlier (less than) or equal to the date passed, the server should return a 304 Not Modified message (with no body).
If resource is accessible and its last modification date is later than the date passed, the server should return a 200 OK message (i.e. same as now) with body.
So, I guess you don't need to configure Apache and/or Tomcat the way you described. You need to make your application HTTP/1.1 compliant.
Try sending a valid HTTP-Date for the Expires header?
One way to solve the problem is using separate URIs for each week. The canonical url redirects to the appropriate url for the week, and instructs the browser to cache the redirect for a week. Also, URLs that have a date in them will instruct the browser to cache forever.
Canonical URL : /path/to/resource
Status Code : 301
Location : /path/to/resource/12-dec or /path/to/resource/19-dec
Expires : Next Monday
Week 1 : /path/to/resource/12-dec
Status code : 200
Expires : Never
Week 2 : /path/to/resource/19-dec
Status code : 200
Expires : Never
When the cache expires on Monday, you just send a redirect response. You either send last weeks URL or this weeks, but you never send the entire response body.
With this approach, you have eliminated conditional gets. You have also made your resources "unmodifiable-once-published", and you also get versioned resources.
The only caveat - redirects aren't cached by all browsers even though the http spec requires them to do so. Notably IE8 and below don't cache. For details, look at the column "cache browser redirects" in browserscope.
The Expires header was basically deprecated with HTTP 1.1; use Cache-Control: max-age instead.
Make sure you are including Last-Modified.
It's optional, but you may also want to specify Cache-Control: must-revalidate, to make sure intermediate proxies don't deliver potentially stale content.
You don't need to set ETag.
Example request:
GET http://localhost/images/logo.png HTTP/1.1
Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5
Referer: http://localhost/default.aspx
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Host: localhost
Connection: Keep-Alive
The response includes the requested content:
HTTP/1.1 200 OK
Cache-Control: max-age=10
Content-Type: image/png
Last-Modified: Sat, 21 Feb 2009 11:28:18 GMT
Accept-Ranges: bytes
Date: Sun, 18 Dec 2011 05:48:34 GMT
Content-Length: 2245
Requests made before the 10 second timeout are resolved from cache, with no HTTP request. After the timeout:
GET http://localhost/images/logo.png HTTP/1.1
Accept: image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5
Referer: http://localhost/default.aspx
Accept-Language: en-US
User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
If-Modified-Since: Sat, 21 Feb 2009 11:28:18 GMT
Host: localhost
The response is just headers, without content:
HTTP/1.1 304 Not Modified
Cache-Control: max-age=10
Last-Modified: Sat, 21 Feb 2009 11:28:18 GMT
Accept-Ranges: bytes
Date: Sun, 18 Dec 2011 05:49:04 GMT
Subsequent requests are again resolved from the browser's cache until the specified cache expiration time.