We use AWS to store aduio/video content for our website.
We us the Signed Cookies Using a Canned Policy:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/private-content-setting-signed-cookie-canned-policy.html
So we have 3 cookies set for each request to retrieve the data:
CloudFront-Policy;
CloudFront-Signature;
CloudFront-Key-Pair-Id;
And it is used to access a resource URL like http://cloudfront.org_name.com/2016%2F7%2F1%2FStanding+Meditation_updated+91615.mp3
All three cookies are set by the server (Java-based) for each request anew to a correct pre-set value.
It all works most of the time for most of the content, but for some resources it just fails with a 403 Forbidden error.
If I open two contents (one working, one not) in separate browser tabs, all the cookies and the rest look exactly the same, except for the resource URL.
And yet - one works, while the other does not.
What is even more confusing, sometimes the same resource requested from the same physical client machine, once in FF, other time in Chrome, works in one browser but fails in other one.
Also, sometimes clearing user browser cookies works, the other time it fails, with no discernible pattern.
It's been driving me insane as I struggle to see what's wrong.
Can anyone provide any insight as to what the reason could be and what remedies could be tried?
Okay, the answer is in my reply to Michael:
I noticed later on that the resource URLs for working and failing content were different. Pretty close to not spot the difference on the first sight, but diffrent. Everything was the same - cookes, headers, other parameters. But I was comparing 2 different contents. First URL always worked, second always failed.
Lesson learnt: carefully curl the two resources and analyse the uRLS to see what actually is different.
A tip: use Chrome's development tools to derive curl commands:
Right click on the failing URL -> Copy-> Copy as cURL. Then paste in command line to test.
BTW, we just re-uploaded the failing resource and updated the referring web page - everything works again.
Related
What differentiate these 2 requests that cause them to have different results/responses from the server although they should be the same ?
Request initiated by Chrome after a simple
click/navigation(successful, response code is 302)
I simply copied
that request as a curl and imported it to Postman and then postman
hanged
I did the same with Java - HttpUrlConnection(mimicking all the request headers and cookies like Chrome sent), but it hanged and waited forever. Is this simply because of the server logic that doesn't accept non-browser client ?
Here are the steps that I tried:
1. Visited this link: https://www.tokopedia.com/p/handphone-tablet/handphone
2. I opened the inspector and opened the Network - All tab
3. I clicked one of the products
4. I clicked the top request from the Network - All tab
5. I copied it as cURL bash
6. I imported it to Postman
7. I ran that request
8. Postman hanged
Actually the problem might even go deeper than what the other answers say.
So neither the User-Agent request header nor telnet might solve that problem (unless you initialize the TLS handshake also with telnet MANUALLY, but that is near impossible to complete).
TLS fingerprinting
If the connection is an SSL/TLS connection, the server could detect which algorithm is used to generate keys, and most applications have their specific signature / cipher.
So only by the TLS handshake alone you can tell Chrome from Postman or FireFox or Java. Java usually - unless a JVM implementation REALLY wants to go off-road - has the same signature across all platforms, using the same cipher/algorithm across all implementations.
I am sorry I cannot properly recall the name of this technique. The first project I know that published this is called something like "A3" or "S3". Salesforce published an article about JA3 analysis. They describe the technique and show a list of signatures and applications so you can guesstimate what app you're talking to, without the need to even decrypt the data: https://engineering.salesforce.com/tls-fingerprinting-with-ja3-and-ja3s-247362855967
My Solution
I had that same problem too, wanted to scan the NVidia or AMD servers for graphics card availability. Did not work from Java, so after a lot of research, finding the project mentioned above, I simply used Selenium to control FireFox and that got the proper server responses and I achieved my goal this way.
The only way to be sure that the exact same data is sent is to manually send it yourself through something like telnet. I had a similar problem once- it turned out that the browser was sending the data in one big chunk, while my code was sending it line-by-line. No site should have this problem, but it's possible that it exists.
The server might be checking for User-Agent request header and will block traffic that does not originate from a browser. Try setting the header in curl or your Java Code to a value corresponding to (any) browser. I've encountered such behavior on some e-shops and commercial websites.
I tried stress testing with JMeter software to test a web site as it crashed after a sms campaign. Currently site has been moved to a physical server.
I tested multiple times by adding threads, it worked and gave few errors (for above 1000 threads), and worked for 400 threads with no error. So I tried distributed testing with 4 PCs including my one.
After I tried again with only my PC to send requests to the site by adding 400 threads(ramp up = 1 , loop = 1). But each and every requests gives error. Then I tried using 1 thread. Same error was given.
I checked my network connection, and there is no problem. Then I browsed the web site "http://www.myjobs.lk/", and it works fine.
These are the values I have given in testing.
Under this condition, I cannot perform the testing because it always gives errors. How can I overcome this problem?
You're using incorrect JMeter configuration, change it as follows:
Remove http:// from "Server Name or IP" input
Put http to "Protocol input
It is also possible to have the full URL in "Path" field like
But using "http://" in "Server Name or IP" won't work.
Also once you defined hostname, port, path, etc. in HTTP Request Defaults it will be automatically applied to all HTTP Request Samplers. You will be able to override an option for particular this or that sampler but if you don't - default value will be used. See Why It's SO Important To Use JMeter's HTTP Request Defaults for more detailed explanation and some use cases.
For me it was helpful to setup proxy server:
Looks like JMeter tries to connect to myjobs.lk, and you browse to www.myjobs.lk. Try changing so that JMeter also connects to www.myjobs.lk
I want to download a source of a webpage to a file (*.htm) (i.e. entire content with all html markups at all) from this URL:
http://isap.sejm.gov.pl/DetailsServlet?id=WDU20061831353
which works perfectly fine with FileUtils.copyURLtoFile method.
However, the said URL has also some links, for instance one which I'm very interested in:
http://isap.sejm.gov.pl/RelatedServlet?id=WDU20061831353&type=9&isNew=true
This link works perfectly fine If open it with a regular browser, but when I try to download it in Java by means of FileUtils -- I got only a no-content page with single message "trwa ladowanie danych" (which means: "loading data...") but then nothing happens, the target page is not loaded.
Could anyone help me with this? From the URL I can see that the page uses Servlets -- is there a special way to download pages created with servlets?
Regards --
This isn't a servlet issue - that just happens to be the technology used to implement the server, but generally clients don't need to care about that. I strongly suspect it's just that the server is responding with different data depending on the request headers (e.g. User-Agent). I see a very different response when I fetch it with curl compared to when I load it in Chrome, for example.
I suggest you experiment with curl, making a request which looks as close as possible to a request from a browser, and then fiddling until you can find out exactly which headers are involved. You might want to use Wireshark or Fiddler to make it easy to see the exact requests/responses involved.
Of course, even if you can fetch the original HTML correctly, there's still all the Javascript - it would be entirely feasible for the HTML to contain none of the data, but for it to include Javascript which does the actual data fetching. I don't believe that's the case for this particular page, but you may well find it happens for
try using selenium webdriver to the main page
HtmlUnitDriver driver = new HtmlUnitDriver(true);
driver.manage().timeouts().implicitlyWait(30, TimeUnit.SECONDS);
driver.get(baseUrl);
and then navigate to the link
driver.findElement(By.name("name of link")).click();
UPDATE: I checked the following: if I turn off the cookies in Firefox and then try to load my page:
http://isap.sejm.gov.pl/RelatedServlet?id=WDU20061831353&type=9&isNew=true
then I yield the incorrect result just like in my java app (i.e. page with "loading data" message instead of the proper content).
Now, how can I manage the cookies in java to download this page properly then?
I have some client-server interaction and GWT is updating the webpage and making it look dynamic. This is working on Chrome and Firefox, however, IE (8,9,10) is caching the Responses. I am able to tell that its caching because I used httpwatch to view the exchange.
http://i.imgur.com/qi6mP4n.png
As you can see these Responses are being cached, how can stop IE from aggressively caching like Chrome and Firefox?
The browser is allowed to cache a) any GET request with b) the same url unless c) the server specifies otherwise. With those three criteria, you have three options:
Stop using GET, and use a POST instead. This may not make sense for your use case or your server, but without any further context in your question, it is hard to be more specific
Change the url each time the resource is requested. This 'cache-busting' strategy is often used to let the same file be loaded and not need to worry if it changed on the server or not, but to always get a fresh copy
Specify headers from the server whether or not the file should be cached, and if so, for how long.
If you are dealing with the <module>.nocache.js and <hash>.cache.html files, these typically should get a header set on them, usually via a filter (as is mentioned in the how to clear cache in gwt? link in the comments). The *.cache.* files should be kept around, because their name will change automatically (see bullet #2 above), while the *.nocache.* should be reloaded every time, since their contents might have changed.
I have a URL shortener app (similar to tinyurl.com, bit.ly etc) which redirects to file:// URLs as well.
Internally, this is a Servlet based web-app, and all I do is, retrieve the targetURL and do a response.sendRedirect(targetURL) from the server side.
This works fine for file:// URLs too. However, recently, this has stopped working on Chrome. When I try to redirect to file://foo.txt (via a response.sendRedirect('file://foo.txt'), things simply fail (the Chrome debugger says "Cancelled").
Things work fine in FF and IE however. Any clues ?
I'd say this is a bad idea, and I'm glad at least chrome denies this (although I would suspect that other browsers would as well). It would be a pretty big security hole if you could instruct someone else's browser to open an arbitrary file.
Second, why would you want to do this? It would require that the user actually have this same file, at the same location on their computer. Seems like a pretty narrow use case. I tested your use case with bit.ly, and it you try to add a file:/// url there, it's regarded as an invalid URL and cannot be shortned.
Edit: There's a very good answer covering the same topic here. It references this useful resource about security restrictions with redirection.
You also specify that this is for an internal app. If you're attempting to do some sort of document sharing, I'd say you should look into dedicated systems for this. Another option is to extend your service with a "dropbox light", where your users can upload the file in question to a storage service, and you can generate a shortned url based on serving the file from your storage via regular http/https.