Anonymize Amazon public URL, decode using a script on Nginx - java

I have been wondering if its possible to anonymize public URL. When user makes a request with this anonymized public URL, let Nginx decode, fetch and serve the URL.
Example
Public URL http://amazon.server.com/location/file.html
Anonymized URL https://amazon.server.com/09872340-932872389-390643289/983724.html
Nginx decodes 09872340-932872389-390643289/983724.html to location/file.html
Added image below for further clarification. Nginx has a reverse logic to decode, whereas Remote Server has the logic to Anonymize URL.
Question
All I need to know is how would Nginx decode anonymized URL? Nginx got anonymized URL request. There has to be a way to decode it.

This is an answer to the updated question:
Question All I need to know is how would Nginx decode anonymized URL? Nginx got anonymized URL request. There has to be a way to decode it.
Nginx would make a request to a script, e.g., either through proxy_pass or fastcgi_pass et al.
The script could decode the URL and provide the actual URL through a Location HTTP Response Header with a 302 Found HTTP Status.
Nginx would then have the decoded URL stored in the $upstream_http_location variable. It could subsequently be used in another proxy_pass et al within a named location #named, to which you could redirect the processing of the original request from the user through error_page 302 = #named.
In all, each user request would be processed twice within nginx, but it'll all be transparent to the user -- they simply receive the resource through the original URL, with all redirects being done internally within nginx.

Define Anonymize for a URL? You can use any of the same methods as URL shortners such as http://bitly.com. But that is not truely anonymous since there is a definite mapping between the shortened URL and the target public url. If you make this per user based there is still a mapping but it is user based.
Looks like what you are suggesting is a variation on the above scheme where instead of sending the user to the target URL via a redirect you want the your server to actually fetch the content and return to the user. You need to be aware of the linked content in the public URL such as style sheets and images and adjust them accordingly. Many of the standard proxies has this kind of functionality built in. Also take a look at
https://github.com/jenssegers/php-proxy
http://search.cpan.org/~book/HTTP-Proxy-0.304/lib/HTTP/Proxy.pm.
If you are planning to build your own these can serve as a base.

I think what you want to do here is somewhat similar to another question I've answered in the past, where for each request by the client, you effectively want to make two requests to two different upstreams under the hood (first one to an upstream capable of decoding the URL, second one to actually fetch said decoded URL), but, of course, only return one result.
https://serverfault.com/questions/202011/nginx-and-2-upstreams/485044#485044
As mentioned on serverfault, you could use error_page to process another request, after the first one is complete. You could then use $upstream_http_ to make the subsequent request based on the original one, for example, using $upstream_http_location.
You might also want to look into X-Accel-Redirect header, introduced in this context at proxy_ignore_headers.

Related

Open an authenticated image served by django from java using Apache http client

I Am serving an authenticated image using django. The image is behind a view which require login, and in the end I have to check more things than just the authentication.
Because of a reason to complicated to explain here, I cannot use the real url to the image, but I Am serving it with a custom url leading to the authenticated view.
From java the image must be reachable, to save or display. For this part I use Apache httpclient.
In Apacahe I tried a lot of things (every example and combination of examples...) but can't seem to get it working.
For other parts of the webapp I use django-rest-framwork, which I succesfully connected to from java (and c and curl).
I use the login_reuired decorator in django, which makes the attempt to get to the url redirect to a login page first.
Trying the link and the login in a webviewer, I see the 200 code (OK) in the server console.
Trying the link with the httpclient, I get a 302 Found in the console.... (looking up 302, it means a redirect..)
this is what I do in django:
in urls.py:
url(r'^photolink/(?P<filename>.*)$', 'myapp.views.photolink',name='photolink'),
in views.py:
import mimetypes
import os
#login_required
def photolink(request, filename):
# from the filename I get the image object, for this question not interesting
# there is a good reason for this complicated way to reach a photo, but not the point here
filename_photo = some_image_object.url
base_filename=os.path.basename(filename_photo)
# than this is the real path and filename to the photo:
path_filename=os.path.join(settings.MEDIA_ROOT,'photos',mac,base_filename)
mime = mimetypes.guess_type(filename_photot)[0]
logger.debug("mimetype response = %s" % mime)
image_data = open(path_filename, 'rb').read()
return HttpResponse(image_data, mimetype=mime)
by the way, if i get this working i need another decorator to pass some other tests....
but i first need to get this thing working....
for now it's not a secured url.... plain http.
in java i tried a lot of things... using apache's httpclient 4.2.1
proxy, cookies, authentication negociation, with follow redirects... and so on...
Am I overlooking some basic thing here?...
it seems the login of the website client is not suitable for automated login...
so the problem can be in my code in django....or in the java code....
In the end the problem was, using HTTP authorization.
Which is not by default used in the login_required decorator.
adding a custom decorator that checks for HTTP authorization did the trick:
see this example: http://djangosnippets.org/snippets/243/

serving GWT permutations from appengine blob store - XSRF not found

In trying to serve GWT permutations out of the blob store in order to escape the AppEngine hard limit of 150 mb for static files, I've succeed in doing so for "html" and image files "jpeg, png, .etc" and other .rpc calls, but am hung up on XSRF calls.
In the server logs, I see:
The serialization policy file '/theapplication/CCA65B31464BDB27545C23C142FEEEF8.gwt.rpc' was not found;
My upload log shows it was uploaded /CCA65B31464BDB27545C23C142FEEEF8.gwt.rpc : HTTP/1.1 200 OK
The request url shows http://14.applicationXYZ.appspot.com/xsrf
the RequestPayload shows: http://14.applicationXYZ.appspot.com/theapplication/|CCA65B31464BDB27545C23C142FEEEF8|com.google.gwt.user.client.rpc.XsrfTokenService|getNewXsrfToken|1|2|3|4|0|
Other rpc calls are resolving (via a server filter is looking for /theapplication and mapping the requests to a blob to serve) as in the following case where an rpc call is made without an Xsrf request (as the user is not logged in yet)
req url -- http://14.applicationXYZ.appspot.com/someRPCCall
RequestPayload -- http://14.applicationXYZ.appspot.com/theapplication/|62D7E6737056C685E10947B640409549|com.abc.client.rpc.Service|doWork|java.lang.String/2004016611|java.lang.Boolean/476441737|wwwerr|1|2|3|4|3|5|5|6|7|7|6|0|
So, I have two questions:
1) why is XSRF call failing to return the appropriate blob, ie. why doesn't the xrsf call get handled by the filter the way other url calls to /theapplication/* do?
2) What can I do about it?
3) Also, I tried setting the content type to "text/x-gwt-rpc; charset=UTF-8 and also as unspecified when I uploaded the blob. Anyone know what the content type should be for *.gwt.rpc in case I do get the xrsf working? Could having the wrong content type be causing the trouble?
***note applicationXYZ is not the real name so no the links won't work.
OK /xsrf is mapped to a servlet as well, so if the filter returns a blob without passing on the filter, it seems it won't reach the servlet.
Anyway, it's easy enough just to upload the few .rpc files as normal and not serve them as blobs.

Options for passing data across HTTP redirects

I am working on a Web application and need to pass data across HTTP redirects. For example:
http://foo.com/form.html
POSTs to
http://foo.com/form/submit.html
If there is an issue with the data, the Response is redirected back to
http://foo.com/form.html?error=Some+error+message
and the query param "error"'s value is displayed on the page.
Is there any other reliable way to pass data across redirects (ie HTTP headers, etc.).
Passing the data as query params works but isn't ideal because:
its cleartext (and in the query string, so SSL cant be relied on to encyrpt) so I wouldn't want to pass sensitive data
URIs are limited in length by the browser (albiet the length is generally fairly long).
IMPORTANT: This platform is state-less and distributed across many app servers, so I can't track the data in a server-side session object.
From the client-server interaction point of view, this is a server internal dispatch issue.
Browsers are not meant to re-post the entity of the initial request automatically according to the HTTP specification: "The action required MAY be carried out by the user agent without interaction with the user if and only if the method used in the second request is GET or HEAD."
If it's not already the case, make form.html dynamic so that it's an HTML static file. Send the POST request to itself and pre-fill the value in case of error. Alternatively, you could make submit.html use the same template as form.html if there is a problem.
its cleartext (and in the query string, so SSL cant be relied on to
encyrpt) so I wouldn't want to pass sensitive data
I'm not sure what the issue is here. You're submitting everything over plain HTTP anyway. Cookie, query parameters and request entity will all be visible. Using HTTPS would actually protect all this, although query parameters can still be an issue with browser history and server logs (that's not part of the connection, which is what TLS protects).
I think using cookies would be a reasonable solution depending on the amount of data. As you can't track it on the server side (by using a sessions for example, which would be much simpler)
You can store error message in database on server and reference to it by id:
http://foo.com/form.html?error_id=42
If error texts are fixed you even don't need to use a database.
Also, you can use Web Storage. Instead of redirection with "Location" header you can display output page with this JavaScript:
var error_message = "Something is wrong";
if( typeof(Storage) !== "undefined" ) {
localStorage.error_message = error_message;
else {
// fallback for IE < 8
alert(error_message);
}
location.href = "new url";
And after redirection you can read localStorage.error_message using JavaScript and display the message.

Design solution for URL encoding

I am planning a URL rewriter/encoder (maybe rewriter is a better term). The main purpose is to hide the exact URL from the client, since if he is smart enough, he can figure out how to mess up the application.
The URL encoder would be an injective function f(x) = y. The decoder would be the inverse function of f, say g such that g(y) = x. This way I can encode and decode my URLs.
A URL like:
http://www.myapp.com/servlet/myapp/template/MyScreen.vm/action/MyAction
would be encoded to something like:
http://www.myapp.com/uyatsd6787asv6dyuasgbdxuasydgb876876v
It does not matter what is in the encoded URL as far as it is not understandable.
The problem is that I do not know how to manipulate the URL that the browser displays. I am using JBoss as a servlet container and Turbine servlet as the web application framework.
I would need a module that receives the encoded URL, decodes it, passes it to Turbine, then it modifies the response's URL to show the encoded URL again.
Previous attempts to solve the problem:
I have created a servlet filter, but I can not access the URL since the filter receives a ServletRequest that is a JBoss implementation. As far as I have read it seems that a servlet filter is not a good choice for manipulating the URL.
Maybe you could do something like write a servlet that accepts the initial request, decodes the URL, and then internally forwards to your existing servlet.
For example, have a servlet that will accept:
www.myapp.com/enc/uyatsd6787asv6dyuasgbdxuasydgb876876v
This servlet could be set to handle requests that begin with /enc/ or some other marker to indicate that the URL needs to go to the decoder servlet. It would decode to the URL to:
/servlet/myapp/template/MyScreen.vm/action/MyAction
and then internally forward to this URL on your existing servlet using something like:
getServletContext().getRequestDispatcher(decoded_url).forward(req, res);

AS2: Does xml.sendAndLoad use POST or GET?

All,
I'm trying to find out, unambiguously, what method (GET or POST) Flash/AS2 uses with XML.sendAndLoad.
Here's what the help/docs (http://livedocs.adobe.com/flash/9.0/main/wwhelp/wwhimpl/common/html/wwhelp.htm?context=LiveDocs_Parts&file=00002340.html) say about the function
Encodes the specified XML object into
an XML document, sends it to the
specified URL using the POST method,
downloads the server's response, and
loads it into the resultXMLobject
specified in the parameters.
However, I'm using this method to send XML data to a Java Servlet developed and maintained by another team of developers. And they're seeing log entries that look like this:
GET /portal/delegate/[someService]?svc=setPayCheckInfo&XMLStr=[an encoded version of the XML I send]
After a Google search to figure out why the POST shows up as a GET in their log, I found this Adobe technote (http://kb2.adobe.com/cps/159/tn_15908.html). Here's what it says:
When loadVariables or getURL actions are
used to send data to Java servlets it
can appear that the data is being sent
using a GET request, when the POST
method was specified in the Flash
movie.
This happens because Flash sends the
data in a GET/POST hybrid format. If
the data were being sent using a GET
request, the variables would appear in
a query string appended to the end of
the URL. Flash uses a GET server
request, but the Name/Value pairs
containing the variables are sent in a
second transmission using POST.
Although this causes the servlet to
trigger the doGet() method, the
variables are still available in the
server request.
I don't really understand that. What is a "GET/POST hybrid format"?
Why does the method Flash uses (POST or GET) depend on whether the data is sent to a Java servlet or elsewhere (e.g., a PHP page?)
Can anyone make sense of this? Many thanks in advance!
Cheers,
Matt
Have you try doing something like that :
var sendVar=new LoadVars();
var xml=new XML("<r>test</r>");
sendVar.xml=xml;
sendVar.svc="setPayCheckInfo";
var receiveXML=new XML();
function onLoad(success) {
if (success) {
trace("receive:"+receiveXML);
} else {
trace('error');
}
}
receiveXML.onLoad=onLoad;
sendVar.sendAndLoad("http://mywebserver", receiveXML, "POST");
The hybrid format is just a term Macromedia invented to paint over its misuse of HTTP.
HTTP is very vague on what you can do with GET and POST. But the convention is that no message body is used in GET. Adobe violates this convention by sending parameters in the message body.
Flash sends the same request regardless of the server. You have problem in Servlet because most implementation (like Tomcat) ignores message body for GET. PHP doesn't care the verb and it processes the message body for GET too.

Categories