Selective downloading of torrents via webseeding

Selective downloading of torrents via webseeding - java

I serve up files from a Jetty webserver which presently get downloaded via regular HTTP GET.
However, I am interested in a P2P model where users can download files via the webseeding. How would this be implemented in the context of a Jetty server with libtorrent?
Second, I dont want to "seed" ALL files on the Jetty webserver forever, instead I only want to be able to seed files "on demand".
For example rather than blindly seeding a torrent, I would like to have the file available for demand IF a request comes in for it (via GET or webseeding or whatever) - upon which it can be "seeded".
I want to seed or upload on demand because I have a multitude of files and do not know if I will be able to seed tens of thousands of files concurrently. Btw would anyone know what the upper limit is for number of files which can be seeded concurrently?

The relevant documentation about the libtorrent part is here: http://www.rasterbar.com/products/libtorrent/manual.html#http-seeding and the specs are http://bittorrent.org/beps/bep_0019.html and http://bittorrent.org/beps/bep_0017.html (both being supported by libtorrent, as "url seeds" and "http seeds").
IIRC, BEP19 (webseeds, or urlseeds) is rather straight-forward from the server POV, and you don't need to do anything special there - you just serve the files as you would do for a normal HTTP requests for that file (so, the second part of your question doesn't quite make sense here).
With BEP17, you rather use a unique http endpoint, and pass it get parameters to specify what the client wants (which for example allows for better throttling control and range selection) (eg: http://example.com/seed/?info_hash=X&piece=Y&ranges=Z).
This second approach is more flexible if you intend to have more (programmatic) control over what is downloaded, but obviously requires a lot more code to write to handle the requests though.
Again, from the server POV, this is not that different from regular HTTP transactions, and there is nothing special about "seeding" here. You just serve files (each with its own url, either directly, or via a handler).
A for the metadata part, with BEP19, you add a "url-list" extension (with the full url of your file: http://example.com/seeds/SOMEFILE.txt - watch out for multi-file torrents), whereas BEP17 uses the key "httpseeds" (with your endpoint, eg: http://example.com/seed/).
Depending on whether your Jetty also handle metadata generation or not, you might prefer BEP19 over BEP17, for your urls to be more predictable / the metadata generation to be simpler...
Hope that helps.

Related

How to put a my own proxy between any client and any server (via a web page)

what I want to do is to build a web application(proxy) that user use to request the webpage he want and
my application forward the request to the main server,
modify HTML code,
send to the client the modified one.
The question now is
How to keep my application between the client and main serevr
(for example when the user click any link inside the modified page-
ajax request - submit forms - and so on)
in another words
How to grantee that any request (after the first URL request) from the client sent to my proxy and any response come first to my proxy

The question is: Why do you need a proxy? Why do you want to build it - why not use already existing one like HAProxy ?
EDIT: sorry, I didn't read your whole post correctly. You can start with:
http://www.jtmelton.com/2007/11/27/a-simple-multi-threaded-java-http-proxy-server/

If the user is willing to, or can be forced1 to configure his clients (e.g. web browser) to use a web proxy, then your problem is already solved. Another way to do this (assuming that the user is cooperative) is to get them to install a trusted browser plugin that dynamically routes selected URLs through your proxy. But you can't do this using an untrusted webapp: the Browser sandbox won't (shouldn't) let you.
Doing it without the user's knowledge and consent requires some kind of interference at the network level. For example, a "smart" switch could recognizes TCP/IP packets on port 80 and deliberately route them to your proxy instead of the IP address that the client's browser specifies. This kind of thing is known as "deep packet inspection". It would be very difficult to implement yourself, and it requires significant compute power in your network switch if you are going to achieve high network rates through the switch.
The second problem is that making meaningful on-the-fly modifications to arbitrary HTML + Javascript responses is a really difficult problem.
The final problem is that this is only going to work with HTTP. HTTPS protects against "man in the middle" attacks ... such as this ... that monitor or interfere with the requests and responses. The best you could hope to do would be to capture the encrypted traffic between the client and the server.
1 - The normal way to force a user to do this is to implement a firewall that blocks all outgoing HTTP connections apart from those made via your proxy.
UPDATE
The problem now what should I change in the html code to enforce client to request any thing from my app --- for example for link href attribute may be www.aaaa.com?url=www.google.com but for ajax and form what I should do?
Like I said, it is a difficult task. You have to deal with the following problems:
Finding and updating absolute URLs in the HTML. (Not hard)
Finding and dealing with the base URL (if any). (Not hard)
Dealing with the URLs that you don't want to change; e.g. links to CSS, javascript (maybe), etc. (Harder ...)
Dealing with HTML that is syntactically invalid ... but not to the extent that the browser can't cope. (Hard)
Dealing with cross-site issues. (Uncertain ...)
Dealing with URLs in requests being made by javascript embedded in / called from the page. This is extremely difficult, given the myriad ways that javascript could assemble the URL.
Dealing with HTTPS. (Impossible to do securely; i.e. without the user not trusting the proxy to see private info such as passwords, credit card numbers, etc that are normally sent securely.)
and so on.

GWT Internet Explorer Caching AJAX Responses

I have some client-server interaction and GWT is updating the webpage and making it look dynamic. This is working on Chrome and Firefox, however, IE (8,9,10) is caching the Responses. I am able to tell that its caching because I used httpwatch to view the exchange.
http://i.imgur.com/qi6mP4n.png
As you can see these Responses are being cached, how can stop IE from aggressively caching like Chrome and Firefox?

The browser is allowed to cache a) any GET request with b) the same url unless c) the server specifies otherwise. With those three criteria, you have three options:
Stop using GET, and use a POST instead. This may not make sense for your use case or your server, but without any further context in your question, it is hard to be more specific
Change the url each time the resource is requested. This 'cache-busting' strategy is often used to let the same file be loaded and not need to worry if it changed on the server or not, but to always get a fresh copy
Specify headers from the server whether or not the file should be cached, and if so, for how long.
If you are dealing with the <module>.nocache.js and <hash>.cache.html files, these typically should get a header set on them, usually via a filter (as is mentioned in the how to clear cache in gwt? link in the comments). The *.cache.* files should be kept around, because their name will change automatically (see bullet #2 above), while the *.nocache.* should be reloaded every time, since their contents might have changed.

Extending JMeter or write complex scenario?

I have several thousands of files, some of them contain HTTP request and some of them contain corresponding HTTP response. I want to create some script, which spawns hundreds/thousands of threads and each thread should take HTTP request, send it to the server and compare response from server with corresponding response file.
I'm not sure if I have to create custom sampler with configuration, or I can use existing pieces of JMeter functions to create one?
Also I was not able to find reliable documentation of how to extend JMeter with new Samplers. This one seems to be outdated: http://www.jajakarta.org/jmeter/1.7/en/extending/JMeter%20Extension%20Scenario.html
May be somebody could advice where I can find latest guide, covering creation of Samplers? For example, how to create Sampler which will take a directory as argument and iterate over every file in that directory, then make request/compare response and tell JMeter if that one was correctly processed by WEB server along with timings?

I think you should just use regulat JMeter scenario. Implement list of files with CSV Data Set, spawn some threads with this set with Thread Group. Each thread gets the name of the request/response files pair from CSV Data Set, uses HTTP/TCP sampler to send request and an Assertion (Response Assertion) to verify response.
MD5 assertion can be even faster way to check the response.

My program needs to access information (key/value) from my hosted server. What web architecture would be best for this?

My program needs to download object definitions (basically xml files, maybe binary files) on demand via the net. The program will request objects from my server during runtime. The only thing the program has to send the server is a string that identifies the object it needs (e.g. RedCubeIn3DSpace23). So a basic Key, Value system. My app also has to have some basic authentication mechanism to make sure only legitimate programs access my server’s info. Maybe send the license number and a password.
What is the best way to go about implementing this? I have 0 web knowledge so I'm not sure exactly what technologies I need. I have implemented socket programs in college so maybe that is what I need? Are there frameworks for this type of thing? There could be thousands of users/clients simultaneously; maybe more but I don’t know.
One super important requirement is that I need security to be flawless on the server side. That is, I can't have some hacker replacing object definitions with malicious one that clients download. That would be disastrous.
My first thoughts:
-Set up an ftp server and have each xml file will be named by the key value. Program logs in with its product_id and fixed password and just does downloads. If I use a good ftp server, that is pretty impervious to a hacker modifying definitions. Drawback is that it's very non expandable nor flexible.
-RESTful type system. I just learned about this when searching stackoverflow. I can make categories of objects using URL but how do I do authentication and other actions. Might be hard to program but is this a better approach? Is there a prebuilt library for this?
-Sockets using Java/C#. Java/C# would protect me from overflow attacks and then it is just a matter of spawning a thread on each connection and setting up simple messaging protocol and file transfers.
-SOAP. Just learned about it while searching. Don't know much.
-EC2. I think it (and other?) cloud services add a db layer over it.
That's what I can come up with, what do you think given my requirements? I just need a little guidance.

HTTP seems a better fit than ftp, since you only want to download stuff. That is, you would set up a web server (e.g. Apache), configure it for whatever authentication scheme you need, and have it serve that content.
SOAP is clearly overkill for this, and using raw sockets would be reinventing the wheel (i.e. a web server).
I'd do security on the socket level, using HTTPS. That way, the client will verify the identity of the server prior when establishing the connection, and nobody can intercept the password sent to the server. Again, a decent webserver will support this out-of-the-box, you just need to configure it properly.

Using HTTP OPTIONS to retrieve information about REST resources

This problem relates to the Restlet framework and Java
When a client wants to discover the resources available on a server - they must send an HTTP request with OPTIONS as the request type. This is fine I guess for non human readable clients - i.e. in code rather than a browser.
The problem I see here is - browsers (human readable) using GET, will NOT be able to quickly discover the resources available to them and find out some extra help documentation etc - because they do not use OPTIONS as a request type.
Is there a way to make a browser send an OPTIONS/GET request so the server can fire back formatted XML to the client (as this is what happens in Restlet - i.e. the server response is to send all information back as XML), and display this in the browser?
Or have I got my thinking all wrong - i.e. the point of OPTIONS is that is meant to be used inside a client's code and not meant to be read via a browser.

Use the TunnelService (which by default is already enabled) and simply add the method=OPTIONS query parameter to your URL.
(The Restlet FAQ Q19 is a similar question.)

I think OPTIONS is not designed to be 'user-visible'.
How would you dispatch an OPTIONS request from the browser ? (note that the form element only allows GET and POST).
You could send it using XmlHttpRequest and then get back XML in your Javascript callback and render it appropriately. But I'm not convinced this is something that your user should really know about!

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.