Using Tomcat or JBoss, how do I transfer and save dynamic image content from an image repository on one server to a number of other servers (machines) on the same network without writing a Client/Server application?
The web site I am building contains a large number of images that will only be saved and shared on one machine. All of the web app servers need to be able to access these files.
Kermit Love, here are a few suggestions based on your requirements. Note that none of these are based on Java nor specific to Tomcat / JBoss.
Option 1 : Using a NAS or a Shared Directory
Using a specific hardware (NAS), a file server (Samba) or a simple shared directory over the network would allow all you machines to access the content through the network.
Pros:
This solution can scale
Setup is easy
Cons:
The network overhead may slow down your global solution depending on your needs (access frequency, images sizes, ...)
No high-availability (fault tolerance)
Option 2 : Using an upfront server dedicated to load balancing / reverse-proxy / serving files
You can use a simple Nginx / Apache server to deliver "static" content while routing trafic to your application servers.
Pros:
Efficient way to serve images to clients.
You separate the concerns (business logic vs serving files).
Cons: No high-availability (fault tolerance)
Option 3 : Using rsync to synchronise file systems
You can define a cron based rule to run a rsync command every N seconds/minutes to ensure all your machines have the data available.
Pros:
Easy to setup
Free (no need extra hardware, for now)
(slightly) better performance in the long term than option 1 (no repeated network overhead).
Cons:
Yet an extra-process
Doesn't scale horizontally
Files won't be immediately accessible
Related
The theme of my project is to implement a distributed server which provides several clients several files to download. The server is hosting several files and we want that the server should implement some best algorithms to quickly let the clients download data from it.
My idea of implementation of project:
Like the client generally downloads the file using some download managers, similarly there must exist some server side managers/codes/algorithms which upload/seed the file quickly to let client download the file. There must not be any action of client except the selection of the file to be downloaded!
How should I write the code for such a server on the back end, analogous to multi-threading based downloaded managers for clients on the front-end?
How should server seed/make avail the file to the client if the client only sends the path as a String to the server in Java for downloading?
Or, if I am missing something/my idea is totally wrong, please enlighten me with an alternative process/algorithm which I must implement on the server side. Please remember that the whole purpose of asking this question is the back end server seeding algorithm OR equivalent algorithms/methods.
I assume, this server of yours has a good internet connection with a broad upstream. If that is the case then the limiting factor when only few clients are downloading few files is the bandwith of these clients. So you will at most get as fast as the downstream bandwith of your clients. So simply taking an off-the-shelf HTTP server library to serve the downloads should be sufficient.
Where your backend implementation really matters and is able to improve download performance is then many users are connecting to your server and downloading many files. First off there are following points to consider:
TCP has a startup-time. When you first open an connection, the download rate slowly starts to increase until it hits the maximum. To minimize this time, when downloading multiple files the connection opened for one file download should be reused for the next file.
Downloading many files at once(on clientside) is not reasonable when bandwidth is the limiting factor, because the client has to start up many TCP connections and the data will be either fragmented, when written to Disk, or (when allocating beforehand) the disk will be pretty busy while jumping between sectors.
Your server should generally use a non-blocking IO library (eg. java.nio) and refrain from creating a thread per incomming connection since this leads to thrashing which again decreases your server's performance drastically.
If you have a really big amount of clients simultaneously downloading from your server, the limit you will probably hit will be either:
The upstream limit of your provider
The read speed of your Harddrive (SSD have ~ 500MB/s as far as I'm informed)
Your server can try to hold the most commonly requested files in his memory and serve the content from there (DDR3 RAM reaches speeds of 17GB/s). I doubt that you have only as few files on your server that you could cache them all in your server's RAM.
So the main engineering task lays in the clever selection of which content should be cached and which not. This could be done on a priority base by assigning higher priorities to certain files or by a metric which encodes the probability of a single file to be downloaded in the next few minutes. Or simply the files which are downloaded by the most clients at this point of time.
With such considerations you are able to push the limits of your download server until a certain point from which the only improvement can be achieved by distributing or replicating your files onto many servers.
If you are going into such a direction where serving millions of clients simultaneously must be possible, you should consider buying such a service from CDNs. They are specialized in fast delivery and have many upstream server in most ASes so that every client can download his files from the regional CDN server.
I know, I haven't given any algorithm or code examples, but I didn't intend to answer this question completely. I just wnated to give you some important guidelines and thoughts to that topic. I hope, you can at least use some of these thoughts for your project.
Similar to How to have 2 JVMs talk to one another, but with the added requirement that there be per-user authentication between the 2 JVMs. I have one Windows machine that multiple users will utilize, each using 2 java applications (essentially a client process and server process). The communication should be isolated, such that no user can see or access the data moving between another user's client/server.
I'm looking at named pipes and Netty as of now, but not sure of how each supports the above requirement, or which is more reliable / easier to set up.
We have one web application(With Spring, hibernate and MySQL as a Database) in which multiple users can store the heavy videos(pre-recorded or record from application itself) on server at same time.
In that scenario, server load would be definitely more there. We are assuming there would be 500-2000 users in the application.
So what strategy i should use to reduce the load from server and make the response time faster.
1) Storing the videos on our server(With large Disk Space), and using the ActiveMQ/RabbitMQ mechanisms for File Upload and download in the Queues.
2) Storing the videos on some third party server(like YouTube,vimeo etc) that will upload all the videos on one central account. I had recently check this thing with you tube and vimeo but they require the end user login credentials for each upload. And i don;t want in my application that end-users to provide their credentials before each upload.
Is there any other way to reduce the work load and make the response time better for simultaneously upload on server, then please guide.
Thanks In Advance,
Arun
Multi servers can help.
On a single server:
If you use a single core processor - only ONE client will get served.
If you use a multi core processor and you are oppening a new thread for a new connection - only #ofCores clients will get served, and even that is not correct because your local memory might run out before your os will save the data to your local hard disk (which has one bus), so serving 500-2000 clients leads you to a multi server solution.
We currently have a centralised web app and database (running on glassfish and oracle) which is accessed from multiple stations distributed about the country.
At the stations there is data entered into and read from the system (through the browser).
When the (external) connection goes down between the station and the centralized web app we would like for the stations to continue to function - store and present data, then when the connection returns the data is pushed back into the central server maintaining database integrity.
Given that we would be willing to change our app server or database if it was worth it, how is this best handled, is there any out of the box solution for this?
Install the servers at the individual locations, replicate what you want to share across them "routinely", and leave all of the other centralized, but non-vital tasks (like, say, reporting) on the central system.
There is not "out of the box" solution. You system is centralized for whatever reason it's centralized. You're asking for it to be decentralized. By doing so you need to reconsider why it's centralized in the first place, and what dependencies there are because of that centralization (such as each site having instant access to data at all of the other sites).
Address those issues of what you can do without, for how long, and how to share it, and then you can set up autonomous sites. The magnitude and complexity of this process is dependent upon you application and the services it supplies to the remote users.
If you can tolerate losing the current sessions I would point you to look for a distributed database (replication). Oracle probably supports it. In each office you would have a glassfish server
But it is going to cost a lot:
Licences
Hardware (servers)
Properly securing the server
(Lots of) tuning/rewritting to avoid new bottlenecks
Maybe it would be easier / cheaper if you chose to just use redundant internet access for all of your offices.
If you are willing to go cutting edge, then look into HTML 5 with Local Storage. Note that the local storage specification in HTML 5 is still in transition. The second link I included has a good fallback option for when HTML 5 local storage is unavailable. With the fallback option of Store.js, you won't even need to require your clients to use a modern browser, though it definitely helps.
Another option, if you are open to moving in that direction, is to use Adobe Flex 3 for your UI, talking through LiveCycle to your application hosted on Glassfish. There will be more moving parts and a steeper learning curve though.
We are trying to improve the application performance we are using the struts2,jsp and webservices and retriving the data from service and display response in jsp page using ajax.
This is taking approximately 15 secs my client is asking reduce this time. So we are implementing the Parallelize downloads across hostnames
How can we load the images and js files from different sub host names?
Please suggest
How can we load the images and js
files from different sub host names?
You use subdomains pointing to different servers.
You can, for example, configure your DNS so that, images.example.org points to Amazon S3 while js.example.org points to a dedicated server in Timbuktu while all the other resources are downloaded from your "main" server(s) (whatever that is).