Server loading static resources too slowly - what server optimizations can I make?
Images + CSS content is loading way too slowly (relatively small files at that) are taking over 1 second each to load. What are some optimizations that I can do server-side to reduce those load times (Other than increasing server processing power/network speed).
The server is WebSphere.
There are plenty possibilies (sorted by importance):
Set proper Expires- and Last Modified-Header for all static resources. This can reduce overall requests for static resources dramatically. Thus reducing server load. No requests are the fastest requests with no payload.
Serve static resources from a separate cookie-less (sub-)domain.
Use CSS-Spites and combine often used graphics like Logos and Icons into one single large image.
Combine all your CSS in a single or just a few files. This reduces overall request count and increases frontend performance, too.
Optimize your image sizes lossless with tools like PngOut.
Pre-gzip your css (and js) files and serve them directly from memory. Do not read them from hard disc and compress on the fly.
Use a library like jawr if you do not want to do all these things on your own. Many of these things can jawr handle for you without having negative impacts on your development.
Let Apache webserver serve these static contents for you.
Use something like mod_proxy that relies on your Caching Headers to serve the contents for you. Apache is faster in serving static resources and more important it can be done from another system in front of your Websphere server.
Use a CDN for serving your static content.
Is it possible to wrap these file resources in a .jar file, then use the Java Zip and/or Java Jar APIs to read them?
If you employed a gzip filter to compress the output or static resources, make sure to exclude images as they render slow when gzipped on the server side before responding out.
You may want to read this Using IBM HTTP Server diagnostic capabilities with WebSphere
and this WebSphere tuning for the impatient: How to get 80% of the performance improvement with 20% of the effort
Make sure keep alive is on and functioning. Reduces the overall network overhead required.Please Refer this
Also, make sure you have enough memory allocated to the VM running the server. Using GC stats for logging memory usage and GC is a good idea...e.g. add these to the java VM:
-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails
Related
I am running a Java web app.
A user uploads a file (max 1 MB) and I would like to store that file until the user completes an entire process (which consists of multiple requests).
Is it ok to store the file as a byte array in the session until the user completes the entire process? Or is this expensive in terms of resources used?
The reason I am doing this is because I ultimately store the file on an external server (eg aws s3) but I only want to send it to that server if the whole process is completed.
Another option would be to just write the file to a temporary file on my server. However, this means I would need to remove the file in case the user exits the website. But it seems excessive for me to add code to the SessionDestroyed method in my SessionListener which removes the file if it’s just for this one particular case (ie: sessions are created throughout my entire application where I don’t need to check for temp files).
Thanks.
Maybe Yes, maybe No
Certainly it is reasonable to store such data in memory in a session if that fits your deployment constraints.
Remember that each user has their own session. So if all of your users have such a file in their session, then you must multiply to calculate the approximate impact on memory usage.
If you exceed the amount of memory available at runtime, there will be consequences. Your Servlet container may serialize less-used sessions to storage, which is a problem if you’ve not programmed all of your objects to support serialization. The JVM and OS may use a swap file to move contents out of real memory as part of the virtual memory system. That swapping may impact or even cripple performance.
You must consider your runtime deployment constraints, which you did not disclose. Are you running on a Raspberry Pi or inexpensive little cloud server with little memory available? Or will you run on an enterprise-class server with half a terabyte of RAM? Do you have 3 users, 300, or 30,000? You need to crunch the numbers and determine your needs, and maybe do some runtime profiling to see actual usage.
For example… I write web apps using the Vaadin Framework, a sophisticated package for creating desktop-style apps within a web browser. Being Servlet-based, Vaadin maintains a complete representation of each user’s entire work data on the server-side in the Servlet session. Multiplied by the number of users, and depending on the complexity of the app, this may require much memory. So I need to account for this and run my server on sufficient hardware with 64-bit Java tuned to run with a large amount of memory. Or take other approaches such load-balancing across multiple servers with sticky sessions.
Fortunately, RAM is quite cheap nowadays. And 64-bit hardware with large physical support for RAM modules, 64-bit operating systems, and 64-bit JVM implementations ( Azul, others ) are all readily available.
Give me the right way. I have nginx server (list 80port) which proxy to tomcat server (for ex 8080port). I need to get static images in my spring app. I got something like this:
1) map images on tomcat server (aliease) or Context docBase
2) map static on nginx server
3) create another sub domain for ex images.mysite.com and work with him.
And also what will be better?
There is no universal right way.
If you have a low traffic site: Use what you can set up quickest. Don't worry, if you are running into performance problems, they won't be due to this decision but due to other aspects of your solution.
If you have a high traffic site: Start with the easiest setup (same as before). Then measure where your performance problems are. Again, most likely they won't be due to the delivery of static content, but whatever your biggest performance problem is: Fix it, rinse, repeat. If static content delivery makes up for an improvement of 0.5% of performance, while another factor makes up 20%, guess where you should invest your time (hint: it's not static content delivery)
In this regards I'm totally with Klaus Groenbaek's comment: Building a complex system that's harder to maintain without having some justification (measurements) showing the advantage of the complexity is preposterous.
Unless you identify an actual performance bottleneck in your own system, optimize for maintainability, build the simplest possible system.
Performance:
Nginx is a great webserver and at the moment is the best when talking about serving static content. You can referr to benchmarks available online, or benchmark it yourself.
Subdomain/Separate domain for static content:
By using sub/separation for static content, you will eliminate cookies on static content, reduce http request/response size and will have better performance.
You will also increase the number of parallel downloads that the browser can perform. This will reduce your page load time.
This will increase your costs if you have ssl enabled, you need a certificate for your sub/separate domain too.
This is a little related to my previous question Solaris: Mounting a file system on an application's handlers except this question is for a different purpose and is simpler as there is no open/close/lock it is just a fixed length block of bytes with read/write operations.
Is there anyway I can create a virtual slice, kinda like a RAM disk or a SVM slice.. but I want the reads and writes to go through my app.
I am planning to use ZFS to take multiple of these virtual slices/disks and make them into one larger one for distributed backup storage with snapshots. I really like the compression and stacking that ZFS offers. If necessary I can guarantee that there is only one instance of ZFS accessing these virtual disks at a time (to prevent cache conflicts and such). If the one instance goes down, we can make sure it won't start back up and then we can start another instance of that ZFS.
I am planning to have those disks in chunks of about 4GB or so,, then I can move around each chunk and decide where to store them (multiple times mirrored of course) and then have ZFS access the chunks and put them together in to larger chunks for actual use. Also ZFS would permit adding of these small chunks if necessary to increase the size of the larger chunk.
I am aware there would be extra latency / network traffic if we used my own app in Java, but this is just for backup storage. The production storage is entirely different configuration that does not relate.
Edit: We have a system that uses all the space available and basically when there is not enough space it will remove old snapshots and increase the gaps between old snapshots. The purpose of my proposal is to allow the unused space from production equipment to be put to use at no extra cost. At different times different units of our production equipment will have free space. Also the system I am describing should eliminate any single point of failure when attempting to access data. I am hoping to not have to buy two large units and keep them synchronized. I would prefer just to have two access points and then we can mix large/small units in any way we want and move data around seamlessly.
This is a cross post because this is more software related than sysadmin related The original question is here: https://serverfault.com/questions/212072. it may be a good idea for the original to be closed
One way would be to write a Solaris device driver, precisely a block device one emulating a real disk but that will communicate back to your application instead.
Start with reading the Device Driver Tutorial, then have a look at OpenSolaris source code for real drivers code.
Alternatively, you might investigate modifying Solaris iSCSI target to be the interface with your application. Again, looking at OpenSolaris COMSTAR will be a good start.
It seems that any fixed length file on any file system will do for a block device for use with ZFS. Not sure how reboots work, but I am sure we can get write some boot up commands to work that out.
Edit: The fixed length file would be on a network file system such as NFS.
I have a Tomcat instance which is exhibiting the following behaviour:
Accept a single http incoming request.
Issue one request to a backend server and get back about 400kb of XML.
Pass through this XML and transform it into about 400kb of JSON.
Return the JSON response.
The problem is that in the course of handling the 400k request my webapp generates about 100mb of garbage which fills up the Eden space and triggers a young generation collection.
I have tried to use the built in java hprof functionality to do allocation sites profiling but Tomcat didn't seem to start up properly with that in place. It is possible that I was just a bit impatient as I imagine memory allocation profiling has a high overhead and therefore tomcat startup might take a long time
What are the best tools to use to do java memory profiling of very young objects/garbage? I can't use heap dumps because the objects I'm interested in are garbage.
As to the actual problem: XML parsing can be very memory hogging when using a DOM based parser. Consider using a SAX or binary XML based parser (VTD-XML is a Java API based on that).
Actually, if the XML->JSON mapping is pure 1:1, then you can also consider to just read the XML and write the JSON realtime line by line using a little stack.
Back to the question: I suggest to use VisualVM for this. You can find here a blog article how to get it to work with Tomcat.
You can use the profiler in jvisualvm in the JDK to do memory profiling.
Also have a look at Templates to cache the XSLT transformer.
http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/transform/Templates.html
You should be able to get heap dumps to work anyway by debugging the app, placing breakpoints at key points of the code and creating a heap dump while the app is paused at each breakpoint.
You might want to try LambdaProbe, which is a profiler for Tomcat.
It supports the following:
Overview
Lambda Probe (formerly Tomcat Probe) is a self sufficient web application, which helps to visualize various parameters of Apache Tomcat instance in real time. Lambda Probe is designed to work specifically with Tomcat so it is able to access far more information that is normally available to JMX agents. Here is a list of features available through Lambda Probe:
New! Comprehensive JVM memory usage
monitor.
JBoss compatibility
Display of deployed applications,
their status, session count, session
object count, context object count,
datasource usage etc.
Start, stop, restart, deploy and
updeploy of applications
Ability to view deployed JSP files
Ability to compile all or selected
JSP files at any time.
Ability to pre-compile JSP files on
application deployment.
New! Ability to view auto-generated
JSP servlets
Display of list of sessions for a
particular application
Display of session attributes and
their values for a particular
application. Ability to remove
session attributes.
Ability to view application context
attributes and their values.
Ability to expire selected sessions
Graphical display of datasource
details including maximum number of
connections, number of busy
connections and configuration details
New! Ability to group datasource
properties by URL to help visualizing
impact on the databases
Ability to reset data sources in case
of applications leaking connection
Display of system information
including System.properties, memory
usage bar and OS details
Display of JK connector status
including the list of requests
pending execution
Real-time connector usage charts and
statistics.
Real-time cluster monitoring and
clulster traffic charts
New! Real time OS memory usage, swap
usage and CPU utilisation monitoring
Ability to show information about log
files and download selected files
Ability to tail log files in real
time from a browser.
Ability to interrupt execution of
"hang" requests without server
restart
New! Ability to restart Tomcat/JVM
via Java Serview Wrapper.
Availability "Quick check"
Support for DBCP, C3P0 and Oracle
datasources
Support for Tomcat 5.0.x and 5.5.x
Support for Java 1.4 and Java 1.5
https://github.com/mchr3k/org.inmemprofiler/wiki (http://mchr3k.github.io/org.inmemprofiler/)
InMemProfiler can be used to identify which objects are collected after a very short time.
I am trying to create 100 files using FileOutputStream/BufferedOutputStream.
I can see the CPU utilization is 100% for 5 to 10 sec. The Directory which i am writing is empty. I am creating PDF files thru iText. Each file having round 1 MB. I am running on Linux.
How can i rewrite the code so that i can minimize the CPU utilization?
Don't guess: profile your application.
If the numbers show that a lot of time is spent in / within write calls, then look at ways to do faster I/O. But if most time is spent in formatting stuff for output (e.g. iText rendering), then that's where you need to focus your efforts.
Is this in a directory which already contains a lot of files? If so, you may well just be seeing the penalty for having a lot of files in a directory - this varies significantly by operating system and file system.
Otherwise, what are you actually doing while you're creating the files? Where does the data come from? Are they big files? One thing you might want to do is try writing to a ByteArrayOutputStream instead - that way you can see how much of the activity is due to the file system and how much is just how you're obtaining/writing the data.
It's a long shot guess, but even if you're using buffered streams make sure you're not writing out a single byte at a time.
The .read(int) and .write(int) methods are CPU killers. You should be using .read(byte[]...) and .write(byte[], int, int) for certain.
A 1MB file to write is large enough to use a java.nio FileChannel and see large performance improvements over java.io. Rewrite your code, and measure it agaist the old stuff. I predict a 2x improvement, at a minimum.
You're unlikely to be able to reduce the CPU load for your task, especially on a Windows system. Java on Linux does support Asynchronous File I/O, however, this can seriously complicate your code. I suspect you are running on Windows, as File I/O generally takes much more time on Windows than it does on Linux. I've even heard of improvements by running Java in a linux VM on Windows.
Take a look at your Task Manager when the process is running, and turn on Show Kernel Times. The CPU time spent in user space can generally be optimized, but the CPU time in kernel space can usually only be reduce by make more efficient calls.
Update -
JSR 203 specifically addresses the need for asynchronous, multiplexed, scatter/gather file IO:
The multiplexed, non-blocking facility introduced by JSR-51 solved much of that problem for network sockets, but it did not do so for filesystem operations.
Until JSR-203 becomes part of Java, you can get true asynchronous IO with the Apache MINA project on Linux.
Java NIO (1) allows you to do Channel based I/O. This is an improvement in performance, but your only doing a buffer of data at a time, and not true async & multiplexed IO.