Storing 1 MB byte array as session attribute - java

I am running a Java web app.
A user uploads a file (max 1 MB) and I would like to store that file until the user completes an entire process (which consists of multiple requests).
Is it ok to store the file as a byte array in the session until the user completes the entire process? Or is this expensive in terms of resources used?
The reason I am doing this is because I ultimately store the file on an external server (eg aws s3) but I only want to send it to that server if the whole process is completed.
Another option would be to just write the file to a temporary file on my server. However, this means I would need to remove the file in case the user exits the website. But it seems excessive for me to add code to the SessionDestroyed method in my SessionListener which removes the file if it’s just for this one particular case (ie: sessions are created throughout my entire application where I don’t need to check for temp files).
Thanks.

Maybe Yes, maybe No
Certainly it is reasonable to store such data in memory in a session if that fits your deployment constraints.
Remember that each user has their own session. So if all of your users have such a file in their session, then you must multiply to calculate the approximate impact on memory usage.
If you exceed the amount of memory available at runtime, there will be consequences. Your Servlet container may serialize less-used sessions to storage, which is a problem if you’ve not programmed all of your objects to support serialization. The JVM and OS may use a swap file to move contents out of real memory as part of the virtual memory system. That swapping may impact or even cripple performance.
You must consider your runtime deployment constraints, which you did not disclose. Are you running on a Raspberry Pi or inexpensive little cloud server with little memory available? Or will you run on an enterprise-class server with half a terabyte of RAM? Do you have 3 users, 300, or 30,000? You need to crunch the numbers and determine your needs, and maybe do some runtime profiling to see actual usage.
For example… I write web apps using the Vaadin Framework, a sophisticated package for creating desktop-style apps within a web browser. Being Servlet-based, Vaadin maintains a complete representation of each user’s entire work data on the server-side in the Servlet session. Multiplied by the number of users, and depending on the complexity of the app, this may require much memory. So I need to account for this and run my server on sufficient hardware with 64-bit Java tuned to run with a large amount of memory. Or take other approaches such load-balancing across multiple servers with sticky sessions.
Fortunately, RAM is quite cheap nowadays. And 64-bit hardware with large physical support for RAM modules, 64-bit operating systems, and 64-bit JVM implementations ( Azul, others ) are all readily available.

Related

Why does Spring Boot WEB take to respond more faster?

I usually use Spring Boot + JPA + Hibernate + Postgres.
At the end of the development of a WEB application I compile in Jar, then I run it directly with Java and then I do reverse proxy with Apache (httpd).
I have noticed that when starting there are no problems or latency, when accessing the website it works very quickly, but when several hours pass without anyone making a request to the server and then I want to access I must wait at least 20 seconds until the server responds, after this I can continue to access the site normally.
Why does this happen ?, It is as if Spring were in standby mode every time it detects that it has no load of requests, but I am not sure if it is so or is a problem. If it's some native spring functionality, how can I disable it?
Although I need to use a little more memory in idle state I want the answers to be fast regardless of whether it is loaded or not.
Without knowing more, it is likely that while your webapp is sitting idle, other programs on your server is using memory and cause the JVM memory to be swapped to disk.
When you then access the webapp again, the OS has to swap that JVM memory back into RAM, one page at a time. That takes time, but once the memory is back in RAM, your webapp will run normally.
Unfortunately, the way Java memory works, swapping JVM memory to disk is very bad for performance. That is an issue for most languages that rely on garbage collectors to free memory. Languages with manual memory management, e.g. C++ code, will usually not be hit as badly, when memory is swapped to disk, because memory use is more "focused" in those languages.
Solution: If my guess at the cause of your problem is correct, reconfigure your server so the JVM memory won't be swapped to disk.
Note that when I say server, I mean the physical machine. The "other programs", that your JVM is fighting for memory, might be running in different VMs, i.e. not in the same OS.

Write back strategy for Memcache on GAE

My App Engine (Java) application is planned to work on a data structure that needs frequent updates on many items. The amount of data is not planned to exceed 1000 records (per client) but the amount of clients is unlimited so I'm not willing to do 1000 reads and 1000 writes every second just to update some counters.
Naturally I'm thinking about utilizing the Memcache. Ideally the data should be in memory all the time so I can read and update it frequently. It should only be written to the data storage if the cache is full or the VM is being shut down (my biggest concern). Can I implement some sort of write-back strategy where the cache is only written to the storage when it needs to?
In particular my two questions are:
How do I know when an item is deleted from the cache?
How do I know when the VM is being shut down, so I can persist the content of the cache?
Short answer: No.
Longer answer: Memcache offers no guarantees.
Useful answer: Look at https://developers.google.com/appengine/articles/scaling/memcache#transient. If losing data is an option, you can rely on memcache (but sometimes some things might be lost).
Don't worry about the VM being shut down though: Memcache runs outside of the instance VM, and is shared between all the app instance VMs.

How I can set cpu and memory limit for a process in java?

Hi
I want to make a website using java and tomcat server.
The website will set cpu and memory limit for each user for their all process on a server. For example for each user %3 cpu and 100mb memory limit.
How can I make it?
Thanks
It is not possible with one Tomcat process. You need to spawn multiple child processes, each with it's own (-Xmx100m OR OS-specific memory quota) AND OS-specific settings to control CPU quota.
Depending on how malicious your users may become, you may also restrict number of available file descriptors, port ranges, disk quota, etc. At the end it may worth it to place each user's process into a VM and/or jail.
There is no mechanism available in Java for enforcing limits on threads resource usage. The only way you can do this, is by getting the underlying operating system to do it which will most likely require you to spawn a separate JVM for each restriction, and it will not be portable across platforms.
Note that you can put a timeout limit on a function call (using Executor) and have most operations interrupted. This might be good enough for you. If you need more, you again need the operating system to step in.

Solaris: virtual slices/disks for use with ZFS

This is a little related to my previous question Solaris: Mounting a file system on an application's handlers except this question is for a different purpose and is simpler as there is no open/close/lock it is just a fixed length block of bytes with read/write operations.
Is there anyway I can create a virtual slice, kinda like a RAM disk or a SVM slice.. but I want the reads and writes to go through my app.
I am planning to use ZFS to take multiple of these virtual slices/disks and make them into one larger one for distributed backup storage with snapshots. I really like the compression and stacking that ZFS offers. If necessary I can guarantee that there is only one instance of ZFS accessing these virtual disks at a time (to prevent cache conflicts and such). If the one instance goes down, we can make sure it won't start back up and then we can start another instance of that ZFS.
I am planning to have those disks in chunks of about 4GB or so,, then I can move around each chunk and decide where to store them (multiple times mirrored of course) and then have ZFS access the chunks and put them together in to larger chunks for actual use. Also ZFS would permit adding of these small chunks if necessary to increase the size of the larger chunk.
I am aware there would be extra latency / network traffic if we used my own app in Java, but this is just for backup storage. The production storage is entirely different configuration that does not relate.
Edit: We have a system that uses all the space available and basically when there is not enough space it will remove old snapshots and increase the gaps between old snapshots. The purpose of my proposal is to allow the unused space from production equipment to be put to use at no extra cost. At different times different units of our production equipment will have free space. Also the system I am describing should eliminate any single point of failure when attempting to access data. I am hoping to not have to buy two large units and keep them synchronized. I would prefer just to have two access points and then we can mix large/small units in any way we want and move data around seamlessly.
This is a cross post because this is more software related than sysadmin related The original question is here: https://serverfault.com/questions/212072. it may be a good idea for the original to be closed
One way would be to write a Solaris device driver, precisely a block device one emulating a real disk but that will communicate back to your application instead.
Start with reading the Device Driver Tutorial, then have a look at OpenSolaris source code for real drivers code.
Alternatively, you might investigate modifying Solaris iSCSI target to be the interface with your application. Again, looking at OpenSolaris COMSTAR will be a good start.
It seems that any fixed length file on any file system will do for a block device for use with ZFS. Not sure how reboots work, but I am sure we can get write some boot up commands to work that out.
Edit: The fixed length file would be on a network file system such as NFS.

Response Time is different for mulitiple execution of the application with the same request Performance problem

My java application functionality is to provide reference data (basically loads lots of data from xml files into hashmap) and hence we request for one such data from the hashmap based on a id and we have such multiple has map for different set of business data. The problem is that when i tried executing the java application for the same request multiple times, the response times are different like 31ms, 48ms, 72ms, 120ms, 63ms etc. hence there is a considerable gap between the min and max time taken for the execution to complete. Ideally, i would expect the response times to be like, 63ms, 65ms, 61ms, 70ms, 61ms, but in my case the variation of the response time for the same request is varying hugely. I had used a opensource profile to understand if there is any extra execution of the methods or memory leak, but as per my understanding there was no problem. Please let me know what could be the reasons and how can i address this problem.
There could be many causes:
Is your Java application restarted for each run? If not, it could be that the garbage collector kicks in at an unfortunate time. If so, the JVM startup time could be responsible for the variations.
Is anything else running on that machine?
Is the disk cache "warmed up" in some cases, but not in others? That is, have the files been recently accessed so that they are still in memory?
If this is a networked application, is there any network activity during the measurements?
If there is a remote machine involved (e.g. a database server or a file server), do the above apply to that machine as well?
Use a profiler to find out which piece of code is responsible for the variations in time.
If you don't run a real-time system, then you can't be sure it will execute within a certain time.
OSes constantly do other things, mostly housekeeping tasks, and providing the system other services. This easily will slow down the rest of your system for 50ms.
There also might be time that you need to wait for IO. Such as harddisks or network communication.
Besides that there is also the fact that your JVM doesn't do any real-time promises. This can mean the garbage collector runs through. The effect of this is very small on a normal application, but can be large if you create and forget lots of objects (as you might do when loading many or large files).
Finally it can be your algorithm (do you run the same data each time?) if you have different data, you can have different execution times.

Categories