We are developing one social networking site, which has to maintain lot of data for notification (unreal notification, which is nothing but the notification for the activities done when the user is not logged in).
Once the user logs in we will be fetching lot of data in the form of JSON. There will be around around JSON objects, each JSON will inturn consist of min 1000 characters, this data must be available in all pages the user navigates, so we are keeping all the data in the session.
How feasible it is to keep such data in session?
Is there any limit for the amount data we store in session?
Keeping huge data may sometime lead to hinder the performance of the application.
What is the most optimized way of handling such data if this is not the proper one?
Is there any limit for the amount data we store in session?
There is no limit how much data you put in your session, but there is only limit of space in your computer.
But,
To put lot of data in session lead performance degradation of system,
you can do one thing use temporary database table which will keep only values that use required when he navigate through application and the populate each time when user visit that particular page. I think round-trip no more costly then session in your scenario.
Related
In shopping cart application,instead of maintaining the cart items by using the session, whenever the user adding the cart items, we can store the item in the database right?. Why in most of the places using a session is recommended?
Shopping cart applications usually does not require a user to create an account in order to add items to the cart. Imagine a user with malicious intent to harm your application, they can just write a script to frequently add and remove items in your cart which will eventually overload your database and bring the whole application down. So its a security vulnerability.
Also genuine users themselves may add or remove items too frequently which could overload the system. They may even abandon their shopping midway with items in cart, in which case your database will keep holding unnecessary data and you would need to write an invalidation logic. So it makes sense to only update the database when the user is actually checking out their cart items.
It's Faster
Every update the user is doing can be immediately updated in the local browser instead of making an HTTP request to the server + validations + DB time.
Less load on the server
The users can add / remove / update their shopping cart how many times they like and you don't need to scale the server for that
Less DB storage size
You don't need to keep all the "in-flight", "just checking" users. If a user abandons the site, you can keep the data in his browser but you don't need to allocate DB storage for it
Smaller DB + Smaller server = cheaper costs to the company.
Faster site responses = better user experience.
I want to know whether it is useful to use ConcurrentHashMaps for user data. I have the user data saved in a mysql database and retrieve them when a user logs in (or someone edits the user). Every time when the user goes on another page, these user data will be refreshed. Should I use a map and save changes from my application there while having a database in background or should I directly download it from the db. I want to make the application as performant as possible.
What you are describing is a cache. Suppose the calls to the database cost a lot because there is a lot of info to load, or the query that is used to extract the data is complex and requires a lot of time. Here comes in play the cache data structure. It is basically an in memory storage, which is really faster w.r.t querying the database, because indeed, it is already loaded in memory.
The process of filling the cache takes the same time as querying the db for the data (generally more but in the same order). So it makes sense to use caches only if it brings benefit in time. There is a compromise though, speed vs freshness of data. Depending on your use-case you must find the right compromise between those two, and you shall afterwards find out if it is really convenient.
As you describe it, i.e user updates that needs to be saved and displayed, using a cache seems a bit an overkill IMO, unless you have lot of registered users, and so many of those are using the system simultaneously. If you decide to use it keep in mind of some concurrency issues that may rise. Concurrent hash maps saves you from many hazards but with performance compromise.
If the performance is the priority I think you should keep the logged users in memory.
That way, the read requests would be fast as you would not need to query the database. However, you would need to update the map if any of the logged users would be somehow edited.
A human cannot tell the difference between a 1ms delay and a 50ms delay. So it is overkill to optimize beyond "good enough".
MySQL already does a flavor of caching; your addition of another cache may actually slow down the response time.
I have a search page which populates search results.
I can redirect to other pages from search screen.
When a user wants to return to the search screen, I want to show the same search results.
The obvious option which rings is saving in session. Is that a good design?
Is it possible to store such amount of data in session in liferay? How? Any pointers are much appreciated!
Is it possible? Yes. Just allocate enough memory for the session size times number of concurrent users. Of course, calculate first if this is feasible. Optimize if necessary. Also, keep in mind that two concurrent searches in separate tabs that are open concurrently could easily interfere with each other.
My recommendation would be to first investigate if you can't store just the search terms and maybe the results page and execute the search again when/if required. Or just keep track of all of the previous searches of a user. This way you don't even need it in the session, but can store it in the database, keyed by the userid (for logged in users) or sessionid (for anonymous users)
I'm new to open source stacks and have been playing with hibernate/jpa/jdbc and memcache. I have a large data set per jdbc query and possibly will have a number these large data sets where I eventually bind to a chart.
However, I'm very focused on performance instead of hitting the database per page load to display it on my web page chart.
Are there some examples of how (memcache, redis, local or distributed) and where to cache this data (jSON or raw result data) to load in memory? Also I need to figure out how to refresh the cache unless it's a time based eviction marking algorithm (i.e. 30min expires so grab new data from data base query instead of using cache or perhaps an automated feed of data into the cache every xhrs/min/etc).?
Thanks!
This is typical problem and solution not straight forward. There are many factor which determine your design. Here is what we did sometime ago
Since our queries to extract data were a bit complex (took around a min to execute) and large dataset, we populated the memcache from a batch which used to pull data from database every 1 hour and push it to the memcached. By keeping the expiry cache larger than the batch interval, we made sure that where will always be data in cache.
There was another used case for dynamic caching, wherein on receiving the request for data, we checked first the memcached and if data not found, query the database, fetch the data, push it to memcached and return the results. But I would advise for this approach only when your database queries are simple and fast enough not to cause the poor overall response.
You can also used Hibernat's second level cache. It depends on your database schema, queries etc. to use this feature efficiently.
Hibernate has built-in support for 2nd level caching. Take a look at EhCache for example.
Also see: http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/performance.html#performance-cache
I'm in the early stages of doing a web project which will require working with arrays containing around 500 elements of custom object type. Objects will likely contain between 10 and 40 fields (based on user input), mostly booleans, strings and floats. I'm gonna use PHP for this project, but I'm also interested to know how to treat this problem in Java.
I know that "premature optimization is the root of all evil", but I think I need to decide now, how do I handle those arrays. Do I keep them in the Session object or do I store them in the database (mySQL) and keep just a minimum amount of keys in the session. Keeping data in the session would make application work faster, but when visitor numbers start growing I risk using up too much memory. On the other hand reading and writing from and into database all the time will degrade performance.
I'd like to know where the line is between those two approaches. How do I decide when it's too much data to keep inside session?
When I face a problem like this I try to estimate the size of per user data that I want to keep fast.
If your case, suppose for example to have 500 elements with 40 fields each of which sizing 50 bytes (making an average among texts, numbers, dates, etc.). So we have to keep in memory about 1MB per user for this storage, so you will have about 1GB every 1000 users only for this cache.
Depending on your server resource availability you can find bottlenecks: 1000 users consume CPU, memory, DB, disks accesses; so are in this scenario 1GB the problem? If yes keep them in DB if no keep them in memory.
Another option is to use an in-memory DB or a distributed cache solution that does it all for you, at some cost:
architectural complexity
eventually licence costs
I would be surprised if you had that amount of unique data for each user. Ideally, some of this data would be shared across users, and you could have some kind of application-level cache that stores the most recently used entries, and transparently fetches them from the database if they're missing.
This kind of design is relatively straightforward to implement in Java, but somewhat more involved (and possibly less efficient) with PHP since it doesn't have built-in support for application state.