I have a search page which populates search results.
I can redirect to other pages from search screen.
When a user wants to return to the search screen, I want to show the same search results.
The obvious option which rings is saving in session. Is that a good design?
Is it possible to store such amount of data in session in liferay? How? Any pointers are much appreciated!
Is it possible? Yes. Just allocate enough memory for the session size times number of concurrent users. Of course, calculate first if this is feasible. Optimize if necessary. Also, keep in mind that two concurrent searches in separate tabs that are open concurrently could easily interfere with each other.
My recommendation would be to first investigate if you can't store just the search terms and maybe the results page and execute the search again when/if required. Or just keep track of all of the previous searches of a user. This way you don't even need it in the session, but can store it in the database, keyed by the userid (for logged in users) or sessionid (for anonymous users)
Related
I want to know whether it is useful to use ConcurrentHashMaps for user data. I have the user data saved in a mysql database and retrieve them when a user logs in (or someone edits the user). Every time when the user goes on another page, these user data will be refreshed. Should I use a map and save changes from my application there while having a database in background or should I directly download it from the db. I want to make the application as performant as possible.
What you are describing is a cache. Suppose the calls to the database cost a lot because there is a lot of info to load, or the query that is used to extract the data is complex and requires a lot of time. Here comes in play the cache data structure. It is basically an in memory storage, which is really faster w.r.t querying the database, because indeed, it is already loaded in memory.
The process of filling the cache takes the same time as querying the db for the data (generally more but in the same order). So it makes sense to use caches only if it brings benefit in time. There is a compromise though, speed vs freshness of data. Depending on your use-case you must find the right compromise between those two, and you shall afterwards find out if it is really convenient.
As you describe it, i.e user updates that needs to be saved and displayed, using a cache seems a bit an overkill IMO, unless you have lot of registered users, and so many of those are using the system simultaneously. If you decide to use it keep in mind of some concurrency issues that may rise. Concurrent hash maps saves you from many hazards but with performance compromise.
If the performance is the priority I think you should keep the logged users in memory.
That way, the read requests would be fast as you would not need to query the database. However, you would need to update the map if any of the logged users would be somehow edited.
A human cannot tell the difference between a 1ms delay and a 50ms delay. So it is overkill to optimize beyond "good enough".
MySQL already does a flavor of caching; your addition of another cache may actually slow down the response time.
We are developing one social networking site, which has to maintain lot of data for notification (unreal notification, which is nothing but the notification for the activities done when the user is not logged in).
Once the user logs in we will be fetching lot of data in the form of JSON. There will be around around JSON objects, each JSON will inturn consist of min 1000 characters, this data must be available in all pages the user navigates, so we are keeping all the data in the session.
How feasible it is to keep such data in session?
Is there any limit for the amount data we store in session?
Keeping huge data may sometime lead to hinder the performance of the application.
What is the most optimized way of handling such data if this is not the proper one?
Is there any limit for the amount data we store in session?
There is no limit how much data you put in your session, but there is only limit of space in your computer.
But,
To put lot of data in session lead performance degradation of system,
you can do one thing use temporary database table which will keep only values that use required when he navigate through application and the populate each time when user visit that particular page. I think round-trip no more costly then session in your scenario.
I have a datastore that stores the cab booking details of the customers. In the admin console I need to display the statistics to the admin, like busiest location, peak hours, total bookings in a particular location in a particular day. For the busiest location i need to retrieve the location from where most number of cabs has been booked. Should I iterate through the entire datastore and keep a count or is there any method to know which location has maximum and minimum duplicates.
I am using a ajax call to java servlet which should return the busiest location.
And I also need a suggestion for maintaining such a stats page. Should I keep a separate Entity kind just for counters and stats and update it everytime when a customer books a cab or is the logic correct for iterating through the entire datastore for the stats page. Thanks in advance.
There are too many unknowns about your data model and usage patterns to offer a specific solution, but I can offer a few tips.
Updating a counter every time you create a new record will increase your writing costs by 2 write operations, which may or may not be significant.
Using keys-only queries is very cheap and fast. It is the preferred method for counting something, so you should try to model your data in such a way that a keys-only query can give you an answer. For example, if a "trip" entity has a property for "id of a starting point", and this property is indexed, you can loop through your locations using a keys-only query to count the number of trips that started from each location.
Assuming that you record a lot of trips, and that an admin page will be visited/refreshed not very frequently, the keys-only queries approach is the way to go. If the admin page is visited/refreshed many times per hour, you may be better off with the counters.
Part of my project requires that we maintain stats of our customers products. More or less we want to show our customers how often their products has been viewed on the site
Therefore we want to create some form of Product Impressions Counter. I do not just mean a counter when we land on the specific product page, but when the product appears in search results and in our product directory lists.
I was thinking that after calling the DB I would extract the specific product ids and pass them to a service that will then inserted then into the stats tables. Or another is using some form of singleton buffer writer which writes to the DB after it reaches a certains size?
Has anyone ever encountered this in there projects and have any ideas that they would like to share?
And / or does anyone know of any framework or tools that could aid this development?
Any input would be really appreciated.
As long as you don't have performance problems, do not over-engineer your design. On the other hand, depending on how big the site is, it seem that you are going to have performance problems due to huge amount of writes.
I think real time updates will have a huge performance impact. Also it is very likely that you will update the same data multiple times in short period of time. Another thing is that, although interesting, storing this statistics is not mission-cricital and it shouldn't affect normal system work. Final thought: inconsistencies and minor inaccuracy is IMHO acceptable in this use case.
Taking all this into account I would temporarily hold the statistics in memory and flush them periodically as you've suggested. This has the additional benefit of merging events for the same product - if between two flushed some product was visited 10 times, you will only perform one update, not 10.
Technically, you can use properly synchronized singleton with background thread (a lot of handcrafting) or some intelligent cache with write-behind technology.
I am currently thinking about Caching Strategies and more importantly avoiding any duplication of data inside the cache. My query is kind of language agnostic but very much programming related.
My question is regarding the efficient caching of paged or filtered data but more than that, distributed caching. The latter I have decided to go with memcached and more specifically a .NET port of it. I have seen another commercial option in the form of NCache, but memcached seems perfectly acceptable to me and apparently is used on facebook, myspace etc...
My query then is a strategy my which you can contain objects in cache and also a reference to them with paged data. If I have 100 items and I page them, then I could cache the ids of product 1-10 inside the cache and cache each product seperately. If I where to sort the items descending then items 1-10 would be different products so I would not want to store the actual objects each time the paged data/sorting/filtering changed, but instead stored the ids of the objects so I could then perform a trabsactional lookup in the databse if some of them do not already exist in the cache or are invalid.
My initial idea was this for a cache key.
paged_<pageNumber><pageSize><sort><sortDirection>[<filter>]
I would then iterate through the cache keys and remove any which start with "paged_" My question ultimately is if any one knows of any patterns or ideas about straties regarding caching of such patterns of data such as paged data and also making sure that objects are not cached more than once.
memcached is native code and would not have a problem clearing the cache in the way I have stated above, but it is an obvious fact that the more items in the cache the more time it would take. I am interested if anyone knows of any solution or theory to this type of problem which is currently beig employed. I am sure there will be . Thank you for your time
TIA
Andrew
I once tried, what I think, is a similar caching strategy and found it unwieldy. I eventually ended up just caching the objects that make up the pages and generating the pages for every request. 10 cache hits to construct a page is going to be (hopefully) sub second response time, pretty much instant to the users of your service.
If you must cache entire pages (I think of them as result sets) then perhaps you could run the user request through a hash and use that as your cache key. It's a hard problem to visualize with a concrete example or code (for me at least).