Pagination in SAS based software - java

I am developing a SAS based system using jsp, servlet and java. And i am confused in using client side pagination(getting all result at one go) or using server side(With every click).
If i use client side pagination how much data is o.k. And what is the best way to implement pagination like javascript, ajax, jquery etc.

It really depends upon volume of your data and the probability of user loading that page in his session.
If the dataset is limited to say max 100* rows or so, and each record has few columns with small data size, you could go for client side pagination. But if the max of dataset is unknown or its gonna grow gradually, its best to go for a server side pagination.
Ajax with jQuery is definitely the way to go. Every jQuery grid plugin has its own mechanism for pagination but the basic logic is similar :
You need to design your backend APIs such that they accept
maxResults and currentPage as params, along with other params.
The API which interacts with your DB fetches maximum maxResults no. of rows and the first row starts from (currentPage -1) * pageSize
I have been using jQgrid and found it very well documented and simple to implement.
Helpful post :
https://stackoverflow.com/questions/159025/jquery-grid-recommendations
NB: *100 is just an example don't talke it literally :)

I reckon js/jquery based pagination with Ajax for data fetch is great. You need to consider if you need sorting or not. I implemented jQuery based pagination 7 months back, and that time the pagination almost become really slow (1000 rows 10 rows per page) because of too much data. So, please make sure you implement Ajax based pagination.
I used this: http://tablesorter.com/docs/
and this: http://tablesorter.com/docs/example-ajax.html
Also, antoher helpful link:
http://www.xarg.org/2011/09/jquery-pagination-revised/
P.S. - Be very careful about syntax and classes for pagination implementation. One spelling mistake and you might get into circles.
Extra information: If you are thinking of using a different language, try using Ruby on Rails. You can use will_paginate or Kaminari gem for simple implementation of pagination.

I'd certainly look at a combination of server side pagination and client side pagination.
There is no sense in returning 1000 (or 10000+) rows of data, if all you are going to do is display 10 at a time.
If you are going to be displaying data in a grid, then I'd suggest you look at Datatables.net. They have some great examples including pagination and pipelining your data from the server (ie. returning a few more records than you actually display so you have fewer calls to retrieve more data).

Related

Set pagination off while searching documents in marklogic for a specified criteria

I am doing a search in the marklogic using JsonDocumentManager by providing the StructuredQuery Definition. As a result I am getting a DocumentPage, defaults to 50 records (page length defaulted in JsonDocumentManager). But I want to retrieve all the documents in one go?
I can see two options here to solve this, either by increasing the page length to a limit which cannot be exceeded for the criteria I am supplying or by providing the page offset in the jsonDocumentManager.search(queryDefinition, pageOffset) in the loop till the documentPage.isLastPage returns to true
Could some one please let me know the further options if any? Is there any parameter for pagination which I can switch to false to not allow marklogic to do a paginated search?
As stated by #grtjn, it's always best to paginate, and even faster if you can run requests in parallel. For that reason, the Java API doesn't have a flag to get all results. Nor do the layers it builds on: REST API and the search:search API.
The layer those build on, cts:search, uses server-side lazy evaluation to efficiently paginate under the hood while it appears to get all results. With that said, if you must have another options besides those you already know about, consider creating a Resource extension and have it call directly to the cts:search API.
For what it's worth, in MarkLogic 9 we'll be providing the Data Movement SDK which will do all the pagination and parallelization for you under the hood on the client side. It is specifically designed for long-running data movement applications that need to export or manipulate large datasets. If that's of interest, please consider joining the early access program and you can try it out.

Pagination and sorting

I am using spring mvc,IOC and hibernate in my project. I am reading the records from the database and displaying them in a grid
I need sorting and pagination on table records
I used jquery tablesorter for sorting. My Problem is that i want to implement server side pagination not client side.So in case of table sorter when somebody will sort some column and click on next page then client side sorting will fail.
Is there library or api for implementing server side pagination and sorting?
Thanks
Ramandeep Singh
If it is an option for you to use Spring Data for your repository layer then this supports server side sort and page out of the box with integrations for your controllers.
http://docs.spring.io/spring-data/jpa/docs/1.4.2.RELEASE/reference/html/repositories.html#web-pagination
I have used it with DisplaTag without any issues but should work with any table component. Configure either the component or Spring Data so the sort/page param names match up.
Also vastly reduces much of the boiler plate code around creating JPA repos. Much of the time all you need do is create an interface and leave the implementation to the library.
Well worth a look.
For this purpose, you have to send these values from jsp page to your action method (1)Current Page Number(2) ColumnToSort (3)No of records on page you want to see
and you have to apply to some logic on server side to calculate starting and ending record
then you can use hibernate query's (1)setFirstResult (2)setMaxResults functions.
You can use "order by" in your query for sorting.

how to cache the objects for display tags in jsp JSTL

I am using the displaytag for the pagination purpose.
Now from the DB, I have a millions of records, to go one from the other page, its taking a quite longer time.
Is there a way we can cache the objects which needs to be shown, and so that traversing in between the pages can be faster.
Requirement : We are querying and displaying the number of files in the directory under Linux environment. each folders has thousands of files..
How are your reading from DB? It would be good to see some more from your implementation.
As a general guideline:
If you read all your data into a list from the DB and only display a page, you will be wasting resources (processing and memory). This can kill your app. Try an approach that will just go for the page you're needing.
If you are using a framework like Hibernate, you can implement caching and paging without much trouble.
If you are using direct JDBC, you will have to limit registers in your query. Here the proper technique might depend on the Database Engine you're using. Please provide this information.
Be aware that your problem might be the amount of read information rather than a caching problem (just depends on the implementation).
As a sample, in Oracle, you would need to know the page and the pagesize. With both, you could limit the query with "where rownum < pagesize * page" (or something similar depending on how you index, and navigate to the first register you need with the absolute(int) method of Resultset. On other Engines it might be more efficient.
Now, if you're paginating with some framework, normally they support some implementation of a "DataProvider" so you can control how to fetch results for each page.

way(client side or server side) to go for pagination /sortable columns?

I have 3000 records in an employee table which I have fetched from my database with a single query. I can show 20 records per page. So there will be 150 pages for with each page showing 20 records. I have two questions on pagination and sortable column approach:
1) If I implement a simple pagination without sortable columns, should I send all 3000 records to client and do the pagination client side using javascript or jquery. So if user clicks second page, call will not go to server side and it will be faster. Though I am not sure what will be the impact of sending 3000 or more records on browser/client side? So what is the best approach either sending all the records to client in single go and do the sorting there or on click of page send the call to server side and then just return that specific page results?
2) In this scenario, I need to provide the pagination along with sortable columns (6 columns). So here user can click any column like employee name or department name, then names should be arranged in ascending or descending order. Again I want to know the best approach in terms of time response/memory?
Sending data to your client is almost certainly going to your bottleneck (especially for mobile clients), so you should always strive to send as little data as possible. With that said, it is almost definitely better to do your pagination on the server side. This is a much more scalable solution. It is likely that the amount of data will grow, so it's a safer bet for the future to just do the pagination on the server.
Also, remember that it is fairly unlikely that any user will actually bother looking through hundreds of result pages, so transferring all the data is likely wasteful as well. This may be a relevant read for you.
I assume you have a bean class representing records in this table, with instances loaded from whatever ORM you have in place.
If you haven't already, you should implement caching of these beans in your application. This can be done locally, perhaps using Guava's CacheBuilder, or remotely using calls to Memcached for example (the latter would be necessary for multiple app servers/load balancing). The cache for these beans should be keyed on a unique id, most likely the mapping to the primary key column of the corresponding table.
Getting to the pagination: simply write your queries to return only IDs of the selected records. Include LIMIT and OFFSET or your DB language's equivalent to paginate. The caller of the query can also filter/sort at will using WHERE, ORDER BY etc.
Back in the Java layer, iterate through the resulting IDs of these queries and build your List of beans by calling the cache. If the cache misses, it will call your ORM to individually query and load that bean. Once the List is built, it can be processed/serialized and sent to the UI.
I know this doesn't directly answer the client vs server side pagination, but I would recommend using DataTables.net to both display and paginate your data. It provides a very nice display, allows for sorting and pagination, built in search function, and a lot more. The first time I used it was for the first web project I worked on, and I, as a complete noobie, was able to get it to work. The forums also provide very good information/help, and the creator will answer your questions.
DataTables can be used both client-side and server-side, and can support thousands of rows.
As for speed, I only had a few hundred rows, but used the client-side processing and never noticed a delay.
USE SERVER PAGINATION!
Sure, you could probably get away with sending down a JSON array of 3000 elements and using JavaScript to page/sort on the client. But a good web programmer should know how to page and sort records on the server. (They should really know a couple ways). So, think of it as good practice :)
If you want a slick user interface, consider using a JavaScript grid component that uses AJAX to fetch data. Typically, these components pass back the following parameters (or some variant of them):
Start Record Index
Number of Records to Return
Sort Column
Sort Direction
Columns to Fetch (sometimes)
It is up to the developer to implement a handler or interface that returns a result set based on these input parameters.

How to handle huge result sets from database

I'm designing a multi-tiered database driven web application – SQL relational database, Java for the middle service tier, web for the UI. The language doesn't really matter.
The middle service tier performs the actual querying of the database. The UI simply asks for certain data and has no concept that it's backed by a database.
The question is how to handle large data sets? The UI asks for data but the results might be huge, possibly too big to fit in memory. For example, a street sign application might have a service layer of:
StreetSign getStreetSign(int identifier)
Collection<StreetSign> getStreetSigns(Street street)
Collection<StreetSign> getStreetSigns(LatLonBox box)
The UI layer asks to get all street signs meeting some criteria. Depending on the criteria, the result set might be huge. The UI layer might divide the results into separate pages (for a browser) or just present them all (serving up to Goolge Earth). The potentially huge result set could be a performance and resource problem (out of memory).
One solution is not to return fully loaded objects (StreetSign objects). Rather return some sort of result set or iterator that lazily loads each individual object.
Another solution is to change the service API to return a subset of the requested data:
Collection<StreetSign> getStreetSigns(LatLonBox box, int pageNumber, int resultsPerPage)
Of course the UI can still request a huge result set:
getStreetSigns(box, 1, 1000000000)
I'm curious what is the standard industry design pattern for this scenario?
The very first question should be:
¿The user needs to, or is capable of, manage this amount of data?
Although the result set should be paged, if its potentially size is so huge, the answer will be "probably not", so the UI shouldn't try to show it.
I worked on J2EE projects on Health Care Systems, that deal with enormous amount of stored data, literally millions of patients, visits, forms, etc, and the general rule is not to show more than 100 or 200 rows for any user search, advising the user that those set of criteria produces more information that he can understand.
The way to implement this varies from one project to another, it is possible to force the UI to ask the service tier the size of a query before launching it, or it is possible to throw an Exception from the service tier if the result set grows too much (however this way couples the service tier with the limited implementation of an UI).
Be careful! This not means that every method on the service tier must throw an Exception if its result sizes more than 100, this general rule only applies to result sets that are shown to the user directly, that is a better reason to place the control in the UI instead on the service tier.
The most frequent pattern I've seen for this situation is some sort of paging, usually done server-side to reduce the amount of information sent over the wire.
Here's a SQL Server 2000 example using a table variable (generally faster than a temp table) together with your street signs example:
CREATE PROCEDURE GetPagedStreetSigns
(
#Page int = 1,
#PageSize int = 10
)
AS
SET NOCOUNT ON
-- This memory-variable table will control paging
DECLARE #TempTable TABLE (RowNumber int identity, StreetSignId int)
INSERT INTO #TempTable
(
StreetSignId
)
SELECT [Id]
FROM StreetSign
ORDER BY [Id]
-- select only those rows belonging to the requested page
SELECT SS.*
FROM StreetSign SS
INNER JOIN #TempTable TT ON TT.StreetSignId = SS.[Id]
WHERE TT.RowNumber BETWEEN ((#Page - 1) * #PageSize + 1)
AND (#Page * #PageSize)
In SQL Server 2005, you can get more clever with stuff like Common Table Expressions and the new SQL Ranking functions. But the general theme is that you use the server to return only the information belonging to the current page.
Be aware that this approach can get messy if you're allowing the end-user to apply on-the-fly filters to the data that s/he's seeing.
I would say if the potential exsists for a large set of data, then go the paging route.
You can still set a MAX that you do not want them to go over.
E.G. SO uses page sizes of 15, 30, 50...
One thing to be wary of when working with home-grown row-wrapper classes like you (apparently) have, is code that makes additional calls to the database without you (the developer) being aware of it. For example, you might call a method that returns a collection of Person objects and think that the only thing going on under the hood is a single "SELECT * FROM PERSONS" call. In actuality, the method you're calling might iterate through the returned collection of Person objects and make additional DB calls to populate each Person's Orders collection.
As you say, one of your solutions is to not return fully-loaded objects, so you're probably aware of this potential problem. One of the reasons I tend to avoid using row wrappers is that they invariably make it difficult to tune your application and minimize the size and frequency of database traffic.
In ASP.NET I would use server-side paging, where you only retrieve the page of data the user has requested from the data store. This is opposed to retrieving the entire result set, putting it into memory and paging through it on request.
JSF or JavaServerFaces has widgets for chunking large result sets to the browser. It can be parameterized as you suggest. I wouldn't call it a "standard industry design pattern" by any means, but it is worth a look to see how someone else solved the problem.
When I deal with this type of issue, I usually chunk the data sent to the browser (or thin/thick client, whichever is more appropriate for your situation) as regardless of the actual total size of the data that meets some certain criteria, only a small portion is really usable in any UI at one time.
I live in a Microsoft world, so my primary environment is ASP.Net with SQL Server. Here are two articles about paging (which mention some techniques for paging through result sets) that may be helpful:
Paging through lots of data efficiently (and in an Ajax way) with ASP.NET 2.0
Efficient Data Paging with the ASP.NET 2.0 DataList Control and ObjectDataSource
Another mechanism that Microsoft has shipped lately is their idea of "Dynamic Data" - you might be able to check out the guts of this for some guidance as to how they're dealing with this issue.
I've done similar things on two different products. In one case the data source is optionally paginated -- for java, implements a Pageable interface similar to:
public interface Pageable
{
public void setStartIndex( int index );
public int getStartIndex();
public int getRowsPerPage() throws Exception;
public void setRowsPerPage( int rowsPerPage );
}
The data source implements another method for get() of items, and the implementation of a paginated data source just returns the current page. So you can set your start index, and grab a page in your controller.
One thing to consider will be to cache your cursors server side. For a web app you'll have to expire them, but they'll really help performance wise.
The fedora digital repository project returns a maximum number of results with a result-set-id. You then get the rest of the result by asking for the next chunk supplying the result-set-id in the subsequent query. It works ok as long as you don't want to do any searching or sorting outside of the query.
From the datay retrieval layer, the standard design pattern is to have two method interfaces, one for all and one for a block size.
If you wish, you can layer components that do paging over it.

Categories