I have a Mongodb database that contains a Poll Collection.
The Poll collection has a number of Poll documents. This could be a large number of documents.
I am using Java Servlet for serving HTTP requests.
How can I implement a feed kind of retrieval mechanism at the server side?
For e.g., In the first request, I want to retrieve 1 to 10, documents, then 11 to 20 and so on...
As there is a scroll in the view, i want to get the data from server and send to client.
Does Mongodb provide a way to do this?
I think what you are looking for is a pagination. You could use the limit and skip methods with your find query.
First request
db.Poll.find().skip(0).limit(10)
Second request
db.Poll.find().skip(10).limit(10)
...
...
Note: You should also be sorting your find with some field.
db.Poll.find().skip(10).limit(10).sort({_id:-1})
For more info on the cursor methods you could look here: http://docs.mongodb.org/manual/reference/method/js-cursor/
Related
I have a requirement to read a large data set from a postgres database which needs to be accessible via a rest api endpoint. The client consuming the data will then need to transform the data into csv format(might need to support json and xml later on).
On the server side we are using Spring Boot v2.1.6.RELEASE and spring-jdbc v5.1.8.RELEASE.
I tried using paging and loop through all the pages and store the result into a list and return the list but resulted in OutOfMemory error as the data set does not fit into memory.
Streaming the large data set looks like a good way to handle memory limits.
Is there any way that I can just return a Stream of all the database entities and also have the rest api return the same to the client? How would the client deserialize this stream?
Are there any other alternatives other than this?
If your data is so huge that it doesn't fit into memory - I'm thinking gigabytes or more - then it's probably too big to reasonably provide as single HTTP response. You will hold the connection open for a very long time. If you have a problem mid-way through, the client will need to start all over at the beginning, potentially minutes ago.
A more user-friendly API would introduce pagination. Your caller could specify a page size and the index of the page to fetch as part of their request
For example
/my-api/some-collection?size=100&page=50
This would represent fetching 100 items, starting from the 5000th (5000 - 5100)
Perhaps you could place some reasonable constraints on the page size based on what you are able to load into memory at one time.
I'm trying to generate a CSV file based on a list of objects returned by a web service method.
The problem is that I want to retrieve all of the objects available, but the call will 'fail' if I try to get more than 100 entries (the method has 2 parameters which give me the possibility to specify the interval of objects I want to retrieve, ex: from 10 to 50, from 45 to 120, etc.).
I thought of making sequential calls while incrementing the two indexes which represent the interval, but someone suggested that I should use batch processing for this. As far as I searched the internet I only found examples on how to export database data or xml files into csv, using Spring Batch.
Could someone explain me how should I handle this situation? Or at least point me to an example/tutorial similar to what I need? Thank you very much!!
If you try to load all data from a single request through a webservice , you are exposed to get a memory or timeout exception because data too much large in response, maybe you should try make some calls to your webservice, something like a paginated request, after each response you can insert response in your local database.
When all calls are over, call a process and build your csv file.
regards.
I have an android app which uses java rest api, which gets the data from mysql database.
After I open the app, I download a list of some items. Then, I make another 5 requests to rest api to get other connected resources, so:
First I download a list of shops.
Then, having these shops ids, I make 5 http requests to rest api and it fetches (from mysql) lists of photos (not actual images, just urls and ids), ratings, opening hours and 2 more things
This makes 6 api (and db) calls and it takes a long time.
Is there a better way to do it?
You might rewrite the multiple queries into a single JOIN and fetch them in a single network round trip.
Be sure you understand where the time is being spent and why. Your queries might be slow if you have WHERE clauses that don't use indexes. Run EXPLAIN PLAN and see if there's a TABLE SCAN being executed. If you see one, you can immediately improve performance by adding appropriate indexes on the columns in the WHERE clauses.
Why don't yout retrieve all those data in the first call? the response of restapi can be json or xml.
REQUEST: GET /shop/list
RESPONSE XML:
<shops>
<shop name="shop1" img="../../img1.jpg" ></shop>
<shop name="shop2" img="../../img2.jpg" ></shop>
<shop name="shop3" img="../../img3.jpg" ></shop>
</shops>
RESPONSE JSON:
[{name:"shop1",img:"../../img1.jpg"},
{name:"shop2",img:"../../img2.jpg"},
{name:"shop3",img:"../../img3.jpg"}]
You could handle these responses in your Java/Android application
I want to build a reports based application, that retrieve very large amount of data from oracle DB and display it to user, so my solution was to put a java based web service that returns a large amount of data. Is there a standard way to stream a response rather than trying to return a huge chunk of data at once?
You can consider PAGING mechanism. Just display the required set of rows at once, then on request move to next set of rows.
From database end, you can do LIMIT and FETCH certain number of rows at a time.
If you are on 12c, the LIMIT TOP-n functionality is readily available.
I have 3000 records in an employee table which I have fetched from my database with a single query. I can show 20 records per page. So there will be 150 pages for with each page showing 20 records. I have two questions on pagination and sortable column approach:
1) If I implement a simple pagination without sortable columns, should I send all 3000 records to client and do the pagination client side using javascript or jquery. So if user clicks second page, call will not go to server side and it will be faster. Though I am not sure what will be the impact of sending 3000 or more records on browser/client side? So what is the best approach either sending all the records to client in single go and do the sorting there or on click of page send the call to server side and then just return that specific page results?
2) In this scenario, I need to provide the pagination along with sortable columns (6 columns). So here user can click any column like employee name or department name, then names should be arranged in ascending or descending order. Again I want to know the best approach in terms of time response/memory?
Sending data to your client is almost certainly going to your bottleneck (especially for mobile clients), so you should always strive to send as little data as possible. With that said, it is almost definitely better to do your pagination on the server side. This is a much more scalable solution. It is likely that the amount of data will grow, so it's a safer bet for the future to just do the pagination on the server.
Also, remember that it is fairly unlikely that any user will actually bother looking through hundreds of result pages, so transferring all the data is likely wasteful as well. This may be a relevant read for you.
I assume you have a bean class representing records in this table, with instances loaded from whatever ORM you have in place.
If you haven't already, you should implement caching of these beans in your application. This can be done locally, perhaps using Guava's CacheBuilder, or remotely using calls to Memcached for example (the latter would be necessary for multiple app servers/load balancing). The cache for these beans should be keyed on a unique id, most likely the mapping to the primary key column of the corresponding table.
Getting to the pagination: simply write your queries to return only IDs of the selected records. Include LIMIT and OFFSET or your DB language's equivalent to paginate. The caller of the query can also filter/sort at will using WHERE, ORDER BY etc.
Back in the Java layer, iterate through the resulting IDs of these queries and build your List of beans by calling the cache. If the cache misses, it will call your ORM to individually query and load that bean. Once the List is built, it can be processed/serialized and sent to the UI.
I know this doesn't directly answer the client vs server side pagination, but I would recommend using DataTables.net to both display and paginate your data. It provides a very nice display, allows for sorting and pagination, built in search function, and a lot more. The first time I used it was for the first web project I worked on, and I, as a complete noobie, was able to get it to work. The forums also provide very good information/help, and the creator will answer your questions.
DataTables can be used both client-side and server-side, and can support thousands of rows.
As for speed, I only had a few hundred rows, but used the client-side processing and never noticed a delay.
USE SERVER PAGINATION!
Sure, you could probably get away with sending down a JSON array of 3000 elements and using JavaScript to page/sort on the client. But a good web programmer should know how to page and sort records on the server. (They should really know a couple ways). So, think of it as good practice :)
If you want a slick user interface, consider using a JavaScript grid component that uses AJAX to fetch data. Typically, these components pass back the following parameters (or some variant of them):
Start Record Index
Number of Records to Return
Sort Column
Sort Direction
Columns to Fetch (sometimes)
It is up to the developer to implement a handler or interface that returns a result set based on these input parameters.