Best way to query multiple data from webservice and mysql?

Best way to query multiple data from webservice and mysql? - java

I have an android app which uses java rest api, which gets the data from mysql database.
After I open the app, I download a list of some items. Then, I make another 5 requests to rest api to get other connected resources, so:
First I download a list of shops.
Then, having these shops ids, I make 5 http requests to rest api and it fetches (from mysql) lists of photos (not actual images, just urls and ids), ratings, opening hours and 2 more things
This makes 6 api (and db) calls and it takes a long time.
Is there a better way to do it?

You might rewrite the multiple queries into a single JOIN and fetch them in a single network round trip.
Be sure you understand where the time is being spent and why. Your queries might be slow if you have WHERE clauses that don't use indexes. Run EXPLAIN PLAN and see if there's a TABLE SCAN being executed. If you see one, you can immediately improve performance by adding appropriate indexes on the columns in the WHERE clauses.

Why don't yout retrieve all those data in the first call? the response of restapi can be json or xml.
REQUEST: GET /shop/list
RESPONSE XML:
<shops>
<shop name="shop1" img="../../img1.jpg" ></shop>
<shop name="shop2" img="../../img2.jpg" ></shop>
<shop name="shop3" img="../../img3.jpg" ></shop>
</shops>
RESPONSE JSON:
[{name:"shop1",img:"../../img1.jpg"},
{name:"shop2",img:"../../img2.jpg"},
{name:"shop3",img:"../../img3.jpg"}]
You could handle these responses in your Java/Android application

Related

Read large data from database using JdbcTemplate and expose via api?

I have a requirement to read a large data set from a postgres database which needs to be accessible via a rest api endpoint. The client consuming the data will then need to transform the data into csv format(might need to support json and xml later on).
On the server side we are using Spring Boot v2.1.6.RELEASE and spring-jdbc v5.1.8.RELEASE.
I tried using paging and loop through all the pages and store the result into a list and return the list but resulted in OutOfMemory error as the data set does not fit into memory.
Streaming the large data set looks like a good way to handle memory limits.
Is there any way that I can just return a Stream of all the database entities and also have the rest api return the same to the client? How would the client deserialize this stream?
Are there any other alternatives other than this?

If your data is so huge that it doesn't fit into memory - I'm thinking gigabytes or more - then it's probably too big to reasonably provide as single HTTP response. You will hold the connection open for a very long time. If you have a problem mid-way through, the client will need to start all over at the beginning, potentially minutes ago.
A more user-friendly API would introduce pagination. Your caller could specify a page size and the index of the page to fetch as part of their request
For example
/my-api/some-collection?size=100&page=50
This would represent fetching 100 items, starting from the 5000th (5000 - 5100)
Perhaps you could place some reasonable constraints on the page size based on what you are able to load into memory at one time.

AJAX/JavaScript search performance better than Java/Oracle

I work with a very large, enterprise application written in Java which queries an Oracle SQL database. We use JavaScript on the front end, and are always looking for ways to improve upon the performance of the application with increased use.
The issue we're having right now is that we are sending a query, via Java, that results in 39,000 records. This is putting a significant load on the server and causes the browser to hang. I should mention that the data is relatively static (only changes about once a year) and we could use an xml map or something similar (flat file) since we know the exact results that will be returned each time.
The query, however, is still taking 1.5 - 2 minutes to load, which is unacceptable. I wanted to see if there were any suggestions as to how this scenario can be optimized, especially if it can be done any quicker with JavaScript (or jQuery) and using AJAX for the db connection. Or, are we going about this problem all wrong?

You want to determine if the slowness is due to:
the query executing in the database
the network is slow returning 39k records
the javascript working with the 39k records after the ajax is complete
If you can run the query in sqlplus or toad, this will eliminate the web-tier and network all together. If this is slow, then tune the query by checking indexes.
If after adding the appropriate indexes, the query is still slow, then you could prebuild the query's results and store the results in a table or you could create a materialized view.
Once you have the query performing well from sqlplus, then add the network back into the equation. Run it from your web browser and see what overhead is being added.
If it is still slow, then you need to determine if the problem is the act of ajaxing the data or if the slowness occurs after the page does something with the data (ie. populating a data grid via javascript).
If the slowness is because the browser is waiting for the data, then you want to make sure it's only ever fetched once. You can do this by setting the cache headers in the ajax request to cache the result for 1 year. Or you can store the results in localstorage.
If the slowness is due to the browser working with the 39k rows (ie. moving the data into a data grid), then you have a few options.
find a better approach or library
use pagination
You may find performance issues from each of these areas. Most likely the query just needs to be tuned and by adding indexes or pre-querying the data and storing it will solve the problem.
Another thing to consider is if you really need 39k rows at one time. If you can, paginate at the db level so you're returning 100 rows per page.

How to fetch data dynamically from web server?

I have a Mongodb database that contains a Poll Collection.
The Poll collection has a number of Poll documents. This could be a large number of documents.
I am using Java Servlet for serving HTTP requests.
How can I implement a feed kind of retrieval mechanism at the server side?
For e.g., In the first request, I want to retrieve 1 to 10, documents, then 11 to 20 and so on...
As there is a scroll in the view, i want to get the data from server and send to client.
Does Mongodb provide a way to do this?

I think what you are looking for is a pagination. You could use the limit and skip methods with your find query.
First request
db.Poll.find().skip(0).limit(10)
Second request
db.Poll.find().skip(10).limit(10)
...
...
Note: You should also be sorting your find with some field.
db.Poll.find().skip(10).limit(10).sort({_id:-1})
For more info on the cursor methods you could look here: http://docs.mongodb.org/manual/reference/method/js-cursor/

way(client side or server side) to go for pagination /sortable columns?

I have 3000 records in an employee table which I have fetched from my database with a single query. I can show 20 records per page. So there will be 150 pages for with each page showing 20 records. I have two questions on pagination and sortable column approach:
1) If I implement a simple pagination without sortable columns, should I send all 3000 records to client and do the pagination client side using javascript or jquery. So if user clicks second page, call will not go to server side and it will be faster. Though I am not sure what will be the impact of sending 3000 or more records on browser/client side? So what is the best approach either sending all the records to client in single go and do the sorting there or on click of page send the call to server side and then just return that specific page results?
2) In this scenario, I need to provide the pagination along with sortable columns (6 columns). So here user can click any column like employee name or department name, then names should be arranged in ascending or descending order. Again I want to know the best approach in terms of time response/memory?

Sending data to your client is almost certainly going to your bottleneck (especially for mobile clients), so you should always strive to send as little data as possible. With that said, it is almost definitely better to do your pagination on the server side. This is a much more scalable solution. It is likely that the amount of data will grow, so it's a safer bet for the future to just do the pagination on the server.
Also, remember that it is fairly unlikely that any user will actually bother looking through hundreds of result pages, so transferring all the data is likely wasteful as well. This may be a relevant read for you.

I assume you have a bean class representing records in this table, with instances loaded from whatever ORM you have in place.
If you haven't already, you should implement caching of these beans in your application. This can be done locally, perhaps using Guava's CacheBuilder, or remotely using calls to Memcached for example (the latter would be necessary for multiple app servers/load balancing). The cache for these beans should be keyed on a unique id, most likely the mapping to the primary key column of the corresponding table.
Getting to the pagination: simply write your queries to return only IDs of the selected records. Include LIMIT and OFFSET or your DB language's equivalent to paginate. The caller of the query can also filter/sort at will using WHERE, ORDER BY etc.
Back in the Java layer, iterate through the resulting IDs of these queries and build your List of beans by calling the cache. If the cache misses, it will call your ORM to individually query and load that bean. Once the List is built, it can be processed/serialized and sent to the UI.

I know this doesn't directly answer the client vs server side pagination, but I would recommend using DataTables.net to both display and paginate your data. It provides a very nice display, allows for sorting and pagination, built in search function, and a lot more. The first time I used it was for the first web project I worked on, and I, as a complete noobie, was able to get it to work. The forums also provide very good information/help, and the creator will answer your questions.
DataTables can be used both client-side and server-side, and can support thousands of rows.
As for speed, I only had a few hundred rows, but used the client-side processing and never noticed a delay.

USE SERVER PAGINATION!
Sure, you could probably get away with sending down a JSON array of 3000 elements and using JavaScript to page/sort on the client. But a good web programmer should know how to page and sort records on the server. (They should really know a couple ways). So, think of it as good practice :)
If you want a slick user interface, consider using a JavaScript grid component that uses AJAX to fetch data. Typically, these components pass back the following parameters (or some variant of them):
Start Record Index
Number of Records to Return
Sort Column
Sort Direction
Columns to Fetch (sometimes)
It is up to the developer to implement a handler or interface that returns a result set based on these input parameters.

Is a good idea do processing of a large amount of data directly on database?

I have a database with a lot of web pages stored.
I will need to process all the data I have so I have two options: recover the data to the program or process directly in database with some functions I will create.
What I want to know is:
do some processing in the database, and not in the application is a good
idea?
when this is recommended and when not?
are there pros and cons?
is possible to extend the language to new features (external APIs/libraries)?
I tried retrieving the content to application (worked), but was to slow and dirty. My
preoccupation was that can't do in the database what can I do in Java, but I don't know if this is true.
ONLY a example: I have a table called Token. At the moment, it has 180,000 rows, but this will increase to over 10 million rows. I need to do some processing to know if a word between two token classified as `Proper Name´ is part of name or not.
I will need to process all the data. In this case, doing directly on database is better than retrieving to application?

My preoccupation was that can't do in the database what can I do in
Java, but I don't know if this is true.
No, that is not a correct assumption. There are valid circumstances for using database to process data. For example, if it involves calling a lot of disparate SQLs that can be combined in a store procedure then you should do the processing the in the stored procedure and call the stored proc from your java application. This way you avoid making several network trips to get to the database server.
I do not know what are you processing though. Are you parsing XML data stored in your database? Then perhaps you should use XQuery and a lot of the modern databases support it.
ONLY an example: I have a table called Token. At the moment, it has
180,000 rows, but this will increase to over 10 million rows. I need
to do some processing to know if a word between two token classified
as `Proper Name´ is part of name or not.
Is there some indicator in the data that tells it's a proper name? Fetching 10 million rows (highly susceptible to OutOfMemoryException) and then going through them is not a good idea. If there are certain parameters about the data that can be put in a where clause in a SQL to limit the number of data being fetched is the way to go in my opinion. Surely you will need to do explains on your SQL, check the correct indices are in place, check index cluster ratio, type of index, all that will make a difference. Now if you can't fully eliminate all "improper names" then you should try to get rid of as many as you can with SQL and then process the rest in your application. I am assuming this is a batch application, right? If it is a web application then you definitely want to create a batch application to do the staging of the data for you before web applications query it.
I hope my explanation makes sense. Please let me know if you have questions.

Directly interacting with the DB for every single thing is a tedious job and affects the performance...there are several ways to get around this...you can use indexing, caching or tools such as Hibernate which keeps all the data in the memory so that you don't need to query the DB for every operation...there are tools such as luceneIndexer which are very popular and could solve your problem of hitting the DB everytime...

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.