GWT: Populating a page from datastore using RPC is too slow - java

Is there a way to speed up the population of a page with GWT's UI elements which are generated from data loaded from the datastore? Can I avoid making the unnecessary RPC call when the page is loaded?
More details about the problem I am experiencing: There is a page on which I generate a table with names and buttons for a list of entities loaded from the datastore. There is an EntryPoint for the page and in its onModuleLoad() I do something like this:
final FlexTable table = new FlexTable();
rpcAsyncService.getAllCandidates(new AsyncCallback<List<Candidate>>() {
public void onSuccess(List<Candidate> candidates) {
int row = 0;
for (Candidate person : candidates) {
table.setText(row, 0, person.getName());
table.setWidget(row, 1, new ToggleButton("Yes"));
table.setWidget(row, 2, new ToggleButton("No"));
row++;
}
}
...
});
This works, but takes more than 30 seconds to load the page with buttons for 300 candidates. This is unacceptable.
The app is running on Google App Engine and using the app engine's datastore.

You could do a lot of things, I will just list them in order that will give you the best impact.
FlexTable is not meant for 300 rows. Since your table is so simple, you should consider generating the HTML by hand, and then using simple HTML widget. Also, 300 rows is a lot of information - consider using pagination. The DynaTable sample app shows you how to do this.
It looks like you are using one GWT module per page. That is the wrong approach to GWT. Loading a GWT module has some non-trivial cost. To understand what I mean, compare browser refresh on gmail v/s the refresh link that gmail provides. That is the same cost you pay when every page in your website has a distinct GWT module.
If the list of candidates is needed across views, you can send it along with the HTML as a JSON object, and then use the Dictionary class in GWT to read it. This saves you the RPC call that you are making. This approach is only recommended if the data is going to be useful across multiple views/screens (like logged in users info)
Check how long your RPC method call is taking. You can enable stats in GWT to figure out where your application is taking time.
You can also run Speed Tracer to identify where the bottleneck is. This is last only because it is obvious FlexTable is performing a lot of DOM manipulations. In general, if you don't know where to start, Speed Tracer is a great tool.

The significant thing here is how you're retrieving the list of candidates, which you haven't shown. 30 seconds is extremely high, and it's unlikely that it's due to the datastore alone.
Have you tried using appstats to profile your app?

Like sri suggested - pagination is easiest and (I think) best solution (along with switching to Grid or just <table>). But in case you wanted for some reason to show/render many rows at once, the GWT Incubator project has a nice wiki page about it - along with some benchmarks showing how FlexTable sucks at large row count. Check out their other tables too ;)

Your problem is that everytime you add something to the FlexTable it has to re-render the whole page and repaint. Try creating a new FlexTable, populating it, when it is fully populated, get rid of the old one and put the new one there.

Related

How to refresh jsp page?

I'm new to Google App Engine and I'm having a little problem that I can't seem to be able to find the solution to.
Whenever I create/delete/update something from the Datastore, in the end I do this:
resp.sendRedirect("/view_list.jsp");
And the page doesn't get updated.
For instance, if I have a page with a list of 2 items, then I create another item and I redirect to that page with the list, and instead of showing 3 items, it shows 2 items, until I change page and come back.
So how can I make sure that the page refreshes after my changes to the Datastore?
A couple of points that are relevant:
The Data store is HRD (High Replication Database) and as per the documentation, the delay from the time a write is committed until it becomes visible in all datacenters means that queries across multiple entity groups (non-ancestor queries) can only guarantee eventually consistent results. Consequently, the results of such queries may sometimes fail to reflect recent changes to the underlying data. Please refer to the documentation for more details.
In short, to get consistent reads, use get as much as you can. If you use a query, there could be a delay due to the indexing.
Hope this helps. I also suggest to frame the question title better. The question is a good one but could get lost when it says "refresh the jsp page".

Aside from RPC calls, what could be taking my App Engine Program so long

I'm trying to performance optimize a page load in GAE, and I'm a bit stumped what is taking so long to serve the page.
When I first got appstats running I found the page was calling about 500-600 RPC calls. I've now got that down to 3.
However, I'm still seeing a massive load of extra time in App Stats. Another page on my site (using the same django framework + templating) loads in about 60ms, doing a small query to a small data set.
Question is, what is this overhead, and where should I be looking for trouble points?
The data in the request has about 350 records, and about 30 properties per record. I'm cool with the data call itself taking the datastore api time, but it's the other time I'm confused about. The data does get stepped through a biggish iterator, and I've now used fetch on most of these requests to keep the RPC call down, and make sure things are in memory rather than being queried as they go.
Slow Request - Look at all the extra blue
Fast Request , RPC blue is matched against overall blue
EDIT
OK, so I have created a new model called FastModel, and copied the bare minimum items needed for the page to it, so it can load as quickly as possible, and it does make a big difference. Seems there are things on the Model that slow it all down. Will investigate further.
Deserializing 350 records, especially large ones, takes a long time. That's probably what's taking up the bulk of your execution time.

Best way to store small, alternating, public data that updates every couple of hours?

The essence of my problem is that there are too many solutions, and I would like to find which one wins out in pros and cons before I build an infrastructure around it.
(Simplified for the purpose of this forum) This is an auction site where five auctions are stored in a rank #1-5, #1 being the currently featured auction. The other four are simply "on deck." After either a couple hours or the completion of that auction, #2-5 move up to #1-4 and a new one is chosen to be #5
I'm using a dedicated server and I've been considering just storing the data in the servlet or maybe adding a column in the database as a boolean for each auction...like "isFeatured = 1"
Suffice it to say the data is read about 5 times+ more often than it is written, which is why I'm leaning towards good old SQL.
When you can retrieve the relevant auctions from DB with a simple query with ORDER BY and TOP or something similar then try this. If no performance issues occur then KISS and you're done.
Otherwise when these 5 auctions are valid for a while then cache them in memory. Have a singleton holding these auctions and provide methods for updating for example. Maybe you want to use a caching lib instead. Update these Top5 whenever necessary but serve them directly out of memory without hiting a DB or something similar expensive.
What kind of scale are you looking for? How many application servers need access to the data?
I think you're probably making this more complicated than it is. Just use a database, take a hit of ACID, and move onto whatever else you need to work on. :P
Have you taken a look at SQLite? It allows for "good old SQL" without all of the hassles of setting up a separate database server. As long as the data isn't too huge (to be fair, I haven't tested the size limits, but I've skimmed blog entries mentioning the use of SQLite to process files of several dozen MB in size quickly and with no problems), you should be fine.
It isn't a perfect solution for all needs (frankly, I sometimes find the dynamic typing to be a pain), but since it relies on locally stored files, reads will be much faster than firing up a network connection to talk to a more "traditional" RDBMS.

Deciding on a strategy for paginating Book listings without SQL

I have an ArrayList of Books pulled from different Merchants and sorted in Java, depending on user preferences, according to price or customer reviews:
List<Book> books = new ArrayList<Book>();
This requires me to keep a large chunk of data in memory stored as Java objects at all times. Now that I need to paginate this data into listings that span multiple web pages and allow a user to click on a numbered page link to hop to that segment of the data, what's the best way to do this?
My idea was to have maybe 25 book listings per page and rather than use hyperlinks that submit the form data as a GET request of URL parameters, the page number hyperlinks would simply resubmit the form, passing the requested page number as an additional form POST parameter.
<input type="hidden" id="pageNumber" value="0">
Page 5
In that case page 5 would simply be a set of 25 records starting at the 125th (5 * 25) record in the ArrayList and ending at the 149th record in the ArrayList.
Is there a better way to do this?
Refactor your application to let e.g. Hibernate pull out data from the underlying database.
Hibernate can do all the sorting and pagination without you having to keep it all in memory at all times.
IMO the request being a GET or POST shouldn't make much difference, so I'd say do whatever floats your boat (shielding head from RESTful rebuttals). The big thing I'd still be concerned about is keeping that list in memory. Pulling it from separate merchants seems like a good argument for not re-retrieving it each time a page is requested, but personally I'd consider storing those results in a local database anyways, even temporarily. Keeping that much data in memory on your app server will have consequences when you have a lot of concurrent users.
How many results pages do users generally look at? How big is the data (total or per record)?
Is this big list always around (static), or created per-query ?
Instead of returning a page with 25 results, can you output (say) 200 in a JSON array, and use javascript to display n .. n+24 results. If you have all the results on the page, you can also do the sorting there as well. Request a 1x1.gif?user=u1&action=whatever if you want to do user tracking (update ads, etc.) when displaying another page.
Depending on your record size, traffic, user behavior, sending
JSON could be more compact than the html generated on the server, so you get
less bandwidth used
fewer queries made to the server
user sees better response because pages update quicker (and server is doing fewer queries)
but you will need to do some analysis to see if this will help you. For example, if more than half of people always look at the second page of results, you might as well ship with that first page results 26-50 as well.
On the other hand, you wouldn't want to send 500 results if no one looks past page 3. Unless you wanted to inflate your traffic numbers. Or you could do something dynamic, like send out smaller pages when there is less bandwidth available. God we live in primitive times.
I have a open source library to deal with pagination issue in Java Web application. Here is the link:
http://www.hdpagination.org
It may be an option for you to think about.
If you have any question, feel free to ask.

Prefuse: Reloading of XML files

I am a new to the prefuse visualization toolkit and have a couple of general questions. For my purpose, I would like to perform an initial visualization using prefuse (graphview / graphml). Once rendered, upon a user click of a node, I would like to completely reload a new xml file for a new visualization. I want to do this in order to allow me to "pre-package" graphs for display.
For example. If I search for Ted. I would like to have an xml file relating to Ted load and render a display. Now in the display I see that Ted has nodes associated called Bill and Joe. When I click Joe, I would like to clear the display and load an xml file associated with Joe. And so on.
I have looked into loading one very large xml file containing all node and node relationship info and allowing prefuse to handle this using the hops from one level to another. However, eventually I am sure that system performance issues will arise due to the size of data.
Thanks in advance for any help,
John
Of course as you said, one option is loading all nodes and then set the nodes you don't need to be invisible. Prefuse scales fairly well, but of course it has its limits. The second option is to just create a brand new panel and replace the old panel. I've used the option 2 and it works quite well.
I'm far from an expert on Prefuse's performance issues, but I think it is definitely more resource intensive to have a huge xml file loaded at once than to do the processing to only re-load the necessary nodes.
I don't know what kind of graph you are using, but I would place a 'refreshGraph' that removes the graph from the Visualization object, cancels Activity, cancels Layout, refreshes the ActionList and re-starts over. It would probably turn out something like this:
public void refresh(clickedNode){
visualization.removeGroup(GRAPH);
visualization.removeGroup(AGGR);
activity.cancel();
actionList.cancel();
visualization.reset();
// process the XML and reload your graph here
}

Categories