I am trying to use the musicBrainz api for getting the discography of an artist using the following method: http://www.musicbrainz.org/ws/2/release/?query=artist:eminem but i get a lot of unsorted and repeated data. I know that i can use keywords (AND, OR, etc) but i do not really know how to sort it by date or filter repeated data. Is there a way to do this in the rest call or have i to implement these sort methods in my code?
About repeated data:
It sounds what you want are release groups, not releases. For example a certain album is one release group on MusicBrainz, but usually has multiple different releases in this group due to different editions of an album.
About sorting:
Unfortunately cou can't sort the data while querying the web service yet. There is a ticket for that: MBS-5636.
If you do want to sort the data, you have to fetch all of it and sort it yourself.
Related
I am writing Couchbase DAO using Java API. I store all documents for one entity in particular bucket. I wonder what is the best way to get all documents from this bucket?
Thanks in advance!
First: do you plan to store each entity type in their own buckets? That will probably not work in the long run, unless you plan to only ever have no more than 10 total entities. Buckets are not made to organize data like that: they are meant to store a variety of different types of data.
Second: do you really want to get all data from a bucket? That seems like a very uncommon use case. It's almost like asking "how do I query all data from all tables in a relational database"
That being said, I could imagine a very specialized situation where you'd want to do this. So, you could:
Create a PRIMARY index and execute a N1QL query like SELECT * FROM mybucket;
Create a very simple map/reduce view index of the data.
Both of these things can be done with the Java SDK.
I'm writing this on the fly on my phone, so forgive the crappy code samples.
I have entities with a manytomany relationship:
#JoinTable(name="foo", #JoinColum="...", #InverseJoinColumn="...")
#ManyToMany
List list = new ArrayList();
I want their data to be retrieved in a paginated way.
I know about setFirstResult and setMaxResults. Is there a way to use this with the mapping? As in, I retrieve the object and get the list filled with contents equal to the amount of records for a single page, with the appropriate offset.
I guess I'm just unclear of the best way to do this. I could just manually use hibernate criteria to have the effect, but I feel thats missing the API. I have this mapping, I want to see if there's a way to use it in a paginated way.
PS. If this is impractical, just say. Also, if it is, can I still use the mapping to add new entries to the join table. As in, if the entity is a persisted entity in the DB, but I haven't fetched the manytomany list, can I add something new to it and when its persisted with cascade all it'll be added to the join table without clearing the other entries?
The type of the relationship between entities that are part of your query isn't that important. There are a couple of ways to tackle this.
If your database supports the LIMIT keyword in it's queries, you would be able to use it to get data sets, assuming you sort your data. Note that if your data changes while your user is navigating between pages, you might see some duplication or miss some records. You'll be stuck having to rewrite if your database changes to one that doesn't have the LIMIT keyword.
If you need to freeze the data at the point of the original query you need to use a 3rd party framework or write your own to fetch a list of Ids for your query then split up that list and fetch by id in a subset for pagination. This is more reliable can be made to work for any database.
Displaytag is a data paging framework I've used and that I therefore can tell you works well for large datasets. It's also one of the older solutions for this problem and is not part of an extended framework.
http://displaytag.sourceforge.net/11/tut_externalSortAndPage.html
Table sorter is another one I came across. This one uses JQuery and fetches the entire data set in one query, so strictly speaking it doesn't meet your "fetches the data in a paginated way" criteria. (This might not be appropriate for large sets).
http://tablesorter.com/docs/
This tutorial might be helpful:
http://theopentutorials.com/examples/java-ee/jsp/pagination-in-servlet-and-jsp/
If you're already using a framework take a look at whether that framework has tackled pagination:
Spring MVC provides a data pager
http://blog.fawnanddoug.com/2012/05/pagination-with-spring-mvc-spring-data.html
GWT provides a data pager:
http://www.gwtproject.org/javadoc/latest/com/google/gwt/user/cellview/client/SimplePager.html
The following refrences might be helpful too:
JDBC Pagination
which also points to:
http://java.avdiel.com/Tutorials/JDBCPaging.html
I have a list of records containing country, city, district and building name information (more than 50,000 records) where building name is unique for every record.
I want to search building, district & city. But I want to get a list of cities if I pass the country to a method, e.g. get(String country). Or, get a list of districts if I pass country and city to the method, e.g. get(String country, String city).
Is there any existing collection/library/data structure to do something like this? I am thinking of a tree-like structure / Map. I tried MultiKeyMap, but it does not return a list of values and it is not thread-safe. Also, I don't want to use database for doing this.
Thanks in advance for your help.
SolR might do the job you are after:
Solr is the popular, blazing fast open source enterprise search
platform from the Apache Lucene project. Its major features include
powerful full-text search, hit highlighting, faceted search, dynamic
clustering, database integration, rich document (e.g., Word, PDF)
handling, and geospatial search. Solr is highly scalable, providing
distributed search and index replication, and it powers the search and
navigation features of many of the world's largest internet sites...
It should allow you to create queries which will in turn allow you to search through your records.
You can also interact with SolR through Solrj:
Solrj is a java client to access solr. It offers a java interface to
add, update, and query the solr index.
You can use HashMap like
HashMap<country,HashMap<City,HashMap<district,HashMap<building,value>>>>
You could take a look at Apache's Commons CollectionUtils. It has a "select" method that do what you want.
An off-beat type of way maybe using .properties files for each country to refer to a subset of localities in their each own .properties that again contains a a .properties to refer to cities that refer to .properties file containing buildings.
Another may be a class hierarchy system with a base instantiated "new" class e.g. GeographicLocation with a constructor that is fed an index to load an abstract class that indicates a Region or brings back a list of regions if not indicated by calling one of the two methods overloaded and that in turn automatically loads the next abstract class layer of city over the top of that.
Inside GeographicLocation class ....
CountryMap cntry = (CountryMap)this();
RegionMap rgion = (RegionMap)cntry;
CityMap cty = (CityMap)rgion;
....e.t.c.
Why not simply use three hashtables (e.g. of the type HashMap<String, List<Record>>): one keyed by buildings, one keyed by city and one keyed by district. Sure, you'll be using about three times as much memory; but 50,000 records really isn't that much. Furthermore, lookups will be really fast and simple. I'd recommend trying this and seeing how it performs.
I have 3000 records in an employee table which I have fetched from my database with a single query. I can show 20 records per page. So there will be 150 pages for with each page showing 20 records. I have two questions on pagination and sortable column approach:
1) If I implement a simple pagination without sortable columns, should I send all 3000 records to client and do the pagination client side using javascript or jquery. So if user clicks second page, call will not go to server side and it will be faster. Though I am not sure what will be the impact of sending 3000 or more records on browser/client side? So what is the best approach either sending all the records to client in single go and do the sorting there or on click of page send the call to server side and then just return that specific page results?
2) In this scenario, I need to provide the pagination along with sortable columns (6 columns). So here user can click any column like employee name or department name, then names should be arranged in ascending or descending order. Again I want to know the best approach in terms of time response/memory?
Sending data to your client is almost certainly going to your bottleneck (especially for mobile clients), so you should always strive to send as little data as possible. With that said, it is almost definitely better to do your pagination on the server side. This is a much more scalable solution. It is likely that the amount of data will grow, so it's a safer bet for the future to just do the pagination on the server.
Also, remember that it is fairly unlikely that any user will actually bother looking through hundreds of result pages, so transferring all the data is likely wasteful as well. This may be a relevant read for you.
I assume you have a bean class representing records in this table, with instances loaded from whatever ORM you have in place.
If you haven't already, you should implement caching of these beans in your application. This can be done locally, perhaps using Guava's CacheBuilder, or remotely using calls to Memcached for example (the latter would be necessary for multiple app servers/load balancing). The cache for these beans should be keyed on a unique id, most likely the mapping to the primary key column of the corresponding table.
Getting to the pagination: simply write your queries to return only IDs of the selected records. Include LIMIT and OFFSET or your DB language's equivalent to paginate. The caller of the query can also filter/sort at will using WHERE, ORDER BY etc.
Back in the Java layer, iterate through the resulting IDs of these queries and build your List of beans by calling the cache. If the cache misses, it will call your ORM to individually query and load that bean. Once the List is built, it can be processed/serialized and sent to the UI.
I know this doesn't directly answer the client vs server side pagination, but I would recommend using DataTables.net to both display and paginate your data. It provides a very nice display, allows for sorting and pagination, built in search function, and a lot more. The first time I used it was for the first web project I worked on, and I, as a complete noobie, was able to get it to work. The forums also provide very good information/help, and the creator will answer your questions.
DataTables can be used both client-side and server-side, and can support thousands of rows.
As for speed, I only had a few hundred rows, but used the client-side processing and never noticed a delay.
USE SERVER PAGINATION!
Sure, you could probably get away with sending down a JSON array of 3000 elements and using JavaScript to page/sort on the client. But a good web programmer should know how to page and sort records on the server. (They should really know a couple ways). So, think of it as good practice :)
If you want a slick user interface, consider using a JavaScript grid component that uses AJAX to fetch data. Typically, these components pass back the following parameters (or some variant of them):
Start Record Index
Number of Records to Return
Sort Column
Sort Direction
Columns to Fetch (sometimes)
It is up to the developer to implement a handler or interface that returns a result set based on these input parameters.
I've been using the low level datastore API for App Engine in Java for a while now and I'm trying to figure out the best way to handle one to many relationships. Imagine a one to many relationship like "Any one student can have zero or more computers, but every computer is owned by exactly one student".
The two options are to:
have the student entity store a list of Keys of the computers associated with the student
have the computer entity store a single Key of the student who owns the computer
I have a feeling option two is better but I am curious what other people think.
The advantage of option one is that you can get all the 'manys' back without using a Query. One can ask the datastore for all entities using get() and passing in the stored list of keys. The problem with this approach is that you cannot have the datastore do any sorting of the values that get returned from get(). You must do the sorting yourself. Plus, you have to manage a list rather than a single Key.
Option two seems nice because there is no list to maintain. Also, you can sort by properties of the computer as long as their is an index for that property. Imagine trying to get all the computers for a student where the results are sorted by purchase date. With approach two it is a simple query, no sorting is done in our code (the datastore's index takes care of it)
Sorting is not really hard, but a little more time consuming (~O(nlogn) for a sort) than having a sorted index (~O(n) for going through the index). The tradeoff is an index (space in the datastore) for processing time. As I said my instinct tells me option two is a better general solution because it gives the developer a little more flexibility in getting results back in order at the cost of additional indexes (which with the google pricing model are pretty cheap). Does anyone agree, disagree, or have comments?
Both approaches are valid in different situations, though option two - storing a single reference on the 'many' side - is the more common approach. Which you use depends on how you need to access your data.
Have you considered doing both? Then you could quickly get a list of computers a student owns by key OR use a query which returns results in some sorted order. I don't think maintaining a list of keys on the student model is as intimidating as you think.
Don't underestimate the benefit of fetching entities directly by keys. According to this article, this can be 4-5x faster than queries.