Data Retrieving from XML file with some limitation in Java - java

I am using XML in my project for data to be Insert/Update/Delete.
Currently i am using XPath for doing the above operations from my Java application.
I am facing a problem while retrieving the data from XML. If there are 1000 records in the XML file i want to get the data from XML file with some limit (same as limit in a MySQL select query) in the rows, for implementing the pagination in the view page. I want to display 100 records at a time, so that end-user can click on next button to see all the 1000 records.
Can anyone tell me the best way to full-fill this requirement?
Ya, we can do it with "position()" function but the problem is i want to get the data in an sorted order. position() will return the data from the XML file respective to the given number(in XML file the data may not be in an order). So i want to read the data along with order. I am not able to find the XML query for Sorting and Paginated data in XPath.

You can consider using JAXB instead of direct XML manipulation.

As you are using XPath to access your XML data, one possibility could be the position() function to get "paginated" data from the XML. Like:
/path/to/some/element[position() >= 100 and position() <= 200]
Of course you have to store the boundaries (e.g. 100 - 200 as an example) then between user requests.
Ok, if you need sorted output aswell... as far as I know there is no sort function in pure xpath (1.0/2.0). Maybe you are using a library that offers this as an extension. Or you maybe have the possibility to use an XSLT and xsl:sort. Or you use XML binding as written in the other answer.

Related

MarkLogic get data from collection

Is it possible to get data from MarkLogic xml database, but using MarkLogic API for java?
I have read documentation, but it only shows how to add a xml to collection or to delete it, doesn't show how to get all xml documents from one selected collection?
This looks like it will do the job
https://docs.marklogic.com/javadoc/client/index.html?com/marklogic/client/query/StructuredQueryBuilder.html
StructuredQueryBuilder.CollectionConstraintQuery collectionConstraint(String constraintName, String... uris)
Matches documents belonging to at least one of the criteria collections with the specified constraint.

How to index a XML file with Lucene [duplicate]

I am new in lucene I want to indexing with lucene of large xml files(15GB) that contain plain text as well as attribute and so many xml tags. how to parse and indexing this xml file using lucene with any sample and if we use lucene we need any database
How to parse and index huge xml file using lucene ? Any sample or links would be helpful to me to understand the process. Another one, if I use lucene, will I need any database, as I have seen and done indexing with Databases..
Your indexing would be build as you would have done using a database, just iterate through all data you want to index and write it to the index. Just go with the XmlReader class to parse your xml in a forward-only fashion. You will, just as with a database, need to index some kind of primary-key so you know what the search result represents.
A database helps when it comes to looking up the indexed data from the primary-key. It will be messy to read the data for a primary-key if you need to iterate a 15 GiB xml file at every request.
A database is not required, but it helps a lot. I would build this as an import tool that reads your xml, dumps it into your database, and then use your "normal" database indexing code you've built before.
You might like to look at Michael Sokolov's Lux product, which combines Lucene and Saxon:
http://www.mail-archive.com/solr-user#lucene.apache.org/msg84102.html
I haven't used it myself and can't claim to fully understand its capabilities.

How to store java objects on Solr

I want to store java objects as part of the Solr document.
They don't need to be parsed or searched, only be returned as part of the document.
I can convert them to json or XML and store the text but I prefer something more efficient.
If I could use Java serialization and then add the binary blob to the document it could be ideal.
I'm aware of the option to convert the binary blob with base64 but I was wondering if there is a more efficient way.
I do not share the opinions of the first two answers.
An additional database call can in some scenarios be completely unnecessary, Solr can act as a NoSQL database, too.
It can even use compression for some fields, which affects CPU cost, but saves some cache memory for some kind of binary data.
Take a look at BinaryField and the lazy loading field declarations within your schema.xml.
As you can construct an id in Solr to pass with any document, you can store this object in other way (database for example) and query it as you get the id back from solr.
For example, we're storing web pages in Solr. When we index it, we're creating an id which match the id of a WebPage Object created by the ORM in the database
When a search is performed, we get the id back and load the java object from the database
No need to store it in solr (which has been made to store and index documents)

Which Java structure should I use to store XML records in?

Ok, so I am still relatively new to Java and I'm making pretty good progress. My task is this:
Build a SOAP request to initiate communicates with a web services server (done)
Retrieve the resulting XML which contains a unique session ID which must be used in step 3 (done)
Create another SOAP request using the unique session ID that returns another set of XML containing 100 rows of records (done)
Extract specific data from these results (in progress)
My question is, what is the best way to store this data in Java so that I can sort through it easily? The data portion of my XML looks like this:
<RawData>
<item value="1" anothervalue="2" yetanothervalue="3"/>
<item value="4" anothervalue="5" yetanothervalue="6"/>
<item value="7" anothervalue="8" yetanothervalue="9"/>
</RawData>
I have no problem using XPath and SimpleXPathEngine to retrieve specific values from the XML. But I would really love to be able to store it in some sort of ResultSet type structure so that I could easily retrieve and manipulate it. I've used ResultSet with SQL queries so I'm familiar and comfortable with it, however, I'm not sure how to use it outside of an actual DB connection and query. What would be the best way to handle this?
Thanks in advance!
If the document is not that big, you can use a DOM parser to get all data in-memory. That's either org.w3c.dom, dom4j or jdom
Well since your data seems simple and regular I'd personally use JAXB - you can either create the classes by hand or use some tool to generate them from the XML schema.
That way you can easily work with a List of your classes and can use your usual java to get at the data or manipulate it, also you can easily ignore fields you don't need in your java representation so your classes will only cover those parts of the XML file you're really interested in.
You can store the data in a memory-only database.
hsqldb supports this functionality.

Display 1000's of records in jsp page

We are getting 1000 of records from services we need to display all records in jsp page. We have set the data to object and stored in java collections. How can get that collections in java script using Ajax and need to display 10 records in every time based on scroll we can load another 10 records upto completion.
Please suggest the compatible technology.
At this time We are using the struts2 and jquery.
It sounds like you want something along the lines of SlickGrid. It is very fast, and is the data grid that powers SEDE result tables.
Another option, which I have used before with great results, is a YUI DataTable with pagination (server-side or client-side). With client-side pagination — which is typically faster, since all the data is already in the browser — I've created YUI data tables that work with more data than the browser can parse at once, with minimal performance degradation.
You can try implementing a simple pagination technique
int totalRecords;
int maxRecordsPerPage;
int totalPages = ( totalRecords / maxRecordsPerPage );
int displayRecordFrom;
int displayRecordTo;
Total Records : Number of records fetched.
Max Records Per page :
Total Pages : this is optional, either you can display total pages some thing like google or just put next button or link
Display Record From And To : As you are storing records in collection, it can be fetched used get(index)
After fetching results use Jettison or any other Java JSON library to output results into JSON. Instead of working from scratch, Its better to use pre-tested third party Javascript components using JQuery or other library.

Categories