How to Use SELECT Query on Xml using java - java

I am having an xml file which contains data from different tables.
These tables are linked to each other.
I want to access Records from the xml.
Can i write SQL select query on Xml file.

No, you cannot use SQL for XML files. Either move the data to a relational store or use a hierarchical query language.
All you can expect after such a vague question is a bunch of random keywords (like XQuery, eXist, XPath, Oracle XML Db, MarkLogic, Jaxen). Probably none of them is relevant to whatever problem you might have at hand.

Related

Export table from DB to an XML file with Java

I need to read every row of a particular table stored in a mysql DB, then i have to write an xml file extracted from each row.
Which is the best practice for obtaining this target? I should use some java libraries, but i don't know how to choose the exact ones.
look at this:
http://examples.javacodegeeks.com/core-java/xml/bind/jaxb-marshal-example/
From the DB if you use an ORM like hibernate or other you will get your rows as objects and then you can transforme them using JAXB.

How to index a XML file with Lucene [duplicate]

I am new in lucene I want to indexing with lucene of large xml files(15GB) that contain plain text as well as attribute and so many xml tags. how to parse and indexing this xml file using lucene with any sample and if we use lucene we need any database
How to parse and index huge xml file using lucene ? Any sample or links would be helpful to me to understand the process. Another one, if I use lucene, will I need any database, as I have seen and done indexing with Databases..
Your indexing would be build as you would have done using a database, just iterate through all data you want to index and write it to the index. Just go with the XmlReader class to parse your xml in a forward-only fashion. You will, just as with a database, need to index some kind of primary-key so you know what the search result represents.
A database helps when it comes to looking up the indexed data from the primary-key. It will be messy to read the data for a primary-key if you need to iterate a 15 GiB xml file at every request.
A database is not required, but it helps a lot. I would build this as an import tool that reads your xml, dumps it into your database, and then use your "normal" database indexing code you've built before.
You might like to look at Michael Sokolov's Lux product, which combines Lucene and Saxon:
http://www.mail-archive.com/solr-user#lucene.apache.org/msg84102.html
I haven't used it myself and can't claim to fully understand its capabilities.

Is there a clean way to read embedded SQL resource files?

To avoid creating SQL statements as strings in a class I've placed them as .sql files in the same package and read the contents to a string in the static constructor. The reason for this is the SQL is very complex due to an ERP system that the SQL is querying.
There's no problem with this method, though since the SQL reading mechanism quite simply just reads the whole file any comments within that file may cause the read to fail if they are at the end of the line, as when reading it first removes excess whitespace and removes new-lines. Full commented lines (i.e. lines beginning with -- are removed).
I could enhance the simple reading to read the file and remove commented lines etc, though I have to wonder if there is something already available that could read an SQL file and clean it up.
I've seen this same problem solved in a project I've worked on by storing queries in XML, and loading the XML into a custom StoredQueriesCache object at runtime. To get a query, we would call a method on the StoredQueriesCache object and just pass the query name (which is defined in the XML), and it would return the query.
Writing something like this is fairly simple. The XML would look something like this below...
<Query>
<Name>SomeUniqueQueryName</Name>
<SQL>
SELECT someColumn FROM someTable WHERE somePredicate
</SQL>
</Query>
You would have one element for every stored query. The XML would be loaded into memory at application startup from file, or depending on your needs it could be lazy loaded from file. Then your StoredQueriesCache object that holds the XML would have methods to return individual queries by name. In my experience, having comments in the query has never caused any issue since linebreaks are part of the XML node's innertext, but if you want your StoredQueriesCache methods that retrieve the queries could parse comments out.
I've found this to be the most organized way of storing queries without embedding them in code, and without using stored procedures. There should honestly be a library that does this for you; maybe I'll write one!

JAVA : file exists Vs searching large xml db

I'm quite new to Java Programming and am writing my first desktop app, this app takes a unique isbn and first checks to see if its all ready held in the local DB, if it is then it just reads from the local DB, if not it requests the data from isbndb.com and enters it into the DB the local DB is in XML format. Now what im wondering is which of the following two methods would create the least overhead when checking to see if the entry all ready exists.
Method 1.) File Exists.
On creating said DB entry the app would create a seperate file for every isbn number named isbn number.xml (ie. 3846504937540.xml) and when checking would use the file exists method to check if an entry all ready exists using the user provided isbn .
Method 2.) SAX XML Parser.
All entries would be entered into a single large XML file and when checking for existing entries the SAX XML Parser would be used to parse the file and then the user provided isbn would be checked against those in the XML DB for a match.
Note :
The resulting entries could number in the thousands over time.
Any information would be greatly appreciated.
I don't think either of your methods is all that great. I strongly suggest using a DBMS to store the data. If you don't have a DBMS on the system, or if you want an app that can run on systems without an installed DBMS, take a look at using SQLite. You can use it from Java with SQLiteJDBC by David Crawshaw.
As far as your two methods are concerned, the first will generate a huge amount of file clutter, not to mention maintenance and consistency headaches. The second method will be slow once you have a sizable number of entries because you basically have to read (on the average) half the data base for every query. With a DBMS, you can avoid this by defining indexes for the info you need to look up quickly. The DBMS will automatically maintain the indexes.
I don't like too much the idea of relying on the file system for that task: I don't know how critical is your application, but many things may happen to these xml files :) plus, if the folder gets very very big, you would need to think about splitting these files in some hierarchcal folder structure, to have decent performance.
On the other hand, I don't see why using an xml file as a database, if you need to update frequently.
I would use a relational database, and add a new record in a table for each entry, with an index on the isbn_number column.
If you are in the thousands records, you may very well go with sqlite, and you can replace it with a more powerful non-embedded DB if you ever need it, with no (or little :) ) code modification.
I think you'd better use DBMS instead of your 2 methods.
If you want least overhead just for checking existence, then option 1 is probably what you want, since it's direct look up. Parsing XML each time for checking requires you to to pass through the whole XML file in worst case. Although you can do caching with option 2 but that gets more complicated than option 1.
With option 1 though, you need to beware that there is a limit of how many files you can store under a directory, so you probably have to store the XML files by multiple layer (for example /xmldb/38/46/3846504937540.xml).
That said, neither of your options is good way to store data in the long run, you will find them become quite restrictive and hard to manage as data grows.
People already recommended using DBMS and I agree. On top of that I would suggest you to look into document-based database like MongoDB as your database.
Extend your db table to not only include the XML string but also the ISBN number.
Then you select the XML column based on the ISBN column.
Query: Java escaped, "select XMLString from cacheTable where isbn='"+ isbn +"'"
A different approach could be to use an ORM like Hibernate.
In ORM instead of saving the whole XML document in one column you use different different columns for each element and attribute and you could even split upp your document over several tables for a simpler long term design.

Binding Sql Server and XML

I have a Sql Server database (version 2008 running databases in 2000 mode) and I want to generate some XML files using data from this db. I have XML schema for this XML.
I've though of 3 ways to do it.
SQL Server select FOR EXPLICIT
The query would be a little messy, but I am familiar with writing sql statements. The problem is that generated XML will require additional operations like changing some enum ints to strings. So it will look like:
Sql Server-(XML)-> Java Application
xml Creation -> Post Processing -> Schema Validation -> xml file
The dependencies are stored in SQL Query in form like Cases!Case!titles!title!name (pretty messy) and in additional operations in java.
JAXB generated classes and custom sql queries
XML created in java in application logic I would manually write queries to retrieve data and put them in the right tag/attribute.
The dependencies are stored in java code:
in select queries (select number,... from ...)
case.setNumber(rs.getInt("number"));
JAXB generated classes with less queries
So I notice that in case 2 I have the same information in 2 places, so I want to store this bindings field=column. Then I can generate select queries and copy loops using reflection.
The way of storing:
-hashmap<String,String> field,column
-annotations to fields genereted from XML Schema using Annotate Plugin, then I get annotiatons for each field in class and generate query.
Maybe there is another way I have not considered yet.
I want to make it in lets say professional way to practise something new during a quite simple task.

Categories