Query exist-db from Java

Query exist-db from Java - java

i want to query existdb from Java. i know there are samples but where can i get the necessary packages to run the examples?
in the samples :
import javax.xml.transform.OutputKeys;
import org.exist.storage.serializers.EXistOutputKeys;
import org.exist.xmldb.EXistResource;
import org.xmldb.api.DatabaseManager;
import org.xmldb.api.base.Collection;
import org.xmldb.api.base.Database;
import org.xmldb.api.modules.XMLResource;
where can i get these ?
and what is the right standard connection string for exist-db? port number etc
and YES, i have tried to read the existdb documentation, but those are not really understandable for beginners. they are confusing.
All i want to do is write a Java class in eclipse that can connect to a exist-db and query an xml document.

Your question is badly written, and I think you are really not explaining what you are trying to do very well.
If you want the JAR files as dependencies directly for some project then you can download eXist and get them from there. Already covered several times here, which JAR files you need as dependencies is documented on the eXist website and links to that documentation have already been posted in this thread.
I wanted to add, that if you did want a series of simple Java examples that use Maven to resolve the dependencies (which takes away the hard work), then when we wrote the eXist book we provided just that in the Integration Chapter. It shows you how to use each of eXist's different APIs from Java for storing/querying/updating etc. You can find the code from that book chapter here: https://github.com/eXist-book/book-code/tree/master/chapters/integration. Included are the Maven project files to resolve all the dependencies and build and run the examples.
If the code is not enough for you, you might also want to consider purchasing the book and reading the Integration Chapter carefully, that should answer all of your questions.

i ended up with a maven project and imported some missing jars (like ws.commons etc) by manually installing them on maven.
the missing jars i copied from the existdb installation path on my local system.
then i got it to work.

from: http://exist-db.org/exist/apps/doc/devguide_xmldb.xml
There are several XML:DB examples provided in eXist's samples
directory . To start an example, use the start.jar jar file and pass
the name of the example class as the first parameter, for instance:
java -jar start.jar org.exist.examples.xmldb.Retrieve [- other
options]
Example: Retrieving a Document with XML:DB
import org.xmldb.api.base.*;
import org.xmldb.api.modules.*;
import org.xmldb.api.*;
import javax.xml.transform.OutputKeys;
import org.exist.xmldb.EXistResource;
public class RetrieveExample {
private static String URI = "xmldb:exist://localhost:8080/exist/xmlrpc";
/**
* args[0] Should be the name of the collection to access
* args[1] Should be the name of the resource to read from the collection
*/
public static void main(String args[]) throws Exception {
final String driver = "org.exist.xmldb.DatabaseImpl";
// initialize database driver
Class cl = Class.forName(driver);
Database database = (Database) cl.newInstance();
database.setProperty("create-database", "true");
DatabaseManager.registerDatabase(database);
Collection col = null;
XMLResource res = null;
try {
// get the collection
col = DatabaseManager.getCollection(URI + args[0]);
col.setProperty(OutputKeys.INDENT, "no");
res = (XMLResource)col.getResource(args[1]);
if(res == null) {
System.out.println("document not found!");
} else {
System.out.println(res.getContent());
}
} finally {
//dont forget to clean up!
if(res != null) {
try { ((EXistResource)res).freeResources(); } catch(XMLDBException xe) {xe.printStackTrace();}
}
if(col != null) {
try { col.close(); } catch(XMLDBException xe) {xe.printStackTrace();}
}
}
}
}

On the page http://exist-db.org/exist/apps/doc/deployment.xml#D2.2.6 a list of dependencies is included; unfortunately there is no link to this page on http://exist-db.org/exist/apps/doc/devguide_xmldb.xml (should be added);
The latest xmldb.jar documentation can be found on http://xmldb.exist-db.org/
All the jar files can be retrieved by installing eXist-db from the installer jar; the files are all in EXIST_HOME/lib/core

If you work with a maven project, try adding this to your pom.xml
<dependency>
<groupId>xmldb</groupId>
<artifactId>xmldb-api</artifactId>
<version>20021118</version>
</dependency>
Be aware that the release date is 2002.
Otherwise you can query exist-db via XML-RPC

Related

Hadoop Hive UDF with external library

I'm trying to write a UDF for Hadoop Hive, that parses User Agents. Following code works fine on my local machine, but on Hadoop I'm getting:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public java.lang.String MyUDF .evaluate(java.lang.String) throws org.apache.hadoop.hive.ql.metadata.HiveException on object MyUDF#64ca8bfb of class MyUDF with arguments {All Occupations:java.lang.String} of size 1',
Code:
import java.io.IOException;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.*;
import com.decibel.uasparser.OnlineUpdater;
import com.decibel.uasparser.UASparser;
import com.decibel.uasparser.UserAgentInfo;
public class MyUDF extends UDF {
public String evaluate(String i) {
UASparser parser = null;
parser = new UASparser();
String key = "";
OnlineUpdater update = new OnlineUpdater(parser, key);
UserAgentInfo info = null;
info = parser.parse(i);
return info.getDeviceType();
}
}
Facts that come to my mind I should mention:
I'm compiling with Eclipse with "export runnable jar file" and extract required libraries into generated jar option
I'm uploading this "fat jar" file with Hue
Minimum working example I managed to run:
public String evaluate(String i) {
return "hello" + i.toString()";
}
I guess the problem lies somewhere around that library (downloaded from https://udger.com) I'm using, but I have no idea where.
Any suggestions?
Thanks, Michal

It could be a few things. Best thing is to check the logs, but here's a list of a few quick things you can check in a minute.
jar does not contain all dependencies. I am not sure how eclipse builds a runnable jar, but it may not include all dependencies. You can do
jar tf your-udf-jar.jar
to see what was included. You should see stuff from com.decibel.uasparser. If not, you have to build the jar with the appropriate dependencies (usually you do that using maven).
Different version of the JVM. If you compile with jdk8 and the cluster runs jdk7, it would also fail
Hive version. Sometimes the Hive APIs change slightly, enough to be incompatible. Probably not the case here, but make sure to compile the UDF against the same version of hadoop and hive that you have in the cluster
You should always check if info is null after the call to parse()
looks like the library uses a key, meaning that actually gets data from an online service (udger.com), so it may not work without an actual key. Even more important, the library updates online, contacting the online service for each record. This means, looking at the code, that it will create one update thread per record. You should change the code to do that only once in the constructor like the following:
Here's how to change it:
public class MyUDF extends UDF {
UASparser parser = new UASparser();
public MyUDF() {
super()
String key = "PUT YOUR KEY HERE";
// update only once, when the UDF is instantiated
OnlineUpdater update = new OnlineUpdater(parser, key);
}
public String evaluate(String i) {
UserAgentInfo info = parser.parse(i);
if(info!=null) return info.getDeviceType();
// you want it to return null if it's unparseable
// otherwise one bad record will stop your processing
// with an exception
else return null;
}
}
But to know for sure, you have to look at the logs...yarn logs, but also you can look at the hive logs on the machine you're submitting the job on ( probably in /var/log/hive but it depends on your installation).

such a problem probably can be solved by steps:
overide the method UDF.getRequiredJars(), make it returning a hdfs file path list which values are determined by where you put the following xxx_lib folder into your hdfs. Note that , the list mist exactly contains each jar's full hdfs path strings ,such as hdfs://yourcluster/some_path/xxx_lib/some.jar
export your udf code by following "Runnable jar file exporting wizard" (chose "copy required libraries into a sub folder next to the generated jar". This steps will result in a xxx.jar and a lib folder xxx_lib next to xxx.jar
put xxx.jar and the folders xxx_lib to your hdfs filesystem according to your code in step 0.
create a udf using: add jar ${the-xxx.jar-hdfs-path}; create function your-function as $}qualified name of udf class};
Try it. I test this and it works

How to use ROME in Intellij?

How can I set up my project in Intellij to use the ROME library to read a RSS Feed?
So far, I've developed the following:
import com.sun.syndication.feed.synd.SyndFeed;
import com.sun.syndication.io.SyndFeedInput;
import com.sun.syndication.io.XmlReader;
import java.net.URL;
public class ReadRSS {
public static void main(String[] args) {
String urlString = "http://news.ycombinator.com/"
boolean ok = false;
if (args.length==1) {
try {
URL feedUrl = new URL(urlString);
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(new XmlReader(feedUrl));
System.out.println(feed);
ok = true;
}
catch (Exception ex) {
ex.printStackTrace();
System.out.println("ERROR: "+ex.getMessage());
}
}
if (!ok) {
System.out.println();
System.out.println("FeedReader reads and prints any RSS/Atom feed type.");
System.out.println("The first parameter must be the URL of the feed to read.");
System.out.println();
}
}
}
But, I get multiple errors when running my code, mainly of the variant:
.. java:package com.sun.syndication.feed.synd does not exist..
How do I import the package in Intellij? Managed to import this my adding jar in my project structure.
But the next problem is: I can't access org.jdom.Document - though I have installed jdom in my project structure. The error I get is
Error:(16, 38) java: cannot access org.jdom.Document class file for
org.jdom.Document not found
How can I resolve this?

If you're using Maven or gradle add the dependency in your configuration file (ex. pom.xml in Maven) and do a build/install to download your dependencies. It should work fine after that. Dependency info is here: http://mvnrepository.com/artifact/rome/rome/0.9
Otherwise add the jar (downloadable from the link above) manually to your project. Look at the first answer in this question to see how to do this: Correct way to add external jars (lib/*.jar) to an IntelliJ IDEA project

I'm a developer of the ROME team. The latest version is ROME 1.5. It can be obtained from the central maven repository: http://search.maven.org/#artifactdetails%7Ccom.rometools%7Crome%7C1.5.1%7Cjar
The groupId has changed to com.rometools in v1.5.0.#
I highly recommend you to use Maven, Gradle or another build tool that is able to resolve transitive dependencies so you won't have to collect all dependencies manually.

Java: CaptureDeviceManager#getDeviceList() is empty?

I am trying to print out all of the capture devices that are supported using the #getDeviceList() method in the CaptureDeviceManager class and the returned Vector has a size of 0.
Why is that? I have a webcam that works - so there should be at least one. I am running Mac OS X Lion - using JMF 2.1.1e.
Thanks!

CaptureDeviceManager.getDeviceList(Format format) does not detect devices. Instead it reads from the JMF registry which is the jmf.properties file. It searches for the jmf.properties file in the classpath.
If your JMF install has succeeded, then the classpath would have been configured to include all the relevant JMF jars and directories. The JMF install comes with a jmf.properties file included in the 'lib' folder under the JMF installation directory. This means the jmf.properties would be located by JMStudio and you would usually see the JMStudio application executing correctly. (If your JMF install is under 'C:\Program Files', then run as administrator to get around UAC)
When you create your own application to detect the devices, the problem you described above might occur. I have seen a few questions related to the same problem. This is because your application's classpath might be different and might not include the environment classpath. Check out your IDE's properties here. The problem is that CaptureDeviceManager cannot find the jmf.properties file because it is not there.
As you have found out correctly, you can copy the jmf.properties file from the JMF installation folder. It would contain the correct device list since JMF detects it during the install (Check it out just to make sure anyway).
If you want do device detection yourself, then create an empty jmf.properties file and put it somewhere in your classpath (it might throw a java.io.EOFException initially during execution but that's properly handled by the JMF classes). Then use the following code for detecting webcams...
import javax.media.*;
import java.util.*;
public static void main(String[] args) {
VFWAuto vfwObj = new VFWAuto();
Vector devices = CaptureDeviceManager.getDeviceList(null);
Enumeration deviceEnum = devices.elements();
System.out.println("Device count : " + devices.size());
while (deviceEnum.hasMoreElements()) {
CaptureDeviceInfo cdi = (CaptureDeviceInfo) deviceEnum.nextElement();
System.out.println("Device : " + cdi.getName());
}
}
The code for the VFWAuto class is given below. This is part of the JMStudio source code. You can get a good idea on how the devices are detected and recorded in the registry. Put both classes in the same package when you test. Disregard the main method in the VFWAuto class.
import com.sun.media.protocol.vfw.VFWCapture;
import java.util.*;
import javax.media.*;
public class VFWAuto {
public VFWAuto() {
Vector devices = (Vector) CaptureDeviceManager.getDeviceList(null).clone();
Enumeration enum = devices.elements();
while (enum.hasMoreElements()) {
CaptureDeviceInfo cdi = (CaptureDeviceInfo) enum.nextElement();
String name = cdi.getName();
if (name.startsWith("vfw:"))
CaptureDeviceManager.removeDevice(cdi);
}
int nDevices = 0;
for (int i = 0; i < 10; i++) {
String name = VFWCapture.capGetDriverDescriptionName(i);
if (name != null && name.length() > 1) {
System.err.println("Found device " + name);
System.err.println("Querying device. Please wait...");
com.sun.media.protocol.vfw.VFWSourceStream.autoDetect(i);
nDevices++;
}
}
}
public static void main(String [] args) {
VFWAuto a = new VFWAuto();
System.exit(0);
}
}
Assuming you are on a Windows platform and you have a working web-cam, then this code should detect the device and populate the jmf.properties file. On the next run you can also comment out the VFWAuto section and it's object references and you can see that CaptureDeviceManager reads from the jmf.properties file.
The VFWAuto class is part of jmf.jar. You can also see the DirectSoundAuto and JavaSoundAuto classes for detecting audio devices in the JMStudio sample source code. Try it out the same way as you did for VFWAuto.
My configuration was Windows 7 64 bit + JMF 2.1.1e windows performance pack + a web-cam.

I had the same issue and I solved by invoking flush() on my ObjectInputStream object.
According to the API documentation for ObjectInputStream's constructor:
The stream header containing the magic number and version number are read from the stream and verified. This method will block until the corresponding ObjectOutputStream has written and flushed the header.
This is a very important point to be aware of when trying to send objects in both directions over a socket because opening the streams in the wrong order will cause deadlock.
Consider for example what would happen if both client and server tried to construct an ObjectInputStream from a socket's input stream, prior to either constructing the corresponding ObjectOutputStream. The ObjectInputStream constructor on the client would block, waiting for the magic number and version number to arrive over the connection, while at the same time the ObjectInputStream constructor on the server side would also block for the same reason. Hence, deadlock.
Because of this, you should always make it a practice in your code to open the ObjectOutputStream and flush it first, before you open the ObjectInputStream. The ObjectOutputStream constructor will not block, and invoking flush() will force the magic number and version number to travel over the wire. If you follow this practice in both your client and server, you shouldn't have a problem with deadlock.
Credit goes to Tim Rohaly and his explanation here.

Before calling CaptureDeviceManager.getDeviceList(), the available devices must be loaded into the memory first.
You can do it manually by running JMFRegistry after installing JMF.
or do it programmatically with the help of the extension library FMJ (Free Media in Java). Here is the code:
import java.lang.reflect.Field;
import java.util.Vector;
import javax.media.*;
import javax.media.format.RGBFormat;
import net.sf.fmj.media.cdp.GlobalCaptureDevicePlugger;
public class FMJSandbox {
static {
System.setProperty("java.library.path", "D:/fmj-sf/native/win32-x86/");
try {
final Field sysPathsField = ClassLoader.class.getDeclaredField("sys_paths");
sysPathsField.setAccessible(true);
sysPathsField.set(null, null);
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String args[]) {
GlobalCaptureDevicePlugger.addCaptureDevices();
Vector deviceInfo = CaptureDeviceManager.getDeviceList(new RGBFormat());
System.out.println(deviceInfo.size());
for (Object obj : deviceInfo ) {
System.out.println(obj);
}
}
}
Here is the output:
USB2.0 Camera : civil:\\?\usb#vid_5986&pid_02d3&mi_00#7&584a19f&0&0000#{65e8773d-8f56-11d0-a3b9-00a0c9223196}\global
RGB, -1-bit, Masks=-1:-1:-1, PixelStride=-1, LineStride=-1

Ant script to Find a jar file given the class name?

how to find the name of jar which contains the specific class through ant script?

We are using the following Ant code to find the jar containing our main class. It requires Ant 1.8+.
<whichresource class="${main.class}" property="main.class.url">
<classpath>
<fileset dir="${jar.dir}" includes="*.jar" />
</classpath>
</whichresource>
<pathconvert property="jar.file">
<url url="${main.class.url}" />
<regexpmapper from="jar:file:/(.*)!.*" to="\1" />
</pathconvert>
<echo>Jar file containing main class: ${jar.file}</echo>
I hope this answers your question.
Maarten

Try jarscan . It is a commandline java tool, it should be easy if you want to integrate it through ant also.

I think this problem can be better solved by a 1-line (ok, 5-line) C-shell script. Suppose you are trying to find a list of jar files in some directory that contain a certain file. Try this at your csh prompt:
% cd directory_where_your_jar_files_reside
% set f2Search = filename_you_are_looking_for
% foreach jarFile (*.jar)
? (jar tvf $jarFile | grep $f2Search > /dev/null) || echo $jarFile
? end
You can obviously concat the output to some other file if required. This is a Unix solution, dunno how to do this on Windows, sorry. Apologies for not answering the Ant question, but others have answered it already. Hope this helps, - M.S.

That's not really what Ant is for.
Ant is primarily a build tool, and it's core functionalities revolve around that. Yes, it can be used for additional functions, but it's still best to use it what it was intended for. Finding a class inside a JAR file isn't what it's good at.
You may be better off just doing this in Java. Java offers functionality to open up JAR files and inspect the contents. You can use those to find the class you are looking for. Or, use another tool that's intended to do that, such as the Jarscan that Biju mentions.

I agree with rfeak this is not what ANT is designed for. Having said that, it can very be a frustrating to determine how to resolve missing class dependencies.....
I use the findjar website. It indexes most of the available open source libraries.

If you really need to do this, one approach that might work is to loop through all the JAR files and pass each one to the available task as the lone classpath path element. The class you're interested in would be used for the classname attribute.
Edit: Since it sounds like you're now entertaining non-Ant solutions, here's a variation of the find-my-class code:
import java.net.URL;
public class ClassFinder {
public static void main(String[] args) throws ClassNotFoundException {
Class<?> c = Class.forName(args[0]);
URL url = c.getResource(c.getSimpleName() + ".class");
System.out.println("location of " + args[0] + ": " + url);
}
}
With Java 6, you can use classpath wildcards (see "Understanding class path wildcards"), so it should be easy to include your directory of JARs and see how fast it is.
Edit2: And if you want to find multiple locations...
import java.io.IOException;
import java.net.JarURLConnection;
import java.net.URL;
import java.net.URLConnection;
import java.util.Enumeration;
public class ClassFinder {
public static void main(String[] args) throws IOException {
String classResourceName = args[0].replaceAll("\\.", "/") + ".class";
ClassLoader loader = ClassFinder.class.getClassLoader();
Enumeration<URL> classResources = (loader == null) ? ClassLoader.getSystemResources(classResourceName) : loader.getResources(classResourceName);
if (classResources.hasMoreElements()) {
System.out.println("Locations of " + args[0] + ":");
while (classResources.hasMoreElements()) {
URL url = classResources.nextElement();
URLConnection conn = url.openConnection();
String loc = (conn instanceof JarURLConnection) ? ((JarURLConnection)conn).getJarFile().getName() : url.toString();
System.out.println(loc);
}
} else {
System.out.println("No locations found for " + args[0]);
}
}
}

Runtime error using the Eclipse Abstract Syntax Tree

I'm trying to use AST parser in a non-plugin environment. The code compiles, but I get the following runtime error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/eclipse/core/resources/IResource
at org.eclipse.jdt.core.dom.ASTParser.(ASTParser.java:189)
at org.eclipse.jdt.core.dom.ASTParser.newParser(ASTParser.java: 118)
Here is the code I'm running:
import org.eclipse.core.runtime.IProgressMonitor;
import org.eclipse.jdt.core.dom.*;
public class TestAST
{
private void runTest()
{
String helloStr ="\n"+
"public class HelloWorld {\n"+
"\n"+
" private String name=\"\"\n\n"+
" /**\n"+
" * \n"+
" */\n"+
" public void sayHello() {\n"+
" System.out.println(\"Hello \"+name+\"!\");\n"+
" }\n"+
"\n"+
"}";
ASTParser parser = ASTParser.newParser(AST.JLS3);
parser.setKind(ASTParser.K_COMPILATION_UNIT);
parser.setSource(helloStr.toCharArray());
parser.setResolveBindings(true);
ASTNode tree = parser.createAST(null);
tree.toString();
}
public static void main(String args[])
{
TestAST ast = new TestAST();
ast.runTest();
}
}
Does anyone know why this is happening?
Thanks in advance,
Shirley

I recently ran into a similar issue and I slowly stepped through fixing one dependency at a time and here is the list of required dependencies that I came up with. I hope this saves some time for people who try to do this same task:
List (which matches picture below):
ContentType (org.eclipse.core.contenttype)
Jobs (org.eclipse.core.jobs)
Resources (org.eclipse.core.resources)
Runtime (org.eclipse.core.runtime)
Equinox Common (org.eclipse.equinox.common)
Equinox Preferences (org.eclipse.equinox.preferences)
JDT (org.eclipse.jdt)
JDT Core (org.eclipse.jdt.core)
OSGI (org.eclipse.osgi)
OSGI Services (org.eclipse.osgi.services)
OSGI Util (org.eclipse.osgi.util)
All these JARs will likely already be contained in your Eclipse plugins directory and you can find and add them to the build path by adding them as external JARs.

The IResource class is not on your classpath when you start the application.
If you're not using Eclipse (or some other tool) to manage the dependencies, you're going to have to track down every jar file that the Abstract Syntax Tree classes require and manually include them on your classpath. I'm not sure exactly how many this might be, but Eclipse is made up of many dozens of plugins, and manually working out the build dependencies will be a chore.
Edit: To add IResorce to the classpath, the particular jar file you're looking for will be called something like org.eclipse.core.resources_3.5.0.v20090512.jar, depending on your version of Eclipse. But I don't think it will be the only one you'll need...

I had the same problem. I solved adding the jars into the required dependencies of the the plugin.xml. You can find it in the tab Dependencies of the plugin.xml file.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Query exist-db from Java - java

i ended up with a maven project and imported some missing jars (like ws.commons etc) by manually installing them on maven. the missing jars i copied from the existdb installation path on my local system. then i got it to work.

If you work with a maven project, try adding this to your pom.xml <dependency> <groupId>xmldb</groupId> <artifactId>xmldb-api</artifactId> <version>20021118</version> </dependency> Be aware that the release date is 2002. Otherwise you can query exist-db via XML-RPC

Related

Hadoop Hive UDF with external library

How to use ROME in Intellij?

Java: CaptureDeviceManager#getDeviceList() is empty?

Ant script to Find a jar file given the class name?

Runtime error using the Eclipse Abstract Syntax Tree

Categories

Resources