How to describe classes and properties using RDFS in Java - java

I am quite new to Java Sesame. I am following this tutorial: http://openrdf.callimachus.net/sesame/2.7/docs/users.docbook?view . I know how to create statements and add them into the Sesame repository. At the moment, I am trying to describe classes and properties for the statements I am going to add. For example, I am having the ones below:
:Book rdf:type rdfs:Class .
:bookTitle rdf:type rdf:Property .
:bookTitle rdfs:domain :Book .
:bookTitle rdfs:range rdfs:Literal .
:MyBook rdf:type :Book .
:MyBook :bookTitle "Open RDF" .
As shown, Book is defined as a Class. bookTitle is defined as a Property. My question is: How can I do this in Java Openrdf using org.openrdf.model.vocabulary.RDFS. To clarify the point, Here is another example:
con.add(alice, RDF.TYPE, person);
alice is a type of person. How can I define person as a class using org.openrdf.model.vocabulary.RDFS. Your assistance would be very much appreciated.

You'd do this in exactly the same way as you're describing alice as a person. Like this:
con.add(person, RDF.TYPE, RDFS.CLASS);
Similarly for the other things you want to add (assuming you've created a URI for bookTitle):
con.add(bookTitle, RDF.TYPE, RDF.PROPERTY);
con.add(bookTitle, RDFS.DOMAIN, book);
etc.
I should point out that although it is of course possible to create your schema or ontology in this fashion, it might be easier to instead create a file containing your ontology (e.g. in Turtle or N-Triples syntax), and then simply upload that file to your Sesame repository.

Related

Using java, how to change the default operator of match query to AND in elasticsearch?

When using java api as below
query.must(matchQuery("name", object.getName()));
The resulted elastic query is
"bool":{
"must":[
{"match":{"name":{"query":"De Michael Schuster","operator":"OR","boost":1.0}}}
.....
Right now I am getting back document with name : De OR Michael OR Schuster as expected.
I want to change the operator to AND to match the whole string.
I know I can use term query, but that is not an option in my scenario.
I came across this, but the answer is not given - https://discuss.elastic.co/t/changing-the-default-operator-for-search-api/47033
How can I achieve this using Java ?
Simply like this:
query.must(matchQuery("name", object.getName()).operator(Operator.AND));

ELDA - Linked Data API, define sorting option

I have the endpoint which lists all games which include {name} parameter and what I want to implement currently is to give user an option of choosing the ordering of the results.
games?name={game}
Something similar as:
games?name={game}&order={order}
you can see the partial implementation of my endpoint. Currently api:orderBy is statically written.
api:selector [
api:where " ?item a epic:Game . ?item epic:Name ?name . FILTER (regex(?name, ?game, 'i')) " ;
api:orderBy "DESC(?name)"
]
.
I am using ELDA.

OpenRdf Exception when parsing data from DBPedia

I use OpenRdf with Sparql to gather data from DBPedia but I encounter some errors on the following query ran against the DBPedia Sparql endpoint:
CONSTRUCT{
?battle ?relation ?data .
}
WHERE{
?battle rdf:type yago:Battle100953559 ;
?relation ?data .
FILTER(?relation != owl:sameAs)
}
LIMIT 1
OFFSET 18177
I modified the LIMIT and OFFSET to point out the specific result that provokes the problem.
The response is this one :
#prefix foaf: <http://xmlns.com/foaf/0.1/> .
#prefix ns1: <http://en.wikipedia.org/wiki/> .
<http://dbpedia.org/resource/Mongol%E2%80%93Jin_Dynasty_War> foaf:isPrimaryTopicOf ns1:Mongol–Jin_Dynasty_War .
The problem is that the ns1:Mongol–Jin_Dynasty_War entity contains a minus sign, therefore I get the following exception when running this query inside a Java application using OpenRdf :
org.openrdf.query.QueryEvaluationException: org.openrdf.rio.RDFParseException: Expected '.', found '–' [line 3]
Is there any way to circumvent this problem ?
Thanks !
To help other users who might encounter the same problem, I'll post here the way to set the preferred output format for Graph Queries using OpenRDF v2.7.x.
You need to creat a subclass of SPARQLRepository to access the HTTPClient (for some reason, the field is protected.
public class NtripleSPARQLRepository extends SPARQLRepository {
public NtripleSPARQLRepository(String endpointUrl) {
super(endpointUrl);
this.getHTTPClient().setPreferredRDFFormat(RDFFormat.NTRIPLES);
}
}
The you just need to create a new Instance of this class :
NtripleSPARQLRepository repository = new NtripleSPARQLRepository(service);
RepositoryConnection connection = new SPARQLConnection(repository);
Query query = connection.prepareQuery(QueryLanguage.SPARQL, "YOUR_QUERY");
If you are querying a Virtuoso server, then you are probably encountering sloppiness in the implementation of Virtuoso. I have seen this when getting XML results (vertical tab in output but only XML 1.0) and most recently in JSON results (\U escape for characters not in Basic Multilingual Plane).

Aggregating within Apache Jena

I'm using the Java API of Apache Jena to store and retrieve documents and the words within them. For this I decided to set up the following datastructure:
_dataset = TDBFactory.createDataset("./database");
_dataset.begin(ReadWrite.WRITE);
Model model = _dataset.getDefaultModel();
Resource document= model.createResource("http://name.space/Source/DocumentA");
document.addProperty(RDF.value, "Document A");
Resource word = model.createResource("http://name.space/Word/aword");
word.addProperty(RDF.value, "aword");
Resource resource = model.createResource();
resource.addProperty(RDF.value, word);
resource.addProperty(RSS.items, "5");
document.addProperty(RDF.type, resource);
_dataset.commit();
_dataset.end();
The code example above represents a document ("Document A") consisting of five (5) words ("aword"). The occurences of a word in a document are counted and stored as a property. A word can also occur in other documents, therefore the occurence count relating to a specific word in a specific document is linked together by a blank node. (I'm not entirely sure if this structure makes any sense as I'm fairly new to this way of storing information, so please feel free to provide better solutions!)
My major question is: How can I get a list of all distinct words and the sum of their occurences over all documents?
Your data model is a bit unconventional, in my opinion. With your code, you'll end up with data that looks like this (in Turtle notation), and which uses rdf:type and rdf:value in unconventional ways:
:doc rdf:value "document a" ;
rdf:type :resource .
:resource rdf:value :word ;
:items 5 .
:word rdf:value "aword" .
It's unusual, because usually you wouldn't have such complex information on the type attribute of a resource. From the SPARQL standpoint though, rdf:type and rdf:value are properties just like any other, and you can still retrieve the information you're looking for with a simple query. It would look more or less like this (though you'll need to define some prefixes, etc.):
select ?word (sum(?n) as ?nn) where {
?document rdf:type ?type .
?type rdf:value/rdf:value ?word ;
:items ?n .
}
group by ?word
That query will produce a result for each word, and with each will be the sum of all the values of the :items properties associated with the word. There are lots of questions on Stack Overflow that have examples of running SPARQL queries with Jena. E.g., (the first one that I found with Google): Query Jena TDB store.

How can one extract rdf:about or rdf:ID properties from triples using SPARQL?

It seemed a trivial matter at the beginning but so far I have not managed to get the unique identifier for a given resource using SPARQL. What I mean is given, e.g., rdf:Description rdf:about="http://..." and then some properties identifying this resource, what I want to do is to first find this very resource and then retrieve all the triples given some URI.
I have tried naïve approaches by writing statements in a WHERE clause such as:
?x rdf:about ?y and ?x rdfs:about ?y
I hope I am being precise.
You're making a classic mistake: confusing RDF (which is what SPARQL queries) with (one of) its serialisation, namely RDF/XML. rdf:about (and rdf:ID, rdf:Description, rdf:resource) are part of RDF/XML, a way RDF is written down. You can play around with the RDF Validator to see what RDF triples result from a piece of RDF/XML.
In your case let's start with:
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/terms/">
<rdf:Description rdf:about="http://www.example.org/">
<dc:title>Example for Donal Fellows</dc:title>
</rdf:Description>
</rdf:RDF>
Plug that into the validator and you get:
Number Subject Predicate Object
1 http://www.example.org/ http://purl.org/dc/terms/title "Example for Donal Fellows"
(you can also ask for a pictorial representation)
Notice that rdf:about is not present: its value provides the subject for the triple.
How do I do a query to find properties associated with http://www.example.org? Like this:
select * {
<http://www.example.org/> ?predicate ?object
}
You'll get:
?predicate ?object
<http://purl.org/dc/terms/title> "Example for Donal Fellows"
You'll notice that the query is a triple match with variables (?v) in places where we want to find values. We could also ask what predicate links http://www.example.org/ with "Example for..." by asking:
select * {
<http://www.example.org/> ?predicate "Example for Donal Fellows"
}
This pattern matching is the heart of SPARQL.
RDF/XML is a tricky beast, and you might find it easier to work with N-Triples, which is very verbose but clear, or turtle, which is like N-Triples with a large number of shorthands and abbreviations. Turtle is often preferred by the rdf community.
P.S. rdfs:about doesn't exist anywhere.

Categories