Add basic value to Ontology individuals #Jena

Add basic value to Ontology individuals #Jena - java

I have an Ontology with some Classes and everything setup to run. What is a good way to fill it up with Individuals and Data?? In Short do a one-way Mapping from Database (as Input) to an Ontology.
public class Main {
static String SOURCE = "http://www.umingo.de/ontology/bento.owl";
static String NS = SOURCE+"#";
public static void main(String[] args) throws Exception {
OntModel model = ModelFactory.createOntologyModel( OntModelSpec.OWL_MEM );
// read the RDF/XML file
model.read(SOURCE);
OntologyPreLoader loader = new OntologyPreLoader();
model = loader.init(model);
model.write(System.out,"RDF/XML");
}
}
My Preloader has a Method init with the goal to copy data from a database into the ontology. Here is the Excerpt.
public OntModel init(OntModel model) throws SQLException{
Resource r = model.getResource( Main.NS + "Tag" );
Property tag_name = model.createProperty(Main.NS + "Tag_Name");
OntClass tag = r.as( OntClass.class );
// statements allow to issue SQL queries to the database
statement = connect.createStatement();
// resultSet gets the result of the SQL query
resultSet = statement
.executeQuery("select * from niuu.tags");
// resultSet is initialised before the first data set
while (resultSet.next()) {
// it is possible to get the columns via name
// also possible to get the columns via the column number
// which starts at 1
// e.g., resultSet.getSTring(2);
String id = resultSet.getString("id");
String name = resultSet.getString("name");
Individual tag_tmp = tag.createIndividual(Main.NS+"Tag_"+id);
tag_tmp.addProperty(tag_name,name);
System.out.println("id: " + id);
System.out.println("name: " + name);
}
return model;
}
Everything is working, but I feel really unsure about this way to preload ontologies. Also every Individual should get its own ID so that i can match it with the database at a later point.
Can i simply define a Property ID and add it to every Individual?
I thought about Adding ID to "Thing" as it is the most basic Type in OWL ontologies.

At first sight it seems ok. One tip is to try convert the Jena model into a RDF serialization and run it through Protégé to get a more clear picture on how your ontology mapping looks like.
You can definitely make your own property to describe the id of every individual.
Beneath is an example on how you can create a similar property in turtle format.(I did not add the prefixes for OWL and rdfs since they are some common)
You can add this in Jena aswell if needed. (or load this into your model in Jena.)
#prefix you: <your domain> .
you:dbIdentificator a owl:DatatypeProperty .
you:dbIdentificator rdfs:label "<Your database identifcator>"#en .
you:dbIdentificator rdfs:comment "<Some valuable information if needed>"#en .
you:dbIdentificator rdfs:isDefinedBy <your domain> .
you:dbIdentificator rdfs:domain owl:Thing .
You could also add owl:Thing to every resource, but that is not the best practice because it is a vague definition of a resource. I would look around for vocabularies that defines more what the resource is. Take a look at GoodRelations. It is a very good defined vocabulary that can describe information even though it is not for commercial use. Especially check out the classes there.
Hope that answered some of your question.

Programatically generating URIs is always somewhat unsettling. If you have Guava, use Preconditions to make some fail-fast assertions about what's coming out of the database (so that your code will let you know if it gets out of alignment with your schema). Use the JDK's URLEncoder to ensure that the id you get from the database is converted to a URI-friendly format (Note that if your data contains characters that cannot be printed in xml and have no percent encoding, you'll need to manually handle them).
For your property/column values, use explicitly create the literal. This makes it very clear whether you are using plain literals, language literals, or typed literals:
// If things can have multiple names in multiple languages, for example
tag_tmp.addProperty(tag_name,model.createTypedLiteral(name, "en"));
Note that you may not wish to define your schema so that it implies things about owl:Thing, because that would have implications outside of your domain. Instead, define a domain-specific notion like a :DatabaseResource. Set the domains of your properties to be that and it's subclasses rather than thing. This way the use of your property implies that the subject with within your domain, rather than simply an owl individual (which is implied by the domain of owl:DatatypeProperty anyway).
EDIT: It's absolutely acceptable to create a representation of the database's unique ID and place it into the RDF model. If you are using owl2, you can define an OWL-2 Key on that property for your :DatabaseResources and keep the same semantics that you had in the database.
EDIT: Noting a portion of your post on the Jena mailing list:
I have a huge MYSQL-Database for read only purpose and want to extract some Data into the Ontology.
I would highly recommend using the TDB Java API to construct a Dataset that backed by your disk. I've worked on very large database exports before, and it's quite possible that your data size won't be tractable otherwise. TDB's indexing requires a lot of disk space, but the memory-mapped IO makes it very difficult to kill due to OOM errors. Finally, once you have constructed the database on disk, you won't have to perform this expensive import operation again (or could at least optimize it).
If you find database creation times to be prohibitive, then you may with to utilize the bulk loader in creative ways. This answer has an example of using the bulk loader from java.

Related

Collect all VersionOne assets and all their attributes using Java

I'm working on a task that requires me to export all assets and all their attribute values into a CSV file. I know that there is an option to export into Excel, but that one has its problems and we decide to give a chance to an API.
The problem I faced is that while I can get all assets of a specific type with the code
IServices services = new Services(connector);
IAssetType requestType = services.getMeta().getAssetType("Request");
Query query = new Query(requestType);
it isn't clean how to return all asset's attributes. There is a getAttributes() for the Asset object
QueryResult result = services.retrieve(query);
for (Asset asset : result.getAssets()) {
Map<String, Attribute> attributes = asset.getAttributes();
System.out.println(attributes.toString());
}
but it doesn't seem to return an attribute unless it is explicitly added into a query eg.
…
Query query = new Query(requestType);
IAttributeDefinition nameAttribute = requestType.getAttributeDefinition("Name");
IAttributeDefinition numberAttribute = requestType.getAttributeDefinition("Number");
query.getSelection().add(nameAttribute);
query.getSelection().add(numberAttribute);
QueryResult result = services.retrieve(query);
…
which doesn't make sense to me, since I may not even know all possible object's attributes!
I feel like getAttributes() method may not be suitable for this purpose, but what else to use then? Any ideas on how I can collect the data I need?

You can use Meta API query to retrieve metadata for specific asset type:
<Server Base URI>/meta.v1/Request

In general VersionOne APIs don't return all available attributes at once as a default. When using the VersionOne Rest api, the most important subset of attributes are returned and no custom fields. The VersionOne sdks are a wrapper around this api so it stands to reason that api business rules are fulfilled in the sdk. You will have to know the names of all possible attributes of an asset and explicitly request them. This includes custom fields (Custom_AttributeName). This can be queried by doing a meta query YourVersionOneInstance/meta.v1/YourAssetName. You then will have to iterate through this xml tree and get the attribute names and wrap the proper query plumbing around each attribute.

How do I query OrientDB Vertex graph object by Record ID in Java?

How do I retrieve an OrientDB Document/Object or Graph object using its Record ID? (Language: Java)
I'm referring to http://orientdb.com/docs/2.0/orientdb.wiki/Tutorial-Record-ID.html and Vertex.getId() / Edge.getId() methods.
It is just like an SQL query "SELECT * from aTable WHERE ID = 1".
Usage/purpose description: I want to store the generated ID after it is created by OrientDB, and later retrieve the same object using the same ID.

(1) I'd suggest using OrientDB 2.1, and its documentation, e.g. http://orientdb.com/docs/2.1/Tutorial-Record-ID.html
(2) From your post, it's unclear to me whether you need help obtaining the RID from the results of a query, or retrieving an object given its RID, so let me begin by mentioning that the former can be accomplished as illustrated by this example (in the case of an INSERT query):
ODocument result=db.command(new OCommandSQL(<INSERTQUERY>)).execute();
System.out.println(result.field("#rid"));
Going the other way around, there are several approaches. I have verified that the following does work using Version 2.1.8:
OrientGraph graph = new OrientGraph("plocal:PATH_TO_DB", "admin", "admin");
Vertex v = graph.getVertex("#16:0");
An alternative and more generic approach is to construct and execute a SELECT query of the form SELECT FROM :RID, along the lines of this example:
List<ODocument> results = db.query(new OSQLSynchQuery<ODocument>("select from " + rid));
for (ODocument aDoc : results) {
System.out.println(aDoc.field("name"));
}
(3) In practice, it will usually be better to use some other "handle" on OrientDB vertices and edges in Java code, or indeed when using any of the supported programming languages. For example, once one has a vertex as a Java Vertex, as in the "Vertex v" example above, one can usually use it.

How to prevent SQL injection when dealing with dynamic table/column names?

I am using jdbc PreparedStatement for data insertion.
Statement stmt = conn.prepareStatement(
"INESRT INTO" + tablename+ "("+columnString+") VALUES (?,?,?)");
tablename and columnString are something that is dynamically generated.
I've tried to parameterise tablename and columnString but they will just resolve to something like 'tablename' which will violate the syntax.
I've found somewhere online that suggest me to lookup the database to check for valid tablename/columnString, and cache it somewhere(a Hashset perhaps) for another query, but I'm looking for better performance/ quick hack that will solve the issue, perhaps a string validator/ regex that will do the trick.
Have anyone came across this issue and how do you solve it?

I am not a java-guy, so, only a theory.
You can either format dynamically added identifiers or white-list them.
Second option is way better. Because
most developers aren't familiar enough with identifiers to format them correctly. Say, to quote an identifier, which is offered in the first comment, won't make it protected at all.
there could be another attack vector, not entirely an injection, but similar: imagine there is a column in your table, an ordinary user isn't allowed to - say, called "admin". With dynamically built columnString using data coming from the client side, it's piece of cake to forge a privilege escalation.
Thus, to list all the possible (and allowed) variants in your code beforehand, and then to verify entered value against it, would be the best.
As of columnString - is consists of separate column names. Thus, to protect it, one have to verify each separate column name against a white list, and then assemble a final columnString from them.

Create a method that generates the sql string for you:
private static final String template = "insert into %s (%s) values (%s)";
private String buildStmt(String tblName, String ... colNames) {
StringJoiner colNamesJoiner = new StringJoiner(",");
StringJoiner paramsJoiner = new StringJoiner(",");
Arrays.stream(colNames).forEach(colName -> {
colNamesJoiner.add(colName);
paramsJoiner.add("?");
});
return String.format(template, tblName, colNamesJoiner.toString(), paramsJoiner.toString());
}
Then use it...
Statement stmt = conn.prepareStatement(buildStmt(tablename, [your column names]));

As an elaboration on #Anders' answer, don't use the input parameter as the name directly, but keep a properties file (or database table) that maps a set of allowed inputs to actual table names.
That way any invalid name will not lead to valid SQL (and can be caught before any SQL is generated) AND the actual names are never known outside the application, thus making it far harder to guess what would be valid SQL statements.

I think, the best approach is to get table and columns names from database or other non user input, and use parameters in prepared statement for the rest.

There are multiple solutions we can apply.
1) White List Input Validation
String tableName;
switch(PARAM):
case "Value1": tableName = "fooTable";
break;
case "Value2": tableName = "barTable";
break;
...
default : throw new InputValidationException("unexpected value provided for table name");
By doing this input validation on tableName, will allows only specified tables in the query, so it will prevents sql injection attack.
2) Bind your dynamic columnName(s) or tableName(s) with special characters as shown below
eg:
For Mysql : use back codes (`)
Select `columnName ` from `tableName `;
For MSSQL : Use double codes(" or [ ] )
select "columnName" from "tableName"; or
select [columnName] from [tableName];
Note: Before doing this you should sanitize your data with this special characters ( `, " , [ , ] )

What is the best way to import an XML string into a SQL Server table

I am working with a 3rd product called JPOS and it has an XMLPackager whereby I get a string from this packager that contains a record in an XML format such as:
<MACHINE><B000>STRING_VALUE</B000><B002>STRING_VALUE</B002><B003>STRING_VALUE</B003><B004>STRING_VALUE</B004><B007>STRING_VALUE</B007><B011>STRING_VALUE</B011><B012>STRING_VALUE</B012><B013>STRING_VALUE</B013><B015>STRING_VALUE</B015><B018>STRING_VALUE</B018><B028>STRING_VALUE</B028><B032>STRING_VALUE</B032><B035>STRING_VALUE</B035><B037>STRING_VALUE</B037><B039>STRING_VALUE</B039><B041>STRING_VALUE</B041><B043>STRING_VALUE</B043><B048>STRING_VALUE</B048><B049>STRING_VALUE</B049><B058>STRING_VALUE</B058><B061>STRING_VALUE</B061><B063>STRING_VALUE</B063><B127>STRING_VALUE</B127></MACHINE>
I have a SQL server table that contains a column for each of the listed. Not that it matters but I could potentially have thru defined with specific STRING_VALUEs. I'm not sure what is the best way to go about this in Java. My understanding is that SQL Server can take an XML string (not document) and do an insert. Is it best to parse each value and then put into a list that populate each value into? This is the first time I've used an XML file and therefore trying to get some help/direction.
Thanks.

Sorry, one of my colleagues was able to help and provide a quick answer. I'll try it from my Java code and it looks like it should work great. Thanks anyway.
Here is the SP that she created whereby I can pass in my XML string and bit value:
CREATE PROCEDURE [dbo].[sbssp_InsertArchivedMessages]
(
#doc varchar(max),
#fromTo bit
)
AS
BEGIN
DECLARE #idoc int, #lastId int
EXEC sp_xml_preparedocument #idoc OUTPUT, #doc
INSERT INTO [dbo].[tblArchivedMessages]
SELECT * FROM OPENXML(#idoc, '/MACHINE', 2) WITH [dbo].[tblArchivedMessages]
SET #lastId = (SELECT IDENT_CURRENT('tblArchivedMessages'))
UPDATE [dbo].[tblArchivedMessages]
SET FromToMach = #fromTo
WHERE ID = #lastId
END
GO
Regards.

Cleanest way to build an SQL string in Java

I want to build an SQL string to do database manipulation (updates, deletes, inserts, selects, that sort of thing) - instead of the awful string concat method using millions of "+"'s and quotes which is unreadable at best - there must be a better way.
I did think of using MessageFormat - but its supposed to be used for user messages, although I think it would do a reasonable job - but I guess there should be something more aligned to SQL type operations in the java sql libraries.
Would Groovy be any good?

First of all consider using query parameters in prepared statements:
PreparedStatement stm = c.prepareStatement("UPDATE user_table SET name=? WHERE id=?");
stm.setString(1, "the name");
stm.setInt(2, 345);
stm.executeUpdate();
The other thing that can be done is to keep all queries in properties file. For example
in a queries.properties file can place the above query:
update_query=UPDATE user_table SET name=? WHERE id=?
Then with the help of a simple utility class:
public class Queries {
private static final String propFileName = "queries.properties";
private static Properties props;
public static Properties getQueries() throws SQLException {
InputStream is =
Queries.class.getResourceAsStream("/" + propFileName);
if (is == null){
throw new SQLException("Unable to load property file: " + propFileName);
}
//singleton
if(props == null){
props = new Properties();
try {
props.load(is);
} catch (IOException e) {
throw new SQLException("Unable to load property file: " + propFileName + "\n" + e.getMessage());
}
}
return props;
}
public static String getQuery(String query) throws SQLException{
return getQueries().getProperty(query);
}
}
you might use your queries as follows:
PreparedStatement stm = c.prepareStatement(Queries.getQuery("update_query"));
This is a rather simple solution, but works well.

For arbitrary SQL, use jOOQ. jOOQ currently supports SELECT, INSERT, UPDATE, DELETE, TRUNCATE, and MERGE. You can create SQL like this:
String sql1 = DSL.using(SQLDialect.MYSQL)
.select(A, B, C)
.from(MY_TABLE)
.where(A.equal(5))
.and(B.greaterThan(8))
.getSQL();
String sql2 = DSL.using(SQLDialect.MYSQL)
.insertInto(MY_TABLE)
.values(A, 1)
.values(B, 2)
.getSQL();
String sql3 = DSL.using(SQLDialect.MYSQL)
.update(MY_TABLE)
.set(A, 1)
.set(B, 2)
.where(C.greaterThan(5))
.getSQL();
Instead of obtaining the SQL string, you could also just execute it, using jOOQ. See
http://www.jooq.org
(Disclaimer: I work for the company behind jOOQ)

One technology you should consider is SQLJ - a way to embed SQL statements directly in Java. As a simple example, you might have the following in a file called TestQueries.sqlj:
public class TestQueries
{
public String getUsername(int id)
{
String username;
#sql
{
select username into :username
from users
where pkey = :id
};
return username;
}
}
There is an additional precompile step which takes your .sqlj files and translates them into pure Java - in short, it looks for the special blocks delimited with
#sql
{
...
}
and turns them into JDBC calls. There are several key benefits to using SQLJ:
completely abstracts away the JDBC layer - programmers only need to think about Java and SQL
the translator can be made to check your queries for syntax etc. against the database at compile time
ability to directly bind Java variables in queries using the ":" prefix
There are implementations of the translator around for most of the major database vendors, so you should be able to find everything you need easily.

I am wondering if you are after something like Squiggle (GitHub). Also something very useful is jDBI. It won't help you with the queries though.

I would have a look at Spring JDBC. I use it whenever I need to execute SQLs programatically. Example:
int countOfActorsNamedJoe
= jdbcTemplate.queryForInt("select count(0) from t_actors where first_name = ?", new Object[]{"Joe"});
It's really great for any kind of sql execution, especially querying; it will help you map resultsets to objects, without adding the complexity of a complete ORM.

I tend to use Spring's Named JDBC Parameters so I can write a standard string like "select * from blah where colX=':someValue'"; I think that's pretty readable.
An alternative would be to supply the string in a separate .sql file and read the contents in using a utility method.
Oh, also worth having a look at Squill: https://squill.dev.java.net/docs/tutorial.html

I second the recommendations for using an ORM like Hibernate. However, there are certainly situations where that doesn't work, so I'll take this opportunity to tout some stuff that i've helped to write: SqlBuilder is a java library for dynamically building sql statements using the "builder" style. it's fairly powerful and fairly flexible.

I have been working on a Java servlet application that needs to construct very dynamic SQL statements for adhoc reporting purposes. The basic function of the app is to feed a bunch of named HTTP request parameters into a pre-coded query, and generate a nicely formatted table of output. I used Spring MVC and the dependency injection framework to store all of my SQL queries in XML files and load them into the reporting application, along with the table formatting information. Eventually, the reporting requirements became more complicated than the capabilities of the existing parameter mapping frameworks and I had to write my own. It was an interesting exercise in development and produced a framework for parameter mapping much more robust than anything else I could find.
The new parameter mappings looked as such:
select app.name as "App",
${optional(" app.owner as "Owner", "):showOwner}
sv.name as "Server", sum(act.trans_ct) as "Trans"
from activity_records act, servers sv, applications app
where act.server_id = sv.id
and act.app_id = app.id
and sv.id = ${integer(0,50):serverId}
and app.id in ${integerList(50):appId}
group by app.name, ${optional(" app.owner, "):showOwner} sv.name
order by app.name, sv.name
The beauty of the resulting framework was that it could process HTTP request parameters directly into the query with proper type checking and limit checking. No extra mappings required for input validation. In the example query above, the parameter named serverId
would be checked to make sure it could cast to an integer and was in the range of 0-50. The parameter appId would be processed as an array of integers, with a length limit of 50. If the field showOwner is present and set to "true", the bits of SQL in the quotes will be added to the generated query for the optional field mappings. field Several more parameter type mappings are available including optional segments of SQL with further parameter mappings. It allows for as complex of a query mapping as the developer can come up with. It even has controls in the report configuration to determine whether a given query will have the final mappings via a PreparedStatement or simply ran as a pre-built query.
For the sample Http request values:
showOwner: true
serverId: 20
appId: 1,2,3,5,7,11,13
It would produce the following SQL:
select app.name as "App",
app.owner as "Owner",
sv.name as "Server", sum(act.trans_ct) as "Trans"
from activity_records act, servers sv, applications app
where act.server_id = sv.id
and act.app_id = app.id
and sv.id = 20
and app.id in (1,2,3,5,7,11,13)
group by app.name, app.owner, sv.name
order by app.name, sv.name
I really think that Spring or Hibernate or one of those frameworks should offer a more robust mapping mechanism that verifies types, allows for complex data types like arrays and other such features. I wrote my engine for only my purposes, it isn't quite read for general release. It only works with Oracle queries at the moment and all of the code belongs to a big corporation. Someday I may take my ideas and build a new open source framework, but I'm hoping one of the existing big players will take up the challenge.

Why do you want to generate all the sql by hand? Have you looked at an ORM like Hibernate Depending on your project it will probably do at least 95% of what you need, do it in a cleaner way then raw SQL, and if you need to get the last bit of performance you can create the SQL queries that need to be hand tuned.

You can also have a look at MyBatis (www.mybatis.org) . It helps you write SQL statements outside your java code and maps the sql results into your java objects among other things.

Google provides a library called the Room Persitence Library which provides a very clean way of writing SQL for Android Apps, basically an abstraction layer over underlying SQLite Database. Bellow is short code snippet from the official website:
#Dao
public interface UserDao {
#Query("SELECT * FROM user")
List<User> getAll();
#Query("SELECT * FROM user WHERE uid IN (:userIds)")
List<User> loadAllByIds(int[] userIds);
#Query("SELECT * FROM user WHERE first_name LIKE :first AND "
+ "last_name LIKE :last LIMIT 1")
User findByName(String first, String last);
#Insert
void insertAll(User... users);
#Delete
void delete(User user);
}
There are more examples and better documentation in the official docs for the library.
There is also one called MentaBean which is a Java ORM. It has nice features and seems to be pretty simple way of writing SQL.

Read an XML file.
You can read it from an XML file. Its easy to maintain and work with.
There are standard STaX, DOM, SAX parsers available out there to make it few lines of code in java.
Do more with attributes
You can have some semantic information with attributes on the tag to help do more with the SQL. This can be the method name or query type or anything that helps you code less.
Maintaince
You can put the xml outside the jar and easily maintain it. Same benefits as a properties file.
Conversion
XML is extensible and easily convertible to other formats.
Use Case
Metamug uses xml to configure REST resource files with sql.

If you put the SQL strings in a properties file and then read that in you can keep the SQL strings in a plain text file.
That doesn't solve the SQL type issues, but at least it makes copying&pasting from TOAD or sqlplus much easier.

How do you get string concatenation, aside from long SQL strings in PreparedStatements (that you could easily provide in a text file and load as a resource anyway) that you break over several lines?
You aren't creating SQL strings directly are you? That's the biggest no-no in programming. Please use PreparedStatements, and supply the data as parameters. It reduces the chance of SQL Injection vastly.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.