Retrieval of synonyms of an instance from whole ontology

Retrieval of synonyms of an instance from whole ontology - java

Individual ind = model.createIndividual("http://www.semanticweb.org/ontologies/Word#Human", isSynonymOf);
System.out.println( "Synonyms of given instance are:" );
StmtIterator it =ind.listProperties(isSynonymOf);
while( it.hasNext() ) {
Statement stmt = ((StmtIterator) it).nextStatement();
System.out.println( " * "+stmt.getObject());
}
Output
Synonyms of given instance are:
http://www.semanticweb.org/ontologies/Word#Human
http://www.semanticweb.org//ontologies/Word#Mortal
http://www.semanticweb.org/ontologies/Word#Person
Problem 1: My output shows whole URI but I need output as under
Synonyms of given instance are:
Human
Mortal
Person
Problem 2: I have 26 instances and every time I have to mention its URI to show its synonyms. How will I show synonyms of any instance from whole ontology model instead of mentioning URIs again and again. I am using eclipse Mars 2.0 and Jena API

You can use REGEX or simply Java string operations to extract the substring after #. Note, best practice is to provide human readable representations of URIs and not to encode it in the URI. For instance, rdfs:label is a common property for doing that.
It is simply iterating over all individuals of the ontology which are returned by
model.listIndividuals()
Some comments:
You're using the method createIndividual not as expected. The second argument denotes a class and you're giving it a property. Please use Javadoc for the future.
I don't understand why you're casting it to StmtIterator - that doesn't make sense
Using listPropertiesValues is more convenient since you're only interested in the values.
Use Java 8 to make the code more compact
model.listIndividuals().forEachRemaining(ind -> {
System.out.println("Synonyms of instance " + ind + " are:");
ind.listPropertyValues(isSynonymOf).forEachRemaining(val -> {
System.out.println(" * " + val);
});
});
Java 6 compatible version:
ExtendedIterator<Individual> indIter = model.listIndividuals();
while(indIter.hasNext()) {
Individual ind = indIter.next();
System.out.println("Synonyms of instance " + ind + " are:");
NodeIterator valueIter = ind.listPropertyValues(isSynonymOf);
while(valueIter.hasNext()) {
RDFNode val = valueIter.next();
System.out.println(" * " + val);
}
}

Related

Match a string from a list and extract values

What would be the most efficient (low CPU time) way of achieving the following in Java ?
Let us say we have a list of strings as follows :
1.T.methodA(p1).methodB(p2,p3).methodC(p4)
2.T.methodX.methodY(p5,p6).methodZ()
3 ...
At runtime we get strings as follows that may match one of the strings in our list :
a.T.methodA(p1Value).methodB(p2Value,p3Value).methodC(p4Value) // Matches 1
b.T.methodM().methodL(p10) // No Match
c.T.methodX.methodY(p5Value,p6Value).methodZ() // Matches 2
I would like to match (a) to (1) and extract the values of p1,p2,p3 and p4
where:
p1Value = p1, p2Value = p2, p3Value = p3 and so on.
Similarly for the other matches like c to 2 for example.

The first method I have in mind is of course a regular expression.
But that could be complicated to update in the future or to handle hedge cases.
Instead you can try using the Nashorn engine, that allow you to exec javascript code in a jvm.
So you just need to create a special javascript object that handle all your methods:
private static final String jsLib = "var T = {" +
"results: new java.util.HashMap()," +
"methodA: function (p1) {" +
" this.results.put('p1', p1);" +
" return this;" +
"}," +
"methodB: function (p2, p3) {" +
" this.results.put('p2', p2);" +
" this.results.put('p3', p3);" +
" return this;" +
"}," +
"methodC: function (p4) {" +
" this.results.put('p4', p4);" +
" return this.results;" +
"}}";
This is a string for semplicity, than handle your first case.
You can write the code in a js file and load that one easely.
You create a special attribute in your javascript object, that is a Java HashMap, so you get that as the result of the evaluation, with all the values by name.
So you just eval the input:
ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn");
final String inputSctipt = "T.methodA('p1Value').methodB('p2Value','p3Value').methodC('p4Value')";
try {
engine.eval(jsLib);
Map<String, Object> result = (Map<String, Object>)engine.eval(inputSctipt);
System.out.println("Script result:\n" + result.get("p1"));
} catch (ScriptException e) {
e.printStackTrace();
}
And you got:
Script result:
p1Value
In the same way you can get all the other values.
You need to ignore the script errors, are they should be path not implemented.
Just remember to reset the script context before each evaluation, in order to avoid to mix with previous values.
The advantage of this solution compared to regular expressions is that is easy to understand, easy to update when needed.
The only disadvantage I can see is the knowledge of Javascript, of course, and the performances.
You didn't mention the performances as an issue, so you can try this way if is fine for your need.
If you need a better peroformance than you should look on regular expressions.
UPDATE
To have a more complete answer, here is the same example with regular expressions:
Pattern p = Pattern.compile("^T\\.methodA\\(['\"]?(.+?)['\"]?\\)\\.methodB\\(['\"]?([^,]+?)['\"]?,['\"]?(.+?)['\"]?\\)\\.methodC\\(['\"]?(.+?)['\"]?\\)$");
Matcher m = p.matcher(inputSctipt);
if (m.find()) {
System.out.println("With regexp:\n" + m.group(1));
}
Please be aware that this expression didn't handle hedge cases, and you're going to need a reg exp for each string you want to parse and grab the attribute values.

Java: Issue when replacing Strings on loop

I'm building a small app which auto translates boolean queries in Java.
This is the code to find if the query string contains a certain word and if so, it replaces it with the translated value.
int howmanytimes = originalValues.size();
for (int y = 0; y < howmanytimes; y++) {
String originalWord = originalValues.get(y);
System.out.println("original Word = " + originalWord);
if (toReplace.contains(" " + originalWord.toLowerCase() + " ")
|| toCheck.contains('"' + originalWord.toLowerCase() + '"')) {
toReplace = toReplace.replace(originalWord, translatedValues.get(y).toLowerCase());
System.out.println("replaced " + originalWord + " with " + translatedValues.get(y).toLowerCase());
}
System.out.println("to Replace inside loop " + toReplace);
}
The problem is when a query has, for example, '(mykeyword OR "blue mykeyword")' and the translated values are different, for example, mykeyword translates to elpalavra and "blue mykeyword" translates to "elpalavra azul". What happens in this case is that the result string will be '(elpalavra OR "blue elpalavra")' when it should be '(elpalavra OR "elpalavra azul")' . I understand that in the first loop it replaces all keywords and in the second it no longer contains the original value it should for translation.
How can I fix this?
Thank you

you can sort originalValues by size desc. And after that loop through them.
This way you first replace "blue mykeyword" and only after you replace "mykeyword"

The "toCheck" variable is not explained what is for, and in any case the way it is used looks weird (to me at least).
Keeping that aside, one way to answer your request could be this (based only on the requirements you specified):
sort your originalValues, so that the ones with more words are first. The ones that have same number of words, should be ordered from more length to less.

Apache Calcite to Find Selected Columns in an SQL String

I have a use case where I want to know the columns which have been selected in an SQL string.For instance, if the SQL is like this:
SELECT name, age*5 as intelligence FROM bla WHERE bla=bla
Then, after parsing the above String, I just want the output to be: name, intelligence.
Firstly, is it possible through Calcite?
Any other option is also welcome.
PS: I want to know this before actually running the query on database.

This is definitely doable with Calcite. You'll want to start by creating an instance of SqlParser and parsing the query:
SqlParser parser = SqlParser.create(query)
SqlNode parsed = parser.parseQuery()
From there, you'll probably have the most success implementing the SqlVisitor interface. You'll want to first find a SqlSelect instance and then visit each expression being selected by calling visit on each element of getSelectList.
From there, how you proceed will depend on the complexity of expressions you want to support. However, it's probably sufficient to recursively visit all SqlCall nodes and their operands and then collect any SqlIdentifier values that you see.

It can be as simple as:
SqlParser parser = SqlParser.create(yourQuery);
SqlSelect selectNode = (SqlSelect) parser.parseQuery();
SqlNodeList list = selectNode.getList();
for (int i = 0; i < list.size(); i++) {
System.out.println("Column " + (i + 1) + ": " + list.get(i).toString());
}

How to loop elements in 2d array to construct an string in "Ruby functional Style"

Ruby use the functions from "functional concept" heavily, such as map, each. They really depend on a self-contained function which is so called block in Ruby.
It is very common to loop though a 2d array, make an string about the elements.
In java, it may looks like
public String toString(){
String output = "[";
for (int i =0; i<array.length; i++) {
output+= "Row "+(i+1)+" : ";
for (int j=0; j<array[0].length;j++ ) {
output += array[i][j]+", ";
}
output += "\n";
}
return output += "]";
}
I tried to rewrite such a thing in "Ruby functional Style", but I think there are still some improvements. Eg. I want to remove the mutable variable output
def to_s
output = "[\n"
#data.each_with_index do |row,i|
output << "Row #{i+1} : "
row.each { |num| output << "#{num}," }
output << "\n"
end
output+"]"
end

Whenever you see the pattern:
initialize an accumulator (in your case output)
on each iteration of some collection modify the accumulator (in your case append to it)
return the accumulator
that's a fold, or in Ruby terms an inject.
Actually, that's a bit of a tautology. A fold is a universal method of iteration: everything that can be expressed by iterating over the elements of a collection can also be expressed as a fold over the collection. In other words: all methods on Enumerable (including each!) could also be defined in terms of inject as the primitive method instead of each.
Think about it this way: a collection can either be empty or there can be a current element. There's no third option, if you cover those two cases, then you have covered everything. Well, fold takes two arguments: one which tells it what to do when the collection is empty, and one which tells it what to do with the current element. Or, put yet another way: you can see a collection as a series of instructions and fold is an interpreter for those instructions. There are only two kinds of instructions: the END instruction and a VALUE(el) instruction. And you can supply the interpreter code for both those instructions to the fold.
In Ruby, the second argument is not part of the argument list, it is the block.
So, what's it look like as a fold?
def to_s
#data.each_with_index.inject("[\n") do |acc, (row, i)|
acc + "Row #{i+1} : #{row.join(',')}\n"
end + ']'
end
If you're curious about whether or not the each_with_index may infect your code with some non-functional impurity, rest assured that you can just as easily get rid of it by including the index in the accumulator:
def to_s
#data.inject(["[\n", 1]) do |(s, i), row|
[s + "Row #{i} : #{row.join(',')}\n", i+1]
end.first + ']'
end
Also note that in the first case, with the each_with_index, we're not actually doing anything "interesting" with the accumulator, unlike the second case, where we are using it to keep count of the index. In fact, the first case is actually a restricted form of fold, it doesn't use all of its power. It really is just a map:
def to_s
"[\n" + #data.map.with_index(1) do |row, i|
"Row #{i} : #{row.join(',')}\n"
end.join + ']'
end
In my personal opinion, it would actually be perfectly okay to use (mutable) string appending here instead of string concatenation:
def to_s
"[\n" << #data.map.with_index(1) do |row, i|
"Row #{i} : #{row.join(',')}\n"
end.join << ']'
end
This saves us from creating a couple of unnecessary string objects, but more importantly: it is more idiomatic. The real problem is shared mutable state, but we're not sharing our mutable string here: when to_s returns its caller does get access to the string, but to_s itself has returned and thus no longer has access to it.
If you want to get real fancy, you could even use string interpolation:
def to_s
%Q<[\n#{#data.map.with_index(1) do |row, i|
"Row #{i} : #{row.join(',')}\n"
end.join}]>
end
Unfortunately, this not only breaks IRb's syntax highlighting, but also my brain's ;-)

Here's a method with no mutable vars:
def to_s
(
[ "[" ] +
#data.map.with_index { |row,i| "Row #{i+1} : #{row * ','}" } +
[ "]" ]
).join("\n")
end

same thing but shorter with less blocks.
def to_s
output = "[\n"
#data.each_with_index do |row,i|
output << "Row #{i+1} : #{row.join(',')}\n"
end
output+"]"
end

What is the effective method to handle word contractions using Java?

I have a list of words in a file. They might contain words like who's, didn't etc. So when reading from it I need to make them proper like "who is" and "did not". This has to be done in Java. I need to do this without losing much time.
This is actually for handling such queries during a search that uses solr.
Below is a sample code I tried using a hash map
Map<String, String> con = new HashMap<String, String>();
con.put("'s", " is");
con.put("'d", " would");
con.put("'re", " are");
con.put("'ll", " will");
con.put("n't", " not");
con.put("'nt", " not");
String temp = null;
String str = "where'd you're you'll would'nt hello";
String[] words = str.split(" ");
int index = -1 ;
for(int i = 0;i<words.length && (index =words[i].lastIndexOf('\''))>-1;i++){
temp = words[i].substring(index);
if(con.containsKey(temp)){
temp = con.get(temp);
}
words[i] = words[i].substring(0, index)+temp;
System.out.println(words[i]);
}

If you are worried about queries containing for eg "who's" finding documents containing for eg "who is" then you should look at using a Stemmer, which is designed exactly for this purpose.
You can easily add a stemmer buy configuring it as a filter in your solr config. See http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
Edit:
A SnowballPorterFilterFactory will probably do the job for you.

Following on from #James Jithin's last remark:
the "'s" -> " is" transform is incorrect if the word is a possessive form.
the "'d" -> " would" transform is incorrect in archaic forms, where the "'d" can be a contraction of "ed".
the "'nt" -> " not" transform is not correct because this is really just a mis-spelling of the "n't" contraction. (I mean "wo'nt" is just plain wrong ... isn't it.)
So, to my mind, the best way to implement this would be to enumerate the small number of contractions that are common and valid, and leave the rest alone. This also has the advantage that you can implement it with a simple string match rather than a suffix match.

The code can be written as
Map<String, String> con = new HashMap<String, String>();
con.put("'s", " is");
con.put("'d", " would");
con.put("'re", " are");
con.put("'ll", " will");
con.put("n't", " not");
con.put("'nt", " not");
String str = "where'd you're you'll would'nt hello";
for(String key : con.keySet()) {
str = str.replaceAll(key + "\\b" , con.get(key));
}
with the logic you have. But suppose its script's is a word which shows possession, changing it to script is alters the meaning.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Retrieval of synonyms of an instance from whole ontology - java

Related

Match a string from a list and extract values

Java: Issue when replacing Strings on loop

Apache Calcite to Find Selected Columns in an SQL String

How to loop elements in 2d array to construct an string in "Ruby functional Style"

What is the effective method to handle word contractions using Java?

Categories

Resources