The following python code passes ["hello", "world"] into the universal sentence encoder and returns an array of floats denoting their encoded representation.
import tensorflow as tf
import tensorflow_hub as hub
module = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4")
model = tf.keras.Sequential(module)
print("model: ", model(["hello", "world"]))
This code works but I'd now like to do the same thing using the Java API. I've successfully loaded the module, but I am unable to pass inputs into the model and extract the output. Here is what I've got so far:
import org.tensorflow.Graph;
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.Tensors;
import org.tensorflow.framework.ConfigProto;
import org.tensorflow.framework.GPUOptions;
import org.tensorflow.framework.GraphDef;
import org.tensorflow.framework.MetaGraphDef;
import org.tensorflow.framework.NodeDef;
import org.tensorflow.util.SaverDef;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
public final class NaiveBayesClassifier
{
public static void main(String[] args)
{
new NaiveBayesClassifier().run();
}
protected SavedModelBundle loadModule(Path source, String... tags) throws IOException
{
return SavedModelBundle.load(source.toAbsolutePath().normalize().toString(), tags);
}
public void run()
{
try (SavedModelBundle module = loadModule(Paths.get("universal-sentence-encoder"), "serve"))
{
Graph graph = module.graph();
try (Session session = new Session(graph, ConfigProto.newBuilder().
setGpuOptions(GPUOptions.newBuilder().setAllowGrowth(true)).
setAllowSoftPlacement(true).
build().toByteArray()))
{
Tensor<String> input = Tensors.create(new byte[][]
{
"hello".getBytes(StandardCharsets.UTF_8),
"world".getBytes(StandardCharsets.UTF_8)
});
List<Tensor<?>> result = session.runner().feed("serving_default_inputs", input).
addTarget("???").run();
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
I used https://stackoverflow.com/a/51952478/14731 to scan the model for possible input/output nodes. I believe the input node is "serving_default_inputs" but I can't figure out the output node. More importantly, I don't have to specify any of these values when invoking the code in python through Keras so is there a way to do the same using the Java API?
UPDATE: Thanks to roywei I can now that confirm the input node is serving_default_input and output node is StatefulPartitionedCall_1 but when I plug these names into the aforementioned code I get:
2020-05-22 22:13:52.266287: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:809 : Failed precondition: Table not initialized.
Exception in thread "main" java.lang.IllegalStateException: [_Derived_]{{function_node __inference_pruned_6741}} {{function_node __inference_pruned_6741}} Error while reading resource variable EncoderDNN/DNN/ResidualHidden_0/dense/kernel/part_25 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/EncoderDNN/DNN/ResidualHidden_0/dense/kernel/part_25/class tensorflow::Var does not exist.
[[{{node EncoderDNN/DNN/ResidualHidden_0/dense/kernel/ConcatPartitions/concat/ReadVariableOp_25}}]]
[[StatefulPartitionedCall_1/StatefulPartitionedCall]]
at libtensorflow#1.15.0/org.tensorflow.Session.run(Native Method)
at libtensorflow#1.15.0/org.tensorflow.Session.access$100(Session.java:48)
at libtensorflow#1.15.0/org.tensorflow.Session$Runner.runHelper(Session.java:326)
at libtensorflow#1.15.0/org.tensorflow.Session$Runner.run(Session.java:276)
Meaning, I still cannot invoke the model. What am I missing?
I figured it out after roywei pointed me in the right direction.
I needed to use SavedModuleBundle.session() instead of constructing my own instance. This is because the loader initializes the graph variables.
Instead of passing a ConfigProto to the Session constructor, I passed it into the SavedModelBundle loader instead.
I needed to use fetch() instead of addTarget() to retrieve the output tensor.
Here is the working code:
public final class NaiveBayesClassifier
{
public static void main(String[] args)
{
new NaiveBayesClassifier().run();
}
public void run()
{
try (SavedModelBundle module = loadModule(Paths.get("universal-sentence-encoder"), "serve"))
{
try (Tensor<String> input = Tensors.create(new byte[][]
{
"hello".getBytes(StandardCharsets.UTF_8),
"world".getBytes(StandardCharsets.UTF_8)
}))
{
MetaGraphDef metadata = MetaGraphDef.parseFrom(module.metaGraphDef());
Map<String, Shape> nameToInput = getInputToShape(metadata);
String firstInput = nameToInput.keySet().iterator().next();
Map<String, Shape> nameToOutput = getOutputToShape(metadata);
String firstOutput = nameToOutput.keySet().iterator().next();
System.out.println("input: " + firstInput);
System.out.println("output: " + firstOutput);
System.out.println();
List<Tensor<?>> result = module.session().runner().feed(firstInput, input).
fetch(firstOutput).run();
for (Tensor<?> tensor : result)
{
{
float[][] array = new float[tensor.numDimensions()][tensor.numElements() /
tensor.numDimensions()];
tensor.copyTo(array);
System.out.println(Arrays.deepToString(array));
}
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
/**
* Loads a graph from a file.
*
* #param source the directory containing to load from
* #param tags the model variant(s) to load
* #return the graph
* #throws NullPointerException if any of the arguments are null
* #throws IOException if an error occurs while reading the file
*/
protected SavedModelBundle loadModule(Path source, String... tags) throws IOException
{
// https://stackoverflow.com/a/43526228/14731
try
{
return SavedModelBundle.loader(source.toAbsolutePath().normalize().toString()).
withTags(tags).
withConfigProto(ConfigProto.newBuilder().
setGpuOptions(GPUOptions.newBuilder().setAllowGrowth(true)).
setAllowSoftPlacement(true).
build().toByteArray()).
load();
}
catch (TensorFlowException e)
{
throw new IOException(e);
}
}
/**
* #param metadata the graph metadata
* #return the first signature, or null
*/
private SignatureDef getFirstSignature(MetaGraphDef metadata)
{
Map<String, SignatureDef> nameToSignature = metadata.getSignatureDefMap();
if (nameToSignature.isEmpty())
return null;
return nameToSignature.get(nameToSignature.keySet().iterator().next());
}
/**
* #param metadata the graph metadata
* #return the output signature
*/
private SignatureDef getServingSignature(MetaGraphDef metadata)
{
return metadata.getSignatureDefOrDefault("serving_default", getFirstSignature(metadata));
}
/**
* #param metadata the graph metadata
* #return a map from an output name to its shape
*/
protected Map<String, Shape> getOutputToShape(MetaGraphDef metadata)
{
Map<String, Shape> result = new HashMap<>();
SignatureDef servingDefault = getServingSignature(metadata);
for (Map.Entry<String, TensorInfo> entry : servingDefault.getOutputsMap().entrySet())
{
TensorShapeProto shapeProto = entry.getValue().getTensorShape();
List<Dim> dimensions = shapeProto.getDimList();
long firstDimension = dimensions.get(0).getSize();
long[] remainingDimensions = dimensions.stream().skip(1).mapToLong(Dim::getSize).toArray();
Shape shape = Shape.make(firstDimension, remainingDimensions);
result.put(entry.getValue().getName(), shape);
}
return result;
}
/**
* #param metadata the graph metadata
* #return a map from an input name to its shape
*/
protected Map<String, Shape> getInputToShape(MetaGraphDef metadata)
{
Map<String, Shape> result = new HashMap<>();
SignatureDef servingDefault = getServingSignature(metadata);
for (Map.Entry<String, TensorInfo> entry : servingDefault.getInputsMap().entrySet())
{
TensorShapeProto shapeProto = entry.getValue().getTensorShape();
List<Dim> dimensions = shapeProto.getDimList();
long firstDimension = dimensions.get(0).getSize();
long[] remainingDimensions = dimensions.stream().skip(1).mapToLong(Dim::getSize).toArray();
Shape shape = Shape.make(firstDimension, remainingDimensions);
result.put(entry.getValue().getName(), shape);
}
return result;
}
}
There are two ways to get the names:
1) Using Java:
You can read the input and output names from the org.tensorflow.proto.framework.MetaGraphDef stored in saved model bundle.
Here is an example on how to extract the information:
https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-engine/src/main/java/ai/djl/tensorflow/engine/TfSymbolBlock.java#L149
2) Using python:
load the saved model in tensorflow python and print the names
loaded = tf.saved_model.load("path/to/model/")
print(list(loaded.signatures.keys()))
infer = loaded.signatures["serving_default"]
print(infer.structured_outputs)
I recommend to take a look at Deep Java Library, it automatically handle the input, output names.
It supports TensorFlow 2.1.0 and allows you to load Keras models as well as TF Hub Saved Model. Take a look at the documentation here and here
Feel free to open an issue if you have problem loading your model.
You can load TF model with Deep Java Library
System.setProperty("ai.djl.repository.zoo.location", "https://storage.googleapis.com/tfhub-modules/google/universal-sentence-encoder/1.tar.gz?artifact_id=encoder");
Criteria.Builder<NDList, NDList> builder =
Criteria.builder()
.setTypes(NDList.class, NDList.class)
.optArtifactId("ai.djl.localmodelzoo:encoder")
.build();
ZooModel<NDList, NDList> model = ModelZoo.loadModel(criteria);
See https://github.com/awslabs/djl/blob/master/docs/load_model.md#load-model-from-a-url for detail
I need to do the same, but seems still lots of missing pieces RE DJL usage. E.g., what to do after this?:
ZooModel<NDList, NDList> model = ModelZoo.loadModel(criteria);
I finally found an example in the DJL source code. The key take-away is to not use NDList for the input/output at all:
Criteria<String[], float[][]> criteria =
Criteria.builder()
.optApplication(Application.NLP.TEXT_EMBEDDING)
.setTypes(String[].class, float[][].class)
.optModelUrls(modelUrl)
.build();
try (ZooModel<String[], float[][]> model = ModelZoo.loadModel(criteria);
Predictor<String[], float[][]> predictor = model.newPredictor()) {
return predictor.predict(inputs.toArray(new String[0]));
}
See https://github.com/awslabs/djl/blob/master/examples/src/main/java/ai/djl/examples/inference/UniversalSentenceEncoder.java for the complete example.
Related
I have a method called printStatements(Model m,Resource s,Property p,Resource o). if I try to find all predicate and objects attached to 'adam' Eclipse throws 'adam cannot be resolved as a variable'. How can I use printStatements to interrogate subject, predicate or object in the model? Following is the code:
package jenaHelloWorld;
import java.util.*;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.Property;
import org.apache.jena.rdf.model.Resource;
import org.apache.jena.rdf.model.Statement;
import org.apache.jena.rdf.model.StmtIterator;
import org.apache.jena.util.PrintUtil;
import org.apache.jena.rdf.model.*;
/**
* A small family tree held in a Jena Model
*/
public class FamilyModel {
// Namespace declarations
static final String familyUri = "http://family/";
static final String relationshipUri = "http://purl.org/vocab relationship/";
// Jena model representing the family
private Model model = ModelFactory.createDefaultModel();
/**
* Creates a model and populates it with family members and their
* relationships
*/
private FamilyModel() {
// Create an empty Model
// Create the types of Property we need to describe relationships
// in the model
Property childOf = model.createProperty(relationshipUri,"childOf");
Property parentOf = model.createProperty(relationshipUri,"parentOf");
Property siblingOf = model.createProperty(relationshipUri,"siblingOf");
Property spouseOf = model.createProperty(relationshipUri,"spouseOf");
// Create resources representing the people in our model
Resource adam = model.createResource(familyUri+"adam");
Resource beth = model.createResource(familyUri+"beth");
Resource chuck = model.createResource(familyUri+"chuck");
Resource dotty = model.createResource(familyUri+"dotty");
Resource edward = model.createResource(familyUri+"edward");
Resource fran = model.createResource(familyUri+"fran");
Resource greg = model.createResource(familyUri+"greg");
Resource harriet = model.createResource(familyUri+"harriet");
// Add properties to describing the relationships between them
adam.addProperty(siblingOf,beth);
adam.addProperty(spouseOf,dotty);
adam.addProperty(parentOf,edward);
adam.addProperty(parentOf,fran);
beth.addProperty(siblingOf,adam);
beth.addProperty(spouseOf,chuck);
chuck.addProperty(spouseOf,beth);
dotty.addProperty(spouseOf,adam);
dotty.addProperty(parentOf,edward);
dotty.addProperty(parentOf,fran);
// Statements can also be directly created ...
Statement statement1 = model.createStatement(edward,childOf,adam);
Statement statement2 = model.createStatement(edward,childOf,dotty);
Statement statement3 = model.createStatement(edward,siblingOf,fran);
// ... then added to the model:
model.add(statement1);
model.add(statement2);
model.add(statement3);
// Arrays of Statements can also be added to a Model:
Statement statements[] = new Statement[5];
statements[0] = model.createStatement(fran,childOf,adam);
statements[1] = model.createStatement(fran,childOf,dotty);
statements[2] = model.createStatement(fran,siblingOf,edward);
statements[3] = model.createStatement(fran,spouseOf,greg);
statements[4] = model.createStatement(fran,parentOf,harriet);
model.add(statements);
// A List of Statements can also be added
List list = new ArrayList();
list.add(model.createStatement(greg,spouseOf,fran));
list.add(model.createStatement(greg,parentOf,harriet));
list.add(model.createStatement(harriet,childOf,fran));
list.add(model.createStatement(harriet,childOf,greg));
model.add(list);
}
/**
* Creates a FamilyModel and dumps the content of its RDF representation
*/
public static void main(String args[]) {
// Create a model representing the family
FamilyModel theFamily = new FamilyModel();
// Dump out a String representation of the model
//System.out.println(theFamily.model);
//StmtIterator stmts = theFamily.listStatements( adam, null, (RDFNode) null );
printStatements(theFamily.model, adam,null, null);
}
private static void printStatements(Model m,Resource s,Property p,Resource o) {
for (StmtIterator i = m.listStatements(s,p,o); i.hasNext(); ) {
Statement stmt = i.nextStatement();
System.out.println(PrintUtil.print(stmt));
}
}
}
Tried the following:
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.spi.FileTypeDetector;
import org.apache.tika.Tika;
import org.apache.tika.mime.MimeTypes;
/**
*
* #author kiriti.k
*/
public class TikaFileTypeDetector {
private final Tika tika = new Tika();
public TikaFileTypeDetector() {
super();
}
public String probeContentType(Path path) throws IOException {
// Try to detect based on the file name only for efficiency
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
// Then check the file content if necessary
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Specification says to return null if we could not
// conclusively determine the file type
return null;
}
public static void main(String[] args) throws IOException {
Tika tika = new Tika();
// expects file path as the program argument
if (args.length != 1) {
printUsage();
return;
}
Path path = Paths.get(args[0]);
TikaFileTypeDetector detector = new TikaFileTypeDetector();
// Analyse the file - first based on file name for efficiency.
// If cannot determine based on name and then analyse content
String contentType = detector.probeContentType(path);
System.out.println("File is of type - " + contentType);
}
public static void printUsage() {
System.out.print("Usage: java -classpath ... "
+ TikaFileTypeDetector.class.getName()
+ " ");
}
}
The above program is checking based on file extension only. How do I make it to check content type also(mime) and then determine the type. I am using tika-app-1.8.jar in netbean 8.0.2. What am I missing?
The code checks the file extension first and returns the MIME type based on that, if it finds a result. If you want it to check the content first, just switch the two statements:
public String probeContentType(Path path) throws IOException {
// Check contents first
String fileContentDetect = tika.detect(path.toFile());
if (!fileContentDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileContentDetect;
}
// Try file name only if content search was not successful
String fileNameDetect = tika.detect(path.toString());
if (!fileNameDetect.equals(MimeTypes.OCTET_STREAM)) {
return fileNameDetect;
}
// Specification says to return null if we could not
// conclusively determine the file type
return null;
}
Be aware that this may have huge performance impact.
You can use Files.probeContentType(path)
I am a greenhand on Lucene, and I want to implement auto suggest, just like google, when I input a character like 'G', it would give me a list, you can try your self.
I have searched on the whole net.
Nobody has done this , and it gives us some new tools in package suggest
But i need an example to tell me how to do that
Is there anyone can help ?
I'll give you a pretty complete example that shows you how to use AnalyzingInfixSuggester. In this example we'll pretend that we're Amazon, and we want to autocomplete a product search field. We'll take advantage of features of the Lucene suggestion system to implement the following:
Ranked results: We will suggest the most popular matching products first.
Region-restricted results: We will only suggest products that we sell in the customer's country.
Product photos: We will store product photo URLs in the suggestion index so we can display them in the search results, without having to do an additional database lookup.
First I'll define a simple class to hold information about a product in Product.java:
import java.util.Set;
class Product implements java.io.Serializable
{
String name;
String image;
String[] regions;
int numberSold;
public Product(String name, String image, String[] regions,
int numberSold) {
this.name = name;
this.image = image;
this.regions = regions;
this.numberSold = numberSold;
}
}
To index records in with the AnalyzingInfixSuggester's build method you need to pass it an object that implements the org.apache.lucene.search.suggest.InputIterator interface. An InputIterator gives access to the key, contexts, payload and weight for each record.
The key is the text you actually want to search on and autocomplete against. In our example, it will be the name of the product.
The contexts are a set of additional, arbitrary data that you can use to filter records against. In our example, the contexts are the set of ISO codes for the countries we will ship a particular product to.
The payload is additional arbitrary data you want to store in the index for the record. In this example, we will actually serialize each Product instance and store the resulting bytes as the payload. Then when we later do lookups, we can deserialize the payload and access information in the product instance like the image URL.
The weight is used to order suggestion results; results with a higher weight are returned first. We'll use the number of sales for a given product as its weight.
Here's the contents of ProductIterator.java:
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import java.io.UnsupportedEncodingException;
import java.util.Comparator;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
import org.apache.lucene.search.suggest.InputIterator;
import org.apache.lucene.util.BytesRef;
class ProductIterator implements InputIterator
{
private Iterator<Product> productIterator;
private Product currentProduct;
ProductIterator(Iterator<Product> productIterator) {
this.productIterator = productIterator;
}
public boolean hasContexts() {
return true;
}
public boolean hasPayloads() {
return true;
}
public Comparator<BytesRef> getComparator() {
return null;
}
// This method needs to return the key for the record; this is the
// text we'll be autocompleting against.
public BytesRef next() {
if (productIterator.hasNext()) {
currentProduct = productIterator.next();
try {
return new BytesRef(currentProduct.name.getBytes("UTF8"));
} catch (UnsupportedEncodingException e) {
throw new Error("Couldn't convert to UTF-8");
}
} else {
return null;
}
}
// This method returns the payload for the record, which is
// additional data that can be associated with a record and
// returned when we do suggestion lookups. In this example the
// payload is a serialized Java object representing our product.
public BytesRef payload() {
try {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(currentProduct);
out.close();
return new BytesRef(bos.toByteArray());
} catch (IOException e) {
throw new Error("Well that's unfortunate.");
}
}
// This method returns the contexts for the record, which we can
// use to restrict suggestions. In this example we use the
// regions in which a product is sold.
public Set<BytesRef> contexts() {
try {
Set<BytesRef> regions = new HashSet();
for (String region : currentProduct.regions) {
regions.add(new BytesRef(region.getBytes("UTF8")));
}
return regions;
} catch (UnsupportedEncodingException e) {
throw new Error("Couldn't convert to UTF-8");
}
}
// This method helps us order our suggestions. In this example we
// use the number of products of this type that we've sold.
public long weight() {
return currentProduct.numberSold;
}
}
In our driver program, we will do the following things:
Create an index directory in RAM.
Create a StandardTokenizer.
Create an AnalyzingInfixSuggester using the RAM directory and tokenizer.
Index a number of products using ProductIterator.
Print the results of some sample lookups.
Here's the driver program, SuggestProducts.java:
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester;
import org.apache.lucene.search.suggest.Lookup;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.BytesRef;
import org.apache.lucene.util.Version;
public class SuggestProducts
{
// Get suggestions given a prefix and a region.
private static void lookup(AnalyzingInfixSuggester suggester, String name,
String region) {
try {
List<Lookup.LookupResult> results;
HashSet<BytesRef> contexts = new HashSet<BytesRef>();
contexts.add(new BytesRef(region.getBytes("UTF8")));
// Do the actual lookup. We ask for the top 2 results.
results = suggester.lookup(name, contexts, 2, true, false);
System.out.println("-- \"" + name + "\" (" + region + "):");
for (Lookup.LookupResult result : results) {
System.out.println(result.key);
Product p = getProduct(result);
if (p != null) {
System.out.println(" image: " + p.image);
System.out.println(" # sold: " + p.numberSold);
}
}
} catch (IOException e) {
System.err.println("Error");
}
}
// Deserialize a Product from a LookupResult payload.
private static Product getProduct(Lookup.LookupResult result)
{
try {
BytesRef payload = result.payload;
if (payload != null) {
ByteArrayInputStream bis = new ByteArrayInputStream(payload.bytes);
ObjectInputStream in = new ObjectInputStream(bis);
Product p = (Product) in.readObject();
return p;
} else {
return null;
}
} catch (IOException|ClassNotFoundException e) {
throw new Error("Could not decode payload :(");
}
}
public static void main(String[] args) {
try {
RAMDirectory index_dir = new RAMDirectory();
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_48);
AnalyzingInfixSuggester suggester = new AnalyzingInfixSuggester(
Version.LUCENE_48, index_dir, analyzer);
// Create our list of products.
ArrayList<Product> products = new ArrayList<Product>();
products.add(
new Product(
"Electric Guitar",
"http://images.example/electric-guitar.jpg",
new String[]{"US", "CA"},
100));
products.add(
new Product(
"Electric Train",
"http://images.example/train.jpg",
new String[]{"US", "CA"},
100));
products.add(
new Product(
"Acoustic Guitar",
"http://images.example/acoustic-guitar.jpg",
new String[]{"US", "ZA"},
80));
products.add(
new Product(
"Guarana Soda",
"http://images.example/soda.jpg",
new String[]{"ZA", "IE"},
130));
// Index the products with the suggester.
suggester.build(new ProductIterator(products.iterator()));
// Do some example lookups.
lookup(suggester, "Gu", "US");
lookup(suggester, "Gu", "ZA");
lookup(suggester, "Gui", "CA");
lookup(suggester, "Electric guit", "US");
} catch (IOException e) {
System.err.println("Error!");
}
}
}
And here is the output from the driver program:
-- "Gu" (US):
Electric Guitar
image: http://images.example/electric-guitar.jpg
# sold: 100
Acoustic Guitar
image: http://images.example/acoustic-guitar.jpg
# sold: 80
-- "Gu" (ZA):
Guarana Soda
image: http://images.example/soda.jpg
# sold: 130
Acoustic Guitar
image: http://images.example/acoustic-guitar.jpg
# sold: 80
-- "Gui" (CA):
Electric Guitar
image: http://images.example/electric-guitar.jpg
# sold: 100
-- "Electric guit" (US):
Electric Guitar
image: http://images.example/electric-guitar.jpg
# sold: 100
Appendix
There's a way to avoid writing a full InputIterator that you might find easier. You can write a stub InputIterator that returns null from its next, payload and contexts methods. Pass an instance of it to AnalyzingInfixSuggester's build method:
suggester.build(new ProductIterator(new ArrayList<Product>().iterator()));
Then for each item you want to index, call the AnalyzingInfixSuggester add method:
suggester.add(text, contexts, weight, payload)
After you've indexed everything, call refresh:
suggester.refresh();
If you're indexing large amounts of data, it's possible to significantly speedup indexing using this method with multiple threads: Call build, then use multiple threads to add items, then finally call refresh.
[Edited 2015-04-23 to demonstrate deserializing info from the LookupResult payload.]
My goal is to write an java applet application that writes a word document(this document is fetched from DB)in a temporary directory on client machine and opens that document using Jacob.
Through Jacob I need to keep the handle to the opened document, so that after the user closes the document I need to save it back to the DB with the changes.
That said, the first thing I want to know is how to capture a close/exit event through Jacob when the user closes/exits the MS Word document. How can I achieve this?
I tried the code below, which is based in the code i saw in this answer: https://stackoverflow.com/a/12332421/3813385 but it only opens the document and does not listen the closing event...
package demo;
import com.jacob.activeX.ActiveXComponent;
import com.jacob.com.Dispatch;
import com.jacob.com.DispatchEvents;
import com.jacob.com.Variant;
public class WordEventTest {
public static void main(String[] args) {
WordEventTest wordEventTest = new WordEventTest();
wordEventTest.execute();
}
public void execute() {
String strDir = "D:\\fabricasw\\workspace\\jacob\\WebContent\\docs\\";
String strInputDoc = strDir + "file_in.doc";
String pid = "Word.Application";
ActiveXComponent axc = new ActiveXComponent(pid);
axc.setProperty("Visible", new Variant(true));
Dispatch oDocuments = axc.getProperty("Documents").toDispatch();
Dispatch oDocument = Dispatch.call(oDocuments, "Open", strInputDoc).toDispatch();
WordEventHandler w = new WordEventHandler();
new DispatchEvents(oDocument, w);
}
public class WordEventHandler {
public void Close(Variant[] arguments) {
System.out.println("closed word document");
}
}
I would appreciate if you guys post some java code showing how. At least how to obtain the contents of a Microsoft Word document and how to detect the application closing event.
For handling events, I got help from this site:
http://danadler.com/jacob/
Here is my solution that work:
import java.io.File;
import java.util.ArrayList;
import java.util.List;
import com.jacob.com.DispatchEvents;
import com.jacob.com.Variant;
import com.microsoft.word.WordApplication;
import com.microsoft.word.WordDocument;
import com.microsoft.word.WordDocuments;
public class WordEventDemo {
private WordApplication wordApp = null;
private WordDocuments wordDocs = null;
private WordDocument wordDoc = null;
private WordAppEventListener wordAppEventListener = null;
private WordDocEventListener wordDocEventListener = null;
private List<DispatchEvents> dispatchEvents = new ArrayList<DispatchEvents>();
public WordEventDemo() {
}
/**
* Start Word, open the document and register listener to Word events
*
* #param docId
* The id of the document in the database
*/
public void start(String filename) throws Exception {
// get document from DB
File fFile = new File(filename); // replace by your code to retrieve file from your DB
// open document
// create WORD instance
wordApp = new WordApplication();
// get document list
wordDocs = wordApp.getDocuments();
Object oFile = fFile.getAbsolutePath();
Object oConversion = new Boolean(false);
Object oReadOnly = new Boolean(false);
wordDoc = wordDocs.Open(oFile, oConversion, oReadOnly);
wordDoc.Activate();
wordApp.setVisible(true);
// register listeners for the word app and document
wordAppEventListener = new WordAppEventListener();
dispatchEvents.add(new DispatchEvents(wordApp, wordAppEventListener));
wordDocEventListener = new WordDocEventListener();
dispatchEvents.add(new DispatchEvents(wordDoc, wordDocEventListener));
}
// This is the event interface for the word application
public class WordAppEventListener {
public WordAppEventListener() {
}
/**
* Triggered when the Word Application is closed.
*/
public void Quit(Variant[] args) {
// Perform operations on "Quit" event
System.out.println("quitting Word!");
}
/**
* Event called by Word Application when it attempt to save a file.<br>
* For Microsoft API reference, see <a
* href="http://msdn.microsoft.com/en-us/library/ff838299%28v=office.14%29.aspx"
* >http://msdn.microsoft.com/en-us/library/ff838299%28v=office.14%29.aspx</a>
*
* #param args
* An array of 3 Variants (WARNING, they are not in the same order indicated in the msdn link)
* #param args
* [0] <b>Cancel</b> : False when the event occurs. If the event procedure sets this argument to
* True, the document is not saved when the procedure is finished.
* #param args
* [1] <b>SaveAsUI</b> : True to display the Save As dialog box.
* #param args
* [2] <b>Doc</b> : The document that is being saved.
*/
public void DocumentBeforeSave(Variant[] args) {
// Perform operations on "DocumentBeforeSave" event
System.out.println("saving Word Document");
}
}
// This is the event interface for a word document
public class WordDocEventListener {
/**
* Triggered when a Word Document is closed.
*
* #param args
*/
public void Close(Variant[] args) {
// Perform operations on "Close" event
System.out.println("closing document");
}
}
}
Then I call it simply like the following:
WordEventDemo fixture = new WordEventDemo();
fixture.start("path/to/file.docx");
// add a waiting mechanism (could be linked to the close or quit event), to make it simple here
Thread.sleep(20000);
I have this class below, i build it considering the examples given on the wiki and in a thesis, why can't SympleKMeans handle data? The class can print the Datasource dados, so its nothing wrong with processing file, the error is on the build.
package slcct;
import weka.clusterers.ClusterEvaluation;
import weka.clusterers.SimpleKMeans;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
public class Cluster {
public String path;
public Instances dados;
public String[] options = new String[2];
public Cluster(String caminho, int nclusters, int seed ){
this.path = caminho;
this.options[0] = String.valueOf(nclusters);
this.options[1] = String.valueOf(seed);
}
public void ledados() throws Exception{
DataSource source = new DataSource(path);
dados = source.getDataSet();
System.out.println(dados)
if(dados.classIndex()==-1){
dados.setClassIndex(dados.numAttributes()-1);
}
}
public void imprimedados(){
for(int i=0; i<dados.numInstances();i++)
{
Instance actual = dados.instance(i);
System.out.println((i+1) + " : "+ actual);
}
}
public void clustering() throws Exception{
SimpleKMeans cluster = new SimpleKMeans();
cluster.setOptions(options);
cluster.setDisplayStdDevs(true);
cluster.getMaxIterations();
cluster.buildClusterer(dados);
Instances ClusterCenter = cluster.getClusterCentroids();
Instances SDev = cluster.getClusterStandardDevs();
int[] ClusterSize = cluster.getClusterSizes();
ClusterEvaluation eval = new ClusterEvaluation();
eval.setClusterer(cluster);
eval.evaluateClusterer(dados);
for(int i=0;i<ClusterCenter.numInstances();i++){
System.out.println("Cluster#"+( i +1)+ ": "+ClusterSize[i]+" dados .");
System.out.println("CentrĂ³ide:"+ ClusterCenter.instance(i));
System.out.println("STDDEV:" + SDev.instance(i));
System.out.println("Cluster Evaluation:"+eval.clusterResultsToString());
}
}
}
The error:
weka.core.WekaException: weka.clusterers.SimpleKMeans: Cannot handle any class attribute!
at weka.core.Capabilities.test(Capabilities.java:1097)
at weka.core.Capabilities.test(Capabilities.java:1018)
at weka.core.Capabilities.testWithFail(Capabilities.java:1297)
at weka.clusterers.SimpleKMeans.buildClusterer(SimpleKMeans.java:228)
at slcct.Cluster.clustering(Cluster.java:53)//Here.
at slcct.Clustering.jButton1ActionPerformed(Clustering.java:104)
I believe you need not set the class index, as you are doing clustering and not classification. Try following this guide for programmatic Java clustering.
In your "ledados()" function just remove the code block given below. It will work. Because you have no defined class in your data.
if(dados.classIndex()==-1){
dados.setClassIndex(dados.numAttributes()-1);
}
Your new function:
public void ledados() throws Exception{
DataSource source = new DataSource(path);
dados = source.getDataSet();
System.out.println(dados) }
You would not need a class attribute in the data while doing k clustering