How can I copy attribute in java weka? - java

I am using Weka libraries for feature selection problem solution. I have read data as follows : The data set is in arff format
BufferedReader reader = new BufferedReader(new FileReader("D:\\Exp\\golf.arff"));
Instances data = new Instances(reader);
reader.close();
data.setClassIndex(data.numAttributes() - 1);
For example, attributes are Temperature, humidity, Windy, Outlook and the last one is class.
Now I want to store Temperature attribute as instances. The new data will consists only temperature (and it should be instance type because in further processing I have to use methods of instance)

Quick google search brought this:
Instead of picking the features just remove all features that you don't need.
import java.io.File;
import weka.core.Instances;
import weka.core.converters.ArffLoader;
import weka.core.converters.ArffSaver;
import weka.filters.Filter;
import weka.filters.unsupervised.attribute.Remove;
public class Convert4 {
public static void main(String[] args) {
// TODO Auto-generated method stub
try
{
ArffLoader loader2= new ArffLoader();
loader2.setSource(new File("C:/Users/RAHUL/Desktop/stack.arff"));
Instances data2= loader2.getDataSet();
//Load Arff
String[] options = new String[2];
options[0] = "-R"; // "range"
options[1] = "1"; // first attribute
Remove remove = new Remove(); // new instance of filter
remove.setOptions(options); // set options
remove.setInputFormat(data2); // inform filter about dataset **AFTER** setting options
Instances newData2 = Filter.useFilter(data2, remove); // apply filter
ArffSaver saver = new ArffSaver();
saver.setInstances(newData2);
saver.setFile(new File("C:/Users/RAHUL/Desktop/stack2.arff"));
saver.writeBatch();
}
catch (Exception e)
{}
}
}
Answer comes from:
How to remove particular attributes from arff file and produce modified arff?

Related

How to combine multiple mp4 videos into one video using jcodec in Java?

I have multiple mp4 files which are parts of a whole mp4 file. They have been just split up to smaller files.
I want to combine those files programmatically into one mp4 file using jcodec in Java. Currently, I'm using the following code to do this:
import org.jcodec.common.DemuxerTrack;
import org.jcodec.common.MuxerTrack;
import org.jcodec.common.io.SeekableByteChannel;
import org.jcodec.common.model.Packet;
import org.jcodec.containers.mp4.Brand;
import org.jcodec.containers.mp4.demuxer.MP4Demuxer;
import org.jcodec.containers.mp4.muxer.MP4Muxer;
import java.io.File;
import java.util.List;
import static org.jcodec.common.io.NIOUtils.readableChannel;
import static org.jcodec.common.io.NIOUtils.writableChannel;
public class Main {
public static void main(String[] args) throws Exception {
File outputFile = new File("file_path_output.mp4");
outputFile.mkdirs();
outputFile.delete();
File[] inputFiles = new File[]{
new File("file_path_1.mp4"),
new File("file_path_2.mp4"),
new File("file_path_3.mp4"),
};
SeekableByteChannel output = writableChannel(outputFile);
MP4Muxer mp4Muxer = MP4Muxer.createMP4Muxer(output, Brand.MP4);
MuxerTrack mixedVideoTrack = null;
MuxerTrack mixedAudioTrack = null;
for (File input : inputFiles) {
SeekableByteChannel byteChannel = readableChannel(input);
MP4Demuxer demuxer = MP4Demuxer.createMP4Demuxer(byteChannel);
List<DemuxerTrack> videoTracks = demuxer.getVideoTracks();
List<DemuxerTrack> audioTracks = demuxer.getAudioTracks();
if (mixedVideoTrack == null) {
mixedVideoTrack = mp4Muxer.addVideoTrack(videoTracks.get(0).getMeta().getCodec(), videoTracks.get(0).getMeta().getVideoCodecMeta());
}
if (mixedAudioTrack == null) {
mixedAudioTrack = mp4Muxer.addAudioTrack(audioTracks.get(0).getMeta().getCodec(), audioTracks.get(0).getMeta().getAudioCodecMeta());
}
for (DemuxerTrack videoTrack : videoTracks) {
Packet packet;
while ((packet = videoTrack.nextFrame()) != null) {
mixedVideoTrack.addFrame(packet);
}
}
for (DemuxerTrack audioTrack : audioTracks) {
Packet packet;
while ((packet = audioTrack.nextFrame()) != null) {
mixedAudioTrack.addFrame(packet);
}
}
demuxer.close();
}
mp4Muxer.finish();
output.close();
}
}
Unfortunately, it does not work as the video generated does not contain audio files nor correct frames. For example, all visual frames I get are like one below:
What is the problem and how can I fix it? Thank you very much.
You can do this with ffmpeg.
First you will need to write the file paths to a txt file
txt file with :
file '/sdcard/video/video1.mp4'
file '/sdcard/video/video2.mp4'
file '/sdcard/video/video3.mp4'
file '/sdcard/video/video4.mp4'
You can use following command.
ffmpeg -f concat -i mp4list.txt -c:v copy -c:a copy merging.mp4
You can write a wrapper over the Java code. Additionally, you can use the code below.
Resource to help : link
String[] cmd = new String[]{"-f", "concat", "-safe", "0", "-i", "mp4list.txt", "-c", "copy", "-preset", "ultrafast", outFile};
ffmpeg.execute(cmd, new ExecuteBinaryResponseHandler(){
#Override
public void onFailure(String message){
}
#Override
public void onSuccess(String message){
}
#Override
public void onProgress(String message){
}
#Override
public void onStart(){
}
#Override
public void onFinish(){
}
}
I don't really know jcodec, but I've used MP4parser in the past. If you need to merge mp4 files, regardless of the library, then I suggest you this one. I'm just pitching it to you because I don't know if you're having problems in merging the files due to the complexity of the library. Might be worth to give it a shot. MP4Parser is quite easy and straightforward. I'll also leave you the link to the maven repository.
https://search.maven.org/artifact/com.googlecode.mp4parser/isoparser/1.1.22/jar
Here there's an implementation I've done to show you how to merge 2 files from the same original video (to replicate your exact case). Basically, you have a Movie class, which represents your mp4 file, containing a List of Track (only 2); one representing the video while the other the audio.
First, I've collected from the movies all the video and audio tracks in two separate arrays.
Then, I've created a List with the merged audio Track and the merged video Track in order to set it to the output file.
Finally, I've built the Movie inside a Container and saved it with a FileOutputStream.
public class Main {
public static void main(String[] args) throws IOException {
MovieCreator mc = new MovieCreator();
Movie movie1 = mc.build("./test1.mp4");
Movie movie2 = mc.build("./test2.mp4");
//Fetching the video tracks from the movies and storing them into an array
Track[] vetTrackVideo = new Track[0];
vetTrackVideo = Stream.of(movie1, movie2)
.flatMap(movie -> movie.getTracks().stream())
.filter(movie -> movie.getHandler().equals("vide"))
.collect(Collectors.toList())
.toArray(vetTrackVideo);
//Fetching the audio tracks from the movies and storing them into an array
Track[] vetTrackAudio = new Track[0];
vetTrackAudio = Stream.of(movie1, movie2)
.flatMap(movie -> movie.getTracks().stream())
.filter(movie -> movie.getHandler().equals("soun"))
.collect(Collectors.toList())
.toArray(vetTrackAudio);
//Creating the output movie by setting a list with both video and audio tracks
Movie movieOutput = new Movie();
List<Track> listTracks = new ArrayList<>(List.of(new AppendTrack(vetTrackVideo), new AppendTrack(vetTrackAudio)));
movieOutput.setTracks(listTracks);
//Building the output movie and storing it into a Container
DefaultMp4Builder mp4Builder = new DefaultMp4Builder();
Container c = mp4Builder.build(movieOutput);
//Writing the output file
FileOutputStream fos = new FileOutputStream("output.mp4");
c.writeContainer(fos.getChannel());
fos.close();
}
}
Here, I'll leave you also a link to my github repository with the test mp4 files so that you can test the merging sample.
https://github.com/daniele-aveta/MP4Merger

java Gherkin parser stream does not release file locks

I am using Gherkin parser to parse feature files and returning the list of Gherkin documents see the function below:
import io.cucumber.gherkin.Gherkin;
import io.cucumber.messages.IdGenerator;
import io.cucumber.messages.Messages;
import io.cucumber.messages.Messages.Envelope;
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public class GherkinUtils {
private static final Logger LOG = LogManager.getLogger(GherkinUtils.class);
public static ArrayList<Messages.GherkinDocument> getGherkinDocumentsFromFiles() {
IdGenerator idGenerator = new IdGenerator.Incrementing();
ArrayList<Messages.GherkinDocument> listOfGherkinDocuments = new ArrayList<>();
String pathFolderFrameworkFeatures = SettingsUtils.getPathFolderFrameworkFeatures();
List<String> listOfPathsForFeatureFiles = FileUtils.getAllFilePathsFromFolder(pathFolderFrameworkFeatures);
try (Stream<Envelope> dataStream = Gherkin.fromPaths(listOfPathsForFeatureFiles, false, true, false, idGenerator)){
List<Envelope> envelopes = dataStream.collect(Collectors.toList());
for (Envelope env : envelopes) {
Messages.GherkinDocument gherkinDocument = env.getGherkinDocument();
listOfGherkinDocuments.add(gherkinDocument);
}
} catch (Exception e) {
LOG.error("Error occurred while trying to read the feature files", new Exception(e));
}
FileUtils.renameAllFeatureFiles("b");
return listOfGherkinDocuments;
}
}
Just before the return statement, you can see the function that will update the name for all feature files just to check if they are not locked.
The problem is that only the first file is always renamed and the rest of them are always locked.
If I will place the rename function at the top, then all the files are successfully renamed...
My understanding is that the try statement will automatically close the stream. Also, I tried to close it manually inside the try block but the results are the same.
What am I missing? How can I make it to release the file locks?
Update 1:
This exact line is making the files (except the first one to be locked):
List<Envelope> envelopes = dataStream.collect(Collectors.toList());
Here is the file name update function definition in case you want to test it:
public static void renameAllFeatureFiles(String fileName) {
String pathFeaturesFolder = SettingsUtils.getPathFolderFrameworkFeatures();
List<String> pathList = FileUtils.getAllFilePathsFromFolder(pathFeaturesFolder);
int counter = 0;
for (String path : pathList) {
counter ++;
File file = new File(path);
File newFile = new File(pathFeaturesFolder + "\\" + fileName +counter+".feature");
System.out.println("File: " + path + " locked: " + !file.renameTo(newFile));
}
}
And here is a sample feature file content:
Feature: Test
Scenario: test 1
Given User will do something
And User will do something
Update 2:
Tried with separate thread using javafx Task, still the same issue :(
Except for one file (this is really strange) all files are locked...
public static void runInNewThread() {
// define the execution task that will run in a new thread
Task<Void> newTask = new Task<>() {
#Override
protected Void call() {
ArrayList<Messages.GherkinDocument> listOfGherkinDocuments = GherkinUtils.getGherkinDocumentsFromFiles();
return null;
}
};
// run the task in a new thread
Thread th = new Thread(newTask);
th.setDaemon(true);
th.start();
}
For now, I have used workaround with creating copies of the specific files and using parser on the copies to prevent locking of the original versions...

How to invoke model from TensorFlow Java?

The following python code passes ["hello", "world"] into the universal sentence encoder and returns an array of floats denoting their encoded representation.
import tensorflow as tf
import tensorflow_hub as hub
module = hub.KerasLayer("https://tfhub.dev/google/universal-sentence-encoder/4")
model = tf.keras.Sequential(module)
print("model: ", model(["hello", "world"]))
This code works but I'd now like to do the same thing using the Java API. I've successfully loaded the module, but I am unable to pass inputs into the model and extract the output. Here is what I've got so far:
import org.tensorflow.Graph;
import org.tensorflow.SavedModelBundle;
import org.tensorflow.Session;
import org.tensorflow.Tensor;
import org.tensorflow.Tensors;
import org.tensorflow.framework.ConfigProto;
import org.tensorflow.framework.GPUOptions;
import org.tensorflow.framework.GraphDef;
import org.tensorflow.framework.MetaGraphDef;
import org.tensorflow.framework.NodeDef;
import org.tensorflow.util.SaverDef;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
public final class NaiveBayesClassifier
{
public static void main(String[] args)
{
new NaiveBayesClassifier().run();
}
protected SavedModelBundle loadModule(Path source, String... tags) throws IOException
{
return SavedModelBundle.load(source.toAbsolutePath().normalize().toString(), tags);
}
public void run()
{
try (SavedModelBundle module = loadModule(Paths.get("universal-sentence-encoder"), "serve"))
{
Graph graph = module.graph();
try (Session session = new Session(graph, ConfigProto.newBuilder().
setGpuOptions(GPUOptions.newBuilder().setAllowGrowth(true)).
setAllowSoftPlacement(true).
build().toByteArray()))
{
Tensor<String> input = Tensors.create(new byte[][]
{
"hello".getBytes(StandardCharsets.UTF_8),
"world".getBytes(StandardCharsets.UTF_8)
});
List<Tensor<?>> result = session.runner().feed("serving_default_inputs", input).
addTarget("???").run();
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
}
I used https://stackoverflow.com/a/51952478/14731 to scan the model for possible input/output nodes. I believe the input node is "serving_default_inputs" but I can't figure out the output node. More importantly, I don't have to specify any of these values when invoking the code in python through Keras so is there a way to do the same using the Java API?
UPDATE: Thanks to roywei I can now that confirm the input node is serving_default_input and output node is StatefulPartitionedCall_1 but when I plug these names into the aforementioned code I get:
2020-05-22 22:13:52.266287: W tensorflow/core/framework/op_kernel.cc:1651] OP_REQUIRES failed at lookup_table_op.cc:809 : Failed precondition: Table not initialized.
Exception in thread "main" java.lang.IllegalStateException: [_Derived_]{{function_node __inference_pruned_6741}} {{function_node __inference_pruned_6741}} Error while reading resource variable EncoderDNN/DNN/ResidualHidden_0/dense/kernel/part_25 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/EncoderDNN/DNN/ResidualHidden_0/dense/kernel/part_25/class tensorflow::Var does not exist.
[[{{node EncoderDNN/DNN/ResidualHidden_0/dense/kernel/ConcatPartitions/concat/ReadVariableOp_25}}]]
[[StatefulPartitionedCall_1/StatefulPartitionedCall]]
at libtensorflow#1.15.0/org.tensorflow.Session.run(Native Method)
at libtensorflow#1.15.0/org.tensorflow.Session.access$100(Session.java:48)
at libtensorflow#1.15.0/org.tensorflow.Session$Runner.runHelper(Session.java:326)
at libtensorflow#1.15.0/org.tensorflow.Session$Runner.run(Session.java:276)
Meaning, I still cannot invoke the model. What am I missing?
I figured it out after roywei pointed me in the right direction.
I needed to use SavedModuleBundle.session() instead of constructing my own instance. This is because the loader initializes the graph variables.
Instead of passing a ConfigProto to the Session constructor, I passed it into the SavedModelBundle loader instead.
I needed to use fetch() instead of addTarget() to retrieve the output tensor.
Here is the working code:
public final class NaiveBayesClassifier
{
public static void main(String[] args)
{
new NaiveBayesClassifier().run();
}
public void run()
{
try (SavedModelBundle module = loadModule(Paths.get("universal-sentence-encoder"), "serve"))
{
try (Tensor<String> input = Tensors.create(new byte[][]
{
"hello".getBytes(StandardCharsets.UTF_8),
"world".getBytes(StandardCharsets.UTF_8)
}))
{
MetaGraphDef metadata = MetaGraphDef.parseFrom(module.metaGraphDef());
Map<String, Shape> nameToInput = getInputToShape(metadata);
String firstInput = nameToInput.keySet().iterator().next();
Map<String, Shape> nameToOutput = getOutputToShape(metadata);
String firstOutput = nameToOutput.keySet().iterator().next();
System.out.println("input: " + firstInput);
System.out.println("output: " + firstOutput);
System.out.println();
List<Tensor<?>> result = module.session().runner().feed(firstInput, input).
fetch(firstOutput).run();
for (Tensor<?> tensor : result)
{
{
float[][] array = new float[tensor.numDimensions()][tensor.numElements() /
tensor.numDimensions()];
tensor.copyTo(array);
System.out.println(Arrays.deepToString(array));
}
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
/**
* Loads a graph from a file.
*
* #param source the directory containing to load from
* #param tags the model variant(s) to load
* #return the graph
* #throws NullPointerException if any of the arguments are null
* #throws IOException if an error occurs while reading the file
*/
protected SavedModelBundle loadModule(Path source, String... tags) throws IOException
{
// https://stackoverflow.com/a/43526228/14731
try
{
return SavedModelBundle.loader(source.toAbsolutePath().normalize().toString()).
withTags(tags).
withConfigProto(ConfigProto.newBuilder().
setGpuOptions(GPUOptions.newBuilder().setAllowGrowth(true)).
setAllowSoftPlacement(true).
build().toByteArray()).
load();
}
catch (TensorFlowException e)
{
throw new IOException(e);
}
}
/**
* #param metadata the graph metadata
* #return the first signature, or null
*/
private SignatureDef getFirstSignature(MetaGraphDef metadata)
{
Map<String, SignatureDef> nameToSignature = metadata.getSignatureDefMap();
if (nameToSignature.isEmpty())
return null;
return nameToSignature.get(nameToSignature.keySet().iterator().next());
}
/**
* #param metadata the graph metadata
* #return the output signature
*/
private SignatureDef getServingSignature(MetaGraphDef metadata)
{
return metadata.getSignatureDefOrDefault("serving_default", getFirstSignature(metadata));
}
/**
* #param metadata the graph metadata
* #return a map from an output name to its shape
*/
protected Map<String, Shape> getOutputToShape(MetaGraphDef metadata)
{
Map<String, Shape> result = new HashMap<>();
SignatureDef servingDefault = getServingSignature(metadata);
for (Map.Entry<String, TensorInfo> entry : servingDefault.getOutputsMap().entrySet())
{
TensorShapeProto shapeProto = entry.getValue().getTensorShape();
List<Dim> dimensions = shapeProto.getDimList();
long firstDimension = dimensions.get(0).getSize();
long[] remainingDimensions = dimensions.stream().skip(1).mapToLong(Dim::getSize).toArray();
Shape shape = Shape.make(firstDimension, remainingDimensions);
result.put(entry.getValue().getName(), shape);
}
return result;
}
/**
* #param metadata the graph metadata
* #return a map from an input name to its shape
*/
protected Map<String, Shape> getInputToShape(MetaGraphDef metadata)
{
Map<String, Shape> result = new HashMap<>();
SignatureDef servingDefault = getServingSignature(metadata);
for (Map.Entry<String, TensorInfo> entry : servingDefault.getInputsMap().entrySet())
{
TensorShapeProto shapeProto = entry.getValue().getTensorShape();
List<Dim> dimensions = shapeProto.getDimList();
long firstDimension = dimensions.get(0).getSize();
long[] remainingDimensions = dimensions.stream().skip(1).mapToLong(Dim::getSize).toArray();
Shape shape = Shape.make(firstDimension, remainingDimensions);
result.put(entry.getValue().getName(), shape);
}
return result;
}
}
There are two ways to get the names:
1) Using Java:
You can read the input and output names from the org.tensorflow.proto.framework.MetaGraphDef stored in saved model bundle.
Here is an example on how to extract the information:
https://github.com/awslabs/djl/blob/master/tensorflow/tensorflow-engine/src/main/java/ai/djl/tensorflow/engine/TfSymbolBlock.java#L149
2) Using python:
load the saved model in tensorflow python and print the names
loaded = tf.saved_model.load("path/to/model/")
print(list(loaded.signatures.keys()))
infer = loaded.signatures["serving_default"]
print(infer.structured_outputs)
I recommend to take a look at Deep Java Library, it automatically handle the input, output names.
It supports TensorFlow 2.1.0 and allows you to load Keras models as well as TF Hub Saved Model. Take a look at the documentation here and here
Feel free to open an issue if you have problem loading your model.
You can load TF model with Deep Java Library
System.setProperty("ai.djl.repository.zoo.location", "https://storage.googleapis.com/tfhub-modules/google/universal-sentence-encoder/1.tar.gz?artifact_id=encoder");
Criteria.Builder<NDList, NDList> builder =
Criteria.builder()
.setTypes(NDList.class, NDList.class)
.optArtifactId("ai.djl.localmodelzoo:encoder")
.build();
ZooModel<NDList, NDList> model = ModelZoo.loadModel(criteria);
See https://github.com/awslabs/djl/blob/master/docs/load_model.md#load-model-from-a-url for detail
I need to do the same, but seems still lots of missing pieces RE DJL usage. E.g., what to do after this?:
ZooModel<NDList, NDList> model = ModelZoo.loadModel(criteria);
I finally found an example in the DJL source code. The key take-away is to not use NDList for the input/output at all:
Criteria<String[], float[][]> criteria =
Criteria.builder()
.optApplication(Application.NLP.TEXT_EMBEDDING)
.setTypes(String[].class, float[][].class)
.optModelUrls(modelUrl)
.build();
try (ZooModel<String[], float[][]> model = ModelZoo.loadModel(criteria);
Predictor<String[], float[][]> predictor = model.newPredictor()) {
return predictor.predict(inputs.toArray(new String[0]));
}
See https://github.com/awslabs/djl/blob/master/examples/src/main/java/ai/djl/examples/inference/UniversalSentenceEncoder.java for the complete example.

how to fetch public data and images from flicker using java?

I am trying to get the Flickr data by using API key and secret key provided by flickr . I have written a java code for it.
`package com.flickr.project;
import java.io.IOException;
import javax.xml.parsers.ParserConfigurationException;
import org.xml.sax.SAXException;
import com.aetrion.flickr.Flickr;
import com.aetrion.flickr.FlickrException;
import com.aetrion.flickr.REST;
import com.aetrion.flickr.auth.Permission;
import com.aetrion.flickr.photos.SearchParameters;
import com.aetrion.flickr.photos.PhotoList;
import com.aetrion.flickr.photos.PhotosInterface;
import com.aetrion.flickr.photos.Photo;
import com.flickr4java.flickr.Auth;
import com.flickr4java.flickr.RequestContext;
public class SampleProgram{
public static void main(String[] args) {
try {
searchImages();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void searchImages() {
// Search photos with tag keywords and get result
try{
//Set api key
String key="";
String svr="www.flickr.com";
String secret="";
RequestContext requestContext = RequestContext.getRequestContext();
Auth auth = new Auth();
auth.setPermission(Permission.READ);
auth.setToken("");
auth.setTokenSecret("");
requestContext.setAuth(auth);
REST rest=new REST();
rest.setHost(svr);
//initialize Flickr object with key and rest
Flickr flickr=new Flickr(key,secret,rest);
Flickr.debugRequest = false;
Flickr.debugStream = false;
Flickr.debugStream=false;
//initialize SearchParameter object, this object stores the search keyword
SearchParameters searchParams=new SearchParameters();
searchParams.setSort(SearchParameters.INTERESTINGNESS_DESC);
//Create tag keyword array
String[] tags=new String[]{"Dog","Doberman"};
searchParams.setTags(tags);
//Initialize PhotosInterface object
PhotosInterface photosInterface=flickr.getPhotosInterface();
//Execute search with entered tags
// PhotoList photoList=null;
PhotoList photoList=photosInterface.search(searchParams,20,1);
System.out.println("here");
//get search result and fetch the photo object and get small square imag's url
if(photoList!=null){
//Get search result and check the size of photo result
for(int i=0;i<photoList.size();i++){
//get photo object
Photo photo=(Photo)photoList.get(i);
//Get small square url photo
StringBuffer strBuf=new StringBuffer();
strBuf.append("<a href=\"\">");
strBuf.append("<img border=\"0\" src=\""+photo.getSmallSquareUrl()+"\">");
strBuf.append("</a>\n");
// ....
}
}
}catch(Exception e){
e.printStackTrace();
}
}
public void userAuthentication(){
/*InputStream in = null;
try {
in = getClass().getResourceAsStream("/setup.properties");
// properties = new Properties();
// properties.load(in);
} finally {
IOUtilities.close(in);
}
f = new Flickr(properties.getProperty("apiKey"), properties.getProperty("secret"), new REST());
requestContext = RequestContext.getRequestContext();
Auth auth = new Auth();
auth.setPermission(Permission.READ);
auth.setToken(properties.getProperty("token"));
auth.setTokenSecret(properties.getProperty("tokensecret"));
requestContext.setAuth(auth);
Flickr.debugRequest = false;
Flickr.debugStream = false;*/
}
} `
I need to fetch all data including images from flickr using the key words i mentioned in the program.
From the look of things, Photo has several methods like the following:
"getSmallSquareAsInputStream" - This returns an input stream, this can be used to fetch the image data.
The full API's list for Photo class is available here

How to implement auto suggest using Lucene's new AnalyzingInfixSuggester API?

I am a greenhand on Lucene, and I want to implement auto suggest, just like google, when I input a character like 'G', it would give me a list, you can try your self.
I have searched on the whole net.
Nobody has done this , and it gives us some new tools in package suggest
But i need an example to tell me how to do that
Is there anyone can help ?
I'll give you a pretty complete example that shows you how to use AnalyzingInfixSuggester. In this example we'll pretend that we're Amazon, and we want to autocomplete a product search field. We'll take advantage of features of the Lucene suggestion system to implement the following:
Ranked results: We will suggest the most popular matching products first.
Region-restricted results: We will only suggest products that we sell in the customer's country.
Product photos: We will store product photo URLs in the suggestion index so we can display them in the search results, without having to do an additional database lookup.
First I'll define a simple class to hold information about a product in Product.java:
import java.util.Set;
class Product implements java.io.Serializable
{
String name;
String image;
String[] regions;
int numberSold;
public Product(String name, String image, String[] regions,
int numberSold) {
this.name = name;
this.image = image;
this.regions = regions;
this.numberSold = numberSold;
}
}
To index records in with the AnalyzingInfixSuggester's build method you need to pass it an object that implements the org.apache.lucene.search.suggest.InputIterator interface. An InputIterator gives access to the key, contexts, payload and weight for each record.
The key is the text you actually want to search on and autocomplete against. In our example, it will be the name of the product.
The contexts are a set of additional, arbitrary data that you can use to filter records against. In our example, the contexts are the set of ISO codes for the countries we will ship a particular product to.
The payload is additional arbitrary data you want to store in the index for the record. In this example, we will actually serialize each Product instance and store the resulting bytes as the payload. Then when we later do lookups, we can deserialize the payload and access information in the product instance like the image URL.
The weight is used to order suggestion results; results with a higher weight are returned first. We'll use the number of sales for a given product as its weight.
Here's the contents of ProductIterator.java:
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.ObjectOutputStream;
import java.io.UnsupportedEncodingException;
import java.util.Comparator;
import java.util.HashSet;
import java.util.Iterator;
import java.util.Set;
import org.apache.lucene.search.suggest.InputIterator;
import org.apache.lucene.util.BytesRef;
class ProductIterator implements InputIterator
{
private Iterator<Product> productIterator;
private Product currentProduct;
ProductIterator(Iterator<Product> productIterator) {
this.productIterator = productIterator;
}
public boolean hasContexts() {
return true;
}
public boolean hasPayloads() {
return true;
}
public Comparator<BytesRef> getComparator() {
return null;
}
// This method needs to return the key for the record; this is the
// text we'll be autocompleting against.
public BytesRef next() {
if (productIterator.hasNext()) {
currentProduct = productIterator.next();
try {
return new BytesRef(currentProduct.name.getBytes("UTF8"));
} catch (UnsupportedEncodingException e) {
throw new Error("Couldn't convert to UTF-8");
}
} else {
return null;
}
}
// This method returns the payload for the record, which is
// additional data that can be associated with a record and
// returned when we do suggestion lookups. In this example the
// payload is a serialized Java object representing our product.
public BytesRef payload() {
try {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(currentProduct);
out.close();
return new BytesRef(bos.toByteArray());
} catch (IOException e) {
throw new Error("Well that's unfortunate.");
}
}
// This method returns the contexts for the record, which we can
// use to restrict suggestions. In this example we use the
// regions in which a product is sold.
public Set<BytesRef> contexts() {
try {
Set<BytesRef> regions = new HashSet();
for (String region : currentProduct.regions) {
regions.add(new BytesRef(region.getBytes("UTF8")));
}
return regions;
} catch (UnsupportedEncodingException e) {
throw new Error("Couldn't convert to UTF-8");
}
}
// This method helps us order our suggestions. In this example we
// use the number of products of this type that we've sold.
public long weight() {
return currentProduct.numberSold;
}
}
In our driver program, we will do the following things:
Create an index directory in RAM.
Create a StandardTokenizer.
Create an AnalyzingInfixSuggester using the RAM directory and tokenizer.
Index a number of products using ProductIterator.
Print the results of some sample lookups.
Here's the driver program, SuggestProducts.java:
import java.io.ByteArrayInputStream;
import java.io.IOException;
import java.io.ObjectInputStream;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester;
import org.apache.lucene.search.suggest.Lookup;
import org.apache.lucene.store.RAMDirectory;
import org.apache.lucene.util.BytesRef;
import org.apache.lucene.util.Version;
public class SuggestProducts
{
// Get suggestions given a prefix and a region.
private static void lookup(AnalyzingInfixSuggester suggester, String name,
String region) {
try {
List<Lookup.LookupResult> results;
HashSet<BytesRef> contexts = new HashSet<BytesRef>();
contexts.add(new BytesRef(region.getBytes("UTF8")));
// Do the actual lookup. We ask for the top 2 results.
results = suggester.lookup(name, contexts, 2, true, false);
System.out.println("-- \"" + name + "\" (" + region + "):");
for (Lookup.LookupResult result : results) {
System.out.println(result.key);
Product p = getProduct(result);
if (p != null) {
System.out.println(" image: " + p.image);
System.out.println(" # sold: " + p.numberSold);
}
}
} catch (IOException e) {
System.err.println("Error");
}
}
// Deserialize a Product from a LookupResult payload.
private static Product getProduct(Lookup.LookupResult result)
{
try {
BytesRef payload = result.payload;
if (payload != null) {
ByteArrayInputStream bis = new ByteArrayInputStream(payload.bytes);
ObjectInputStream in = new ObjectInputStream(bis);
Product p = (Product) in.readObject();
return p;
} else {
return null;
}
} catch (IOException|ClassNotFoundException e) {
throw new Error("Could not decode payload :(");
}
}
public static void main(String[] args) {
try {
RAMDirectory index_dir = new RAMDirectory();
StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_48);
AnalyzingInfixSuggester suggester = new AnalyzingInfixSuggester(
Version.LUCENE_48, index_dir, analyzer);
// Create our list of products.
ArrayList<Product> products = new ArrayList<Product>();
products.add(
new Product(
"Electric Guitar",
"http://images.example/electric-guitar.jpg",
new String[]{"US", "CA"},
100));
products.add(
new Product(
"Electric Train",
"http://images.example/train.jpg",
new String[]{"US", "CA"},
100));
products.add(
new Product(
"Acoustic Guitar",
"http://images.example/acoustic-guitar.jpg",
new String[]{"US", "ZA"},
80));
products.add(
new Product(
"Guarana Soda",
"http://images.example/soda.jpg",
new String[]{"ZA", "IE"},
130));
// Index the products with the suggester.
suggester.build(new ProductIterator(products.iterator()));
// Do some example lookups.
lookup(suggester, "Gu", "US");
lookup(suggester, "Gu", "ZA");
lookup(suggester, "Gui", "CA");
lookup(suggester, "Electric guit", "US");
} catch (IOException e) {
System.err.println("Error!");
}
}
}
And here is the output from the driver program:
-- "Gu" (US):
Electric Guitar
image: http://images.example/electric-guitar.jpg
# sold: 100
Acoustic Guitar
image: http://images.example/acoustic-guitar.jpg
# sold: 80
-- "Gu" (ZA):
Guarana Soda
image: http://images.example/soda.jpg
# sold: 130
Acoustic Guitar
image: http://images.example/acoustic-guitar.jpg
# sold: 80
-- "Gui" (CA):
Electric Guitar
image: http://images.example/electric-guitar.jpg
# sold: 100
-- "Electric guit" (US):
Electric Guitar
image: http://images.example/electric-guitar.jpg
# sold: 100
Appendix
There's a way to avoid writing a full InputIterator that you might find easier. You can write a stub InputIterator that returns null from its next, payload and contexts methods. Pass an instance of it to AnalyzingInfixSuggester's build method:
suggester.build(new ProductIterator(new ArrayList<Product>().iterator()));
Then for each item you want to index, call the AnalyzingInfixSuggester add method:
suggester.add(text, contexts, weight, payload)
After you've indexed everything, call refresh:
suggester.refresh();
If you're indexing large amounts of data, it's possible to significantly speedup indexing using this method with multiple threads: Call build, then use multiple threads to add items, then finally call refresh.
[Edited 2015-04-23 to demonstrate deserializing info from the LookupResult payload.]

Categories