I have the following document in my database:
_id: ObjectId('63a73aec1afb1e4de760d9de')
uuid: "71e5db4e-ab05-4de2-9238-5660474c5156"
coins: 0
level: 1
currentXp: 1
upgrades: Object
durability: 0
luck: 1
Now I want to get the data from the object. I tried to get the int from durability by doing this:
public static int getDurabilityLevel(UUID uuid) {
Document filter = new Document("uuid", uuid.toString());
int durabilityLevel = Main.getInstance().getDataConnection().getCollection().find(filter).first().getInteger("upgrades.durability");
return durabilityLevel;
}
I also want to chance the value of the luck integer. But if I try to chance it, the durability integer disappears. I used this to chance the value:
public static void setLuckLevel(UUID uuid, int level) {
Document filter = new Document("uuid", uuid.toString());
Document foundDocument = Main.getInstance().getDataConnection().getCollection().find(filter).first();
if(foundDocument != null) {
Document updateValue = new Document("Upgrades", new Document("luck", level));
Document updateOperation = new Document("$set", updateValue);
Main.getInstance().getDataConnection().getCollection().updateOne(foundDocument, updateOperation);
}
}
I hope anyone can help me with this simple problem. Thanks!
Now I could fix the problems. Here are my solutions:
This is my way to get data from the object:
public static int getDurabilityLevel(UUID uuid) {
Document filter = new Document("uuid", uuid.toString());
Document document = Main.getInstance().getDataConnection().getCollection().find(filter).first();
Document object = (Document) document.get("upgrades");
int durabilityLevel = object.getInteger("durability");
return durabilityLevel;
}
And this is my way to chance data in the object without deleting the other values:
public static void setDurabilityLevel(UUID uuid, int level) {
Document filter = new Document("uuid", uuid.toString());
Document foundDocument = Main.getInstance().getDataConnection().getCollection().find(filter).first();
if (foundDocument != null) {
Document updateValue = new Document("upgrades.durability", level);
Document updateOperation = new Document("$set", updateValue);
Main.getInstance().getDataConnection().getCollection().updateOne(foundDocument, updateOperation);
}
}
Related
i want to looping the code, every loop the data saved to document variable, how to add more data to document, i have problem when the loop more than 1 loop. can you give me an idea how to do it? thank you anyway
private Document getProcessInstances(String status, int page, int size, String sort){
StringBuilder url = new StringBuilder();
Document processinstancelist = null;
Integer totalItems = this.getTotalItems(status, page, size, sort);
Integer totalPages = totalItems/size;
try{
while (page<=totalPages){
url.append(activitiqueryhost).append("/v1/process-instances?status=").append(status).append("&page=").append(page).append("&size=").append(size).append("&sort=").append(sort);
// System.out.println(" >>>>>>>>>> URL="+url.toString());
ResponseEntity<String> processinstancestring = this.get(url.toString());
// System.out.println("processinstancestring="+processinstancestring.getBody());
Document processinstance = Document.parse(processinstancestring.getBody());
// System.out.println(">>>>> processinstance=" + processinstance.toJson());
// Document
processinstancelist = (Document) processinstance.get("list");
// System.out.println(">>>>> list=" + processinstancelist.toJson());
System.out.println("==== datanya "+totalItems);
System.out.println("==== total page "+totalPages);
System.out.println("==== datanya "+page);
page++;
}
return processinstancelist;
}
catch(Exception e){
return null;
}
}
I have this path for a MongoDB field main.inner.leaf and every field couldn't be present.
In Java I should write, avoiding null:
String leaf = "";
if (document.get("main") != null &&
document.get("main", Document.class).get("inner") != null) {
leaf = document.get("main", Document.class)
.get("inner", Document.class).getString("leaf");
}
In this simple example I set only 3 levels: main, inner and leaf but my documents are deeper.
So is there a way avoiding me writing all these null checks?
Like this:
String leaf = document.getString("main.inner.leaf", "");
// "" is the deafult value if one of the levels doesn't exist
Or using a third party library:
String leaf = DocumentUtils.getNullCheck("main.inner.leaf", "", document);
Many thanks.
Since the intermediate attributes are optional you really have to access the leaf value in a null safe manner.
You could do this yourself using an approach like ...
if (document.containsKey("main")) {
Document _main = document.get("main", Document.class);
if (_main.containsKey("inner")) {
Document _inner = _main.get("inner", Document.class);
if (_inner.containsKey("leaf")) {
leafValue = _inner.getString("leaf");
}
}
}
Note: this could be wrapped up in a utility to make it more user friendly.
Or use a thirdparty library such as Commons BeanUtils.
But, you cannot avoid null safe checks since the document structure is such that the intermediate levels might be null. All you can do is to ease the burden of handling the null safety.
Here's an example test case showing both approaches:
#Test
public void readNestedDocumentsWithNullSafety() throws IllegalAccessException, NoSuchMethodException, InvocationTargetException {
Document inner = new Document("leaf", "leafValue");
Document main = new Document("inner", inner);
Document fullyPopulatedDoc = new Document("main", main);
assertThat(extractLeafValueManually(fullyPopulatedDoc), is("leafValue"));
assertThat(extractLeafValueUsingThirdPartyLibrary(fullyPopulatedDoc, "main.inner.leaf", ""), is("leafValue"));
Document emptyPopulatedDoc = new Document();
assertThat(extractLeafValueManually(emptyPopulatedDoc), is(""));
assertThat(extractLeafValueUsingThirdPartyLibrary(emptyPopulatedDoc, "main.inner.leaf", ""), is(""));
Document emptyInner = new Document();
Document partiallyPopulatedMain = new Document("inner", emptyInner);
Document partiallyPopulatedDoc = new Document("main", partiallyPopulatedMain);
assertThat(extractLeafValueManually(partiallyPopulatedDoc), is(""));
assertThat(extractLeafValueUsingThirdPartyLibrary(partiallyPopulatedDoc, "main.inner.leaf", ""), is(""));
}
private String extractLeafValueUsingThirdPartyLibrary(Document document, String path, String defaultValue) {
try {
Object value = PropertyUtils.getNestedProperty(document, path);
return value == null ? defaultValue : value.toString();
} catch (Exception ex) {
return defaultValue;
}
}
private String extractLeafValueManually(Document document) {
Document inner = getOrDefault(getOrDefault(document, "main"), "inner");
return inner.get("leaf", "");
}
private Document getOrDefault(Document document, String key) {
if (document.containsKey(key)) {
return document.get(key, Document.class);
} else {
return new Document();
}
}
I'm working on Lucene Library, and I found the documents required after executing a BooleanQuery.
I looped in the searcher and each time I would like to put the Document in a HashMap.
int docId = hits[i].doc;
Document doc = searcher.doc(docId);
HashMap X = new HashMap ();
Now I want to know how to fill the hashmap X with the name_Field and the value_Field of the document?
You can iterate over document fields like this:
for (IndexableField field : doc.getFields())
{
X.put(field.name(), field.stringValue());
}
But it will work only for fields which are stored in index (those which was added with Field.Store.YES flag). Also if you have several values for the field in the document this code has to be modified.
You could extends the lucene Collector then add the document as the way you want.
IndexSearcher searcher = new IndexSearcher(indexReader);
private Map<String, String> docs = new HashMap<String, String>();
searcher.search(query, new Collector() {
private int docBase;
// ignore scorer
public void setScorer(Scorer scorer) {
}
// accept docs out of order (for a BitSet it doesn't matter)
public boolean acceptsDocsOutOfOrder() {
return true;
}
public void collect(int docNum) {
Document luceneDoc = searcher.doc(doc + docBase);
docs.put(luceneDoc.getValues(name_Field), luceneDoc.getValues(value_Field));
}
public void setNextReader(AtomicReaderContext context) {
this.docBase = context.docBase;
}
});
I'm trying to create Term-Document matrix for a small corpus to further experiment with LSI. However, I couldn't find a way to do it with Lucene 4.4.
I know how to get TermVector for each document as following:
//create boolean query to search for a specific document (not shown)
TopDocs hits = searcher.search(query, 1);
Terms termVector = reader.getTermVector(hits.scoreDocs[0].doc, "contents");
System.out.println(termVector.size()); //just testing
I thought I can just union all the termVector together as columns in a matrix to get the matrix. However, termVector for different documents have different size. And we don't know how to pad 0 into the termVector. So, certainly, this method does not work.
Hence, I wonder if someone can show me how to create Term-Document vector with Lucene 4.4 please? (If possible, please show sample code).
If Lucene does not support this function, what is the other way you recommend to do it?
Many thanks,
I found the solution to my problem here. Very detail example given by Mr. Sujit, although the code is written in older version of Lucene so many things will have to be changed. I'll update details when I finish my code.
Here is my solution that works on Lucene 4.4
public class BuildTermDocumentMatrix {
public BuildTermDocumentMatrix(File index, File corpus) throws IOException{
reader = DirectoryReader.open(FSDirectory.open(index));
searcher = new IndexSearcher(reader);
this.corpus = corpus;
termIdMap = computeTermIdMap(reader);
}
/**
* Map term to a fix integer so that we can build document matrix later.
* It's used to assign term to specific row in Term-Document matrix
*/
private Map<String, Integer> computeTermIdMap(IndexReader reader) throws IOException {
Map<String,Integer> termIdMap = new HashMap<String,Integer>();
int id = 0;
Fields fields = MultiFields.getFields(reader);
Terms terms = fields.terms("contents");
TermsEnum itr = terms.iterator(null);
BytesRef term = null;
while ((term = itr.next()) != null) {
String termText = term.utf8ToString();
if (termIdMap.containsKey(termText))
continue;
//System.out.println(termText);
termIdMap.put(termText, id++);
}
return termIdMap;
}
/**
* build term-document matrix for the given directory
*/
public RealMatrix buildTermDocumentMatrix () throws IOException {
//iterate through directory to work with each doc
int col = 0;
int numDocs = countDocs(corpus); //get the number of documents here
int numTerms = termIdMap.size(); //total number of terms
RealMatrix tdMatrix = new Array2DRowRealMatrix(numTerms, numDocs);
for (File f : corpus.listFiles()) {
if (!f.isHidden() && f.canRead()) {
//I build term document matrix for a subset of corpus so
//I need to lookup document by path name.
//If you build for the whole corpus, just iterate through all documents
String path = f.getPath();
BooleanQuery pathQuery = new BooleanQuery();
pathQuery.add(new TermQuery(new Term("path", path)), BooleanClause.Occur.SHOULD);
TopDocs hits = searcher.search(pathQuery, 1);
//get term vector
Terms termVector = reader.getTermVector(hits.scoreDocs[0].doc, "contents");
TermsEnum itr = termVector.iterator(null);
BytesRef term = null;
//compute term weight
while ((term = itr.next()) != null) {
String termText = term.utf8ToString();
int row = termIdMap.get(termText);
long termFreq = itr.totalTermFreq();
long docCount = itr.docFreq();
double weight = computeTfIdfWeight(termFreq, docCount, numDocs);
tdMatrix.setEntry(row, col, weight);
}
col++;
}
}
return tdMatrix;
}
}
One can refer this code also. In the latest Lucene version It will be quite easy.
Example 15
public void testSparseFreqDoubleArrayConversion() throws Exception {
Terms fieldTerms = MultiFields.getTerms(index, "text");
if (fieldTerms != null && fieldTerms.size() != -1) {
IndexSearcher indexSearcher = new IndexSearcher(index);
for (ScoreDoc scoreDoc : indexSearcher.search(new MatchAllDocsQuery(), Integer.MAX_VALUE).scoreDocs) {
Terms docTerms = index.getTermVector(scoreDoc.doc, "text");
Double[] vector = DocToDoubleVectorUtils.toSparseLocalFreqDoubleArray(docTerms, fieldTerms);
assertNotNull(vector);
assertTrue(vector.length > 0);
}
}
}
I'm literally struggling with this new API and the lack of examples for core things like the NRT Manager.
I followed this example and here is the final result:
This is how the NRT Manager is built:
analyzer = new StopAnalyzer(Version.LUCENE_40);
config = new IndexWriterConfig(Version.LUCENE_40, analyzer);
writer = new IndexWriter(FSDirectory.open(new File(ConfigUtil.getProperty("lucene.directory"))), config);
mgrWriter = new NRTManager.TrackingIndexWriter(writer);
ReferenceManager<IndexSearcher> mgr = new NRTManager(mgrWriter, new SearcherFactory(), true);
Adding a new element to the NRT Manager's writer:
long gen = -1;
try{
Document userDoc = DocumentManager.getDocument(user);
gen = mgrWriter.addDocument(userDoc);
} catch (Exception e) {}
return gen;
After some small amount of time I need to update the previous document:
// Acquire a searcher from the NRTManager. I am using the generation obtained in the creation step
((NRTManager)mgr).waitForGeneration(gen);
searcher = mgr.acquire();
//Search for the document based on some user id
Term idTerm = new Term(USER_ID, Integer.toString(userId));
Query idTermQuery = new TermQuery(term);
TopDocs result = searcher.search(idTermQuery, 1);
if (result.totalHits > 0) resultDoc = searcher.doc(result.scoreDocs[0].doc);
else resultDoc = null;
The problem is that resultDoc will always be null. What am I missing? I should not use commit() or flush() in orther to see those changes.
I am using a NRTManagerReopenThread as exemplified here.
LE userDoc creation
public static Document getDocument(User user) {
Document doc = new Document();
FieldType storedType = new FieldType();
storedType.setStored(true);
storedType.setIndexed(false);
// Store user data
doc.add(new Field(USER_ID, user.getId().toString(), storedType));
doc.add(new Field(USER_NAME, user.getFirstName() + user.getLastName(), storedType));
FieldType unstoredType = new FieldType();
unstoredType.setStored(false);
unstoredType.setIndexed(true);
Field field = null;
// Analyze Location
String tokens = "";
if (user.getLocation() != null && ! user.getLocation().isEmpty()){
for (Tag location : user.getLocation()) tokens += location.getName() + " ";
field = new Field(USER_LOCATION, tokens, unstoredType);
field.setBoost(Constants.LOCATION);
doc.add(field);
}
// Analyze Language
if (user.getLanguage() != null && ! user.getLanguage().isEmpty()){
// Same as Location
}
// Analyze Career
if (user.getCareer() != null && ! user.getCareer().isEmpty()){
// Same as Location
}
return doc;
}
Your problem is not NRT-related. You are searching agains the USER_ID field although it has not been indexed, this can't work. If you don't want your ID field to be tokenized, just call FieldType#setTokenized(false) (or just use StringField, which does exactly that by default: indexed by not tokenized).