How to store dictionaries(map) in flatbuffer in Java - java

I was learning flatbuffers from this link , there was no example to demonstrate how to store dictionary(map). There was a mention of "Storing dictionaries in java/Csharp" in this link , but i did not understand much about it. I am from java background. Any example of how to store dictionary/map in flatbuffers in java would be helpful

I realize this is an old question but I came across it when I was trying to figure out the same thing. Here is what I did to get a "dictionary/map"
Schema File
namespace com.dictionary;
table DataItem{
keyValue:string(key);
name:string;
age:int;
codes[string];
}
table DictionaryRoot{
items:[DataItem];
}
root_type DictionaryRoot;
When you run this through the FlatBuffers compiler flatc -j schema.fbs it will produce two Java files, one named DictionaryRoot.java, the other named DataItem.java.
In your Java application
Using those two generated Java files you will need to construct the buffer. This has to be done from the innermost data to the outermost. So you need to construct your DataItems (and keep track of their offsets) before your DictionaryRoot.
In this example, let's assume that you have a map of Objects in Java that you need to create the buffer from.
List<Integer> offsets = new ArrayList<>();
FlatBufferBuilder builder = new FlatBufferBuilder(1024);
for (Entry<String, DataObj> entry : map.entrySet()) {
DataObj dataObj = entry.getValue();
// use the builder to create the data and keep track of their offsets
int keyValueOffset = builder.createString(entry.getKey());
int nameOffset = builder.createString(dataObj.getName());
int ageOffset = dataObj.getAge();
int[] codesOffsets = dataObj.getCodes().stream().mapToInt(builder::createString)
.toArray();
// use the builder to create a vector using the offsets from above
int codesVectorOffset = DataItem.createCodesVector(builder, codesOffsets);
// now with all the inner data offsets, create the DataItem
DataItem.startDataItem(builder);
DataItem.addKeyValue(builder, keyValueOffset);
DataItem.addName(builder, nameOffset);
DataItem.addAge(builder, ageOffset);
DataItem.addCodes(builder, codesVectorOffset);
// ensure you 'end' the DataItem to get the offset
int dataItemOffset = DataItem.endDataItem(builder);
// track the offsets
offsets.add(dataItemOffset);
}
// use the builder to create a sorted vector from your offsets. This is critical
int sortedVectorOffset = builder.createSortedVectorOfTables(new DataItem(),
offsets.stream().mapToInt(Integer::intValue).toArray());
// now with your sorted vector, create the DictionaryRoot
DictionaryRoot.startDictionaryRoot(builder);
DictionaryRoot.addItems(builder, sortedVectorOffset);
int dictRootOffset = DictionaryRoot.endDictionaryRoot(builder);
// signal to the builder you are done
builder.finish(dictRootOffset);
// Write data to file
FileOutputStream outputStream = new FileOutputStream("output.bin");
outputStream.write(builder.sizedByteArray());
outputStream.close();
I hope that will help someone else along their journey using FlatBuffers.

Related

How to read a list from a YAMLConfigutation

I have a YAML file that looks like this:
foo:
bar:
- entry1: 1
entry2: a
- entry1: 2
entry2: b
(Where the actual list is much longer.) I'm reading this file using Apache Configuration2's YAMLConfiguration. I can see the data in the internal data structures used in Apache Configuration2, but I can't figure out how to get this list out. I actually have a class that matches the structure of the list elements, which is what I'd really like to read into:
class MyListEntry {
public int entry1;
public String entry2;
}
How can I get the data YAMLConfiguration into a List<MyListEntry>?
Here's the solution I found (note this works for any HierarchicalConfiguration, not just YAMLConfiguration)
// this will return a list of List<HierarchicalConfiguration<ImmutableNode>>, one entry for each element of the list
var subConfigList = hierarchicalConfig.configurationsAt("foo.bar");
List<MyListEntry> myListEntries = new ArrayList<>(subConfigList.size());
// iterate over the subconfigs and pull out the specific values of interest
for(var subConfig : subconfigs) {
MyListEntry myListEntry = new MyListEntry();
myListEntry.entry1 = subConfig.getInt("entry1");
myListEntry.entry2 = subConfig.getString("entry2");
myListEntries.add(myListEntry);
}

How to proccess/read java objects from xmlrpc received by a server

First, im learning java, so i am totally new with it, i am making a petition to a python function with xmlrpc, python sents a dictionary which contanins another dictionary inside, and various ids lists like this:
{
country_ids=[1,2,3,4,6,7,8],
state_ids=[23,22,12,12,56,12,56,72,23],
config={GLOBAL_DC=true, MAX_GLOBAL_DC=1,RET=5,COMP=1,VER=1.0}
}
So im getting this in java with:
HashMap<String, Object> data=HashMap<String, Object> xmlrpc.call...
and i am getting something like this:
{
country_ids=[Ljava.lang.Object;#7e0aa6f,
state_ids=[Ljava.lang.Object;#dc6c405,
config={GLOBAL_DC=true, MAX_GLOBAL_DC=1,RET=5,COMP=1,VER=1.0}
}
I know how to read value from the hashmap with data.get("country_ids") but, i don't know how to map/read/convert this object to get the ids inside of it.
Just in case someone cames here wondering the same or similar question, i found how has to be done:
Object [] country_ids = (Object[]) data.get('country_ids');
// To read the data of elements, something like this
for (Integer i = 0; i < country_ids.length; i++) {
Log.d("Element value of " + i.toString(),country_ids[i].toString());
}

Convert list to hashmap

Title of the question may give you the impression that it is duplicate question, but according to me it is not.
I am just a few months old in Java and a month old in MongoDB, SpringBoot and REST.
I have a Mongo Collection with 3 fields in a document, _id (default field), appName and appKey. I am using list to iterate through all the documents and find one document whose appName and appKey matches with the one that is passed. This collection right now has only 4 entries, and thus it is running smoothly. But I was reading a bit about collections and found that if there will be a higher number of documents in a collection then the result with list will be much slower than hashMap.
But as I have already said that I am quite new to Java, I am having a bit of trouble converting my code to hashMap, so I was hoping if someone can guide me through this.
I am also attaching my code for reference.
public List<Document> fetchData() {
// Collection that stores appName and appKey
MongoCollection<Document> collection = db.getCollection("info");
List<Document> nameAndKeyList = new ArrayList<Document>();
// Getting the list of appName and appKey from info DB
AggregateIterable<Document> output = collection
.aggregate(Arrays.asList(new BasicDBObject("$group", new BasicDBObject("_id",
new BasicDBObject("_id", "$id").append("appName", "$appName").append("appKey", "$appKey"))
)));
for (Document doc : output) {
nameAndKeyList.add((Document) doc.get("_id"));
}
return nameAndKeyList;
}// End of Method
And then I am calling it in another method of the same class:
List<Document> nameAndKeyList = new ArrayList<>();
//InfoController is the name of the class
InfoController obj1 = new InfoController();
nameAndKeyList = obj1.fetchData();
// Fetching and checking if the appName & appKey pair
// is present in the DB one by one.
// If appName & appKey mismatches, it increments the value
// of 'i' and check them with the other values in DB
for (int i = 0; i < nameAndKeyList.size(); i++) {
"followed by my code"
And if I am not wrong then there will be no need for the above loop also.
Thanks in advance.
You just need a simple find query to get the record you need directly from Mongo DB.
Document document = collection
.find(new Document("appName", someappname).append("appKey", someappkey)).first();
First of all a list is not much slower or faster than an HashMap. A Hasmap is commonly used to save key-pair values such as "ID", "Name" or something like that. In your case I see you are using ArrayList without a specified size for the list. better use a linked list when you do not know the size because an arraylist is holding a array behind and extending this by copying. If you want to generate a Hasmap out of the List or use a Hasmap you need to map an ID and the value to the records.
HashMap<String /*type of the identifier*/, String /*type of value*/> map = new HashMap<String,String>();
for (Document doc : output) {
map.put(doc.get("_id"), doc.get("_value"));
}
First, avoid premature optimization (lookup the expression if you don’t know what it is). Put a realistic number of thousands of items containing near-realistic data in your list. Try to retrieve an item that isn’t there. This will force your for loop to traverse the entire list. See how long it takes. Try a number of times to get an impression of whether you get impatient. If you don’t, you’re done.
If you find out that you need a speed-up, I agree that HashMap is one of the obvious solutions to try. One of the first things to consider with this is a key type for you HashMap. As I understand, what you need to search for is an item where appName and appKey are both right. The good solution is to write a simple class with these two fields and equals and hashCode methods (I’ll call it DocumentHashMapKey for now, think of a better name). For hashCode(), try Objects.hash(appName, appKey). If it doesn’t give satisfactory performance with the data you have, consider alternatives. Now you are ready to build your HashMap< DocumentHashMapKey, Document>.
If you’re lazy or just want a first impression of how a HashMap performs, you may also build your keys by concatenating appName + "$##" + appKey (where the string in the middle is something that is unlikely to be part of a name or key) and use HashMap<String, Document>.
Everything I said can be refined depending on your needs. This was just to get you started.
Thanks everyone for your help, without which I would not have got to a solution.
public HashMap<String, String> fetchData() {
// Collection that stores appName and apiKey
MongoCollection<Document> collection = db.getCollection("info");
HashMap<String, String> appKeys = new HashMap<String, String>();
// Getting the list of appName and appKey from info DB
AggregateIterable<Document> output = collection
.aggregate(Arrays.asList(new BasicDBObject("$group", new BasicDBObject("_id",
new BasicDBObject("_id", "$id").append("appName", "$appName").append("appKey", "$appKey"))
)));
String appName = null;
String appKey = null;
for (Document doc : output) {
Document temp = (Document) doc.get("_id");
appName = (String) temp.get("appName");
appKey = (String) temp.get("appKey");
appKeys.put(appName, appKey);
}
return appKeys;
Calling the above method into another method of the same class.
InfoController obj = new InfoController();
//Fetching the values of 'appName' & 'appKey' sent from 'info' DB
HashMap<String, String> appKeys = obj.fetchData();
storedAppkey = appKeys.get(appName);
//Handling the case of mismatch
if (storedAppkey == null || storedApikey.compareTo(appKey)!=0)
{//Then the response and further processing that I need to do.
Now what HashMap has done is that it has made my code more readable and the 'for' loop that I was using for iterating is gone, although it might not make much difference in the performance as of now.
Thanks once again to everyone for your help and support.

Does Java API of Rocks DB support prefix scan?

I have huge data set(key-value) in Rocks DB and I have to search for key based on prefix of key in hand. I do not want to scan whole data set to filter out key based on key-prefix. is there any way to do that?
You can use something like this.
Using RocksIterator there is a api exposed where you can seek to the key substring and if your key starts with the prefix then consider that key.
Please find the sample code.
List<String> result = new ArrayList<String>();
RocksIterator iterator = db.newIterator();
for (iterator.seek(prefix.getBytes()); iterator.isValid(); iterator
.next()) {
String key = new String(iterator.key());
if (!key.startsWith(prefix))
break;
result.add(String.format("%s", new String(iterator.key())));
}
Hope it will help you.
The answer of #Pramatha V works pretty well, although I made some improvements to the code. I am not deserializing the iterator key in every iteration. I am using the Bytes.increment() from the Kafka common utils (you can extract this class and use it in your code directly). This function increments the underlying byte array by adding 1. With this approach, I can find the next bigger key than my prefix key. I am using the BYTES_LEXICO_COMPARATOR (also from the same class) to make the comparison, but you are free to implement and use your comparator. Moreover, the function returns a map of byte arrays, which you can deserialize later in your code.
public Map<byte[], byte[]> prefixScan(final byte[] prefix) {
final Map<byte[], byte[]> result = new HashMap<>();
RocksIterator iterator = db.newIterator();
byte[] rawLastKey = increment(prefix);
for (iterator.seek(prefix); iterator.isValid(); iterator.next()) {
if (Bytes.BYTES_LEXICO_COMPARATOR.compare(iterator.key(), rawLastKey) > 0
|| Bytes.BYTES_LEXICO_COMPARATOR.compare(iterator.key(), rawLastKey) == 0) {
break;
}
result.put(iterator.key(), iterator.value());
}
iterator.close();
return result;
}
Seek is working very slow. 5.35 Seconds on SSD disk , 1 billion records.
The size of the Keys are fixed 16 bytes. Searched for 8 bytes.
2 Long bytes [xx,xx]
Searched for 1 Long as 8 bytes.
Use ColumnFamily for mapping keys.

Copy table in HBase from Java

I want to copy data from one HBase table to another using Java APIs, but not able to find one. Is there any Java API to do the same?
Thanks.
The following is not by far the most optimized way - but from the tone of the question it seems performance is not the critical factor here.
First, you need to set up your HBaseConfiguration and your input / output tables:
Configuration config = HBaseConfiguration.create();
HTable inputTable = new HTable(config, "input_table");
HTable outputTable = new HTable(config, "output_table");
What you want is a "Scan", which allows a range scan to be performed. You need to define the query parameters, by adding columns to a Scan object.
Scan scan = new Scan(Bytes.toBytes("smith-"));
scan.addColumn(Bytes.toBytes("personal"), Bytes.toBytes("givenName"));
scan.addColumn(Bytes.toBytes("contactinfo"), Bytes.toBytes("email"));
scan.setFilter(new PageFilter(25));
Now you are ready to invoke the scan object and retrieve results:
ResultScanner scanner = inputTable.getScanner(scan);
for (Result result : scanner) {
putToOutputTable(result);
}
Now to save to the second table, you will either do Put's within the for loop, or aggregate the results into a List/Array or similar for a bulk put.
protected void putToOutputTable(Result result) {
// Retrieve the Map of families to their most recent qualifiers and values.
NavigableMap<byte[],NavigableMap<byte[],byte[]>> map = result.getNoVersionMap();
for ( // iterate through the family/values map entries for this result ) {
// Convert the result to the row key and the column values here ..
// specifically set the rowKey, colFamily, colQualifier, and colValue(s)
Put p = new Put(Bytes.toBytes(rowKey));
// To set the value you'd like to update in the row 'myLittleRow',
// specify the column family, column qualifier, and value of the table
// cell you'd like to update. The column family must already exist
// in your table schema. The qualifier can be anything.
// All must be specified as byte arrays as hbase is all about byte
// arrays. Lets pretend the table 'myLittleHBaseTable' was created
// with a family 'myLittleFamily'.
p.add(Bytes.toBytes(colFamily), Bytes.toBytes(colQualifier),
Bytes.toBytes(colValue));
}
table.put(p);
}
If instead you want a more scalable version, take a look at how to use map/reduce to read from input hdfs files / write to output hbase tables here: Hbase Map/Reduce

Categories