Using Hbase API (Get/Put) or HBQL API, is it possible to retrieve timestamp of a particular column?
Assuming your client is configured and you have a table setup. Doing a get returns a Result
Get get = new Get(Bytes.toBytes("row_key"));
Result result_foo = table.get(get);
A Result is backed by a KeyValue. KeyValues contain the timestamps. You can get either a list of KeyValues with list() or get an array with raw(). A KeyValue has a get timestamp method.
result_foo.raw()[0].getTimestamp()
I think the follow will be better:
KeyValue kv = result.getColumnLatest(family, qualifier);
String status = Bytes.toString(kv.getValue());
Long timestamp = kv.getTimestamp();
since Result#getValue(family, qualifier) is implemented as
public byte[] getValue(byte[] family, byte[] qualifier) {
KeyValue kv = this.getColumnLatest(family, qualifier);
return kv == null ? null : kv.getValue();
}
#codingFoo's answer assumes all timestamps are the same for all cells, but op's question was for a specific column. In that respect, similar to #peibin wang's answer, I would propose the following if you would like the last timestamp for your column:
Use the getColumnLatestCell method on your Result object, and then call the getTimestamp method like so:
Result res = ...
res.getColumnLatestCell(Bytes.toBytes("column_family"), Bytes.toBytes("column_qualifier")).getTimestamp();
If you want access to a specific timestamp you could use the getColumnCells which returns all cells for a specified column, but then you will have to choose between the cells with a get(int index) and then call getTimestamp()
result_foo.rawCells()(0).getTimestamp
is a good style
Related
I am aware that BigTable supports operations append and increment using ReadModifyWriteRow requests, but I'm wondering if there is support or an alternative way to use more generic mapping functions where the value from the cell can be accessed and modified within some sort of closure? For instance, bitwise ANDing a long value in a cell:
Function<Long, Long> modifyFunc = f -> f & 10L;
ReadModifyWriteRow
.create("tableName", "rowKey")
.apply("family", "qualifier", modifyFunc);
Doing a mapping like this is not supported by Bigtable, so here is an option you could try. This will only work with single cluster instances due to consistency required for it.
You could add a column to keep track of row version (in addition to the existing row versions) and then you can read the data and version, modify it in memory and then do a checkAndMutate with the version and new value. Something like this:
Row row = dataClient.readRow(tableId, rowkey);
ArrayList<RowCell> cells = row.getCells();
// Get the value and timestamp/version from the cell you are targetting.
RowCell cell = cells.get(...);
long version = cell.getTimestamp();
ByteString value = cell.getValue();
// Do your mapping to the new value.
ByteString newValue = ...;
Mutation mutation =
Mutation.create().setCell(COLUMN_FAMILY_NAME, COLUMN_NAME, timestamp, newValue);
// Filter on a column that tracks the version to do validation.
Filter filter =
FILTERS
.chain()
.filter(FILTERS.family().exactMatch(COLUMN_FAMILY_NAME))
.filter(FILTERS.qualifier().exactMatch(VERSION_COLUMN))
.filter(FILTERS.value().exactMatch(version));
ConditionalRowMutation conditionalRowMutation =
ConditionalRowMutation.create(tableId, rowkey).condition(filter).then(mutation);
boolean success = dataClient.checkAndMutateRow(conditionalRowMutation);
I am trying to apply ScrubFunction on each tuple and return the tuple with updated values.
But i am getting the Exception like..
Caused by: cascading.tuple.TupleException: failed to set a value, tuple may not be initialized with values, is zero length
Sample Code:
TupleEntry argument = functionCall.getArguments();
Tuple result = new Tuple();
result.setInteger(0, argument.getInteger(0));
result.setString(1, argument.getString(1).toUpperCase());
result.setString(2, argument.getString(2));
result.setString(3, argument.getString(3));
result.setString(4, argument.getString(4));
result.setString(5, argument.getString(5));
functionCall.getOutputCollector().add(result);
What if i want to update few fields in a Tuple and return the updated values.
Can i update directly in TupleEntry and return it.
To your first question: don't set values to a tuple, instead, add.
Tuple result = new Tuple();
result.addInteger(argument.getInteger(0));
// ...
To your second question: yes. See the API doc here: TupleEntry.setObject
Hope this helps :)
I have a bucket on riak in which I store simple Timestamp -> String values in this way:
val riakClient = RiakFactory.newClient(myHttpClusterConfig)
val myBucket = riakClient.fetchBucket(name).execute
myBucket.store(timestamp.toString, value).withoutFetch().w(1).execute
What I need to do now is to add an index on the keys. I tried defining a Java POJO in this way:
public class MyWrapper {
#RiakIndex(name="timestamp_index")
#RiakKey
public String timestamp;
public String value;
public MyWrapper(String timestamp, String value) {
this.timestamp = timestamp;
this.value = value;
}
}
and then running
myBucket.store(new MyWrapper(timestamp.toString, value)).withoutFetch().w(1).execute
The problem of this approach is that in riak the actual value is stored as a json object:
{"value":"myvalue"}
while I would simply need to store the myvalue string. Is there any way to achieve this? I can't see any index(name) method when executing store, and I can't see any annotations like #RiakKey but for values.
You can create a RiakObject using a RiakObjectBuilder, and then add the index on that:
val obj = RiakObjectBuilder.newBuilder(bucketName, myKey)
.withValue(myValue)
.addIndex("timestamp_index", timestamp)
.build
myBucket.store(obj).execute
If I understand what you are trying to do, you want the key/value pair "1403909549"/"Some Value" to be indexed by timestamp_index="1403909549" so that you can query specific times or ranges of times.
If that is the case, you do not need to explicitly add an index, you can query the implicit index $KEY in the same manner you would any other index.
Since all keys that Riak stores in LevelDB are indexed implicitly, I don't think a method was exposed to index them again explicitly.
I have just started using Mongo Db . Below is my data structure .
It has an array of skillID's , each of which have an array of activeCampaigns and each activeCampaign has an array of callsByTimeZone.
What I am looking for in SQL terms is :
Select activeCampaigns.callsByTimeZone.label,
activeCampaigns.callsByTimeZone.loaded
from X
where skillID=50296 and activeCampaigns.campaign_id= 11371940
and activeCampaigns.callsByTimeZone='PT'
The output what I am expecting is to get
{"label":"PT", "loaded":1 }
The Command I used is
db.cd.find({ "skillID" : 50296 , "activeCampaigns.campaignId" : 11371940,
"activeCampaigns.callsByTimeZone.label" :"PT" },
{ "activeCampaigns.callsByTimeZone.label" : 1 ,
"activeCampaigns.callsByTimeZone.loaded" : 1 ,"_id" : 0})
The output what I am getting is everything under activeCampaigns.callsByTimeZone while I am expecting just for PT
DataStructure :
{
"skillID":50296,
"clientID":7419,
"voiceID":1,
"otherResults":7,
"activeCampaigns":
[{
"campaignId":11371940,
"campaignFileName":"Aaron.name.121.csv",
"loaded":259,
"callsByTimeZone":
[{
"label":"CT",
"loaded":6
},
{
"label":"ET",
"loaded":241
},
{
"label":"PT",
"loaded":1
}]
}]
}
I tried the same in Java.
QueryBuilder query = QueryBuilder.start().and("skillID").is(50296)
.and("activeCampaigns.campaignId").is(11371940)
.and("activeCampaigns.callsByTimeZone.label").is("PT");
BasicDBObject fields = new BasicDBObject("activeCampaigns.callsByTimeZone.label",1)
.append("activeCampaigns.callsByTimeZone.loaded",1).append("_id", 0);
DBCursor cursor = coll.find(query.get(), fields);
String campaignJson = null;
while(cursor.hasNext()) {
DBObject campaignDBO = cursor.next();
campaignJson = campaignDBO.toString();
System.out.println(campaignJson);
}
the value obtained is everything under callsByTimeZone array. I am currently parsing the JSON obtained and getting only PT values . Is there a way to just query the PT fields inside activeCampaigns.callsByTimeZone .
Thanks in advance .Sorry if this question has already been raised in the forum, I have searched a lot and failed to find a proper solution.
Thanks in advance.
There are several ways of doing it, but you should not be using String manipulation (i.e. indexOf), the performance could be horrible.
The results in the cursor are nested Maps, representing the document in the database - a Map is a good Java-representation of key-value pairs. So you can navigate to the place you need in the document, instead of having to parse it as a String. I've tested the following and it works on your test data, but you might need to tweak it if your data is not all exactly like the example:
while (cursor.hasNext()) {
DBObject campaignDBO = cursor.next();
List callsByTimezone = (List) ((DBObject) ((List) campaignDBO.get("activeCampaigns")).get(0)).get("callsByTimeZone");
DBObject valuesThatIWant;
for (Object o : callsByTimezone) {
DBObject call = (DBObject) o;
if (call.get("label").equals("PT")) {
valuesThatIWant = call;
}
}
}
Depending upon your data, you might want to add protection against null values as well.
The thing you were looking for ({"label":"PT", "loaded":1 }) is in the variable valueThatIWant. Note that this, too, is a DBObject, i.e. a Map, so if you want to see what's inside it you need to use get:
valuesThatIWant.get("label"); // will return "PT"
valuesThatIWant.get("loaded"); // will return 1
Because DBObject is effectively a Map of String to Object (i.e. Map<String, Object>) you need to cast the values that come out of it (hence the ugliness in the first bit of code in my answer) - with numbers, it will depend on how the data was loaded into the database, it might come out as an int or as a double:
String theValueOfLabel = (String) valuesThatIWant.get("label"); // will return "PT"
double theValueOfLoaded = (Double) valuesThatIWant.get("loaded"); // will return 1.0
I'd also like to point out the following from my answer:
((List) campaignDBO.get("activeCampaigns")).get(0)
This assumes that "activeCampaigns" is a) a list and in this case b) only has one entry (I'm doing get(0)).
You will also have noticed that the fields values you've set are almost entirely being ignored, and the result is most of the document, not just the fields you asked for. I'm pretty sure you can only define the top-level fields you want the query to return, so your code:
BasicDBObject fields = new BasicDBObject("activeCampaigns.callsByTimeZone.label",1)
.append("activeCampaigns.callsByTimeZone.loaded",1)
.append("_id", 0);
is actually exactly the same as:
BasicDBObject fields = new BasicDBObject("activeCampaigns", 1).append("_id", 0);
I think some of the points that will help you to work with Java & MongoDB are:
When you query the database, it will return you the whole document of
the thing that matches your query, i.e. everything from "skillID"
downwards. If you want to select the fields to return, I think those will only be top-level fields. See the documentation for more detail.
To navigate the results, you need to know that a DBObjects are returned, and that these are effectively a Map<String,
Object> in Java - you can use get to navigate to the correct node,
but you will need to cast the values into the correct shape.
Replacing while loop from your Java code with below seems to give "PT" as output.
`while(cursor.hasNext()) {
DBObject campaignDBO = cursor.next();
campaignJson = campaignDBO.get("activeCampaigns").toString();
int labelInt = campaignJson.indexOf("PT", -1);
String label = campaignJson.substring(labelInt, labelInt+2);
System.out.println(label);
}`
I have a java Set of Result objects. My Result class definition looks like this:
private String url;
private String title;
private Set<String> keywords;
I have stored my information in a database table called Keywords which looks like this
Keywords = [id, url, title, keyword, date-time]
As you can see there isn't a one-to-one mapping between an object and a row in the database. I am using SQL (MySQL DB) to extract the values and have a suitable ResultSet object.
How do I check whether the Set already contains a Result with a given URL.
If the set already contains a Result object with the current URL I simply want to add the extra keyword to the Set of keywords, otherwise I create a new Result object for adding to the Set of Result objects.
When you iterate over the JDBC resultSet (to create your own set of Results) why don't you put them into a Map? To create the Map after the fact:
Map<String, List<Result>> map = new HashMap<String, List<Result>>();
for (Result r : resultSet) {
if (map.containsKey(r.url)) {
map.get(r.url).add(r);
} else {
List<Result> list = new ArrayList<Result>();
list.add(r);
map.put(r.url, list);
}
}
Then just use map.containsKey(url) to check.
Normalization is your friend
http://en.wikipedia.org/wiki/Database_normalization
If it's possible, I suggest changing your database design to eliminate this problem. Your current design requries storing the id, url, title and date-time once per key word, which could waste quite a bit of space if you have lots of key words
I would suggest having two tables. Assuming that the id field is guarenteed to be unique, the first table would store the id, url, title and date-time and would only have one row per id. The second table would store the id and a key word. You would insert multiple rows into this table as required.
Is that possible / does that make sense?
You can use a Map with the URLs as the keys:
Map<String, Result> map = new HashMap<String, Result>();
for (Result r : results) {
if (map.containsKey(r.url)) {
map.get(r.url).keywords.addAll(r.keywords);
} else {
map.put(r.url, r);
}
}
I think that you need to make an override on equals() method of your Result class. In that method you will put your logic that will check what you are looking for.
N.B. You also need to know that overrideng the equals() method, you need to override also hashCode() method.
For more on "overriding equals() and hashCode() methods" topic you can look at the this another question.