CellUtil: Key type in createCell method - java

I am using the CellUtil class packaged in org.apache.hadoop.hbase to create a Cell object. The function header looks like this:
public static Cell createCell(byte[] row, byte[] family, byte[] qualifier, long timestamp, byte type, byte[] value)
What does the 5th. argument byte type represent? I looked into the KeyValueType class and it refers to an enum called Type with the following definition:
public static enum Type {
Minimum((byte)0),
Put((byte)4),
Delete((byte)8),
DeleteFamilyVersion((byte)10),
DeleteColumn((byte)12),
DeleteFamily((byte)14),
// Maximum is used when searching; you look from maximum on down.
Maximum((byte)255);
private final byte code;
Type(final byte c) {
this.code = c;
}
public byte getCode() {
return this.code;
}
My question is, what has the type minimum, put, etc. got to do with the type of cell I want to create?

Sarin,
Please refer 69.7.6. KeyValue
There are some scenarios in which you will use these enums. For Example, I'm writing coprocessor like below then I will use KeyValue.Type.Put.getCode()
similarly other Enums also can be used like this.
See example co-processor usage below...
package getObserver;
import java.io.IOException;
import java.util.List;
import java.util.NavigableSet;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.KeyValue;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.coprocessor.BaseRegionObserver;
import org.apache.hadoop.hbase.coprocessor.ObserverContext;
import org.apache.hadoop.hbase.coprocessor.RegionCoprocessorEnvironment;
public class Observer extends BaseRegionObserver{
private boolean isOewc;
#Override
public void preGetOp(ObserverContext<RegionCoprocessorEnvironment> arg0,
Get arg1, List<Cell> arg2) throws IOException {
NavigableSet<byte[]> qset = arg1.getFamilyMap().get("colfam1".getBytes());
if(qset==null){//do nothing
}else{
String message = "qset.size() = "+String.valueOf(qset.size());
String m = "isOewc = "+String.valueOf(isOewc);
this.isOewc = true;
Cell cell = CellUtil.createCell(
"preGet Row".getBytes(),
m.getBytes(),
message.getBytes(),
System.currentTimeMillis(),
KeyValue.Type.Put.getCode(),
"preGet Value".getBytes());
arg2.add(cell);
}
}
#Override
public void postGetOp(ObserverContext<RegionCoprocessorEnvironment> arg0,
Get arg1, List<Cell> arg2) throws IOException {
String m = "isOewc = "+String.valueOf(isOewc);
Cell cell = CellUtil.createCell(
"postGet Row".getBytes(),
m.getBytes(),
"postGet Qualifier".getBytes(),
System.currentTimeMillis(),
KeyValue.Type.Put.getCode(),
"postGet Value".getBytes());
arg2.add(cell);
}
}
Similarly other below EnumTypes can be used if you don't know which
operation you are going to perform on co-processor event..
programcreek examples clearly explains what is the usage of Put,Delete(prepare key value pairs for mutation) maximum,minimum (for range check). also Co-processor above example uses Put.

Related

Cannot read or serialize POJO with enumerations using Java MongoDB driver

I have an existing object that I want to serialize in MongoDB using Java + POJO codec. For some reason the driver tries to create an instance of an enum instead of using valueOF:
org.bson.codecs.configuration.CodecConfigurationException: Failed to decode 'phase'. Failed to decode 'value'. Cannot find a public constructor for 'SimplePhaseEnumType'.
at org.bson.codecs.pojo.PojoCodecImpl.decodePropertyModel(PojoCodecImpl.java:192)
at org.bson.codecs.pojo.PojoCodecImpl.decodeProperties(PojoCodecImpl.java:168)
at org.bson.codecs.pojo.PojoCodecImpl.decode(PojoCodecImpl.java:122)
at org.bson.codecs.pojo.PojoCodecImpl.decode(PojoCodecImpl.java:126)
at com.mongodb.operation.CommandResultArrayCodec.decode(CommandResultArrayCodec.java:52)
The enumeration:
public enum SimplePhaseEnumType {
PROPOSED("Proposed"),
INTERIM("Interim"),
MODIFIED("Modified"),
ASSIGNED("Assigned");
private final String value;
SimplePhaseEnumType(String v) {
value = v;
}
public String value() {
return value;
}
public static SimplePhaseEnumType fromValue(String v) {
for (SimplePhaseEnumType c: SimplePhaseEnumType.values()) {
if (c.value.equals(v)) {
return c;
}
}
throw new IllegalArgumentException(v);
}}
And the class the uses the enumeration (only showing the relevant fields):
public class SpecificPhaseType {
protected SimplePhaseEnumType value;
protected String date;
public SimplePhaseEnumType getValue() {
return value;
}
public void setValue(SimplePhaseEnumType value) {
this.value = value;
}}
I was looking for a way to maybe annotate the class to tell the driver to use a different method to serialize / deserialize those fields when they are encountered. I know how to skip them during the serialization / deserialization but that doesn't fix the problem:
public class SpecificPhaseType {
#BsonIgnore
protected SimplePhaseEnumType value;
Any help on where I could look (code, documentation)?. I already checked PojoQuickTour.java, MongoDB Driver Quick Start - POJOs and POJOs - Plain Old Java Objects
Thanks!
--Jose
I figured out what to do, you first need to write a custom Codec to read and write the enum as a String (an ordinal is another option if you want to save space, but string was more than OK with me):
package com.kodegeek.cvebrowser.persistence.serializers;
import com.kodegeek.cvebrowser.entity.SimplePhaseEnumType;
import org.bson.BsonReader;
import org.bson.BsonWriter;
import org.bson.codecs.Codec;
import org.bson.codecs.DecoderContext;
import org.bson.codecs.EncoderContext;
public class SimplePhaseEnumTypeCodec implements Codec<SimplePhaseEnumType>{
#Override
public SimplePhaseEnumType decode(BsonReader reader, DecoderContext decoderContext) {
return SimplePhaseEnumType.fromValue(reader.readString());
}
#Override
public void encode(BsonWriter writer, SimplePhaseEnumType value, EncoderContext encoderContext) {
writer.writeString(value.value());
}
#Override
public Class<SimplePhaseEnumType> getEncoderClass() {
return SimplePhaseEnumType.class;
}
}
Then you need to register the new codec so MongoDB can handle the enum using your class:
/**
* MongoDB could not make this any simpler ;-)
* #return a Codec registry
*/
public static CodecRegistry getCodecRegistry() {
final CodecRegistry defaultCodecRegistry = MongoClient.getDefaultCodecRegistry();
final CodecProvider pojoCodecProvider = PojoCodecProvider.builder().register(packages).build();
final CodecRegistry cvePojoCodecRegistry = CodecRegistries.fromProviders(pojoCodecProvider);
final CodecRegistry customEnumCodecs = CodecRegistries.fromCodecs(
new SimplePhaseEnumTypeCodec(),
new StatusEnumTypeCodec(),
new TypeEnumTypeCodec()
);
return CodecRegistries.fromRegistries(defaultCodecRegistry, customEnumCodecs, cvePojoCodecRegistry);
}
Jackson makes it easier to register custom serializer/ deserializer with annotations like #JsonSerializer / #JsonDeserializer and while Mongo forces you to deal with the registry. Not a big deal :-)
You can peek at the full source code here. Hope this saves some time to anyone who has to deal with a similar issue.

Changing Java file to Stored procedure

I have a java file name E2BXmlParser where I am reading and manipulating the XML data fetched from the database.
Now I am trying to execute the java file using Oracle SQL Developer after changing the file like this
CREATE AND COMPILE JAVA SOURCE NAMED "E2BXmlParser" AS
--(Rest of Code).
And rest of code looks like this--
import oracle.jdbc.*
import oracle.xdb.XMLType;
import oracle.xml.parser.v2.XMLDocument;
import oracle.jdbc.*;
import org.w3c.dom.*;
import org.xml.sax.InputSource;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.*;
import java.sql.Connection;
import java.util.*
import javax.xml.xpath.*;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.io.StringReader;
class Reaction {
}
public class E2BXmlParser {
//variables
public E2BXmlParser(int regReportId, int reportId) {
//connection
}
public static void parseXML(int regReportId, int reportId, int isBlinded, int reportFormid,int pi_is_r3_profile,int pi_max_length,String pi_risk_category) throws SQLException, XPathExpressionException, TransformerException {
//fetching data
}
private static Document getDocumentFromString(String xmlContent) throws Exception {
}
private String getStringByElementName(String tagName, Element element) {
}
private OracleConnection getConnecton() {
//oracle connection
}
private Document getXmlDocumentFromDb(int regReportId, int reportId) {
//fetching and manipulating data
}
private List<Reaction> getReactionIds() {
//logic
}
private void findById(Reaction reaction, String id) {
//xpath for finding nodes
}
private boolean checkNodeExists(Element el, String nodeName) {
NodeList list = el.getElementsByTagName(nodeName);
return list.getLength() > 0;
}
private void updateNode(Reaction reaction, Element el) {
//update xml
}
private void updateXmlInDB(int regReportId, int reportId) throws SQLException {
//update xml in db
}
private void updateDrugNode() {
Element rootElement = document.getDocumentElement();
//logic
}
private void updateDrugEventandDrugRelatedness(int reportFormid) {
//update xml
}
private void updateMedicinalActiveSubstance(int regReportId, int isBlinded, int reportFormid,int pi_is_r3_profile,int pi_max_length,String pi_risk_category) {
//update xml after fetching data and changing in DB
}
private Boolean compareStrings(String strOne, String strTwo) {
//logic
}
private void updateDosageInformation() {
//logic
}
private void updateActiveSubstanceName() {
updating activesubstance using xpath
}
private void RemoveDuplicateActiveSubstance(NodeList activesubstancenameList, List<String> names) {
// logic
}
}
Now it is asking for multiple values(reactions,nodelist,node) that are used in code.
But this is not the case
when I am executing the java file from command line like this
loadjava -user username/password#DBalias -r E2BXmlParser.java
P.S I have to change my E2BXmlParser.java file to E2BXmlParser.sql file so that I can execute it from oracle sql developer.
Please help.
The easiest solution is wrapping all logic of your class into one static method in class. Next you have to publish this method to pl sql.
And publication of static function will be look (more or less) like this.
CREATE PROCEDURE parseXML (regReportId NUMBER, reportId NUMBER, isBlinded NUMBER, reportFormid NUMBER, pi_is_r3_profile NUMBER, pi_max_length NUMBER, pi_risk_category varchar2)
AS LANGUAGE JAVA
NAME 'E2BXmlParser.parseXML(int regReportId, int reportId, int isBlinded, int reportFormid,int pi_is_r3_profile,int pi_max_length,java.lang.String pi_risk_category)';
Note. In plsql you have to use full path to object example String -> java.lang.String
Of course oracle allows to use java class in more object oriented way but this is more complicated.
For more information check this manual. https://docs.oracle.com/cd/E18283_01/java.112/e10588/toc.htm
Chapter 3 (Calling Java Methods in Oracle Database) - for basic solutions.
Chapter 6 (Publishing Java Classes With Call Specifications) - ( paragraph Writing Object Type Call Specifications) - for publishing full java class.

Spark the best way to load data set in java language

I have a data set like this:
Result categoricF1 categoricF2 categoricF3
N red a apple
P green b banana
....
which I will then convert each element in each column into bit representation
for example:red will be 10000, green will be 01000 and then I will store 10000 in BigInteger array. I will do the same process for each element in dataset
what is the best way for this case to load data? (data frame, data set, RDD)
I need code in Java. Thanks indeed for helping
Spark Dataset are similar to RDDs, however, instead of using Java serialization or Kryo they use a specialized Encoder to serialize the objects for processing or transmitting over the network. While both encoders and standard serialization are responsible for turning an object into bytes, encoders are code generated dynamically and use a format that allows Spark to perform many operations like filtering, sorting and hashing without deserializing the bytes back into an object.
For example, you have a class ClassName which contains all parameters you require in your data.
import java.io.Serializable;
public class ClassName implements Serializable {
private String result;
private String categoricF1;
private String categoricF2;
private String categoricF3;
public String getResult() {
return result;
}
public String getCategoricF1() {
return categoricF1;
}
public String getCategoricF2() {
return categoricF2;
}
public String getCategoricF3() {
return categoricF3;
}
public void setResult(String result) {
this.result = result;
}
public void setCategoricF1(String categoricF1) {
this.categoricF1 = categoricF1;
}
public void setCategoricF2(String categoricF2) {
this.categoricF2 = categoricF2;
}
public void setCategoricF3(String categoricF3) {
this.categoricF3 = categoricF3;
}
}
Then to create Dataset of required data, you can code like this:
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoder;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SparkSession;
import java.util.ArrayList;
import java.util.List;
public class Test {
public static void main(String[] args) {
SparkSession spark = SparkSession
.builder()
.appName("Java Spark SQL basic example")
.master("local")
.getOrCreate();
// Create an instance of a Bean class
ClassName elem1 = new ClassName();
elem1.setResult("N");
elem1.setCategoricF1("red");
elem1.setCategoricF2("a");
elem1.setCategoricF3("apple");
ClassName elem2 = new ClassName();
elem2.setResult("P");
elem2.setCategoricF1("green");
elem2.setCategoricF2("b");
elem2.setCategoricF3("banana");
List<ClassName> obj = new ArrayList<>();
obj.add(elem1);
obj.add(elem2);
// Encoders are created for Java beans
Encoder<ClassName> classNameEncoder = Encoders.bean(ClassName.class);
Dataset<ClassName> javaBeanDS = spark.createDataset(obj, personEncoder);
javaBeanDS.show();
}
}

storing custom object with different primitive types in ArrayList

I want to create a custom object with three ints and a String and store that object in an arrayList, but I seem to be having issues with it and haven't been able to find documentation for exactly my issue online. I'm getting errors on the fac.add. here is the code
**made some changes to the code
package facility;
import dal.DataAccess;
public class FacilityInfo implements Facility {
private int buildingNo, roomNo, capacity;;
private String type; //classroom, conference room, office, etc.
FacilityInfo(){}//default constructor
FacilityInfo(int b, int r, int c, String t){
this.buildingNo = b;
this.roomNo = r;
this.capacity = c;
this.type = t;
}
package dal;
import java.util.*;
import facility.FacilityInfo;
public class DataAccess {
List<FacilityInfo> fac = new ArrayList<FacilityInfo>();
fac.add(new FacilityInfo (1,2,10,conference));//changed code here
}
That's because of two main reasons.
First, 1,2,10,conference isn't a FacilityInfo object. You can't add the arguments of a FacilityInfo to the List, you have to add an actual object.
Second, you can't have statements outside of a code block, and currently you are calling fac.add(...); directly in the class body.
Try something like:
public class DataAccess {
List<FacilityInfo> fac = new ArrayList<FacilityInfo>();
public void initializeFac() {
fac.add(new FacilityInfo(1,2,10,"conference"));
// etc.
}
}

Junit-Quickcheck: Generate String matching a pattern

I am using pholser's port. I have to generate strings matching a given pattern like \[a-zA-Z0-9\\.\\-\\\\;\\:\\_\\#\\[\\]\\^/\\|\\}\\{]* Length 40.
I extend the Generator class as:
public class InputGenerator extends Generator<TestData> {...}
It overloads a function:
publicTestData generate(SourceOfRandomness random, GenerationStatus status) {...}
Now, random has functions like nextDouble(), nextInt() but there is nothing for strings! How can I generate random strings matching the above pattern?
Find below snippet for a custom generator which implement the generate(..) method to return a random string matching your posted pattern.
public class MyCharacterGenerator extends Generator<String> {
private static final String LOWERCASE_CHARS = "abcdefghijklmnopqrstuvwxyz";
private static final String UPPERCASE_CHARS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
private static final String NUMBERS = "0123456789";
private static final String SPECIAL_CHARS = ".-\\;:_#[]^/|}{";
private static final String ALL_MY_CHARS = LOWERCASE_CHARS
+ UPPERCASE_CHARS + NUMBERS + SPECIAL_CHARS;
public static final int CAPACITY = 40;
public MyCharacterGenerator () {
super(String.class);
}
#Override
public String generate(SourceOfRandomness random, GenerationStatus status) {
StringBuilder sb = new StringBuilder(CAPACITY);
for (int i = 0; i < CAPACITY; i++) {
int randomIndex = random.nextInt(ALL_MY_CHARS.length());
sb.append(ALL_MY_CHARS.charAt(randomIndex));
}
return sb.toString();
}
}
edit A simple unit test to demonstrate the usage of the MyCharacterGenerator class.
import com.pholser.junit.quickcheck.ForAll;
import com.pholser.junit.quickcheck.From;
import static org.junit.Assert.assertTrue;
import org.junit.contrib.theories.Theories;
import org.junit.contrib.theories.Theory;
import org.junit.runner.RunWith;
#RunWith(Theories.class)
public class MyCharacterGeneratorTest {
#Theory
public void shouldHold(#ForAll #From(MyCharacterGenerator.class) String s) {
// here you should add your unit test which uses the generated output
//
// assertTrue(doMyUnitTest(s) == expectedResult);
// the below lines only for demonstration and currently
// check that the generated random has the expected
// length and matches the expected pattern
System.out.println("shouldHold(): " + s);
assertTrue(s.length() == MyCharacterGenerator.CAPACITY);
assertTrue(s.matches("[a-zA-Z0-9.\\-\\\\;:_#\\[\\]^/|}{]*"));
}
}
sample output generated by shouldHold
shouldHold(): MD}o/LAkW/hbJVWPGdI;:RHpwo_T.lGs^DOFwu2.
shouldHold(): IT_O{8Umhkz{#PY:pmK6}Cb[Wc19GqGZjWVa#4li
shouldHold(): KQwpEz.CW28vy_/WJR3Lx2.tRC6uLIjOTQtYP/VR
shouldHold(): pc2_T4hLdZpK78UfcVmU\RTe9WaJBSGJ}5v#z[Z\
...
There is no random.nextString(), but there is a way to generate random strings within junit-quickcheck-generators library. You can access it when creating new generators using gen().type(String.class). However, it seems we don't have much control over it.
Here is a silly example of a StringBuilder generator to demonstrate how to use the String generator:
import com.pholser.junit.quickcheck.generator.GenerationStatus;
import com.pholser.junit.quickcheck.generator.Generator;
import com.pholser.junit.quickcheck.random.SourceOfRandomness;
public class StringBuilderGenerator extends Generator<StringBuilder> {
public StringBuilderGenerator() {
super(StringBuilder.class);
}
#Override
public StringBuilder generate(SourceOfRandomness random, GenerationStatus status) {
String s = gen().type(String.class).generate(random, status);
return new StringBuilder(s);
}
}
I just made a library that suppose to do what you want in a generic way: https://github.com/SimY4/coregex
Simple usage example:
import com.pholser.junit.quickcheck.Property;
import com.pholser.junit.quickcheck.runner.JUnitQuickcheck;
import org.junit.runner.RunWith;
import java.util.UUID;
import static org.junit.Assert.assertEquals;
#RunWith(JUnitQuickcheck.class)
public class CoregexGeneratorTest {
#Property
public void shouldGenerateMatchingUUIDString(
#Regex("[0-9a-f]{8}-[0-9a-f]{4}-[0-5][0-9a-f]{3}-[089ab][0-9a-f]{3}-[0-9a-f]{12}")
String uuid) {
assertEquals(uuid, UUID.fromString(uuid).toString());
}
}

Categories