Kryo: Difference between readClassAndObject/ReadObject and WriteClassAndObject/WriteObject - java

I am trying to understand the following statement from the documentation:
If the concrete class of the object is not known and the object couldbe null:
kryo.writeClassAndObject(output, object);
Object object = kryo.readClassAndObject(input);
What does if the concrete class is not known exactly.
I am having the following code:
case class RawData(modelName: String,
sourceType: String,
deNormalizedVal: String,
normalVal: Map[String, String])
object KryoSpike extends App {
val kryo = new Kryo()
kryo.setRegistrationRequired(false)
kryo.addDefaultSerializer(classOf[scala.collection.Map[_,_]], classOf[ScalaImmutableAbstractMapSerializer])
kryo.addDefaultSerializer(classOf[scala.collection.generic.MapFactory[scala.collection.Map]], classOf[ScalaImmutableAbstractMapSerializer])
kryo.addDefaultSerializer(classOf[RawData], classOf[ScalaProductSerializer])
//val testin = Map("id" -> "objID", "field1" -> "field1Value")
val testin = RawData("model1", "Json", "", Map("field1" -> "value1", "field2" -> "value2") )
val outStream = new ByteArrayOutputStream()
val output = new Output(outStream, 20480)
kryo.writeClassAndObject(output, testin)
output.close()
val input = new Input(new ByteArrayInputStream(outStream.toByteArray), 4096)
val testout = kryo.readClassAndObject(input)
input.close()
println(testout.toString)
}
When I use readClassAndObject and writeClassAndObject is works. However if I use writeObject and readObject it does not.
Exception in thread "main" com.esotericsoftware.kryo.KryoException:
Class cannot be created (missing no-arg constructor):
com.romix.scala.serialization.kryo.ScalaProductSerializer
I just don't understand why.
earlier using the same code, Instead of using my class RawData, I used a Map and it worked like a charm with writeObject and ReadObject. Hence i am confused.
Can someone help understand it ?

The difference is as follows:
you use writeClassAndObject and readClassAndObject when you're using a serializer that:
serializes a base type: an interface, a class that has subclasses, or - in case of Scala - a trait like Product,
and needs the type (i.e. the Class object) of the deserialized object to construct this object (without this type, it doesn't know what to construct),
example: ScalaProductSerializer
you use writeObject and readObject when you're using a serializer that:
serializes exactly one type (i.e. a class that can be instantiated; example: EnumSetSerializer),
or serializes more than one type but the specific type can be somehow deduced from the serialized data (example: ScalaImmutableAbstractMapSerializer)
To sum this up for your specific case:
when you deserialize your RawData:
ScalaProductSerializer needs to find out the exact type of Product to create an instance,
so it uses the typ: Class[Product] parameter to do it,
as a result, only readClassAndObject works.
when you deserialze a Scala immutable map (scala.collection.immutable.Map imported as IMap):
ScalaImmutableAbstractMapSerializer doesn't need to find out the exact type - it uses IMap.empty to create an instance,
as a result, it doesn't use the typ: Class[IMap[_, _]] parameter,
as a result, both readObject and readClassAndObject work.

Related

Jackson - deserialize json into generic type

I have a model that looks like this:
#JsonIgnoreProperties(ignoreUnknown = true)
public class InputMessage<T> {
#JsonProperty("UUID")
private String UUID;
#JsonProperty("MessageType")
private String messageType;
#JsonProperty("KeyData")
private T keyData;
...
getters/setters
}
This will be in a library that will be called from arbitrary clients, so the KeyData field has a generic type. If I try to make a call like the following from the client code, I get a ClassCastException java.lang.ClassCastException: class java.util.LinkedHashMap cannot be cast to class model.KeyData:
Edit :
Try with constructParametricType() advise but always an error.
ObjectMapper objectMapper = new ObjectMapper();
JavaType type = objectMapper.getTypeFactory().constructParametricType(InputMessage.class, KeyData.class);
KeyData keyData = objectMapper.readValue(inputMessage.getKeyData().toString(), type);
The json that I'm attempting to deserialize looks like this:
{
UUID: 9bae9a6a-5553-4716-8a85-995f36df7732,
KeyData: {
CNSM_ID: 2,
LGCY_SRC_ID: 123,
PARTN_NBR: 1,
PCD_EFF_DT: 2019-01-01,
SRC_CD: AB
},
MessageType: provider_selection,
Partition: 3,
Rows: [
{
Type: l_cov_prdt_pcd_w_srch,
SchemaID: 2,
Value: base64encoded value
}
]
}
The library I'm using to deserialize the json is com.fasterxml.jackson.core:jackson-databind:2.9.8
Any help with this would be greatly appreciated.
When you deserialize a JSON to a generic class, Jackson cannot guess the generic type used since that is not an information present in the JSON.
And when it doesn't know how to deserialize a field, it uses a java.util.LinkedHashMap as target.
What you want is a generic type such as :
InputMessage<KeyData> inputMessage = ...;
KeyData keyData = inputMessage.getKeyData();
An elegant way to solve that is defining a Jackson JavaType for the class by specifying the expected generic.
TypeFactory.constructParametricType(Class parametrized, Class... parameterClasses) allows that.
Supposing you want to deserialize to InputMessage<KeyData>, you can do :
JavaType type = mapper.getTypeFactory().constructParametricType(InputMessage.class, KeyData.class);
InputMessage<KeyData> keyData = mapper.readValue(json, type);
About your comment :
The library code with the generic type knows nothing about the
KeyData class, so I assume it belongs in the client code?
The library doesn't need to know this class but clients should however pass the class to perform correctly the deserialization and to return a generic instance to the client and not a raw type.
For example, clients could use the library in this way :
InputMessage<KeyData> inputMessage = myJsonLibrary.readValue("someValueIfNeeded", KeyData.class);

TypeReference<Map<String, String>>() { }

Since few days ago I started to work on a webservice project. This project is using Jackson to marshalling and unmarshalling JSON objects. So my question is:
Why always I have to put the {} when I am creating an instance of TypeReference? I know the constructor is protected, but why is protected? I think that it's like a hack to make visible the constructor creating an implementation of the constructor since TypeReference is abstract and you can do it. But what is the point of this?
String jsonString = "{\" firstName\":\"John\",\"lastName\":\"Chen\"}";
ObjectMapper objectMapper = new ObjectMapper();
// properties will store name and value pairs read from jsonString
Map<String, String> properties = objectMapper.readvalue(
jsonString, new TypeReference<Map<String, String>>()
{ //
});
TL;DR
Via subclassing it is possible for TypeReference to extract the actual generic type parameter. E.g:
TypeReference<String> ref = new TypeReference<String>(){};
System.out.println(ref.getType());
Prints:
class java.lang.String
This can be useful when you can't use normal classes. E.g when this doesn't work:
// doesn't work
Type type = ArrayList<String>.class;
You still can get that class by using a TypeReference:
// will yield Class<ArrayList<String>>>
Type type = new TypeReference<ArrayList<String>>(){}.getType();
Detailed
When looking at the source code of TypeReference (using Jackson 2.8.5) you can see that the constructor body contains the following lines:
Type superClass = getClass().getGenericSuperclass();
if (superClass instanceof Class<?>) { // sanity check, should never happen
throw new IllegalArgumentException("Internal error: TypeReference constructed without actual type information");
}
_type = ((ParameterizedType) superClass).getActualTypeArguments()[0];
The interesting lines are the first and last. Let's take a closer look at the first line:
Type superClass = getClass().getGenericSuperclass();
For example when you're creating a subclass, by using an anonymous class:
TypeReference<SomeStype> ref = new TypeReference<SomeType>(){};
Then getClass returns the current Class object (an anonymous class), and getGenericSuperclass() will return the Class object from the class the current implementation extends from, in our case, superClass will equal Class<TypeReference<?>>.
Now when looking at the last line from the constructor body:
_type = ((ParameterizedType) superClass).getActualTypeArguments()[0];
As we know that the superClass is the Class object for TypeReference<?> we know that it has a generic parameter. Hence the cast to ParameterizedType. This specified Type has the method getActualyTypeArguments() which returns an array of all generic parameters specified by that class. In our case it's just 1. So [0] will yield the first element. In the example we will get the actually specified type parameter SomeType.

Spark Scala Dynamic creation of Serializable object

I need using a tester for Scala Spark filter, with tester implementing java's Predicate interface and receiving specific class name by arguments.
I'm doing something like this
val tester = Class.forName(qualifiedName).newInstance().asInstanceOf[Predicate[T]]
var filtered = rdd.filter(elem => tester.test(elem))
The problem is that at runtime i have a Spark "TaskNotSerializable Exception" because my specific Predicate class is not Serializable.
If I do
val tester = Class.forName(qualifiedName).newInstance()
.asInstanceOf[Predicate[T] with Serializable]
var filtered = rdd.filter(elem => tester.test(elem))
I get the same error.
If I create tester into rdd.filter call it works:
var filtered = rdd.filter { elem =>
val tester = Class.forName(qualifiedName).newInstance()
.asInstanceOf[Predicate[T] with Serializable]
tester.test(elem)
}
But I would create a single object (maybe to broadcast) for testing. How can I resolve?
You simply have to require the class implements Serializable. Note that the asInstanceOf[Predicate[T] with Serializable] cast is a lie: it doesn't actually check value is Serializable, which is why the second case doesn't produce an error immediately during the cast, and the last one "succeeds".
But I would create a single object (maybe to broadcast) for testing.
You can't. Broadcast or not, deserialization will create new objects on worker nodes. But you can create only a single instance on each partition:
var filtered = rdd.mapPartitions { iter =>
val tester = Class.forName(qualifiedName).newInstance()
.asInstanceOf[Predicate[T]]
iter.filter(tester.test)
}
It will actually perform better than serializing the tester, sending it, and deserializing it would, since it's strictly less work.

Gson-like library for scala

I'm learning scala. I'm trying to find an easy way for turing JSON String to Scala case class instance. Java has wonderful library called Google Gson. It can turn java bean to json and back without some special coding, basically you can do it in a single line of code.
public class Example{
private String firstField
private Integer secondIntField
//constructor
//getters/setters here
}
//Bean instance to Json string
String exampleAsJson = new Gson().toJson(new Example("hehe", 42))
//String to Bean instance
Example exampleFromJson = new Gson().fromJson(exampleAsJson, Example.class)
I'm reading about https://www.playframework.com/documentation/2.5.x/ScalaJson and can't get the idea: why it's so complex is scala? Why should I write readers/writers to serialize/deserialize plain simple case class instances? Is there easy way to convert case class instance -> json -> case class instance using play json api?
Let's say you have
case class Foo(a: String, b: String)
You can easily write a formatter for this in Play by doing
implicit val fooFormat = Json.format[Foo]
This will allow you to both serialize and deserialize to JSON.
val foo = Foo("1","2")
val js = Json.toJson(foo)(fooFormat) // Only include the specific format if it's not in scope.
val fooBack = js.as[Foo] // Now you have foo back!
Check out uPickle
Here's a small example:
case class Example(firstField: String, secondIntField: Int)
val ex = Example("Hello", 3)
write(ex) // { "firstField": "Hello", "secondIntField" : 3 }

Obtain Class from Jackson TypeReference

I have some code which needs to unpick a Jackson TypeReference to find out if it is a Collection. At the moment the best I can come up with is:
// Sample type reference - in reality this is an argument to the method
final TypeReference<List<String>> typeRef = new TypeReference<List<String>>(){};
// Obtain the Java reflection type from the TypeReference
final Type type = typeRef.getType() instanceof ParameterizedType ? ((ParameterizedType)typeRef.getType()).getRawType() : typeRef.getType();
// Obtain the name of the class (or interface)
final String typeName = type.toString().replace("class ", "").replace("interface ", "");
// And find out if it is a Collection
final boolean isCollection = Collection.class.isAssignableFrom(Class.forName(typeName));
But I would hope that there is a way to do this without string manipulation. Is there a better way to go from the Java Type to the Class, or indeed to check assignability directly from either the TypeReference or the Type?
This needs to work on Android so any features added in Java 8 can't be used.
Based on your line of code,
final Type type = typeRef.getType() instanceof ParameterizedType ? ((ParameterizedType)typeRef.getType()).getRawType() : typeRef.getType();
You can safely cast it to a Class like this
final Class clazz = (Class)(typeRef.getType() instanceof ParameterizedType ? ((ParameterizedType)typeRef.getType()).getRawType() : typeRef.getType());
To add a little more explanation -
In the first scenario where ( typeRef is an instance of ParameterizedType), you are retrieving the rawType which would be a Class.
In the second scenario where (typeRef is not an instance of ParameterizedType), it would still be a regular Class because it is not Parameterized.

Categories