We're using ChronicleMap to support off-heap persistence in a large number of different stores, but hit a bit a of a problem with the most simple usecase.
First of all, here's the helper I wrote to make creation easier:
import java.io.File
import java.util.concurrent.atomic.AtomicLong
import com.madhukaraphatak.sizeof.SizeEstimator
import net.openhft.chronicle.map.{ChronicleMap, ChronicleMapBuilder}
import scala.reflect.ClassTag
object ChronicleHelper {
def estimateSizes[Key, Value](data: Iterator[(Key, Value)], keyEstimator: AnyRef => Long = defaultEstimator, valueEstimator: AnyRef => Long = defaultEstimator): (Long, Long, Long) = {
println("Estimating sizes...")
val entries = new AtomicLong(1)
val keySum = new AtomicLong(1)
val valueSum = new AtomicLong(1)
var i = 0
val GroupSize = 5000
data.grouped(GroupSize).foreach { chunk =>
chunk.par.foreach { case (key, value) =>
entries.incrementAndGet()
keySum.addAndGet(keyEstimator(key.asInstanceOf[AnyRef]))
valueSum.addAndGet(valueEstimator(value.asInstanceOf[AnyRef]))
}
i += 1
println("Progress:" + i * GroupSize)
}
(entries.get(), keySum.get() / entries.get(), valueSum.get() / entries.get())
}
def defaultEstimator(v: AnyRef): Long = SizeEstimator.estimate(v)
def createMap[Key: ClassTag, Value: ClassTag](data: => Iterator[(Key, Value)], file: File): ChronicleMap[Key, Value] = {
val keyClass = implicitly[ClassTag[Key]].runtimeClass.asInstanceOf[Class[Key]]
val valueClass = implicitly[ClassTag[Value]].runtimeClass.asInstanceOf[Class[Value]]
val (entries, averageKeySize, averageValueSize) = estimateSizes(data)
val builder = ChronicleMapBuilder.of(keyClass, valueClass)
.entries(entries)
.averageKeySize(averageKeySize)
.averageValueSize(averageValueSize)
.asInstanceOf[ChronicleMapBuilder[Key, Value]]
val cmap = builder.createPersistedTo(file)
val GroupSize = 5000
println("Inserting data...")
var i = 0
data.grouped(GroupSize).foreach { chunk =>
chunk.par.foreach { case (key, value) =>
cmap.put(key, value)
}
i += 1
println("Progress:" + i * GroupSize)
}
cmap
}
def empty[Key: ClassTag, Value: ClassTag]: ChronicleMap[Key, Value] = {
val keyClass = implicitly[ClassTag[Key]].runtimeClass.asInstanceOf[Class[Key]]
val valueClass = implicitly[ClassTag[Value]].runtimeClass.asInstanceOf[Class[Value]]
ChronicleMapBuilder.of(keyClass, valueClass).create()
}
def loadMap[Key: ClassTag, Value: ClassTag](file: File): ChronicleMap[Key, Value] = {
val keyClass = implicitly[ClassTag[Key]].runtimeClass.asInstanceOf[Class[Key]]
val valueClass = implicitly[ClassTag[Value]].runtimeClass.asInstanceOf[Class[Value]]
ChronicleMapBuilder.of(keyClass, valueClass).createPersistedTo(file)
}
}
It uses https://github.com/phatak-dev/java-sizeof for object size estimation. Here's the kind of usage we want to support:
object TestChronicle {
def main(args: Array[String]) {
def dataIterator: Iterator[(String, Int)] = (1 to 5000).toIterator.zipWithIndex.map(x => x.copy(_1 = x._1.toString))
ChronicleHelper.createMap[String, Int](dataIterator, new File("/tmp/test.map"))
}
}
But it throws an exception:
[error] Exception in thread "main" java.lang.ClassCastException: Key
must be a int but was a class java.lang.Integer [error] at
net.openhft.chronicle.hash.impl.VanillaChronicleHash.checkKey(VanillaChronicleHash.java:661)
[error] at
net.openhft.chronicle.map.VanillaChronicleMap.queryContext(VanillaChronicleMap.java:281)
[error] at
net.openhft.chronicle.map.VanillaChronicleMap.put(VanillaChronicleMap.java:390)
[error] at ...
I can see that it might have something to do with atomicity of Scala's Int as opposed to Java's Integer, but how do I bypass that?
Scala 2.11.7
Chronicle Map 3.8.0
Seems suspicious that in your test it's Iterator[(String, Int)] (rather than Iterator[(Int, String)]) for key type is String and value type is Int, while the error message is compaining about key's type (int/Integer)
If error message says Key must be a %type% it means that you configured that type in the first ChronicleMapBuilder.of(keyType, valueType) statement. So in your case it means that you configured int.class (the Class object, representing the primitive int type in Java), that is not allowed, and providing java.lang.Integer instance to map's methods (probably you provide primitive ints, but they become Integer due to boxing), that is allowed. You should ensure that you are providing java.lang.Integer.class (or some other Scala's class) to ChronicleMapBuilder.of(keyType, valueType) call.
I don't know what size estimation this project gives: https://github.com/phatak-dev/java-sizeof, but in any case you should specify size in bytes that the object will take in serialized form. Serialized form itself depends on default serializers, chosen for a specific type in Chronicle Map (and may change between Chronicle Map versions), or custom serializers configured for specific ChronicleMapBuilder. So using any information about key/value "sizes" to configure a Chronicle Map, other than out of the Chronicle Map itself, is fragile. You can use the following procedure to estimate sizes more reliably:
public static <V> double averageValueSize(Class<V> valueClass, Iterable<V> values) {
try (ChronicleMap<Integer, V> testMap = ChronicleMap.of(Integer.class, valueClass)
// doesn't matter, anyway not a single value will be written to a map
.averageValueSize(1)
.entries(1)
.create()) {
LongSummaryStatistics statistics = new LongSummaryStatistics();
for (V value : values) {
try (MapSegmentContext<Integer, V, ?> c = testMap.segmentContext(0)) {
statistics.accept(c.wrapValueAsData(value).size());
}
}
return statistics.getAverage();
}
}
You can find it in this test: https://github.com/OpenHFT/Chronicle-Map/blob/7aedfba7a814578a023f7975ef15ba88b4d435db/src/test/java/eg/AverageValueSizeTest.java
This procedure is hackish, but there are no better options right now.
Another recommendation:
If your keys or values are kind of primitives (ints, longs, doubles, but boxed), or any other type that is always of the same size, you shouldn't use averageKey/averageValue/averageKeySize/averageValueSize methods, better you use constantKeySizeBySample/constantValueSizeBySample method. Specifically for java.lang.Integer, Long and Double even this is not needed, Chronicle Map already knows that those types are constantly sized.
Related
In the example code below, I am trying to create case class objects with default values using runtime Scala reflection (required for my use case)!
First Approach
Define default values for case class fields
Create objects at runtime
Second Approach
Create a case class object in the companion object
Fetch that object using reflection
At first glance, the second approach seemed better because we are creating object only once but upon profiling these two approaches, the second doesn't seem to add much value. Although while sampling only one object is created indeed throughout the runtime of the application! Though it looks obvious that those objects are being created every time when using reflection (Correct me if I am wrong).
newDefault
newDefault2
object TestDefault extends App {
case class XYZ(str: String = "Shivam")
object XYZ { private val default: XYZ = XYZ() }
case class ABC(int: Int = 99)
object ABC { private val default: ABC = ABC() }
def newDefault[A](implicit t: reflect.ClassTag[A]): A = {
import reflect.runtime.{universe => ru}
import reflect.runtime.{currentMirror => cm}
val clazz = cm.classSymbol(t.runtimeClass)
val mod = clazz.companion.asModule
val im = cm.reflect(cm.reflectModule(mod).instance)
val ts = im.symbol.typeSignature
val mApply = ts.member(ru.TermName("apply")).asMethod
val syms = mApply.paramLists.flatten
val args = syms.zipWithIndex.map {
case (p, i) =>
val mDef = ts.member(ru.TermName(s"apply$$default$$${i + 1}")).asMethod
im.reflectMethod(mDef)()
}
im.reflectMethod(mApply)(args: _*).asInstanceOf[A]
}
for (i <- 0 to 1000000000)
newDefault[XYZ]
// println(s"newDefault XYZ = ${newDefault[XYZ]}")
// println(s"newDefault ABC = ${newDefault[ABC]}")
def newDefault2[A](implicit t: reflect.ClassTag[A]): A = {
import reflect.runtime.{currentMirror => cm}
val clazz = cm.classSymbol(t.runtimeClass)
val mod = clazz.companion.asModule
val im = cm.reflect(cm.reflectModule(mod).instance)
val ts = im.symbol.typeSignature
val defaultMember = ts.members.filter(_.isMethod).filter(d => d.name.toString == "default").head.asMethod
val result = im.reflectMethod(defaultMember).apply()
result.asInstanceOf[A]
}
for (i <- 0 to 1000000000)
newDefault2[XYZ]
}
Is there any way to reduce the memory footprint? Any other better approach to achieve the same?
P.S. If are trying to run this app, comment the following lines alternatively:
for (i <- 0 to 1000000000)
newDefault[XYZ]
for (i <- 0 to 1000000000)
newDefault2[XYZ]
EDIT
As per #Levi Ramsey's suggestion, I did try memoization but it seems to only make a small difference!
val cache = new ConcurrentHashMap[universe.Type, XYZ]()
def newDefault2[A](implicit t: reflect.ClassTag[A]): A = {
import reflect.runtime.{currentMirror => cm}
val clazz = cm.classSymbol(t.runtimeClass)
val mod = clazz.companion.asModule
val im = cm.reflect(cm.reflectModule(mod).instance)
val ts = im.symbol.typeSignature
if (!cache.contains(ts)) {
val default = ts.members.filter(_.isMethod).filter(d => d.name.toString == "default").head.asMethod
cache.put(ts, im.reflectMethod(default).apply().asInstanceOf[XYZ])
}
cache.get(ts).asInstanceOf[A]
}
for (i <- 0 to 1000000000)
newDefault2[XYZ]
I have a case class defined as below
case class ChooseBoxData[T](index:T, text:String)
Is it possible to declare a List so that the list only accept type of ChooseBoxData[String] and ChooseBoxData[Int]?
What I expected is something like:
val specialList:List[some type declaration] = List(
ChooseBoxData[String]("some string","some string"),/* allow, because is ChooseBoxData[String]*/
ChooseBoxData[Int](12,"some string"), /* also allow, because is ChooseBoxData[Int]*/
ChooseBoxData[Boolean](true,"some string")/* not allow type other than ChooseBoxData[String] or ChooseBoxData[Int]*/
)
Something like this maybe:
trait AllowableBoxData
object AllowableBoxData {
private of[T](cbd: ChooseBoxData[T]) = new ChooseBoxData(cbd.index, cbd.text)
with AllowableBoxData
implicit def ofInt(cbd: ChooseBoxData[Int]) = of(cbd)
implicit def ofString(cbd: ChooseBoxData[String]) = of(cbd)
}
Now you can do things like
val list: List[ChooseBoxData[_] with AllowableBoxData] = List(ChooseBoxData("foo", "bar"), ChooseBoxData(0, "baz")
But not val list: List[AllowableBoxData] = List(ChooseBoxData(false, "baz"))
Also, if you were looking to declare a function argument rather than just a variable, there would be a bit more elegant solution:
trait CanUse[T]
implicit case object CanUseInt extends CanUse[Int]
implicit case object CanUseString extends CanUse[String]
def foo[T : CanUse](bar: List[ChooseBoxData[T]])
Here's what I came up with:
First, we create the following Algebraic Data Types (ADT):
sealed trait StringInt
case class Stringy(s : String) extends StringInt
case class Inty(s : Int) extends StringInt
And define ChoooseBoxData as follows:
case class ChooseBoxData(index : StringInt, text : String)
Then we define the following implicts to convert Int and String in the scope to the defined ADT:
object CBImplicits {
implicit def conv(u : String) = Stringy(u)
implicit def conv2(u : Int) = Inty(u)
}
Now, we can enforce the requirement in the question. Here is an example:
import CBImplicits._
val list = List(ChooseBoxData("str", "text"),
ChooseBoxData(1, "text"),
ChooseBoxData(true, "text"))
Trying to run the above, the compiler will complain about type mismatch. But this will compile and run:
List(
ChooseBoxData("str", "text"),
ChooseBoxData(1, "text"),
ChooseBoxData(12, "text2"))
which results in:
a: List[ChooseBoxData] =
List(ChooseBoxData(Stringy(str),text), ChooseBoxData(Inty(1),text), ChooseBoxData(Inty(12),text2))
This preserves index type information (wrapped in StringInt supertype of course) which later can be easily extracted using pattern matching for individual elements.
It is easy to remove the wrapper for all elements too, but it will result in the index type to become Any which is what we would expect because Any is the lowest common ancestor for both String and Int in Scala's class hierarchy.
EDIT: A Solution Using Shapeless
import shapeless._
import syntax.typeable._
case class ChooseBoxData[T](index : T, text : String)
val a = ChooseBoxData(1, "txt")
val b = ChooseBoxData("str", "txt")
val c = ChooseBoxData(true, "txt")
val list = List(a, b, c)
val `ChooseBoxData[Int]` = TypeCase[ChooseBoxData[Int]]
val `ChooseBoxData[String]` = TypeCase[ChooseBoxData[String]]
val res = list.map {
case `ChooseBoxData[Int]`(u) => u
case `ChooseBoxData[String]`(u) => u
case _ => None
}
//result
res: List[Product with Serializable] = List(ChooseBoxData(1,txt), ChooseBoxData(str,txt), None)
So it allows compilation, but will replace invalid instances with None (which then can be used to throw a runtime error if desired), or you can directly filter the instances you want using:
list.flatMap(x => x.cast[ChooseBoxData[Int]])
//results in:
List[ChooseBoxData[Int]] = List(ChooseBoxData(1,txt))
You can build extra constraint on top of your case class.
import language.implicitConversions
case class ChooseBoxData[T](index:T, text:String)
trait MySpecialConstraint[T] {
def get: ChooseBoxData[T]
}
implicit def liftWithMySpecialConstraintString(cbd: ChooseBoxData[String]) =
new MySpecialConstraint[String] {
def get = cbd
}
implicit def liftWithMySpecialConstraintInt(cbd: ChooseBoxData[Int]) =
new MySpecialConstraint[Int] {
def get = cbd
}
// Now we can just use this constraint for out list
val l1: List[MySpecialConstraint[_]] = List(ChooseBoxData("A1", "B1"), ChooseBoxData(2, "B2"))
Why can't you do it like this:
object solution extends App {
case class ChooseBoxData[T](index: T, text: String) extends GenericType[T]
trait GenericType[T] {
def getType(index: T, text: String): ChooseBoxData[T] = ChooseBoxData[T](index, text)
}
val specialList = List(
ChooseBoxData[String]("some string", "some string"),
ChooseBoxData[Int](12, "some string"),
ChooseBoxData[Boolean](true, "some string")
)
println(specialList)
}
//output: List(ChooseBoxData(some string,some string), ChooseBoxData(12,some string), ChooseBoxData(true,some string))
I have only seen examples where the result is a Java list of Scala doubles. I got as far as
def getDistance(): java.util.List[java.lang.Double] = {
val javadistance = distance.toList.asJava
javadistance
}
but this is still a Java list containing Scala doubles (distance is a member of the same class as getDistance).
One has to use the java boxed variant in a map:
def getDistance(): java.util.List[java.lang.Double] = {
distance.toList.map(Double.box).asJava
}
Other than Scala 2.13+ box method, you can use:
def getDistance(): java.util.List[java.lang.Double] = {
val javadistance = distance.toList.map(java.lang.Double.valueOf).asJava
javadistance
}
Assume Scala 2.11. I'm writing a class that will persist a Scala value. It's intention is to be used as such:
class ParentClass {
val instanceId: String = "aUniqueId"
val statefulString: Persisted[String] = persisted { "SomeState" }
onEvent {
case NewState(state) => statefulString.update(state)
}
}
Persisted is a class with a type parameter that is meant to persist that specific value like a cache, and Persist handles all of the logic associated with persistence. However, to simply the implementation, I'm hoping to retrieve information about it's instantiation. For example, if it's instance in the parent class is named statefulString, how can I access that name from within the Persisted class itself?
The purpose of doing this is to prevent collisions in automatic naming of persisted values while simplifying the API. I cannot rely on using type, because there could be multiple values of String type.
Thanks for your help!
Edit
This question may be helpful: How can I get the memory location of a object in java?
Edit 2
After reading the source code for ScalaCache, it appears there is a way to do this via WeakTypeTag. Can someone explain what exactly is happening in its macros?
https://github.com/cb372/scalacache/blob/960e6f7aef52239b85fa0a1815a855ab46356ad1/core/src/main/scala/scalacache/memoization/Macros.scala
I was able to do this with the help of Scala macros and reflection, and adapting some code from ScalaCache:
class Macros(val c: blackbox.Context) {
import c.universe._
def persistImpl[A: c.WeakTypeTag, Repr: c.WeakTypeTag](f: c.Tree)(keyPrefix: c.Expr[ActorIdentifier], scalaCache: c.Expr[ScalaCache[Repr]], flags: c.Expr[Flags], ec: c.Expr[ExecutionContext], codec: c.Expr[Codec[A, Repr]]) = {
commonMacroImpl(keyPrefix, scalaCache, { keyName =>
q"""_root_.persistence.sync.caching($keyName)($f)($scalaCache, $flags, $ec, $codec)"""
})
}
private def commonMacroImpl[A: c.WeakTypeTag, Repr: c.WeakTypeTag](keyPrefix: c.Expr[ActorIdentifier], scalaCache: c.Expr[ScalaCache[Repr]], keyNameToCachingCall: (c.TermName) => c.Tree): Tree = {
val enclosingMethodSymbol = getMethodSymbol()
val valNameTree = getValName(enclosingMethodSymbol)
val keyName = createKeyName()
val scalacacheCall = keyNameToCachingCall(keyName)
val tree = q"""
val $keyName = _root_.persistence.KeyStringConverter.createKeyString($keyPrefix, $valNameTree)
$scalacacheCall
"""
tree
}
/**
* Get the symbol of the method that encloses the macro,
* or abort the compilation if we can't find one.
*/
private def getValSymbol(): c.Symbol = {
def getValSymbolRecursively(sym: Symbol): Symbol = {
if (sym == null || sym == NoSymbol || sym.owner == sym)
c.abort(
c.enclosingPosition,
"This persistence block does not appear to be inside a val. " +
"Memoize blocks must be placed inside vals, so that a cache key can be generated."
)
else if (sym.isTerm)
try {
val termSym = sym.asInstanceOf[TermSymbol]
if(termSym.isVal) termSym
else getValSymbolRecursively(sym.owner)
} catch {
case NonFatal(e) => getValSymbolRecursively(sym.owner)
}
else
getValSymbolRecursively(sym.owner)
}
getValSymbolRecursively(c.internal.enclosingOwner)
}
/**
* Convert the given method symbol to a tree representing the method name.
*/
private def getValName(methodSymbol: c.Symbol): c.Tree = {
val methodName = methodSymbol.asMethod.name.toString
// return a Tree
q"$methodName"
}
private def createKeyName(): TermName = {
// We must create a fresh name for any vals that we define, to ensure we don't clash with any user-defined terms.
// See https://github.com/cb372/scalacache/issues/13
// (Note that c.freshName("key") does not work as expected.
// It causes quasiquotes to generate crazy code, resulting in a MatchError.)
c.freshName(c.universe.TermName("key"))
}
}
I would like to create a class in Java 8 which is able to recursively create an object which has a method that takes a function parameter based on the parameters I added.
For example, I would like to be able to do this:
new X().param(23).param("some String").param(someObject)
.apply((Integer a) -> (String b) -> (Object c) -> f(a,b,c))
The apply method would then apply the collected parameters to the given function.
I feel this should be possible without reflection while maintaing type-safety, but I can't quite figure out how. A solution in Scala is also welcome, if I can translate it to Java 8. If it's not possible, I'll also accept an answer that explains why.
What I have so far is essentially this:
class ParamCmd<A,X> {
final A param;
public ParamCmd(A param) {
this.param = param;
}
public<B> ParamCmd<B, Function<A,X>> param(B b) {
return new ParamCmd<>(b);
}
public void apply(Function<A,X> f) {
// this part is unclear to me
}
public static void main(String[] args) {
new ParamCmd<Integer,String>(0).param("oops").param(new Object())
// the constructed function parameters are reversed relative to declaration
.apply((Object c) -> (String b) -> (Integer a) ->
"args were " + a + " " + b + " " + c
);
}
}
As noted in the code comments, my problems are keeping the function parameters in the order of the calls of param(), and actually applying the parameters.
For an unlimited amount of parameters, the only solution I could think of is with Heterogeneous Lists in Scala.
It is probably isn't feasible in Java as there is type level computation going on with path-dependant types.
Using Heterogeneous Lists and Path-Dependant types:
import scala.language.higherKinds
object Main extends App {
val builder1 = HCons(23, HCons("Hello", HNil))
val builder2 = HCons(42L, builder1)
val res1:String = builder1.apply(i => s => i + s)
val res2:String = builder2.apply(l => i => s => (i+l) + s)
println(res1) // 23Hello
println(res2) // 65Hello
}
sealed trait HList {
type F[Res]
def apply[Res]: F[Res] => Res
}
case class HCons[Head, HTail <: HList](head: Head, tail: HTail) extends HList {
type F[Res] = Head => (tail.type)#F[Res]
def apply[Res]: F[Res] => Res = f => tail.apply(f(head))
}
case object HNil extends HList {
type F[Res] = Res
def apply[Res]: F[Res] => Res = identity
}
This code prints:
23Hello
65Hello
The second, more limited way of doing this, but which might work with Java, is to create multiple classes for each function length, which returns the next sized function length class wrapping the value, up to some maximal length - See the Applicative Builder in Scalaz: "Scalaz Applicative Builder"
This doesn't answer your question. However, maybe it helps someone to find a solution, or to explain why it isn't possible in Java and/or Scala.
It can be done in C++, with an arbitrary number of parameters, and without losing type-safety. The call-side look as follows. Unfortunately, the lambda syntax in C++ is quite verbose.
bar{}.param(23).param("some String").param(4.2).apply(
[](int i) {
return [=](std::string s) {
return [=](double d) {
std::cout << i << ' ' << s << ' ' << d << '\n';
};
};
});
Following is the definition of foo and bar. The implementation is straight-forward. However, I doubt that it is possible to build something like this in Java, because the way type parameters work in Java. Generics in Java can only be used to avoid type casts, and that's not enough for this use case.
template <typename Param, typename Tail>
struct foo {
Param _param;
Tail _tail;
template <typename P>
auto param(P p) {
return foo<P, foo>{p, *this};
}
template <typename Function>
auto apply(Function function) {
return _tail.apply(function)(_param);
}
};
struct bar {
template <typename P>
auto param(P p) {
return foo<P, bar>{p, *this};
}
template <typename Function>
auto apply(Function function) {
return function;
}
};
Sorry I just could give some leads in Scala:
Perhaps it would help to have a look at http://www.scala-lang.org/api/2.10.4/index.html#scala.Function$
.apply((Integer a) -> (String b) -> (Object c) -> f(a,b,c))
pretty much looks like Function.uncurried
param(23).param("some String").param(someObject)
could be implemented using a list for an accumulator if you don't care for Type safety. If you want to keep the Types you could use the HList out of Shapeless https://github.com/milessabin/shapeless which comes with a handy tuppled method.
Implementation of param():
import shapeless._
import HList._
import syntax.std.traversable._
class Method(val l : HList = HNil) {
def param(p: Any) = new Method( p :: l )
}
Example
scala> val m = new Method().param(1).param("test")
m: Method = Method#1130ad00
scala> m.l
res8: shapeless.HList = test :: 1 :: HNil