Convert Seq[Byte] in Scala to byte[]/ InputStream - java

Have Seq[Byte] in scala . How to convert it to java byte[] or Input Stream ?

wouldn't
val a: Seq[Byte] = List()
a.toArray
do the job?

You can copy the contents of a Seq With copyToArray.
val myseq: Seq[Byte] = ???
val myarray = new Array[Byte](myseq.size)
myseq.copyToArray(myarray)
Note that this will iterate through the Seq twice, which may be undesirable, impossible, or just fine, depending on your use.

A sensible option:
val byteSeq: Seq[Byte] = ???
val byteArray: Array[Byte] = bSeq.toArray
val inputStream = java.io.ByteArrayInputStream(byteArray)
A less sensible option:
object HelloWorld {
implicit class ByteSequenceInputStream(val byteSeq: Seq[Byte]) extends java.io.InputStream {
private var pos = 0
val size = byteSeq.size
override def read(): Int = pos match {
case `size` => -1 // backticks match against the value in the variable
case _ => {
val result = byteSeq(pos).toInt
pos = pos + 1
result
}
}
}
val testByteSeq: Seq[Byte] = List(1, 2, 3, 4, 5).map(_.toByte)
def testConversion(in: java.io.InputStream): Unit = {
var done = false
while (! done) {
val result = in.read()
println(result)
done = result == -1
}
}
def main(args: Array[String]): Unit = {
testConversion(testByteSeq)
}
}

Related

Akka Framing by Size

how can i frame Flow<ByteString, ByteString, NotUsed> by size? All examples I have found assumes that there is some delimiter, which is not my case, I just need to frame by length / size.
Framing via Framing.delimiter does require a designated delimiter, and there doesn't seem to be any built-in stream operator that does framing simply by a fixed chunk size. One of the challenges in coming up with a custom framing/chunking solution is to properly handle the last chunk of elements.
One solution would be to assemble a custom GraphStage like the "chunking" example illustrated in the Akka Stream-cookbook:
import akka.stream.stage.{GraphStage, GraphStageLogic, InHandler, OutHandler}
import akka.stream.{Attributes, Inlet, Outlet, FlowShape}
import akka.util.ByteString
class Chunking(val chunkSize: Int) extends GraphStage[FlowShape[ByteString, ByteString]] {
val in = Inlet[ByteString]("Chunking.in")
val out = Outlet[ByteString]("Chunking.out")
override val shape = FlowShape.of(in, out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
private var buffer = ByteString.empty
setHandler(in, new InHandler {
override def onPush(): Unit = {
val elem = grab(in)
buffer ++= elem
emitChunk()
}
override def onUpstreamFinish(): Unit = {
if (buffer.isEmpty)
completeStage()
else {
if (isAvailable(out)) emitChunk()
}
}
})
setHandler(out, new OutHandler {
override def onPull(): Unit = {
if (isClosed(in)) emitChunk()
else pull(in)
}
})
private def emitChunk(): Unit = {
if (buffer.isEmpty) {
if (isClosed(in)) completeStage() else pull(in)
}
else {
val (chunk, nextBuffer) = buffer.splitAt(chunkSize)
buffer = nextBuffer
push(out, chunk)
}
}
}
}
Note that emitChunk() handles the fixed-size chunking and onUpstreamFinish() is necessary for processing the last chunk of elements in the internal buffer.
Test-running with a sample text file "/path/to/file" which has content as below:
Millions of people worldwide are in for a disastrous future of hunger, drought and disease, according to a draft report from the United Nations' Intergovernmental Panel on Climate Change, which was leaked to the media this week.
import akka.actor.ActorSystem
import akka.stream.scaladsl._
import java.nio.file.Paths
implicit val system = ActorSystem("system")
implicit val executionContext = system.dispatcher
val chunkSize = 32
FileIO.fromPath(Paths.get("/path/to/file")).
via(new Chunking(chunkSize)).
map(_.utf8String).
runWith(Sink.seq)
// res1: scala.concurrent.Future[Seq[String]] = Future(Success(Vector(
// "Millions of people worldwide are",
// " in for a disastrous future of h",
// "unger, drought and disease, acco",
// "rding to a draft report from the",
// " United Nations' Intergovernment",
// "al Panel on Climate Change, whic",
// "h was leaked to the media this w",
// "eek."
// )))
Something like (in Scala, disclaimer: only mentally compiled) this, using statefulMapConcat, which allows
emitting zero or more frames per input element
maintaining state from element to element of what's yet to be emitted
val frameSize: Int = ???
require(frameSize > 0, "frame size must be positive")
Flow[ByteString].statefulMapConcat { () =>
var carry: ByteString = ByteString.empty
{ in =>
val len = carry.length + in.length
if (len < frameSize) {
// append to carry and emit nothing
carry = carry ++ in
Nil
} else if (len == frameSize) {
if (carry.nonEmpty) {
carry = ByteString.empty
List(carry ++ in)
} else List(in)
} else {
if (carry.isEmpty) {
val frames = len / frameSize
val (emit, suffix) = in.splitAt(frames * frameSize)
carry = suffix
emit.grouped(frameSize)
} else {
val (appendToCarry, inn) = in.splitAt(frameSize - carry.length)
val first = carry ++ appendToCarry
val frames = inn.length / frameSize
if (frames > 0) {
val (emit, suffix) = inn.splitAt(frames * frameSize)
carry = suffix
Iterator.single(first) ++ emit.grouped(frameSize)
} else {
carry = inn
List(first)
}
}
}
}
If in Java, note that carry ++ in can be expressed as carry.concat(in). It may be useful, in order to get around the restriction in Java around closing over non-final variables, to use a 1-element ByteString[] (e.g. ByteString[] carry = { ByteString.empty }).

How can I pack a BigInt treating four characters as an unsigned long in network byte order as Ruby's .pack(ā€œNā€) does, in Scala?

Here is Ruby equivalent
[Digest::MD5.hexdigest("Data to pack").to_i(16)].pack("N")
Output: "\x1AP0\\"
The generated big integer is (md5 hash);
321255238386231367014342192054081171548
In Scala I got BigInt as follows;
def md5Hash(input: String): String = md5HashArr(input.getBytes(StandardCharsets.UTF_8))
def md5HashArr(bytes: Array[Byte]): String = {
val digest = java.security.MessageDigest.getInstance("MD5")
digest.reset()
digest.update(bytes)
digest.digest().map(0xFF & _).map {
"%02x".format(_)
}.foldLeft("") {
_ + _
}
}
def hex2dec(hex: String): BigInt = {
hex.toLowerCase().toList.map(
"0123456789abcdef".indexOf(_)).map(
BigInt(_)).reduceLeft(_ * 16 + _)
}
def bytes2hexStr(bytes: Array[Byte], sep: String = ""): String = bytes.map("0x%02x".format(_)).mkString(sep)
val result: BigInt = hex2dec(md5Hash("Data to pack"))
val resultAsByteArray: Array[Byte] = result.toByteArray
Before converting to hex str as follows;
val hexStr:String = bytes2hexStr(resultAsByteArray)
How can I do the same in Scala?
Thanks for the help.
It seems that pack method, for BigNum in Ruby, will pack only last 32-bits(4-bytes). So, for this hash f1af822290cfcac4ce476a691a50305c
[Digest::MD5.hexdigest("Data to pack").to_i(16)]
.pack("N")
.each_byte.map { |b| sprintf("0x%02X ",b) }.join
shows 0x1A 0x50 0x30 0x5C, which are last 4bytes
Scala implementation with the same behaviour:
import java.security.MessageDigest
import java.nio._
import java.io._
import scala.annotation.tailrec
object MD5 {
def main(args: Array[String]) = {
val str = "Data to pack"
val md5Bytes = calcMd5(str)
val md5Str = md5Bytes.map(b => "%02x".format(b)).mkString
println(s"$str => $md5Str")
// Get last 4-bytes
val byteArr = md5Bytes.slice(12, 16)
// Print it
print(s"Last 4-bytes => ")
byteArr.foreach { r =>
print("%02X ".format(r))
}
println
}
def calcMd5(str: String): Array[Byte] = {
MessageDigest.getInstance("MD5")
.digest(str.getBytes)
}
}

Scala error : NoClassDefFoundError, UnsatisfiedLinkError

Good morning. I am creating a spark application with scala. Seven
You must run shared library (libfreebayes.so) in a distributed node environment. libfreebayes.so runs an external program written in c ++ called freebayes. However, the following errors occur:
java.lang.UnsatisfiedLinkError: Native Library /usr/lib/libfreebayes.so already loaded in another classloader
The CreateFreebayesInput method must be done on a partition-by-partition basis. Is there a problem loading libfreebayes.so for each partition? This application works properly in spark local mode. How do I get it to work in yarn-cluster mode? I can not sleep because of this problem. Help me. :-<
import java.io.{File, FileReader, FileWriter, PrintWriter}
import org.apache.spark.rdd.RDD
import org.apache.spark.{HashPartitioner, SparkConf, SparkContext}
object sparkFreebayes {
def main(args : Array[String]) {
val conf = new SparkConf().setAppName("sparkFreebayes")
val sc = new SparkContext(conf)
val appId = sc.applicationId
val appName = sc.appName
val referencePath="/mnt/data/hg38.fa"
val input = sc.textFile(args(0))
val executerNum = args(1).toInt
val outputDir = args(2)
val inputDir = "/mnt/partitionedSam/"
val header = input.filter(x => x.startsWith("#"))
val body = input.filter(x => !x.startsWith("#"))
val partitioned = body.map{x => (x.split("\t")(2),x)}.repartitionAndSortWithinPartitions(new tmpPartitioner(executerNum)).persist()
val cHeader = header.collect.mkString("\n")
val sorted = partitioned.map( x => (x._2) )
CreateFreebayesInput(sorted)
def CreateFreebayesInput(sortedRDD : RDD[String]) = {
sortedRDD.mapPartitionsWithIndex { (idx, iter) =>
val tmp = iter.toList
val outputPath = outputDir+"/"+appId+"_Out_"+idx+".vcf"
val tmp2 = List(cHeader) ++ tmp
val samString = tmp2.mkString("\n")
val jni = new FreeBayesJni
val file = new File(inputDir + "partitioned_" + idx + ".sam")
val fw = new FileWriter(file)
fw.write(samString)
fw.close()
if (file.exists() || file.length()!=0) {
System.loadLibrary("freebayes")
val freebayesParameter = Array("-f","/mnt/data/hg38.fa",file.getPath,"-file",outputPath)
jni.freebayes_native(freebayesParameter.length,freebayesParameter)
//runFreebayes(file.getPath, referencePath, outputPath )
}
tmp2.productIterator
}
}.collect()
}
}
FreeBayesJni class is next :
class FreeBayesJni {
#native def freebayes_native(argc: Int, args: Array[String]): Int;
}
my spark-submit command:
spark-submit --class partitioning --master yarn-cluster ScalaMvnProject.jar FullOutput_sorted.sam 7 /mnt/OutVcf
thank you.

Scala/Java reflective programming: invoke constructor by typecasted arguments

In the library json4s, I intend to write a weakly typed deserializer for some malformed data (mostly the result of XML -> JSON conversions)
I want the dynamic program to get the type information of a given constructor (easy, e.g. 'Int'), apply it on a parsed string (e.g. "12.51"), automatically convert string into the type (in this case 12.51 should be typecasted to 13), then call the constructor.
I come up with the following implementation:
import org.json4s.JsonAST.{JDecimal, JDouble, JInt, JString}
import org.json4s._
import scala.reflect.ClassTag
object WeakNumDeserializer extends Serializer[Any] {
def cast[T](cc: Class[T], v: Any): Option[T] = {
implicit val ctg: ClassTag[T] = ClassTag(cc)
try {
Some(v.asInstanceOf[T])
}
catch {
case e: Throwable =>
None
}
}
override def deserialize(implicit format: Formats): PartialFunction[(TypeInfo, JValue), Any] = Function.unlift{
tuple: (TypeInfo, JValue) =>
tuple match {
case (TypeInfo(cc, _), JInt(v)) =>
cast(cc, v)
case (TypeInfo(cc, _), JDouble(v)) =>
cast(cc, v)
case (TypeInfo(cc, _), JDecimal(v)) =>
cast(cc, v)
case (TypeInfo(cc, _), JString(v)) =>
cast(cc, v.toDouble)
case _ =>
None
}
}
}
However executing the above code on a real Double => Int case always yield IllegalArgumentException. Debugging reveals that the line:
v.asInstanceOf[T]
does not convert Double type to Int in memory, it remains as a Double number after type erasure, and after it is used in reflection to call the constructor it triggers the error.
How do I bypass this and make the reflective function figuring this out?
Is there a way to tell the Java compiler to actually convert it into an Int type?
UPDATE: to help validating your answer I've posted my test cases:
case class StrStr(
a: String,
b: String
)
case class StrInt(
a: String,
b: Int
)
case class StrDbl(
a: String,
b: Double
)
case class StrIntArray(
a: String,
b: Array[Int]
)
case class StrIntSeq(
a: String,
b: Seq[Int]
)
case class StrIntSet(
a: String,
b: Set[Int]
)
class WeakSerializerSuite extends FunSuite with TestMixin{
implicit val formats = DefaultFormats ++ Seq(StringToNumberDeserializer, ElementToArrayDeserializer)
import org.json4s.Extraction._
test("int to String") {
val d1 = StrInt("a", 12)
val json = decompose(d1)
val d2 = extract[StrStr](json)
d2.toString.shouldBe("StrStr(a,12)")
}
test("string to int") {
val d1 = StrStr("a", "12")
val json = decompose(d1)
val d2 = extract[StrInt](json)
d2.toString.shouldBe("StrInt(a,12)")
}
test("double to int") {
val d1 = StrDbl("a", 12.51)
val json = decompose(d1)
val d2 = extract[StrInt](json)
d2.toString.shouldBe("StrInt(a,12)")
}
test("int to int array") {
val d1 = StrInt("a", 12)
val json = decompose(d1)
val d2 = extract[StrIntArray](json)
d2.copy(b = null).toString.shouldBe("StrIntArray(a,null)")
}
test("int to int seq") {
val d1 = StrInt("a", 12)
val json = decompose(d1)
val d2 = extract[StrIntSeq](json)
d2.toString.shouldBe("StrIntSeq(a,List(12))")
}
test("int to int set") {
val d1 = StrInt("a", 12)
val json = decompose(d1)
val d2 = extract[StrIntSet](json)
d2.toString.shouldBe("StrIntSet(a,Set(12))")
}
test("string to int array") {
val d1 = StrStr("a", "12")
val json = decompose(d1)
val d2 = extract[StrIntArray](json)
d2.copy(b = null).toString.shouldBe("StrIntArray(a,null)")
}
test("string to int seq") {
val d1 = StrStr("a", "12")
val json = decompose(d1)
val d2 = extract[StrIntSeq](json)
d2.toString.shouldBe("StrIntSeq(a,List(12))")
}
test("string to int set") {
val d1 = StrStr("a", "12")
val json = decompose(d1)
val d2 = extract[StrIntSet](json)
d2.toString.shouldBe("StrIntSet(a,Set(12))")
}
I've found the first solution, TL:DR: its totally absurd & illogical, and absolutely full of boilerplates for a established strongly typed language. Please post your answer deemed any better:
abstract class WeakDeserializer[T: Manifest] extends Serializer[T] {
// final val tpe = implicitly[Manifest[T]]
// final val clazz = tpe.runtimeClass
// cannot serialize
override def serialize(implicit format: Formats): PartialFunction[Any, JValue] = PartialFunction.empty
}
object StringToNumberDeserializer extends WeakDeserializer[Any] {
override def deserialize(implicit format: Formats): PartialFunction[(TypeInfo, JValue), Any] = {
case (TypeInfo(cc, _), JString(v)) =>
cc match {
case java.lang.Byte.TYPE => v.toByte
case java.lang.Short.TYPE => v.toShort
case java.lang.Character.TYPE => v.toInt.toChar
case java.lang.Integer.TYPE => v.toInt
case java.lang.Long.TYPE => v.toLong
case java.lang.Float.TYPE => v.toFloat
case java.lang.Double.TYPE => v.toDouble
case java.lang.Boolean.TYPE => v.toBoolean
//TODO: add boxed type
}
}
}
object ElementToArrayDeserializer extends WeakDeserializer[Any] {
val listClass = classOf[List[_]]
val seqClass = classOf[Seq[_]]
val setClass = classOf[Set[_]]
val arrayListClass = classOf[java.util.ArrayList[_]]
override def deserialize(implicit format: Formats): PartialFunction[(TypeInfo, JValue), Any] = {
case (ti# TypeInfo(this.listClass, _), jv) =>
List(extractInner(ti, jv, format))
case (ti# TypeInfo(this.seqClass, _), jv) =>
Seq(extractInner(ti, jv, format))
case (ti# TypeInfo(this.setClass, _), jv) =>
Set(extractInner(ti, jv, format))
case (ti# TypeInfo(this.arrayListClass, _), jv) =>
import scala.collection.JavaConverters._
new java.util.ArrayList[Any](List(extractInner(ti, jv, format)).asJava)
case (ti# TypeInfo(cc, _), jv) if cc.isArray =>
val a = Array(extractInner(ti, jv, format))
mkTypedArray(a, firstTypeArg(ti))
}
def mkTypedArray(a: Array[_], typeArg: ScalaType) = {
import java.lang.reflect.Array.{newInstance => newArray}
a.foldLeft((newArray(typeArg.erasure, a.length), 0)) { (tuple, e) => {
java.lang.reflect.Array.set(tuple._1, tuple._2, e)
(tuple._1, tuple._2 + 1)
}}._1
}
def extractInner(ti: TypeInfo, jv: JValue, format: Formats): Any = {
val result = extract(jv, firstTypeArg(ti))(format)
result
}
def firstTypeArg(ti: TypeInfo): ScalaType = {
val tpe = ScalaType.apply(ti)
val firstTypeArg = tpe.typeArgs.head
firstTypeArg
}
}

Scala how to inject mock object to ScalatraFlatSpec

I stuck with Unit test in Scala for many days. I cannot inject mock object to Unit test. The ScalatraFlatSpec call to the actual database not my mock variable and i have no idea to do.
This is my API
class Dashboard extends Servlet {
get("/:brand_code") {
val start = System.currentTimeMillis
val brandCode = params.get("brand_code").get
var brandId = 0;
val sqlFind = "SELECT DISTINCT(id) FROM brands WHERE brand_code=?"
val found:List[Map[String, Any]] = ConnectionModel.getExecuteQuery(sqlFind, List(brandCode))
if(found.isEmpty){
halt(404, send("error", s"brand_code [$brandCode] not found."))
}else{
brandId = found(0).getOrElse("id", 0).toString.toInt
send("Yeah55", brandId)
}
}
And this is Servlet
abstract class Servlet extends ScalatraServlet with CorsSupport with JacksonJsonSupport {
protected implicit lazy val jsonFormats: Formats = DefaultFormats.withBigDecimal
protected override def transformResponseBody(body: JValue): JValue = body.underscoreKeys
protected lazy val body = parsedBody.extract[Map[String, Any]]
protected def send(message: String, data: Any = None) = Map("message" -> message, "data" -> data)
options("/*") {
response.setHeader(
"Access-Control-Allow-Headers", request.getHeader("Access-Control-Request-Headers")
)
}
before() {
contentType = formats("json")
}
}
And this is ConnectionModel and ConnectionModelAble
trait ConnectionModelAble {
def getExecuteQuery(sql: String, parameters: List[Any]): List[Map[String, Any]]
}
object ConnectionModel extends ConnectionModelAble{
var connection:Connection = {
val url = "jdbc:mysql://localhost:3306/db"
val username = "root"
val password = ""\
Class.forName("com.mysql.jdbc.Driver")
DriverManager.getConnection(url, username, password)
}
def getExecuteQuery(sql: String, parameters: List[Any]): List[Map[String, Any]]= {
try {
val statement = connection.createStatement()
var preparedStatement: PreparedStatement = connection.prepareStatement(sql);
var formatDate: DateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");
// Do some execute
for (i <- 0 until parameters.size) {
parameters(i) match {
case _: Int => preparedStatement.setInt(i + 1, parameters(i).toString.toInt)
case _: Double => preparedStatement.setDouble(i + 1, parameters(i).toString.toDouble)
case _: Date => preparedStatement.setDate(i + 1, new java.sql.Date(formatDate.parse(parameters(i).toString).getTime))
case default => preparedStatement.setString(i + 1, parameters(i).toString)
}
}
val resultSet = preparedStatement.executeQuery()
val metaData: ResultSetMetaData = resultSet.getMetaData();
val columnCount = metaData.getColumnCount();
var ret: List[Map[String, Any]] = List();
while (resultSet.next()) {
var row: Map[String, Any] = Map[String, Any]();
for (i <- 1 to columnCount) {
val columnName = metaData.getColumnName(i);
var obj = resultSet.getObject(i);
row += columnName -> obj
}
ret = ret :+ row
}
ret
}catch {
case e: Exception => {
e.printStackTrace();
List()
}
}
}
And this is my unit test
class DashboardSpec extends ScalatraFlatSpec with MockitoSugar {
addServlet(new Dashboard, "/v1/dashboard/*")
it should "return get dashboard correctly" in {
val brandCode = "APAAA"
val brandId = 157
get("/v1/dashboard/APAAA") {
val connectModel = mock[ConnectionModelAble]
val sqlFind = "SELECT DISTINCT(id) FROM brands WHERE brand_code=?"
Mockito.when(connectModel.getExecuteQuery(sqlFind, List(brandCode))).thenReturn(
List(Map("id" -> 150))
)
assert(status == 200)
println(connectModel.getExecuteQuery(sqlFind, List(brandCode)))
println(body)
}
}
}
I found that body from unit test is not from my mock data, it's from real database. What should i do.
Thank you.
You aren't injecting your mock into the Dashboard, so the Connection you're seeing in getExecuteQuery is the one provided by ConnectionModel.connection. You probably want to use a dependency injection framework or something like the Cake pattern to make sure your Dashboard is referring to your mock instance.

Categories