how is it possible to update some element with the index i in the object of the class DenseVector?
Is it possible? Well, it is:
scala> val vec = Vectors.dense(1, 2, 3)
vec: org.apache.spark.mllib.linalg.Vector = [1.0,2.0,3.0]
scala> vec.toArray(0) = 3.0
scala> vec
res28: org.apache.spark.mllib.linalg.Vector = [3.0,2.0,3.0]
I doubt it is an intended behavior though. Since Vectors don't implement update method there are clearly designed as immutable data structures.
Related
in Java I have been using gson to parse a json like this [[1.2, 4.1], [3.4, 4.4]] into a java primitive multi-array double[][]
The code looks like this (and works fine) :
String json = "[[1.2, 4.1], [3.4, 4.4]]"
double[][] variable = new Gson().fromJson(json, double[][].class);
Is there a way to get the double[][].class in kotlin ?
Is double[][] variable; can be substitute in kotlin ?
Edit :
My goal is to achieve the same behavior with gson in kotlin. I have thousand of doubles arrays to parse.
I would like to do something like this in kotlin :
val json = "[[1.1, 1.2], [2.1, 2.2, 2.3], [3.1, 3.2]"
val variable:Double[][] = Gson().fromJson(json, Double[][]::class.java)
Answer to the Gson problem
For the class type of your use case use Array<DoubleArray>::class.java)
Some additional Words on Multidimensional Arrays
Simply wrap arrayOf into another arrayOf or doubleArrayOf (less Boxing overhead) to get something like Array<DoubleArray>:
val doubles : Array<DoubleArray> = arrayOf(doubleArrayOf(1.2), doubleArrayOf(2.3))
It's also possible to nest multiple Array initializers with the following constructor:
public inline constructor(size: Int, init: (Int) -> T)
A call can look like this:
val doubles2: Array<DoubleArray> = Array(2) { i ->
DoubleArray(2) { j ->
j + 1 * (i + 1).toDouble()
}
}
//[[1.0, 2.0], [2.0, 3.0]]
In the future, you can try using the Kotlin converter. I took your code and ran it through the converter and got the following working code which agrees with the answer given.
internal var json = "[[1.2, 4.1], [3.4, 4.4]]"
internal var variable = Gson().fromJson(json, Array<DoubleArray>::class.java)
You can mix arrayOf and doubleArrayOf for that case.
arrayOf(
doubleArrayOf(1.2, 4.1)
doubleArrayOf(3.4, 4.4)
)
I am new to scala and spark.I have below case class A
case class A(uniqueId : String,
attributes: HashMap[String, List[String]])
Now I have a dataFrame of type A. I need to call a java function on each row of that DF. I need to convert Hashmap to Java HashMap and List to java list..
How can i do that.
I am trying to do following
val rddCaseClass = RDD[A]
val a = rddCaseClass.toDF().map ( x=> {
val rowData = x.getAs[java.util.HashMap[String,java.util.List[String]]]("attributes")
callJavaMethod(rowData)
But this is giving me error :
java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.util.List
Please help.
You can convert Scala Wrapped array to Java List using
scala.collection.JavaConversions
val wrappedArray: WrappedArray[String] = WrappedArray.make(Array("Java", "Scala"))
val javaList = JavaConversions.mutableSeqAsJavaList(wrappedArray)
JavaConversions.asJavaList can also be used but its deprecated: use mutableSeqAsJavaList instead
I think, you could use Seq instead of List for your parameters to work efficiently with List. This way it should work with most of the Seq implementations and no need to to convert the seqs like WrappedArray.
val rddCaseClass = RDD[A]
val a = rddCaseClass.toDF().map ( x=> {
val rowData = x.getAs[java.util.HashMap[String, Seq[String]]]("attributes")
callJavaMethod(rowData)
I have a set of items with different equality and sorting semantics. E.g.
class Item(
val uid: String, // equality
val score: Int // sorting
)
What I need is to have items in some collection sorted all the time by score.
Bonus is have a quick lookup/membership check by equality (like in hash/tree).
Equal items can have different score, so I can not prefix equality with a score equality (i.e. use a kind of tree/hash map).
Any ideas on combinations of scala or java std collections to achieve this with minimum coding? :)
I would probably use an SortedSet since they are already sorted. As Woot4Moo pointed out you can create your own Comparable (although I would suggest using Scala's ordering). If you pass that ordering as an argument to the SortedSet, the Set will sort everything out for you - SortedSets are always sorted.
NB: It's the implicit argument you will want so it might look something like this:
val ordering = Ordering[...]
val set = SortedSet(1, 2, 3, ... n)(ordering)
Note the last parameter given as the ordering
A possibility is to build your own Setof item, wrapping both a SortedMap[Int, Set[Item]] (for ordering) and a HashSet[Item] (for access performance:
class MyOrderedSet(items: Set[Item], byPrice: collection.SortedMap[Int, Set[Item]]) extends Set[Item] {
def contains(key: Item) = items contains key
def iterator = byPrice map {_._2.iterator} reduceOption {_ ++ _} getOrElse Iterator.empty
def +(elem: Item) =
new MyOrderedSet(items + elem, byPrice + (elem.score -> (byPrice.getOrElse(elem.score, Set.empty) + elem)))
def -(elem: Item) =
new MyOrderedSet(items - elem, byPrice + (elem.score -> (byPrice.getOrElse(elem.score, Set.empty) - elem)))
// override any other methods for your convenience
}
object MyOrderedSet {
def empty = new MyOrderedSet(Set.empty, collection.SortedMap.empty)
// add any other factory method
}
Modification of the set is painful because you synchronized 2 collections, but all the features you want are there (at least I hope so)
A quick example:
scala> MyOrderedSet.empty + Item("a", 50) + Item("b", 20) + Item("c", 100)
res44: MyOrderedSet = Set(Item(b,20), Item(a,50), Item(c,100))
There is also a little drawback, which is actually not related to the proposed structure: You can check if an item is in the set, but you cannot get its value:
scala> res44 contains Item("a", 100)
res45: Boolean = true
Nothing in the API allows you to get Item("a", 50) as a result. If you want to do so, I suggest to Map[String, Item]instead of Set[Item] for items (and of course, to change the code accordingly).
EDIT: For the more curious, here is the quicky written version of Item I use:
case class Item(id: String, score: Int) {
override def equals(y: Any) =
y != null && {
PartialFunction.cond(y) {
case Item(`id`, _) => true
}
}
}
Here's my case:
I created a table with DefaultTableModel
So when I use getDataVector I get a two-dimensional java.util.Vector.
When I use toSeq or any other converter I get something like
Buffer([5.0, 1.0, 50.0], [10.0, 1.5, 40.0], [2.0, 1.5, 90.0], [1.0, 1.0, 100.0], [6.0, 3.0, 100.0], [16.0, 3.5, 50.0])
The inner objects are returned as java.lang.Object (AnyRef in scala), and not as arrays
How can I convert them or access their contents?
Here is the code to test
import collection.mutable.{Buffer, ArrayBuffer}
import javax.swing.table._
import scala.collection.JavaConversions._
var data = Array(
Array("5.0", "1.0", "50.0"),
Array("10.0", "1.5", "40.0"),
Array("2.0", "1.5", "90.0"),
Array("1.0", "1.0", "100.0"),
Array("6.0", "3.0", "100.0"),
Array("16.0", "3.5", "50.0"))
val names = Array("K¹", "K²", "K³")
val m = new DefaultTableModel(data.asInstanceOf[Array[Array[AnyRef]]], names.asInstanceOf[Array[AnyRef]])
val t = m.getDataVector.toSeq
This is an older interface in Java, so it returns a pre-generic Vector (i.e. a Vector[_]). There are a variety of ways you could deal with this, but one is:
val jv = m.getDataVector.asInstanceOf[java.util.Vector[java.util.Vector[AnyRef]]]
val sv = jv.map(_.toSeq)
to first explicitly specify what the return type ought to be, and then convert it into Scala collections. If you prefer to convert to immutable collections, you can
val sv = Vector() ++ jv.map(Vector() ++ _)
among other things. (These are now Scala immutable vectors, not java.util.Vectors.)
If you want to mutate the vectors that were returned, just use jv as-is, and rely upon the implicit conversions to do the work for you.
Edit: added a couple other ways to get immutable collections (possible, but I wouldn't say that they're better):
val sv = List(jv.map(v => List(v: _*)): _*)
val sv = Vector.tabulate(jv.length,jv(0).length)((i,j) => jv(i)(j))
Note that the second only works if the table is nonempty and rectangular.
What is the equivalent Scala constructor (to create an immutable HashSet) to the Java
new HashSet<T>(c)
where c is of type Collection<? extends T>?.
All I can find in the HashSet Object is apply.
The most concise way to do this is probably to use the ++ operator:
import scala.collection.immutable.HashSet
val list = List(1,2,3)
val set = HashSet() ++ list
There are two parts to the answer. The first part is that Scala variable argument methods that take a T* are a sugaring over methods taking Seq[T]. You tell Scala to treat a Seq[T] as a list of arguments instead of a single argument using "seq : _*".
The second part is converting a Collection[T] to a Seq[T]. There's no general built in way to do in Scala's standard libraries just yet, but one very easy (if not necessarily efficient) way to do it is by calling toArray. Here's a complete example.
scala> val lst : java.util.Collection[String] = new java.util.ArrayList
lst: java.util.Collection[String] = []
scala> lst add "hello"
res0: Boolean = true
scala> lst add "world"
res1: Boolean = true
scala> Set(lst.toArray : _*)
res2: scala.collection.immutable.Set[java.lang.Object] = Set(hello, world)
Note the scala.Predef.Set and scala.collection.immutable.HashSet are synonyms.
From Scala 2.13 use the companion object
import scala.collection.immutable.HashSet
val list = List(1,2,3)
val set = HashSet.from(list)