I have ArrayList in Java which holds objects of type Row such as following
List<Row> Table = new ArrayList<Row>();
I need to convert this Table to Scala Seq in order to later convert it to data-frame. I tried the following without success
var TableScala:Seq[Row] =
Table.asScalaJavaConverters.asScalaBufferConverter(getTable).asScala
val newList = TableScala.map(row => new Tuple5(row(0), row(1), row(2), row(3), row(4)))
spark.createDataFrame(newList).toDF("userID")
From Scala 2.8.1, these conversions are made explicit using “scala.collection.JavaConverters._” api. The following code shows same conversion using this api.
First import
import scala.collection.JavaConverters._
Java list
List<Row> javaList = new ArrayList<Row>();
Use “asScala “ to convert Java list to Scala list
val scalaList = javaList.asScala
val squareList = scalaList.map(value => value*value)
println("square list is" + squareList)
reference : http://blog.madhukaraphatak.com/converting-java-collections-to-scala/
Related
I'm trying to read in a .yaml file into my Scala code. Assume I have the following yaml file:
animals: ["dog", "cat"]
I am trying to read it using the following code:
val e = yaml.load(os.read("config.yaml")).asInstanceOf[java.util.Map[String, Any]]
val arr = e.getOrDefault("animals", new Java.util.ArrayList[String]()) // arr is Option[Any], but I know it contains java.util.ArrayList<String>
arr.asInstanceOf[Buffer[String]] // ArrayList cannot be cast to Buffer
The ArrayList is type Any, so how do I cast to a Scala Collection e.g. Buffer? (Or Seq, List...)
SnakeYaml (assuming what you're using) can't give you a scala collection like Buffer directly.
But you can ask it for ArrayList of string and then convert it to whatever you need.
import scala.jdk.CollectionConverters._
val list = arr.asInstanceOf[util.ArrayList[String]].asScala
results in:
list: scala.collection.mutable.Buffer[String] = Buffer(dog, cat)
Another option you have is to define the model of you configuration, for example:
class Sample {
#BeanProperty var animals = new java.util.ArrayList[String]()
}
The following will create an instance of Sample:
val input = new StringReader("animals: [\"dog\", \"cat\"]")
val yaml = new Yaml(new Constructor(classOf[Sample]))
val sample = yaml.load(input).asInstanceOf[Sample]
Then, using CollectionConverters in Scala 2.13, or JavaConverters in Scala 2.12 or prior, convert animals into a Scala structure:
val buffer = sample.getAnimals.asScala
Code run at Scastie.
I've tried the solution using withColumn specified here:
How to cast all columns of Spark dataset to string using Java
But, the solution is taking a hit on performance for huge number of columns (1k-6k). It takes more than 6 hours and then gets aborted.
Alternatively, I'm trying to use map to cast like below, but I get error here:
MapFunction<Column, Column> mapFunction = (c) -> {
return c.cast("string");
};
dataset = dataset.map(mapFunction, Encoders.bean(Column.class));
Error with above snippet:
The method map(Function1<Row,U>, Encoder<U>) in the type Dataset<Row> is not applicable for the arguments (MapFunction<Column,Column>, Encoder<Column>)
Import used:
import org.apache.spark.api.java.function.MapFunction;
Are you sure you mean 1k-6k columns or do you mean rows?
But in any case I cast columns genericly like this:
import spark.implicits._
val df = Seq((1, 2), (2, 3), (3, 4)).toDF("a", "b")
val cols = for {
a <- df.columns
} yield col(a).cast(StringType)
df.select(cols : _*)
Found the below solution for anyone looking for this:
String[] strColNameArray = dataset.columns();
List<Column> colNames = new ArrayList<>();
for(String strColName : strColNameArray){
colNames.add(new Column(strColName).cast("string"));
}
dataset = dataset.select(JavaConversions.asScalaBuffer(colNames));`
I am implementing a Kotlin interface in Java which expects me to return a Sequence<T>.
How can I convert a Java collection into a Kotlin Sequence? Conversely, how can I convert a Kotlin Sequence into a Java collection?
Here are some conversions:
val javaList = java.util.ArrayList<String>()
javaList.addAll(listOf("A", "B", "C"))
// From Java List to Sequence
val seq = sequenceOf(*javaList.toTypedArray())
// or
val seq2 = javaList.asSequence()
// Sequence to Kotlin List
val list = seq.toList()
// Kotlin List to Sequence
val seqFromList = sequenceOf(*list.toTypedArray())
// or
val seqFromList2 = list.asSequence()
// Sequence to Java List
val newJavaList = java.util.ArrayList<String>().apply { seqFromList.toCollection(this) }
// or
val newJavaList2 = java.util.ArrayList<String>()
newJavaList2.addAll(seqFromList)
Since the Kotlin code gets run from Java, it gets a bit trickier.
Let's try to recreate the scenario:
Kotlin:
interface SequenceInterface {
fun foo(list: List<Int>) : Sequence<Int>
}
If you inspect Kotlin code, you will discover that there's no particular implementation of the Sequence interface. So, in your case, you need to implement it by yourself (just like Kotlin is doing when calling asSequence:
public class Foo implements SequenceInterface {
#NotNull
public Sequence<Integer> foo(final List<Integer> list) {
return () -> list.listIterator();
}
}
and then you can start using it in Java:
new Foo().foo(Arrays.asList(42))
Keep in mind that all useful methods will be gone since they are implemented as Kotlin extensions.
Want to convert to List? In plain Java, just reiterate:
ArrayList<Integer> list = new ArrayList<>();
seq.iterator().forEachRemaining(list::add);
Also, make sure that you absolutely need to return a Sequence in Kotlin's code. If you want to achieve better interoperability, returning a Stream would make more sense.
Like this:
val javaList = java.util.ArrayList<String>()
val kotlinSeq = javaList.asSequence()
val newJavaList = java.util.ArrayList<String>().apply { kotlinSeq.toCollection(this) }
I'm trying to convert from a java List to a scala List[scala.Long], i have seen from scala to java, but not the other way around.
I have tried using:
def convertJavaList2ScalaList[A]( list : java.util.List[A] ) : List[A]
={
val buffer = list.asScala
buffer.toList
}
And it works for other Objects (Eg. Person), but doesn't work when i try to convert scala.Long to java.lang.Long
Thanks for the help.
import scala.collection.JavaConverters._
// given a Java List of Java Longs:
val jlist: java.util.List[java.lang.Long] = ???
val scalaList: List[Long] = jlist.asScala.toList.map(_.toLong)
I am new to scala and spark.I have below case class A
case class A(uniqueId : String,
attributes: HashMap[String, List[String]])
Now I have a dataFrame of type A. I need to call a java function on each row of that DF. I need to convert Hashmap to Java HashMap and List to java list..
How can i do that.
I am trying to do following
val rddCaseClass = RDD[A]
val a = rddCaseClass.toDF().map ( x=> {
val rowData = x.getAs[java.util.HashMap[String,java.util.List[String]]]("attributes")
callJavaMethod(rowData)
But this is giving me error :
java.lang.ClassCastException: scala.collection.mutable.WrappedArray$ofRef cannot be cast to java.util.List
Please help.
You can convert Scala Wrapped array to Java List using
scala.collection.JavaConversions
val wrappedArray: WrappedArray[String] = WrappedArray.make(Array("Java", "Scala"))
val javaList = JavaConversions.mutableSeqAsJavaList(wrappedArray)
JavaConversions.asJavaList can also be used but its deprecated: use mutableSeqAsJavaList instead
I think, you could use Seq instead of List for your parameters to work efficiently with List. This way it should work with most of the Seq implementations and no need to to convert the seqs like WrappedArray.
val rddCaseClass = RDD[A]
val a = rddCaseClass.toDF().map ( x=> {
val rowData = x.getAs[java.util.HashMap[String, Seq[String]]]("attributes")
callJavaMethod(rowData)