Java 8 Stream from scala code

Java 8 Stream from scala code - java

I'm trying to use Java 8 Stream from scala code as below and stuck with an compilation error.
any help appreciated!!
def sendRecord(record: String): Unit throws Exception
bufferedReader.lines().forEach(s => sendRecord(s))
Cannot resolve forEach with such signature, expect: Consumer[_ >: String], actual: (Nothing)
PS: though there is some indication that it is almost straight forward like https://gist.github.com/adriaanm/892d6063dd485d7dd221 it doesn't seem to work. I'm running Scala 2.11.8

You can convert toiterator to iterate java Stream, like:
import scala.collection.JavaConverters._
bufferedReader.lines().iterator.asScala.forEach(s => sendRecord(s))

Look at top of the file you linked in your question. It mentions -Xexperimental flag. If you run scala compiler or repl with this flag scala functions will be translated to java equivalents. Another option is to just pass in java function manually.
scala> java.util.stream.Stream.of("asd", "dsa").map(new java.util.function.Function[String, String] { def apply(s: String) = s + "x" }).toArray
res0: Array[Object] = Array(asdx, dsax)
you can also create (implicit) conversion to wrap scala functions for you.
You can also wait for scala 2.12, with this version you won't need the flag anymore.
Update
As scala 2.12 is out, the code in question would just compile normally.

The problem is that that expression in scala does not automatically implement the expected java functional interface Consumer.
See these questions for details how to solve it. In Scala 2.12,it will probably work without conversions.
"Lambdifying" scala Function in Java
Smooth way of using Function<A, R> java interface from scala?

In Scala 2.12 you can work with Java streams very easily:
import java.util.Arrays
val stream = Arrays.stream(Array(1, 2, 3, 4, 6, 7))
stream
.map {
case i: Int if i % 2 == 0 => i * 2
case i: Int if i % 2 == 1 => i * 2 + 2
}
.forEach(println)

Related

Clojure - compiling project with Java classes that are potentially not available

I am wrapping a java library in Clojure. Depending on the java library version, some classes exist or not, so my library can fail to even compile if it can't find the java classes. My idea was to use Reflector to use the string name of classes.
Example of what I'm trying to do:
(java.time.LocalDateTime/parse "2020-01-01")
would become
(if right-version?
(clojure.lang.Reflector/invokeStaticMethod "java.time.LocalDate" "parse" (into-array ["2020-01-01"]))
This works but is slower by a factor of 20x. Is there a better way to achieve the same? Can I use a macro that will define the correct function at compile time, depending on the version of the underlying library?
Thanks,

I have been using a macro solution to this problem for 6+ years in the Tupelo Library. It allows you to write code like:
(defn base64-encoder []
(if-java-1-8-plus
(java.util.Base64/getEncoder)
(throw (RuntimeException. "Unimplemented prior to Java 1.8: "))))
The macro itself is quite simple:
(defmacro if-java-1-11-plus
"If JVM is Java 1.11 or higher, evaluates if-form into code. Otherwise, evaluates else-form."
[if-form else-form]
(if (is-java-11-plus?)
`(do ~if-form)
`(do ~else-form)))
(defmacro when-java-1-11-plus
"If JVM is Java 1.11 or higher, evaluates forms into code. Otherwise, elide forms."
[& forms]
(when (is-java-11-plus?)
`(do ~#forms)))
and the version testing functions look like
;-----------------------------------------------------------------------------
; Java version stuff
(s/defn version-str->semantic-vec :- [s/Int]
"Returns the java version as a semantic vector of integers, like `11.0.17` => [11 0 17]"
[s :- s/Str]
(let [v1 (str/trim s)
v2 (xsecond (re-matches #"([.0-9]+).*" v1)) ; remove any suffix like on `1.8.0-b097` or `1.8.0_234`
v3 (str/split v2 #"\.")
v4 (mapv #(Integer/parseInt %) v3)]
v4))
(s/defn java-version-str :- s/Str
[] (System/getProperty "java.version"))
(s/defn java-version-semantic :- [s/Int]
[] (version-str->semantic-vec (java-version-str)))
(s/defn java-version-min? :- s/Bool
"Returns true if Java version is at least as great as supplied string.
Sort is by lexicographic (alphabetic) order."
[tgt-version-str :- s/Str]
(let [tgt-version-vec (version-str->semantic-vec tgt-version-str)
actual-version-vec (java-version-semantic)
result (increasing-or-equal? tgt-version-vec actual-version-vec)]
result))
(when-not (java-version-min? "1.7")
(throw (ex-info "Must have at least Java 1.7" {:java-version (java-version-str)})))
(defn is-java-8-plus? [] (java-version-min? "1.8")) ; ***** NOTE: version string is still `1.8` *****
(defn is-java-11-plus? [] (java-version-min? "11"))
(defn is-java-17-plus? [] (java-version-min? "17"))
The advantage of using the macro version is that you can refer to a Java class normally via the symbol java.util.Base64. Without macros, this will crash the compiler for older versions of Java even if wrapped by an if or when, since the symbol will be unresolved before the if or when is evaluated.
Since Java doesn't have macros, the only workaround in that case is to use the string "java.util.Base64"
and then Class/forName, etc, which is awkward & ugly. Since Clojure has macros, we can take advantage of conditional code compilation to avoid needing the powerful (but awkward) Java Reflection API.
Instead of copying or re-writing these functions into your own code, just use put
[tupelo "22.05.04"]
into your project.clj and away you go!
P.S.
You do not need to throw an exception if you detect an older version of Java. This example simply elides the code if the Java version is too old:
(t/when-java-1-11-plus
(dotest
(throws-not? (Instant/parse "2019-02-14T02:03:04.334Z"))
(throws-not? (Instant/parse "2019-02-14T02:03:04Z"))
(throws-not? (Instant/parse "0019-02-14T02:03:04Z")) ; can handle really old dates w/o throwing
...)

I could not tell from your question if your code uses Reflector on every call to the parse method. If so, you could instead define the Method once to use later many times:
(def right-version? true) ; set as appropriate
(def ^java.lang.reflect.Method parse-method
(when right-version?
(.getMethod (Class/forName "java.time.LocalDateTime")
"parse"
(into-array [java.lang.CharSequence]))))
(defn parse-local-date-time [s]
(when parse-method
(.invoke parse-method nil (into-array [s]))))
(parse-local-date-time "2020-01-01T14:30:00")
;; => #object[java.time.LocalDateTime 0x268fc120 "2020-01-01T14:30"]

Getting shortestPaths in GraphFrames with Java

I am new to Spark and GraphFrames.
When I wanted to learn about shortestPaths method in GraphFrame, GraphFrames documentation gave me a sample code in Scala, but not in Java.
In their document, they provided following (Scala code):
import org.graphframes.{examples,GraphFrame}
val g: GraphFrame = examples.Graphs.friends // get example graph
val results = g.shortestPaths.landmarks(Seq("a", "d")).run()
results.select("id", "distances").show()
and in Java, I tried:
import org.graphframes.GraphFrames;
import scala.collection.Seq;
import scala.collection.JavaConverters;
GraphFrame g = new GraphFrame(...,...);
Seq landmarkSeq = JavaConverters.collectionAsScalaIterableConverter(Arrays.asList((Object)"a",(Object)"d")).asScala().toSeq();
g.shortestPaths().landmarks(landmarkSeq).run().show();
or
g.shortestPaths().landmarks(new ArrayList<Object>(List.of((Object)"a",(Object)"d"))).run().show();
Casting to java.lang.Object was necessary since the API demands Seq<Object> or ArrayList<Object> and I could not pass ArrayList<String> to compile it right.
After running the code, I saw the message:
Exception in thread "main" org.apache.spark.sql.AnalysisException: You're using untyped Scala UDF, which does not have the input type information. Spark may blindly pass null to the Scala closure with primitive-type argument, and the closure will see the default value of the Java type for the null argument, e.g. `udf((x: Int) => x, IntegerType)`, the result is 0 for null input. To get rid of this error, you could:
1. use typed Scala UDF APIs(without return type parameter), e.g. `udf((x: Int) => x)`
2. use Java UDF APIs, e.g. `udf(new UDF1[String, Integer] { override def call(s: String): Integer = s.length() }, IntegerType)`, if input types are all non primitive
3. set spark.sql.legacy.allowUntypedScalaUDF to true and use this API with caution;
To follow the 3., I have added the code:
System.setProperty("spark.sql.legacy.allowUntypedScalaUDF","true");
but situation did not change.
Since there are limited number of sample code or stackoverflow questions about GraphFrames in Java, I could not find any useful information while seeking around.
Could anyone experienced in this area help me solve this problem?

This seems a bug in GraphFrames 0.8.0.
See Issue #367 in github.com

How do I rewrite this java function in scala keeping the same Optional input parameter?

I have the following method in java
protected <T> T getObjectFromNullableOptional(final Optional<T> opt) {
return Optional.ofNullable(opt).flatMap(innerOptional -> innerOptional).orElse(null);
}
It takes a java Optional that can itself be null (I know, this is really bad and we're going to fix this eventually). And it wraps it with another Optional so it becomes either Some(Optional<T>) or None. Then it's flatMapped, so we get back Optional<T> or None, and finally apply orElse() to get T or null.
How do I write the same method with the same java.util.Optional in scala?
protected def getObjectFromNullableOptional[T >: Null](opt : Optional[T]): T =
???
I tried
protected def getObjectFromNullableOptional[T >: Null](opt : Optional[T]): T =
Optional.ofNullable(opt).flatMap(o => o).orElse(null)
But this gives me a Type mismatch error
Required: Function[_ >: Optional[T], Optional[NotInferedU]]
Found: Nothing => Nothing
I tried
protected def getObjectFromNullableOptional[T >: Null](opt : Optional[T]): T =
Option(opt).flatMap(o => o).getOrElse(null)
But this gives me
Cannot resolve overloaded method 'flatMap'
Edit I neglected to mention I'm using scala 2.11. I believe #tefanobaghino's solution is for scala 2.13 but it guided me towards the right path. I put my final solution in comments under this solution

The last error raises a few suspicions: it looks like you're wrapping a Java Optional in a Scala Option. I would have instead have expected this to have failed because you're trying to flatMap to a different type, something like
error: type mismatch;
found : java.util.Optional[T] => java.util.Optional[T]
required: java.util.Optional[T] => Option[?]
This seems to fulfill your requirement:
import java.util.Optional
def getObjectFromNullableOptional[T](opt: Optional[T]): T =
Optional.ofNullable(opt).orElse(Optional.empty).orElse(null.asInstanceOf[T])
assert(getObjectFromNullableOptional(null) == null)
assert(getObjectFromNullableOptional(Optional.empty) == null)
assert(getObjectFromNullableOptional(Optional.of(1)) == 1)
You can play around with this here on Scastie.
Note that asInstanceOf is compiled to a cast, not to an actual method call, so this code will not throw a NullPointerException.
You can also go into something closer to your original solution by helping Scala's type inference a bit:
def getObjectFromNullableOptional[T](opt: Optional[T]): T =
Optional.ofNullable(opt).flatMap((o: Optional[T]) => o).orElse(null.asInstanceOf[T])
Or alternatively using Scala's identity:
def getObjectFromNullableOptional[T](opt: Optional[T]): T =
Optional.ofNullable(opt).flatMap(identity[Optional[T]]).orElse(null.asInstanceOf[T])
For a solution using Scala's Option you can do something very close:
def getObjectFromNullableOption[T](opt: Option[T]): T =
Option(opt).getOrElse(None).getOrElse(null.asInstanceOf[T])
Note that going to your flatMap solution with Scala's Option allows you to avoid having to be explicit about the function type:
def getObjectFromNullableOption[T](opt: Option[T]): T =
Option(opt).flatMap(identity).getOrElse(null.asInstanceOf[T])
I'm not fully sure about the specifics, but I believe the issue is that, when using java.util.Optional you are passing a Scala Function to Optional.flatMap, which takes a Java Function. The Scala compiler can convert this automatically for you but apparently you have to be specific and explicit about type for this to work, at least in this case.
A note about your original code: you required T to be a supertype of Null but this is not necessary.
You have a better context regarding what you are doing, but as a general advice it's usually better to avoid having nulls leak in Scala code as much as possible.

Java-callable n-Sampler for Spark Dataset

I'm migrating code from Python to Java and want to build an n-Sampler for Dataset<Row>. It's been a bit frustrating, I ended up cheating and making a very inefficient Scala function for it based off other posts. I then run the function from my Java code, but even that hasn't worked
N-Sample behaviour:
- Select N-rows randomly from dataset
- No repetitions (no replacement)
Current Solution (broken)
import scala.util.Random
object ScalaFunctions {
def nSample(df : org.apache.spark.sql.Dataset[org.apache.spark.sql.Row], n : Int) : org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = {
//inefficient! Shuffles entire dataframe
val output = Random.shuffle(df).take(n)
return output.asInstanceOf[org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]]
}
}
Error Message
Error:(6, 25) inferred type arguments [org.apache.spark.sql.Row,org.apache.spark.sql.Dataset] do not conform to method shuffle's type parameter bounds [T,CC[X] <: TraversableOnce[X]]
val output = Random.shuffle(df).take(n)
Error:(6, 33) type mismatch;
found : org.apache.spark.sql.Dataset[org.apache.spark.sql.Row]
required: CC[T]
val output = Random.shuffle(df).take(n)
I'm new to Java and Scala, so even though I understand the shuffle function doesn't seem to like Datasets, I have no idea how to fix it.
- Virtual beer if you have a solution that doesn't involve shuffling the entire dataframe (for me, this could be like 4M rows) for a small n sample (250)

Scala 2.8.1 implicitly convert to java.util.List<java.util.Map<String, Object>>

I have a Scala data structure created with the following:
List(Map[String, Anyref]("a" -> someFoo, "b" -> someBar))
I would like to implicitly convert it (using scala.collection.JavaConversions or scala.collection.JavaConverters) to a java.util.List<java.util.Map<String, Object>> to be passed the a Java method that expects the latter.
Is this possible?
I have already created the following method that does it, but was wondering if it can be done automatically by the compiler?
import scala.collection.JavaConversions._
def convertToJava(listOfMaps: List[Map[String, AnyRef]]):
java.util.List[java.util.Map[String, Object]] = {
asJavaList(listOfMaps.map(asJavaMap(_)))
}

How about writing
implicit def convertToJava...
?

You don't want this kind of multilevel conversion happening by magic. You can improve a little on your own conversion though, at least aesthetically.
import java.{ util => ju }
implicit def convert[K, V](xs: List[Map[K, V]]): ju.List[ju.Map[K, V]] = xs map (x => x: ju.Map[K, V])

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java 8 Stream from scala code - java

You can convert toiterator to iterate java Stream, like: import scala.collection.JavaConverters._ bufferedReader.lines().iterator.asScala.forEach(s => sendRecord(s))

In Scala 2.12 you can work with Java streams very easily: import java.util.Arrays val stream = Arrays.stream(Array(1, 2, 3, 4, 6, 7)) stream .map { case i: Int if i % 2 == 0 => i * 2 case i: Int if i % 2 == 1 => i * 2 + 2 } .forEach(println)

Related

Clojure - compiling project with Java classes that are potentially not available

Getting shortestPaths in GraphFrames with Java

How do I rewrite this java function in scala keeping the same Optional input parameter?

Java-callable n-Sampler for Spark Dataset

Scala 2.8.1 implicitly convert to java.util.List<java.util.Map<String, Object>>

Categories

Resources