How to trigger source generation with sbt - java

I have an sbt sub-project which compiles messages.json files into new java sources. I've set the task up to run before running tests and before compiling the primary project, or run manually via a new command "gen-messages".
The problem is the message generation takes some time, it always generates all sources, and it is running too often. Some tasks like running tests with coverage end up generating and compiling the messages twice!
How can I monitor the sources to the generator, and only run the source generation if something has changed/or the expected output java files are missing?
Secondly how would I go about running the generator only on changed messages.json files?
Currently the sbt commands I'm using are:
lazy val settingsForMessageGeneration =
((test in Test) <<= (test in Test) dependsOn(messageGenerationCommand)) ++
((compile in Compile) <<= (compile in Compile) dependsOn(messageGenerationCommand)) ++
(messageGenerationCommand <<= messageGenerationTask) ++
(sourceGenerators in Compile += messageGenerationTask.taskValue)
lazy val messageGenerationCommand = TaskKey[scala.collection.Seq[File]]("gen-messages")
lazy val messageGenerationTask = (
sourceManaged,
fullClasspath in Compile in messageGenerator,
runner in Compile in messageGenerator,
streams
) map { (dir, cp, r, s) =>
lazy val f = getFileTree(new File("./subProjectWithMsgSources/src/")).filter(_.getName.endsWith("messages.json"))
f.foreach({ te =>
val messagePackagePath = te.getAbsolutePath().replace("messages.json", "msg").replace("./", "")
val messagePath = te.getAbsolutePath().replace("./", "")
val fi = new File(messagePackagePath)
if (!fi.exists()) {
fi.mkdirs()
}
val ar = List("-d", messagePackagePath, messagePath)
toError(r.run("com.my.MessageClassGenerator", cp.files, ar, s.log))
})
getFileTree(new File("./subProjectWithMsgSources/src/"))
.filter(_.getName.endsWith("/msg/*.java"))
.to[scala.collection.Seq]
}
The message generator creates a directory with the newly created java files - no other content will be in that directory.
Related Questions
sbt generate using project generator

You can use sbt.FileFunction.cached to run your source generator only when your input files or output files have been changed.
The idea is to factor your actual source generation to a function Set[File] => Set[File], and call it via FileFunction.cached.
lazy val settingsForMessageGeneration =
((test in Test) <<= (test in Test) dependsOn(messageGenerationCommand)) ++
((compile in Compile) <<= (compile in Compile) dependsOn(messageGenerationCommand)) ++
(messageGenerationCommand <<= messageGenerationTask) ++
(sourceGenerators in Compile += messageGenerationTask.taskValue)
lazy val messageGenerationCommand = TaskKey[scala.collection.Seq[File]]("gen-messages")
lazy val messageGenerationTask = (
sourceManaged,
fullClasspath in Compile in messageGenerator,
runner in Compile in messageGenerator,
streams
) map { (dir, cp, r, s) =>
lazy val f = getFileTree(new File("./subProjectWithMsgSources/src/")).filter(_.getName.endsWith("messages.json"))
def gen(sources: Set[File]): Set[File] = {
sources.foreach({ te =>
val messagePackagePath = te.getAbsolutePath().replace("messages.json", "msg").replace("./", "")
val messagePath = te.getAbsolutePath().replace("./", "")
val fi = new File(messagePackagePath)
if (!fi.exists()) {
fi.mkdirs()
}
val ar = List("-d", messagePackagePath, messagePath)
toError(r.run("com.my.MessageClassGenerator", cp.files, ar, s.log))
})
getFileTree(new File("./subProjectWithMsgSources/src/"))
.filter(_.getName.endsWith("/msg/*.java"))
.to[scala.collection.immutable.Set]
}
val func = FileFunction.cached(s.cacheDirectory / "gen-messages", FilesInfo.hash) { gen }
func(f.toSet).toSeq
}

Related

Groovy script reloading on runtime

I want to be a able to execute a groovy script from my Java application.
I want to reload the groovy script on runtime if needed. According to their tutorials, I can do something like that :
long now = System.currentTimeMillis();
for(int i = 0; i < 100000; i++) {
try {
GroovyScriptEngine groovyScriptEngine = new GroovyScriptEngine("");
System.out.println(groovyScriptEngine.run("myScript.groovy", new Binding()););
} catch (Exception e) {
e.printStackTrace();
}
}
long end = System.currentTimeMillis();
System.out.println("time " + (end - now));//24 secs
myScript.groovy
"Hello-World"
This works fine and the script is reloaded everytime i change a line in myScript.groovy.
The problem is that this is not time efficient, what it does is parsing the script from the file every time.
Is there any other alternative ? E.g something smarter that checks if the script is already parsed and if it did not change since the last parse do not parse it again.
<< edited due to comments >>
Like mentioned in one of the comments, separating parsing (which is slow) from execution (which is fast) is mandatory if you need performance.
For reactive reloading of the script source we can for example use the java nio watch service:
import groovy.lang.*
import java.nio.file.*
def source = new File('script.groovy')
def shell = new GroovyShell()
def script = shell.parse(source.text)
def watchService = FileSystems.default.newWatchService()
source.canonicalFile.parentFile.toPath().register(watchService, StandardWatchEventKinds.ENTRY_MODIFY)
boolean keepWatching = true
Thread.start {
while (keepWatching) {
def key = watchService.take()
if (key.pollEvents()?.any { it.context() == source.toPath() }) {
script = shell.parse(source.text)
println "source reloaded..."
}
key.reset()
}
}
def binding = new Binding()
def start = System.currentTimeMillis()
for (i=0; i<100; i++) {
script.setBinding(binding)
def result = script.run()
println "script ran: $result"
Thread.sleep(500)
}
def delta = System.currentTimeMillis() - start
println "took ${delta}ms"
keepWatching = false
The above starts a separate watcher thread which uses the java watch service to monitor the parent directory for file modifications and reloads the script source when a modification is detected. This assumes java version 7 or later. The sleep is just there to make it easier to play around with the code and should naturally be removed if measuring performance.
Storing the string System.currentTimeMillis() in script.groovy and running the above code will leave it looping twice a second. Making modifications to script.groovy during the loop results in:
~> groovy solution.groovy
script ran: 1557302307784
script ran: 1557302308316
script ran: 1557302308816
script ran: 1557302309317
script ran: 1557302309817
source reloaded...
script ran: 1557302310318
script ran: 1557302310819
script ran: 1557302310819
source reloaded...
where the source reloaded... lines are printed whenever a change was made to the source file.
I'm not sure about windows but I believe at least on linux that java uses the fsnotify system under the covers which should make the file monitoring part performant.
Should be noted that if we are really unlucky, the script variable will be reset by the watcher thread between the two lines:
script.setBinding(binding)
def result = script.run()
which would break the code as the reloaded script instance would not have the binding set. To fix this, we can for example use a lock:
import groovy.lang.*
import java.nio.file.*
import java.util.concurrent.locks.ReentrantLock
def source = new File('script.groovy')
def shell = new GroovyShell()
def script = shell.parse(source.text)
def watchService = FileSystems.default.newWatchService()
source.canonicalFile.parentFile.toPath().register(watchService, StandardWatchEventKinds.ENTRY_MODIFY)
lock = new ReentrantLock()
boolean keepWatching = true
Thread.start {
while (keepWatching) {
def key = watchService.take()
if (key.pollEvents()?.any { it.context() == source.toPath() }) {
withLock {
script = shell.parse(source.text)
}
println "source reloaded..."
}
key.reset()
}
}
def binding = new Binding()
def start = System.currentTimeMillis()
for (i=0; i<100; i++) {
withLock {
script.setBinding(binding)
def result = script.run()
println "script ran: $result"
}
Thread.sleep(500)
}
def delta = System.currentTimeMillis() - start
println "took ${delta}ms"
keepWatching = false
def withLock(Closure c) {
def result
lock.lock()
try {
result = c()
} finally {
lock.unlock()
}
result
}
which convolutes the code somewhat but keeps us safe against concurrency issues which tend to be hard to track down.

scala matchers works from jar not from sbt

I have a code blurb thats doing some reflection on scala class files and looking for annotations,something like
// Create a new class loader with the directory
val cl = new URLClassLoader(allpaths.toArray)
getTypeNamesFromPath(sourceDir).map(s => {
val cls = cl.loadClass(s)
cls.getAnnotations.map(an =>
an match {
case q: javax.ws.rs.Path => println("found annotation")
case _ =>
})
})
This code prints "found annotation" when assembling this in a jar and running using java -jar, but doesn't print anything when running from sbt run.
sbt version is 13.8,
scala version 2.11.7
for completeness
private def getTypeNamesFromPath(file: File, currentPath: mutable.Stack[String] = new mutable.Stack[String]()): List[String] = {
if (file.isDirectory) {
var list = List[String]()
for (f <- file.listFiles()) {
currentPath.push(f.getName)
list = list ++ getTypeNamesFromPath(f, currentPath)
}
return list
}
if (currentPath.isEmpty)
throw new IllegalArgumentException(file.getAbsolutePath)
currentPath.pop()
if (file.getName.endsWith(".class")) {
return List(currentPath.foldRight("")((s, b) => if (b.isEmpty) s else b + "." + s) + "." + file.getName.stripSuffix(".class"))
}
return List()
}
so it turns out i need two things...
use URLClassLoader from scala.tools.nsc.util.ScalaClassLoader.URLClassLoader, this will solve loading the right classes.
set the fork := true in build.sbt, because once you do step #1, you will run into issues with class name clashes because sbt runs with own class paths and own version of class loader, more on it here http://www.scala-sbt.org/0.13.0/docs/Detailed-Topics/Forking.html

JSON4s can't find constructor w/spark

I've run into an issue with attempting to parse json in my spark job. I'm using spark 1.1.0, json4s, and the Cassandra Spark Connector, with DSE 4.6. The exception thrown is:
org.json4s.package$MappingException: Can't find constructor for BrowserData org.json4s.reflect.ScalaSigReader$.readConstructor(ScalaSigReader.scala:27)
org.json4s.reflect.Reflector$ClassDescriptorBuilder.ctorParamType(Reflector.scala:108)
org.json4s.reflect.Reflector$ClassDescriptorBuilder$$anonfun$6.apply(Reflector.scala:98)
org.json4s.reflect.Reflector$ClassDescriptorBuilder$$anonfun$6.apply(Reflector.scala:95)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
My code looks like this:
case class BrowserData(navigatorObjectData: Option[NavigatorObjectData],
flash_version: Option[FlashVersion],
viewport: Option[Viewport],
performanceData: Option[PerformanceData])
.... other case classes
def parseJson(b: Option[String]): Option[String] = {
implicit val formats = DefaultFormats
for {
browserDataStr <- b
browserData = parse(browserDataStr).extract[BrowserData]
navObject <- browserData.navigatorObjectData
userAgent <- navObject.userAgent
} yield (userAgent)
}
def getJavascriptUa(rows: Iterable[com.datastax.spark.connector.CassandraRow]): Option[String] = {
implicit val formats = DefaultFormats
rows.collectFirst { case r if r.getStringOption("browser_data").isDefined =>
parseJson(r.getStringOption("browser_data"))
}.flatten
}
def getRequestUa(rows: Iterable[com.datastax.spark.connector.CassandraRow]): Option[String] = {
rows.collectFirst { case r if r.getStringOption("ua").isDefined =>
r.getStringOption("ua")
}.flatten
}
def checkUa(rows: Iterable[com.datastax.spark.connector.CassandraRow], sessionId: String): Option[Boolean] = {
for {
jsUa <- getJavascriptUa(rows)
reqUa <- getRequestUa(rows)
} yield (jsUa == reqUa)
}
def run(name: String) = {
val rdd = sc.cassandraTable("beehive", name).groupBy(r => r.getString("session_id"))
val counts = rdd.map(r => (checkUa(r._2, r._1)))
counts
}
I use :load to load the file into the REPL, and then call the run function. The failure is happening in the parseJson function, as far as I can tell. I've tried a variety of things to try to get this to work. From similar posts, I've made sure my case classes are in the top level in the file. I've tried compiling just the case class definitions into a jar, and including the jar in like this: /usr/bin/dse spark --jars case_classes.jar
I've tried adding them to the conf like this: sc.getConf.setJars(Seq("/home/ubuntu/case_classes.jar"))
And still the same error. Should I compile all of my code into a jar? Is this a spark issue or a JSON4s issue? Any help at all appreciated.

how to save a list of case classes in scala

I have a case class named Rdv:
case class Rdv(
id: Option[Int],
nom: String,
prénom: String,
sexe: Int,
telPortable: String,
telBureau: String,
telPrivé: String,
siteRDV: String,
typeRDV: String,
libelléRDV: String,
numRDV: String,
étape: String,
dateRDV: Long,
heureRDVString: String,
statut: String,
orderId: String)
and I would like to save a list of such elements on disk, and reload them later.
I tried with java classes (ObjectOutputStream, fileOutputStream, objectInputStream, fileInputStream) but I have an error in the retrieving step : the statement
val n2 = ois.readObject().asInstanceOf[List[Rdv]]
always get an error(classNotFound:Rdv), although the correct path is given in the imports place.
Do you know a workaround to save such an object?
Please provide a little piece of code!
thanks
olivier
ps: I have the same error while using the Marshall class, such as in this code:
object Application extends Controller {
def index = Action {
//implicit val Rdv2Writes = Json.writes[rdv2]
def rdvTordv2(rdv: Rdv): rdv2 = new rdv2(
rdv.nom,
rdv.prénom,
rdv.dateRDV,
rdv.heureRDVString,
rdv.telPortable,
rdv.telBureau,
rdv.telPrivé,
rdv.siteRDV,
rdv.typeRDV,
rdv.libelléRDV,
rdv.orderId,
rdv.statut)
val n = variables.manager.liste_locale
val out = new FileOutputStream("out")
out.write(Marshal.dump(n))
out.close
val in = new FileInputStream("out")
val bytes = Stream.continually(in.read).takeWhile(-1 !=).map(_.toByte).toArray
val bar: List[Rdv] = Marshal.load[List[Rdv]](bytes) <--------------
val n3 = bar.map(rdv =>
rdvTordv2(rdv))
println("n3:" + n3.size)
Ok(views.html.Application.olivier2(n3))
}
},
in the line with the arrow.
It seems that the conversion to the type List[Rdv] encounters problems, but why? Is it a play! linked problem?
ok, there's a problem with play:
I created a new scala project with this code:
object Test1 extends App {
//pour des fins de test
case class Person(name:String,age:Int)
val liste_locale=List(new Person("paul",18))
val n = liste_locale
val out = new FileOutputStream("out")
out.write(Marshal.dump(n))
out.close
val in = new FileInputStream("out")
val bytes = Stream.continually(in.read).takeWhile(-1 !=).map(_.toByte).toArray
val bar: List[Person] = Marshal.load[List[Person]](bytes)
println(s"bar:size=${bar.size}")
}
and the display is good ("bar:size=1").
then, I modified my previous code in the play project, in the controller class, such as this:
object Application extends Controller {
def index = Action {
//pour des fins de test
case class Person(name:String,age:Int)
val liste_locale=List(new Person("paul",18))
val n = liste_locale
val out = new FileOutputStream("out")
out.write(Marshal.dump(n))
out.close
val in = new FileInputStream("out")
val bytes = Stream.continually(in.read).takeWhile(-1 !=).map(_.toByte).toArray
val bar: List[Person] = Marshal.load[List[Person]](bytes)
println(s"bar:size=${bar.size}")
Ok(views.html.Application.olivier2(Nil))
}
}
and I have an error saying:
play.api.Application$$anon$1: Execution exception[[ClassNotFoundException: controllers.Application$$anonfun$index$1$Person$3]]
is there anyone having the answer?
edit: I thought the error could come from sbt, so I modified build.scala such as this:
import sbt._
import Keys._
import play.Project._
object ApplicationBuild extends Build {
val appName = "sms_play_2"
val appVersion = "1.0-SNAPSHOT"
val appDependencies = Seq(
// Add your project dependencies here,
jdbc,
anorm,
"com.typesafe.slick" % "slick_2.10" % "2.0.0",
"com.github.nscala-time" %% "nscala-time" % "0.6.0",
"org.xerial" % "sqlite-jdbc" % "3.7.2",
"org.quartz-scheduler" % "quartz" % "2.2.1",
"com.esotericsoftware.kryo" % "kryo" % "2.22",
"io.argonaut" %% "argonaut" % "6.0.2")
val mySettings = Seq(
(javaOptions in run) ++= Seq("-Dconfig.file=conf/dev.conf"))
val playCommonSettings = Seq(
Keys.fork := true)
val main = play.Project(appName, appVersion, appDependencies).settings(
Keys.fork in run := true,
resolvers += Resolver.sonatypeRepo("snapshots")).settings(mySettings: _*)
.settings(playCommonSettings: _*)
}
but without success, the error is still there (Class Person not found)
can you help me?
Scala Pickling has reasonable momentum and the approach has many advantages (lots of the heavy lifting is done at compile time). There is a plugable serialization mechanism and formats like json are supported.

How to link classes from JDK into scaladoc-generated doc?

I'm trying to link classes from the JDK into the scaladoc-generated doc.
I've used the -doc-external-doc option of scaladoc 2.10.1 but without success.
I'm using -doc-external-doc:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar#http://docs.oracle.com/javase/7/docs/api/, but I get links such as index.html#java.io.File instead of index.html?java/io/File.html.
Seems like this option only works for scaladoc-generated doc.
Did I miss an option in scaladoc or should I fill a feature request?
I've configured sbt as follows:
scalacOptions in (Compile,doc) += "-doc-external-doc:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar#http://docs.oracle.com/javase/7/docs/api"
Note: I've seen the Opts.doc.externalAPI util in the upcoming sbt 0.13. I think a nice addition (not sure if it's possible) would be to pass a ModuleID instead of a File. The util would figure out which file corresponds to the ModuleID.
I use sbt 0.13.5.
There's no out-of-the-box way to have the feature of having Javadoc links inside scaladoc. And as my understanding goes, it's not sbt's fault, but the way scaladoc works. As Josh pointed out in his comment You should report to scaladoc.
There's however a workaround I came up with - postprocess the doc-generated scaladoc so the Java URLs get replaced to form proper Javadoc links.
The file scaladoc.sbt should be placed inside a sbt project and whenever doc task gets executed, the postprocessing via fixJavaLinksTask task kicks in.
NOTE There are lots of hardcoded paths so use it with caution (aka do the polishing however you see fit).
import scala.util.matching.Regex.Match
autoAPIMappings := true
// builds -doc-external-doc
apiMappings += (
file("/Library/Java/JavaVirtualMachines/jdk1.8.0_11.jdk/Contents/Home/jre/lib/rt.jar") ->
url("http://docs.oracle.com/javase/8/docs/api")
)
lazy val fixJavaLinksTask = taskKey[Unit](
"Fix Java links - replace #java.io.File with ?java/io/File.html"
)
fixJavaLinksTask := {
println("Fixing Java links")
val t = (target in (Compile, doc)).value
(t ** "*.html").get.filter(hasJavadocApiLink).foreach { f =>
println("fixing " + f)
val newContent = javadocApiLink.replaceAllIn(IO.read(f), fixJavaLinks)
IO.write(f, newContent)
}
}
val fixJavaLinks: Match => String = m =>
m.group(1) + "?" + m.group(2).replace(".", "/") + ".html"
val javadocApiLink = """\"(http://docs\.oracle\.com/javase/8/docs/api/index\.html)#([^"]*)\"""".r
def hasJavadocApiLink(f: File): Boolean = (javadocApiLink findFirstIn IO.read(f)).nonEmpty
fixJavaLinksTask <<= fixJavaLinksTask triggeredBy (doc in Compile)
I took the answer by #jacek-laskowski and modified it so that it avoid hard-coded strings and could be used for any number of Java libraries, not just the standard one.
Edit: the location of rt.jar is now determined from the runtime using sun.boot.class.path and does not have to be hard coded.
The only thing you need to modify is the map, which I have called externalJavadocMap in the following:
import scala.util.matching.Regex
import scala.util.matching.Regex.Match
val externalJavadocMap = Map(
"owlapi" -> "http://owlcs.github.io/owlapi/apidocs_4_0_2/index.html"
)
/*
* The rt.jar file is located in the path stored in the sun.boot.class.path system property.
* See the Oracle documentation at http://docs.oracle.com/javase/6/docs/technotes/tools/findingclasses.html.
*/
val rtJar: String = System.getProperty("sun.boot.class.path").split(java.io.File.pathSeparator).collectFirst {
case str: String if str.endsWith(java.io.File.separator + "rt.jar") => str
}.get // fail hard if not found
val javaApiUrl: String = "http://docs.oracle.com/javase/8/docs/api/index.html"
val allExternalJavadocLinks: Seq[String] = javaApiUrl +: externalJavadocMap.values.toSeq
def javadocLinkRegex(javadocURL: String): Regex = ("""\"(\Q""" + javadocURL + """\E)#([^"]*)\"""").r
def hasJavadocLink(f: File): Boolean = allExternalJavadocLinks exists {
javadocURL: String =>
(javadocLinkRegex(javadocURL) findFirstIn IO.read(f)).nonEmpty
}
val fixJavaLinks: Match => String = m =>
m.group(1) + "?" + m.group(2).replace(".", "/") + ".html"
/* You can print the classpath with `show compile:fullClasspath` in the SBT REPL.
* From that list you can find the name of the jar for the managed dependency.
*/
lazy val documentationSettings = Seq(
apiMappings ++= {
// Lookup the path to jar from the classpath
val classpath = (fullClasspath in Compile).value
def findJar(nameBeginsWith: String): File = {
classpath.find { attributed: Attributed[File] => (attributed.data ** s"$nameBeginsWith*.jar").get.nonEmpty }.get.data // fail hard if not found
}
// Define external documentation paths
(externalJavadocMap map {
case (name, javadocURL) => findJar(name) -> url(javadocURL)
}) + (file(rtJar) -> url(javaApiUrl))
},
// Override the task to fix the links to JavaDoc
doc in Compile <<= (doc in Compile) map {
target: File =>
(target ** "*.html").get.filter(hasJavadocLink).foreach { f =>
//println(s"Fixing $f.")
val newContent: String = allExternalJavadocLinks.foldLeft(IO.read(f)) {
case (oldContent: String, javadocURL: String) =>
javadocLinkRegex(javadocURL).replaceAllIn(oldContent, fixJavaLinks)
}
IO.write(f, newContent)
}
target
}
)
I am using SBT 0.13.8.

Categories