I have a code blurb thats doing some reflection on scala class files and looking for annotations,something like
// Create a new class loader with the directory
val cl = new URLClassLoader(allpaths.toArray)
getTypeNamesFromPath(sourceDir).map(s => {
val cls = cl.loadClass(s)
cls.getAnnotations.map(an =>
an match {
case q: javax.ws.rs.Path => println("found annotation")
case _ =>
})
})
This code prints "found annotation" when assembling this in a jar and running using java -jar, but doesn't print anything when running from sbt run.
sbt version is 13.8,
scala version 2.11.7
for completeness
private def getTypeNamesFromPath(file: File, currentPath: mutable.Stack[String] = new mutable.Stack[String]()): List[String] = {
if (file.isDirectory) {
var list = List[String]()
for (f <- file.listFiles()) {
currentPath.push(f.getName)
list = list ++ getTypeNamesFromPath(f, currentPath)
}
return list
}
if (currentPath.isEmpty)
throw new IllegalArgumentException(file.getAbsolutePath)
currentPath.pop()
if (file.getName.endsWith(".class")) {
return List(currentPath.foldRight("")((s, b) => if (b.isEmpty) s else b + "." + s) + "." + file.getName.stripSuffix(".class"))
}
return List()
}
so it turns out i need two things...
use URLClassLoader from scala.tools.nsc.util.ScalaClassLoader.URLClassLoader, this will solve loading the right classes.
set the fork := true in build.sbt, because once you do step #1, you will run into issues with class name clashes because sbt runs with own class paths and own version of class loader, more on it here http://www.scala-sbt.org/0.13.0/docs/Detailed-Topics/Forking.html
Related
I am using below function to list the directories. It works in Azure databricks but when I am adding in IntelliJ project code, it is not able to resolve "union" keyword. Do I need import anything here?
def listLeafDirectories(path: String): Array[String] =
dbutils.fs.ls(path).map(file => {
// Work around double encoding bug
val path = file.path.replace("%25", "%").replace("%25", "%")
if (file.isDir) listLeafDirectories(path)
else Array[String](path.substring(0,path.lastIndexOf("/")+1))
}).reduceOption(_ union _).getOrElse(Array()).distinct
ADB Notebook successful execution:
Sorry, my mistake.
In the screenshot provided highlighting error, I have recurse flag which I was not passing to the function (in same function).
So below one worked:
def listDirectories(dir: String, recurse: Boolean): Array[String] = {
dbutils.fs.ls(dir).map(file => {
val path = file.path.replace("%25", "%").replace("%25", "%")
if (file.isDir) listDirectories(path,recurse)
else Array[String](path.substring(0, path.lastIndexOf("/")+1))
}).reduceOption(_ union _).getOrElse(Array()).distinct
}
I have an sbt sub-project which compiles messages.json files into new java sources. I've set the task up to run before running tests and before compiling the primary project, or run manually via a new command "gen-messages".
The problem is the message generation takes some time, it always generates all sources, and it is running too often. Some tasks like running tests with coverage end up generating and compiling the messages twice!
How can I monitor the sources to the generator, and only run the source generation if something has changed/or the expected output java files are missing?
Secondly how would I go about running the generator only on changed messages.json files?
Currently the sbt commands I'm using are:
lazy val settingsForMessageGeneration =
((test in Test) <<= (test in Test) dependsOn(messageGenerationCommand)) ++
((compile in Compile) <<= (compile in Compile) dependsOn(messageGenerationCommand)) ++
(messageGenerationCommand <<= messageGenerationTask) ++
(sourceGenerators in Compile += messageGenerationTask.taskValue)
lazy val messageGenerationCommand = TaskKey[scala.collection.Seq[File]]("gen-messages")
lazy val messageGenerationTask = (
sourceManaged,
fullClasspath in Compile in messageGenerator,
runner in Compile in messageGenerator,
streams
) map { (dir, cp, r, s) =>
lazy val f = getFileTree(new File("./subProjectWithMsgSources/src/")).filter(_.getName.endsWith("messages.json"))
f.foreach({ te =>
val messagePackagePath = te.getAbsolutePath().replace("messages.json", "msg").replace("./", "")
val messagePath = te.getAbsolutePath().replace("./", "")
val fi = new File(messagePackagePath)
if (!fi.exists()) {
fi.mkdirs()
}
val ar = List("-d", messagePackagePath, messagePath)
toError(r.run("com.my.MessageClassGenerator", cp.files, ar, s.log))
})
getFileTree(new File("./subProjectWithMsgSources/src/"))
.filter(_.getName.endsWith("/msg/*.java"))
.to[scala.collection.Seq]
}
The message generator creates a directory with the newly created java files - no other content will be in that directory.
Related Questions
sbt generate using project generator
You can use sbt.FileFunction.cached to run your source generator only when your input files or output files have been changed.
The idea is to factor your actual source generation to a function Set[File] => Set[File], and call it via FileFunction.cached.
lazy val settingsForMessageGeneration =
((test in Test) <<= (test in Test) dependsOn(messageGenerationCommand)) ++
((compile in Compile) <<= (compile in Compile) dependsOn(messageGenerationCommand)) ++
(messageGenerationCommand <<= messageGenerationTask) ++
(sourceGenerators in Compile += messageGenerationTask.taskValue)
lazy val messageGenerationCommand = TaskKey[scala.collection.Seq[File]]("gen-messages")
lazy val messageGenerationTask = (
sourceManaged,
fullClasspath in Compile in messageGenerator,
runner in Compile in messageGenerator,
streams
) map { (dir, cp, r, s) =>
lazy val f = getFileTree(new File("./subProjectWithMsgSources/src/")).filter(_.getName.endsWith("messages.json"))
def gen(sources: Set[File]): Set[File] = {
sources.foreach({ te =>
val messagePackagePath = te.getAbsolutePath().replace("messages.json", "msg").replace("./", "")
val messagePath = te.getAbsolutePath().replace("./", "")
val fi = new File(messagePackagePath)
if (!fi.exists()) {
fi.mkdirs()
}
val ar = List("-d", messagePackagePath, messagePath)
toError(r.run("com.my.MessageClassGenerator", cp.files, ar, s.log))
})
getFileTree(new File("./subProjectWithMsgSources/src/"))
.filter(_.getName.endsWith("/msg/*.java"))
.to[scala.collection.immutable.Set]
}
val func = FileFunction.cached(s.cacheDirectory / "gen-messages", FilesInfo.hash) { gen }
func(f.toSet).toSeq
}
I've run into an issue with attempting to parse json in my spark job. I'm using spark 1.1.0, json4s, and the Cassandra Spark Connector, with DSE 4.6. The exception thrown is:
org.json4s.package$MappingException: Can't find constructor for BrowserData org.json4s.reflect.ScalaSigReader$.readConstructor(ScalaSigReader.scala:27)
org.json4s.reflect.Reflector$ClassDescriptorBuilder.ctorParamType(Reflector.scala:108)
org.json4s.reflect.Reflector$ClassDescriptorBuilder$$anonfun$6.apply(Reflector.scala:98)
org.json4s.reflect.Reflector$ClassDescriptorBuilder$$anonfun$6.apply(Reflector.scala:95)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
My code looks like this:
case class BrowserData(navigatorObjectData: Option[NavigatorObjectData],
flash_version: Option[FlashVersion],
viewport: Option[Viewport],
performanceData: Option[PerformanceData])
.... other case classes
def parseJson(b: Option[String]): Option[String] = {
implicit val formats = DefaultFormats
for {
browserDataStr <- b
browserData = parse(browserDataStr).extract[BrowserData]
navObject <- browserData.navigatorObjectData
userAgent <- navObject.userAgent
} yield (userAgent)
}
def getJavascriptUa(rows: Iterable[com.datastax.spark.connector.CassandraRow]): Option[String] = {
implicit val formats = DefaultFormats
rows.collectFirst { case r if r.getStringOption("browser_data").isDefined =>
parseJson(r.getStringOption("browser_data"))
}.flatten
}
def getRequestUa(rows: Iterable[com.datastax.spark.connector.CassandraRow]): Option[String] = {
rows.collectFirst { case r if r.getStringOption("ua").isDefined =>
r.getStringOption("ua")
}.flatten
}
def checkUa(rows: Iterable[com.datastax.spark.connector.CassandraRow], sessionId: String): Option[Boolean] = {
for {
jsUa <- getJavascriptUa(rows)
reqUa <- getRequestUa(rows)
} yield (jsUa == reqUa)
}
def run(name: String) = {
val rdd = sc.cassandraTable("beehive", name).groupBy(r => r.getString("session_id"))
val counts = rdd.map(r => (checkUa(r._2, r._1)))
counts
}
I use :load to load the file into the REPL, and then call the run function. The failure is happening in the parseJson function, as far as I can tell. I've tried a variety of things to try to get this to work. From similar posts, I've made sure my case classes are in the top level in the file. I've tried compiling just the case class definitions into a jar, and including the jar in like this: /usr/bin/dse spark --jars case_classes.jar
I've tried adding them to the conf like this: sc.getConf.setJars(Seq("/home/ubuntu/case_classes.jar"))
And still the same error. Should I compile all of my code into a jar? Is this a spark issue or a JSON4s issue? Any help at all appreciated.
I am writing the following (with Scala 2.10 and Java 6):
import java.io._
def delete(file: File) {
if (file.isDirectory)
Option(file.listFiles).map(_.toList).getOrElse(Nil).foreach(delete(_))
file.delete
}
How would you improve it ? The code seems working but it ignores the return value of java.io.File.delete. Can it be done easier with scala.io instead of java.io ?
With pure scala + java way
import scala.reflect.io.Directory
import java.io.File
val directory = new Directory(new File("/sampleDirectory"))
directory.deleteRecursively()
deleteRecursively() Returns false on failure
Try this code that throws an exception if it fails:
def deleteRecursively(file: File): Unit = {
if (file.isDirectory) {
file.listFiles.foreach(deleteRecursively)
}
if (file.exists && !file.delete) {
throw new Exception(s"Unable to delete ${file.getAbsolutePath}")
}
}
You could also fold or map over the delete if you want to return a value for all the deletes.
Using scala IO
import scalax.file.Path
val path = Path.fromString("/tmp/testfile")
try {
path.deleteRecursively(continueOnFailure = false)
} catch {
case e: IOException => // some file could not be deleted
}
or better, you could use a Try
val path: Path = Path ("/tmp/file")
Try(path.deleteRecursively(continueOnFailure = false))
which will either result in a Success[Int] containing the number of files deleted, or a Failure[IOException].
From
http://alvinalexander.com/blog/post/java/java-io-faq-how-delete-directory-tree
Using Apache Common IO
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.filefilter.WildcardFileFilter;
public void deleteDirectory(String directoryName)
throws IOException
{
try
{
FileUtils.deleteDirectory(new File(directoryName));
}
catch (IOException ioe)
{
// log the exception here
ioe.printStackTrace();
throw ioe;
}
}
The Scala one can just do this...
import org.apache.commons.io.FileUtils
import org.apache.commons.io.filefilter.WildcardFileFilter
FileUtils.deleteDirectory(new File(outputFile))
Maven Repo Imports
Using the Java NIO.2 API:
import java.nio.file.{Files, Paths, Path, SimpleFileVisitor, FileVisitResult}
import java.nio.file.attribute.BasicFileAttributes
def remove(root: Path): Unit = {
Files.walkFileTree(root, new SimpleFileVisitor[Path] {
override def visitFile(file: Path, attrs: BasicFileAttributes): FileVisitResult = {
Files.delete(file)
FileVisitResult.CONTINUE
}
override def postVisitDirectory(dir: Path, exc: IOException): FileVisitResult = {
Files.delete(dir)
FileVisitResult.CONTINUE
}
})
}
remove(Paths.get("/tmp/testdir"))
Really, it's a pity that the NIO.2 API is with us for so many years and yet few people are using it, even though it is really superior to the old File API.
Expanding on Vladimir Matveev's NIO2 solution:
object Util {
import java.io.IOException
import java.nio.file.{Files, Paths, Path, SimpleFileVisitor, FileVisitResult}
import java.nio.file.attribute.BasicFileAttributes
def remove(root: Path, deleteRoot: Boolean = true): Unit =
Files.walkFileTree(root, new SimpleFileVisitor[Path] {
override def visitFile(file: Path, attributes: BasicFileAttributes): FileVisitResult = {
Files.delete(file)
FileVisitResult.CONTINUE
}
override def postVisitDirectory(dir: Path, exception: IOException): FileVisitResult = {
if (deleteRoot) Files.delete(dir)
FileVisitResult.CONTINUE
}
})
def removeUnder(string: String): Unit = remove(Paths.get(string), deleteRoot=false)
def removeAll(string: String): Unit = remove(Paths.get(string))
def removeUnder(file: java.io.File): Unit = remove(file.toPath, deleteRoot=false)
def removeAll(file: java.io.File): Unit = remove(file.toPath)
}
Using java 6 without using dependencies this is pretty much the only way to do so.
The problem with your function is that it return Unit (which I btw would explicit note it using def delete(file: File): Unit = {
I took your code and modify it to return map from file name to the deleting status.
def delete(file: File): Array[(String, Boolean)] = {
Option(file.listFiles).map(_.flatMap(f => delete(f))).getOrElse(Array()) :+ (file.getPath -> file.delete)
}
To add to Slavik Muz's answer:
def deleteFile(file: File): Boolean = {
def childrenOf(file: File): List[File] = Option(file.listFiles()).getOrElse(Array.empty).toList
#annotation.tailrec
def loop(files: List[File]): Boolean = files match {
case Nil ⇒ true
case child :: parents if child.isDirectory && child.listFiles().nonEmpty ⇒
loop((childrenOf(child) :+ child) ++ parents)
case fileOrEmptyDir :: rest ⇒
println(s"deleting $fileOrEmptyDir")
fileOrEmptyDir.delete()
loop(rest)
}
if (!file.exists()) false
else loop(childrenOf(file) :+ file)
}
This one uses java.io but one can delete directories matching it with wildcard string which may or may not contain any content within it.
for (file <- new File("<path as String>").listFiles;
if( file.getName() matches("[1-9]*"))) FileUtils.deleteDirectory(file)
Directory structure e.g.
* A/1/, A/2/, A/300/ ... thats why the regex String: [1-9]*, couldn't find a File API in scala which supports regex(may be i missed something).
Getting little lengthy, but here's one that combines the recursivity of Garrette's solution with the npe-safety of the original question.
def deleteFile(path: String) = {
val penultimateFile = new File(path.split('/').take(2).mkString("/"))
def getFiles(f: File): Set[File] = {
Option(f.listFiles)
.map(a => a.toSet)
.getOrElse(Set.empty)
}
def getRecursively(f: File): Set[File] = {
val files = getFiles(f)
val subDirectories = files.filter(path => path.isDirectory)
subDirectories.flatMap(getRecursively) ++ files + penultimateFile
}
getRecursively(penultimateFile).foreach(file => {
if (getFiles(file).isEmpty && file.getAbsoluteFile().exists) file.delete
})
}
This is recursive method that clean all in directory, and return count of deleted files
def cleanDir(dir: File): Int = {
#tailrec
def loop(list: Array[File], deletedFiles: Int): Int = {
if (list.isEmpty) deletedFiles
else {
if (list.head.isDirectory && !list.head.listFiles().isEmpty) {
loop(list.head.listFiles() ++ list.tail ++ Array(list.head), deletedFiles)
} else {
val isDeleted = list.head.delete()
if (isDeleted) loop(list.tail, deletedFiles + 1)
else loop(list.tail, deletedFiles)
}
}
}
loop(dir.listFiles(), 0)
}
What I ended up with
def deleteRecursively(f: File): Boolean = {
if (f.isDirectory) f.listFiles match {
case files: Array[File] => files.foreach(deleteRecursively)
case null =>
}
f.delete()
}
os-lib makes it easy to delete recursively with a one-liner:
os.remove.all(os.pwd/"dogs")
os-lib uses java.nio under the hood, just doesn't expose all the Java ugliness. See here for more info on how to use the library.
You can do this by excute external system commands.
import sys.process._
def delete(path: String) = {
s"""rm -rf ${path}""".!!
}
I'm trying to link classes from the JDK into the scaladoc-generated doc.
I've used the -doc-external-doc option of scaladoc 2.10.1 but without success.
I'm using -doc-external-doc:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar#http://docs.oracle.com/javase/7/docs/api/, but I get links such as index.html#java.io.File instead of index.html?java/io/File.html.
Seems like this option only works for scaladoc-generated doc.
Did I miss an option in scaladoc or should I fill a feature request?
I've configured sbt as follows:
scalacOptions in (Compile,doc) += "-doc-external-doc:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/rt.jar#http://docs.oracle.com/javase/7/docs/api"
Note: I've seen the Opts.doc.externalAPI util in the upcoming sbt 0.13. I think a nice addition (not sure if it's possible) would be to pass a ModuleID instead of a File. The util would figure out which file corresponds to the ModuleID.
I use sbt 0.13.5.
There's no out-of-the-box way to have the feature of having Javadoc links inside scaladoc. And as my understanding goes, it's not sbt's fault, but the way scaladoc works. As Josh pointed out in his comment You should report to scaladoc.
There's however a workaround I came up with - postprocess the doc-generated scaladoc so the Java URLs get replaced to form proper Javadoc links.
The file scaladoc.sbt should be placed inside a sbt project and whenever doc task gets executed, the postprocessing via fixJavaLinksTask task kicks in.
NOTE There are lots of hardcoded paths so use it with caution (aka do the polishing however you see fit).
import scala.util.matching.Regex.Match
autoAPIMappings := true
// builds -doc-external-doc
apiMappings += (
file("/Library/Java/JavaVirtualMachines/jdk1.8.0_11.jdk/Contents/Home/jre/lib/rt.jar") ->
url("http://docs.oracle.com/javase/8/docs/api")
)
lazy val fixJavaLinksTask = taskKey[Unit](
"Fix Java links - replace #java.io.File with ?java/io/File.html"
)
fixJavaLinksTask := {
println("Fixing Java links")
val t = (target in (Compile, doc)).value
(t ** "*.html").get.filter(hasJavadocApiLink).foreach { f =>
println("fixing " + f)
val newContent = javadocApiLink.replaceAllIn(IO.read(f), fixJavaLinks)
IO.write(f, newContent)
}
}
val fixJavaLinks: Match => String = m =>
m.group(1) + "?" + m.group(2).replace(".", "/") + ".html"
val javadocApiLink = """\"(http://docs\.oracle\.com/javase/8/docs/api/index\.html)#([^"]*)\"""".r
def hasJavadocApiLink(f: File): Boolean = (javadocApiLink findFirstIn IO.read(f)).nonEmpty
fixJavaLinksTask <<= fixJavaLinksTask triggeredBy (doc in Compile)
I took the answer by #jacek-laskowski and modified it so that it avoid hard-coded strings and could be used for any number of Java libraries, not just the standard one.
Edit: the location of rt.jar is now determined from the runtime using sun.boot.class.path and does not have to be hard coded.
The only thing you need to modify is the map, which I have called externalJavadocMap in the following:
import scala.util.matching.Regex
import scala.util.matching.Regex.Match
val externalJavadocMap = Map(
"owlapi" -> "http://owlcs.github.io/owlapi/apidocs_4_0_2/index.html"
)
/*
* The rt.jar file is located in the path stored in the sun.boot.class.path system property.
* See the Oracle documentation at http://docs.oracle.com/javase/6/docs/technotes/tools/findingclasses.html.
*/
val rtJar: String = System.getProperty("sun.boot.class.path").split(java.io.File.pathSeparator).collectFirst {
case str: String if str.endsWith(java.io.File.separator + "rt.jar") => str
}.get // fail hard if not found
val javaApiUrl: String = "http://docs.oracle.com/javase/8/docs/api/index.html"
val allExternalJavadocLinks: Seq[String] = javaApiUrl +: externalJavadocMap.values.toSeq
def javadocLinkRegex(javadocURL: String): Regex = ("""\"(\Q""" + javadocURL + """\E)#([^"]*)\"""").r
def hasJavadocLink(f: File): Boolean = allExternalJavadocLinks exists {
javadocURL: String =>
(javadocLinkRegex(javadocURL) findFirstIn IO.read(f)).nonEmpty
}
val fixJavaLinks: Match => String = m =>
m.group(1) + "?" + m.group(2).replace(".", "/") + ".html"
/* You can print the classpath with `show compile:fullClasspath` in the SBT REPL.
* From that list you can find the name of the jar for the managed dependency.
*/
lazy val documentationSettings = Seq(
apiMappings ++= {
// Lookup the path to jar from the classpath
val classpath = (fullClasspath in Compile).value
def findJar(nameBeginsWith: String): File = {
classpath.find { attributed: Attributed[File] => (attributed.data ** s"$nameBeginsWith*.jar").get.nonEmpty }.get.data // fail hard if not found
}
// Define external documentation paths
(externalJavadocMap map {
case (name, javadocURL) => findJar(name) -> url(javadocURL)
}) + (file(rtJar) -> url(javaApiUrl))
},
// Override the task to fix the links to JavaDoc
doc in Compile <<= (doc in Compile) map {
target: File =>
(target ** "*.html").get.filter(hasJavadocLink).foreach { f =>
//println(s"Fixing $f.")
val newContent: String = allExternalJavadocLinks.foldLeft(IO.read(f)) {
case (oldContent: String, javadocURL: String) =>
javadocLinkRegex(javadocURL).replaceAllIn(oldContent, fixJavaLinks)
}
IO.write(f, newContent)
}
target
}
)
I am using SBT 0.13.8.