Create makefile-like wildcard targets in Gradle - java

Use case: I have a bunch of images that have to be processed by a script before I build my app. In makefile I can simply define:
processed/%.png: original/%.png
script/process.sh $< $#
How do I implement this in Gradle? Specifically, I want it to work like in Makefile, that is only the modified original images will be processed again.

You can implement this behaviour as an incremental task, using IncrementalTaskInputs as its input parameter. This API docs contain an example how to use it and here is an example in another the documentation. Both of them do almost exactly what you need.
An incremental task action is one that accepts a single
IncrementalTaskInputs parameter. The task can then provide an action
to execute for all input files that are out of date with respect to
the previous execution of the task, and a separate action for all
input files that have been removed since the previous execution.
In the case where Gradle is unable to determine which input files need
to be reprocessed, then all of the input files will be reported as
IncrementalTaskInputs.outOfDate(org.gradle.api.Action).
Inside your task, call the script using an exec task. Your Gradle script could then look like this:
task processRawFiles(type: ProcessRawFiles)
class ProcessRawFiles extends DefaultTask {
#InputDirectory
File inputDir = project.file('src/raw')
#OutputDirectory
File outputDir = project.file('build/processed')
#TaskAction
void execute(IncrementalTaskInputs inputs) {
if (!inputs.incremental)
project.delete(outputDir.listFiles())
inputs.outOfDate { InputFileDetails change ->
File saveTo = new File(outputDir, change.file.name)
project.exec {
commandLine 'script/process.sh', change.file.absolutePath, saveTo.absolutePath
}
}
inputs.removed { InputFileDetails change ->
File toDelete = new File(outputDir, change.file.name)
if (toDelete.exists())
toDelete.delete()
}
}
}
This task looks for the images in src/raw. It will removed files from build directory and call your script on any files that are out of date or newly added.
Your specific case might be more complicated if you have the images scattered across multiple directories. In that case you will have to use #InputFiles instead of #InputDirectory. But the incremental task should still work.

Related

Gradle - Write Task Output Into A File

I am working with Gradle 7.1, and I am trying to write some of the tasks resuts into a file.
Specifically, I would like to write the output of dependencies task into a file after each jar task execution.
Looking for some solutions, I understand that at first I need to have jar.finalizedBy(dependencies) in order fot it to run.
However, I can't find how to redirect the dependencies's specific output into a file. All the solutions that I have found discuss Exec tasks, which dependencies isn't.
I am looking for somehing like dependencies.doFirst(///REDIRECT HERE).
You can make dependencies task write to file by attaching a StandardOutputListener:
tasks.named('dependencies').configure {
it.logging.addStandardOutputListener(new StandardOutputListener() {
#Override
void onOutput(CharSequence charSequence) {
project.file("$buildDir/dependencies_task_output.txt") << charSequence
}
})
}
This can also be done with any other Gradle task.

Is there any way to automatically setting windows path in a string in groovy?

My project root directory is:
D:/Project/Node_Project
I am using a gradle plugin to install nodejs temporarily in my project root directory so that some nodejs command can run in the project while the thoject builds. The plugin is as below:
plugins {
id "com.github.node-gradle.node" version "2.2.4"
}
node {
download = true
version = "10.10.0"
distBaseUrl = 'https://nodejs.org/dist'
workDir = file("${project.buildDir}/nodejs")
}
So, nodejs is getting installed inside the project in the location:
D:/Project/Node_Project/build/nodejs/node-v10.10.0-win-x64
Now, I am using a .execute(String[] "path to set at environment variable", String path of file to be executed which is in the project root directory) method to run a windows command with node dependency. Code below:
cmd = "node connect.js"
def process = cmd.execute(["PATH=${project.projectDir}/build/nodejs/node-v10.10.0-win-x64"],null)
In the above .execute method, is there a way to auto-populate the "build/nodejs/node-v10.10.0-win-x64" part of the string instead of hardcoding it into the method?
Something like:
def process = cmd.execute(["PATH=${project.projectDir}/.*"],null)
Syntax of .execute method:
https://docs.groovy-lang.org/latest/html/groovy-jdk/java/lang/String.html#execute(java.lang.String[],%20java.io.File)
All the codes are inside "build.gradle" file. Please help!
I asked why you don't just write a task of type NodeTask, but I understand that you like to run a it in the background, which you can't do with that.
You could list the content of a directory and use that as part of the command. But you could also just grab it from the extension provided by the plugin.
This is not documented and it might break in future releases of the plugin, but you can do something like this (Groovy DSL):
task connectJS {
dependsOn nodeSetup
doFirst {
def connectProcess = "$node.variant.nodeExec $projectDir/src/js/connect.js".execute()
// Blocking readers (if async, pipe to a log file instead)
connectProcess.in.eachLine { logger.info(it) }
connectProcess.err.eachLine { logger.err(it) }
}
}

Hadoop Hive UDF with external library

I'm trying to write a UDF for Hadoop Hive, that parses User Agents. Following code works fine on my local machine, but on Hadoop I'm getting:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method public java.lang.String MyUDF .evaluate(java.lang.String) throws org.apache.hadoop.hive.ql.metadata.HiveException on object MyUDF#64ca8bfb of class MyUDF with arguments {All Occupations:java.lang.String} of size 1',
Code:
import java.io.IOException;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.*;
import com.decibel.uasparser.OnlineUpdater;
import com.decibel.uasparser.UASparser;
import com.decibel.uasparser.UserAgentInfo;
public class MyUDF extends UDF {
public String evaluate(String i) {
UASparser parser = null;
parser = new UASparser();
String key = "";
OnlineUpdater update = new OnlineUpdater(parser, key);
UserAgentInfo info = null;
info = parser.parse(i);
return info.getDeviceType();
}
}
Facts that come to my mind I should mention:
I'm compiling with Eclipse with "export runnable jar file" and extract required libraries into generated jar option
I'm uploading this "fat jar" file with Hue
Minimum working example I managed to run:
public String evaluate(String i) {
return "hello" + i.toString()";
}
I guess the problem lies somewhere around that library (downloaded from https://udger.com) I'm using, but I have no idea where.
Any suggestions?
Thanks, Michal
It could be a few things. Best thing is to check the logs, but here's a list of a few quick things you can check in a minute.
jar does not contain all dependencies. I am not sure how eclipse builds a runnable jar, but it may not include all dependencies. You can do
jar tf your-udf-jar.jar
to see what was included. You should see stuff from com.decibel.uasparser. If not, you have to build the jar with the appropriate dependencies (usually you do that using maven).
Different version of the JVM. If you compile with jdk8 and the cluster runs jdk7, it would also fail
Hive version. Sometimes the Hive APIs change slightly, enough to be incompatible. Probably not the case here, but make sure to compile the UDF against the same version of hadoop and hive that you have in the cluster
You should always check if info is null after the call to parse()
looks like the library uses a key, meaning that actually gets data from an online service (udger.com), so it may not work without an actual key. Even more important, the library updates online, contacting the online service for each record. This means, looking at the code, that it will create one update thread per record. You should change the code to do that only once in the constructor like the following:
Here's how to change it:
public class MyUDF extends UDF {
UASparser parser = new UASparser();
public MyUDF() {
super()
String key = "PUT YOUR KEY HERE";
// update only once, when the UDF is instantiated
OnlineUpdater update = new OnlineUpdater(parser, key);
}
public String evaluate(String i) {
UserAgentInfo info = parser.parse(i);
if(info!=null) return info.getDeviceType();
// you want it to return null if it's unparseable
// otherwise one bad record will stop your processing
// with an exception
else return null;
}
}
But to know for sure, you have to look at the logs...yarn logs, but also you can look at the hive logs on the machine you're submitting the job on ( probably in /var/log/hive but it depends on your installation).
such a problem probably can be solved by steps:
overide the method UDF.getRequiredJars(), make it returning a hdfs file path list which values are determined by where you put the following xxx_lib folder into your hdfs. Note that , the list mist exactly contains each jar's full hdfs path strings ,such as hdfs://yourcluster/some_path/xxx_lib/some.jar
export your udf code by following "Runnable jar file exporting wizard" (chose "copy required libraries into a sub folder next to the generated jar". This steps will result in a xxx.jar and a lib folder xxx_lib next to xxx.jar
put xxx.jar and the folders xxx_lib to your hdfs filesystem according to your code in step 0.
create a udf using: add jar ${the-xxx.jar-hdfs-path}; create function your-function as $}qualified name of udf class};
Try it. I test this and it works

How do I define a filtered FileTree using Gradle's Java API?

I am building a Gradle plugin in Java because of some Java libraries I want to take advantage of. As part of the plugin, I need to list and process folders of files. I can find many examples of how to do this in gradle build files:
FileTree tree = fileTree(dir: stagingDirName)
tree.include '**/*.md'
tree.each {File file ->
compileThis(file)
}
But how would do I do this in Java using Gradle's Java api?
The underlying FileTree Java class has very flexible input parameters, which makes it very powerful, but it's devilishly difficult to figure out what kind of input will actually work.
Here's how I did this in my java-based gradle task:
public class MyPluginTask extends DefaultTask {
#TaskAction
public void action() throws Exception {
// sourceDir can be a string or a File
File sourceDir = new File(getProject().getProjectDir(), "src/main/html");
// or:
//String sourceDir = "src/main/html";
ConfigurableFileTree cft = getProject().fileTree(sourceDir);
cft.include("**/*.html");
// Make sure we have some input. If not, throw an exception.
if (cft.isEmpty()) {
// Nothing to process. Input settings are probably bad. Warn user.
throw new Exception("Error: No processable files found in sourceDir: " +
sourceDir.getPath() );
}
Iterator<File> it = cft.iterator();
while (it.hasNext()){
File f = it.next();
System.out.println("File: "+f.getPath()"
}
}
}
It's virtually the same, e.g. project.fileTree(someMap). There's even an overload of the fileTree method that takes just the base dir (instead of a map). Instead of each you can use a for-each loop, instead of closures you can typically use anonymous inner classes implementing the Action interface (although fileTree seems to be missing these method overloads). The Gradle Build Language Reference has the details. PS: You can also take advantage of Java libraries from Groovy.

how to create ant listener for specific task

We have around 80 jars in our applications. All are created using javac task and jar task in ant.
I would like to introduce findbug checks. One option was to create single findbug check ant project. This has all jars , all source paths defined in it. This works -- require lot of space. Analysis of result too not very straight forward. There are thousands of warnings to start with.
One option I am considering is to run ant with special listener on javac task ant , extract source and class location, call findbug task with source and class file information. Any other way introduce findbug to a large project.
tweaked taskFinished()... Fine for my usage.
public class JavacListener implements BuildListener
public void taskFinished(BuildEvent be) {
if ( be.getTask() instanceof UnknownElement ) {
UnknownElement ue= (UnknownElement) be.getTask();
ue.maybeConfigure();
if ( ue.getTask() instanceof Javac ) {
Javac task = (Javac)ue.getTask();
final Path sourcepath = task.getSrcdir();
FindBugsTask fbtask = new FindBugsTask();
System.out.println ("Trying FindBugs");
fbtask.setSourcePath(sourcepath);
fbtask.setAuxClasspath(task.getClasspath());
Path destPath = new Path( task.getProject() );
destPath.setPath(task.getDestdir().getAbsolutePath());
fbtask.setAuxAnalyzepath(destPath);
fbtask.setOutputFile(getFileName(task.getProject()));
fbtask.setProject(task.getProject());
fbtask.setHome(new File("C:\\apps\\findbugs-1.3.0"));
fbtask.execute();
}
} else {
System.out.println(be.getTask().getClass().getName());
System.out.println(be.getTask().getTaskName());
}
}
..

Categories