Update attributes based on content in NiFi - java

How can I add a attribute to the current flow file when developing an Apache NiFi cusom processor.
What I want to do is adding a new attribute (or at least update a current attribute) to the current flow file with calculated value.
Or is there an already built processor that I can use?

NiFi supports several methods of creating and updating attributes, depending on the data source you wish to use. Some general purpose processors include:
UpdateAttribute - Updates attributes on flow files using both static values and NiFi's expression language.
You can add as many properties with one processor. I recommend scanning through the Apache NiFi Expression Language Guide to get a feel for what you can do with it.
ExtractText - The Sets attribute values by applying regular expressions to the flowfile content.
ExecuteScript - ExecuteScript Runs custom script code, which can be used to update attributes however you wish.
And there are more for particular content formats, for example:
EvaluateJsonPath - for JSON
EvaluateXPath - for XML

I had a use case where I needed to load many attributes from a Java-style Properties file. This could be done with ExtractText but it required adding a property and regular expression for every property to be supported. I thought it would be nice to support whatever properties were in the file without having to configure the processor for each one.
The solution I came up with was to use the ExecuteGroovyScript processor with the following script:
def ff=session.get();
if (ff != null) {
def properties = new Properties();
def is = ff.read();
properties.load(is);
is.close();
ff.putAllAttributes(properties);
REL_SUCCESS << ff;
}
This script reads the properties from a physical file rather than from the flow file.
def ff=session.get();
if (ff != null) {
def properties = new Properties();
def propertiesFile = new File('/Users/me/mydata/foo.properties')
propertiesFile.withInputStream {
properties.load(it)
}
ff.putAllAttributes(properties);
REL_SUCCESS << ff;
}

Related

standardized method for writing an arbitrary typesafe Config to a hocon file?

in a Scala research application, i load a hocon file using PureConfig's ConfigSource.file() method, which represents the default configuration for a research experiment. i use this to build a batch of variations dynamically. after making a few modifications related to a specific experimental variation, i then parse the config into a case class structure using pureconfig's auto parser.
at this point, i would like to save the modified Config to my experiment directory as a hocon file, so i can easily re-create this experiment in the future.
i have been looking around the typesafe config README.md and haven't seen anything on this. clearly, i could write a function to pretty-print the config tree to a hocon format, but, is there a way to do this hidden somewhere in the typesafe config API?
Here a solution I came up with that only depends on the Typesafe Config library:
val config = ConfigFactory.parseResources("application.conf")
val otherConfig = ConfigFactory.parseResources("other.conf")
val mergedConf = config.withFallback(otherConfig)
val options = ConfigRenderOptions
.defaults()
.setJson(false) // false: HOCON, true: JSON
.setOriginComments(false) // true: add comment showing the origin of a value
.setComments(true) // true: keep original comment
.setFormatted(true) // true: pretty-print result
val result = mergedConf.root().render(options)
println(result)
This is straight forward:
import pureconfig._
import pureconfig.generic.auto._
val configValue = ConfigWriter[YourCaseClass].to(component)
val configString = configValue.render()
This will create a String of your configuration.
There is one big limitation: It renders JSON.
Here is the according documentation: config-writer

Apache Commons Configuration : read a .properties file and rewrite it with no change

Since I do not know of a better solution, I am currently writing small Java classes to process .properties file to merge them, remove duplicate properties, override properties, etc. (I need to process many files and a huge number of properties).
org.apache.commons.configuration.PropertiesConfiguration works great for reading a properties file (using org.apache.commons.configuration.AbstractFileConfiguration.load(InputStream, String), however if I rewrite the file using org.apache.commons.configuration.AbstractFileConfiguration.save(File), I have two problems:
the original layout and comments are lost. I am going to try the PropertiesConfigurationLayout, which is supposed to help here (see How to overwrite one property in .properties without overwriting the whole file?) and post the results
the properties are slightly modified. Accents é and è are rewritten as unicode characters (\u00E9), which I do not want. Afaik .properties files are generally ISO-8859-1 (and I think mine are), so escaping shouldn't be necessary.
Specifying the encoding when calling org.apache.commons.configuration.AbstractFileConfiguration.load(InputStream, String) does not make a difference, because when it is not specified, the same encoding is used by default anyway (private static final String DEFAULT_ENCODING = "ISO-8859-1";). What could I do about that ?
Doing some tests I think you can do what you want, using CombinedConfiguration plus a OverrideCombiner. Basically the properties will be merged automatically and the trick for the layout is to get the layout from one of the loaded files:
CombinedConfiguration props = new CombinedConfiguration();
final PropertiesConfiguration defaultsProps = new PropertiesConfiguration(new File("/tmp/default.properties"));
final PropertiesConfiguration customProps = new PropertiesConfiguration(new File("/tmp/custom.properties"));
props.setNodeCombiner(new OverrideCombiner());
props.addConfiguration(customProps); //first should be loaded the override values
props.addConfiguration(defaultsProps); // last your 'default' values
PropertiesConfiguration finalFile = new PropertiesConfiguration();
finalFile.append(props);
PropertiesConfigurationLayout layout = new PropertiesConfigurationLayout(finalFile, defaultsProps.getLayout()); //here we copy the layout from the 'base file'
layout.save(new FileWriter(new File("/tmp/app.properties")));
The issue with the encoding I don't know if its possible to find a solution.

Commons configuration library to add elements

I am using the apache commons configuration library to read a configuration xml and it works nicely. However, I am not able to modify the value of the elements or add new ones.
To read the xml I use the following code:
XMLConfiguration config = new XMLConfiguration(dnsXmlPath);
boolean enabled = config.getBoolean("enabled", true));
int size = config.getInt("size");
To write I am trying to use:
config.setProperty("newProperty", "valueNewProperty");
config.save();
If I call config.getString("newProperty"), I obtain "valueNewProperty", but the xml has not been changed.
Obviously it is not the right way or I am missing something, because it does not work.
Could anybody tell me how to do this?
Thanks in advance.
You're modifying xml structure in memory
The parsed document will be stored keeping its structure. The class also tries to preserve as much information from the loaded XML document as possible, including comments and processing instructions. These will be contained in documents created by the save() methods, too.
Like other file based configuration classes this class maintains the name and path to the loaded configuration file. These properties can be altered using several setter methods, but they are not modified by save() and load() methods. If XML documents contain relative paths to other documents (e.g. to a DTD), these references are resolved based on the path set for this configuration.
You need to use XMLConfiguration.html#save(java.io.Writer) method
For example, after you've done all your modifications save it:
config.save(new PrintWriter(new File(dnsXmlPath)));
EDIT
As mentioned in comment, calling config.load() before calling setProperty() method fixes the issue.
I solved it with the following lines. I was missing the config.load().
XMLConfiguration config = new XMLConfiguration(dnsXmlPath);
config.load();
config.setProperty("newProperty", "valueNewProperty");
config.save();
It is true though that you can used the next line instead of config.save() and works the same.
config.save(new PrintWriter(new File(dnsXmlPath)));

Velocity, different template paths

Does anyone know if it is possible to get templates from different paths with velocity? After initialization Velocity refuses to change the "file.resource.loader.path".
This is my code:
public Generator(){
Properties p = new Properties();
p.setProperty("resource.loader", "file");
p.setProperty("file.resource.loader.class", "org.apache.velocity.runtime.resource.loader.FileResourceLoader");
p.setProperty("file.resource.loader.path", "");
Velocity.init(p);
}
The templates can be located in different locations ( the user can select one with a file dialog ). So I have this code upon fetching the template out of velocity
private Template fetch (String templatePath) {
out_println("Initializing Velocity core...");
int end = templatePath.lastIndexOf(File.separator);
Properties p = new Properties();
p.setProperty("file.resource.loader.path", templatePath.substring(0, end));
Velocity.init(p);
return Velocity.getTemplate(templatePath.substring(end+1));
}
This is not working. It seems that once Velocity is initialized it can't be reset with different properties. Any suggestions on how to solve this problem?
Possible Program flow:
User selects group that needs to be filled into the template
User selects a template to use (can be located anywhere on the hdd)
User presses generate
Velocity can be used in two ways: the singleton model or the separate instance model. You are currently using the singleton model in which only one instance of the Velocity engine in the JVM is allowed.
Instead, you should use the separate instance model which allows you to create multiple instances of Velocity in the same JVM in order to support different template directories.
VelocityEngine ve = new VelocityEngine();
ve.setProperty(RuntimeConstants.FILE_RESOURCE_LOADER_PATH, "path/to/templates");
ve.init();
Template t = ve.getTemplate("foo.vm");
Adding to the points above:
Even if one is using non-singleton model i.e using VelocityEngine object. Multiple paths can be configured by giving comma separated values to the property.
[file.resource.loader.class=path1,path2]
In such a case velocity engine will look for template in path1 first and then in path2
Consider instead of using singleton Velocity class creating and initializing new VelocityEngine before step 3.
In my case I am using Velocity with Servlets in an Eclipse Dynamic Web Project.
I couldn't actually reset the path, but I could put a subdirectory under /WebContent folder and then organize my templates that way... and have nested subdirectories as well.
RequestDispatcher requestDispatcher =
request.getRequestDispatcher("/velocity_templates/index.vm");
This simple solution was all I needed ... didn't need to mess with velocity.properties in web.xml or setting them programmatically (in each case, neither approach worked for me unfortunately when I tried).
Note that when I do template includes with #parse(..) command, I need to use the same path prefix inside the template .vm file as I did in the example code for my servlet.

retrieve property values from message broker bar file with java api

I'm trying to read the property values from a bar file created by message broker.
I want to do this via java. The api is here: http://publib.boulder.ibm.com/infocenter/wmbhelp/v7r0m0/index.jsp?topic=%2Fcom.ibm.etools.mft.doc%2Fbe43410_.htm
However, I can only figure out how to get the names of the properties NOT THEIR VALUES by using the deployment descriptor. I can see how to override the value that a property has, but once again, not how to retrieve the value. Another words I can see only how to write to the property not read from it. I want to do both! Call me greedy ;)
If I use the command line based utility: http://publib.boulder.ibm.com/infocenter/wmbhelp/v7r0m0/index.jsp?topic=%2Fcom.ibm.etools.mft.doc%2Faf03900_.htm
I can get the property values no problem.
But I want to get them via java if at all possible.
Thanks in advance for any help on this!
The problem was I was misunderstanding how the deployment descriptor worked. I thought that when the java API referred to overridden properties it meant ones that were over ridden in my java code. But it actually meant all the properties that had a value in the bar file.
That being said getting the values is not strait forward. You have to get all the identifiers and then pass them to getOverride();
BarFile b = BarFile.loadBarFile("C:\\BarParamTest\\myBar.bar");
DeploymentDescriptor d = b.getDeploymentDescriptor();
Enumeration<String> properties = d.getPropertyIdentifiers();
while(properties.hasMoreElements())
{
String p = properties.nextElement();
System.out.println(p + " = " + d.getOverride(p));
}
or use the following to only list properties that have values
Enumeration<String> properties = d.getOverriddenPropertyIdentifiers();
For some reason settings are not written to file, if they are not overriden or not changed.(the reason is the lack of necessity to keep the property's default value:) ) so the way to get the properties is to know their default values. But I would recommend you to use com.ibm.mq.jar library if you're able to connect to broker to read properties using method
java.util.Properties MessageFlowProxy.Node.getProperties()
from already deployed .bar.

Categories