Java SnakeYaml - prevent dumping reference names

Java SnakeYaml - prevent dumping reference names - java

I have the following method which I use to get the object converted to yaml representation (which I can eg. print to console)
#Nonnull
private String outputObject(#Nonnull final ObjectToPrint packageSchedule) {
DumperOptions options = new DumperOptions();
options.setAllowReadOnlyProperties(true);
options.setPrettyFlow(true);
return new Yaml(new Constructor(), new JodaTimeRepresenter(), options).dump(ObjectToPrint);
}
All is good, but for some object contained within ObjectToPrint structure I get something like reference name and not the real object content, eg.
!!com.blah.blah.ObjectToPrint
businessYears:
- businessYearMonths: 12
ppiYear: &id001 {
endDate: 30-06-2013,
endYear: 2013,
startDate: 01-07-2012,
startYear: 2012
}
ppiPeriod:
ppiYear: *id001
endDate: 27-03-2014
startDate: 21-06-2013
units: 24.000
number: 1
As you can see from the example above I have ppiYear object printed (marked as $id001) and the same object is used in ppiPeriod but only the reference name is printed there, not the object content.
How to print the object content everytime I use that object within my structure, which I want to be converted to yaml (ObjectToPrint).
PS. It would be nice not to print the reference name at all (&id001) but thats not crucial

This is because you reference the same object at different places. To avoid this you need to create copies of those objects.
Yaml does not have a flag to switch this off, because you might get into endless loops in case of cyclic references.
However you might tweak Yaml source code to ignore double references:
look at Serializer line ~170 method serializeNode:
...
if ( this.serializedNodes.contains(node) ) {
this.emmitter.emit( new AliasEvent( ... ) );
} else {
serializedNodes.add(node); // <== Replace with myHook(serializedNodes,node);
...
void myHook(serializedNodes,node) {
if ( node's class != myClass(es) to avoid ) {
serializedNodes.add(node);
}
if you find a way to avoid Yaml to put nodes into the serializedNodes collection, your problem will be solved, however your programm will loop endless in case of cyclic references.
Best solution is to add a hook which avoids registering only the class you want to be written plain.

As a way to do this without changing the SnakeYAML source code, you can define:
public class NonAnchorRepresenter extends Representer {
public NonAnchorRepresenter() {
this.multiRepresenters.put(Map.class, new RepresentMap() {
public Node representData(Object data) {
return representWithoutRecordingDescendents(data, super::representData);
}
});
}
protected Node representWithoutRecordingDescendents(Object data, Function<Object,Node> worker) {
Map<Object,Node> representedObjectsOnEntry = new LinkedHashMap<Object,Node>(representedObjects);
try {
return worker.apply(data);
} finally {
representedObjects.clear();
representedObjects.putAll(representedObjectsOnEntry);
}
}
}
and use it as e.g. new Yaml(new SafeConstructor(), new NonAnchorRepresenter());
This only does maps, and it does use anchors when it has to, i.e. where a map refers to an ancestor. It would need similar extensions for sets and lists if required. (In my case it was empty maps that were the biggest offender.)
(It would be more easily handled in the SnakeYAML codebase on Representer.representData looking at an option e.g. setAllowAnchors defaulting to true, with the logic above to reset representedObjects after recursing if disallowed. I take the point at https://stackoverflow.com/a/18419489/109079 about the possibility of endless loops but it would be straightforward using this strategy to detect any reference to a parent map and fail fast.)

Another solution which i came by is using io.circle instead snake-yaml if it possible.
Scala code:
private[this] def removeAnchors(configYaml: String): String = {
val withoutAnchorsObj = io.circe.yaml.parser.parse(configYaml).valueOr(throw _)
val withoutAnchorString = io.circe.yaml.Printer(dropNullKeys = true, mappingStyle = Printer.FlowStyle.Block).pretty(withoutAnchorsObj)
logger.info(s"Removed Anchors from configYaml: $configYaml, result: $withoutAnchorString")
withoutAnchorString
}
build.sbt:
val circeVersion = "0.12.0"
libraryDependencies ++= Seq(
"io.circe" %% "circe-yaml" % circeVersion,
"io.circe" %% "circe-parser" % circeVersion,
)

Related

Access Lambda Arguments with Groovy and Spock Argument Capture

I am trying to unit test a Java class with a method containing a lambda function. I am using Groovy and Spock for the test. For proprietary reasons I can't show the original code.
The Java method looks like this:
class ExampleClass {
AsyncHandler asynHandler;
Component componet;
Component getComponent() {
return component;
}
void exampleMethod(String input) {
byte[] data = input.getBytes();
getComponent().doCall(builder ->
builder
.setName(name)
.data(data)
.build()).whenCompleteAsync(asyncHandler);
}
}
Where component#doCall has the following signature:
CompletableFuture<Response> doCall(Consumer<Request> request) {
// do some stuff
}
The groovy test looks like this:
class Spec extends Specification {
def mockComponent = Mock(Component)
#Subject
def sut = new TestableExampleClass(mockComponent)
def 'a test'() {
when:
sut.exampleMethod('teststring')
then:
1 * componentMock.doCall(_ as Consumer<Request>) >> { args ->
assert args[0].args$2.asUtf8String() == 'teststring'
return new CompletableFuture()
}
}
class TestableExampleClass extends ExampleClass {
def component
TestableExampleClass(Component component) {
this.component = component;
}
#Override
getComponent() {
return component
}
}
}
The captured argument, args, shows up as follows in the debug window if I place a breakpoint on the assert line:
args = {Arrays$ArrayList#1234} size = 1
> 0 = {Component$lambda}
> args$1 = {TestableExampleClass}
> args$2 = {bytes[]}
There are two points confusing me:
When I try to cast the captured argument args[0] as either ExampleClass or TestableExampleClass it throws a GroovyCastException. I believe this is because it is expecting Component$Lambda, but I am not sure how to cast this.
Accessing the data property using args[0].args$2, doesn't seem like a clean way to do it. This is likely linked to the casting issue mentioned above. But is there a better way to do this, such as with args[0].data?
Even if direct answers can't be given, a pointer to some documentation or article would be helpful. My search results discussed Groovy closures and Java lambdas comparisons separately, but not about using lambdas in closures.

Why you should not do what you are trying
This invasive kind of testing is a nightmare! Sorry for my strong wording, but I want to make it clear that you should not over-specify tests like this, asserting on private final fields of lambda expressions. Why would it even be important what goes into the lambda? Simply verify the result. In order to do a verification like this, you
need to know internals of how lambdas are implemented in Java,
those implementation details have to stay unchanged across Java versions and
the implementations even have to be the same across JVM types like Oracle Hotspot, OpenJ9 etc.
Otherwise, your tests break quickly. And why would you care how a method internally computes its result? A method should be tested like a black box, only in rare cases should you use interaction testing,where it is absolutely crucial in order to make sure that certain interactions between objects occur in a certain way (e.g. in order to verify a publish-subscribe design pattern).
How you can do it anyway (dont!!!)
Having said all that, just assuming for a minute that it does actually make sense to test like that (which it really does not!), a hint: Instead of accessing the field args$2, you can also access the declared field with index 1. Accessing by name is also possible, of course. anyway, you have to reflect on the lambda's class, get the declared field(s) you are interested in, make them accessible (remember, they are private final) and then assert on their respective contents. You could also filter by field type in order to be less sensitive to their order (not shown here).
Besides, I do not understand why you create a TestableExampleClass instead of using the original.
In this example, I am using explicit types instead of just def in order to make it easier to understand what the code does:
then:
1 * mockComponent.doCall(_ as Consumer<Request>) >> { args ->
Consumer<Request> requestConsumer = args[0]
Field nameField = requestConsumer.class.declaredFields[1]
// Field nameField = requestConsumer.class.getDeclaredField('arg$2')
nameField.accessible = true
byte[] nameBytes = nameField.get(requestConsumer)
assert new String(nameBytes, Charset.forName("UTF-8")) == 'teststring'
return new CompletableFuture()
}
Or, in order to avoid the explicit assert in favour of a Spock-style condition:
def 'a test'() {
given:
String name
when:
sut.exampleMethod('teststring')
then:
1 * mockComponent.doCall(_ as Consumer<Request>) >> { args ->
Consumer<Request> requestConsumer = args[0]
Field nameField = requestConsumer.class.declaredFields[1]
// Field nameField = requestConsumer.class.getDeclaredField('arg$2')
nameField.accessible = true
byte[] nameBytes = nameField.get(requestConsumer)
name = new String(nameBytes, Charset.forName("UTF-8"))
return new CompletableFuture()
}
name == 'teststring'
}

Jackson YAML Serialization Object Arrays Format

I am trying to format my Jackson Yaml serialization in a certain way.
employees:
- name: John
age: 26
- name: Bill
age: 17
But, when I serialize the object, this is the format that I get.
employees:
-
name: John
age: 26
-
name: Bill
age: 17
Is there any way to get rid of the newline at the start of an object in the array? This is purely a personal preference/human readability issue.
These are the properties I'm currently setting on the YAMLFactory:
YAMLFactory yamlFactory = new YAMLFactory()
.enable(YAMLGenerator.Feature.MINIMIZE_QUOTES) //removes quotes from strings
.disable(YAMLGenerator.Feature.WRITE_DOC_START_MARKER)//gets rid of -- at the start of the file.
.enable(YAMLGenerator.Feature.INDENT_ARRAYS);// enables indentation.
I've looked in the java docs for the YAMLGenerator in Jackson, and looked at the other questions on stackoverflow, but I can't find an option to do what I'm trying to do.
I've tried CANONICAL_OUTPUT, SPLIT_LINES and LITERAL_BLOCK_STYLE properties, the last one being automatically set when MINIMIZE_QUOTES is set.
CANONICAL_OUTPUT seems to add brackets around arrays.
SPLIT_LINES and LITERAL_BLOCK_STYLE are related to how multi-line strings are handled.

The short answer is there is currently no way to do this through Jackson. This is due to a bug within snakeyaml where if you set the indicatorIndent property, the whitespace is not handled properly, and therefore snakeyaml adds the new line.
I have found a workaround using snakeyaml directly.
//The representer allows us to ignore null properties, and to leave off the class definitions
Representer representer = new Representer() {
//ignore null properties
#Override
protected NodeTuple representJavaBeanProperty(Object javaBean, Property property, Object propertyValue, Tag customTag) {
// if value of property is null, ignore it.
if (propertyValue == null) {
return null;
}
else {
return super.representJavaBeanProperty(javaBean, property, propertyValue, customTag);
}
}
//Don't print the class definition
#Override
protected MappingNode representJavaBean(Set<Property> properties, Object javaBean) {
if (!classTags.containsKey(javaBean.getClass())){
addClassTag(javaBean.getClass(), Tag.MAP);
}
return super.representJavaBean(properties, javaBean);
}
};
DumperOptions dumperOptions = new DumperOptions();
//prints the yaml as nested blocks
dumperOptions.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK);
//indicatorIndent indents the '-' character for lists
dumperOptions.setIndicatorIndent(2);
//This is the workaround. Indent must be set to at least 2 higher than the indicator indent because of how whitespace is handled.
//If not set to 2 higher, then the newline is added.
dumperOptions.setIndent(4);
Yaml yaml = new Yaml(representer, dumperOptions);
//prints the object to a yaml string.
yaml.dump(object);
The workaround happens with setting the indent property on the DumperOptions. You need to set the indent to a value at least 2 higher than the indicatorIndent, or else the newline will be added. This is due to how whitespace is handled within snakeyaml.

Looking at the Jackson source code the snakeyaml dumper options are created like:
#Andrey does this look good to you?
protected DumperOptions buildDumperOptions(int jsonFeatures, int yamlFeatures,
org.yaml.snakeyaml.DumperOptions.Version version)
{
DumperOptions opt = new DumperOptions();
// would we want canonical?
if (Feature.CANONICAL_OUTPUT.enabledIn(_formatFeatures)) {
opt.setCanonical(true);
} else {
opt.setCanonical(false);
// if not, MUST specify flow styles
opt.setDefaultFlowStyle(FlowStyle.BLOCK);
}
// split-lines for text blocks?
opt.setSplitLines(Feature.SPLIT_LINES.enabledIn(_formatFeatures));
// array indentation?
if (Feature.INDENT_ARRAYS.enabledIn(_formatFeatures)) {
// But, wrt [dataformats-text#34]: need to set both to diff values to work around bug
// (otherwise indentation level is "invisible". Note that this should NOT be necessary
// but is needed up to at least SnakeYAML 1.18.
// Also looks like all kinds of values do work, except for both being 2... weird.
opt.setIndicatorIndent(1);
opt.setIndent(2);
}
// 14-May-2018: [dataformats-text#84] allow use of platform linefeed
if (Feature.USE_PLATFORM_LINE_BREAKS.enabledIn(_formatFeatures)) {
opt.setLineBreak(DumperOptions.LineBreak.getPlatformLineBreak());
}
return opt;
}

I ran into this and ended up writing this blog entry to describe the solution I came up with. In short, I made a custom subclass of YAMLGenerator and YAMLFactory and used that to configure Jackson's YAMLMapper. Not "clean" but not huge and reasonably effective. Let me set arbitrary DumperOptions
Source below, but also available in this gist.
Warning - I did this all in Kotlin, but it's trivial code, and should be easily back-portable to Java:
val mapper: YAMLMapper = YAMLMapper(MyYAMLFactory()).apply {
registerModule(KotlinModule())
setSerializationInclusion(JsonInclude.Include.NON_EMPTY)
}
class MyYAMLGenerator(
ctx: IOContext,
jsonFeatures: Int,
yamlFeatures: Int,
codec: ObjectCodec,
out: Writer,
version: DumperOptions.Version?
): YAMLGenerator(ctx, jsonFeatures, yamlFeatures, codec, out, version) {
override fun buildDumperOptions(
jsonFeatures: Int,
yamlFeatures: Int,
version: DumperOptions.Version?
): DumperOptions {
return super.buildDumperOptions(jsonFeatures, yamlFeatures, version).apply {
//
// NOTE: CONFIGURATION HAPPENS HERE!!
//
defaultScalarStyle = ScalarStyle.LITERAL;
defaultFlowStyle = FlowStyle.BLOCK
indicatorIndent = 2
nonPrintableStyle = ESCAPE
indent = 4
isPrettyFlow = true
width = 100
this.version = version
}
}
}
class MyYAMLFactory(): YAMLFactory() {
#Throws(IOException::class)
override fun _createGenerator(out: Writer, ctxt: IOContext): YAMLGenerator {
val feats = _yamlGeneratorFeatures
return MyYAMLGenerator(ctxt, _generatorFeatures, feats,_objectCodec, out, _version)
}
}

In today's environment using at least version 2.12.x of jackson-dataformat-yaml thanks to the introduction of YAMLGenerator.Feature.INDENT_ARRAYS_WITH_INDICATOR, you can simply just do:
public ObjectMapper yamlObjectMapper() {
final YAMLFactory factory = new YAMLFactory()
.enable(YAMLGenerator.Feature.INDENT_ARRAYS_WITH_INDICATOR)
.enable(YAMLGenerator.Feature.MINIMIZE_QUOTES)
.disable(YAMLGenerator.Feature.WRITE_DOC_START_MARKER);
return new ObjectMapper(factory);
}
Which will give you the output of:
employees:
- name: John
age: 26
- name: Bill
age: 17

List of regex results instead of first result in Kotlin

Using the following code, I can set a couple variables to my matches. I want to do the same thing, but populate a map of all instances of these results. I'm struggling and could use help.
val (dice, level) = Regex("""([0-9]*d[0-9]*) at ([0-9]*)""").matchEntire(text)?.destructured!!
This code works for one instance, none of my attempts at matching multiple are working.

Your solution is short and readable. Here are a few options the one you use is largely a matter of preference. You can get a Map directly by using the associate method as follows.
val diceLevels = levelMatches.associate { matched ->
val (diceTwo,levelTwo) = matched.destructured
(levelTwo to diceTwo)
}
Note: This creates an immutable map. If you want a MutableMap, you can use associateTo.
If you want to be concise, you can simplify out the destructuring to local variables and index the groups directly.
val diceLevels = levelMatches.associate {
(it.groupValues[2] to it.groupValues[1])
}
Or, using let, you can also avoid needing to declare levelMatches as a local variable if it isn't used elsewhere --
val diceLevels = Regex("([0-9]+d[0-9]+) at ([0-9]+)")
.findAll(text)
.let { levelMatches ->
levelMatches.associate {
(it.groupValues[2] to it.groupValues[1])
}
}

I realized this was no where near as complicated as I was making it. Here was my solution. Is there something more elegant?
val levelMatches = Regex("([0-9]+d[0-9]+) at ([0-9]+)").findAll(text)
levelMatches.forEach { matched ->
val (diceTwo,levelTwo) = matched.destructured
diceLevels[levelTwo] = diceTwo
}

Simple Code to Copy Same Name Properties?

I have an old question sustained in my mind for a long time. When I was writing code in Spring, there are lots dirty and useless code for DTO, domain objects. For language level, I am hopeless in Java and see some light in Kotlin. Here is my question:
Style 1 It is common for us to write following code (Java, C++, C#, ...)
// annot: AdminPresentation
val override = FieldMetadataOverride()
override.broadleafEnumeration = annot.broadleafEnumeration
override.hideEnumerationIfEmpty = annot.hideEnumerationIfEmpty
override.fieldComponentRenderer = annot.fieldComponentRenderer
Sytle 2 Previous code can be simplified by using T.apply() in Kotlin
override.apply {
broadleafEnumeration = annot.broadleafEnumeration
hideEnumerationIfEmpty = annot.hideEnumerationIfEmpty
fieldComponentRenderer = annot.fieldComponentRenderer
}
Sytle 3 Can such code be even simplified to something like this?
override.copySameNamePropertiesFrom (annot) { // provide property list here
broadleafEnumeration
hideEnumerationIfEmpty
fieldComponentRenderer
}
First Priority Requirments
Provide property name list only one time
The property name is provided as normal code, so as to we can get IDE auto complete feature.
Second Priority Requirements
It's prefer to avoid run-time cost for Style 3. (For example, 'reflection' may be a possible implementation, but it do introduce cost.)
It's prefer to generated code like style1/style2 directly.
Not care
The final syntax of Style 3.
I am a novice for Kotlin language. Is it possible to use Kotlin to define somthing like 'Style 3' ?

It should be pretty simple to write a 5 line helper to do this which even supports copying every matching property or just a selection of properties.
Although it's probably not useful if you're writing Kotlin code and heavily utilising data classes and val (immutable properties). Check it out:
fun <T : Any, R : Any> T.copyPropsFrom(fromObject: R, vararg props: KProperty<*>) {
// only consider mutable properties
val mutableProps = this::class.memberProperties.filterIsInstance<KMutableProperty<*>>()
// if source list is provided use that otherwise use all available properties
val sourceProps = if (props.isEmpty()) fromObject::class.memberProperties else props.toList()
// copy all matching
mutableProps.forEach { targetProp ->
sourceProps.find {
// make sure properties have same name and compatible types
it.name == targetProp.name && targetProp.returnType.isSupertypeOf(it.returnType)
}?.let { matchingProp ->
targetProp.setter.call(this, matchingProp.getter.call(fromObject))
}
}
}
This approach uses reflection, but it uses Kotlin reflection which is very lightweight. I haven't timed anything, but it should run almost at same speed as copying properties by hand.
Now given 2 classes:
data class DataOne(val propA: String, val propB: String)
data class DataTwo(var propA: String = "", var propB: String = "")
You can do the following:
var data2 = DataTwo()
var data1 = DataOne("a", "b")
println("Before")
println(data1)
println(data2)
// this copies all matching properties
data2.copyPropsFrom(data1)
println("After")
println(data1)
println(data2)
data2 = DataTwo()
data1 = DataOne("a", "b")
println("Before")
println(data1)
println(data2)
// this copies only matching properties from the provided list
// with complete refactoring and completion support
data2.copyPropsFrom(data1, DataOne::propA)
println("After")
println(data1)
println(data2)
Output will be:
Before
DataOne(propA=a, propB=b)
DataTwo(propA=, propB=)
After
DataOne(propA=a, propB=b)
DataTwo(propA=a, propB=b)
Before
DataOne(propA=a, propB=b)
DataTwo(propA=, propB=)
After
DataOne(propA=a, propB=b)
DataTwo(propA=a, propB=)

Property file based conditional patterns in java

I have a property file (a.txt) which has the values (Example values given below) like below
test1=10
test2=20
test33=34
test34=35
By reading this file, I need to produce an output like below
value = 35_20_34_10
which means => I have a pattern like test34_test2_test33_test1
Note, If the 'test33' has any value other than 34 then I need to produce the value like below
value = 35_20_10
which means => I have a pattern like test34_test2_test1
Now my problem is, every time when the customer is making the change in the logic, I am making the change in the code. So what I expect is, I want to keep the logic (pattern) in another property file so I will be sending the two inputs to the util (one input is the property file (A.txt) another input will be the 'pattern.txt'),
My util has to be compare the A.txt and the business logic 'pattern.txt' and produce the output like
value = 35_20_34_10 (or)
value = 35_20_10
If there an example for such pattern based logic as I expect?
Any predefined util / java class does this?
Any help would be Great.
thanks,
Harry

First of all, svasa's answer makes a lot of sense, but covers different level of
abstraction. I recommend you read his answer too, that pattern should
be useful.
You may wanna look at Apache Velocity and FreeMarker libraries to see how they structure their API.
Those are template engines - they usually have some abstraction of pattern or format, and abstraction of variable/value binding (or namespace, or source). You can render a template by binding it with a binding/namespace, which yields the result.
For example, you may wanna have a pattern "<a> + <b>", and binding that looks like a map: {a: "1", b: "2"}. By binding that binding to that pattern you'll get "1 + 2", when interpreting <...> as variables.
You basically load the pattern from your pattern.txt, then load your data file A.txt (for example, by treating it as properties and using Properties class) and construct binding based on these properties. You'll get your output and possibility to customize the pattern all the time.

You may call the sequences like test34_test2_test33_test1 as a pattern, let me call them as constraints when building something.
To me this problem best fits into a
builder pattern.
When building the value you want, you tell the builder that these are my constraints(pattern) and these are my original properties like below:
new MyPropertiesBuilder().setConstraints(constraints).setProperties(original).buildValue();
Details:
Set some constraints in a separate file where you specify the order of the properties and their values like :
test34=desiredvalue-could-be-empty
test2=desiredvalue-could-be-empty
test33=34
test1=desiredvalue-could-be-empty
The builder goes over the constraints in the order specified, but get the values from the original properties and build the desired string.
One way to achieve your requirement through builder pattern is to define classes like below :
Interface:
public interface IMyPropertiesBuilder
{
public void setConstraints( Properties properties );
public void setProperties( Properties properties );
public String buildValue();
}
Builder
public class MyPropertiesBuilder implements IMyPropertiesBuilder
{
private Properties constraints;
private Properties original;
#Override
public void setConstraints( Properties constraints )
{
this.constraints = constraints;
}
#Override
public String buildValue()
{
StringBuilder value = new StringBuilder();
Iterator it = constraints.keySet().iterator();
while ( it.hasNext() )
{
String key = (String) it.next();
if ( original.containsKey( key ) && constraints.getProperty( key ) != null && original.getProperty( key ).equals( constraints.getProperty( key ) ) )
{
value.append( original.getProperty( key ) );
value.append( "_" );
}
}
return value.toString();
}
#Override
public void setProperties( Properties properties )
{
this.original = properties;
}
}
User
public class MyPropertiesBuilderUser
{
private Properties original = new Properties().load(new FileInputStream("original.properties"));;
private Properties constraints = new Properties().load(new FileInputStream("constraints.properties"));
public String getValue()
{
String value = new MyPropertiesBuilder().setConstraints(constraints).setProperties(original).buildValue();
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.