Parsing picocli-based CLI usage output into structured data - java

I have a set of picocli-based applications that I'd like to parse the usage output into structured data. I've written three different output parsers so far and I'm not happy with any of them (fragility, complexity, difficulty in extending, etc.). Any thoughts on how to cleanly parse this type of semi-structured output?
The usage output generally looks like this:
Usage: taker-mvo-2 [-hV] [-C=file] [-E=file] [-p=payoffs] [-s=millis] PENALTY
(ASSET SPREAD)...
Submits liquidity-taking orders based on mean-variance optimization of multiple
assets.
PENALTY risk penalty for payoff variance
(ASSET SPREAD)... Spread for creating market above fundamental value
for assets
-C, --credential=file credential file
-E, --endpoint=file marketplace endpoint file
-h, --help display this help message
-p, --payoffs=payoffs payoff states and probabilities (default: .fm/payoffs)
-s, --sleep=millis sleep milliseconds before acting (default: 2000)
-V, --version print product version and exit
I want to capture the program name and description, options, parameters, and parameter-groups along with their descriptions into an agent:
public class Agent {
private String name;
private String description = "";
private List<Option> options;
private List<Parameter> parameters;
private List<ParameterGroup> parameterGroups;
}
The program name is taker-mvo-2 and the (possibly multi-lined) description is after the (possibly multi-line) arguments list:
Submits liquidity-taking orders based on mean-variance optimization of multiple assets.
Options (in square brackets) should be parsed into:
public class Option {
private String shortName;
private String parameter;
private String longName;
private String description;
}
The parsed options' JSON is:
options: [ {
"shortName": "h",
"parameter": null,
"longName": "help",
"description": "display this help message"
}, {
"shortName": "V",
"parameter": null,
"longName": "version",
"description": "print product version and exit"
}, {
"shortName": "C",
"parameter": file,
"longName": "credential",
"description": "credential file"
}, {
"shortName": "E",
"parameter": file,
"longName": "endpoint",
"description": "marketplace endpoint file"
}, {
"shortName": "p",
"parameter": payoffs,
"longName": "payoffs",
"description": "payoff states and probabilities (default: ~/.fm/payoffs)"
}]
Similarly for the parameters which should be parsed into:
public class Parameter {
private String name;
private String description;
}
and parameter-groups which are surrounded by ( and )... should be parsed into:
public class ParameterGroup {
private List<String> parameters;
private String description;
}
The first hand-written parser I wrote walked the buffer, capturing the data as it progresses. It works pretty well, but it looks horrible. And it's horrible to extend. The second hand-written parser uses regex expressions while walking the buffer. Better looking than the first but still ugly and difficult to extend. The third parser uses regex expressions. Probably the best looking of the bunch but still ugly and unmanageable.
I thought this text would be pretty simple to parse manually but now I'm wondering if ANTLR might be a better tool for this. Any thoughts or alternative ideas?

Model
It sounds like what you need is a model. An object model that describes the command, its options, option parameter types, option description, option names, and similar for positional parameters, argument groups, and potentially subcommands.
Then, once you have an object model of your application, it is relatively straightforward to render this as JSON or as some other format.
Picocli has an object model
You could build this yourself, but if you are using picocli anyway, why not leverage picocli's strengths and use picocli's built-in model?
CommandSpec
OptionSpec
PositionalParamSpec
ArgGroupSpec
and more...
Accessing picocli's object model
Commands can access their own model
Within a picocli-based application, a #Command-annotated class can access its own picocli object model by declaring a #Spec-annotated field. Picocli will inject the CommandSpec into that field.
For example:
#Command(name = "taker-mvo-2", mixinStandardHelpOptions = true, version = "taker-mvo-2 0.2")
class TakerMvo2 implements Runnable {
// ...
#Option(names = {"-C", "--credential"}, description = "credential file")
File file;
#Spec CommandSpec spec; // injected by picocli
public void run() {
for (OptionSpec option : spec.options()) {
System.out.printf("%s=%s%n", option.longestName(), option.getValue());
}
}
}
The picocli user manual has a more detailed example that uses the CommandSpec to loop over all options in a command to see if the option was defaulted or whether a value was specified on the command line.
Creating a model of any picocli command
An alternative way to access picocli's object model is to construct a CommandLine instance with the #Command-annotated class (or an object of that class). You can do this outside of your picocli application.
For example:
class Agent {
public static void main(String... args) {
CommandLine cmd = new CommandLine(new TakerMvo2());
CommandSpec spec = cmd.getCommandSpec();
// get subcommands
Map<String,CommandLine> subCmds = spec.subcommands();
// get options as a list
List<OptionSpec> options = spec.options()
// get argument groups
List<ArgGroupSpec> argGroups = spec.argGroups()
...
}
}

Related

Picocli: Is it possible to define options with a space in the name?

I googled around for a bit and also searched on StackOverflow and of course the Picocli docs but didn't come to any solution.
The company I work at uses a special format for command line parameters in batch programs:
-VAR ARGUMENT1=VALUE -VAR ARGUMENT2=VALUE2 -VAR BOOLEANARG=FALSE
(Don't ask me why this format is used, I already questioned it and didn't get a proper answer.)
Now I wanted to use Picocli for command line parsing. However, I can't get it to work with the parameter format we use, because the space makes Picocli think those are two separate arguments and thus it won't recognise them as the ones I defined.
This won't work, obviously:
#CommandLine.Option( names = { "-VAR BOOLEANARG" } )
boolean booleanarg = true;
Calling the program with -VAR BOOLEANARG=FALSE won't have any effect.
Is there any way to custom define those special option names containing spaces? Or how would I go about it? I also am not allowed to collapse multiple arguments as parameters into one -VAR option.
Help is much appreciated.
Thanks and best regards,
Rosa
Solution 1: Map Option
The simplest solution is to make -VAR a Map option. That could look something like this:
#Command(separator = " ")
class Simple implements Runnable {
enum MyOption {ARGUMENT1, OTHERARG, BOOLEANARG}
#Option(names = "-VAR",
description = "Variable options. Valid keys: ${COMPLETION-CANDIDATES}.")
Map<MyOption, String> options;
#Override
public void run() {
// business logic here
}
public static void main(String[] args) {
new CommandLine(new Simple()).execute(args);
}
}
The usage help for this example would look like this:
Usage: <main class> [-VAR <MyOption=String>]...
-VAR <MyOption=String>
Variable options. Valid keys: ARGUMENT1, OTHERARG, BOOLEANARG.
Note that with this solution all values would have the same type (String in this example), and you may need to convert to the desired type (boolean, int, other...) in the application.
However, this may not be acceptable given this sentence in your post:
I also am not allowed to collapse multiple arguments as parameters into one -VAR option.
Solution 2: Argument Groups
One idea for an alternative is to use argument groups: we can make ARGUMENT1, OTHERARG, and BOOLEANARG separate options, and put them in a group so that they must be preceded by the -VAR option.
The resulting usage help looks something like this:
Usage: group-demo [-VAR (ARGUMENT1=<arg1> | OTHERARG=<otherValue> |
BOOLEANARG=<bool>)]... [-hV]
-VAR Option prefix. Must be followed by one of
ARGUMENT1, OTHERARG or BOOLEANARG
ARGUMENT1=<arg1> An arg. Must be preceded by -VAR.
OTHERARG=<otherValue> Another arg. Must be preceded by -VAR.
BOOLEANARG=<bool> A boolean arg. Must be preceded by -VAR.
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
And the implementation could look something like this:
#Command(name = "group-demo", mixinStandardHelpOptions = true,
sortOptions = false)
class UsingGroups implements Runnable {
static class MyGroup {
#Option(names = "-VAR", required = true,
description = "Option prefix. Must be followed by one of ARGUMENT1, OTHERARG or BOOLEANARG")
boolean ignored;
static class InnerGroup {
#Option(names = "ARGUMENT1", description = "An arg. Must be preceded by -VAR.")
String arg1;
#Option(names = "OTHERARG", description = "Another arg. Must be preceded by -VAR.")
String otherValue;
#Option(names = "BOOLEANARG", arity = "1",
description = "A boolean arg. Must be preceded by -VAR.")
Boolean bool;
}
// exclusive: only one of these options can follow a -VAR option
// multiplicity=1: InnerGroup must occur once
#ArgGroup(multiplicity = "1", exclusive = true)
InnerGroup inner;
}
// non-exclusive means co-occurring, so if -VAR is specified,
// then it must be followed by one of the InnerGroup options
#ArgGroup(multiplicity = "0..*", exclusive = false)
List<MyGroup> groupOccurrences;
#Override
public void run() {
// business logic here
System.out.printf("You specified %d -VAR options.%n", groupOccurrences.size());
for (MyGroup group : groupOccurrences) {
System.out.printf("ARGUMENT1=%s, ARGUMENT2=%s, BOOLEANARG=%s%n",
group.inner.arg1, group.inner.arg2, group.inner.arg3);
}
}
public static void main(String[] args) {
new CommandLine(new UsingGroups()).execute(args);
}
}
Then, invoking with java UsingGroups -VAR ARGUMENT1=abc -VAR BOOLEANARG=true gives:
You specified 2 -VAR options.
ARGUMENT1=abc, OTHERARG=null, BOOLEANARG=null
ARGUMENT1=null, OTHERARG=null, BOOLEANARG=true
With this approach, you will get a MyGroup object for every time the end user specifies -VAR. This MyGroup object has an InnerGroup which has many fields, all but one of which will be null. Only the field that the user specified will be non-null. That is the disadvantage of this approach: in the application you would need to inspect all fields to find the non-null one that the user specified. The benefit is that by selecting the right type for the #Option-annotated field, the values will be automatically converted to the destination type.

Multi dimensional dictionary?

I'm trying to find a simple/efficient way to store multiple values under each index for my application, for example:
1 = {54, "Some string", false, "Some other string"}
2 = {12, "Some string", true, "Some other string"}
3 = {18, "Some string", true, "Some other string"}
So that I can set this as a static variable which can then be accessed from various object instances via the single index value (the only variable within each object). Essentially, sort of like a "multi dimensional dictionary".
I have looked at 2D arrays, but they seem to be limited to single data types (Int, string, etc) and also looked at hash maps - which also seemed limited as if using more than two values, would require a list variable which again comes back the the single data type problem. Any advice on a simple solution for this please?
Define a class for those entries, and use an array of objects. So the class might be something like:
class Thingy {
private int someNumber;
private String someString;
private boolean someBool;
private String someOtherString;
public Thingy(int _someNumber, String _someString, boolean _someBool, String _someOtherString) {
this.someNumber = _someNumber;
this.someString = _someString;
this.someBool = _someBool;
this.someOtherString = _someOtherString;
}
public int getSomeNumber() {
return this.someNumber;
}
// ...setter if appropriate...
// ...add accessors for the others...
}
...and then you do:
Thingy[] thingies = new Thingy[] {
new Thingy(54, "Some string", false, "Some other string"),
new Thingy(12, "Some string", true, "Some other string"),
new Thingy(18, "Some string", true, "Some other string")
};
The backbone of Python heavily relies on dictionary data structures, many instances can be reflected, assigned and accessed through using the __dict__ attribute. If you have a model that frequently has to access a dozen dictionaries or so, replicating something like the following in Python would reduce quite a lot of the unnecessary Java idiosyncrasies:
class ExampleObject:
spam = "example"
title = "email title"
content = "some content"
obj = ExampleObject()
print obj.spam # prints "example"
print obj.__dict__["spam"] # also prints "example"
Just throwing an alternative option out there for you.

How to set method name prefix in swagger codegen?

(Newbie to Swagger)
In the swagger specification file, the operationId is the name of the operation, corresponding to the HTTP methods.
For example,
"/pet/findByStatus": {
"get": {
"tags": [
"pet"
],
"summary": "Finds Pets by status",
"description": "Multiple status values can be provided with comma separated strings",
"operationId": "findPetsByStatus",
As seen above, operationId = findPetsByStatus. Suppose I want to generate a prefix for all get operations in my java code, with prefix = 'get_'.
For example, I would expect the swagger codegen to produce all operations corresponding to HTTP GET methods with a prefix = 'get_'. Specifically, above, it might generate: get_findPetsByStatus.
Is there a way to tell swagger codegen to prefix methods?
Please note that I want to use swagger-codegen itself and not APIMatic-like alternatives.
Implement AbstractJavaCodegen (or a subclass that implements it) and overload the postProcessOperations function to prepend prefixes to operations (operationId property of the CodegenOperation class). See making-your-own-codegen-modules for instructions on building and running a custom codegen.
Pseudocode:
public class MyCodegen extends AbstractJavaCodegen{ \\or
[...]
#Override
public Map<String, Object> postProcessOperations(Map<String, Object> objs) {
super.postProcessOperations(objs);
Map<String, Object> operations = (Map<String, Object>) objs.get("operations");
if (operations != null) {
List<CodegenOperation> ops = (List<CodegenOperation>) operations.get("operation");
for (CodegenOperation operation : ops) {
if(operation.httpMethod.equals("GET"){
operation.operationId = "get_" + operation.operationId;
}[...]
}
}
return objs;
}
}

Iterate though array of object in java

I am trying to loop over the array of objects in java. I'm posting this value from client side to server side which is java.
"userList": [{
"id": "id1",
"name": "name1"
},
{
"id": "id2",
"name": "name2"
}]
Now I want to get the value of each id and name. I tried the code below:
for (Object temp : userList)
System.out.print(temp);
System.out.print(temp.getId());
}
But the output I get is:[object Object]
I'm sorry for this stupid question. But how will I get the value of id and name?
You're getting [object Object] because you didn't turn your JavaScript object into JSON on the client side before sending it to your server--you need to use something like JSON.stringify(object) in the browser.
Next, you will need to unpack your JSON into some sort of Java structure. The preferable way to do this is to let an existing tool such as Jackson or Gson map it onto a Java object that looks like:
class User {
String id;
String name;
}
How to do this will depend on your framework, but Spring MVC (for example) supports it mostly automatically.
Implement the toString method for your class according to how you want the printed output to look.
For example...
public class User {
private String id;
private String name;
// Constructors, field accessors/mutators, etc...
#Override
public String toString() {
return String.format("User {id: %s, name: %s}", this.id, this.name);
}
}
Your question does not have complete information. You certainly are skipping steps.
Before you start using the object in java you need to cast the object.
ArrayList<User> convertedUserList = (ArrayList<User>)userList;
for (User temp : convertedUserList)
System.out.print(temp);
System.out.print(temp.getId());
}

Parsing complex nested JSON data with Gson

I'm using Gson to parse a JSON string. I want to convert this to an object using a container class and embedded static classes. To some extent this has been possible, but I want to treat the content of stuff1 and stuff2 as arrays, for example, stuff1 is an array containing other_stuff1 and other_stuff2. This is so I can reference the object in a fashion like these: object.integer, object.stuff1.get("other_stuff1").name, or object.stuff2.get("other_stuff3").more. (for the last one, I could be interested in looping over more to get each item.
For example, in PHP, I would use this:
<?php
echo "<pre>";
$object = json_decode(file_get_contents("THE JSON FILENAME"));
foreach($object->stuff1 as $name=>$data) {
echo $name . ":\n"; // other_stuff1 or other_stuff2
echo $unlockable->description . "\n\n"; // Got lots of stuff or Got even more stuff.
}
?>
I want to be able to reference in a similar way, loading the JSON to an object to be used on the fly.
It is crucial that, while some degree of change can be made to the JSON, that the names of the elements remain and be referable and retrievable.
I've included JSON, very similar to the one I'm using, below.
{
"integer":"12345",
"stuff1":{
"other_stuff1":{
"name":"a_name",
"description":"Got lots of stuff.",
"boolean":false
},
"other_stuff2":{
"name":"another_name",
"description":"Got even more stuff",
"boolean":true
}
},
"stuff2":{
"other_stuff3":{
"name":"a_name",
"description":"Got even more stuff",
"boolean":false,
"more":{
"option1":{
"name":"hello"
},
"option2":{
"name":"goodbye"
}
}
},
}
}
I've gone through a number of reference guides and tutorials, and I can't find a way to interpret this the way I'm trying to.
I'd really appreciate it if someone could give me a pointer. I can't find any tutorials that take into account that a) I want multiple objects in an array-style list, referable by the IDs (like with other_stuff1 and other_stuff2), and b) I want to also be able to loop over the items without providing the IDs.
You should define a Java class with fields named after the keys you need. You can use Maps (not arrays) to get the .get("key") behavior you describe. For example:
class Container {
private final int integer;
private final HashMap<String, Stuff> stuff1;
private final HashMap<String, Stuff> stuff2;
}
class Stuff {
private final String name;
private final String description;
#SerializedName("boolean") private final boolean bool;
private final HashMap<String, Option> more;
}
class Option {
private final String name;
}
For the "boolean" field you need need to use a different variable name since boolean is a reserved keyword.
You can then do:
Container c = gson.fromJson(jsonString, Container.class);
for(Stuff s : c.getStuff1().values()) {
System.out.println(s.getName());
}

Categories