How to cast explicitly in clojure when interfacing with java - java

In trying to use weka from clojure, I'm trying to convert this howto guide from the weka wiki to clojure using the java interop features of clojure.
This has worked well so far, except in one case, where the clojure reflection mechanism can't seem to find the right method to invoke - I have:
(def c-model (doto (NaiveBayes.) (.buildClassifier is-training-set)))
Later this will be invoked by the .evaluateModel method of the Evaluation class:
(.evaluateModel e-test c-model is-testing-set)
where e-test is of type weka.classifiers.Evaluation and, according to their api documentation the method takes two parameters of types Classifier and Instances
What I get from clojure though is IllegalArgumentException No matching method found: evaluateModel for class weka.classifiers.Evaluation clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:53) - I guess that this is because c-model is actually of type NaiveBayes, although it should also be a Classifier - which it is, according to instance?.
I tried casting with cast to no avail, and from what I understand this is more of a type assertion (and passes without problems, of course) than a real cast in clojure. Is there another way of explicitly telling clojure which types to cast to in java interop method calls? (Note that the original guide I linked above also uses an explicit cast from NaiveBayes to Classifier)
Full code here: /http://paste.lisp.org/display/129250

The linked javadoc contradicts your claim that there is a method taking a Classifier and an Instances - what there is, is a method taking a Classifier, an Instances, and a variable number of Objects. As discussed in a number of SO questions (the only one of which I can find at the moment is Why Is String Formatting Causing a Casting Exception?), Clojure does not provide implicit support for varargs, which are basically fictions created by the javac compiler. At the JVM level, it is simply an additional required parameter of type Object[]. If you pass a third parameter, an empty object-array, into your method, it will work fine.

IllegalArgumentException No matching method found happens anytime the arguments don't match the class. They can fail to match because no method exists with that name and number of arguments or because no method exists with that name in the called class. so also check the number and type of the arguments.
I basically always resort to repl-utils/show in these cases

Related

Why is there no #ConformsTo for #FunctionalInterface in Java?

Java (1.8+) has an #FunctionalInterface annotation which (basically) suggests that you can pass a method reference in place of an Interface implementation to another method call. The useful one I was playing with today is:
DateTimeFormatter.parse(String, TemporalQuery<T>)
It's nice as it lets you tell the formatter what kind of result to hand back to you. The javadoc even gives you a nice example:
The query is typically a method reference to a from(TemporalAccessor) method. For example:
LocalDateTime dt = parser.parse(str, LocalDateTime::from);
Once I got my head around what the #FunctionalInterface is and means, I started to wonder how a consumer of an API is to figure out what they can actually use in its place. The example above tells you what you can use, and if you trace through the java.time package, you can find other method references you can use. However, any contributors to the API need to read through the entire javadoc to make sure they don't break any implicit contracts mentioned in other places (of course they should, especially for the JDK, but that's not the purpose of javadoc!)
So.. If a contributor to this API were to change the signature of LocalDateTime::from, then there's no compile time checking to say that this method no longer conforms to the FuncitonalInterface of 'TemporalQuery'. This would obviously break any consumers of the API and they could change their code to use an explicit lambda instead. I do understand that it does not need to, but if an annotation, similar to the optional '#Override' annotation were available, then it would provide some compile time checks as well as the possibility of introspecting/reflecting to discover available method references.
e.g.
#ConformsTo(TemporalQuery.class)
public static LocalDateTime from(TemporalAccessor temporal)
It would also then be possible to find, through introspection, any other method references that can be used for a FunctionalInterface.
So, to be clear, I understand that this is not necessary, but do think it seems to be an oversight not to include it as an optional Annotation. Is there any particular reason this could/should not exist?
The problems that arise from changing the signature or return type of a method, e.g. LocalDateTime::from isn't limited to functional interfaces. Even before Java 8 changing those things risked breaking existing code that relied on those things. That's why designing an API is always a challenge because changes to existing code can mean a lot of work.
Additionally, assuming the functional interface and the matching methods are part of different libraries, would you really want that they are closely coupled, i.e. both need to change when one changes? What if they are maintained by different organizations (let's say different open source projects or companies) - how should they coordinate?
As an example take Comparator.comparing(Function<? super T, ? extends U> keyExtractor). That basically accepts a reference to any method that takes no parameter and returns something comparable. There are so many libraries that already provide those methods, would you want them all to have to add #ConformsTo?
That said, a #ConformsTo would at best be incomplete and might even be misleading/outdated.
Edit:
Let's tackle both annotations from the view of the compiler.
#FunctionalInterface tells the compiler that it should complain when you define more than one abstract method or use it on something else other than an interface.
That means that the requirements/contract definition ("this interface is a functional interface") and the implementation (the interface itself) are contained in the same file and thus have to be changed together anyways.
#ConformsTo could tell the compiler to check the requirements of the functional interface (or even interfaces) and see if that method satisfies them.
So far so good, but the problem arises when the interface changes: it would couple the method and the interface which could be part of different and otherwise totally unrelated libraries. And even if they were part of the same library you could run into problems when the method itself wouldn't be recompiled - the compiler might miss that incompatibility and thus defy the purpose of that annotation (if it were only meant for humans then a simple comment would be sufficient as well).

Instrument intermediary local method call within a method body

I know (at least using either BCEL, or ASM, for instance), it is possible to somehow access local variables of a method... but, I need something more, what I would like is:
to get the type of such a variable (or a way to convert from the signature)
to know (distinguish) when this variable is used (either sees it value affected, or is passed as parameter)
when this variable is used as parameter, to know which method call it was passed to
to break "method-chains" in their respective method calls and get their return value so I can manipulate them
The basic idea is that I would like to "instrument" methods a bit in the same way a debugger does (though limited to the first frame depth...).
Any pointer appreciated.
If more information need, feel free to ask.
This is only possible using a byte code-level API. cglib does not expose such an API such that you have to choose between ASM, BCEL and Javassist where I would recommend you ASM which has the best documentation.
What you would need to do:
Parse the signature of the method, ASM offers utilities for that. You would get any type by its internal name. You would need to map these names to their index.
Find any use of the variable that is used from that index.
This is however a quite difficult task. In order to predict your code, you would have to emulate the method invocation. The JVM is a stack machine, arguments can be placed on the operand stack as a result of an arbitrary chain of commands. Therefore, you would effectively have to interpret any byte code instruction that you find. You will, more or less, need to write your own simplistic interpreter what is quite a task.

Problems calling a variadic Java function from Clojure

I'm having a play with the Java NIO.2 API from JDK 7.
In particular, I want to call the method: Paths#get(String first, String... more)
This is a static method which takes in at least one string, and returns a Path object corresponding to it. There's an overloaded form: Paths#get(URI uri)
However, I can't seem to call the top method from Clojure. The nearest I can seem to get is this:
(Paths/get ^String dir-fq (object-array 0))
which fails with:
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to [Ljava.lang.String;
as you might expect. After all, we're passing in an Object[] to something that's expecting String[].
I've tried removing the (object-array) form - but that just causes Clojure to try to call the get(URI) method - both with and without the type hint.
Passing nil as the second argument to Paths#get(String, String...) causes the right method to be called, but Java 7 then fails with an NPE.
I can't seem to find a way in Clojure to express the type String[] - I'm guessing I either need to do that or provide a hint to the dispatch system.
Any ideas?
As you noticed, it doesn't want an Object[], it wants a String[]. object-array does exactly what it says: it makes an array of objects. If you want to create an array with some different type, make-array and into-array are your friends. For example here:
(Paths/get "foo" (into-array String ["bar" "baz"]))
The String specifier there is optional in this case: if you leave out the array's desired type, Clojure uses the type of the first object as the array's component type.

Using type hints in Clojure for Java return values

I'm working on some Java / Clojure interoperability and came across a reflection warning for the following code:
(defn load-image [resource-name]
(javax.imageio.ImageIO/read
(.getResource
(class javax.imageio.ImageIO)
resource-name)))
=> Reflection warning, clojure/repl.clj:37 - reference to field read can't be resolved.
I'm surprised at this because getResource always returns a URL and I would therefore expect the compiler to use the appropriate static method in javax.imageio.ImageIO/read.
The code works fine BTW so it is clearly finding the right method at run time.
So two questions:
Why is this returning a reflection warning?
What type hint do I need to fix this?
AFAICS has this nothing to do with your code or compilation. It is part of the source-fn function of the REPL :
...
(let [text (StringBuilder.)
pbr (proxy [PushbackReader] [rdr]
(read [] (let [i (proxy-super read)]
(.append text (char i))
i)))]
...
and used to display source code in the REPL shell, AFAICT.
For others who find this post (as I did) when wondering why they get reflection warnings when using proxy-super...
Every proxy method has an implicit this first arg, which, alas, is not type-hinted (presumably because there are a number of possible types being implemented by the proxy and the resultant proxy class is created later).
So, if you ever call methods on this from inside the proxy (which is what proxy-super ends up doing), then you'll see reflection warnings.
The simple solution is to just wrap your code in a let that uses type-hinting. E.g.:
(let [^SomeClass this this]
(proxy-super foo)
(.bar this))

How to determine which classes are referenced in a compiled .Net or Java application?

I wonder if there's an easy way to determine which classes from a library are "used" by a compiled .NET or Java application, and I need to write a simple utility to do that (so using any of the available decompilers won't do the job).
I don't need to analyze different inputs to figure out if a class is actually created for this or that input set - I'm only concerned whether or not the class is referenced in the application. Most likely the application would subclass from the class I look for and use the subclass.
I've looked through a bunch of .Net .exe's and Java .classes with a hex editor and it appears that the referenced classes are spelled out in plaintext, but I am not sure if it will always be the case - my knowledge of MSIL/Java bytecode is not enough for that. I assume that even though the application itself can be obfuscated, it'll still have to call the library classes by the original name?
Extending what overslacked said.
EDIT: For some reason I thought you asked about methods, not types.
Types
Like finding methods, this doesn't cover access through the Reflection API.
You have to locate the following in a Reflector plugin to identify referenced types and perform a transitive closure:
Method parameters
Method return types
Custom attributes
Base types and interface implementations
Local variable declarations
Evaluated sub-expression types
Field, property, and event types
If you parse the IL yourself, all you have to do is process from the main assembly is the TypeRef and TypeSpec metadata, which is pretty easy (of course I'm speaking from parsing the entire byte code here). However, the transitive closure would still require you process the full byte code of each referenced method in the referenced assembly (to get the subexpression types).
Methods
If you can write a plugin for Reflector to handle the task, it will definitely be the easiest way. Parsing the IL is non-trivial, though I've done it now so I would just use that code if I had to (just saying it's not impossible). :D
Keep in mind that you may have method dependencies you don't see on the first pass that neither method mentioned will catch. These are due to indirect dispatch via the callvirt (virtual and interface method calls) and calli (generally delegates) instructions. For each type T created with newobj and for each method M within the type, you'll have to check all callvirt, ldftn, and ldvirtftn instructions to see if the base definition for the target (if the target is a virtual method) is the same as the base method definition for M in T or M is in the type's interface map if the target is an interface method. This is not perfect, but it is about the best you can do for static analysis without a theorem prover. It is a superset of the actual methods that will be called outside of the Reflection API, and a subset of the full set of methods in the assembly(ies).
For .NET: it looks like there's an article on MSDN that should help you get started. For what it's worth, for .NET the magic Google words are ".net assembly references".
In Java, the best mechanism to find class dependencies (in a programmatic fashion) is through bytecode inspection. This can be done with libraries like BCEL or (preferably) ASM. If you wish to parse the class files with your own code, the class file structure is well documented in the Java VM specification.
Note that class inspection won't cover runtime dependencies (like classes loaded using the service API).

Categories