I am implementing a spark process in java, and want to make, from a RDD of the same parametrized type, a Dataset<Try<MyPojo>> for some own made MyPojo class and where Try is the scala Try. In scala, the encoder would be made implicitly, but in java I need to provide it explicitly.
Now, I can get a working Encoder<MyPojo> using Encoders.bean(MyPojo.class). And I expect that there is some code to build an Encoder<Try<T>> from an Encoder<T> that is used by the scala implicit. But I cannot find it.
[Note: I just tried in scala and no implicit was found for type Try... So the question is valid in scala too]
So, how am I supposed to do?
After some search I reached to conclusion that
it is not possible (or maybe but it would be overly complicated)
and that's because it is not the way to use Dataset
At first, I considered Dataset to be a super, more generic, version of RDD. But it is not. Actually, it is less generic with respect to type because the type stored in dataset should be "flat" or "flatten-able".
Traditional Pojo have either a flat structure (each field has a value type that can be represented by one column) or can be flatten when fields has a Pojo type. On the other hand, there is no trivial way to "flatten" a type such as Try, which is basically either some type (MyPojo in the question) or an Exception.
And that conclusion also applies on all none-pojo type, such as interfaces which can have several implementation. Obviously this leads to a question: what about classes that are not pojo, eg. because that contains field of Try or interface type. Probably that Encoders.bean would fail at runtime. So much for type-safety...
Well, in conclusion, to solve my problem which is to keep track of failed items, I think I will go for an addition of an "error" column. Or something like that.
Related
Is there any way to customise the accessor strategy used in clojure.java.data/from-java? from-java is part of the java.data function lib.
I recently updated a third-pary Java-library that used to follow the JavaBean get and set pattern. However, after the update they went from getProperty() to property()...
I guess this change renders the from-java function not suitable in this case, no surprise since the objects are no longer proper JavaBeans.
Is there any way of making from-java aware of this accessor-pattern, or are there any other recursive mapping-mechanisms that supports this?
from-java is a multimethod, do you can override it for any class you like. There is no mechanism for teaching it an alternate naming convention (and if there were such a mechanism, I imagine it would have trouble with "every method with any name at all represents a property"). Therefore you'll have to write manual conversions, but at least the recursion will be handled for you.
It seems you will have to extend the multimethod to support the classes yourself, however, you can probably use reflection (slow, I know) to build something very generic:
For a given object instance, find its class, then the class' DeclaredFields, and from each fields get their name and type
For the same instance, use .getDeclaredMethod or .getDeclaredMethods to find methods for the given name that take no params (use an empty array for this). Those methods should be the new "getters" and you can call these in your instance to extract the values.
Use the Cognitect aws-api instead :)
While working on a program, I thought if it is possible to add elements of multiple types e.g. Integer, String, Long etc in a List, without making it to accept everything of Object type.
I want to restrict the list to accept elements of only these three types? Is this possible?
There are few solutions of it, which I dont want to do
1) We can create a Pojo having all these three types as elements and insert that pojo.
2) A base class implementing datatype specific wrapper classes. In this case, user will know this abstraction while creating objects of different classes.
Can this be done in a better and more generic way?
What you can do as an alternative, is implement your own type of collection. This could have 3 add-methods accepting the different kinds of types you want and the default add-method should throw an UnsupportedOperationException.
This however might not be the ideal solution, since it might introduce bugs if you don't have a full understanding of how the collection you are implementing/extending should work internally.
You have Number to use with, well, any kind of number.
But there is no relationship between String and Number.
So, List<Object> is your only choice there.
And keep in mind: when you really have to deal with different types of elements - maybe List isn't the correct abstraction to use?!
In other words: if you have values of different types that belong together, you should rather consider creating a specific class to wrap around those values.
A better solution would be to use Map<TypeOfData, List<Type>> rather than hacking List to achieve it.
Although there are probably some hacked together work-around that may help you do this, the simple answer to your question is "no, this can't be done". Part of programming is using the right tool for the right job. In this case, it's almost certain that a List is the wrong tool to help you do this job.
I'm heavily using Java.lang.Class.getField() method which requires a String variable as an argument. The problem I'm facing is when I change field names, that getField() refers to, Eclipse doesn't warn me that argument points nowhere (since it's String) and I end up having methods working improperly unnoticed.
So far I can see two ways out. It's either using try-catch blocks around every getField() call and running application to see what will be the next line to throw an exception. Fix it and watch out for the next exception. Or it's using Find/Replace feature every time I change a field name to manually look for the String value and replace it. Is there a more friendly (i.e. automatic) way to update String parameters in such cases?
Maybe there's a method (which I fail to find) that accepts a full field path as a non-String argument and returns a Field object? Something like turnToFieldObject(car.speed) returning Field object corresponding to speed field so that Eclipse would automatically check if there's such a field car.speed.
PS
First of all, thank you for your replies.
I can see that a lot of you, guys, suggest that I'm using reflection too much. That's why I feel I need to add extra explanation and would be glad to hear suggestions as well.
I'm doing a research about modeling social evolution and I need the entities to evolve new features that they don't have at the start. And it seemed to me that adding new fields to represent some evolutional changes is better understanding wise than adding new elements to arrays or collections. And the task suggests I shouldn't be able to know what feature will be evolved. That's why I rely so heavily on reflection.
AFAIK, there is no such method. You pass a reference (if it's an object) or value (if it's primitive); all data about the variables that they were originally assigned to is not available at runtime.
This is the huge downside of using reflection, and if you're "heavily" using this feature in such way, you're probably doing something wrong. Why not access the field directly, using getters and setters?
Don't get me wrong, reflection has its uses (for example, when you want to scan for fields with certain annotations and inject their values), but if you're referencing fields or methods by their name using a simple string, you could just as well access fields or methods directly. It implies that you know the field beforehand. If it's private, there is probably a reason why it's encapsulated. You're losing the content assist and refactoring possibilities by overusing reflection.
If you're modeling social evolution, I'd go with a more flexible solution. Adding new fields at runtime is (near?) impossible, so you are basically forced to implement a new class for each entity and create a new object each time the entity "evolves". That's why I suggest you to go with one of these solutions:
Use Map<String, Object> to store entities' properties. This is a very flexible solution which will allow you easily add and remove "fields" at the cost of losing their type data. Checking if the entity has a certain property will be a cheap contains call.
If you really want to stick to a million custom classes, use interfaces with getters and setters in addition to fields. For example, convert private String name to interface Named { String getName(); void setName(String name); }. This is much easier to refactor and does not rely on reflection. A class can implement as many interfaces as you want, so this is pretty much like the field solution, except it allows you to create custom getters/setters with extra logic if desperately needed. And determining if entity has a certain property is a entity instanceof MyInterface call, which is still cheaper than reflection.
I would suggest writing a method that use to get your fields supply it a string and then if the exception is thrown notify whatever needs to be notified that it was not valid and if the exception isn't caught return the field.
Although I do agree with the above that reflection should not be used heavily.
I'm using Java 6.
Suppose I have a file availableFruits.txt
APPLE
ORANGE
BANANA
Suppose I want an enum FruitType that contains values listed in availableFruits.txt, will I be able to do this?
You can't populate an enum type at execution time, no - at least, not without something like BCEL, or by calling the Java compiler.
You can write code to create a Java source file, of course, and build that when you build your app, if you don't need it to be changed afterwards.
Otherwise, I'd just create a wrapper class which is able to take a set of known values and reuse them. Exactly what you need to do will depend on how you wanted to use the enum, of course.
Well the point of an Enum is to use it at compile time.
If you don't know at compile time what values your Enum has then it's not an Enum it's a collection.
If you do know and you just want to create a class file base on the values in the text file then yes it's possible by reading the txt then generating the source code.
I expect it's possible, by writing your own ClassLoader subclass, creating the bytecode for the enum in a byte array, and using defineClass. Hard, maybe, but possible. I expect once you know the byte sequence for an enum, it's not that hard to custom-generate it from the info in the JVM spec.
Now, whether it's a good idea...well, I suspect only in a very small number of edge cases. (I can't think of one; I mean, having created it, you'd have to generate code to use it, right?) Otherwise, you're probably better off with a Map or similar.
No, not unless you generate the enum source file from the text file.
As everyone else said- no. It's not possible. Your best shot is to use the Registry pattern. Read in the values, store them in some sort of query-able map. Sort of like an Enum.
As everyone pointed out, it's not possible. However, you could create a Map where the key of your map would be the value you read from you file (APPLE,ORANGE,BANANA) and the ? would be an associated valu (int for example).
This way you could basically achieve the same goal without the type safety, of course.
int i = fruitsMap.get("BANANA") // get the assoicated value
You can with dynamically generated code. e.g. Using the Compiler API. I have written a wrapper for that API so you can compile classes in memory. See the code below.
The problem you have is that its not very useful as you cannot use these values except in classes which were compiled AFTER your enum was compiled. You can use Enum.valueOf() etc. But a lot of the value of enums is lost.
As other have suggested, using a Map would be simpler and give the same benefit. I would only use the enum if you have a library has to be passed an Enum. (Or plan more generated code)
public static Class generateEnum(String className, List<String> enums) {
StringBuilder code = new StringBuilder();
code.append("package enums; public enum enums." + className + " {\n");
for (String s : enums)
code.append("\t"+s+",\n");
code.append("}");
return CompilerUtils.CACHED_COMPILER
.loadFromJava("enums."+className, code.toString());
}
One of things I find useful with text generated code is that you can write it to a file and debug it even at run time. (The library supports this) If you byte code generation, its harder to debug.
The library is called Essence JCF. (And it doesn't require a custom class loader)
How would you do this in a dynamic language like JavaScript: it would be just string with one of values: "APPLE", "ORANGE", "BANANA".
Java types (classes, interfaces, enums) exist only for compiler to do some optimizations, and type checking, to make refactoring possible, etc. At runtime you don't need neither optimizations, type checking nor refactoring, so normal "string" is OK, just like in JavaScript every object is either a number (Double in Java), a string (String in Java) or a complex object (Map in Java) - that's all you need to do anything at runtime even in Java.
After using C++ I got used to the concept of Identifier which can be used with a class for the type, provides type safety and has no runtime overhead (the actual size is the size of the primitive). I want to do something like that, so I will not make mistakes like:
personDao.find(book.getId());//I want compilation to fail
personDao.find(book.getOwnerId());//I want compilation to succeed
Possible solutuions that I don't like:
For every entity have an entity id class wrapping the id primitive. I don't like the code bloat.
Create a generic Identifier class. Code like this will not compile:
void foo(Identifier<Book> book);
void foo(Identifier<Person> person);
Does anyone know of a better way?
Is there a library with a utility such as this?
Is implementing this an overkill?
And the best of all, can this be done in Java without the object overhead like in C++?
a more Java-equese and more correct Object Oriented version would be.
personDao.findByBookOwner(book);
inside each method they would extract the the id they need. This is the most Object Oriented way of creating an API.
When using an object relational mapper such as hibernate, objects not (yet) loaded from the database are usually represented with lazy loading proxies. These implement the interface of the actual entity, and transparently load the entity from the database when any method is invoked on it.
For instance, one could use these as follows:
Person person = session.get(Person.class, someId);
Person spouse = person.getSpouse(); // proxy, unless configured otherwise
Task t = new TalkToSpouseTask(spouse);
session.save(t);
With this code, the spouse is not loaded from the database.
And the best of all, can this be done
in Java without the object overhead
like in C++?
Nope, you always pay the object tax in Java (unless you use primitive types, which isn't type safe). However, I have yet to see a business application where that overhead matters.
Java is pretty restrictive when it comes to respecting OO, and I don't think there's any equivalent to typedef, which C++ seems to have inherited from C. The only way I can think of is using wrappers.
As for:
void foo(Identifier<Book> book);
void foo(Identifier<Person> person);
it won't work because generics are only used at the compiler level. They get erased after the compilation step, so both those functions would become:
void foo(Identifier param);
where Identifier is the raw type (without generics), and they would be indistinguishable.
The short answer is no, if you want type safety, you need the Object overhead (beyond varying your primitives if possible (e.g. one is a float and the other an int)).
When you program in Java you don't worry about Object overhead until it matters - that is when you see a problem profiling the implementation. Most of the time the JIT gets rid of problems, or it just isn't a problem to begin with.
Occasionally it does matter, but don't try too much prematurely guess where - you will most likely be wrong.
I see that there is no good way for doing what I wanted. Even if I use class hierarchy I have to manually integrate it with automatic id generators (like JPA annotation). Too much work. Will just have to be carefull about that. A possible solution will be Annotation + inspection, something like what intellij does with #Nullable and #NotNull, but I will not implement something like this myself.