I'm looking for a Java library that can help me parse a CSV file containing pipe-delimited records and create instances of my bean class from them.
I've looked into several alternatives such as SuperCSV, OpenCSV, BeanIO, JFileHelper, jsefa, ... but neither of them seems to have what it takes.
Requirements of the library:
support records with a variable number of fields
provide iterator-style access so the file is never loaded entirely into memory
support mapping a field to its actual type, i.e. be able to take an date field and put it a java.util.Date into my bean instead of a string
let me supply my own factory object to create the beans from, instead of defaulting to Class.newInstance()
All the libraries I've looked into seem to lack requirement #4.
I can live with reflection, but the problem is that it still creates a new bean object for every line in the CSV file. Since the only thing I want to do with my bean at this point is pass it to my persistence layer and store it in the DB, it makes sense to put a couple of the bean instances into a pool and create a factory that takes instances from this pool. This way I can re-use my instances and parsing a 100000 line CSV file won't result in 100000 instances living in memory until the GC comes along.
Does anyone know of a library that can handle all these requirements?
This might be an alternative: https://github.com/org-tigris-jsapar/jsapar
It will probably fall short on requirement #4 though.
Here, you can find a more comprehensive list of alternatives: https://org-tigris-jsapar.github.io/jsapar/links
EDIT
As of jsapar version 1.8, it is now possible to customize Java object creation in an external factory class so I guess that requirement #4 is now also complied to.
Related
I'm having a problem when serializing and deserializing my objects in my project. I'm writing the object to a name.dat file.
However whenever i make a change in the Name class i can nolonger deserialize it, since it's two different objects.
Is there any way around this?
Your best options are:
Don't change your classes :-)
Throw away any serialized objects each time you change your classes.
Don't use Java object serialization.
Given that 1) and 2) are probably out of the question, option 3) should be given serious consideration. There a variety of alternatives to Java serialization, depending on the nature of the data you are persisting. These include:
Using Java properties files
Storing the data in a classical database (using SQL and the JDBC API)
Using an object-relational database mapping such as Hibernate
Using XML or JSON and a "binding" technology so that you can serialize / deserialize POJOs.
Finally, it is possible to implement class versioning using Java object serialization. However, it is tricky. And if you are continually changing the classes, then it is not going to be pleasant. Start by reading Versioning of Serializable Objects.
Just another Java problem (I'm a noob, I know): is it possible to use dynamic property binding in a Custom Control with a dynamic property getter in a Java bean?
I'll explain. I use this feature extensively in my Custom Controls:
<xp:inputTextarea id="DF_TiersM">
<xp:this.value><![CDATA[#{compositeData.dataSource[compositeData.fieldName]}]]></xp:this.value>
This is used in a control where both datasource and the name of the field are passed as parameters. This works, so far so good.
Now, in some cases, the datasource is a managed bean. When the above lines are interpreted, apparently code is generated to get or set the value of ... something. But what exactly?
I get this error: Error getting property 'SomeField' from bean of type com.sjef.AnyRecord which I guess is correct for there is no public getSomeField() in my bean. All properties are defined dynamically in the bean.
So how can I make XPages read the properties? Is there a universal getter (and setter) that allows me to use the name of a property as a parameter instead of the inclusion in a fixed method name? If XPages doesn't find getSomeField(), will it try something else instead, e.g. just get(String name) or so?
As always: I really appreciate your help and answers!
The way the binding works depends on whether or not your Java object implements a supported interface. If it doesn't (if it's just some random Java object), then any properties are treated as "bean-style" names, so that, if you want to call ".getSomeField()", then the binding would be like "#{obj.someField}" (or "#{obj['someField']}", or so forth).
If you want it to fall back to a common method, that's a job for either the DataObject or Map interfaces - Map is larger to implement, but is more standard (and you could inherit from AbstractMap if applicable), while DataObject is basically an XPages-ism but one I'm a big fan of (for reference, document data sources are DataObjects). Be warned, though: if you implement one of those, EL will only bind to the get or getValue method and will ignore normal setters and getters. If you want to use those when present, you'll have to write reflection code to do that (I recommend using Apache BeanUtils).
I have a post describing this in more detail on my blog: https://frostillic.us/f.nsf/posts/expanding-your-use-of-el-%28part-1%29
I have an application that saves its context to XML. In this application, there is a hierarchy of classes, that all implement a common interface, and that represent different settings. For instance, a first setting class may be made of 4 public float fields, another one can be made of a sole HashMap.
I am trying to determine what is the best way to handle writing and reading to XML in a generic way. I read on this site a lot about JAXB and XStream for instance, which are able to make a specific class instance from XML.
However my question is related to the fact that the actual class can be anything that implement a given interface. When you read the XML file, how would you guess the actual class to instantiate from the XML data? How do you do that in your applications?
I thought that I could write the .class name in a XML attribute, read it and compare it to all possible class .class names, until I find a match. Is there a more sensible way?
Thanks
xstream should already take care of this and create the object of correct type.
The tutorial seems to confirm that:
To reconstruct an object, purely from the XML:
Person newJoe = (Person)xstream.fromXML(xml);
If you don't know the type, you will have to first assign it to the common interface type:
CommonInterface newObject = (CommonInterface)xstream.fromXML(xml);
// now you can either check its type or call virtual methods
In my case I just have a kind of header that stores the class name that is serialized and when de-serializing it I just use the header value to figure out to which class shall I de-serialize the values.
A best practice would to use an established, well documented XML parser/mapper. All of the serialization/deserialization work has been done, so you can worry about your business logic instead. Castor and Apache Axiom are two APIs that I have used to marshal/unmarshall(serialize/deserialize) Java Classes and XML.
http://www.castor.org
Apache Axiom
I'm working with an application that uses JBPM 3.1 and MySQL. The core problem is that there are processes instances with variables that contain an older version of an external, non-JBPM Serializable class. When the main application is upgraded, these processes instances cause an exception to be thrown by JBPM since the SUID of a specific class instance has changed in the main application.
I believe I have a method for fixing the deserialization process using the technique described in the following:
How to deserialize an object persisted in a db now when the object has different serialVersionUID
However, my problem is figuring out where in MySQL JBPM stores process instance variables, so I can write a program that can interate over all the variables for all instances, an reserialize the variables so the offending class will have the new SUID, so JBPM can operate against the processes.
My initial looking at the JBPM tables, it appears that the JBPM_BYTEARRAY and/or JBPM_BYTEBLOCK may be the tables to operate against. However, I'm unsure how to proceed. I'm guessing each process variable is stored in a wrapping container class. Is that class org.jbpm.context.exe.VariableInstance? Or is it something else?
I figure if I have the proper jar files in the class path, and I know what the main class instance is that JBPM uses to store process variables in MySQL, I can deserialize the class (which will fix the SUID problem with the embedded problem class instance), and reserialize the class back. Since JBPM documentation does mention stuff about converters, I'm unsure if I have to replicate the conversion process JPBM does when deserializing, or if standard java deserialization is enough.
Some analysis of JBPM indicates that binary data may be split across multiple records. This may not be the case for mysql itself, but the JPBM code is written to support multiple RDBMs, and some have limits on the size of binary records.
Since the question earned me a tumbleweed reward, I was not going to get a usable mysql-based answer in within the deadline I had to meet, so I re-considered the core problem and the operating context the problem occurs, and came up with a solution that avoided the needed to perform direct mysql operations.
The main application in question already has some customize modifications to JBPM, so the solution I implemented altered JBPM source which performs the deserialization of process instance variables. This avoids the need to deal with JBPM logic that extracts the deserialized binary data from the RDBMs.
In the class org.jbpm.context.exe.converter.SerializableToByteArrayConverter, I modifed the code to use a custom ObjectInputStream class that returns the latest SUID of a class. The technique of just replacing the descriptor with the latest version of the class as described in the post referenced in the question does not work if the new class includes new fields. Doing so causes an end-of-data exception since the base deserialization code tries to access the "new" fields in the old, deserialized version of the class.
Therefore, I just need to replace the SUID, but keep all other parts of the descriptor the same. Since the JDK does not make ObjectStreamClass extensible, I created a sub-class of ObjectInputStream that returns the new SUID based upon a given calling pattern the java library executes against ObjectInputStream when deserialzing data.
The pattern: When reading the header of a deserialized object, the readUTF() function is called (to obtain the class name) followed by a readLong() call. Therefore, if this calling sequence occurs, and if the readUTF() returned the class name I want to change the SUID of, I return the newer SUID in the readLong() call.
The custom code reads a configuration file that specifies class names and associated SUIDs that should be mapped to the latest SUIDs for the classes listed. This allows mapping of alternate classes in the future w/o modifying the custom code.
Note, this approach is applicable to general deserialization operations, where one needs to map old SUIDs to the latest SUIDs of specified classes, and leaving the other parts of the serialized class descriptor alone to avoid end-of-data problems if the newer class definition includes additional field declarations not present in the older class definition.
Do you know if you made changes that break the contract or is it just simple adding new fields ? If it is simply adding new fields, then just define prior serialversionuid.. Otherwise.. you will have to read all the variables that have different serialversionids and save them under the new class because you are the only person who knows how to convert them.
Is there way to get properties files as strongly typed classes?
I guess there are code generators but doing it with annotations would be much cooler.
What I mean is;
foo.properties file
keyFoo = valuefoo
keyBar = valuebar
maybe with
#properties(file="foo.properties")
class foo { }
becomes
class foo {
String getKeyFoo() { }
String getKeyBar() { }
}
if not shall I start an open source project for that?
ADDITION TO QUESTION;
Think we have a foo.properties file with let say more than 10 entries;
and think it is used as a simple configuration file. What I believe is that this configuration entries should be provided as a configuration class with related getXXX methods to other parts of the design. Then rest of the system accesses the configuration via provided class instead of dealing with key names and don't need to bother where configuration comes. Then you can replace this class with a mock when you are testing callers and dependency to file system goes away. On the other hand it is really nice to
get all entries in a strongly typed fashion.
So this issue is a code generation issue behind the scenes, it is nothing related to runtime. But code generation with an external something instead of annotations didn't seemed nice to me. Although I am not very much familiar with annotations, I guess this could be achieved (but I'll keep in mind that annotations can not generate classes as McDowell points)
There are countless of framework that achieve that for XML with various degree of configuration needed. The standard one bundled with Java is JaxB but it is not exactly a one liner xml persistence framework ...
The problem is that using properties file will only works better than XML (or JSON, ...) on the most trivial classes. When the class become a bit more complex, the properties file will become a nightmare. Another problem is that with trivial classes - there is not much difference between Xml and properties.
That means that the scope of the project will be rather limited. Mostly useful for project having loads of simple properties files.
In big application I worked with, strongly-type reading of properties file is done quite often using a simple factory-method.
Foo foo = Foo.loadFrom("foo.properties");
class Foo {
static Foo loadFrom(String fileName) {
Properties props = new Properties();
props.load(...);
Foo foo = new Foo();
foo.setKeyFoo(props.get("KeyFoo"));
...
return foo;
}
...
}
There is a somewhat similar project for doing configuration as statically typed files. It requires to declare an interface, but it fills in the implementation itself:
public interface AppConfig extends Config {
long getTimeout ();
URL getURL ();
Class getHandlerClass ();
}
The Annotation Processing Tool (apt) cannot modify classes (though it can create new ones). In order to modify the class at compile time, you'd probably need to edit the AST (as Project Lombok does). The simplest approach would probably be to generate the classes and then use the generated library as a dependency for other code.
Yet another way is to use a data binding framework that does this. Even one that does not seem to directly support that could work: for example, Jackson JSON processor would allow this to be done by something like:
ObjectMapper m = new ObjectMapper();
MyBean bean = m.convertValue(properties, MyBean.class);
// (note: requires latest code from trunk; otherwise need to write first, read back)
which works as long as entries in Properties map match logical bean properties, and String values can be converted to matching underlying values.
Something like JFig (ugly IMO), Commons Configuration or EasyConf?
If you want to do it statically, its a code generation problem that may be solved quite easily (for each item in file, produce a new getXXX method).
But if you want this at runtime, then you have the problem of having your code referencing method that did not exists at compile time; I don't think it can be done.
(Note that if you are looking for a project idead, the reverse, having an interface with accessor method and annotation, and an implementation generated at runtime, that relies on the annotated methods, can be done.)
The OP would like to map a property file to a Java API such that each named property in the file corresponds to a similarly named getter method in the API. I presume that an application would then use this API to get property values without having to use property name strings.
The conceptual problem is that a property file is fundamentally not a statically typed entity. Each time someone edits a property file they could add new properties, and hence change the "type" of the property file ... and by implication, the signature of the corresponding API. If we checked that there were no unexpected properties when the Java app loaded the properties file, then we've got an explicit dynamic type-check. If we don't check for unexpected (e.g. misnamed) properties, we've got a source of errors. Things get even messier if you want the types of property values to be something other than a String.
The only way you could do this properly would be to invent the concept of a schema for a property file that specified the property names and the types of the property values. Then implement a property file editor that ensures that the user cannot add properties that conflict with the schema.
And at this point we should recognize that a better solution would be to use XML as the property file representation, an XML schema driven editor for editing property files, and JAXP or something like it to map the property file to Java APIs.
I think this will solve your problem
I have written on this property framework for the last year.
It will provide of multiple ways to load properties, and have them strongly typed as well.
Have a look at http://sourceforge.net/projects/jhpropertiestyp/
It is open sourced and fully documented
Here is my short description from SourceForge:
JHPropertiesTyped will give the developer strongly typed properties. Easy to integrate in existing projects. Handled by a large series for property types. Gives the ability to one-line initialize properties via property IO implementations. Gives the developer the ability to create own property types and property io's. Web demo is also available, screenshots shown above. Also have a standard implementation for a web front end to manage properties, if you choose to use it.
Complete documentation, tutorial, javadoc, faq etc is a available on the project webpage.