Why does StringBufferInputStream doc recommend StringReader for String to Stream conversion? - java

As has been answered repeatedly, you can easily convert a String to an InputStream.
When browsing the Java 8 documentation I came across the long-deprecated StringBufferInputStream class, which states that
As of JDK 1.1, the preferred way to create a stream from a string is via the StringReader class.
What is way is this referring to? There are several methods requiring classes in non-default libraries such as the error-prone ReaderInputStream from Apache Commons IO, but I'm looking for the preferred way mentioned in the documentation. The solutions referenced in other questions are sufficient for my use cases, but I'd still like to know what the documentation is referencing.
Update
Apparently this is a 16 year old bug that hasn't been fixed. The proposed solution in the link is to use the deprecated class. I can't imagine that is what is intended from what the documentation says.

I don't know the answer to the javadoc part, but the first answer you pointed to is a reasonable one: just use String.getBytes(encoding) to get byte array, then use ByteArrayInputStream.
But usually the more important question is this: why on earth do you NEED such conversion? In a well designed system, you should never need to go in this direction: it is against the normal flow of things where within JDK you deal with chars and Strings, and outside with bytes and Streams. So conversions in this direction are quite rare and it does not seem necessary for JDK to have explicit support.

Related

Why do so many of the Java libraries take `String` where `CharSequence` would do?

I was frustrated recently in this question where OP wanted to change the format of the output depending on a feature of the number being formatted.
The natural mechanism would be to construct the format dynamically but because PrintStream.format takes a String instead of a CharSequence the construction must end in the construction of a String.
It would have been so much more natural and efficient to build a class that implemented CharSequence that provided the dynamic format on the fly without having to create yet another String.
This seems to be a common theme in the Java libraries where the default seems to be to require a String even though immutability is not a requirement. I am aware that keys in Maps and Sets should generally be immutable for obvious reasons but as far as I can see String is used far too often where a CharSequence would suffice.
There are a few reasons.
In a lot of cases, immutability is a functional requirement. For example, you've identified that a lot of collections / collection types will "break" if an element or key is mutated.
In a lot of cases, immutability is a security requirement. For instance, in an environment where you are running untrusted code in a sandbox, any case where untrusted code could pass a StringBuilder instead of a String to trusted code is a potential security problem1.
In a lot of cases, the reason is backwards compatibility. The CharSequence interface was introduced in Java 1.4. Java APIs that predate Java 1.4 do not use it. Furthermore, changing an preexisting method that uses String to use CharSequence risks binary compatibility issues; i.e. it could prevent old Java code from running on a newer JVM.
In the remainder it could simply be - "too much work, too little time". Making changes to existing standard APIs involves a lot of effort to make sure that the change is going to be acceptable to everyone (e.g. checking for the above), and convincing everyone that it will all be OK. Work has to be prioritized.
So while you find this frustrating, it is unavoidable.
1 - This would leave the Java API designer with an awkward choice. Does he/she write the API to make (expensive) defensive copies whenever it is passed a mutable "string", and possibly change the semantics of the API (from the user's perspective!). Or does he/she label the API as "unsafe for untrusted code" ... and hope that developers notice / understand?
Of course, when you are designing your own APIs for your own reasons, you can make the call that security is not an issue. The Java API designers are not in that position. They need to design APIs that work for everyone. Using String is the simplest / least risky solution.
See http://docs.oracle.com/javase/6/docs/api/java/lang/CharSequence.html
Do you notice the part that explains that it has been around since 1.4? Previously all the API methods used String (which has been around since 1.0)

Serialization framework (no no-arg constructor)

I'm looking for some info on the best approach serialize a graph of object based on the following (Java):
Two objects of the same class must be binary equal (bit by bit) compared to true if their state is equal. (Must not depend on JVM field ordering).
Collections are only modeled with arrays (nothing Collections).
All instances are immutable
Serialization format should be in byte[] format instead of text based.
I am in control of all the classes in the graph.
I don't want to put an empty constructor in the classes just to support serialization.
I have looked at implementing a solution based my own traversal an on Objenisis but my problem does not seem that unique. Better checking for any existing/complete solution first.
Updated details:
First, thanks for your help!
Objects must serialize to exactly the same bit order based on the objects state. This is important since the binary content will be digitally signed. Reconstruction of the serialized format will be based on the state of the object and not that the original bits are stored.
Interoperability between different technologies is important. I do see the software running on ex. .Net in the future. No Java flavour in the serialized format.
Note on comments of immutability: The values of the arrays are copied from the argument to the inner fields in the constructor. Less important.
Best regards,
Niclas Lindberg
You could write the data yourself, using reflections or hand coded methods. I use methods which are look hand code, except they are generated. (The performance of hand coded, and the convience of not having to rewrite the code when it changes)
Often developers talk about the builtin java serialization, but you can have a custom serialization to do whatever you want, any way you want.
To give you are more detailed answer, it would depend on what you want to do exactly.
BTW: You can serialize your data into byte[] and still make it human readable/text like/editable in a text editor. All you have to do is use a binary format which looks like text. ;)
Maybe you want to familiarize yourself with the serialization frameworks available for Java. A good starting point for that is the thift-protobuf-compare project, whose name is misleading: It compares the performance of more than 10 ways of serializing data using Java.
It seems that the hardest constraint you have is Interoperability between different technologies. I know that Googles Protobuffers and Thrift deliver here. Avro might also fit.
The important thing to know about serialization is that it is not guaranteed to be consistent across multiple versions of Java. It's not meant as a way to store data on a disk or anywhere permanent.
It's used internally to send classes from one JVM to another during RMI or some other network protocol. These are the types of applications that you should use Serialization for. If this describes your problem - short term communication between two different JVM's - then you should try to get Serialization going.
If you're looking for a way to store the data more permanently or you will need the data to survive in forward versions of Java, then you should find your own solution. Given your requirements, you should create some sort of method of converting each object into a byte stream yourself and reading it back into objects. You will then be responsible for making sure the format is forward compatible with future objects and features.
I highly recommend Chapter 11 of Effective Java by Joshua Bloch.
Is the Externalizable interface what you're looking for ? You fully control the way your objects are persisted and you do that the OO-style, with methods that are inherited and all (unlike the private read-/write-Object methods used with Serializable). But still, you cannot get rid of the no-arg accessible constructor requirement.
The only way you would get this is:
A/ USE UTF8 text, I.E. XML or JSON, binary turned to base64(http/xml safe variety).
B/ Enforce UTF8 binary ordering of all data.
C/ Pack the contents except all unescaped white space.
D/ Hash the content and provide that hash in a positionally standard location in the file.

Where to find an overview of backed Collection methods/classes

I am trying to find an overview of all methods in the java.util package returning backed Collections (and Maps). The only ones easy to find are the synchronizedXX and immutableXX. But there are others like subMap(). Is there a more comfortable way to find out more about all util methods returning backed collections than to actually read the docs? A visual overview maybe?
the tutorial for wrapped classes (has been proposed twice as an answer) at http://download.oracle.com/javase/tutorial/collections/implementations/wrapper.html is oblivious of the NavigableSet/Map interfaces and therefore does not provide an overview of methods returning backed Collections
I know this doesn't exactly answer your question (and I risk being down-voted), but I will try anyway.
You should try to study the collections API as much as you can, in general it is good advice for any programming language/platform to invest some time, and learn the basics.
When studying Java collections you will also notice some oddities in the design, and will also realize that there are many things that are not provided that you either have to build your own or get them from somewhere else (such as Apache commons).
In any case, using a modern IDE (such as IntelliJ IDEA or Eclipse) will make things a lot easier on you. Both have ways of searching for symbols with a few keystrokes and also let you navigate the collections API (and any source code you throw at them) making it a lot easier to figure out what is available and how you might take advantage of it.
Try this mnemonic to understand some methods from TreeSet and TreeMap.
It's a bit tricky though there's a numeric TreeSet (1 2 3 4 5 6 7 8 9 10) below. So it's easy to remember that headSet() & headMap() methods work with the "Head" of the collection.
Also the mnemonic describes that there are two cases of using headSet with different results:
headSet(element)
headSet(element, inclusive).
The tutorial has a page on wrapper classes.

Are Java Properties effectively deprecated?

Java's Properties object hasn't changed much since pre-Java 5, and it hasn't got Generics support, or very useful helper methods (defined pattern to plug in classes to process properties or help to load all properties files in a directory, for example).
Has development of Properties stopped? If so, what's the current best practice for this kind of properties saving/loading?
Or have I completely missed something?
A lot of the concepts around Properties are definitely ancient and questionable. It has very poor internationalization, it adds methods that today would just be accomplished via a Generic type, it extends Hashtable, which is itself generally out of use, since its synchronization is of limited value and it has methods which are not in harmony with the Collections classes introduced in 1.2, and many of the methods added to the Properties class essentially provide the kind of type safety that is replaced by Generics.
If implemented today it would probably be a special implementation of a Map<String, String>, and certainly support better encoding in the properties file.
That being said, there isn't really a replacement that doesn't add complexity. Sure the java.util.prefs.Preferences api is the "new and improved" but it adds a layer of complexity that is well beyond what is needed for many use cases. Just using XML is also an option (which at least fixes the internationalization issues) but a properties object often fits the needs just fine, at which point use it.
It's still a viable solution for simple configuration requirements. They don't need generics support because Property keys and values are inherently Strings, that is, they are stored in flat, ascii files. If you need un/marshaling/serialization of objects, Properties aren't the right approach. The preferred method is now java.util.prefs.Preferences for anything beyond even moderately sophisticated configuration needs.
It does what it needs to do. It's not that hard to write support for reading in all the properties files in a directory. I would say that's not a common use-case, so I don't see that as something that needs to be in the JDK.
Also, it has changed slightly since pre-Java 5, as the Javadoc says that extends Hashtable<Object, Object> and implements Map<Object, Object>.
"it hasn't got Generics support,"
why does it need generics support; it deals with string key and string values
I would not consider Java properties deprecated. It is a mature library - that's all
The dictionary structure is one of the oldest most used structures in most programming languages http://en.wikipedia.org/wiki/Associative_array, I doubt it would be deprecated.
Even if were to be removed there would soon be new implementations outside of the core.
There already are external extensions, apache commons are great resources that I think have helped to shape java over the years, see http://commons.apache.org/configuration/howto_properties.html.

XStream or Simple

I need to decide on which one to use. My case is pretty simple. I need to convert a simple POJO/Bean to XML, and then back. Nothing special.
One thing I am looking for is it should include the parent properties as well. Best would be if it can work on super type, which can be just a marker interface.
If anyone can compare these two with cons and pros, and which thing is missing in which one. I know that XStream supports JSON too, thats a plus. But Simple looked simpler in a glance, if we set JSON aside. Whats the future of Simple in terms of development and community? XStream is quite popular I believe, even the word, "XStream", hit many threads on SO.
Thanks.
Just from reading the documentation (I'm facing down the same problem you are, but haven't tried either way yet; take this with a grain of salt):
XSTREAM
Very, very easy to Google. Examples, forum posts, and blog posts about it are trivial to find.
Works out of the box. (May need more tweaking, of course, but it'll give you something immediately.)
Converting a variable to an attribute requires creating a separate converter class, and registering that with XStream. (It's not hard for simple values, but it is a little extra work.)
Doesn't handle versioning at all, unless you add in XMT (another library); if the XML generated by your class changes, it won't deserialize at all. (Once you add XMT, you can alter your classes however you like, and have XStream handle it fine, as long as you create an increasing line of incremental versioning functions.)
All adjustments require you to write code, either to implement your own (de)serialization functions, or calling XStream functions to alter the (de)serialization techniques used.
Trivial syntax note: you need to cast the output of the deserializer to your class.
SIMPLE
Home page is the only reliable source of information; it lists about a half-dozen external articles, and there's a mailing list, but you can't find it out in the wild Internet.
Requires annotating your code before it works.
It's easy to make a more compact XML file using attributes instead of XML nodes for every property.
Handles versioning by being non-strict in parsing whenever the class is right, but the version is different. (i.e., if you added two fields and removed one since the last version, it'll ignore the removed field and not throw an exception, but won't set the added fields.) Like XStream, it doesn't seem to have a way to migrate data from one version to the next, but unlike XStream, there's no external library to step in and handle it. Presumably, the way to handle this is with some external function (and maybe a "version" variable in your class?), so you do
Stuff myRestoredStuff = serializer.read(Stuff.class, file);
myRestoredStuff.sanityCheck();
Commonly-used (de)serializing adjustments are made by adding/editing annotations, but there's support for writing your own (de)serialization functions to override the standard methods if you need to do something woolly.
Trivial syntax note: you need to pass the restored object's class into the deserializer (but you don't need to cast the result).
Why not use JAXB instead?
100% schema coverage
Huge user base
Multiple implementations (in case you hit a bug in one)
Included in Java SE 6, compatible with JDK 1.5
Binding layer for JAX-WS (Web Services)
Binding layer for JAX-RS (Rest)
Compatible with JSON (when used with libraries such as Jettison)
Useful resources:
Comparison, JAXB & XStream
Comparison, JAXB & Simple
I'd recommend that you take a look at Simple
I would also suggest Simple, take a look at the tutorial, there and decide for yourself. The mailing list is very responsive and you will always get a prompt answer to any queries.
So far I have never use Simple framework yet.
Based on my experience with Xstream. It worked well on XML. However, for JSON, the result is not as precise as expected when I attempt to serialize a bean that contain a List of Hashtable.
Thought I share this here.
To get XStream to ignore missing fields (when you have removed a property):
XStream xstream = new XStream() {
#Override
protected MapperWrapper wrapMapper(MapperWrapper next) {
return new MapperWrapper(next) {
#Override
public boolean shouldSerializeMember(Class definedIn,
String fieldName) {
if (definedIn == Object.class) {
return false;
}
return super.shouldSerializeMember(definedIn, fieldName);
}
};
}
};
This can also be extended to handle versions and property renames.
Credit to Peter Voss: https://pvoss.wordpress.com/2009/01/08/xstream
One "simple" (pun intended) disadvantage of Simple and Jaxb is that they require annotating your objects before they can be serialized to XML. What happens the day you quickly want to serialize someone else's code with objects that are not annotated? If you can see that happening one day, XStream is a better fit. (Sometimes it really just boils down to simple requirements like this to drive your decisions).
Was taking a quick look at simple while reading stackoverflow; as an amendment to Paul Marshalls helpful post, I thought i'd mention that Simple does seem to support versioning through annotations-
http://simple.sourceforge.net/download/stream/doc/tutorial/tutorial.php#version
Simple is much slower then XStream(in serialization objects to xml)
http://pronicles.blogspot.com/2011/03/xstream-vs-simple.html

Categories