Java equivalent of Smalltalk's become:

Java equivalent of Smalltalk's become: - java

Is there a way to swap myself (this) with some other object in Java?
In Smalltalk we could write
Object subclass:myClass [
"in my method I swap myself with someone else"
swapWith:anObject [
self become:anObject.
^nil
]
]
myClass subclass:subClass [
]
obj := myClass new.
obj swapWith:subClass new.
obj inspect.
Result is An instance of subClass, obviously.
I need to do following in Java:
I am in a one-directional hierarchy (directed acyclic graph)
in one of my methods (event listener method to be exact) I decide that I am not the best suited object to be here, so:
I create a new object (from a subclass of my class to be exact), swap myself with him, and let myself to be garbage-collected in near future
So, in short, how can I achieve in Java self become: (someClass new:someParameters)? Are there some known design patterns I could use?

In its most general form, arbitrary object swapping is impossible to reconcile with static typing. The two objects might have different interfaces, so this could compromise type safety. If you impose constraints on how objects can be swapped, such a feature can be made type safe. Such feature never became mainstream, but have been investigated in research. Look for instead at Gilgul.
Closely related is reclassification, the ability to change the class of an object dynamically. This is possible in Smalltalk with some primitives. Again, this puts type safety at risks, never became mainstream, but has been investigated in research. Look at Wide Classes, Fickle, or Plaid.
A poor man's solution to object swapping is a proxy that you interpose between the client and the object to swap, or the use of the state and strategy design patterns.

Here is an interesting thread on the official forum. I believe that object encapuslation in combination with strong types makes this function unable to work in Java. Plus for already slow JVM, this could lead to disaster...

this is a reserved word in Java and you cannot override it.
What you're trying to do can be implemented with a simple reference. You just iterate (or go through your graph) and change pointer to what you want to be active.
Consider this:
List<String> stringList = new ArrayList<String>();
// fill your list
String longestWord = "";
for (String s : stringList) {
if (longestWord.length() < s.length()) {
longestWord = s;
}
}
longestWord is poiniting to another object now.

Related

What is wrong in sharing Mutable State? [duplicate]

This question already has answers here:
How shall we write get method, so that private fields don't escape their intended scope? [duplicate]
(2 answers)
Closed 3 years ago.
In Java Concurrency in Practice chapter # 3 author has suggested not to share the mutable state. Further he has added that below code is not a good way to share the states.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL"
};
public String[] getStates() {
return states;
}
}
From the book:
Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
My question here is: we often use getter and setters to access the class level private mutable variables. if it is not the correct way, what is the correct way to share the state? what is the proper way to encapsulate states ?

For primitive types, int, float etc, using a simple getter like this does not allow the caller to set its value:
someObj.getSomeInt() = 10; // error!
However, with an array, you could change its contents from the outside, which might be undesirable depending on the situation:
someObj.getSomeArray()[0] = newValue; // perfectly fine
This could lead to problems where a field is unexpectedly changed by other parts of code, causing hard-to-track bugs.
What you can do instead, is to return a copy of the array:
public String[] getStates() {
return Arrays.copyOf(states, states.length);
}
This way, even the caller changes the contents of the returned array, the array held by the object won't be affected.

With what you have it is possible for someone to change the content of your private array just through the getter itself:
public static void main(String[] args) {
UnsafeStates us = new UnsafeStates();
us.getStates()[0] = "VT";
System.out.println(Arrays.toString(us.getStates());
}
Output:
[VT, AR]
If you want to encapsulate your States and make it so they cannot change then it might be better to make an enum:
public enum SafeStates {
AR,
AL
}
Creating an enum gives a couple advantages. It allows exact vales that people can use. They can't be modified, its easy to test against and can easily do a switch statement on it. The only downfall for going with an enum is that the values have to be known ahead of time. I.E you code for it. Cannot be created at run time.

This question seems to be asked with respect to concurrency in particular.
Firstly, of course, there is the possibility of modifying non-primitive objects obtained via simple-minded getters; as others have pointed out, this is a risk even with single-threaded programs. The way to avoid this is to return a copy of an array, or an unmodifiable instance of a collection: see for example Collections.unmodifiableList.
However, for programs using concurrency, there is risk of returning the actual object (i.e., not a copy) even if the caller of the getter does not attempt to modify the returned object. Because of concurrent execution, the object could change "while he is looking at it", and in general this lack of synchronization could cause the program to malfunction.
It's difficult to turn the original getStates example into a convincing illustration of my point, but imagine a getter that returns a Map instead. Inside the owning object, correct synchronization may be implemented. However, a getTheMap method that returns just a reference to the Map is an invitation for the caller to call Map methods (even if just map.get) without synchronization.
There are basically two options to avoid the problem: (1) return a deep copy; an unmodifiable wrapper will not suffice in this case, and it should be a deep copy otherwise we just have the same problem one layer down, or (2) do not return unmediated references; instead, extend the method repertoire to provide exactly what is supportable, with correct internal synchronization.

Regarding two lines of java code

I am trying to learn a java-based program, but I am pretty new to java. I am quite confusing on the following two lines of java code. I think my confusion comes from the concepts including “class” and “cast”, but just do not know how to analyze.
For this one
XValidatingObjectCorpus<Classified<CharSequence>> corpus
= new XValidatingObjectCorpus<Classified<CharSequence>>(numFolds);
What is <Classified<CharSequence>> used for in terms of Java programming? How to understand its relationships with XValidatingObjectCorpusand corpus
For the second one
LogisticRegressionClassifier<CharSequence> classifier
= LogisticRegressionClassifier.<CharSequence>train(para1, para2, para3)
How to understand the right side of LogisticRegressionClassifier.<CharSequence>train? What is the difference between LogisticRegressionClassifier.<CharSequence>train and LogisticRegressionClassifier<CharSequence> classifier
?

These are called generics. They tell Java to make an instance of the outer class - either XValidatingObjectCorpus or LogisticRegressionClassifier - using the type of the inner object.
Normally, these are used for lists and arrays, such as ArrayList or HashMap.
What is the relationship between XValidatingObjectCorpus and corpus?
corpus is just a name given to the new XValidatingObjectCorpus object that you make with that statement (hence the = new... part).
What does LogisticRegressionClassifier.<CharSequence>train mean?
I have no idea, really. I suggest looking at the API for that (I think this is the right class).
What is the difference between LogisticRegressionClassifier.<CharSequence>train and LogisticRegressionClassifier<CharSequence> classifier?
You can't really compare these two. The one on the left of the = is the object identifier, and the one on the right is the allocator (probably the wrong word, but it is what it does, kind of).
Together, the two define an instance of LogisticRegressionClassifier, saying to create that type of object, call it classifier, and then give it the value returned by the train() method. Again, look at the API to understand it more.
By the way, these look like wretched examples to be learning Java with. Start with something simple, or at least an easier part of the code. It looks like someone had way too much fun with long names (the API has even longer names). Seriously though, I only just got to fully understanding this, and Java was my main language for quite a while (It gets really confusing when you try and do simple things). Anyways, good luck!

public class Sample<T> { // T implies Generic implementation, T can be substituted with any object.
static <T> Sample<T> train(int par1, int par2, int par3){
return new Sample<T>(); // you are calling the Generic method to return Sample object which works with a particular type of generic object, may it be an Integer or a CharSequence. --> see the main method.
}
public static void main(String ... a)
{
int par1 = 0, par2 = 0, par3 = 1;
// Here you are returning Sample object which works with a sequence of characters.
Sample<CharSequence> sample = Sample.<CharSequence>train(par1, par2, par3);
// Here you are returning Sample object which works with Integer values.
Sample<CharSequence> sample1 = Sample.<Integer>train(par1, par2, par3);
}
}

<Classified<CharSequence>> is a generic parameter.
LogisticRegressionClassifier<CharSequence> is a generic type.
LogisticRegresstionClassifier.<CharSequence>train is a generic method.
Java Generics Tutorial

set/get methods in C++

Java programmers and API seems to favor explicit set/get methods.
however I got the impression C++ community frowns upon such practice.
If it is so,is there a particular reason (besides more lines of code) why this is so?
on the other hand, why does Java community choose to use methods rather than direct access?
Thank you

A well designed class should ideally not have too many gets and sets. In my opinion, too many gets and sets are basically an indication of the fact that someone else (and potentially many of them) need my data to achieve their purpose. In that case, why does that data belong to me in the first place? This violates the basic principle of encapsulation (data + operations in one logical unit).
So, while there is no technical restriction and (in fact abundance of) 'set' and 'get' methods, I would say that you should pause and reinspect your design if you want too many of those 'get' and 'set' in your class interface used by too many other entities in your system.

There are occasions when getters/setters are appropriate but an abundance of getters/setters typically indicate that your design fails to achieve any higher level of abstraction.
Typically it's better (in regards to encapsulation) to exhibit higher level operations for your objects that does not make the implementation obvious to the user.
Some other possible reasons why it's not as common in C++ as in Java:
The Standard Library does not use it.
Bjarne Stroustrup expresses his dislike towards it (last paragraph):
I particularly dislike classes with a
lot of get and set functions. That is
often an indication that it shouldn't
have been a class in the first place.
It's just a data structure. And if it
really is a data structure, make it a
data structure.

The usual argument against get/set methods is that if you have both and they're just trivial return x; and x = y; then you haven't actually encapsulated anything at all; you may as well just make the member public which saves a whole lot of boilerplate code.
Obviously there are cases where they still make sense; if you need to do something special in them, or you need to use inheritance or, particularly, interfaces.
There is the advantage that if you implement getters/setters you can change their implementation later without having to alter code that uses them. I suppose the frowning on it you refer to is kind of a YAGNI thing that if there's no expectation of ever altering the functions that way, then there's little benefit to having them. In many cases you can just deal with the case of altering the implementation later anyway.
I wasn't aware that the C++ community frowned on them any more or less than the Java community; my impression is that they're rather less common in languages like Python, for example.

I think the reason the C++ community frowns on getters and setters is that C++ offers far better alternatives. For example:
template <class T>
class DefaultPredicate
{
public:
static bool CheckSetter (T value)
{
return true;
}
static void CheckGetter (T value)
{
}
};
template <class T, class Predicate = DefaultPredicate <T>>
class Property
{
public:
operator T ()
{
Predicate::CheckGetter (m_storage);
return m_storage;
}
Property <T, Predicate> &operator = (T rhs)
{
if (Predicate::CheckSetter (rhs))
{
m_storage = rhs;
}
return *this;
}
private:
T m_storage;
};
which can then be used like this:
class Test
{
public:
Property <int> TestData;
Property <int> MoreTestData;
};
int main ()
{
Test
test;
test.TestData = 42;
test.MoreTestData = 24;
int value = test.TestData;
bool check = test.TestData == test.MoreTestData;
}
Notice that I added a predicate parameter to the property class. With this, we can get creative, for example, a property to hold an integer colour channel value:
class NoErrorHandler
{
public:
static void SignalError (const char *const error)
{
}
};
class LogError
{
public:
static void SignalError (const char *const error)
{
std::cout << error << std::endl;
}
};
class Exception
{
public:
Exception (const char *const message) :
m_message (message)
{
}
operator const char *const ()
{
return m_message;
}
private:
const char
*const m_message;
};
class ThrowError
{
public:
static void SignalError (const char *const error)
{
throw new Exception (error);
}
};
template <class ErrorHandler = NoErrorHandler>
class RGBValuePredicate : public DefaultPredicate <int>
{
public:
static bool CheckSetter (int rhs)
{
bool
setter_ok = true;
if (rhs < 0 || rhs > 255)
{
ErrorHandler::SignalError ("RGB value out of range.");
setter_ok = false;
}
return setter_ok;
}
};
and it can be used like this:
class Test
{
public:
Property <int, RGBValuePredicate <> > RGBValue1;
Property <int, RGBValuePredicate <LogError> > RGBValue2;
Property <int, RGBValuePredicate <ThrowError> > RGBValue3;
};
int main ()
{
Test
test;
try
{
test.RGBValue1 = 4;
test.RGBValue2 = 5;
test.RGBValue3 = 6;
test.RGBValue1 = 400;
test.RGBValue2 = 500;
test.RGBValue3 = -6;
}
catch (Exception *error)
{
std::cout << "Exception: " << *error << std::endl;
}
}
Notice that I made the handling of bad values a template parameter as well.
Using this as a starting point, it can be extended in many different ways.
For example, allow the storage of the property to be different to the public type of the value - so the RGBValue above could use an unsigned char for storage but an int interface.
Another example is to change the predicate so that it can alter the setter value. In the RGBValue above this could be used to clamp values to the range 0 to 255 rather than generate an error.

Properties as a general language concept technically predate C++, e.g. in Smalltalk, but they weren't ever part of the standard. Getters and setters were a concept used in C++ when it was used for development of UI's, but truth be told, it's an expensive proposition to develop UI's in what is effectively a systems language. The general problem with getters and setters in C++ was that, since they weren't a standard, everybody had a different standard.
And in systems languages, where efficiency concerns are high, then it's just easier to make the variable itself public, although there's a lot of literature that frowns mightily on that practice. Often, you simply see richer exchanges of information between C++ object instances than simple items.
You'll probably get a lot of viewpoints in response to this question, but in general, C++ was meant to be C that did objects, making OOP accessable to developers that didn't know objects. It was hard enough to get virtuals and templates into the language, and I think that it's been kind of stagnant for a while.
Java differs because in the beginning, with what Java brought in areas like garbage collection, it was easier to promote the philosophy of robust encapsulation, i.e. external entities should keep their grubby little paws off of internal elements of a class.
I admit this is pretty much opinion - at this time I use C++ for highly optimized stuff like 3D graphics pipelines - I already have to manage all my object memory, so I'd take a dim view of fundamentally useless code that just serves to wrap storage access up in additional functions - that said, the basic performance capabilies of runtimes like the MSFT .net ILM make that a position that can be difficult to defend at times
Purely my 2c

There's nothing unusual about having explicit set/get methods in C++. I've seen it in plenty of C++, it can be very useful to not allow direct access to data members.

Check out this question for an explanation of why Java tends to prefer them and the reasons for C++ are the same. In short: it allows you to change the way data members are accessed without forcing client code (code that uses your code) to recompile. It also allows you to enforce a specific policy for how to access data and what to do when that data is accessed.

By mandating the use of set/get methods, one can implement useful side-effects in the getter/setter (for example, when the argument to get/set is an object).

I am surprised nobody has mentioned Java introspection and beans yet.
Using get.../set... naming convention combined with introspection allows all sorts of clever trickery with utility classes.
I personally feel that the "public" keyword should have been enough to trigger the bean magic but I am not Ray Gosling.
My take on this is that in C++ is a rather pointless exercise. You are adding at least six lines of code to test and maintain which perform no purpose and will for the most part be ignored by the compiler. It doesnt really protect your class from misuse and abuse unless you add a lot more coding.

I don't think the C++ community frowned on using getters and setters. They are almost always a good idea.

It has to do with the basics of object oriented programming - hiding the internals of an object from its users. The users of an object should not need to know (nor should they care) about the internals of an object.
It also gives you control over what is done whenever a user of your object tries to read/write to it. In effect, you expose an interface to the object's users. They have to use that interface and you control what happens when methods in that interface are called - the getters and setters would be part of the interface.
It just makes things easier when debugging. A typical scenario is when your object lands up in a weird state and you're debugging to find out how it got there. All you do is set breakpoints in your getters and setters and assuming all else is fine, you're able to see how your object gets to the weird state. If your object's users are all directly accessing its members, figuring out when your object's state changes becomes a lot harder (though not impossible)

I would argue that C++ needs getters/setters more than Java.
In Java, if you start with naked field access, and later you changed your mind, you want getter/setter instead, it is extremely easy to find all the usages of the field, and refactor them into getter/setter.
in C++, this is not that easy. The language is too complex, IDEs simply can't reliably do that.
so In C++, you better get it right the first time. In Java, you can be more adventurous.

There were gets/sets long before java. There are many reasons to use them, especially, if you have to recalculate sth. wenn a value changes. So the first big advantage is, that you can watch to value changes. But imho its bad to ALWAYS implement get and set-often a get is enough. Another point is, that class changes will directly affect your customers. You cant change member names without forcing to refactor the clients code with public members. Lets say, you have an object with a lenght and you change this member name...uh. With a getter, you just change you side of the code and the client can sleep well. Adding gets/Sets for members that should be hidden is of course nonsense.

Why does Java toString() loop infinitely on indirect cycles?

This is more a gotcha I wanted to share than a question: when printing with toString(), Java will detect direct cycles in a Collection (where the Collection refers to itself), but not indirect cycles (where a Collection refers to another Collection which refers to the first one - or with more steps).
import java.util.*;
public class ShonkyCycle {
static public void main(String[] args) {
List a = new LinkedList();
a.add(a); // direct cycle
System.out.println(a); // works: [(this Collection)]
List b = new LinkedList();
a.add(b);
b.add(a); // indirect cycle
System.out.println(a); // shonky: causes infinite loop!
}
}
This was a real gotcha for me, because it occurred in debugging code to print out the Collection (I was surprised when it caught a direct cycle, so I assumed incorrectly that they had implemented the check in general). There is a question: why?
The explanation I can think of is that it is very inexpensive to check for a collection that refers to itself, as you only need to store the collection (which you have already), but for longer cycles, you need to store all the collections you encounter, starting from the root. Additionally, you might not be able to tell for sure what the root is, and so you'd have to store every collection in the system - which you do anyway - but you'd also have to do a hash lookup on every collection element. It's very expensive for the relatively rare case of cycles (in most programming). (I think) the only reason it checks for direct cycles is because it so cheap (one reference comparison).
OK... I've kinda answered my own question - but have I missed anything important? Anyone want to add anything?
Clarification: I now realize the problem I saw is specific to printing a Collection (i.e. the toString() method). There's no problem with cycles per se (I use them myself and need to have them); the problem is that Java can't print them. Edit Andrzej Doyle points out it's not just collections, but any object whose toString is called.
Given that it's constrained to this method, here's an algorithm to check for it:
the root is the object that the first toString() is invoked on (to determine this, you need to maintain state on whether a toString is currently in progress or not; so this is inconvenient).
as you traverse each object, you add it to an IdentityHashMap, along with a unique identifier (e.g. an incremented index).
but if this object is already in the Map, write out its identifier instead.
This approach also correctly renders multirefs (a node that is referred to more than once).
The memory cost is the IdentityHashMap (one reference and index per object); the complexity cost is a hash lookup for every node in the directed graph (i.e. each object that is printed).

I think fundamentally it's because while the language tries to stop you from shooting yourself in the foot, it shouldn't really do so in a way that's expensive. So while it's almost free to compare object pointers (e.g. does obj == this) anything beyond that involves invoking methods on the object you're passing in.
And at this point the library code doesn't know anything about the objects you're passing in. For one, the generics implementation doesn't know if they're instances of Collection (or Iterable) themselves, and while it could find this out via instanceof, who's to say whether it's a "collection-like" object that isn't actually a collection, but still contains a deferred circular reference? Secondly, even if it is a collection there's no telling what it's actual implementation and thus behaviour is like. Theoretically one could have a collection containing all the Longs which is going to be used lazily; but since the library doesn't know this it would be hideously expensive to iterate over every entry. Or in fact one could even design a collection with an Iterator that never terminated (though this would be difficult to use in practice because so many constructs/library classes assume that hasNext will eventually return false).
So it basically comes down to an unknown, possibly infinite cost in order to stop you from doing something that might not actually be an issue anyway.

I'd just like to point out that this statement:
when printing with toString(), Java will detect direct cycles in a collection
is misleading.
Java (the JVM, the language itself, etc) is not detecting the self-reference. Rather this is a property of the toString() method/override of java.util.AbstractCollection.
If you were to create your own Collection implementation, the language/platform wouldn't automatically safe you from a self-reference like this - unless you extend AbstractCollection, you would have to make sure you cover this logic yourself.
I might be splitting hairs here but I think this is an important distinction to make. Just because one of the foundation classes in the JDK does something doesn't mean that "Java" as an overall umbrella does it.
Here is the relevant source code in AbstractCollection.toString(), with the key line commented:
public String toString() {
Iterator<E> i = iterator();
if (! i.hasNext())
return "[]";
StringBuilder sb = new StringBuilder();
sb.append('[');
for (;;) {
E e = i.next();
// self-reference check:
sb.append(e == this ? "(this Collection)" : e);
if (! i.hasNext())
return sb.append(']').toString();
sb.append(", ");
}
}

The problem with the algorithm that you propose is that you need to pass the IdentityHashMap to all Collections involved. This is not possible using the published Collection APIs. The Collection interface does not define a toString(IdentityHashMap) method.
I imagine that whoever at Sun put the self reference check into the AbstractCollection.toString() method thought of all of this, and (in conjunction with his colleagues) decided that a "total solution" is over the top. I think that the current design / implementation is correct.
It is not a requirement that Object.toString implementations be bomb-proof.

You are right, you already answered your own question. Checking for longer cycles (especially really long ones like period length 1000) would be too much overhead and is not needed in most cases. If someone wants it, he has to check it himself.
The direct cycle case, however, is easy to check and will occur more often, so it's done by Java.

You can't really detect indirect cycles; it's a typical example of the halting problem.

Scala collection standard practice

Coming from a Java background, I'm used to the common practice of dealing with collections: obviously there would be exceptions but usually code would look like:
public class MyClass {
private Set<String> mySet;
public void init() {
Set<String> s = new LinkedHashSet<String>();
s.add("Hello");
s.add("World");
mySet = Collections.unmodifiableSet(s);
}
}
I have to confess that I'm a bit befuddled by the plethora of options in Scala. There is:
scala.List (and Seq)
scala.collections.Set (and Map)
scala.collection.immutable.Set (and Map, Stack but not List)
scala.collection.mutable.Set (and Map, Buffer but not List)
scala.collection.jcl
So questions!
Why are List and Seq defined in package scala and not scala.collection (even though implementations of Seq are in the collection sub-packages)?
What is the standard mechanism for initializing a collection and then freezing it (which in Java is achieved by wrapping in an unmodifiable)?
Why are some collection types (e.g. MultiMap) only defined as mutable? (There is no immutable MultiMap)?
I've read Daniel Spiewak's excellent series on scala collections and am still puzzled by how one would actually use them in practice. The following seems slightly unwieldy due to the enforced full package declarations:
class MyScala {
var mySet: scala.collection.Set[String] = null
def init(): Unit = {
val s = scala.collection.mutable.Set.empty[String]
s + "Hello"
s + "World"
mySet = scala.collection.immutable.Set(s : _ *)
}
}
Although arguably this is more correct than the Java version as the immutable collection cannot change (as in the Java case, where the underlying collection could be altered underneath the unmodifiable wrapper)

Why are List and Seq defined in package scala and not scala.collection (even though implementations of Seq are in the collection sub-packages)?
Because they are deemed so generally useful that they are automatically imported into all programs via synonyms in scala.Predef.
What is the standard mechanism for initializing a collection and then freezing it (which in Java is achieved by wrapping in an unmodifiable)?
Java doesn't have a mechanism for freezing a collection. It only has an idiom for wrapping the (still modifiable) collection in a wrapper that throws an exception. The proper idiom in Scala is to copy a mutable collection into an immutable one - probably using :_*
Why are some collection types (e.g. MultiMap) only defined as mutable? (There is no immutable MultiMap)?
The team/community just hasn't gotten there yet. The 2.7 branch saw a bunch of additions and 2.8 is expected to have a bunch more.
The following seems slightly unwieldy due to the enforced full package declarations:
Scala allows import aliases so it's always less verbose than Java in this regard (see for example java.util.Date and java.sql.Date - using both forces one to be fully qualified)
import scala.collection.{Set => ISet}
import scala.collection.mutable.{Set => MSet}
class MyScala {
var mySet: ISet[String] = null
def init(): Unit = {
val s = MSet.empty[String]
s + "Hello"
s + "World"
mySet = Set(s : _ *)
}
}
Of course, you'd really just write init as def init() { mySet = Set("Hello", "World")} and save all the trouble or better yet just put it in the constructor var mySet : ISet[String] = Set("Hello", "World")

Mutable collections are useful occasionally (though I agree that you should always look at the immutable ones first). If using them, I tend to write
import scala.collection.mutable
at the top of the file, and (for example):
val cache = new mutable.HashMap[String, Int]
in my code. It means you only have to write “mutable.HashMap”, not scala.collection.mutable.HashMap”. As the commentator above mentioned, you could remap the name in the import (e.g., “import scala.collection.mutable.{HashMap => MMap}”), but:
I prefer not to mangle the names, so that it’s clearer what classes I’m using, and
I use ‘mutable’ rarely enough that having “mutable.ClassName” in my source is not an
undue burden.
(Also, can I echo the ‘avoid nulls’ comment too. It makes code so much more robust and comprehensible. I find that I don’t even have to use Option as much as you’d expect either.)

A couple of random thoughts:
I never use null, I use Option, which would then toss a decent error. This practice has gotten rid of a ton NullPointerException opportunities, and forces people to write decent errors.
Try to avoid looking into the "mutable" stuff unless you really need it.
So, my basic take on your scala example, where you have to initialize the set later, is
class MyScala {
private var lateBoundSet:Option[ Set[ String ] ] = None
def mySet = lateBoundSet.getOrElse( error("You didn't call init!") )
def init {
lateBoundSet = Some( Set( "Hello", "World" ) )
}
}
I've been on a tear recently around the office. "null is evil!"

Note that there might be some inconsistencies in the Scala collections API in the current version; for Scala 2.8 (to be released later in 2009), the collections API is being overhauled to make it more consistent and more flexible.
See this article on the Scala website: http://www.scala-lang.org/node/2060
To add to Tristan Juricek's example with a lateBoundSet: Scala has a built-in mechanism for lazy initialization, using the "lazy" keyword:
class MyClass {
lazy val mySet = Set("Hello", "World")
}
By doing this, mySet will be initialized on first use, instead of immediately when creating a new MyClass instance.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.