Why don't compilers use asserts to optimize? [closed] - java

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Following pseudo-C++-code:
vector v;
... filling vector here and doing stuff ...
assert(is_sorted(v));
auto x = std::find(v, elementToSearchFor);
find has linear runtime, because it's called on a vector, which can be unsorted. But at that line in that specific program we know that either: The program is incorrect (as in: it doesn't run to the end if the assertion fails) or the vector to search for is sorted, therefore allowing a binary search find with O(log n). Optimizing it into a binary search should be done by a good compiler.
This is only the easiest worst case behavrior I found so far (more complex assertions may allow even more optimization).
Do some compilers do this? If yes, which ones? If not, why don't they?
Appendix: Some higher level languages may easily do this (especially in case of FP ones), so this is more about C/C++/Java/similar languages

Rice's Theorem basically states that non-trivial properties of code cannot be computed in general.
The relationship between is_sorted being true, and running a faster search is possible instead of a linear one, is a non-trivial property of the program after is_sorted is asserted.
You can arrange for explicit connections between is_sorted and the ability to use various faster algorithms. The way you communicate this information in C++ to the compiler is via the type system. Maybe something like this:
template<typename C>
struct container_is_sorted {
C c;
// forward a bunch of methods to `c`.
};
then, you'd invoke a container-based algorithm that would either use a linear search on most containers, or a sorted search on containers wrapped in container_is_sorted.
This is a bit awkward in C++. In a system where variables could carry different compiler-known type-like information at different points in the same stream of code (types that mutate under operations) this would be easier.
Ie, suppose types in C++ had a sequence of tags like int{positive, even} you could attach to them, and you could change the tags:
int x;
make_positive(x);
Operations on a type that did not actively preserve a tag would automatically discard it.
Then assert( {is sorted}, foo ) could attach the tag {is sorted} to foo. Later code could then consume foo and have that knowledge. If you inserted something into foo, it would lose the tag.
Such tags might be run time (that has cost, however, so unlikely in C++), or compile time (in which case, the tag-state of a given variable must be statically determined at a given location in the code).
In C++, due to the awkwardness of such stuff, we instead by habit simply note it in comments and/or use the full type system to tag things (rvalue vs lvalue references are an example that was folded into the language proper).
So the programmer is expected to know it is sorted, and invoke the proper algorithm given that they know it is sorted.

Well, there are two parts to the answer.
First, let's look at assert:
7.2 Diagnostics <assert.h>
1 The header defines the assert and static_assert macros and
refers to another macro,
NDEBUG
which is not defined by <assert.h>. If NDEBUG is defined as a macro name at the point in the source file where <assert.h> is included, the assert macro is defined simply as
#define assert(ignore) ((void)0)
The assert macro is redefined according to the current state of NDEBUG each time that <assert.h> is included.
2 The assert macro shall be implemented as a macro, not as an actual function. If the macro definition is suppressed in order to access an actual function, the behavior is undefined.
Thus, there is nothing left in release-mode to give the compiler any hint that some condition can be assumed to hold.
Still, there is nothing stopping you from redefining assert with an implementation-defined __assume in release-mode yourself (take a look at __builtin_unreachable() in clang / gcc).
Let's assume you have done so. Now, the condition tested could be really complicated and expensive. Thus, you really want to annotate it so it does not ever result in any run-time work. Not sure how to do that.
Let's grant that your compiler even allows that, for arbitrary expressions.
The next hurdle is recognizing what the expression actually tests, and how that relates to the code as written and any potentially faster, but under the given assumption equivalent, code.
This last step results in an immense explosion of compiler-complexity, by either having to create an explicit list of all those patterns to test or building a hugely-complicated automatic analyzer.
That's no fun, and just about as complicated as building SkyNET.
Also, you really do not want to use an asymptotically faster algorithm on a data-set which is too small for asymptotic time to matter. That would be a pessimization, and you just about need precognition to avoid such.

Assertions are (usually) compiled out in the final code. Meaning, among other things, that the code could (silently) fail (by retrieving the wrong value) due to such an optimization, if the assertion was not satisfied.
If the programmer (who put the assertion there) knew that the vector was sorted, why didn't he use a different search algorithm? What's the point in having the compiler second-guess the programmer in this way?
How does the compiler know which search algorithm to substitute for which, given that they all are library routines, not a part of the language's semantics?

You said "the compiler". But compilers are not there for the purpose of writing better algorithms for you. They are there to compile what you have written.
What you might have asked is whether the library function std::find should be implemented to potentially seek whether or not it can perform the algorithm other than using linear search. In reality it might be possible if the user has passed in std::set iterators or even std::unordered_set and the STL implementer knows detail of those iterators and can make use of it, but not in general and not for vector.
assert itself only applies in debug mode and optimisations are normally needed for release mode. Also, a failed assert causes an interrupt not a library switch.
Essentially, there are collections provided for faster lookup and it is up to the programmer to choose it and not the library writer to try to second guess what the programmer really wanted to do. (And in my opinion even less so for the compiler to do it).

In the narrow sense of your question, the answer is they do if then can but mostly they can't, because the language isn't designed for it and assert expressions are too complicated.
If assert() is implemented as a macro (as it is in C++), and it has not been disabled (by setting NDEBUG in C++) and the expression can be evaluated at compile time (or can be data traced) then the compiler will apply its usual optimisations. That doesn't happen often.
In most cases (and certainly in the example you gave) the relationship between the assert() and the desired optimisation is far beyond what a compiler can do without assistance from the language. Given the very low level of meta-programming capability in C++ (and Java) the ability to do this is quite limited.
In the wider sense I think what you're really asking for is a language in which the programmer can make asserts about the intention of the code, from which the compiler can choose between different translations (and algorithms). There have been experimental languages attempting to do that, and Eiffel had some features in that direction, but I'm now aware of any mainstream compiled languages that could do it.

Optimizing it into a binary search should be done by a good compiler.
No! A linear search results in a much more predictable branch. If the array is short enough, linear search is the right thing to do.
Apart from that, even if the compiler wanted to, the list of ideas and notions it would have to know about would be immense and it would have to do nontrivial logic on them. This would get very slow. Compilers are engineered to run fast and spit out decent code.
You might spend some time playing with formal verification tools whose job is to figure out everything they can about the code they're fed in, which asserts can trip, and so forth. They're often built without the same speed requirements compilers have and consequently they're much better at figuring things out about programs. You'll probably find that reasoning rigorously about code is rather harder than it looks at first sight.

Related

Why isn't overflow checked by default

I found some questions on SO about checking operations before executions for over/underflow behavior. It seems that there are ways to do this quite easy. So why isn't there an option to automatically check each mathematical operation before execution or why is there not Exception for buffer over/underflow of arithmetic operations? Or phrased differently: In what scenario would it be useful to allow operations to overflow unnoticed?
Is it maybe a matter of run-time? Or is the main source of overflow occurring during non-mathematical operations?
Actually, for C there are checking options, see here: http://danluu.com/integer-overflow/
As for java, adding integer overflow checks would open a can of worms. As java does not offer unsigned types, unsigned math is often done in plain int or long types - obviously the VM will not be magically aware of the unsigned nature of the operation intended, meaning you either need to add unsigned types or the programmer would need to pay a lot of attention to turn the checks on/off. An example for unsigned math with signed types can be found in Arrays.binarySearch. On a side note, Java does exactly define what the result in case of overflow is, so relying on overflow behavior is legal use of defined behavior.
As briefly analyzed in the C link above, these checks can have a severe impact on performance in practice, due to a combination of crude implementation and/or by interfering with other code optimizations.
Also, while most CPU's can detect overflow (usually by the C and V flag), they do it simultaneously for signed/unsigned (common CPU ISA's do not make a distiction between signed/unsigned operations in case of add/sub). Its up to the program to respond to these flags, which means inserting additional instructions into the code. Again this means the programmer/compiler has to be aware if the operation is intended to be signed or unsigned to make the correct choice.
So overflow detection does come with a cost, albeit it could be made reasonably small with good compiler support.
But in many cases overflows are either not possible by design (e.g. the valid input parameters of a function cannot produce overflow), desired (e.g. wrap around behavior counters), or when they do happen are caught by other means when the result is used (e.g. by array bounds checking).
I have to think hard for instances where I actually ever felt the need for overflow checking. Usually you're far more concerned to validate the value range at specific points (e.g. function arguments). But these are arbitrary checks for a function specfic value range, which the compiler cannot even know (well, in some languages it would, because its explicitly expressed, but neither Java nor C fall in this category).
So overflow checking is not universally useful. It doesn't mean there aren't any potential bugs it could prevent, but compared to other bugs types overflow isn't really a common issue. I can't remember when I last saw a bug caused by integer overflow. Off by one bugs are far more common, for example. On the other hand, there are some microoptimizations that explicitly rely on overflow wraparound (e.g. an old question of mine, see accepted answer: Performance: float to int cast and clipping result to range).
With the situation as described, forcing C/Java to check and respond to integer overflow would make them worse languages. They would be slower, and/or the programmer would simply deactivate the feature because it gets in the way more than it is useful. That doesn't mean overflow checking as a language feature would generally be bad; but to really get something out of it, the environment also needs to fit (e.g. as mentioned above, Java would need unsigned types).
TL;DR It could be useful, but it requires much deeper language support than just a switch to be useful.
I can offer two potential factors as to why unchecked arithmetic is the default:
Sense of familiarity: Arithmetic in C and C++ is unchecked by default and people who got used to those languages would not expect the program to throw, but to silently continue. This is a misconception, as both C and C++ have undefined behavior on signed integer overflow/underflow. But nonetheless, it has created a certain expectation in many people's minds and new languages in the same family tend to shy away from visibly breaking established conventions.
Benchmark performance: Detecting overflow/underflow usually requires the execution of more instructions than you would need if you decided to ignore it. Imagine how a new language would look like if a person not familiar with it wrote a math-heavy benchmark (as it often happens) and "proved" that the language is dramatically slower than C and C++ even for the simplest mathematical operations. This would damage people's perception of the language's performance and it could hinder its adoption.
The Java language just does not have this feature built-in as a keyword or mechanism to apply directly for the +, - and * operators. For example, C# has the checked and unchecked keywords for this. However, these checks can be costly and hard to implement, when there is no native support in the language. As for Java 1.8, the methods addExact, subtractExact and multiplyExact have been added to the API to provide this feature, as pointed out by #Tom in the comments.
Why is this not done automatically even if the language supports it? The simple answer is that in general over- and underflow can be accepted or wanted behaviours or they simply do not occur because of a sophisticated and well executed design as it should be. I would say that exploiting over- and underflows is rather a low-level or harware programming concern to avoid additional operations for performance reasons.
Overall, your application design should either explicitly state the sensible use of arithmetic over- and underflows or better not need to use them at all, because it can lead to confusion, unintuitive behaviour or critical bugs. In the first case you don't check, in the second case the check would be useless. An automatic check would be superfluos and only cost performance.
A contrived example of a wanted overflow, could be a counter. Say you have an unsigned short and count it up. After 65536 it goes back to zero because of the overflow, which can be convenient.

Why does Guava's Optional use abstract classes when Java 8's uses nulls?

When Java 8 was released, I was expecting to find its implementation of Optional to be basically the same as Guava's. And from a user's perspective, they're almost identical. But Java 8's Optional uses null internally to mark an empty Optional, rather than making Optional abstract and having two implementations. Aside from Java 8's version feeling wrong (you're avoiding nulls by just hiding the fact that you're really still using them), isn't it less efficient to check if your reference is null every time you want to access it, rather than just invoke an abstract method? Maybe it's not, but I'm wondering why they chose this approach.
Perhaps the developers of Google Guava wanted to develop an idiom closer to those of the functional world:
datatype ‘a option = NONE | SOME of ‘a
In whose case you use pattern matching to check the true nature of an instance of type option.
case x of
NONE => //address null here
| SOME y => //do something with y here
By declaring Option as an abstract class, the Google Guava is following this approach, where Option represent the type ('a option), and the subclasses for of and absent would represent the particular instances of this type (SOME 'a and NONE).
The design of Option was thoroughly discussed in the lambda mailing list. In the words of Brian Goetz:
The problem is with the expectations. This is a classic "blind men
and elephant" problem; the thing called Optional has different
"essential natures" to different viewpoints, and the problem is not
that each is not valid, the problem is that we're all using the same
word to describe different concepts (more precisely, assuming that the
goals of the JDK team are the same as the goals of the people you
condescendingly refer to as "those familiar with the concept."
There is a narrow design scope of what Optional is being used for in
the JDK. The current design mostly meets that; it might be extended
in small ways, but the goal is NOT to create an option monad or solve
the problems that the option monad is intended to solve. (Even if we
did, the result would still likely not be satisfactory; without the
rest of the class library following the same monadic API conventions,
without higher-kinded generics to abstract over different kinds of
monads, without linguistic support for flatmap in the form of the <-
operator, without pattern matching, etc, etc, the value of turning
Optional into a monad is greatly decreased.) Given that this is not
our goal here, we're stopping where it stops adding value according to
our goals. Sorry if people are upset that we're not turning Java into
Scala or Haskell, but we're not.
On a purely practical note, the discussions surrounding Optional have
exceeded its design budget by several orders of magnitude. We've
carefully considered the considerable input we've received, spent no
small amount of time thinking about it, and have concluded that the
current design center is the right one for the current time. What is
surely meant as well-intentioned input is in fact rapidly turning into
a denial-of-service attack. We could spend endless time arguing this
back and forth, and there'd be no JDK 8 as a result. I'm sure no one
wants that.
So, let's keep our input on the subject to that which is within the
design center of the current implementation, rather than trying to
convince us to change the design center.
i would expect virtual method invocation lookup to be more expensive. you have to load the virtual function table, look up an offset, and then invoke the method. a null check is a single bytecode that reads from a register and not memory.

What are the subphases of the semantics analysis compiler phase?

I took an interest in finding out how a compiler really works. I looked through several books and all of them agree on the fact that the compiler phases are roughly as this(correct me if I'm wrong): lexical analysis, syntax analysis, semantic analysis, intermediate code, code optimization, code generation. The lexical and syntax phases look pretty clear and straightforward as methods(but this does not mean easy of course). However, I'm still not able to find what the semantic phase really consist of. For one, I know that there should be some subphases like scope checking, declaration checking and type checking but question that has been bothering me is: are there other things that have to be done. Can you tell me what are the mandatory steps that have to taken during this phase. I know this strongly depends on the programming language and the compiler implementation but could you give me some examples concerning C/C++, Java. And could you please point me to a book/page/article where can I read those things in-depth. Thanks.
Edit:
The books I look through were "Compilers: Principles, Techniques, and Tools",Aho and "Modern Compiler Design", Grune, Reeuwijk. I haven't been able to answer this question using them. If you find this question too broad could you please give an answer considering an compiler implementation of your choice for either C,C++ or Java.
There are typical "semantic analysis" phases that many compilers go through in one form or another. After lexing and parsing, the following actions typically occur in this order:
Name and type resolution. Determines lexical scopes, identifiers declared in such scopes, the type information for those identifiers, and for each non-declaration use of an identifier, the declaration to which it refers
Control flow analysis. The construction of a control flow graph over the computations explicit and/or implied (e.g., constructors) by the code.
Data flow analysis. Determines where variables recieve new values, and where those values are read by other parts of the program. (This often has a local analysis done within procedures, followed possibly by one across the procedures).
Also often done, as part of data flow analysis:
Points-to analysis. Determination for each pointer, at each location in the code, which entities that pointer might reference
Call graph. Construction of a call graph across the procedures, often taking into account indirect function pointers whose estimated values occur during the points-to analysis.
As a practical matter, some of these need to be interleaved to produce better results.
Beyond this, there are many analyses used to support various optimizations and code generation passes. If you really want to know more, consult any decent compiler book.
As already mentioned by templatetypedef, semantic analysis is language specific. For C++ it would among other things involve what template instantiations are required (the C++ language tends towards more and more semantic analysis), and for Java there would need to be some checked exception analysis.
Even for C the GNU C compiler can be configured to check arguments given to string-interpolations. I guess there are hundres of semi semantic analysis-related options for GCC to choose from. If you are doing a paper on the subject, you could spend an afternoon counting them :)
Besides availability, I find that the semantic analysis is what differentiates the statically typed imperative object-oriented languages of today.
You can't necessarily divide it into sub-phases at all. There are a number of things that have to be done, but at least conceptually they are all done while walking the parse tree from top to bottom and back up again. What exactly they are and how exactly it all happens depends on the language, the statement being processed, the specific compiler writer, ...
You could start to make a list:
Build symbol table.
Find the declarations of variables referenced.
Check compatibility of variable datatypes.
Establish subexpression types.
...
You can see that already these must be somewhat intermingled in practice, rather than constitute separable sub-phases.

Java Program Specialization - What is it? I don't understand it

I'm reading about program specialization - specifically java and I don't think I quite understand it to be honest. So far what I understand is that it is a method for optimizing efficiency of programs by constraining parameters or inputs? How is that actually done? Can someone maybe explain to me how it helps, and maybe an example of what it actually does and how its done?
Thanks
I have been reading:
Program Specialization - java
Program specialization is the process of specializing a program when you know in advance what are the arguments you're going to have.
One example is if you have a test and you know that with your arguments, you're never going to enter the block, you can eliminate the test.
You create a specialized version of the program for a certain kind of input.
Basically, it helps to get rid off useless with your input. However, with the modern architectures and compilers (at least in C), you're not going to win a lot in terms of performance.
From the same authors, i would recommend the Tempo work.
EDIT
From the Toplas paper:
Program specialization is a program
transformation technique that
optimizes a pro- gram fragment with
respect to information about a context
in which it is used, by generating an
implementation dedicated to this usage
context. One approach to automatic
program specialization is partial
evaluation, which performs aggressive
inter-procedural constant propagation
of values of all data types, and
performs constant folding and
control-flow simplifications based on
this information [Jones et al. 1993].
Partial evaluation thus adapts a
program to known (static) informa-
tion about its execution context, as
supplied by the user (the programmer).
Only the program parts controlled by
unknown (dynamic) data are
reconstructed. Par- tial evaluation
has been extensively investigated for
functional languages [Bondorf 1990;
Consel 1993], logic languages [Lloyd
and Shepherdson 1991], and imperative
languages [Andersen 1994; Baier et al.
1994; Consel et al. 1996].
Interesting.
It's not a very common term, at least I haven't come across it before.
I don't have time to read the whole paper, but it seems to refer to the potential to optimise a program depending on the context in which it will be run. An example in the paper shows an abstract "power" operation being optimised through adding a hard-coded "cube" operation. These optimisations can be done automatically, or may require programmer "hints".
It's probably worth pointing out that specialization isn't specific to Java, although the paper you link to describes "JSpec", a Java code specializer.
It looks like Partial Evaluation applied to Java.
That idea is if you have a general function F(A,B) having two parameters A and B, and (just suppose) every time it is called, A is always the same. Then you could transform F(A,B) into a new function FA(B) that only takes one parameter, B. This function should be faster because it is not having to process the information in A - it already "knows" it. It can also be smaller, for the same reason.
This is closely related to code generation.
In code generation, you write a code generator G to take input A and write the small, fast specialized function FA. G(A) -> FA.
In specialization, you need three things, the general program F, the specializer S, and the input A: S(F,A) -> FA.
I think it's a case of divide-and-conquer.
In code generation, you only have to write G(A), which is simple because it only has to consider all As, while the generated program considers all the Bs.
In Partial Evaluation, you have to get an S somewhere, and you have to write F(A,B) which is more difficult because it has to consider the cross product of all possible As and Bs.
In personal experience, a program F(A,B) had to be written to bridge real-time changes from an older hierarchical database to a newer relational one. A was the meta-description of how to map the old database to the new, in the form of another database. B was the changes being made to the original database, and F(A,B) computed the corresponding changes to the newer database. Since A changed at low frequency (weekly), F(A,B) did not have to be written. Instead a generator G(A) was written (in C) to generate FA(B) (in C). Time saved was roughly an order of magnitude of development time, and two orders of magnitude of run time.

As a Java programmer learning Python, what should I look out for? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Much of my programming background is in Java, and I'm still doing most of my programming in Java. However, I'm starting to learn Python for some side projects at work, and I'd like to learn it as independent of my Java background as possible - i.e. I don't want to just program Java in Python. What are some things I should look out for?
A quick example - when looking through the Python tutorial, I came across the fact that defaulted mutable parameters of a function (such as a list) are persisted (remembered from call to call). This was counter-intuitive to me as a Java programmer and hard to get my head around. (See here and here if you don't understand the example.)
Someone also provided me with this list, which I found helpful, but short. Anyone have any other examples of how a Java programmer might tend to misuse Python...? Or things a Java programmer would falsely assume or have trouble understanding?
Edit: Ok, a brief overview of the reasons addressed by the article I linked to to prevent duplicates in the answers (as suggested by Bill the Lizard). (Please let me know if I make a mistake in phrasing, I've only just started with Python so I may not understand all the concepts fully. And a disclaimer - these are going to be very brief, so if you don't understand what it's getting at check out the link.)
A static method in Java does not translate to a Python classmethod
A switch statement in Java translates to a hash table in Python
Don't use XML
Getters and setters are evil (hey, I'm just quoting :) )
Code duplication is often a necessary evil in Java (e.g. method overloading), but not in Python
(And if you find this question at all interesting, check out the link anyway. :) It's quite good.)
Don't put everything into classes. Python's built-in list and dictionaries will take you far.
Don't worry about keeping one class per module. Divide modules by purpose, not by class.
Use inheritance for behavior, not interfaces. Don't create an "Animal" class for "Dog" and "Cat" to inherit from, just so you can have a generic "make_sound" method.
Just do this:
class Dog(object):
def make_sound(self):
return "woof!"
class Cat(object):
def make_sound(self):
return "meow!"
class LolCat(object):
def make_sound(self):
return "i can has cheezburger?"
The referenced article has some good advice that can easily be misquoted and misunderstood. And some bad advice.
Leave Java behind. Start fresh. "do not trust your [Java-based] instincts". Saying things are "counter-intuitive" is a bad habit in any programming discipline. When learning a new language, start fresh, and drop your habits. Your intuition must be wrong.
Languages are different. Otherwise, they'd be the same language with different syntax, and there'd be simple translators. Because there are not simple translators, there's no simple mapping. That means that intuition is unhelpful and dangerous.
"A static method in Java does not translate to a Python classmethod." This kind of thing is really limited and unhelpful. Python has a staticmethod decorator. It also has a classmethod decorator, for which Java has no equivalent.
This point, BTW, also included the much more helpful advice on not needlessly wrapping everything in a class. "The idiomatic translation of a Java static method is usually a module-level function".
The Java switch statement in Java can be implemented several ways. First, and foremost, it's usually an if elif elif elif construct. The article is unhelpful in this respect. If you're absolutely sure this is too slow (and can prove it) you can use a Python dictionary as a slightly faster mapping from value to block of code. Blindly translating switch to dictionary (without thinking) is really bad advice.
Don't use XML. Doesn't make sense when taken out of context. In context it means don't rely on XML to add flexibility. Java relies on describing stuff in XML; WSDL files, for example, repeat information that's obvious from inspecting the code. Python relies on introspection instead of restating everything in XML.
But Python has excellent XML processing libraries. Several.
Getters and setters are not required in Python they way they're required in Java. First, you have better introspection in Python, so you don't need getters and setters to help make dynamic bean objects. (For that, you use collections.namedtuple).
However, you have the property decorator which will bundle getters (and setters) into an attribute-like construct. The point is that Python prefers naked attributes; when necessary, we can bundle getters and setters to appear as if there's a simple attribute.
Also, Python has descriptor classes if properties aren't sophisticated enough.
Code duplication is often a necessary evil in Java (e.g. method overloading), but not in Python. Correct. Python uses optional arguments instead of method overloading.
The bullet point went on to talk about closure; that isn't as helpful as the simple advice to use default argument values wisely.
One thing you might be used to in Java that you won't find in Python is strict privacy. This is not so much something to look out for as it is something not to look for (I am embarrassed by how long I searched for a Python equivalent to 'private' when I started out!). Instead, Python has much more transparency and easier introspection than Java. This falls under what is sometimes described as the "we're all consenting adults here" philosophy. There are a few conventions and language mechanisms to help prevent accidental use of "unpublic" methods and so forth, but the whole mindset of information hiding is virtually absent in Python.
The biggest one I can think of is not understanding or not fully utilizing duck typing. In Java you're required to specify very explicit and detailed type information upfront. In Python typing is both dynamic and largely implicit. The philosophy is that you should be thinking about your program at a higher level than nominal types. For example, in Python, you don't use inheritance to model substitutability. Substitutability comes by default as a result of duck typing. Inheritance is only a programmer convenience for reusing implementation.
Similarly, the Pythonic idiom is "beg forgiveness, don't ask permission". Explicit typing is considered evil. Don't check whether a parameter is a certain type upfront. Just try to do whatever you need to do with the parameter. If it doesn't conform to the proper interface, it will throw a very clear exception and you will be able to find the problem very quickly. If someone passes a parameter of a type that was nominally unexpected but has the same interface as what you expected, then you've gained flexibility for free.
The most important thing, from a Java POV, is that it's perfectly ok to not make classes for everything. There are many situations where a procedural approach is simpler and shorter.
The next most important thing is that you will have to get over the notion that the type of an object controls what it may do; rather, the code controls what objects must be able to support at runtime (this is by virtue of duck-typing).
Oh, and use native lists and dicts (not customized descendants) as far as possible.
The way exceptions are treated in Python is different from
how they are treated in Java. While in Java the advice
is to use exceptions only for exceptional conditions this is not
so with Python.
In Python things like Iterator makes use of exception mechanism to signal that there are no more items.But such a design is not considered as good practice in Java.
As Alex Martelli puts in his book Python in a Nutshell
the exception mechanism with other languages (and applicable to Java)
is LBYL (Look Before You Leap) :
is to check in advance, before attempting an operation, for all circumstances that might make the operation invalid.
Where as with Python the approach is EAFP (it's easier to Ask for forgiveness than permission)
A corrollary to "Don't use classes for everything": callbacks.
The Java way for doing callbacks relies on passing objects that implement the callback interface (for example ActionListener with its actionPerformed() method). Nothing of this sort is necessary in Python, you can directly pass methods or even locally defined functions:
def handler():
print("click!")
button.onclick(handler)
Or even lambdas:
button.onclick(lambda: print("click!\n"))

Categories