As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Java is designed to run in a “Sandboxed” environment in contrast to C which is not constrained. Discuss the implications of this from a security point of view.
I have done a bit of research in regards to the concept. What I found was if it's sandboxed where Java runs on, it goes through a more controlled door to retrieve information from different parts of memory. However, in C, it doesn't where it is uncontrolled.
Can someone please explain if there is anything else that I can add?
Edit:
This is what I have added:
Java programming language is an object-oriented language specifically designed to have as few implementation dependencies as possible. Java is write once deploy many times model. It is superficially like C/C++/C# but have different underlying object model. There are many versions of Java such as Java Standard Edition (Java SE), Java Mobile Edition (Java ME), Java Enterprise Edition (Java EE) etc.
The applications written in java are compiled to Java byte code which is an intermediate language independent of any platform. Thus, compiler can work multi platform.
The Virtual Machine model of compilation and execution works by running the compiled code (Java byte code) on Java Virtual Machine (JVM). Java Virtual Machine interprets java byte code and provides an environment in which Java byte code can be executed. The use of the same byte code for all JVMs on all platforms allow Java to be described as a “write once, run anywhere”.
This is better model than the compiler that generates native code due to following reasons:
Mobile devices come in various shape and sizes and with varying processor speed and architecture. Therefore, the virtual machine model of compilation and execution allows all java apps to run on various different mobile platforms once JVM is implemented on a device.
The virtual machine mode of compilation provide a platform-independent programming environment that abstracts away details of the underlying hardware or operating system, and allows a program to execute in the same way on any platform.
Performance comparable to compiled programming languages is achieved by the use of just-in-time compilation (a method to improve the runtime performance of computer program).
Compilation from byte code to machine code is much faster than compiling from source.
The deployed byte code is portable, unlike native code.
Since the runtime has control over the compilation, like interpreted byte code, it can run in a secure sandbox. Compilers from byte code to machine code are easier to write, because the portable byte code compiler has already done much of the work.
Don't mix languages and virtualisation.
Google's Native Client (NaCL) is a sandboxed environment for C/C++ (and everything else) programs, for example.
Here is a 'quick sampling' of the most common security challenges for sand-boxed code, off the top of my head.
In a sand-boxed app. restrictions will be added in the areas of:
Printing.
Cross-domain resource access.
Copy/Paste.
Import/export of resources from/to the local file system.
Applet calling System.exit(n).
Call applet methods from JavaScript..
Warnings attached to free floating elements. Which can even apply to tool-tips of apps. using 3rd party PLAFs.
Access to a variety of properties (e.g. user.*, any number of java.* props)
Load and use natives (that should be obvious).
Start processes.
Redirect output or error streams.
Record sound or video.
Access to the Robot.
..
If this question is about 'security', you have a lot of research to do. ;)
Java is inherently "sandboxable", whereas C++ et al are not. In Java there is no way to get outside of an object -- you can't freely cast an object pointer to char*, for instance, and you can't address beyond the end of an array.
Access to system facilities is limited by in several ways (some of which I forget just now). But basically one can run a Java program that loads secure class loaders and the like, and then any application that is loaded in that environment is "corralled" and cannot do anything that you don't wish to let it do.
(In fact, the "classic" JVM on IBM iSeries made use of these features to prevent even the "main" Java program from running amok. Java code ran in the same address space as the OS, but still the OS was protected by Java's inherent security.)
In C++ et al, in order to accomplish the same thing, you either have to make use of storage protection hardware or else have a compiler that compiles in checks of every access.
Related
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why isn't more Java software compiled natively?
I know that Java is byte code compiled, but when using the JIT, it will compile the 'hotspots' to native code. Why is there not an option to compile a program to native code?
There is an option to compile Java source code files in binary code it is called GCJ and come from Free Software Foundation.
It is not officially supported by Sun/Oracle but i've used the compiler and it do a great job ;)
Java, as a language, can be implemented, like any language, in many ways. This include full interpretation, bytecode compilation, and native compilation. See Java language specification
The VM specification defines how bytecode should be loaded and executed. It defines the compiled class format, and the execution semantics, e.g. threading issue, and what is called the "memory model". See Java VM specification
While orignally meant to go together, the language and the VM are distinct specifications. You can compile another language to Java bytecode and run it on top of the JVM. And you can implement the Java language in different way, notably without a VM, as long as you follow the expected semantics.
Of course, both are still related, and certain aspects of Java might be hard to support without bytecode interpretation at all, e.g. custom ClassLoader that return bytecode (see ClassLoader.defineClass). I guess the bytecode would need to be JIT'ed immediatly into native code, or maybe are not supported at all.
Which platform native code should it compile to?
Windows, Mac, Linux?
What if the developer works on a different platform than the application is going to run on?
What if the application platform changes, either in the server room or on the desktop?
I don't see the benefit, the JVM's nowadays seem to be to be fast enough for very general purpose needs.
There are several products out there to compile java programs to native code, however they are imperfect, and not at all like the JIT compiler. Some differences include:
Write Once Run Everywhere - it will only work on the target you compile it for.
Dynamic code - you cannot load jars or other Java code at runtime, which is often a feature of application servers, GUI builders and the like.
Runtime profiling - a lot of JIT compiler action involves understanding what the code is doing at runtime, not what it could potentially do under a static analysis, meaning that JIT can outperform a natively compiled application in the right circumstances.
Cannot support all Java features. Things like reflection aren't going to be very meaningful in a compiled program.
Large footprint - when it is compiled to native code, all of the libraries the JVM gives you have to be bundled into the package, causing a very large footprint. It is a tricky problem to figure out what can be left out.
So it is possible, for a certain subset of applications, to compile to native code, but as VMs have gotten faster and faster, and issue #5 above has not really been improved (although project Jigsaw should help with that), it is not a very compelling option for real world applications.
Because it is enough to have byte-code compiled.
If you would compile your own code - you had also compile all libraries.
And it is real problem from two point of view:
1. licensing - most of the code wouldn't be changed
2. you had 'recompile' megatons of code :-)
This was a decision made by Sun to not allow this because they wanted to position Java as being inherently multi-platform. As such, they wanted to ensure that any Java application compiled would run on any platform with a JVM. This prevents there from being Java binaries available on-line which don't run on certain hardware or operating systems.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
After reading the Java White Paper, I have a question on my mind and it might not be a very smart question but here it is anyway:
From what I gathered, Java attempted to improve an a number of fallacies associated with C++ such as redundancy, confusion with pointers, full-object orientation etc.,
If Java managed to overcome these issues, why would it be incorret to say that Java can replace C++.
There are many situations where C++ is a better option than Java. Comparison here.
Specifically:
In addition to running a compiled Java program, computers running Java
applications generally must also run the Java Virtual Machine JVM,
while compiled C++ programs can be run without external applications.
Early versions of Java were significantly outperformed by statically
compiled languages such as C++. This is because the program statements
of these two closely related languages may compile to a few machine
instructions with C++, while compiling into several byte codes
involving several machine instructions each when interpreted by a JVM.
Certain inefficiencies are inherent to the Java language itself,
primarily:
All objects are allocated on the heap. For functions using small objects this can result in performance degradation as stack
allocation, in contrast, costs essentially zero. However, this
advantage is obsoleted by modern JIT compilers utilising escape
analysis or escape detection to allocate objects on the stack. Escape
analysis was introduced in Oracle JDK 6.
Methods are by-default virtual. This slightly increases memory usage by adding a single pointer to a virtual table per each object.
Also, it induces a startup performance penalty, because a JIT compiler
has to do additional optimization passes even for de-virtualization of
small functions.
A lot of casting required even using standard containers induces a performance penalty. However, most of these casts are statically
eliminated by the JIT compiler, and the casts that remain in the code
usually do not cost more than a single CPU cycle on modern processors,
thanks to branch prediction.
Array access must be safe. The compiler is required to put appropriate range checks in the code. Naive approach of guarding each
array access with a range check is not efficient, so most JIT
compilers generate range check instructions only if they cannot
statically prove the array access is safe. Even if all runtime range
checks cannot be statically elided, JIT compilers try to move them out
of inner loops to make the performance degradation as low as possible.
Lack of access to low level details does not allow the developer to better optimize the program where the compiler is unable to do
so.[10]. Programmers can interface with the OS directly by providing
code in C or C++ and calling that code from Java by means of JNI.
Also as an iOS/Mac dev, and strong background in DSP, and lover of many open source C++ and Objective C libraries, I could go on and on as to why Java is not better...
Java didn't have the same goals and so doesn't serve the same function as C++. In other words, Java may have made some improvements, but also some regressions in things that are important for C++ applications.
Therefore Java cannot simply replace C++.
Your question is kind of off-topic for this forum (it's not about a programming problem). Nevertheless, here's my two cents.
Java and C++ serve different needs. For instance, while pointers in C++ (as in C) are certainly complicated, it is precisely because of pointers that one can manipulate specific addresses in memory. This is quite valuable for certain applications and impossible to do in Java (without resorting to native methods—often implemented in (ahem) C++).
C++ is compiled into machine code native to the machine. Java is compiled into machine-independent byte code. Thus C++ tends to have a speed advantage. This is offset somewhat, but not entirely, by just-in-time compilers for Java.
I'm sure others here will post additional differences between the two.
Java is a reasonably good application development platform.
It is not a systems development platform. It can't provide direct access to hardware. It can't be used to implement a Java Virtual Machine (chicken and egg problem).
So you will always need a language that compiles to native code in order to bootstrap your high-level runtime.
Because C++ is often still a factor faster in applications and the JRE hasn't been ported to all platforms/OS's.
Other than that, not everyone agrees that, just from a design perspective, Java is an improvement on C++.
The main answer is probably that the language syntax is not what matters the most. C++ is designed to be compiled as native applications, while Java is designed to be compiled as Java bytecode applications, which run on a Java Virtual Machine.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 12 years ago.
The JIT compiler has been around for some time. Occasionally it comes to my mind that, "hey, why not just compile Java source file directly to machine code?"
Along with compiled library, we can get rid of the cumbersome JVM.
The only barrier I can think of is the garbage collector. How's your thoughts?
PS: Oh, you say portability? WTH is that? Plus, I'm forced to install a JVM in the first place.
Well, my friend uses Ubuntu, and I use Windows XP, and my other friend uses OSX.
When I send them a jar that I compiled they can both run the file without any changes.
That is why you should not get rid of the JVM.
vm can do complex optimizations based on information only available at runtime. Static compile time optimization simply can't compete. Java is fast because it is running on the vm.
watch this
http://www.infoq.com/presentations/Towards-a-Universal-VM
On some platforms (mostly embedded ones), it's just as you say (or else the machines speak java natively). You can also download compilers that do what you are suggesting, but I imagine you lose a lot of the Java API in the process.
Back to your question, the main reason why is that the people who design the languge and specification want to have it. Plain and simple. It offers portability in the consideration that the "hard" part of making portable code supposedly only has to be done once per environment (another poster spoke of 3 different OS's running the JVM) rather than once per each environment per project. Have you ever tried to make even mostly-portable C++ code without the aid of frameworks like Qt or packages like Boost? It gets VERY difficult, and even then you must still re-compile for each architecture.
Beside portability another issue that comes to mind is dynamic classloading which is difficult to handle via machine code. How would servlet containers work in such a scenario? Maybe it works well for embedded Java, but I don't think for J2EE.
Would the .class form just be an intermediate binary that is converted to machine code before execution? Or would you directly compile from Java source to machine code?
Bytecode generation is necessary for platform independence of code.
JVM (JVM is different for all platforms) reads these bytecode and converts these into machine code depending upon which platform its running. This makes Java compiled code platform independent. JVM also does optimizations which makes Java fast.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I wonder how Java is more portable than C, C++ and .NET and any other language. I have read many times about java being portable due to the interpreter and JVM, but the JVM just hides the architectural differences in the hardware, right? We'd still need different JVMs for different machine architectures. What am I missing here? So if someone writes an abstraction layer for C for the most common architectures, let's say the CVM, then any C program will run on those architectures once CVM is installed, isn't it?
What exactly is this portability? Can .NET be called portable?
Portability isn't a black and white, yes or no kind of thing. Portability is how easily one can I take a program and run it on all of the platforms one cares about.
There are a few things that affect this. One is the language itself. The Java language spec generally leaves much less up to "the implementation". For example, "i = i++" is undefined in C and C++, but has a defined meaning in Java. More practically speaking, types like "int" have a specific size in Java (eg: int is always 32-bits), while in C and C++ the size varies depending on platform and compiler. These differences alone don't prevent you from writing portable code in C and C++, but you need to be a lot more diligent.
Another is the libraries. Java has a bunch of standard libraries that C and C++ don't have. For example, threading, networking and GUI libraries. Libraries of these sorts exist for C and C++, but they aren't part of the standard and the corresponding libraries available can vary widely from platform to platform.
Finally, there's the whole question of whether you can just take an executable and drop it on the other platform and have it work there. This generally works with Java, assuming there's a JVM for the target platform. (and there are JVMs for many/most platforms people care about) This is generally not true with C and C++. You're typically going to at least need a recompile, and that's assuming you've already taken care of the previous two points.
Yes, if a "CVM " existed for multiple platforms, that would make C and C++ more portable -- sort of. You'd still need to write your C code either in a portable way (eg: assuming nothing about the size of an int other than what the standard says) or you'd write to the CVM (assuming it has made a uniform decision for all of these sorts of things across all target platforms). You'd also need to forgo the use of non-standard libraries (no networking, threading or GUI) or write to the CVM-specific libraries for those purposes. So then we're not really talking about making C and C++ more portable, but a special CVM-C/C++ that's portable.
Once again, portability isn't a black and white thing. Even with Java there can still be incompatibilities. The GUI libraries (especially AWT) were kind of notorious for having inconsistent behavior, and anything involving threads can behave differently if you get sloppy. In general, however, it's a lot easier to take a non-trivial Java program written on one platform and run it on another than it is to do the same with a program written in C or C++.
As others have already said, portability is somewhat of a fuzzy concept. From a certain perspective, C is actually more portable than Java. C makes very few assumptions about the underlying hardware. It doesn't even assume that there are 8 bits in a byte, or that negative numbers should be represented using two's complement. Theoretically, as long as you have a Von Neumann based machine and a compiler, you're good to go with C.
In fact, a "Hello world" program written in C is going to work on many more platforms than a "Hello world" program written in Java. You could probably get the same "hello world" program to work on a PDP-11 and an iPhone.
However, the reality is that most real-world programs do a lot more than output "Hello world". Java has a reputation for being more portable than C because in practice, it takes a lot more effort to port real-world C programs to different platforms than real-world Java programs.
This is because the C language is really ANSI-C, which is an extremely general-purpose, bare-bones language. It has no support for network programming, threading, or GUI development. Therefore, as soon as you write a program which includes any of those things, you have to fall back on a less-portable extension to C, like Win32 or POSIX or whatever.
But with Java, network programming, threading, and GUI tools are defined by the language and built into each VM implementation.
That said, I think a lot of programmers also underestimate the progress modern C/C++ has made in regard to portability these days. POSIX goes a long way towards providing cross-platform threading, and when it comes to C++, Boost provides networking and threading libraries which are basically just as portable as anything in Java. These libraries have some platform-specific quirks, but so does Java.
Essentially, Java relies on each platform having a VM implementation which will interpret byte code in a predictable way, and C/C++ relies on libraries which incorporate platform specific code using the preprocessor (#ifdefs). Both strategies allow for cross platform threading, networking, and GUI development. It's simply that Java has made faster progress than C/C++ when it comes to portability. The Java language spec had threading, networking and GUI development almost from day one, whereas the Boost networking library only came out around 2005, and it wasn't until 2011 with C++11 that standard portable threading was included in C++.
When you write a Java program, it runs on all platforms that have JVM written for them - Windows, Linux, MacOS, etc.
If you write a C++ program, you'll have to compile it specifically for each platform.
Now, it is said that the motto of Java "write once, run everywhere" is a myth. It's not quite true for desktop apps, which need interaction with many native resources, but each JavaEE application can be run on any platform. Currently I'm working on windows, and other colleagues are working on Linux - without any problem whatsoever.
(Another thing related to portability is JavaEE (enterprise edition). It is said that applications written with JavaEE technologies run in any JavaEE-certified application server. This, however, is not true at least until JavaEE6. (see here))
Portability is a measure for the amount of effort to make a program run on another environment than where it originated.
Now you can debate if a JVM on Linux is a different environment than on Windows (I would argue yes), but the fact remains that in many cases there is zero effort involved if you take care of avoiding a few gotchas.
The CVM you are talking about is very much what the POSIX libraries and the runtime libraries try to provide, however there are big implementation differences which make the hurdles high to cross. Certainly in the case of Microsoft and Apple these are probably intentionally so in order to keep developers from bringing out products on competing platforms.
On the .net front, if you can stick to what mono provides, an open source .Net implementation, you will enjoy roughly the same kind of portability as Java, but since mono is significantly behind the Windows versions, this is not a popular choice. I do not know how popular this is for server based development where I can imagine it is less of an issue.
Java is portable from the perspective of the developer: code written in Java can be executed in any environment without the need to recompile. C is not portable because not only is it tied to a specific OS in many cases, it is also always tied to a specific hardware architecture once it has been compiled. The same is true for C++. .Net is more portable than C/C++, as it also relies on a virtual machine and is therefore not tied to a specific hardware architecture at compile-time, but it is limited to Windows machines (officially).
You are correct, the JVM is platform-specific (it has to be!), but when you say Java is portable, you are talking about it from a developer standpoint and standard Java developers do not write the JVM, they use it :-).
Edit #Raze2Dust To address your question. Yes, you could. In fact, you could make Java platform-specific by writing a compiler that would generate machine code rather than bytecode. But as some of the other comments suggest, why would you do that? You'd have to create an interpreter that maps the compiled code to operations in the same way the JVM works. So the long and short of it is, absolutely, you definitely could, but why would you?
Java provides three distinct types of portability:
Source code portability: A given Java program should produce identical results regardless of the underlying CPU, operating system, or Java compiler.
CPU architecture portability: the current Java compilers produce object code (called byte-code) for a CPU that does not yet exist. For each real CPU on which Java programs are intended to run, a Java interpreter, or virtual machine, "executes" the J-code. This non-existent CPU allows the same object code to run on any CPU for which a Java interpreter exists.
OS/GUI portability: Java solves this problem by providing a set of library functions (contained in Java-supplied libraries such as awt, util, and lang) that talk to an imaginary OS and imaginary GUI. Just like the JVM presents a virtual CPU, the Java libraries present a virtual OS/GUI. Every Java implementation provides libraries implementing this virtual OS/GUI. Java programs that use these libraries to provide needed OS and GUI functionality port fairly easily.
See this link
You ask if one could write a "C VM". Not exactly. "Java" is a big term used by Sun to mean a lot of things, including both the programming language, and the virtual machine. "C" is just a programming language: it's up to the compiler and OS and CPU to decide what format the resulting binary should be.
C is sometimes said to be portable because it doesn't specify the runtime. The people who wrote your compiler were able to pick things that make sense for that platform. The downside is that C is low-level enough, and platforms are different enough, that it's common for C programs to work fine on one system and not at all on another.
If you combine the C language with a specific ABI, you could define a VM for it, analogous to the JVM. There are a few things like this already, for example:
The "Intel Binary Compatibility Specification" is an example of such an ABI (which almost nobody uses today)
"Microsoft Windows" could also be such an ABI (though a huge and underspecified one), for which Wine is one VM that runs programs written for it
"MS-DOS", for which dosemu is one VM
"Linux" is one of the more popular ones today, whose programs can be run by Linux itself, NetBSD, or FreeBSD
"PA-RISC", for which HP's Dynamo was a JIT-like VM
All of these C VMs are in fact a real machine -- nobody, AFAIK, has ever made a C VM that was purely virtual. This isn't surprising, since C was designed to run efficiently on hardware, so you might as well make it run normally on one system. As HP showed, you can still make a JIT to run the code more efficiently, even on the same platform.
You need the JVM for different architectures, but of course your Java programs run on that JVM. So once you have a JVM for an architecture, then your Java programs are available for that architecture.
So I can write a Java program, compile it down to Java bytecode (which is architecture-agnostic), and that means I can run it on any JVM on any architecture. The JVM abstracts away the underlying architecture and my program runs on a virtual machine.
The idea is that the Java language is portable (or more accurately, the compiled byte-code is portable). You are correct that each VM requires a specific implementation for a given hardware profile. However, once that effort has been made, all java bytecode will run on that platform. You write the java/bytecode once, and it runs on any JVM.
.NET is quite similar, but with a far lower emphasis on the principle. The CLR is analogous to the JVM and it has its own bytecode. Mono exists on *nix, but you are correct that it is not "official".
Portability or as written in Wikipedia, Software Portability is the ability to reuse the same software (code) across multiple environments (OSes). The java JVM is a JVM that can be run on any operating systems it was designed for: Windows, Linux, Mac OSes, etc.
On .NET, it is possible to port your software to different platforms. From Wikipedia:
The design of the .NET Framework
allows it to theoretically be platform
agnostic, and thus cross-platform
compatible. That is, a program written
to use the framework should run
without change on any type of system
for which the framework is
implemented.
And because Microsoft never implemented the .NET framework outside of Windows and seeing that .NET is platform agnostic, Mono has made it possible to run .NET applications and compile code to run in Linux.
For languages such as C++, Pascal, etc. you will have to go to each OS and build it on that platform in order to run it on that platform. The EXE file in Windows isn't the same as the .so in linux (the machine code) since both uses different libraries to talk to the kernel and each OS has its own kernel.
WORE - Write Once Run Everywhere
In reality, this is limited to the platforms that have a JVM, but this covers off the majority of platforms you would wish to deploy to. It is almost a half-way between an interpreted language, and a compiled language, gaining the benefits of both.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
It seems like anything you can do with bytecode you can do just as easily and much faster in native code. In theory, you could even retain platform and language independence by distributing programs and libraries in bytecode then compiling to native code at installation, rather than JITing it.
So in general, when would you want to execute bytecode instead of native?
Hank Shiffman from SGI said (a long time ago, but it's till true):
There are three advantages of Java
using byte code instead of going to
the native code of the system:
Portability: Each kind of computer has its unique instruction
set. While some processors include the
instructions for their predecessors,
it's generally true that a program
that runs on one kind of computer
won't run on any other. Add in the
services provided by the operating
system, which each system describes in
its own unique way, and you have a
compatibility problem. In general, you
can't write and compile a program for
one kind of system and run it on any
other without a lot of work. Java gets
around this limitation by inserting
its virtual machine between the
application and the real environment
(computer + operating system). If an
application is compiled to Java byte
code and that byte code is interpreted
the same way in every environment then
you can write a single program which
will work on all the different
platforms where Java is supported.
(That's the theory, anyway. In
practice there are always small
incompatibilities lying in wait for
the programmer.)
Security: One of Java's virtues is its integration into the Web. Load
a web page that uses Java into your
browser and the Java code is
automatically downloaded and executed.
But what if the code destroys files,
whether through malice or sloppiness
on the programmer's part? Java
prevents downloaded applets from doing
anything destructive by disallowing
potentially dangerous operations.
Before it allows the code to run it
examines it for attempts to bypass
security. It verifies that data is
used consistently: code that
manipulates a data item as an integer
at one stage and then tries to use it
as a pointer later will be caught and
prevented from executing. (The Java
language doesn't allow pointer
arithmetic, so you can't write Java
code to do what we just described.
However, there is nothing to prevent
someone from writing destructive byte
code themselves using a hexadecimal
editor or even building a Java byte
code assembler.) It generally isn't
possible to analyze a program's
machine code before execution and
determine whether it does anything
bad. Tricks like writing
self-modifying code mean that the evil
operations may not even exist until
later. But Java byte code was designed
for this kind of validation: it
doesn't have the instructions a
malicious programmer would use to hide
their assault.
Size: In the microprocessor world RISC is generally preferable
over CISC. It's better to have a small
instruction set and use many fast
instructions to do a job than to have
many complex operations implemented as
single instructions. RISC designs
require fewer gates on the chip to
implement their instructions, allowing
for more room for pipelines and other
techniques to make each instruction
faster. In an interpreter, however,
none of this matters. If you want to
implement a single instruction for the
switch statement with a variable
length depending on the number of case
clauses, there's no reason not to do
so. In fact, a complex instruction set
is an advantage for a web-based
language: it means that the same
program will be smaller (fewer
instructions of greater complexity),
which means less time to transfer
across our speed-limited network.
So when considering byte code vs native, consider which trade-offs you want to make between portability, security, size, and execution speed. If speed is the only important factor, go native. If any of the others are more important, go with bytecode.
I'll also add that maintaining a series of OS and architecture-targeted compilations of the same code base for every release can become very tedious. It's a huge win to use the same Java bytecode on multiple platforms and have it "just work."
The performance of essentially any program will improve if it is compiled, executed with profiling, and the results fed back into the compiler for a second pass. The code paths which are actually used will be more aggressively optimized, loops unrolled to exactly the right degree, and the hot instruction paths arranged to maximize I$ hits.
All good stuff, yet it is almost never done because it is annoying to go through so many steps to build a binary.
This is the advantage of running the bytecode for a while before compiling it to native code: profiling information is automatically available. The result after Just-In-Time compilation is highly optimized native code for the specific data the program is processing.
Being able to run the bytecode also enables more aggressive native optimization than a static compiler could safely use. For example if one of the arguments to a function is noted to always be NULL, all handling for that argument can simply be omitted from the native code. There will be a brief validity check of the arguments in the function prologue, if that argument is not NULL the VM aborts back to the bytecode and starts profiling again.
Bytecode creates an extra level of indirection.
The advantages of this extra level of indirection are:
Platform independence
Can create any number of programming languages (syntax) and have them compile down to the same bytecode.
Could easily create cross language converters
x86, x64, and IA64 no longer need to be compiled as seperate binaries. Only the proper virtual machine needs to be installed.
Each OS simply needs to create a virtual machine and it will have support for the same program.
Just in time compilation allows you to update a program just by replacing a single patched source file. (Very beneficial for web pages)
Some of the disadvantages:
Performance
Easier to decompile
All good answers, but my hot-button has been hit - performance.
If the code being run spends all its time calling library/system routines - file operations, database operations, sending windows messages, then it doesn't matter very much if it's JITted, because most of the clock time is spent waiting for those lower-level operations to complete.
However, if the code contains things we usually call "algorithms", that have to be fast and don't spend much time calling functions, and if those are used often enough to be a performance problem, then JIT is very important.
I think you just answered your own question: platform independence. Platform-independent bytecode is produced and distributed to its target platform. When executed it's quickly compiled to native code either before execution begins, or simultaneously (Just In Time). The Java JVM and presumably the .NET runtimes operate on this principle.
Here: http://slashdot.org/developers/02/01/31/013247.shtml
Go see what the geeks of Slashdot have to say about it! Little dated, but very good comments!
Ideally you would have portable bytecode that compiles Just In Time to native code. I think the reason bytecode interpreters exist without JIT is due primarily to the practical fact that native code compilation adds complexity to a virtual machine. It takes time to build, debug, and maintain that additional component. Not everyone has the time or resources to make that commitment.
A secondary factor is safety. It's much easier to verify an interpreter won't crash than to guarantee the same for native code.
Third is performance. It can often take more time to generate machine code than to interpret bytecode for small pieces of code that only run once.
Portability and platform independence are probably the most notable advantages of bytecode over native code.