I would to pick up a new programming language - Java, having been using Python for some time. But it seems most things that can be done with Java can be done with Python. So I would like to know
What kind of things can be done with Java but not Python?
mobile programming (Android).
POSIX Threads Programming.
Conversely, What kind of things can be done with Python but not Java if any?
clarification:
I hope to get an answer from a practical point of view but not a theoretical point of view and it should be about the current status, not future. So theoretically all programming languages can perform any task, practically each is limited in some way.
Both languages are Turing complete, both have vast libraries, and both support extensions written in C so that you can access low level code if needed. The main difference is where they are currently supported. Java in general has wider support than Python.
Your example of Android is one place where Java is the standard choice, although Python also has some support in the form of Android Scripting Environment. Java is already installed on most home computers. You can write Java applets and expect them to work in most browsers.
One thing you can't easily do in Java is quickly write short scripts that perform useful tasks. Python is more suitable for scripting than Java (although there are of course other alternatives too).
I guess using Jython, you can do anything with Python that you can do in Java.
Conversely, Python has the PyPy compiler, which is pretty cool - a virtual machine with multiple backeds (Java Runtime, LLVM, .net, and Python IIRC), multiple garbage collectors, multiple implementations (Stackless), etc. I know Java has a big choice of virtual machines, but the growth of PyPy is amazing, due to it being written in RPython - a fairly productive language.
Also, can a Java do this, in 1 file and less that 20 lines, with no library imports? Obviously both languages have libraries that can do this, but I'm just talking about the flexibility of the languages.
class Logger(object): # boilerplate code
def log(self,level,msg,*args,**kwargs): # *args, **kwargs = flexible arguments
self._log(level,msg,*args,**kwargs) # call with flexible argments
def _log(self,level,msg,*args,**kwargs):
# override me at runtime :)
# I think Java people call this Dependency Runtime Injection
if level>1:
print msg,args,kwargs
logger = Logger() # boilerplate code
def logged(level): # what pattern do you call this?
def logged_decorator(function): # and this?
def func(*args,**kwars):
name = func.__name__ # look ma, reflective metaprogramming!
logger.log(level,name,*args,**kwargs)
return func(*args,**kwargs)
return func # boilerplate code
return logged_decorator # boilerplate code
Example use:
#logged
def my_func(arg1,arg2):
# your code here
pass
You would surely love reading the comparisons made below between these 2 languages.
Check them :
Java is Dead ! Long live Python
Python-Java : A side-by-side comparison
Python is NOT java
What Python Can do Java Can't -
Nothing.
What Java Can do but Python Can't -
Java is a multithreading boss. So if you are trying to write a web server, where multiple requests come at the same time, java can just spawn multiple threads and CPU swaps them. That's why almost all big companies (Expedia, LinkedIn, Goldman, Amazon, Netflix, CITY, JPMC, VISA almost all) uses the JVM Web server for their main application.
CPython has something called GIL, which prevents itself from using OS-level thread efficiently. How Python Application servers (Gunicorn, Django ..) work then? Well, they fork new Processes instead of threads. Threads are lightweight, so forking new processes instead of threads still work till a threshold, but not a very scalable solution.
Sheer Execution speed - When you take var a = 1. and then do a = 10.07, you just stored float value in a variable that previously-stored an integer. When variables are assigned a new value then internally, Python creates a new object to store the value. This pointer is then assigned to the variable, this is called dynamic binding. Dynamic Binding(Python) is slower than Static Binding(Java) - as it requires object creation. And Java or C/C++ primitive types are an order of magnitude faster because of static binding.
Space used (RAM usage) - The default space python uses to store a variable is huge.
>>> import math, sys
>>> a=0
>>> sys.getsizeof(a)
24
But in Java, you can take a Byte size variable, which will take just a byte. So nobody writes an application in Python if asked to write memory-efficient software(Think about you are asked to create a DataBase like Cassandra, maybe design a compute engine like Spark).
Packaging - In Java, you can create something like a Jar. Which can run on any machine where JVM is installed. and that JAR contains all the dependencies. In python you can't just ship something like a JAR, you will have to write a script to install dependencies in every machine you want to run your code on.
Think about if Android was a Python application, before installing an App from the play store, you had to install the dependencies of that app separately.
Performance . Performance . Performance -
Java’s efficiency largely comes from its Just-In-Time (JIT) compiler and support for concurrency. The JIT compiler is a part of the Java Runtime Environment. It improves performance of Java programs by compiling bytecodes into native machine code “just in time” to run. Java Virtual Machine (JVM) calls the compiled code directly. Since the code is not interpreted, compiling does not require processor time and memory usage. Theoretically, this can make a Java program as fast as a native application.
While Java programs are compiled directly, Python is interpreted which slows down Python programs during runtime. Determining the variable type which occurs during runtime increases the workload of the interpreter. Also, remembering the object type of objects retrieved from container objects contributes to memory usage.
We can go on and on ...
People use python because python is easy to learn, easy to use. A lot of libraries for ML / DataScience etc . that saves you coding. But if you are asked to write a performant, scalable, durable, long term application, People who understand comp sci, will always choose Java or C/C++
CPython has a lot of libraries with bindings to native libraries--not so Java.
Related
Someone told me there was more overhead for Java because you can essentially run it on most operating systems and that C# doesn't have that overhead so then it can execute at near C++ speeds.
So is there more overhead in Java, or does each OS has it's own overhead for it's JVM implementation?
C#, Java (and I'll toss it in there too - JavaScript) are languages. Languages are not fast or slow, they just are specifications for how we humans write things that are to eventually be handled by a computer.
The JVM is the Java Virtual Machine. But there are several different versions of it. There's HotSpot (the original), OpenJDK, And then one can look at JRockit from BEA, Apache Harmony and a bunch more.
For C# there is the CLR, but there's also Mono's runtime. There are also others that have been abandoned over time.
JavaScript (because I'm tossing that in there) has an entire army of runtimes. Some of those runtimes are faster than others.
It is the runtime that is faster or slower than another - even possibly for the same language. But that one is 'cross platform' and another is not is not enough of an indication to say that one is faster than the other. There are a great many other things at work and benchmarks can be constructed that show one combination being faster than another for each one.
Going even further, one can look at languages that span multiple runtimes. You've got Python with CPython as its default implementation - but there's Jython that runs in the JVM and IronPython that runs in the CLR. Similar examples can be found with Ruby, IronRuby, and JRuby or Clojure which can be compiled to JavaScript via ClojureScript and then run on one of the JavaScript runtimes rather than a JVM.
Again, its not the language that is fast or slow - but rather how its implemented in its runtime.
The Java language and the Java Virtual Machine (JVM) are completely separate entities. Oracle has done an excellent job of separating the two, so that other languages (like Scala or even Ruby) can run on the JVM.
The Java language itself is definitely written with the intent of targeting the JVM, but, so far as I know, there is no actual requirement that it must. So far as I know it's completely possible to write a Java compiler that generates native code, rather than Java bytecode. (This is all completely hypothetical. I've never heard of anyone actually doing that - there would be very little point. Current implementations of the JVM tend to be almost as fast as native code, and any benefit gained by this would be greatly outweighed by the loss of portability it would entail.)
The situation is further complicated by the fact that C# doesn't exactly have a VM, as discussed here. So the best comparison you can make is "does this implementation of the JVM run this Java code faster than that implementation of the .NET framework runs that C# code?"
In the end, unless there is a remarkable speed difference for very similar code, the comparison just isn't that compelling because there are too many variables. Use a different JVM, or a different Java compiler, or a different .NET implementation, or a different C# compiler, or run the same code on a different machine, and the numbers change.
After searching for an option to run Java code from Django application(python), I found out that Py4J is the best option for me. I tried Jython, JPype and Python subprocess and each of them have certain limitations:
Jython. My app runs in python.
JPype is buggy. You can start JVM just once after that it fails to start again.
Python subprocess. Cannot pass Java object between Python and Java, because of regular console call.
On Py4J web site is written:
In terms of performance, Py4J has a bigger overhead than both of the previous solutions (Jython and JPype) because it relies on sockets, but if performance is critical to your application, accessing Java objects from Python programs might not be the best idea.
In my application performance is critical, because I'm working with Machine learning framework Mahout. My question is: Will Mahout also run slower because of Py4J gateway server or this overhead just mean that invoking Java methods from Python functions is slower (in latter case performance of Mahout will not be a problem and I can use Py4J).
JPype issue that #HIP_HOP mentioned with JVM getting detached from new threads can be overcome with the following hack (add it before the first call to Java objects in the new thread which does not have JVM yet):
# ensure that current thread is attached to JVM
# (essential to prevent JVM / entire container crashes
# due to "JPJavaEnv::FindClass" errors)
if not jpype.isThreadAttachedToJVM():
jpype.attachThreadToJVM()
PySpark uses Py4J quite successfully. If all the heavylifting is done on Spark (or Mahout in your case) itself, and you just want to return result back to "driver"/Python code, then Py4J might work for you very well as well.
Py4j has slightly bigger overhead for huge results (that's not necessarily the case for Spark workloads, as you only return summaries /aggregates for the dataframes). There is an improvement discussion for py4j to switch to binary serialization to remove that overhead for higher badnwidth requirements too: https://github.com/bartdag/py4j/issues/159
My solutions
java thread/process <-> Pipes <-> py subprocess
Use pipes by java's ProcessBuilder to call py with args "-u" to transfer data via pipes.
Here is a good practice.
https://github.com/JULIELab/java-stdio-ipc
Here is my stupid research result about "java <-> py"
[Jython] Java implement of python.
[Jpype] JPype is designed to allow the user to exercise Java as fluidly as possible from within Python. We can break this down into a few specific design goals.
Unlike Jython, JPype does not achieve this by re-implementing Python, but instead by interfacing both virtual machines at the native level. This shared memory based approach achieves good computing performance while providing the access to the entirety of CPython and Java libraries.
[Runtime] The Runtime class in java (old method).
[Process] Java ProcessBuilder class gives more structure to the arguments.
[Pipes] Named pipes could be the answer for you. Use subprocess. Popen to start the Java process and establish pipes to communicate with it.
Try mkfifo() implementation in python.
https://jj09.net/interprocess-communication-python-java/
-> java<-> Pipes <-> py https://github.com/JULIELab/java-stdio-ipc
[Protobuf] This is the opensource solution Google uses to do IPC between Java and Python. For serializing and deserializing data efficiently in a language-neutral, platform-neutral, extensible way, take a look at Protocol Buffers.
[Socket] CS-arch throgh socket
Server(Python) - Client(Java) communication using sockets
https://jj09.net/interprocess-communication-python-java/
Send File From Python Server to Java Client
[procbridge]
A super-lightweight IPC (Inter-Process Communication) protocol over TCP socket. https://github.com/gongzhang/procbridge
https://github.com/gongzhang/procbridge-python
https://github.com/gongzhang/procbridge-java
[hessian binary web service protocol] using python client and java server.
[Jython]
Jython is a reimplementation of Python in Java. As a result it has much lower costs to share data structures between Java and Python and potentially much higher level of integration. Noted downsides of Jython are that it has lagged well behind the state of the art in Python; it has a limited selection of modules that can be used; and the Python object thrashing is not particularly well fit in Java virtual machine leading to some known performance issues.
[Py4J]
Py4J uses a remote tunnel to operate the JVM. This has the advantage that the remote JVM does not share the same memory space and multiple JVMs can be controlled. It provides a fairly general API, but the overall integration to Python is as one would expect when operating a remote channel operating more like an RPC front-end. It seems well documented and capable. Although I haven’t done benchmarking, a remote access JVM will have a transfer penalty when moving data.
[Jep]
Jep stands for Java embedded Python. It is a mirror image of JPype. Rather that focusing on accessing Java from within Python, this project is geared towards allowing Java to access Python as a sub-interpreter. The syntax for accessing Java resources from within the embedded Python is quite similar to support for imports. Notable downsides are that although Python supports multiple interpreters many Python modules do not, thus some of the advantages of the use of Python may be hard to realize. In addition, the documentation is a bit underwhelming thus it is difficult to see how capable it is from the limited examples.
[PyJnius]
PyJnius is another Python to Java only bridge. Syntax is somewhat similar to JPype in that classes can be loaded in and then have mostly Java native syntax. Like JPype, it provides an ability to customize Java classes so that they appear more like native classes. PyJnius seems to be focused on Android. It is written using Cython .pxi files for speed. It does not include a method to represent primitive arrays, thus Python list must be converted whenever an array needs to be passed as an argument or a return. This seems pretty prohibitive for scientific code. PyJnius appears is still in active development.
[Javabridge]
Javabridge is direct low level JNI control from Python. The integration level is quite low on this, but it does serve the purpose of providing the JNI API to Python rather than attempting to wrap Java in a Python skin. The downside being of course you would really have to know a lot of JNI to make effective use of it.
[jpy]
This is the most similar package to JPype in terms of project goals. They have achieved more capabilities in terms of a Java from Python than JPype which does not support any reverse capabilities. It is currently unclear if this project is still active as the most recent release is dated 2014. The integration level with Python is fairly low currently though what they do provide is a similar API to JPype.
[JCC]
JCC is a C++ code generator that produces a C++ object interface wrapping a Java library via Java’s Native Interface (JNI). JCC also generates C++ wrappers that conform to Python’s C type system making the instances of Java classes directly available to a Python interpreter. This may be handy if your goal is not to make use of all of Java but rather have a specific library exposed to Python.
[VOC] https://beeware.org/project/projects/bridges/voc/_
A transpiler that converts Python bytecode into Java bytecode part of the BeeWare project. This may be useful if getting a smallish piece of Python code hooked into Java. It currently list itself as early development. This is more in the reverse direction as its goals are making Python code available in Java rather providing interaction between the two.
[p2j]
This lists itself as “A (restricted) python to java source translator”. Appears to try to convert Python code into Java. Has not been actively maintained since 2013. Like VOC this is primilarly for code translation rather that bridging.
[GraalVM]
Source: https://github.com/oracle/graal
I don't know Mahout. But think about that: At least with JPype and Py4J you will have performance impact when converting types from Java to Python and vice versa. Try to minimize calls between the languages. Maybe it's an alternative for you to code a thin wrapper in Java that condenses many Javacalls to one python2java call.
Because the performance is also a question about your usage screnario (how often you call the script and how large is the data that is moved) and because the different solutions have their own specific benefits/drawbacks, I have created an API to switch between different implementations without you having to change your python script: https://github.com/subes/invesdwin-context-python
Thus testing what works best or just being flexible about what to deploy to is really easy.
I have been working in java for a while now, and want to learn how c++ works when it comes to compilation and executed.
I was wondering if there is a way to convert compiled c++ class into .class files in java and vice versa. I am interested in a single format that can be used both by java code as well as c++ code that can be directly executed to see the results.
Almost everything can be done if you are ambitious and stubborn enough, however, the question lies usually in time and cost and features..
In its core, Java language is a subset of C++. There are some syntactic sugars added that may make you feel that there is "something more", like anonymous class implementation or hidden pointers to outer classes, but it is just a thin layer of syntax, which is irrelevant once the code gets compiled.
After compilation, C++ code is represented by machine code. Java's bytecode of course is translatable to machine code - simply by the fact that JVM executes it and that the jitter can recompile it on the fly into machine code..
So, roughly speaking, every Java code, compiled or not, is translatable to C++.
However, there are some code constructs in C++ that can be compiled into machine code, but that maybe are representable in Java's bytecode, but that cannot be represented back in the Java language. There are lots of it: from some easy to go around like passing parameters as references, to more complex ones like pointer (TheList->) arithmetics, to some really painful to translate like multiple inheritance, custom memory management (that is, overloaded operators new and delete), or some wicked types like unions.
So, clearly, C++ code is not translatable to Java. Clearly, C++ compiled code is even more not translatable, as C++ compilers often optimize the products thorougly, so that it is very hard to guess what was the 'classes' or 'functions' like..
However, if you limit the C++ language, and restrict yourself to not use any of those hard-to-translate constructs (see 'TheList'), then you can make the C++ code translatable. Again, code, but not binaries.
This is not all though. The 'translatability' is one thing, but the other is: will it run? The most distinctive runtime difference is the GarbageCollector. Let's say you actually managed to translate some Java code into C++, and you lined it up with C++ application. Your Java/C++ code executes and creates some objects. Who will clean them up? Typically, there's no GC in C++. Your Java code will therefore leak -- or you will have to provide/implement some kind of GC for the Java/C++ code.. Not pretty. Of course you can limit Java code to not create any objects, d'oh.
Do not get me wrong: even those hard-to-translate things like pointer arithmetics etc are translatable: you can generate tons of helper/wireup code that will replace them with 'proper things of the second platform'. It will, however, be ridiculously complex and slow. I don't think anyone sane will ever try.
So, the only thing that would seem to be left available is very-limited-C++-code <-> somewhat-limited-Java-code. If we cut down the question to this, then yes, that should be translatable. But..
What does it mean to translate code? How'd you do it? You have to read, process, analyze the source code, and then somehow produce the other code in the other language. Well, ok. Producing code is simple, it's just text. But, have you ever tried to analyze code? Long story short, let me just tell you that reading/parsing Java code is at least an order of magnitude easier that reading/parsing C++ code. Java was partly desined to be easily parsable by relatively simple algorithms. If you drop any attempts to optimize, writing a Java parser/compiler is relatively simple thing. On the other hand, C++ was not. Like Java, to parse C++ properly you'd have to effectively create a custom C++ compiler, but also a preprocessor. To some extent, you might also need to implement some parts of the linker. To make thins a little worse, C++ evolved from older languages and is literally packed with some once-in-your-lifetime-used features that make the syntax really difficult to accurately process (ie. have you ever used alternate token set? ..and this is only beginning:)). Do not get scared too much, though! I just want to give you a feeling what you try to touch. You probably would'nt need to write it. Such already tools exists, both for Java (really many, actually) and for C++ (few, and I bet the reasonable ones are not-fully-for-free.. or maybe you could use the GCC toolset probably..). They produce machine-processable representations of the sourcecode, and if you really want to do some translating job, I'd suggest you start there.
Of course my knowledge can be off by a few years, and maybe someone already has written some moreorless working translator - I'd love to see it!
If not, I think it is not worth it. Try embedding Java's runtime in your C++ app, or talk from Java to C++ DLLs via JNI. It is much simplier!
Compiled C++ code is loaded by the OS. That is, the C++ linker generate OS dependent executable modules. Whereas Java .class is loaded by the JVM. JVM executes Java byte code.
If you want to make a Java byte code loader/runner, you could start from JVM source code.
=> link
If you are aimed to load/execute compiled C++ code in Java envirionment, following are required.
(Assuming 32bit Windows platform)
. PE parser/loader => link
. X86 CPU instruction parser(?) => link
. X86 instruction to Java byte code translator
In short, its almost impossible.
For simple C/C++ vs. Java interoperations, JNI will do. link
I think, JNI will be useful:
"The Java Native Interface (JNI) is a programming framework that enables Java code running in a Java Virtual Machine (JVM) to call, and to be called by, native applications (programs specific to a hardware and operating system platform) and libraries written in other languages such as C, C++ and assembly."
Here are good tips how to program a simple example of using the Java Native Interfac (write a Java application that calls a C function).
Converting already compiled C++ code to JVM bytecode is not an easy task, especially if you want to be able to use compiled C++ code from multiple platforms.
It will probably be easier to make a JVM-backend for clang instead. The drawback here is that people who wants this functionality must use your compiler.
In short, if you want to write code targeting the Java virtual machine, then use a language and compiler already made to do it. Like Java...
Just curious about speed of Python and Java..
Intuitively, Python should be much slower than java, but I want to know more...Could anybody give me more? or introduce some nice post to read?
The current standard implementation of Python (CPython) is slower than Java because the standard CPython implementation doesn't have a powerful JIT compiler. Yet.
There have been several projects with the aim of producing a faster implement of Python:
Psyco
Unladen Swallow
PyPy
From what I've tried some of these projects can give very good speed ups for specific algorithms, but you still won't get it to run as fast as Java for typical application code. Most of the current effort seems now to be directed towards PyPy.
The lack of a JIT mentioned is one reason, but another reason is that Python is dynamic. Yes, that does make the language slower. You can see for yourself by using Cython.
A function written in Python can often be compiled to C with Cython. It makes it faster. But it get's really fast when you start adding type information to the variables and parameters, as both Cython and the C-compiler can start applying various simple optimizations that you can't do when the types are dynamic.
So one part of the difference is the inherent dynamicism of Python.
On the future: Python 3 has function annotations: http://www.python.org/dev/peps/pep-3107/ I expect that in a couple of years time, the JIT compilers like PyPy and UnladenSwallow will use this information, and you'll see Python being just as fast as Java, and with some careful applying of Cython, even faster. :)
I do not have data points to give, but one interesting aspect is that there are Python implementations on JVM (ditto for many other dynamic/scripting languages) -- JPython and Jython for example. This could allow some Python applications to run at speeds comparable to native Java applications, assuming implementation of Python runtime itself (on JVM) is efficient.
There are many great answers here as to why Java is faster than Python, 2 of the most common answers are that Python is dynamically typed, and that Java has several extremely powerful Just-in Time Compilers (2 Production quality, several experimental and not for general use) in it's arsenal, with only the C# Common Language Runtime capable of matching it. This is true, but even so, there is one last reason as to why Java is still faster, and oddly enough, it has to do with differences in how Java's Interpreter is designed, vs Python's Interpreter.
Right now, this is the source code of the Interpreter in production for the OpenJDK/HotSpot JVM, Java's reference implementation (There is actually another legacy Interpreter in the JVM source code which was the old one written by James Gosling himself way back when Java was first created, but that one is outdated and not compiled into the actual binary unless you compile it from source with special flags for debugging purposes. Interestingly enough, this is the interpreter responsible for earning Java the reputation of being horrendously slow back in those days):
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/interpreter/templateInterpreter.cpp
This, in contrast, is the code segment of the CPython interpreter that executes Python opcode:
https://github.com/python/cpython/blob/master/Python/ceval.c#L1847
Notice something different between the 2?
While CPython has a massive for loop with a switch case for every single opcode possible (This is true for almost every other interpreter out there actually, other than Java), there is not a single loop, if else, or switch case in Java's Interpreter. Why is this?
The answer is that Java's Interpreter is a special type called a Template Interpreter, which to date is the only one of it's kind. Unlike most designs, instead of a switch case to evaluate Java bytecode, Java's Interpreter has a big arraylist of bytecode, mapped to native machine language when the application is started. This way, the Java Interpreter doesn't need to evaluate bytecode at all, it just inserts the bytecode as an array index, loads native machine language, and directly runs it on the CPU. This means Java's "Interpreter" is actually a discount Compiler, since it runs your code directly on the hardware. CPython on the other hand, like many other interpreters today, is a run of the mill bytecode interpreter, which processes python opcode in software. This obviously makes Python slower to run than Java, even without JIT.
As to why Java has such a unique Interpreter design not used anywhere else, it's because it needs to run interpreted code directly with JIT Compiled code seamlessly, and the ingenious design of having the Interpreter contain a table with bytecode->machine language pairs rather that directly execute it in software was the best way to achieve this goal.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
I wonder how Java is more portable than C, C++ and .NET and any other language. I have read many times about java being portable due to the interpreter and JVM, but the JVM just hides the architectural differences in the hardware, right? We'd still need different JVMs for different machine architectures. What am I missing here? So if someone writes an abstraction layer for C for the most common architectures, let's say the CVM, then any C program will run on those architectures once CVM is installed, isn't it?
What exactly is this portability? Can .NET be called portable?
Portability isn't a black and white, yes or no kind of thing. Portability is how easily one can I take a program and run it on all of the platforms one cares about.
There are a few things that affect this. One is the language itself. The Java language spec generally leaves much less up to "the implementation". For example, "i = i++" is undefined in C and C++, but has a defined meaning in Java. More practically speaking, types like "int" have a specific size in Java (eg: int is always 32-bits), while in C and C++ the size varies depending on platform and compiler. These differences alone don't prevent you from writing portable code in C and C++, but you need to be a lot more diligent.
Another is the libraries. Java has a bunch of standard libraries that C and C++ don't have. For example, threading, networking and GUI libraries. Libraries of these sorts exist for C and C++, but they aren't part of the standard and the corresponding libraries available can vary widely from platform to platform.
Finally, there's the whole question of whether you can just take an executable and drop it on the other platform and have it work there. This generally works with Java, assuming there's a JVM for the target platform. (and there are JVMs for many/most platforms people care about) This is generally not true with C and C++. You're typically going to at least need a recompile, and that's assuming you've already taken care of the previous two points.
Yes, if a "CVM " existed for multiple platforms, that would make C and C++ more portable -- sort of. You'd still need to write your C code either in a portable way (eg: assuming nothing about the size of an int other than what the standard says) or you'd write to the CVM (assuming it has made a uniform decision for all of these sorts of things across all target platforms). You'd also need to forgo the use of non-standard libraries (no networking, threading or GUI) or write to the CVM-specific libraries for those purposes. So then we're not really talking about making C and C++ more portable, but a special CVM-C/C++ that's portable.
Once again, portability isn't a black and white thing. Even with Java there can still be incompatibilities. The GUI libraries (especially AWT) were kind of notorious for having inconsistent behavior, and anything involving threads can behave differently if you get sloppy. In general, however, it's a lot easier to take a non-trivial Java program written on one platform and run it on another than it is to do the same with a program written in C or C++.
As others have already said, portability is somewhat of a fuzzy concept. From a certain perspective, C is actually more portable than Java. C makes very few assumptions about the underlying hardware. It doesn't even assume that there are 8 bits in a byte, or that negative numbers should be represented using two's complement. Theoretically, as long as you have a Von Neumann based machine and a compiler, you're good to go with C.
In fact, a "Hello world" program written in C is going to work on many more platforms than a "Hello world" program written in Java. You could probably get the same "hello world" program to work on a PDP-11 and an iPhone.
However, the reality is that most real-world programs do a lot more than output "Hello world". Java has a reputation for being more portable than C because in practice, it takes a lot more effort to port real-world C programs to different platforms than real-world Java programs.
This is because the C language is really ANSI-C, which is an extremely general-purpose, bare-bones language. It has no support for network programming, threading, or GUI development. Therefore, as soon as you write a program which includes any of those things, you have to fall back on a less-portable extension to C, like Win32 or POSIX or whatever.
But with Java, network programming, threading, and GUI tools are defined by the language and built into each VM implementation.
That said, I think a lot of programmers also underestimate the progress modern C/C++ has made in regard to portability these days. POSIX goes a long way towards providing cross-platform threading, and when it comes to C++, Boost provides networking and threading libraries which are basically just as portable as anything in Java. These libraries have some platform-specific quirks, but so does Java.
Essentially, Java relies on each platform having a VM implementation which will interpret byte code in a predictable way, and C/C++ relies on libraries which incorporate platform specific code using the preprocessor (#ifdefs). Both strategies allow for cross platform threading, networking, and GUI development. It's simply that Java has made faster progress than C/C++ when it comes to portability. The Java language spec had threading, networking and GUI development almost from day one, whereas the Boost networking library only came out around 2005, and it wasn't until 2011 with C++11 that standard portable threading was included in C++.
When you write a Java program, it runs on all platforms that have JVM written for them - Windows, Linux, MacOS, etc.
If you write a C++ program, you'll have to compile it specifically for each platform.
Now, it is said that the motto of Java "write once, run everywhere" is a myth. It's not quite true for desktop apps, which need interaction with many native resources, but each JavaEE application can be run on any platform. Currently I'm working on windows, and other colleagues are working on Linux - without any problem whatsoever.
(Another thing related to portability is JavaEE (enterprise edition). It is said that applications written with JavaEE technologies run in any JavaEE-certified application server. This, however, is not true at least until JavaEE6. (see here))
Portability is a measure for the amount of effort to make a program run on another environment than where it originated.
Now you can debate if a JVM on Linux is a different environment than on Windows (I would argue yes), but the fact remains that in many cases there is zero effort involved if you take care of avoiding a few gotchas.
The CVM you are talking about is very much what the POSIX libraries and the runtime libraries try to provide, however there are big implementation differences which make the hurdles high to cross. Certainly in the case of Microsoft and Apple these are probably intentionally so in order to keep developers from bringing out products on competing platforms.
On the .net front, if you can stick to what mono provides, an open source .Net implementation, you will enjoy roughly the same kind of portability as Java, but since mono is significantly behind the Windows versions, this is not a popular choice. I do not know how popular this is for server based development where I can imagine it is less of an issue.
Java is portable from the perspective of the developer: code written in Java can be executed in any environment without the need to recompile. C is not portable because not only is it tied to a specific OS in many cases, it is also always tied to a specific hardware architecture once it has been compiled. The same is true for C++. .Net is more portable than C/C++, as it also relies on a virtual machine and is therefore not tied to a specific hardware architecture at compile-time, but it is limited to Windows machines (officially).
You are correct, the JVM is platform-specific (it has to be!), but when you say Java is portable, you are talking about it from a developer standpoint and standard Java developers do not write the JVM, they use it :-).
Edit #Raze2Dust To address your question. Yes, you could. In fact, you could make Java platform-specific by writing a compiler that would generate machine code rather than bytecode. But as some of the other comments suggest, why would you do that? You'd have to create an interpreter that maps the compiled code to operations in the same way the JVM works. So the long and short of it is, absolutely, you definitely could, but why would you?
Java provides three distinct types of portability:
Source code portability: A given Java program should produce identical results regardless of the underlying CPU, operating system, or Java compiler.
CPU architecture portability: the current Java compilers produce object code (called byte-code) for a CPU that does not yet exist. For each real CPU on which Java programs are intended to run, a Java interpreter, or virtual machine, "executes" the J-code. This non-existent CPU allows the same object code to run on any CPU for which a Java interpreter exists.
OS/GUI portability: Java solves this problem by providing a set of library functions (contained in Java-supplied libraries such as awt, util, and lang) that talk to an imaginary OS and imaginary GUI. Just like the JVM presents a virtual CPU, the Java libraries present a virtual OS/GUI. Every Java implementation provides libraries implementing this virtual OS/GUI. Java programs that use these libraries to provide needed OS and GUI functionality port fairly easily.
See this link
You ask if one could write a "C VM". Not exactly. "Java" is a big term used by Sun to mean a lot of things, including both the programming language, and the virtual machine. "C" is just a programming language: it's up to the compiler and OS and CPU to decide what format the resulting binary should be.
C is sometimes said to be portable because it doesn't specify the runtime. The people who wrote your compiler were able to pick things that make sense for that platform. The downside is that C is low-level enough, and platforms are different enough, that it's common for C programs to work fine on one system and not at all on another.
If you combine the C language with a specific ABI, you could define a VM for it, analogous to the JVM. There are a few things like this already, for example:
The "Intel Binary Compatibility Specification" is an example of such an ABI (which almost nobody uses today)
"Microsoft Windows" could also be such an ABI (though a huge and underspecified one), for which Wine is one VM that runs programs written for it
"MS-DOS", for which dosemu is one VM
"Linux" is one of the more popular ones today, whose programs can be run by Linux itself, NetBSD, or FreeBSD
"PA-RISC", for which HP's Dynamo was a JIT-like VM
All of these C VMs are in fact a real machine -- nobody, AFAIK, has ever made a C VM that was purely virtual. This isn't surprising, since C was designed to run efficiently on hardware, so you might as well make it run normally on one system. As HP showed, you can still make a JIT to run the code more efficiently, even on the same platform.
You need the JVM for different architectures, but of course your Java programs run on that JVM. So once you have a JVM for an architecture, then your Java programs are available for that architecture.
So I can write a Java program, compile it down to Java bytecode (which is architecture-agnostic), and that means I can run it on any JVM on any architecture. The JVM abstracts away the underlying architecture and my program runs on a virtual machine.
The idea is that the Java language is portable (or more accurately, the compiled byte-code is portable). You are correct that each VM requires a specific implementation for a given hardware profile. However, once that effort has been made, all java bytecode will run on that platform. You write the java/bytecode once, and it runs on any JVM.
.NET is quite similar, but with a far lower emphasis on the principle. The CLR is analogous to the JVM and it has its own bytecode. Mono exists on *nix, but you are correct that it is not "official".
Portability or as written in Wikipedia, Software Portability is the ability to reuse the same software (code) across multiple environments (OSes). The java JVM is a JVM that can be run on any operating systems it was designed for: Windows, Linux, Mac OSes, etc.
On .NET, it is possible to port your software to different platforms. From Wikipedia:
The design of the .NET Framework
allows it to theoretically be platform
agnostic, and thus cross-platform
compatible. That is, a program written
to use the framework should run
without change on any type of system
for which the framework is
implemented.
And because Microsoft never implemented the .NET framework outside of Windows and seeing that .NET is platform agnostic, Mono has made it possible to run .NET applications and compile code to run in Linux.
For languages such as C++, Pascal, etc. you will have to go to each OS and build it on that platform in order to run it on that platform. The EXE file in Windows isn't the same as the .so in linux (the machine code) since both uses different libraries to talk to the kernel and each OS has its own kernel.
WORE - Write Once Run Everywhere
In reality, this is limited to the platforms that have a JVM, but this covers off the majority of platforms you would wish to deploy to. It is almost a half-way between an interpreted language, and a compiled language, gaining the benefits of both.