Fog Creek Software
Discussion Board

what's good about having a virtual machine?

As in the JVM and C# CLR... seriously, not intended to start a religious debate.

I am a C/C++ guy and have been wondering about this.  some things I can see:

distribution: to be platform independent, you don't have to distribute multiple binaries.  but other than that, why not just make multiple Java compilers for platform independence?

you can distribute the "libraries" with the runtime environment, no need for messing with all the DLLs...

security: you have a layer of indirection on what can be done to your PC

business standpoint: Sun wanted to make Microsoft's OS irrelevant

why does C# have the CLR if it doesn't need to be platform independent?  I think I must be missing something here.  Isn't it just causing a lot of overhead to have to compile the code before you run it?  There must be some more significant benefits than I see.

Saturday, July 26, 2003

Oh and I guess the CLR supports more than C#, but let's assume that we're talking about just one language.  Why have the runtime environment.

Saturday, July 26, 2003

what about the ability to distribute executable code in only one format?

also, .net will be running on at least 2 platforms officially, x86 and itanic.

Saturday, July 26, 2003

>"distribution: to be platform independent, you don't have to distribute multiple binaries."

That in itself is a major benefit, which would make it worthwhile to many if no other benefits existed.  It's like asking "other than losing weight and maintaining a healthy heart, why should I exercise?"

The ability to "sandbox" code to improve security is also a useful benefit of virtual machines.

T. Norman
Saturday, July 26, 2003

there are occasional posts here about making 'software engineers' more accountable for the systems they create. (usually in threads about using the term 'engineer' having certain legal meanings in some places.)

in order to do that we first need to keep a stable platform.

bridge builders don't need to worry about the laws of physics changing ...

perhaps in the future we'll see vm's that are stable and unchanging over decades allowing peopel to become much more familiar with their performance and bugs etc.

Saturday, July 26, 2003

There is good number of advantages.

For example, lets assume that the some memory allocation code needs to be fixed, and this fix will improve both speed, and performance. And, lets also assume that his fix will also improve reliability.

Ok, you simply modify the runtime engine, and ALL EXISTING CODE does not have to re re-compiled. All existing software written instantly gets the advantage of this.

True that in the current windows system, the API’s can also be modified, and thus all code will also gain benefits. Of course, you can from a point of view consider the windows api a runtime engine anyway. Of course, this assumes that everyone is using the api..and that is the problem!.

Regardless,  placing a library between the code and the hardware isolates not only the hardware from the software, but also protects the software investment.

The stakes for a new OS is getting very high indeed. If tomorrow all the rage becomes 128 bit processors then what? I don’t think any company can afford to re-write windows, or whatever they use. The investment is too large, and thus a virtual runtime environment can protect the software.

Years ago one person in a basement could write a whole OS. No anymore!

Thus, modifications to the hardware such as new chips, new instructions sets, and new types of memory management can be changed at a hardware level, and then the virtual machine can be modified to work with the new hardware. The result is all software on top will gain these benefits, and still function.

The other problem is that parts of the windows API goes back several thousands of years, and parts of it is really ragged.

MS wants the best, and cleaning the whole mess up was also a goal here.

Of course, there is a performance hit, but it is not too bad, and the due to such low cost hardware, it is software investment that must be protected.

Also, often a runtime is far more efficient from a resources point of view (you loose some processing, but a software written for a virtual engine is usually much smaller then is native compiled code).

You also get a much more powerful instruction set with a pcode engine.

It is strange, but every development tool I have used has been some type of p-code environment. That speaks very high the concept of abstracting hardware from the software. I have always used the most productive tool I can find, and it seems that those tools are usually based on p-code.

So we have:
Pick operating system.
I used that for years, and it have it run on every platform from Intel, to Motorola to Dec to Honeywell computers..

UCSD Pascal system.
Gosh, on a crappy 8bit AppleII+ I could write real nice code, and you got things like Type-ahead which the apple did NOT support. This is a perfect example of how well a virtual machine can run on a crappy little appleII. The 6502 did not exactly have  great instruction set. However, the virtual p-code system on a apple had all the trimmings of a VERY good processor design. Using the UCSD Pascal system on a 8bit apple completely transformed the computing experience. All of a sudden the appleII never froze up, and I even got things like type-ahead. It was like going form a econo box to a Mercedes Benz.

VB, ms-access
Again, both of the environments are build around p-code. By VB5, a native compiler option became available, but you can still use the p-code if you want. (the native option does mean that VB speed is the same as c++).

However, ms-access does not give the native option, so it always runs in pcode I have NEVER experience performance problem due to this fact.

Again, a pcode system.

So, it looks like the most bang for the software dollar comes from pcode systems. It seems every piece of software I used is based on pcode!

Having said all the above, can I assume that CLR for windows .net is only a isolation or API layer, or does it actually have a processor/virtual machine with its own instruction set and registers etc?

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Saturday, July 26, 2003

"the native option does mean that VB speed is the same as c++" The native option means that the limited constructs of the translation from p-code to binary would be done once, however it most certainly doesn't mean that it's "the same as C++".

C++ Love-In
Saturday, July 26, 2003

From a C++/Java guy:

Platform independence:

With client-side Java, there are different issues (memory usage, UI painting) that can cause code that contains "if (isMac)" statements.

Having worked on a product that was written on C and shipped on Solaris/AIX/HP-UX/Digital Unix/Windows, there isn't a whole lot of difference between having to understand how the various virtual machines work and how the various POSIX implementations work.  Either way, if you're going to support multiple platforms, you're going to have to be able to develop and test on all those platforms.

I think one of the reasons the CLR is important is the multiple language support.  That is a very nice thing to have.  Isn't one of the reasons for ActiveX/COM in the first place to enable Visual Basic developers to have a simple interface to components?  It didn't really make things easier for C++ developers.  With a CLR, all the languages can have a common component interface.

Craig Thrall
Saturday, July 26, 2003

>the native option does mean that VB speed is the same as c++"

>> The native option means that the limited constructs of the translation from p-code to binary would be done once, however it most certainly doesn't mean that it's "the same as C++".

I not saying it is the same, but I am saying the performance difference between the two languages is not different when you are talking about code that loops or something like a Sieve of Eratosthenes

So many people think that stuff like loops etc runs faster in c++ then does VB. This is simply not the case.

C++ CAN BE faster due to things like threading, and different DESIGNS that you will (and can) use in code.  However, simple stuff like loops etc is not different,
and rate of instructions are not much different between the two. 

Further, people seem to get all emotionally caught up in this simple issue. This loop speed thing in fact is not such a big deal (too many c++ coders are so aghast at my above statement!...and they should not be!)

Simply put, loops and a sieve of eratosthrens does not constitute software development. (so, I guess if one does not realize this, then the above statement does get the emotions going). C++ also has better optomzing for dead code, but it don't useally help. However, as mentioned, it is the designs used that yield the real difference in performance here.

We had a discussion on this some time ago. You can read the thread here (I don’t care for how the thread ended at all!).

Joel has always said arguments about my language being better then your language type arguments were fun in grade school. We as developers can leave those arguments for slashdot.

However, I do think that it is good thing that this myth about c++ running loops faster then VB is good knowledge for all.

That thread:

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Saturday, July 26, 2003

Hm... doesn't seem like anyone mentioned anything I didn't.

distributing binaries: already mentioned, and I don't see this as a big advantage unless you're doing web apps.  If you're doing desktop applications, what's the difference?  You also impose upon the user the annoyance of acquiring the VM, if they don't have it.

stable platform: why is a VM a better abstraction for hardware than a compiler and libraries/DLLs?

fixing a memory problem: doesn't really count, that is a benefit of SHARED CODE rather than a VM.  As you mentioned, this already happens with DLLs.  As long as you keep the same interface, you can change the implementation. 

protecting a software investment: you can already do this, with portable programming languages.  The compatibility is just taken care of at compile time rather than at run time.  Admittedly the VM is a tighter abstraction.

I don't buy the "writing the whole OS" argument.  I bet writing a good VM not trivial either.  It's not like some company is going to write a JVM for some platform that Sun doesn't support, just to support some application they have in Java.  That would be way too costly.  You have to do a lot of the things you would do in writing an OS anyway, namely interfacing with the hardware.  This is also an extremely hypothetical example.  No one is going to develop a hardware platform without OS support.  i.e. Intel knows that microsoft and linux will back their next gen hardware, and Apple knows that Apple with back their next gen hardware.

new chips/new instruction sets: again, why is a VM a better abstraction than a compiler/libraries.

I think the reason that intermediate code is smaller than native code is because you can pass off a lot of stuff to the VM.  Like you just compile a hello world in C++ under windows it's probably like 30k-50k.  With C#, though I never tried it, it probably could be like 20 bytes.

The "empirical argument" that Albert mentioned (Fox Pro, VB, MSAccess) is interesting, but I would like to see a logical argument as well.  What about the p-code made those systems good.

I agree with craig, that the abstraction of the VM is not airtight.  Admittedly it is probably tighter than most platform independent libraries you will find, but you still have to test your Java program on every platform.  I think it is possible in C++ to make a tighter layer than most would think.  If you write your own, you have the control to tailor it for your own specific application.  You know how your application allocates memory, Sun or Microsoft do not.  You know how often it accesses disk, and you know what the target hardware level is (high end or low end).

Also, microsoft mentions that if you have m languages and n platforms, you only have to write m+n translators vs. m*n if you have an intermediate language.  This is true, but neither system takes advantage of it.  Microsoft: m languages, 1 platform -> m translators.  Sun: 1 language, n-platforms -> n translators.  OK I guess Microsoft is using itanium, as someone mentioned.  BUT there is no reason that this has to be done at RUNTIME.  This can be done at compile time too.  Why compile the code on the fly when you can do it beforehand?  I still don't get it.  I know there are certain advantages to JIT compilation, because it can optimize stuff at runtime, but I think it is safe to say that overall it is not as fast as pre-compiled code.

No one mentioned run-time type information or garbage collection?  What advantages, if any, do these have in being placed in a VM?

So there must be a reason it is so popular now.  I would suspect the reasons are as much for business reasons as technical reasons.  That is, Sun wanted to make Microsoft's OS irrelevant, and Microsoft now wants to make everybody else's stuff irrelevant unless they write for the CLR.

Why are we missing a C#/Java type language that is compiled?  That is, something that is more toward the productivity side of the execution speed/development speed tradeoff than C++ is, but not as far as VB or python perhaps.

Saturday, July 26, 2003

In reference to p-code systems, you do save money.

So, while the CLR or the so called VM would be a huge job to re-write, you actually only re-write the interpreter for that VM.

This means ONLY the interpreter needs to be re-written, not the whole VM. If everyone wrote using the CLR, then porting to another platform, or chip design would not be that hard. (we are not at that point yet).

Writing a small interpreter for the VM is not that big of a task. (much less then re-writing the compilers, re-writing the development tools etc. Heck, you don’t even have to re-compile your code to move it to a different system with a different processor!. Further, you don’t have to test existing code.

If/when MS comes out with a  new OS, then all of the existing code, compilers, and even all the development tools can simply be moved to the new system. (saves lots of $$)

So, the only part you re-code is the bottom interpret part that executes the p-code.

Without question this is why the Pick system still exists to day. It was easy to port the system to other micro processors, but the compilers/database OS stuff did not have to be re-written. Now pick runs atop windows NT, or Linux. Without a portable system, these ports of the system would not be easy at all, and pick would died many moons ago.

Also, when I talked about changing memory management, I was speaking in reference to changing the interpreter to take advantage of a new chip or type of memory. Again, this change can be taken advantage of without the need for any high level code to be re-compiled etc no dlls, or noting needs to be changed.

They are cheaper, and more portable.

Albert D. Kallal
Edmonton, Alberta Canada

Albert D. Kallal
Saturday, July 26, 2003

Fact check:

Java VM: "n" languages can run on a Java VM (see for a listing; at least one of the Pragamatic Programmers uses JRuby - Ruby running in a JVM; there was a presentation at JavaOne about using Jython - Python running in a JVM - to script Ant; etc).

The difference between Sun's approach and Microsoft's approach is that Sun emphasizes platform neutrality and Microsoft emphasizes language neutrality.

Walter Rumsby
Saturday, July 26, 2003

The portability reason is largely a myth.

C is available for essentially every platform, Java is available for a select few. e.g it was not available in Linux for a long time, and there is still no native java for FreeBSD, which is very popular amongst the larger web sites.

Also, who has had the pleasure of ugrading their apps (and 3rd party apps) from 1.2 to 1.3 to 1.4? Stable platform? I think not.

Sunday, July 27, 2003

Both platform neutrality and language neutrality are myths.

If you limit yourself to not using non-portable constructs, then C is the most portable language there is. I don't want to get into the list, but most open source projects run any Unix flavour, including Linux, the BSDs, OS/X and Cygwin, and the platform dependence is usually isolated into one short "config.h" file which is generated automatically by tools like "autoconf".
To maintain platform neutrality in Java, you have to limit yourself to the lowest common denominator. Often, you need more (e.g., finding out how much free space is available on the file system), and then you have to use the native platform and are not better off than using nonportable C in the first place.

And you've probably noticed that most C++ can't be language neutral - templates and multiple inheritence are not supported by the CLR. The fact that many languages have been ported to the CLR / JVM doesn't mean it's "language neutral" in any way -- CLR Collections are not interchangable with Lisp cons-cell based lists, Python generators, etc -- so you DON'T get to mix languages freely as advertised. If you expect the CLR to work for multiple languages "as advertised", you're left with choosing a "skin" for your C#, because your C++.NET won't be C++, your VB.NET won't be VB, your Python.NET won't be Python and your Lisp.NET won't be Lisp. You might as well write C# from the beginning.

The CLR is perhaps slightly more general than the JVM, but it isn't significantly so.

There are advantages to a virtual machine -- binary compatibility being the most important, IMHO -- but neither platform neutrality nor language neutrality are among them.

Sunday, July 27, 2003

One of the signifigant advantages of the Java/VM model is that there is a wealth of libraries and frameworks that "just work" on any platform that has a JVM.  Although many C/C++ apps (particularly open source applications) are buildable on many platforms (with the config.h example described above), you are held at the mercy of the libraries the application depends on.  If an application needs a library that is not available on your platform, you are out of luck.  This almost never happens with Java.

Another nice thing about Java is the selection of "standard" interfaces defined.  For example, most Java code that accesses databases uses JDBC.  That application would be able to use Oracle, Sybase, SQL Server, DB2, etc., with no changes in code.  A cross-platform C/C++ application would need seperate code for each database system.  Of course, there is ODBC, but that's not going to help much on non-Microsoft systems.

Joe V
Sunday, July 27, 2003

> I think one of the reasons the CLR is important is the multiple language support. 

Topspeed (later Clarion) had this with native Intel compilers for DOS/Windows in about 1989 or 1990. Multiple language support simply requires a common output format and ability to link the various outputs - no VM required.

Windows or UNIX APIs may be crusty, but in a few years so will VMs become. Crusty is related to length of existence.

Common executable format is a strength of the VM idea - provided the VM is available on all required platforms (in other words all platforms are alike in this fundamental way).

One other point about executable formats, RISC executables can be large!

Personally I think

Java VM = idea to commodotize Windows(1)

Microsoft dot Net VM = could be idea to commodotize Intel(2)

Reason for (1) is self evident to everybody.

Reason for (2) is less obvious, but I would say it's probably in MS's strategic interests (no I'm not some anti-MS conspiracy nut).  Why? Because 2 companies make big profits from PCs (MS and Intel), and 2 companies control direction of PCs (MS and Intel).  Surely MS would prefer to control the direction of PCs themselves entirely, and also to drive down the price of PC hardware (to allow more profits in software area, aka, MS's field of play)

S. Tanna
Sunday, July 27, 2003

I think the main reason is security. The VM can ensure that a program doesn't do anything it isn't supposed to, while this is generally not possible for compiled programs.

Frederik Slijkerman
Monday, July 28, 2003

Frederik, that's the operating system's job. The VM can do that, but it's not a specific benefit of a virtual machine.

In many operating systems, you have very fine granularity with which you can dictate what a program can and cannot do.  chroot jails, capabilities (privileges in WinNT), and kits like systrace and selinux policies all work towards this goal.

Monday, July 28, 2003

So it seems to me that having the VM is not why people use Java for non-web apps...

I think it is just because there was a large set of libraries that were relatively reliable, and the language is a bit cleaner and simpler than C++.

It seems like every advantage of the VM can be gotten somewhere else, but having the VM packages it nicely in a big download without little worry.  It makes things simpler.  The alternative is to pick and choose among all these different technologies, which can be quite bewildering.  For graphics you have OpenGL or DirectX; for GUIs you can go with VB or MFC/ATL; for binary compatibility you have COM.  For other things you have like 100 different C/C++ libraries to do the same task.  It is all pretty intimidating.  With a VM, your choice is limited, which seems to be a positive thing for most people, understandably.

Monday, July 28, 2003

1) People who say that C is the ultimate in portability miss a few things.  The primary of these being that this portibility requires distributing source code.  Whatever your philosophical beliefs on open source, it should be obvious that distributing source code is not a good business practice under current conditions because of A) intellectual property questions, and B) usability.

2) Garbage collection, security, etc.  Someone raised the question of why the compiler can't provide this.  The answer to that is pretty simple.  First of all, the compiler can't provide any guarantees, because the executable is out of its hands as soon as it is written to disk.  Secondly, the compiler doesn't have any runtime information.  It can't do bounds checking at compile time, for example.  Now you could say that the compiler could add code to do bounds checking at runtime, but then that code is A) modifiable as mentioned previously, and B) not upgradeable without a recompile.  Someone said "the OS should provide security."  Yes that is true to some extent.  The OS can restrict program access to certain things.  But, that can only be done at the program level.  What if a "trusted" program has a security flaw?  A VM helps to make better guarantees about what code can do.  For example, the CLR can *GUARANTEE* that no managed code can ever have a buffer overflow.  The OS can make no such guarantee.

To some extent, as people mentioned, a VM is a convenient gathering for several different technologies.  Language/platform interopability.  Garbage collection, memory managent, security.  However that's not the whole story.

Your argument was that many of these things can be provided by a compiler.  While true, what advantage does this have?  For example, it is conceivable that garbage collection and memory management could be provided by the compiler via added code.  This code would have to do everything the VM would do, so there is no benefit there.  On the downside, upgrading this code would require a new rev of the compiler, recompiling binaries, and redistribution.  With a VM, if the GC algorithm improves, users can download the new VM and get that benefit for all programs.

VMs allow you to make guarantees about what happens at runtime.  This allows the system to provide security (ie, no buffer overflows), garbage collection, and runtime type information.  This is NOT generally possible with a compiler for 2 reasons.  1) The compiler could lie.  Nobody has exclusive controls over compilers, so anyone could create a compiler that bypasses these security measures.  2) The output produced by a compiler is not secure, it can be modified afterwards.  For a VM, the checking is done at runtime by a secured system that (theoretically) hackers and the like don't have access to.

I think the primary benefit of VMs is the above.  VMs are the only way that you can make guarantees about the runtime behavior of programs at this granularity.  The fact that it comes along with many side benefits just accelerates the acceptance of VMs.

Mike McNertney
Monday, July 28, 2003

"1) People who say that C is the ultimate in portability miss a few things.  The primary of these being that this portibility requires distributing source code. "

Um, no we mean that you can recompile the same source on different platforms and distribute a binary for each one.  Portability doesn't imply distributing the source to the end user.  As mentioned, a single binary is an advantage of a VM (but not a big one IMHO except for web apps).

This is very powerful portability.  I was just reading that someone ported the whole Python interpreter to the PS2 and GameCube in a matter of days (don't have the exact figure, but the description was fairly trivial).  Is there a JVM or CLR for PS2?  haha... If Microsoft or Sun does not decide to release a VM for your platform, and you want to move to that platform, then you're kind of screwed.  The platform independence issue becomes a little hazy.  But there are C compilers for *every* platform.

"Now you could say that the compiler could add code to do bounds checking at runtime, but then that code is A) modifiable as mentioned previously, and B) not upgradeable without a recompile."

A) I missed this, what do you mean, someone can overwrite the instructions that do bounds checking??  B) For bounds checking at least, why would you ever want to change it?  For other things, as already mentioned, you can package things in DLLs to avoid recompiling.  I don't see what the big deal is about recompiling anyway.  If you're going to upgrade you're program, you're going to have to redistribute something.  If you upgrade the VM it has to be recompiled.  And the user has to download it again.  You're just trading one recompile for the other.  I must be missing something.

Microsoft could have packaged the whole .NET library as .DLLs, I presume, and it would have the same effect.  You just download new .DLLs, and all your programs will benefit from the improvements without a recompile.  So again I think this is something that is not exclusive to VMs.

"For example, the CLR can *GUARANTEE* that no managed code can ever have a buffer overflow.  The OS can make no such guarantee."

I agree that security is a benefit, as mentioned in the first post.  And I think this is definitely a valid reason for VMs.  But I would say that most programs written on VMs don't need that kind of security.  If you're writing an MP3 ID tag editor, why do you need security.

"Your argument was that many of these things can be provided by a compiler.  While true, what advantage does this have?"

Speed.  Why translate code to native instructions at runtime, when you can just store the native instructions themselves.  Honestly I am not sure how much the speed overhead is, and it varies widely I'm sure, but I think the "Java is slow" reputation comes partly from the VM.

Also, control.  I have no idea what the internals of the VM are doing.  Maybe my program doesn't need runtime bounds checking, and the overhead is excessive (which it is for some applications, which is why they don't use Java).  I can turn it off if I want to, if I have the source.  I have much more control in many other areas as well.

"For example, it is conceivable that garbage collection and memory management could be provided by the compiler via added code.  This code would have to do everything the VM would do, so there is no benefit there.  "

I think you're conflating two issues.  I agree that there is no savings really with garbage collection and bounds checking -- you have to do them either way.  But that's because they're in native code as well.  The VM _IS_ written in native code, and it includes GC and bounds checking, and runtime type checking and whatever else.  But the question is, why store the rest of the program in an intermediate byte code.

A good answer is security and binary compatibility, but I don't think that accounts for the fact that Java and C# are becoming so popular.  Most desktops apps DON'T need security and binary compatibility (or they already have it with COM and DLLs).  Does your IDE need these features?  Then why are there whole IDEs written in Java?  Seems like kind of a waste to me.  I agree it is NOT a waste to write it in a higher level language like Java, because higher level languages improve your productivity.  But why not just *compile* Java, and you have your productivity benefit AND a speed improvement.

Monday, July 28, 2003

The direct benefit of VMs goes to the user, not the program developer.  The developer gets an indirect benefit in that the user will be more pleased with the program (provided the VM doesn't create a performance problem).

After you have released the program, the user can get it to run on new operating systems or get improved performance and stability by downloading a better-tuned VM, without you having to do anything to enable that.  But for a compiled executable, they can't put it on another OS unless they can get the source from you or have you compile a new binary for them.  Neither can they get a performance improvement except by upgrading their hardware.

T. Norman
Monday, July 28, 2003

T. Norman, I have to disagree.
If portability is a benefit, it is entirely a benefit for the developer. He can sell his code for many platforms while incurring less of a porting overhead. The end user could not care less: he runs the program on a particular platform. What is more, he is incurring the costs on his platform (least common denominator GUI, slow code,...), and reaps no benefits.

The only exception to this is in the large enterprise world, where the portability benefits can be indirectly reaped by the customer. Since here the customer might want to run the program on multiple platforms (if it is a client app in a client/server setup, and as in  most large enterprises the company failed to standardize on the client platform), the possible overall reduction in development cost (if it is passed on to the customer or if it is an inhouse project) can offset the negatives of the portability coin.

Just me (Sir to you)
Tuesday, July 29, 2003

I was comparing VMs to a cross-platform compiler with libraries that enable you to use the EXACT same source code and compile it for multiple platforms.  In both cases I was also assuming that neither the VM nor the compiler and libraries are built by the application developer.  So with a true cross-platform compiler, there is no porting effort for you the developer; the porting has already been done by the compiler creators and library writers.  You just have to run the compiler against the source for each platform.

Of course, the trouble is that I don't know of any such seamless cross-platform compilers.  Software written for Linux or even Unix on one hardware platform can often be recompiled without code changes to run on the *nix for another type of hardware, but simple recompilation won't make it work on Windows or BeOS.

T. Norman
Tuesday, July 29, 2003

I said:
"You just have to run the compiler against the source for each platform."

What I meant was:
You just have to run the compiler for each platform against the source.

T. Norman
Tuesday, July 29, 2003

*  Recent Topics

*  Fog Creek Home