Fog Creek Software
Discussion Board

New Machine anyone ?

I have a design for a new virtual machine that promises to be a lot faster than all current virtual machines.

Do you think the world is ready for a new virtual machine ?

What languages would that VM have to support initially ?

Do you have any suggestions on how to get the VM out there ?


James Ladd
Sunday, June 2, 2002

One possibility is to think "Field of Dreams" -- if you build it, they will come. I'd say write the thing, keep a web page for it somewhere, and drop a link to it in your sig. The rest should take care of itself.

Another possibility is to think Underwear Gnomes from "South Park":
1. Design killer VM
3. Profit!
All you have to do is figure out step two.

Troy King
Sunday, June 2, 2002

I'd say we need another VM about as much as we need another coffee shop.  Of course, I'm sitting here debating about VMs while the guy that put in the coffee shop next door is making a killing. 

those who know me have no need of my name
Sunday, June 2, 2002


Good responses.
I guess the "Field of Dreams" approach is the one I am taking. Otherwise I wouldnt have started it.

I do feel that doing a project for the love of it is fun, but somehow un-rewarding. My main goal is not the money, but the use of the tool by the general community. A quick search of sourceforge/freshmeat and you see a dozen projects offering the same solution. This is great for choice, but dang, I want people to use my "thing".

Im not sure that building the best VM is enough. Do you ?

James Ladd
Sunday, June 2, 2002

Of course it's not. All popular VMs (i.e. Java, CLR's, smalltak's,  TCL's, p-code in the past) are just vehicles for carrying some other virus. It's the popularity of the virus that makes the VM popular. No-one needs just a VM. And to overtake already crowded market you would have to produce something that is order of magnitude better, which is unlikely. No one will switch to gain 10% in speed (assuming your VM will be that much better).

Sunday, June 2, 2002

It should support whatever language you want to use most.  Don't port python though, because it's a measure of success if someone else ports it.  ;-)

I think the best thing is good documentation.  The interesting kind, that makes people feel like they've gained insights after reading it.  I never liked unix a great deal until I read Kernighan & Pike's ancient "The Unix Programing Environment."  Same with programming languages and SICP.

Weblogs are an interesting kind of "documentation," so you can feel free to be a bit unconventional.

To be honest, I think you'll have to find the love internally to yourself.  People won't start using it until there's a running first implementation, though it doesn't have to be great.  Before then, it'll feel to you that no one might be interested.  And truthfully, maybe no one is interested in a generic fast VM; except as a piece of research.  But the more people look at your work, the chances increase that someone sees an interesting slant... and that unpredictable slant is what carries your project to interesting fields.

Sunday, June 2, 2002

More excellent replies. Thanks !
Please keep em coming.

James Ladd
Monday, June 3, 2002

Along, the same lines, be sure to check out

'tis ivan on a visit
Monday, June 3, 2002

Is this a homework assignment?

Monday, June 3, 2002

Its not a homework assignment.

James Ladd
Monday, June 3, 2002

Can you reveal a bit how you can make a VM faster than current ones?

Sam Wong
Monday, June 3, 2002

I am curious. How do you know it's better than the CLR and Java Hotspot virtual machines? Do you know how they are implemented? Have you seen the source code?

I know something about the details of one, and I can assure you that there's some very clever stuff in the implementation.

Nona Myous
Monday, June 3, 2002

I wouldn't stress too much about marketing your idea. If you could make a Java VM (for example) that was consistently at least 2 times faster than existing implementations and was at least as robust then the world would beat a path to your door.

Whatever you do, resist the temptation to design your own programming language to go with you VM :)

Andrew Reid
Tuesday, June 4, 2002

I certainly will resist the temptation to build a new language to go with my new VM. There are too many good languages to add yet another. I would look at supporting Java and Smalltalk initially. No language wars please.

My initial tests show the VM to perform a lot faster then current VM offerings, like the SUN VM and Oberon VM. Of course this isnt always an apples vs apples comparison.

Yes I have looked at the innards of a few VM's, those that are available in source at least. The difference with them and with mine is the instruction set and how those machines are designed around that set. Its like comparing RISC and CISC. Now the real curve ball. My machine doesnt have an instruction set. I know that sounds wierd, if not impossible, but that is how it is. So what do I execute ? Well that is the crux of the design and the reason for the performance increase. I wont go into it here.

Thanks for all your comments and, if you would like to talk further, you can email me directly, or keep this thread going. Id certainly like to talk more about  the VM.


James Ladd
Wednesday, June 5, 2002


The idea of a VM without an instruction set sounds fascinating. Personally I think you should elaborate further on here, I'll certainly keep an eye on the thread!


Adrian Gilby
Thursday, June 6, 2002

> My machine doesnt have an instruction set. I know that
> sounds wierd, if not impossible, but that is how it is.

Interesting idea! Does that mean your VM is  more "compiled" than "interpreted"? And for the benchmarks, did you compare with other VMs on a specific machine running a specific OS, or your VM won on a variety OS's?

Finally, I am not sure if you are planning to sell it or not. I believe a JavaVM with 150%+ speed-up sure will sell well. But if you don't care about money too much, maybe you can include the language of pliant...

Anyway, when can we see your product? Can't wait.

Sam Wong
Thursday, June 6, 2002

Reminds me of MISC (Minimal Instruction Set Computer) machines; They were supposed to achieve great performance gains by doing away with instruction decoding, and wiring bits of the instruction directly to operational units; This requires a large instruction word (that features an extremely sparse effective instruction set), and careful planning or the wrong instruction would physically damage the processor.

They never were commercially viable - I couldn't find any documented reason as to why that was the case, but I suspect that the simpler instruction set required complex sequences and complex optimizations, which greatly reduced the advantage.

Still, I'm sure that VM technology still has a long long way to go, and am very interested in what you're doing.

If you're thinking of going commercial with this, compare it against JIT compillers, not against bytecode interpreters; Whoever uses Java and requires speed will, eventually, prefer a Sun JIT to yours if it is comparable in performance.

And - the most interesting concept of VM/JIT/<don't know how to call it> in a long time is Psyco - The Python Specializing Compiler. It's like a JIT compiler on steroids - where as a JIT compiler optimizes bytecode into machine code, and then lets the machine code execute, Psyco optimizes closures (bytecode + its data in a specific execution) into machine code - Potentially delivering orders of magnitude in performance.  As an example, if you have a routine taking either an 'int' or a 'string' argument, and different cases within it, Psyco will compile it to two different routines, each specialized to its own type. Furthermore, if you keep supplying a non-changing argument to a routine (even if it isn't a constant), Psyco will compile and optimize a version of the routine especially for that value of the argument. Is that cool, or what?

Ori Berger
Thursday, June 6, 2002

"And - the most interesting concept of VM/JIT/<don't know how to call it> in a long time is Psyco - The Python Specializing Compiler. It's like a JIT compiler on steroids - where as a JIT compiler optimizes bytecode into machine code, and then lets the machine code execute, Psyco optimizes closures (bytecode + its data in a specific execution) into machine"

Proving once again that everything old is become new, or that it takes a long time for things to get out of academia, or some such.

"By generalizing the concept of an interpreter, we arrive at a different and unique kind of language processor. A partial evaluator takes a source program together with some - but not all - of its data, performs those elements of the program's computation that can be done as a result of knowing the given subset of the data, and outputs a simpler program that expresses the remainder of the computation."

From "Partial Computation and the Construction of Language Processors", by Frank Pagan, Prentice-Hall, 1991.

Steve Wheeler
Thursday, June 6, 2002


Please post more information here. I'm very interested to read about your implementation. And about an instructionless machine. Very intriguing...

Here are some specific questions that I have:


What language have you used to implement the VM?

Have you worked alone or with a team?

You mention improved benchmarks. How improved are they (give us numbers)? How broad are your benchmarks (what types of different applications have you tested)?

Now that the VM is done (almost done?), how do you plan on marketing it?

Are you going to implement the Java and Smalltalk compilers or areyou looking for someone else to help with those aspects?


Give us as many details as you feel comfortable sharing, and if you don't feel like this is the best forum to present so much information, put up a URL here for a place we can all go to read about your project.

And, good work. I'm very impressed.

Benji Smith
Thursday, June 6, 2002

Wow, thanks for all the interest !

I did send an email to Bill Joy (SUN) and got this response:

"We are aware of an internal form developed >5 years ago
as part of the Oberon project which can be compiled
to machine code very quickly, maybe 2x the speed of bytecodes,
but decided not to support such a small ratio of change.
You may, however, get in contact with
to discuss your idea, provided its better than this work.
Frank can help.
        Bill Joy"

However, subsequent emails to frank.yellin have gone un-replied. Dang.

I would really appreciate some help with the VM. Especially from you, as your interest and obvious skills would be much appreciated and yield a great result.

I will create a project on sourceforge and let all know about it. Is sourceforge ok with everyone ?

Can all please email me directly so I can compile an adequate mailing list ?

Now for a little more detail :

1. The actual name of the VM is ARMA.
2. I am not doing ARMA to make a squillion, however if that is a side effect then I wont complain.
3. ARMA is a new concept, but from what limited info I have on MISC, that would be similar.
4. The work done so far is proof-of-concept
5. I have implemented the VM in 'C'.
6. I have implemented an ARMA assembler so I dont have to fully implement a language to test the machine.
7. I have done my own benchmarks, but feel that they are adequate to prove the concept, but not adequate to publish. This is mainly because benchmarks are rather subjective and for the most part, how do you benchmark an apple against an orange ?
8. I have worked alone, but would love to have others work on the project as well.
9. I have started a Smalltalk compiler some time ago, but put it on hold. Java would be my preference as the first language. For take-up reasons. Again, I would be very very interested in people helping, but ARMA would have to be done first.
10. Have not posted more detail on the internals of ARMA as I would like to have a web site and an initial download before such details were published. However, those that email me directly will get a response with a detailed description of ARMA internals. I would really appreciate your learned opinions.

I did put ARMA on hold while I did some work in Java and JMS/JMX/JCA, as I didnt think anyone would be interested. Hence my initial post. Im rather excited about the interest and really hope that it comes to something.

Ill post this thread to Bill Joy and see if he has anything more to add. Ill let you know.

Thanks again for your interest and I sure hope that we can all continue to talk about ARMA and make something of it.


ps- Im in Melbourne, Victoria,  Australia.

James Ladd
Thursday, June 6, 2002

Consider changing the name; ARMA is already overloaded - I knew it stands for "Auto Regressive Moving Average", but a quick google search doesn't relate to this meaning in any of the top 20 results; Instead it find many other ARMAs.

If you release it under the GPL, "gnarma" would make it share with exactly one page (at the moment) on Google.

If the project really takes off (as I hope it does, and I suppose everyone here does too), the name IS going to be an issue.

Ori Berger
Friday, June 7, 2002

Thankyou to all those who emailed me directly about being involved with ARMA. If you would like to know more, but cant be involved too much, then please email me directly as well.

At this stage I cant see me changing the name, sorry. If you join the list of interested/helpers then you will see why the name should stay.

Thanks again for your interest, I didnt think a new VM would be that popular an idea.

Keep well


James Ladd
Friday, June 7, 2002

Ok public pressure is winning. Later today I will post a description of the internals of ARMA. This should give you all an understanding of the workings of the machine and the reason for the name.

I look forward to your comments on ARMA and any suggestions you have for a new name. Benji you may have a good suggestion as your resoning for wanting a new name is well thought out (you write well).

Keep an eye on this page ! All will be revealed in a few hours.


James Ladd
Saturday, June 8, 2002


Here are some details on ARMA.
If you have comments or suggestions, please email them through.
You opinions will be greatly appreciated.

Regs, James.
ARMA - A Description


ARMA is a new virtual machine design that uses a new appraoch to instruction formatting and execution that is radically different to current virtual machine offerings.
The new instruction format promises to yield significant processing power, whilst also enabling a 'no fixed instruction set' machine.

No discussion is made on the Traditional Approach to Virtual Machines and their instructions as it will be assumed that the reader is familiar with typical approaches.


ARMA is virtual machine derived from the Smalltalk language and it's approach to describing executional logic. The basis for the Smalltalk language is super simple, yet incredibly powerfull. In Smalltalk, everything is described the following way: 

answer = receiver message [arguments].


answer    - the result of the expression.
receiver  - the object to which the message is sent.
message  - the required operation.
arguments - optional arguments to the message.

For more information on Smalltalk, consult the book:

Smalltalk-80 The Language and its Implementation.
Adele Goldberg and David Robson.
ISBN: 0-201-11371-6

ARMA is _not_ a Smalltalk implementation, nor is it using any known Smalltalk Virtual Machine design. However, ARMA does use the simple Smalltalk instruction/expression syntax as the format of an instruction in it's implementation and the Smalltalk notion that everything is an object.

Instruction Format

ARMA considers itself to be a machine executing requests between objects. This is different to traditional approaches that execute atomic operations on known hardware (stacks/registers/accumulator/etc).

ARMA uses the following instruction format, where each item is a 32 bit value.



answer        - is a pointer to the object that is assigned the answer.
receiver      - is a pointer to the object that receives the request.
message      - is a pointer to the object that contains the request.
argumentcount - is a count of the arguments expected by the message.
argument(s)  - is one or more arguments to the message.

Using the above instruction format, there is actually no fixed instruction set. The operations that can be performed by the Virtual Machine is dictated by the available object set. In turn the available objects are implemented in a platform specific manner, whilst the source executable remains platform independant.

ARMA provides fast execution because it does not do any instruction decoding or execution. ARMA simply dispatches requests to the object that receives the request.

James Ladd
Saturday, June 8, 2002

This doesn't sound very fast at all.  Eventually *something* has to do the work, but you have the overhead of a function call every time you want to, say, add two 32-bit integers.  Moreover it seems that every object is in memory, not in registers.

Sunday, June 9, 2002

Well, smalltalk is not known for its speed. An object-oriented VM is interesting, but since most CPU ISAs ain't OO (not that I know of), sooner or later you'll have to implement the instructions in a non-OO way.

Another thing: if there is no unique set of "bytecode", do we need to compile the source for each type of machine to make full use of the "different object sets"?

Correct me if I'm wrong. I thought the main point of building a VM is to trade speed for compatibility.

Maybe you can provide some concrete examples.

Sam Wong
Sunday, June 9, 2002


"smalltalk is not known for its speed", but that is traditional Smalltalk on a traditional machine.

Yes, something at some point would have to execute the message. The code that does that is abstracted from the call. Therefore the source code is portable and the implementation optimized for a particular platform.

There is definately a swings and round-abouts tradoff that has to happen. Adding two numbers would _possibly_ be a function call. However, a call to make a GUI window pop-up on screen would be a single call. So some things would be slowers than conventional methods, and some would be faster.

There is nothing to say that the VM would not know about and act on some messages to receivers directly. So what I have explained is more an instruction format rather than an implementation.

Again, interesting responses, please keep them coming.

James Ladd
Sunday, June 9, 2002

Well it's back to the drawing board for me !

Thankyou all for your responses, suggestions and help.

Next time I post about a new VM ill be sure and have a lot more done and a lot more information on the instruction set readily available. Ill even have an executable and source for people to view/use.

I think a more traditional approach will be taken in the future. If you have a suggestion on the instruction set, or which chip it should be based on, then I would love to hear about it.

Keep well.

James Ladd
Monday, June 10, 2002


Looking at your VM definition, I was struck by it's resemblance to the execution model of Forth. Do you have any familiarity with that language?

The basic idea is that you have a "threaded interpreter", where the interpreter grabs an address for a "word" (a Forth subroutine) and then the word consumes it's parameters in the instruction stream. The main interpreter loop is just "Get next address. Update instruction stream. Jump. Repeat." Seems fairly similar to what you're proposing.

And Forth systems are typically very small and efficient - the language was originally (and is still) used in embedded systems.

Chris Tavares
Tuesday, June 11, 2002


I have no experience with Forth. I certainly have not looked at any code for Forth implementations or the Forth language iteself.

What you are stating is very very similar in concept.

I am doing some work now to see how to remove any instruction decoding and make sure there are no function calls in my implementation. So far so good. However, I do have a few other things to finish so it will be a while before I finish anything to show people like yourself.

Thanks heaps for the reply.


James Ladd
Tuesday, June 11, 2002


I have re-thought some of the approach I have been taking.

So far I have reduced the dispatch loop for ARMA to 3 (three) machine code (8086) instructions. That’s the loop that gets an instruction, decodes it and invokes execution.

If anyone knows how to reduce this loop/instruction count further, then I would love to know.

Whilst using 8086 assembler is not portable, I don’t see this as an issue as I’m not using instructions that cant be found on other machines (as far as I know).

All the parts of an ARMA instruction are loaded into registers. So the answer to any instruction will be found in a register and not on the stack. I am looking at a way of
avoiding function call all together. Well in so far as they appear in a dump of ‘C’ code with all that prolog/epilog code.

All comments welcome.

Keep well.


James Ladd
Wednesday, June 12, 2002

I was also struck with the similarity to Fourth.

The key difference is that Fourth tended to be Stack based.

Ged Byrne
Wednesday, June 12, 2002

Forth is stack based... no tendency about it.

Joe AA.
Wednesday, June 12, 2002

I thought as much, though memories a little vague.  Last machine I used forth on had a rubber keyboard and tape player.

Ged Byrne
Thursday, June 13, 2002

So what do you think are the pro's and con's of using a stack based machine ?


James Ladd
Thursday, June 13, 2002

Stack based is the 'lazy' answer; It's generally easier to write code generators for stack machines; It is much harder to optimize that code, translate it into native register machine code and generally reason about it in useful ways.

Register machines lend themselves to analysis and optimization, especially with  representations such as Static Single Assignment (SSA) and it's variations - the basic idea is that every register in the output code is assigned by exactly one statement; If it needs to be done in two places, the register is renamed so that each location has a distinct register. Might sounds dumb, but it exposes instruction dependencies, dead code, code motion opportunities, common subexpressions and other various pieces of interesting information.

Stack machines, on the other hand, make dependencies implicit, thus generally much harder to analyze.

The complexity of implementing a virtual machine is about the same; Attainable performance of register machines is generally higher, because constraints such as limited register file can be implicitly imposed (e.g., by assigning 6 bits to register number, there will be exactly 64 registers without possibility of overflow), whereas in stack machines, there are many boundary conditions to be checked in a robust implementation (e.g., stack overflow, stack underflow, stack problem in primitive application).

If you want to optimize performance, register machines are superior, but it comes at the price of an implementation more complex.

Or at least, that's how my experience tells me it is.

Ori Berger
Thursday, June 13, 2002

All depends on what para-nickle you want to support.

I have no problem with a stack-based implementation. 

Sure, maybe basic stack maintenance has some overhead, but I don't see how using a bunch of individual registers (and how many would there need to be, he asks) over a stack implementation would offer "lots of more speed".

The assignment and function (object method, whatever) followed by parameter implementation of this suggestion seems awful simplistic to me... especially the example given of the presentation of a GUI window being one statement. 

Some of us realize there has to be something doing the work behind the scene, no matter how "clean" (slash consistent) the high level appearance may look. 

High level "glue" code is usually not where performance bottlenecks lie.

Joe AA.
Thursday, June 13, 2002


I knew I was attempting this for a reason and talking with you all about it seems like reward enough.

ARMA will provide a series of instructions for working with objects and on objects as apposed to just memory/registers. For example, you will find some instructions like the following (of course lower level primitives for math/etc will also be provided):

opcode operand(s):

perform message
performWith message argument
performWithAnd message argument argument
performUsing message array

This is _very_ Smalltalk, however, since the majority of languages today are Object based, it makes sense (to me at least) to have a VM that supports Objects directly.

Again, there will be instructions that support low level memory /register /stack manipulation etc.

I am formulating a document on the instruction set and other VM details at present. Hopefully it wont be too long before I can send it to you.

Please keep up the comments.


James Ladd
Thursday, June 13, 2002

BTW - Whats a "para-nickle" ??

James Ladd
Thursday, June 13, 2002

A cheap paradigm.

Thursday, June 13, 2002

Doh !

James Ladd
Friday, June 14, 2002

Ori, I would suggest that the reason" that it's harder to optimize code for stack machines is that very few people have investigated the subject. The dominant paradigm of computing has for a long time been the register architecture. IIRC, Burroughs had a stack-based mainframe in the 60's, and there have been a few micros such as the Novix 4016 (later the Harris RTX2000), the SH-BOOM, and so on.

Phil Koopman, now at CMU, has written the definitive book on stack computers (available for download from his website), and he and a few others have also done research into optimizing code for stack machines.

One of the main performance hits Forth and similar languages take on today's micros when implemented as threaded interpreters is due to the granularity that is usual in such languages ... since your routines tend to be very small, you have a *lot* of them, which implies a *lot* of branches and subroutine calls. Unless your application is small enough to fit entirely in the cache (which can happen), you can get an enormous number of page faults, with the concomitant slowdowns.

There are benefits to stack computers and stack languages such as Forth, though. In his PhD thesis ("Practical and Theoretical Aspects of Forth Software Development", University of Teesside, 1993), Peter Knaggs notes that due to the possibility of executing multiple actions in a single instruction, the Novix processor provided up to 50 MIPS at a 10MHz clock rate. He also compared a NETBIOS interface written in C with the same one written in Forth. The C version was a linkable library 115k in size, the compiled Forth was 1.2k.

Steve Wheeler
Friday, June 21, 2002

This is difficult to follow. Was browsing through Java webpages and found this forum. Interesting plan. It does sound very similar to Forth(at least from what I remember of Forth) except for the stack/register part. I'm in the "same line of work" I want to build a VM, but not for the market. It's to show a proof of concept, of sorts, for a hardware design I'm planning(a hybrid Intel/Java system allowing native execution of both Java and Intel "mechine" code. Anyway, if anybody has any comments, I'd be happy to here them, however I will probably lose the address to this forum, so if you could, email me directly. Also, how for is the VM at this point? Is the a binary distribution out yet? Well, thanx everybody.

Daniel Goldman
Thursday, August 15, 2002

*  Recent Topics

*  Fog Creek Home