Fog Creek Software
Discussion Board




After $273 million, NASA scraps computer project

http://www.orlandosentinel.com/news/custom/space/orl-asecnasa17091702sep17.story?coll=orl%2Dhome%2Dheadlines

"The CLCS program originally was scheduled to cost $206 million and be completed in time to support its first shuttle mission in December 2000. The new assessment put the price of finishing the system at up to $533 million and the completion date around 2005."

Tough project.

tk
Tuesday, September 17, 2002


Maybe they should try open source.

Zwarm Monkey
Tuesday, September 17, 2002

Zwarm monkey, I thought you were trolling until I realized there are a lot of people who are interested in NASA except that it's closed and insular.  (That's the impression I get, I'm not interested in space travel myself.)

I think NASA could get a lot of good PR from having some openness.  They could publish hardware specs and people could see how those cool software teams work.  They'd have to know going into this that it probably wouldn't make them any more productive, but it would spark renewed interest... and FUNDING.

As well as giving the next generation of spaceballs something to focus on.

Sammy
Tuesday, September 17, 2002

Thanks for the link, Terry. I'm trying to figure out if this is the same project that Fast Company was so excited about a few years back, because it was SEI Level 5 "ooh, aah" and had "no bugs. None." Seems like the same thing -- shuttle launch software, Lockheed Martin -- but maybe someone with a more intimate knowledge of Nasa can fill me in.

It would be very validating for my whole way of life if it turns out that the poster child for SEI-5 was a more expensive failure than boo.com.

Joel Spolsky
Tuesday, September 17, 2002

I tried to find the Greenspun quote.  The essence was that we should love 30 year systems because we couldn't rewrite them today.  Or in Joel's terms, "Face it, launch control software takes 25 years."

tk
Tuesday, September 17, 2002

Joel wrote:  "It would be very validating for my whole way of life if it turns out that the poster child for SEI-5 was a more expensive failure than boo.com."

What exactly does this mean -- that the govrenment is incompetent, or SEI is stupid, or...?

Anonymous coward
Tuesday, September 17, 2002

Joel, I appreciate that thinking SEI-5 is great because NASA projects use it is a silly. But isn't it equally invalid to dismiss SEI because one project at NASA got canned? I hate the "annoying lectures" on SEI as much as anyone, but I can see that some projects with that amount of process control would be necessary.

I don't think anyone here really has much of an idea why the CLCS was cancelled. One article said it was because the existing launch control systems would outlast the shuttle program.

What struck me as unfair was your idea that it would be easy to create some emulators running on modern PCs that would emulate the old hardware/software. These systems _do_ need to be completely, completely crashproof. What if the emulator crashes? Comparing launch control software to mobile 'phone software smacks of wanting to get one side of an argument across. Can I point out that my Nokia has crashed on me a few times?

On the other hand, yes, perhaps they should have realised it wasn't going to work out before pumping in a quarter of a billion dollars.

Adrian Gilby
Tuesday, September 17, 2002

"I'm trying to figure out if this is the same project that Fast Company was so excited about a few years back ..."

The Fast Company article was written in Dec. 1996.  The CLCS project began in 1997. The SEI-5 software is probably the LPS system that they've decided to stick with.

Nick Hebb
Tuesday, September 17, 2002

I heard something a while back about this old system that they now have to use. Is  it true that no one today knows how it works?

Did they not re-hire some older retired astronauts to fix this problem? I saw somthing about some un-known astronauts going back into space? I don’t remember astronauts such as James Garner, Tommy Lee Jones, Donald SutherLand and Clint Eastwood, but apparently they went to space to fix this system.

Check out:

http://us.imdb.com/Title?0186566

Albert D. Kallal
Edmonton, Alberta Canada
Kallal@msn.com

Albert D. Kallal
Tuesday, September 17, 2002

New system seems like it failed to insane feature bloat:
http://www.windriver.com/windword/html/shuttle.html
http://spacelink.nasa.gov/NASA.News/NASA.News.Releases/Previous.News.Releases/97.News.Releases/97-03.News.Releases/97-03-28.New.CLCS.Under.Development

"As superb as [the old system] is, it has its drawbacks today. Its age is causing failures. Approximately 25 a day, in a six-day work week."
http://www.nasawatch.com/ksc/09.04.02.clcs.html

95% chance that the projections of the coders' productivity were bullshit (look at the graph):
http://www.nap.edu/html/upgrading/ch4.htm#II

Fun discussion on this article, including flamewar on emulating the old system:
http://slashdot.org/article.pl?sid=02/09/08/2134253&mode=nested&tid=160

Sammy (can't sleep)
Wednesday, September 18, 2002

"monumental task of developing more than 3 million lines of computer software "

3 million lines of code is monumental? Have they seen the latest Operating systems, Games or ERP packages? I was a project manager that produed a custom enterprise system with ~500,000 lines of code in less then 2 years and with a team that varied in size between 7-16 people.

How did we succeed? Release early, keep the design simple, test, refactor, use past experience, etc (XP)

1. How the hell did they spend so much money (hundreds of millions of dollars)? We're they buying crays for workstations? Using Andersen Consulting partners as programmers? WTF?!

2. I just have to wonder what was the culprit...Over-design? Feature bloat? Too many cooks? etc.  Did not a single one of them go into a bookstore and pick up any modern IT project mgmt books?

AEB
Wednesday, September 18, 2002

Gee, after thinking about this, and reading Joel’s new post (he just posted on his main page), I have to agree with Joel:

<quote>
Recreating those old systems with modern tools would probably be a weekend Visual Basic/Access project. OK, I'm exaggerating a little.
</quote>

Actually, Joel is not far off. We are talking about a GROUND SYSTEM here. If what I have been reading is correct, we are talking about 64K of ram.

Re-writing this stuff in C with a few developers and good design skills would be piece of cake. Just how much complexity can you fit in 64K of ram? What a joke, and embarrassment here.

There is *tons* of emulators on net today for old computer systems. However, why even bother with emulation...just re-write the stuff in C. You start with small code part and start re-writing, and re-design to C. Even more amazing is that you have an existing design to work with!! How nice!! You run each new piece side by side with the old during each launch. Work it, test it, perfect it. Piece by piece we go.

This is a insult.  With modern systems and software design, I figure it would be a walk in the park to lower the cost, and increase reliability by at least a factor of 10 times without breaking up a sweat.

Perhaps 50 times more reliability with systems today costing so low, and  if you give me a NASA budget...heck 50 times...no problem...Redundancy to day is SO CHEAP. Even right now, they say the old system is constantly giving them problems.

No doubt the failing of this project was much due to it being very large, and so many new things trying to be implemented *all* at once.

What to do?

I would approach this project in two steps, and step one would be moving the code/systems to modern pc type platforms, but ONLY the existing functionally *with* improved software designs. Not only would this get us off the old system, but you would now have a hardened developer team that understands the current system! The developer teams would now understand all of the intricacies of the current problem, and difficulty with writing the kind of process control system software that needs to be implemented. You must learn to walk before you fly. After all, I can’t place a ad in the paper for a space shuttle software writer...can I!!! Once the teams get a handle on the quality and the demanding rigors of this control software, then the next step would be to further improve, and expand the system. This process would further result in a battle hardened team ready the next phase in the software. I certainly would not re-do all the software like a big bang that will blow up in your face! This kind of stuff must be done in a incremental fashion, there is no other way. Just the testing involved here can kill this type of project real easy. Also, the KISS principle has to used here.

Darn, why can’t they let me manage this project!

More incredible is this is the 2nd attempt to re-write this software, and the previous time they also flushed 100 mil down the tubes..lots of big bucks.

Man..what a joke.

Albert D. Kallal
Edmonton, Alberta Canada
Kallal@msn.com

Albert D. Kallal
Wednesday, September 18, 2002

More thoughts...

"As of September 1998, approximately $60 million had been spent and about 50 percent of the system software and 10 percent of the applications software had been developed."

I wonder what they mean by "System Software" - did they try to create their own OS, thinking Windows/Unix is not good enough for them?


"could also facilitate future computer-intensive shuttle upgrades, such as an integrated vehicle health management system. "

feature creep.


"CLCS is a large, distributed, heterogeneous computer project involving the development of more than 3 million lines of new software, much of it automatically generated."

Ah ha! Automatically generated code. What a big surprise.


"management believes the predicted level of software productivity can be achieved with the aid of software generation tools. "

Nuff said.

AEB
Wednesday, September 18, 2002

From the things Joel discussed in his article he has overlooked the engineering prospect of it. I don't think 273M was spent for rewriting code. There are various hardware associated with it. Like its 1970's vintage they used vacuum tubes and stuff like that now they have to re-engineer them to use silicon based stuff with it comes new challenges. In short many numerous such things are to be re-organised or to be re-engineered to put it into that vintage guy.
Plus the safety aspect is u have to test all the systems and their unknown problems. Plus this system is goin to space so how do these components react to space. All this increase the complexity. The worst part it that NASA guys must have seen it coming at day one.
I think their decision must have been based on these reasons. And Joel talking about emulators and VB etc etc is does not make any sense. His argument about why they are using these stuff is they WORK and give results. But unfortunately they are not extendible.

Cooler
Wednesday, September 18, 2002

OOPS sorry about the space thing. I must have interpreted the article wrongly.

Cooler
Wednesday, September 18, 2002

Phillip Greenspun seems to sum this up best:

<QUOTE>
After three decades of shelling out for magic programming bullets that failed, you'd think that corporate managers would give up. Yet these products proliferate. Hope seems to spring eternal in the breasts of MBAs.

My personal theory requires a little bit of history. Grizzled old hackers tell of going into insurance companies in the 1960s. The typical computer cost at least $500,000 and held data of great value. When Cromwell & Jeeves Insurance needed custom software, they didn't say "maybe we can save a few centimes by hiring a team of guys in India". They hired the best programmers they could find from MIT and didn't balk at paying $10,000 for a week of hard work. Back in those days, $10,000 was enough to hire a manager for a whole year, a fact not lost on managers who found it increasingly irksome.

Managers control companies and hence policies that irk managers tend to be curtailed. Nowadays, companies have large programming staffs earning, in real dollars, one third of what good programmers earned in the 1960s. When even that seems excessive, work is contracted out to code factories in India. Balance has been restored. Managers are once again earning 3-10 times what their technical staff earn. The only problem with this arrangement is that most working programmers today don't know how to program.

Companies turn over projects to their horde of cubicle-dwelling C-programming drones and then are surprised when, two years later, they find only a tangled useless mess of bugs and a bill for $3 million. This does not lead companies to reflect on the fact that all the smart people in their college class went to medical, law, or business school. Instead, they embark on a quest for tools that will make programming simpler. A manager will book an airplane ticket using a reservation system programmed by highly-paid wizards in the 1960s, never thinking that it might fail. Then the flight will be delayed. The new Denver airport isn't open. The horde of C programmers is a couple of years late with the computerized baggage handling system (it was eventually scrapped). The air traffic controllers are still using the old software because the FAA's horde of $50,000 per year programmers has spent 15 years squelching each other's memory allocation bugs. When our manager gets off the delayed flight, she'll happily trundle to the Junkware Systems demo room where their $100,000 per year marketing staff will explain why the Junkware 2000 system, programmed by Junkware's cubicle drones, will enable her cubicle drones to write software 10 times faster and more reliably.
</QUOTE>
http://philip.greenspun.com/wtr/dead-trees/53010.htm

Matthew Lock
Wednesday, September 18, 2002

I'll try to clear this up a bit.  The group in the FastCompany article is responsible for the Shuttle Flight Software.  The group in the most current article was working on the ground software at the Cape.

I work in the Shuttle Flight Software group and I will stand behind our work and our SEI Level 5 rating (big surprise right?).  But, what I will also agree on is that most projects can't afford to pay for our process, and most probably don't need it.

The ground software is in a gray area.  Yes, they use it to make mission critical decisions, but the only software that has the ability to directly affect the Shuttle is ours.  However, what happens when the ground uplinks a corrupt state that claims the orbiter is somewhere it isn't and it starts spinning thrusting all over?

That said, there is a lot of fat that could be trimmed from the Space Program.  For instance, they wouldn't let me bring my own keyboard in to work because things that are not government property couldn't be plugged into things that were.  Instead they went out and bought a Microsoft Natural Elite for $75 (at the time).

cheeto
Wednesday, September 18, 2002

In response to Albert's comment about the movie Space Cowboys:
"Did they not re-hire some older retired astronauts to fix this problem? I saw somthing about some un-known astronauts going back into space? I don’t remember astronauts such as James Garner, Tommy Lee Jones, Donald SutherLand and Clint Eastwood, but apparently they went to space to fix this system."

Just as a point of clarification, the plot of that movie was fictional, and the system in the movie was the navigational system for a satellite, not a ground system as was discussed in this thread.

The flick was pretty funny in parts, BTW. The nav system was in an orbiting Soviet nuclear missile satellite (can you say treaty violation?). Turns out it was the exact same nav system we had developed for the US SkyLab -- the Soviets had stolen the plans from the safe of a NASA project director. Clint Eastwood's character had designed the nav system and he and the NASA Project Director were old rivals from the Bell X-1 days. Lots of sub-plots - old vs. young; digital vs. analog; computer vs. skilled pilot. Pretty entertaining. But fictional, and wrong system type for this thread.

Cheers,

anonQAguy
Wednesday, September 18, 2002

tk,

Thanks for bringing this up. I find this mesmorizing.

zwarm monkey,

open source of such a 'sexy' project is a brilliant idea since there are hoardes of bright individuals who would work on it in their spare time for free. I assume the problem is that state secrets are involved that would be of use to anyone looking to build an ICBM, such as to deliver any nukes they might have recently bought from the Ukraine or what-have-you.

Sammy,

*You are the man!* Wow, those links you dragged out explain the mystery, don't they.

Old system gathers single point measurements from 50,000-60,000 sensors throughout the shuttle and on the launch tower ever millisecond. All data must be analyzed in real time so that a launch can be aborted or round-trip action taken in no more than 20 ms, else shuttle may explode. This is a pretty serious performance requirement. Currently, simple sensors are used and transmit measurements to some really old custom computers in the command center. But it works -- sort of.

SITUATION

1. The old system fails 25 times a day due to decrepit hardware that can not be replaced.

2. Critical safety upgrades that are known to be needed can not be made because the old system long ago ran out of memory and they have been pulling out capabilities just to put in new stuff that has to be there, but they have long since gone past the point where that is anymore possible. Adding memory is technically not feasable due to the custom  computers used and their age.

3. Astronauts *will* die if this system continues to be used. And then the shuttle program will shut down, perhaps permanently.

ANALYSIS

Conclusion: replacing it with a new system is absolutely critical and needs to be done yesterday.

HISTORY

1. First attempt at new replacement system had plug pulled at $100 million.

2. New system still needed and is ordered in 1997, to be delivered in 2000. Project scope: replace old system.

3. Politics kicks in. Sun must have their workstations used and its critical that system be written in Java because they plan a big ad campaign based on fact the shuttle runs on their hardware and software.

4. MS lobbies -- Windows NT will be used right alongside the Sun boxes.

5. Motorola and IBM check in -- PowerPC singleboard microcomputers must be used to replace all sensors.

6. The solution given the requirements of 3-5: simple sensors will be replaced with scores of PowerPC single board computers each acting as combination DSP data processors/web servers that will take the measurements and format them as graphs and gifs and java applets sent from the sensor itself to HTML browsers at mission control. The scores of controller/servers will transmit the data through a high speed network. The data will be viewed and analyzed on Solaris and NT systems, one of each on dual flatscreen monitors at each command control station by scraping the data from the HTML and analyzing it. If trouble is found, web packets are sent back along the network to PowerPC controllers that then shut down the engines or do whatever is needed.

7. 500+ people working on the project take more than the 3 years allocated, need 2 more years.

8. At end of 5 years, workers are working outlandish overtime; families are falling apart, people are thinking of killing themselves. The project is farther away from being complete than it was the day it started.

9. The project is cancelled. Traumatic for the people involved, but at least they can put back together what is left of their lives.

So there it is.
The only questions left:

1. Do we shut down the shuttle program for safety's sake while building the system again? Or keep rcunning it until someone gets killed because of the old system.

2. Should we bother blaming Sun, MS, IBM, Motorola for pushing their own interests at the costs of millions and risk of lives? Or blame the project managers for incompetance. if the programmers working on the project were competant which I'm sure they were, I assume they have been screaming bloody feature creep and impossible absurd requirements from day-one.

3. What is it about Big Process that didn't prevent this from happening? I thought the whole Miracle of Big Process as opposed to the Wild West was that it gave you accurate, predictable dates and repeatable results. (Actually, these were repeatable results -- the project kept failing, year after year, still as impossible to fulfill as the day it was born.)

4. Is there any possible way that a government with deep pockets can get the contractors to just replace the system with a working modern one (something not hard to do obviously -- they can even use the old sensors, no need to replace them.) without politics and greed of clearly anti-American astronaut-hating corporations like IBM Sun and Motorola getting in the way?

I don't have any answers here!

Though I will say that how did it cost $263 million is easy -- that's how much a project this big costs.

X. J. Scott
Wednesday, September 18, 2002

"Just how much complexity can you fit in 64K of ram?"

(64*1024*8)! I beleive - it actually gives you a 'this calcualtion will take a very long time' message in windows calculator, years...?

Maybe that is what took so long, everyone standing around waiting for windows calculator to tell them how complex it was : ).

Robin Debreuil
Wednesday, September 18, 2002

XJ--

Hahahahaha.  That was beautiful.  I have not read anything about this debacle except here on JOS, but really, what more needs to be said?

I can't believe those questions are serious, but here are my answers anyway.

1) Of course it should be shut down, at least by your description of its current state.  Would not surprise me if it isn't, though.

2) It's the responsibility of project planners to stand up to salesmen.  Don't blame salesmen for being salesmen.

3) What do I know about Big Process?  Not much.  But I am pretty sure that if you don't really know what you want to do, Big Process will not make up for it.  Sounds like that was part of the problem here.

4) Obviously it is technically possible to do this project right.  Technically it could be done cheaper, faster, more reliably and safer than some 20-odd year old system.

It should be obvious that there can not be umpty-million dollars up for grabs without a lot of companies grabbing for it.  I really don't understand blaming the companies for this, though.  It is NASA's job to know what they want, and what is a workable way to get there.  If they don't know, or don't have the guts to insist on it, then yes the project is doomed.  From reading your #6, I would not be optimistic about future attempts.

Makes me a little scared to think about what the DOD is doing.

Matt Conrad
Thursday, September 19, 2002

Thanks Matt C.,

I agree with your responses to the questions.

I was partially rabble-rousing with my rhetoric about the companies -- you are right that it is ultimately the NASA PM's responsibility to know what they need and shoot down absurd self-serving proposals. Perhaps the PMs are not knowledgable enough to make such a determination.

But I do think that companies working on these sorts of projects have an ethical responsibility as well. Should Ford be allowed to get away with intentionally putting faulty fuel tanks on Pintos to save money? Should Pharmeceutical Companies be allowed to get away with substituting drugs that work with cheaper poisons that don't to make more money? Etc. I hope we can agree that the answer is 'no' and that if they do such things, they shoulid be held accountable. Likewise it should be for design of systems like the shuttle, or nuclear missle guidance systems, or nuclear power plant control systems, or things where there is a risk to human life. Obviously, people at Sun knew that the system they proposed would not work -- they just wanted the publicity. Obviously, Motorola/IBM knew that single-board microcontrollers serving web pages is an inane way to build a space-shuttle sensor determining whether to shut down the engines. Because of this, they and the other characters involved in this imbroglio should be hung out to dry, or at least castrated and put in wooden stocks in the public square.

Allow me to correctn one of my claims -- the original announcement of the project stated that an immutable spec was that modfications to the shuttle itself were not allowed.  Assuming that spec didn't change, the original sensors on the shuttle *are* still there and the PowerPC based sensor boards mentioned in the articles must then to have been on the launch tower, relaying data to the control room.

X. J. Scott
Thursday, September 19, 2002

'3. Politics kicks in. Sun must have their workstations used and its critical that system be written in Java because they plan a big ad campaign based on fact the shuttle runs on their hardware and software.'

Java.  They wanted a realtime, critical system written in Java.  I hope I'm misreadiig this.....

Jason McCullough
Thursday, September 19, 2002

In terms of my earlier comment about the "monumental" task of writing 3 million lines of code...

THe following article lists a "stats" sheet at the end of a post-mortem for the game "Black & White"

http://www.gamasutra.com/features/20010613/molyneux_01.htm

-------------

Lionhead Studios
Black & White

Publisher: Electronic Arts

Full-Time Developers: 25

Contractors: 3

Budget: Approx. £4 million (approx. $5.7 million)

Length of Development: 3 years, 1 month, 10 days

Release Date: March 30, 2001

Platforms: Windows 95/98/2000/ME

Hardware Used: 800MHz Pentium IIIs with 256MB RAM, 30GB hard drives, and Nvidia GeForce graphics cards

Software Used: Microsoft Dev Studio, 3D Studio Max

Notable Technologies: Bink for video playback, Immersion touch sense for force-feedback mouse

Project Size: Approx. 2 million lines of code

-------------


While a game cannot compare to a system like this, it just shows that it can be managed (and within a reasonable budget)

Also, who decides how many lines a system needs before the project starts??!! I've never heard of someone even attempting to put forth guess before they start, never mind worrying about trying to hit that target..?

AEB
Thursday, September 19, 2002

<<<Perhaps the PMs are not knowledgable enough to make such a determination.>>>

What good are they, then?

<<<But I do think that companies working on these sorts of projects have an ethical responsibility as well.>>>

Well, me too, and clunking up a safety system for the sake of pleasing your marketing department is pretty low.  However . . .

<<<Obviously, people at Sun knew that the system they proposed would not work -- they just wanted the publicity.>>>

What, the publicity of having top billing on a five-year $273 million disaster?

I am sure Sun at least intended to deliver something that worked.  Whether they really thought it was the optimal way to meet NASA's needs . . . who knows, but it's kind of hard to believe.  But I also don't believe Sun was planning a failure from the beginning.  They thought they could push their private agenda and still make the system work, and it just didn't. 

<<<Because of this, they and the other characters involved in this imbroglio should be hung out to dry, or at least castrated and put in wooden stocks in the public square.>>>

This mess reflects badly on everyone involved, but NASA was still the one running the show.  Maybe if NASA had turned the whole thing over to Sun and let Sun take full responsibility for getting the job done, I could understand the need for some unpleasant stockyard scene involving Scott McNealy.  But as it was, at the end of the day it was NASA that had to make the decision of ok or nokay, and they were the ones that kept saying yes.

You know, this could tie back into Simon Lucy's thread about outsourcing . . . well, it's late.  Not tonight.

Matt Conrad
Friday, September 20, 2002

"Java. They wanted a realtime, critical system written in Java. I hope I'm misreadiig this....."

6... 5... 4... We have main engine start... 3...  2... Garbage collection running... Garbage collection running... Garbage collection running...

Adrian Gilby
Friday, September 20, 2002

This is nothing compared to the fiascos of the FAA.

Literally billions of dollars have been spent attempting to upgrade the FAA's flight control system in the past 30 years.  At least 3 efforts have been totally scrapped, including one by IBM.

We're still flying on 60's technology.

One of the big fears for Y2K was the potential failure of the no longer manufactured or supported IBM computers.

I was peripherally involved in assessing one of these efforts in the mid-80s.  Fearlessly and correctly forecast it's failure.

The lastest attempt was to replace only parts of the system, primarily "monitors".  Flight controllers complained that the pull down menus obscured the more important parts of the visual display.  Yep, the planes and flight paths!

I remember rejecting UNIX years ago because the file system couldn't guarantee integrity in the event of a power failure (a problem since fixed).

In my early days, I worked a fair number of space and communication programs (TDRSS, MILSTAR, GWEN, etc).  I never saw one that came close to being on-time or on-schedule.  Often, it was due to constantly changing requirements.  The government would come in 6 months before delivery and demand a whole new set of features.  I've even been in a position of negotiating requirements while acceptance testing was going on in the next room!

I doubt things have changed much...

Jeff Jacobs
Friday, September 20, 2002

I remeber when I first started using Java there was always this big warning that you 'couldn't use Java for life and death systems like nuclear reactors' etc - is that not the case anymore then I guess? I know it was meant to be an embedded language, I guess that was just for early version then?

Robin Debreuil
Sunday, September 22, 2002

I gathered from the articles that the Java/html systems were to be "features" instead of anything important.

In 1997, Java was just too primitive.  Ideally you don't want to touch Sun code but instead a Java implementation.

There's a Fred Brooks lecture here, where among other things he talks about the insanity of working with the gov't and other bureaucratic committees.
http://wean1.ulib.org/cgi-bin/metawin-lectures.pl?target=Lectures/Distinguished%20Lectures/2002

I dunno, what's the difference between this one Nasa project and those dotcoms that wasted a gazillion?  It was 1997, some people capitalized on the dotcom hype for funding.  It's just sad, you can see that Nasa was very politicized with some people angry at the obvious boondoggle.  The team probably couldn't find many talented people either.

Truly, this is not the Tao.

Sammy
Sunday, September 22, 2002

"what's the difference between this one Nasa project and those dotcoms that wasted a gazillion?"

Quite a bit! The two biggest differences that come to mind:

1. With dotcoms, lives are not directly at stake as a result of the blunder (the NASA system genuinely needs to be replaced for safety reasons).
2. With dotcoms, the people funding them know that they are taking a risk for a possible big payoff. In the NASA case, I am being forced to pay for this even though I do not benefit if the program succeeds.

"The team probably couldn't find many talented people either."

Do you mean because the dot-com type attracted 'all the best people'?

While I agree that government defense type contracts during that era tended to attract a lower caliber of engineer, I think that the mystique and excitement of space travel has always allowed NASA to get the best engineers at a bargain, if they chose to take them on.

And the very very best engineers I have known have never chased the money, but always put their talents to work trying to accomplish something worthwhile.

X. J. Scott
Sunday, September 22, 2002

I feel like I'm being offtopic, but... my ladyfriend's mother was over and I feel like shootin the breeeze.  I'm not sure about your points 1) and 2).  For 1) neither of us are informed enough, and for 2) we'd need to be economists to actually say if the ripple effects of gov't spending didn't make the actual cost less than we think.  In fact, had the tools been released as opensource, the impact would be even less.

About not finding enough talented people, I was thinking of the 400 people laid off.  That suggests something to me, that for a 10 year project that was slated for 5 years, they went on a hiring binge.

Keep in mind that I'm piecing this together, and there are more informed people to reply to.  My intellectual interest here is that I'm in my first job with bona fide management bozos, and I'm happily learning politics while I'm here.  Soon I'll get my best friend into this country and we can have fun finding out what really is fulfilling.

Sammy
Sunday, September 22, 2002

It's a bit interesting that nobody has commented on the original price tag of $206M.

So let me make some observations, based on past experience and educated guesses.  Anybody with more facts, feel free to correct me.

This system is far more than just software.  It also involves interfacing to a huge range of existing, typically obsolete or 1 of a kind hardware interfaces.  So there's probably also a significant amount of hardware design and manufacturing.

This shuttle and its launch structure have thousands of sensors.  Almost certainly, 10s of thousands.  Maybe even 100s...

These are all inputs to the system.  Not only does the equipment that is being monitored fail, but sensors also fail.  "Failure" can mean inaccurate readings, not just hard failure.

Think about how you would test such a system?  Do you go out and hire a bunch of human testers?

No way.  You have to effectively build a system that will simulate the launch complex and shuttle.  This involves developing complex software, test drivers, test data and scenario generators, test result maintenance, and custom hardware.  All of which has to be of higher quality than the deliverable system.

This is just as large an effort as the deliverable software.  A simple example from the TDRS ground station.  There was one PDP-11 mini-computer for processing telemetry from each satellite, plus a hot backup.

There were two equivalently sized PDP-11s for testing each of the telemetry processors.

Given the enormous size complexity of this kind of system, and the "one of a kind" nature, the overrun isn't surprising.  Even SEI-5 organizations miss sometimes.

And of course there has been a tremendous loss of talent and knowledge over the years.

Jeff Jacobs
Monday, September 23, 2002

A couple more comments.

The original systems probably supported more than 64K RAM.  Even the PDP-11, a widely used 16bit mini-computer at the time, had separate addressing for instructions and data and had paging registers.

More likely, the computers were IBM or UNIVAC mainframes.

Also, vacuum tubes were basically obsolete.  Discrete transistors and integrated circuits were in very widespread use.

Memory was probably magnetic donuts strung on wires, however :-)

Jeff Jacobs
Monday, September 23, 2002

This type of project happens all the time -- think of Apple's Copland or even IBMs kicks at the PC can prior to 'changing' their methodology.

Fifteen years ago I met a fellow who was a programmer on the team that produced the software for the Toronto Stock Exchange, using APL I believe. This group had done this some fifteen years before and the software was still in use, much to the chagrin of the TSEs management.

They had spent some $25M attempting to replace it whereas the first and successful project cost under $1M.

He was both upset and proud. Upset that a batch of baboons (i.e. Waterloo grads) had been paid a small fortune to accomplish nothing but proud of what he'd help accomplish on a shoestring budget.

The first problem is that nobody knows specifically what they are trying to accomplish. This ends up as a design it as you build it process.

General Orde Wingate once said something to the effect that 'the difference between leadership and good leadership was an accurate imagination'.

Such people are generally kept away from meaningful projects because they interfere with the ambitions of others.

Ken Keller
Tuesday, October 08, 2002

*  Recent Topics

*  Fog Creek Home