Fog Creek Software
Discussion Board




Backups comment/question

"so I'm going to try a LaCie Big Disk Drive connected to the backup server over USB 2.0 which is about $1.20/GB."

*a* LaCie drive?

As in "one"?

I think I see the flaw in your cunning plan...

Philo

Philo
Monday, January 19, 2004

I'd think this would do the trick....

http://www.lacie.com/products/product.htm?id=10118

Dan G
Monday, January 19, 2004

Alternatively, just add a couple of IDE controller cards (at about $25 each) and 8 160Gb drives (even a quality brand should be obtainable for only about $120 per drive) then do software RAID to get the same capacity for slightly less (about $1 per Gb) but with redundancy.  The problem with the LaCie is that they've probably just put whatever is cheapest in that box.

r1ch
Monday, January 19, 2004

Dan, the problem with a bigger disk is that it's still one head bounce away from being scrap metal. Rich grokked what I was getting at - esp. considering the concerned tone in Joel's post about having a few *files* not covered by backups, how can he put all his backup eggs in the basket of a single hard drive?

Philo

Philo
Monday, January 19, 2004

Joel, don't forget to make copies of your backups to store off-site. :)

Ian Ashley
Monday, January 19, 2004

r1ch, I would need an enclosure and power supply to do that, I don't have room and power for 8 more drives. But I'm open to other ideas.

Although reliability is important, these *are* for backups, and I think hard drives are sure to be at least as reliable as tape, if not more so. I've heard WAY more stories about the tape failing than the hard drive failing. And if a hard drive fails, you know it right away; if a tape on the shelf fails, you find out the first time you try to restore.

Joel Spolsky
Monday, January 19, 2004

Ok, add on an 8 bay chassis - only an extra $260 or so.

r1ch
Monday, January 19, 2004

The key thing about tape backups is that you can take them offsite easily. Also, tapes are automatically redundant and offer a history. "Last five versions stored"? When I'm coding that's about twenty minutes. Tapes generally give you a week to a month of history.

You mention backing up over the internet - I trust that solves the "Can a single firemain break wipe me out" problem?

Finally, and I really have to admit I'm kinda startled - ever think the reason you hear so many stories about tape failures is that so many people use tapes? You don't hear about hard drive failures because hard drive fails, you restore from backups, get new hard drive on warranty.

I've had four hard drives fail in my computing lifetime and zero tapes (and I've used a LOT of tape drives).

One other thought - just about every "tape failure" story I've heard has been "that's when we found out the backups were no good." Why? Because the backup solution was flaky to start with and nobody ever checked. Obviously your hard drive solution is equally vulnerable to *that* problem, and the solution in both cases is the same:
1) Assume nothing
2) Follow up and check

Final thought regarding hard drive backups - no solution is viable if every copy is available to the OS at the same time. We haven't seen a widespread destructive virus yet, but think about it. ("Destructive" as in "randomly change a hundred digits in every data file"). If you're going to use a hard drive backup strategy, you MUST have an offline failsafe (as in "no power"). You could solve this with the USB solution by having two at the NOC and having the NOC personnel switch plugs every night (assuming the backup software has a "rebuild in place" methodology). Of course, if you get the destructive virus at 10pm and it runs until 6am, you're still screwed...

Philo

Philo
Monday, January 19, 2004

In my career, I've worked in places twice where a cestore from tape backup was necessary. In both cases, the tape backup failed but it was too late to do anything - the software that ran diagnostics to prove the backup was good didn't work in one case, and in the second case the tapes had all degenerated within a few weeks after the backup being made.

Don't forget that the reason NASA has to take longer to get back to the moon is because they have lost ALL their data from the Apollo project - none of the tapes were readible by teh time anyone thought to try and get the data off-tape.

Tapes are useless. Anyone using tapes is a fool.

CDRs are also useless for backup.

Some say that CD-RW are better. I read up on that. It does look a bit better.

Ultimately a harddrive is your best bet. Joel's 100% on target with this one.

Dennis Atkins
Monday, January 19, 2004

Another issue is that even if your harddrive is burned in a fre, then thrown off the top of a 100 story building into oncoming traffic, you can get most of the data back for the right price. The same can not be said for tape.

Dennis Atkins
Monday, January 19, 2004

Okay, even if I give you "hard drive > tape" I'm guessing you'd go with more than one drive? ;-)

Philo

Philo
Monday, January 19, 2004

A bunch of hard disks connected to different computers (at your office, at the home of each developer), each running rsync (http://samba.anu.edu.au/rsync/) or unison (http://www.cis.upenn.edu/~bcpierce/unison/), and you should be all set. I recommend you stick to simple, trued and true solutions, whether it's for hardware or software.

... while keeping in mind that, as told in the preface of "Unix Backup & Recovery" (http://www.oreilly.com/catalog/unixbr/index.html), no one cares if you can back up; only if you can restore, which means spending a lot of time regularly to check you are indeed backing up all the necessary data, and you know how to restore them, since each application requires a specific procedure.

FredF
Monday, January 19, 2004

"Ultimately a harddrive is your best bet. Joel's 100% on target with this one."

Technically, I'd argue that magneto-optical is the best, but not nearly as cheap (roughly $8/GB for 9.1GB 5 1/4" media).  It has the advantages of random access and fast reading/writing capability like a hard drive, but can still be easily taken off-site and in comparison to tape and hard drive failure rates, MODs are nearly indestructible.  Unlike tapes and hard drives, they have an archival life of about 50 years.  It's a question of whether the data to be stored is important enough to justify the additional expense.  For most applications, the answer is probably no.

Matt
Monday, January 19, 2004

How many years ago were MOD invented? I seem to remember CDR manufactures quoting figures like 200 years shelf life when they first came out...

Chris Ormerod
Monday, January 19, 2004

Philo,

Absolutely. Have a HD off site in case of catastrophic failure like fire. Down at Circuit City, they've got these pocket drives that are the size of a floppy disk and hold 10-40GB of data. Switch out these ones periodically.

As far as the outside-of-a fire issue of what if the backup drive fails, it is not an issue. If it does fail, you replace it. As long as you have decent power regulation, the chance of the main drive failing at the same time is less than the chance of the site getting hit by a meteor. And if you don't have stable power, you'd burn out a tape drive as well during those 10,000 volt spikes. The trouble with tapes is they silently fail, sitting on the shelf, or never being written right in the first place but the checksum software says its ok because its buggy. I'd only consider a tape drive if it came with a $250 million ironclad insurance policy against loss for any reason - a policy they couldn't get out of if stuff went wrong and where the payout was assured even when they go out of business.

So, if I'm not worried about fire or meteor damage, even a one-drive backup I'd consider to be better and more likely to restore data if the main drive goes than a thousand tapes.

Dennis Atkins
Monday, January 19, 2004

Chris,

The claims I recall were 75 years. The claims have not panned out - after 5 years of archival perfect storage you can expect to lose 50% of your CDRs.

The word is that CDRW is better because it stores the data by altering the phasic properties of the material by subjecting it to heat and fields at the same time, the heat somehow enables teh fields to affect the molecular structure of the substrate, rather than burn holes in the ink. So I'm backing up to CD-RWs right now and will let you know in 25 years if they are any good.

Be warned that they are very slow compared to CDRs.

Dennis Atkins
Monday, January 19, 2004

Damn, I've survived two hard drive failures.  Now I gotta worry about meteors??!  Man, I wasn't sleeping much as things stood, now this.

veal
Monday, January 19, 2004

You had two drives fail at the exact same time and you have a power regulator?

Stay out of thunderstorms, son.

Dennis Atkins
Monday, January 19, 2004

"How many years ago were MOD invented?"

According to IBM's website, "The basic formulation used in all current magneto-optic disks was invented at the IBM research laboratories in Yorktown in 1971 [1] (for which the inventors were recently awarded the National Medal of Technology by President Clinton)."

"I seem to remember CDR manufactures quoting figures like 200 years shelf life when they first came out..."

CDR is a completely different technology.  CDR depends on photo-sensitive dyes.  Being photo-sensitive, they have a tendency to darken or fade with age and exposure to light.  The archival life is very much dependent upon the dye used.  Not all CDR media is created equal.  MO technology depends on heating a material past it's Curie point with a high-power laser, applying a magnetic field to change the polarity of the heated material, and then cooling it to make the polarity change permanent.  The readout process is done with a low-power polarized light beam that is affected by the magnetic polarity of the material.  Basically, to ruin the disk you'd have to heat it well above 400 degrees Fahrenheit or else subject it to a very large magnetic field (on the order of 0.1 Tesla).  Also, MODs are packaged as cartridges, so scratches are generally not a problem like they are for CDRs.  So, as long as you store them in a fire box (as you should do with any archival media) and aren't so silly as to store them in the same room as an MRI or other high-field magnet, they should be just fine.

Matt
Monday, January 19, 2004

Joel also mentions that he is doing "server" backups to an additional offsite hard drive - I'm guessing that the uber important files would be on this server getting backed up remotely as well. That's why I did not find issue with just having a single drive locally.

And in agreeance with some other comments above, it is just a backup drive, it's easy to replace if it breaks and so long as it is replaced promptly if it fails, I see minimal risk here.

Dan G
Monday, January 19, 2004

So Matt, just to clarify, CD-RWs are MODs because they work exactly that way, right?

Dennis Atkins
Tuesday, January 20, 2004

"So Matt, just to clarify, CD-RWs are MODs because they work exactly that way, right?"

Not exactly, but there is some similarity.  AFAIK, there's no magnetic component to the way CD-RWs work.  It's accomplished via a phase change to an amorphous form from a crystalline form or vice versa (which have different reflective properties) by heating with a laser and then either allowing the material to cool rapidly or cooling slowly via an annealing process.  Still, a CD-RW should archive better than a CD-R.

Matt
Tuesday, January 20, 2004

Oh cool! So no magnetic field. If it cools slowly, it lines up phase aligned, and if it cools fast then it's all randomly aligned.

How do they control the rate of cooling when it's spinning so fast?

Dennis Atkins
Tuesday, January 20, 2004

"How do they control the rate of cooling when it's spinning so fast?"

I don't know the specifics all that well, but I would assume that the recording layer is on the surface of the aluminum disc, underneath the layer of plastic.  Plastic, being a pretty decent insulator should hold in the heat so it's not dissipated too quickly via convection.  They probably just keep hitting that spot with the laser as the disc spins and either lower the intensity of the beam or keep the intensity constant and decrease the amount of time that spot is exposed to the laser on subsequent revolutions.

Matt
Tuesday, January 20, 2004

Ah, so that's why they burn at a slower rate - they have to do multiple passes. Makes sense.

Dennis Atkins
Tuesday, January 20, 2004

I thought (I may be wrong), that CD-RW were magneto-optical drives in that when the layer ablates it reverses polarity.  Whereas CD-R was an optical system.

All backup media will degrade, not simply those depending upon magnetic domains, if you etched the data into tungsten bars it would still degrade over time. but it would be a very long time.

Given some recent experience I'd use tape for daily backups where the likelyhood of the backup being useful is within a short time of the tape being written in the first place.  But then continual use of the same set of tapes (in the time honoured, grandfather, father, son) is just waiting for failure to occur.

For stuff that needs to be kept longer I'm burning CD-R and DVD-R and now I'm just embarking on re-copying all the archived CDs I have from over the years, some of which have already shown signs of loss of data.

Simon Lucy
Tuesday, January 20, 2004

I wondered why Joel is using special the Dantz stuff to backup SQL Sever?
I always use the backupnfacilities of SQL2K to backup to a local drive and then back-up the backup. The way I see it, adding an extra layer of complexity to the process does not give me any advantages, and adds a whole new level of "things that might go wrong" in the mix.
I am not a DBA (more of an allrounder by nescessity and choice), but most of the profesional SQL Server DBA's I have heard on this topic seem to agree.

Just me (Sir to you)
Tuesday, January 20, 2004

Dennis, I'd actually beg to differ with you.

NASA not getting back to the moon is primarily because of politics, not the lack of data.  Plans for the Saturn V have not been lost, they are in fact safely stored on Microfiche.  A lot of good it does them, however, because all of the Saturn V facilities are either gone, not available, or currently being used for the space shuttle.

Everything "important" has been saved simply because everybody has copies of it.  Most of the stuff lost was stored on tape because nobody really cared to even look it up.  They've done much better in recent years after people got burned about decaying tapes.

That being said, the jury's out on CD lifetimes.  There's arguments on either side.  The NIST has weighted in with their opinions at: http://www.itl.nist.gov/div895/carefordisc/disc_care/
which I would rate higher than most people's off-the-cuff opinions, which are about as useful as a green CD marker (which were aleged to make audio CDs sound better, for those of you who don't get the reference).  They maintain that CD-RWs are not to be relied upon for archival storage.

I think the big problem is that we are at a transition point between methods of keeping files backed up and around.  Especially given that hard disks are getting so cheap compared to tape.

The thing is, a backup needs to be tested.  If you don't try to do an actual live restore of your backups, you are dumb.  If you don't look at your archive disks every year to make sure that they are all readable, you are dumb.

Flamebait Sr.
Tuesday, January 20, 2004

Well the first recordable CD machine I bought cost $4600 so you can be the judge as to whether or not I know anything about CD lifetimes. I am talking from hard data in the form of bad CDs of every imaginable brand name, all stored in a temperature and humidity controlled dark cabinet designed for archival materials. I am not kidding when I say CDRs don't last and the dozens of people I know who also are seeing their CDRs fail 4-7 years out have had the same experience.

Regarding tape drives, we did check to confirm the backup was made. The software that confirmed the backups were good turned out to be buggy. Actually restoring the data is not practical since the data on the live machines changes too often. bottom line - tapes don't work, they can't be trusted and the advocates of tapes advocate absurd procedures for supposedly maintaining the fragile and expensive tape system, none of which is necessary when using the better alternative -- hard  drive backups.

Dennis Atkins
Tuesday, January 20, 2004

I do the same as Just Me (to you), use SQL Server's backup to make backups of live databases (it deletes backups older than 2 weeks for me, nothing super important) then have just backup these backup files with my other files.

Ben R
Wednesday, January 21, 2004

*  Recent Topics

*  Fog Creek Home