Fog Creek Software
Discussion Board




To Ghost or not to Ghost ...

This was maybe a bit misplaced before, but I would like some opinions on this proposed setup. I have not personally used Ghost before but have often heard it commented on favorably
The idea would be to do overnight Ghosting of the developer machines (every night, every dev machine) on a fully switched network. Are there any gotcha's here that I should be aware of? The goal is to have a very simple off-site backup solution, with fast recovery times in case a machine dies.

Here is what I would like:

dev machines (hardware all image equivalent)
                    I
      switched 100Mbs/1 GBit port
                    I
GBit line to different building (for offsite)
                    I
      switched 100Mbs/ 1Gbit port
                    I
some cheap machines dedicated to storing the images

If something happens, a dev could use a spare standby and ghost it with last night’s image. Should be up and running again in 2-3 hours max, no?

Does this sound ok? Dumb?

Just me (Sir to you)
Friday, January 24, 2003

Hi, I just found this site and articles yesterday (researching functional specifications). Incredible how much useful information on various related subjects is made available here. I only wish I had found and started following this site sooner. Anyway, on to the Ghost question.

Ok, I am by no means an expert on this matter (mostly because we don't have the infrastructure to support it). But none the less I do have a fair amount of experience with Ghost. To be completely honest, it sounds a bit like an over simplification.

It really depends on how many development machines you are talking about and how much data are on the drives (including OS, Apps & Data) and your backup window.

If you are talking about a relatively few machines with maybe a Gig on the drive, then you should be fine.

If you are talking about a great deal of machines and/or significant amount of data and/or you have a very tight time frame to do them in each night, then I would look at a different solution. Probably something that allows incremental backups instead of full backups.

Our company has a ghost of XP with all of our standard apps that is about 1.2 GB without any user data. So let's just say that with data they will average double that 2.4GB and I am dealing with a group of only 10 that I need to backup. It has been my experience that it can take up to 10 minutes per 600 MB. Therefore, these would take up to 30/40 minutes to complete. The speed of the storage machines drives and how many of them you are using would be the limiting factor on how many you could run simultaneously, without a significant impact on the speed. If I were planning to backup these 10 systems on 2 desktops with regular IDE drives I would not try to do more than one backup per storage system at a time. So doing only two ata  time might take me as much as 3 - 3 1/2 hours. If I were going to use Server(s) with SCSI drives striped then you would be able to do a lot more at one time.

Its really hard to guess (at least for me). I have seen so much variance between similar networks and between various storage systems. Your best bet is to just test it out in your environment. Start with one and keep re-testing with more simultaneous machines to see what you are able to sustain and how long they take. Once you know what your averages I would just try to stagger them as much as possible with a fair amount of buffer between the shifts end/start and the imaging process start/end. You will want to keep an eye out for the developer who likes to listen to music while they work. All to often you will find that your estimated 2 Gig drive has 6 Gig of MP3s and therefore 8 Gig total size.

All backup factors set aside, if you need to recover a machine that was backed up with ghost I would count on more like 30 min to 1 hour recovery time. It's not a "Cure All" but it is a reasonable method.

If you ask me a better way would be to create a generic Ghost image (or even one per machine if there aren’t that many of them) then setup something to do nightly backups of data only. This would make backups significantly faster, easier and more reliable and recovery would still only take about an hour for machine specific images or 2 hours for a generic image.

Doh! I guess I got a little winded didn’t I? I would be happy to further discuss some of the intricacies of doing ghost images if you like...

Thanks,
Rich

Richard
Friday, January 24, 2003

It sounds like you're planning to use Ghost as an alternative to doing backups.

There are two problems.

(1) even if you do a backup every single night, a crashed hard drive will still result in a day's lost work. This can be incredibly frustrating. Just recovering from a day's lost email can take hours.

(2) if a programmer accidentally ruins an important file or deletes something, you don't have a good way to recover it.

For backups, use a real backup program: I suggest NetBackup Pro. This lets anyone get any of the last 5 versions of a file if them mess something up, and can restore a hard drive from bare metal although this is a slow and annoying process as I am learning today. It also has the advantage of only copying files that actually changed over the network, and if multiple users have the same file (as they commonly do: os, program files, etc) they will only be stored once on the server, which means it takes up a lot less space.

But I still think you need a way to recover quickly from the most common problem: hard drive failure; and despite all the naysaying I think the best way to do that is to make sure developer machines have mirrored hard drives. This can be done expensively with RAID and SCSI or cheaply with ATA-RAID and commodity IDE drives.

Joel Spolsky
Friday, January 24, 2003

"Just recovering from a day's lost email can take hours."

How?

Firstly where is the email stored? For a single user with an internet connection who keeps his email on a .pst file on a hard disk, the answer is simple. You make sure messages are kept on the server for three days, and make a backup every night. Then if your HD fails, you restore the backup of the hard drive, with the email correct up to the day before, and then when you log on to the mail server it will download the lost messages again. Sent messages will be lost, but you can always send a copy to your Yahoo account or whatever.

Presumably you are referring to a setup where you have your own mail server. Here obviously you need to have a back up program specifically designed to back up the mail server you use, and to do so while the program is running.

One possiblity is for each user to keep his emails on a .pst file on his own drive and leave a copy on the company mail server, as in the case in the other paragraph. This would make problems for accessing the mail from the web, though the setting could be changed every night.

Ghost was originally not intended as a backup tool. It was intended to enable the quick installation of a system disk with all apps and settings ready, and to speed up initial deployment through the enterprise. In fact MS only accepted it as a method of installation with the W2K resource kit.

In the last few months I have been reading about cloning being considered the best way of making back ups of enterprise servers. Soon there should be enough companies who have tried this to give you the feedback.

Now what you must do is distinguish between data and installation. You should use Ghost to back up the system disk (that is to say operating system and applications and all the utilities you set up) but use a back up system for data. Keep the data on a separate disk, or a separate server.

The reason you are using Ghost  for the system disk  instead of the backup system Joel recommends is that it does take a long time to restore a bare metal setup from back up software. This is what Ghost does very well. But to get a back up of one file, or even all your data, where there is no installation, there is no advantage to using Ghost, and plenty of disadvantages.

So, a) use Ghost to back up your system disks.
      b) Keep data on a separate disk and use back up software (have a look at Second Copy from http://www.centered.com ) for it.
    c)  Get special back up software for your email server and for any databases you have running.

And if it takes you two days to recover from one HD crash, like Joel claims it does, then remember to feed the hamster that's running the generator!

Stephen Jones
Sunday, January 26, 2003

"And if it takes you two days to recover from one HD crash, like Joel claims it does, then remember to feed the hamster that's running the generator!"

I agree with you. The problem could have been fixed without complication if Joel knew what he was doing. Him leaving his computer to run overnight (twice) to try and fix the hard drives may have been the cause of the dead disks.

In my early days with FreeBSD, I once left my computer on overnight to recompile the kernel. The next morning the hard drive was dead, but I managed to trace the problem. Because a driver module failed to compile, when FreeBSD booted, the same area of the disk was read over and over again until it went down -- some viruses do this to trash disks. I suspect the same thing happened to Joel's drives. Lesson: never allow unsupervised low-level operations on your data unless you have made a copy of the current state of the data -- as is recommended when you resize partitions, defragment, etc.

Eddy Young
Sunday, January 26, 2003

Just me (something, something)

Why go to such an extreme? As long as you have a good version control program, such as CVS, you should only have to backup the source repository. Using Ghost is a waste of resources. Think about it. Do you really want to also backup the operating system files knowing they size up to hundreds of megabytes these days?

Here is the methodology I personally use.

A source repository is set up on a server, backed with a good CVS (and I am not talking about SourceSafe). Every time a programmer finishes with a file, he checks it back in. At regular intervals, a scheduled job backs up the source repository on another server. If a programmer's laptop dies, his latest changes will have been committed; if the server dies, there will be a backup. If the backup dies, there is the server.

As I am lucky enough to be using FreeBSD, this setup was achieved very easily. My source repository is managed by CVS; my backup is done with rsync (synchronises only changed files to minimise bandwidth usage allowing small intervals between jobs) and I use different CVS clients to access the repository (wherever under MacOS X, Windows or FreeBSD).

Links:

http://www.cvsup.org
http://www.cvshome.org
http://www.wincvs.org

Eddy Young
Sunday, January 26, 2003

"But I still think you need a way to recover quickly from the most common problem: hard drive failure; and despite all the naysaying I think the best way to do that is to make sure developer machines have mirrored hard drives. This can be done expensively with RAID and SCSI or cheaply with ATA-RAID and commodity IDE drives."

You do not have to use technology just for the sake of it. There are simpler ways to recover a developer machine. For one, you can use a hybrid solution. Use Ghost to mirror a typical developer setup with all the required development tools. Use CVS and rsync to version-control and backup the source code.

If a computer dies, you spend a maximum of 30 minutes to have a basic system back up. Then, you use CVS to reinstate the source that the developer needs.

In my personal experience, I have never seen RAID to live to its hype of allowing failure recoveries in a minimum amount of time. Certainly, never below 30 minutes.

Just because cheap RAID solutions exist, it does not mean that they are the best.

Eddy Young
Sunday, January 26, 2003

We now have a backup doing data only. So a restore takes "quite some time", as Joel adequately described.
In this case we are talking about +-10 machines, 20 GB drives, with a window of around 8 hours.
You guys are right in that I maybe should not substitute the backup with Ghosting. Both offer different advantages.
I like Stephen's suggestion best (Ghost system, backup data).
Thanks for the input.

Just me (Sir to you)
Monday, January 27, 2003

*  Recent Topics

*  Fog Creek Home