Fog Creek Software
g
Discussion Board




Server Uptime?

If you ever wondered what server operating system is the most stable, check out this link. Maybe the average time between reboots isn't a perfect indicator of availability and stability, but there's one OS/web server combination on this list that's pretty impressive.

http://uptime.netcraft.com/up/today/top.avg.html

Tom H
Saturday, May 29, 2004

None of those sites are known to be heavy traffic sites. An OS comparision of those sites is what I'd be interested in.

Bob
Saturday, May 29, 2004

http://uptime.netcraft.com/up/today/requested.html shows alot of well known sites. Note that they are not sorted by uptime.
Also, most of these sites are so big they probably run loads of servers, and I dont know exctly what uptime is meassured. All netcraft sees there is probably the load balancers.

Eric Debois
Saturday, May 29, 2004

"I dont know exctly what uptime is meassured"

The FAQ explains what they're doing. Although as you say, on a clustered site like Google or Yahoo the numbers have to be suspect.

http://uptime.netcraft.com/up/accuracy.html

Interesting that www.sco.com is hosted on Linux. I guess they figure they own it so why not use it...

Tom H
Saturday, May 29, 2004

That is being the reason we using the apachi and BSD

Junichiro Kawaguchi
Saturday, May 29, 2004

That's a silly reason to use BSDi,  as the link states:

Additionally HP-UX, Linux, NetApp NetCache, Solaris and recent releases of FreeBSD cycle back to zero after 497 days, exactly as if the machine had been rebooted at that precise point.

Koz
Saturday, May 29, 2004

I don't think uptime should be seperated from high volume sites.

As a high uptime site should have the same results with a high load if scaled well.

Somorn
Saturday, May 29, 2004

Server Uptime is a good idicator, but for the most sophisticated sites you'll have to manage, the big round numbers means jack *bleep*. No down-time could possibly be tolerated. To the customers,  they care about your service uptime--in terms of response time and availability. If you expect to make them happy, you'll have to properly handle say the the downtime of a FreeBSD box that's usually up for 2+ years at a time just as well as you can properly manage a cluster of specialized server that may reboot multiple times a day and have service and software updates any time of the day. The key thing to remember here is that some people need to reboot their machine or mutate their service offerings all the time not because their vendor is Microsoft but because that's part of the service expectation or design. If you dig deep into it, few Linux and Microsoft web-database stack will ever be asked to handle a genuine live upgrade the way a telecom's specialized billing system requires. The illusion of uptime is only made complete because 1) there's clusters of these PCs, and 2) because the illusion of a session and usage is never maintained in the inherently stateless web/application servers but in the backend databases. Blah blah blah blah blah blah blah.. I will go go get a life.

Li-fan Chen
Sunday, May 30, 2004

Well if anybody I know has a linux or nt box up for about 500 days, I know they have a serious security issue. Who cares about the uptime of one machine?

Any box should be able to pulled at any time, for any reason. There should be no human steps in the failover process.

If you want to look at uptime, think of service uptime. Which takes into account failures, upgrades and things going down a lot.

Also it depends on what you're doing, a tiny intranet site doing static html and nothing more will be doing less than some linux or win2k server which is a heavy hit d/b.

fw
Monday, May 31, 2004

This uptime obsession is one thing I really fail to comprehend. I have known shifty academic admins that refused to install some security patches because it would ruin their "uptime" streak. Gawwwd!!!
Hey, if they're so obsessed, why not write a simple uptime service close to the metal. Think about it, you could completely change the OS running on top, maybe even shut down most of the hardware, and still have your "uptime" gong strong.

Hey, maybe I could become a digital snakeoil vendor and try to pass off 99.999% SLA's based on "uptime". "I understand that you applicatioin only managed to be online for half a day during the past month, but we can not give you any compesation since the officially audited "uptime" service that forms the basis of our SLA indicates a full 100% for the period".

Just me (Sir to you)
Monday, May 31, 2004

"they have a serious security issue"

With Unix you usually don't have to reboot for anything but a kernel upgrade. The other services are run as daemons that can be restarted as needed to apply patches.  Apache even has a graceful restart so you don't abort any open connections.

Tom H
Monday, May 31, 2004

*  Recent Topics

*  Fog Creek Home