Downtime
#1
Junior Member
Thread Starter
Join Date: Jan 2002
Location: Palo Alto, CA
Posts: 1
Likes: 0
Received 0 Likes
on
0 Posts
Sorry about the downtime earlier this evening, folks. Somehow, the server that this site runs on dropped out of existence at about 5:00 p.m. CDT. Since many of you are computer-orriented, here'* a more in depth explanation:
I ssh'ed into the router and could ping every box on the LAN except for my server. As the server is 620 miles from me right now, I couldn't physically look at the box without a very long drive. Worse yet, the server is in a locked room to which only I have a key, so nobody closer could check on it either.
I assumed that a catastrophic hardware failure had occured, particularly a primary hard disk crash. That, or somebody knocked the patch cable out of the hub. Feeling that the primary server was dead, I brought up a backup box and fiddled with Bind and Apache to take over MX functionality and display a "We crashed" web page.
Skip ahead to 1:00 a.m.: I'm getting ready for bed, still trying to figure out what to do about the server. I fire up a web browser to confirm that the backup server is still alive. Instead of my generic "We crashed" page, I see the regular front page. Hmmm, must be the cache. Reload. Same thing. Browse to the forums. Everything works as it should. Woohoo!
Somehow, the server came back to life. A quick ssh session revealed that the box never crashed. Due to the quirky nature of DNS, and a bit of luck, the first server resumed authority over the domain as soon as it came back. Thus, the real web site resolved instead of the temporary page.
So what happened? I haven't the foggiest idea. The server never crashed or froze, and nobody could have physically touched it. There'* a chance that somebody did something at the hub; however, nobody who should have been playing with the hub was around at the time.
In any case, we're back up and open for business.
I ssh'ed into the router and could ping every box on the LAN except for my server. As the server is 620 miles from me right now, I couldn't physically look at the box without a very long drive. Worse yet, the server is in a locked room to which only I have a key, so nobody closer could check on it either.
I assumed that a catastrophic hardware failure had occured, particularly a primary hard disk crash. That, or somebody knocked the patch cable out of the hub. Feeling that the primary server was dead, I brought up a backup box and fiddled with Bind and Apache to take over MX functionality and display a "We crashed" web page.
Skip ahead to 1:00 a.m.: I'm getting ready for bed, still trying to figure out what to do about the server. I fire up a web browser to confirm that the backup server is still alive. Instead of my generic "We crashed" page, I see the regular front page. Hmmm, must be the cache. Reload. Same thing. Browse to the forums. Everything works as it should. Woohoo!
Somehow, the server came back to life. A quick ssh session revealed that the box never crashed. Due to the quirky nature of DNS, and a bit of luck, the first server resumed authority over the domain as soon as it came back. Thus, the real web site resolved instead of the temporary page.
So what happened? I haven't the foggiest idea. The server never crashed or froze, and nobody could have physically touched it. There'* a chance that somebody did something at the hub; however, nobody who should have been playing with the hub was around at the time.
In any case, we're back up and open for business.
Thread
Thread Starter
Forum
Replies
Last Post