Another upgrade, another problem

Yep, it happened again … about this time last year I was trying to upgrade my servers to Fedora Core 6 and ran into some problems.

Well, I decided it was time to upgrade to Fedora 8 … and, since I have time off, I figured this was a fine time to do it again.

Bad move.

Of course, in retrospect … there never would have been a good time to do the upgrade, based on the problems I encountered. At least I know my backup procedure is fairly good now.

I had been planning this upgrade for weeks … everything was set. In fact, the first half of the upgrade went smooth as silk. I upgraded the main web server (gondor) to Fedora 8 and it went pretty nicely. Only two issues, both of which were solved after a little research.

This gave me the confidence to proceed to upgrade Rivendell to Fedora 8.

I started the upgrade by booting from the CD so I could install Fedora from the DVD ISO image I had on a USB hard drive. Problem is, the system wouldn’t boot this way.

I tried a few more times with no luck.

Since the Gondor was running OK, I decided to bag the upgrade for now and try again at a later time. This is when things started going all sideways.

I removed the CD from the system and reboot the system normally … but it hung up after just displaying “GRUB” on the console. This is not good. It’s usually an indication that either a) The drive configuration has changed in such a way that GRUB (the linux boot loader) can’t cope, or b) The drive has failed.

Since I was pretty sure the drive configuration hadn’t changed, I figured that the disk had failed. The machine had been running 24×7 for a LONG time. Oddly enough, the machine had been running perfectly up until I tried the upgrade. And I know no part of the upgrade had actually been performed. I was able to get the upgrade CD to boot properly if I disconnected the IDE disk that was in the machine.

At this point I’m unsure what to do. I’ve got plenty of spare drives, but what’s the best approach? FWIW: My backup is from the previous night.

I decided to do this …

  1. Take a install a new SATA drive in the machine (the old drive was a PATA drive (due to the original RAID configuration that stopped working), partition it for one swap and one ext3 partition.
  2. Restore all the data from the last backup to the new drive.
  3. Run the actual Fedora 8 upgrade.
  4. Try to recover what had been lost since the machine was backed up. Maybe I could recover some data from the old drive.

Well, the SATA drive got installed fine. Got it partitioned, the data restored, and started the upgrade.

The Fedora upgrade disk recognized the partition as a Fedora 6 install and offered to upgrade it. It also recognized the lack of a boot loader and offered to install one.

The upgrade went surprisingly smoothly.

After the upgrade completed, I started tweaking the configuration and made sure everything was working.

Oh yeah, I was able to take that IDE drive and hook it up to the system using a IDE to USB adapter … and was able to access the data on the drive without a problem. I’ll have to run that disk through the Seagate diagnostics to see what’s going on.

The system isn’t 100% solid yet … but it appears to be working OK.

Here’s a quick rundown of the problems I had …

  1. Gondor
    1. There was a loop in the upgrader (yum) that caused it to stop at 27% when checking for dependencies. This was resolved by this fix.
    2. The NIS client reported errors trying to bind to Rivendell. There’s no documented fix for this, but I was able to diagnose the problem. Turns out there is a check to see if the ypbind service is registered as a RPC, which assumed there would be a 15 second timeout if it couldn’t find the service. Problem was, the test was failing before it got to the process that had the 15 second timeout. I added a ‘sleep 15s’ to the init script so the ypbind process had a chance to register as a RPC.
    3. Turns out the HAL daemon is required now … I kept getting ‘Could not initialise connection to hald.’ errors on certain operations. Starting the haldaemon service solved that problem.
  2. Rivendell
    1. Drive wouldn’t boot.
    2. DNS configuration wouldn’t work. Got it working, but not happy with it. Still working on it.
    3. Samba server not working. Still working on this too.
    4. Mailman encountering a ‘raising a string exception is deprecated’ warning. Not going to worry about this, as it’s supposed to be fixed in the next version.
    5. ‘messagebus’ and ‘haldaemon’ services are required. Hard to figure out, but easily fixed.

I’ll keep monitoring the system … not sure what I could have done to avoid the drive problem.

Oh, did I mention that the tradition of disk problems never happening in singles … but usually in pairs? Yep, another drive started failing … this time it’s Ginny’s laptop drive. I’ll have to address that today.

btw: Happy new year 🙁


Update: I got Samba working … not 100% sure what was wrong, but the Gondor SMB configuration was working, so I copied it’s configuration over to Rivendell, changed the names, and it worked.

4 thoughts on “Another upgrade, another problem

  1. david

    I’ve got Spinrite … but I don’t think that would have solved the problem. a) The drive was actually accessible once I installed it on a USB adapter, b) Spinrite is primarily for recovery of marginally damaged disks … it’s not a complete fix, and c) I’ve found that Spinrite isn’t as good as it’s touted to be … I’ve only been able to resurrect 1 drive out of the many I’ve had fail.

    Reply
  2. Jon Angliss

    One of the reasons I switched off of Gentoo was because of upgrades… While it’s relatively easy to keep up to date, stability is at risk. I now use Debian, which makes it easy to upgrade to the next release.

    I like the new theme, very clean 🙂

    Reply
  3. david

    I now use Debian, which makes it easy to upgrade to the next release.

    The thing is … my problems have rarely been because of upgrade problems (true, there have been a few, but usually because I screwed around with the config) … but generally the problems I’ve had are unrelated to the actual upgrade.

    Last time my DSL went casters up … this time it was a drive. Any distro’s upgrade would have been bollixed up by those problems.

    Reply

Leave a Reply to Jon Angliss Cancel reply

Your email address will not be published. Required fields are marked *