I need to figure out how to setup a watchdog timer on my linux systems.
While Ginny and I were out on a mini-vacation this weekend … gondor, my main web server, went weird on me.
It would respond to pings … but none of the servers would respond. I could telnet into the specific port, and it would connect, but the server itself would not respond.
So we cut our vacation a little short and came home (Gondor is kind of important) … when I looked at the system it wasn’t locked up (which I didn’t expect, because it still responded to some things), but it wasn’t doing anything.
Nothing serious was showing up on syslog, but the system was still hung up.
I ran a bunch of diagnostics, which didn’t indicate any problems… so I’m pretty much at a loss. Diagnostics were run on memory, the main-board, and the hard drives. I should note that one of the hard drives is making an odd whining noise … which indicates to me a potential problem … but the diagnostics didn’t indicate a problem.
Once I can figure out how to get a watchdog running, if the system goes weird on me again, it will at reboot itself. Not a prefect solution, but workable until I can figure out what is going wrong.
[tags]Linux, hardware, diagnostics, Dell, watchdog[/tags]