I've spent about 15 or so hours the last week setting up vmware server *just right* so that I could experiment with clustering software with the minimum amount of work per reconfigure.
I've documented almost all of it, but I hadn't gotten 'round to backing up the vmware configs or filesystems yet.
My SSH session to the head node virtual machine died just now. Logging in to the host machine, I found out why:
clusterhead01:\~> sudo tcsh
Password:
clusterhead01:\~# /etc/init.d/vmware start
/etc/init.d/vmware: Input/output error.
clusterhead01:\~# df -h
/bin/df: Input/output error.
clusterhead01:\~# dmesg
/bin/dmesg: Input/output error.
clusterhead01:\~#
So I figured I'd shut it down:
clusterhead01:\~# halt -p
/sbin/halt: Input/output error.
clusterhead01:\~#
Well, fortunately, as my co-workers can no doubt attest, I'm religious about updating RTs, so at least I've got my thought process as I reasoned out how to do the last few hours of work.
But I hate redoing work. With a passion. Looks like I'm going to get to. The other disk in this system died as I was installing it. This system is one of about half a dozen that were purchased all at the same time, and every single hard drive (Western Digital 160gb) has been replaced at least once. I should have taken the hint.
Grrr.
And why, yes, it is 11 on a Friday night, and I'm doing this stuff. So what? I had a few beer earlier, and I got bored.


Published

Category

Technology

Contact