How To Replace A Failed Drive

Replacing a Failed Drive


This article assumes that the drive at sdb has failed, you have contacted Rhino to confirm this, and they have shipped you the replacement drive.

Please be aware that not all RAID setups are configured the same. This article assumes Rhino's default of putting /boot in a RAID1 at md0, the root filesystem in a RAID1 at md1, and two swap partitions at sd[ab]2. You are strongly advised to confirm your setup by examining /proc/mdstat to determine which RAID (md) devices contain which partitions.


The first step is to physically identify which is the failed drive. Do this by obtaining the serial number:
hdparm -iI /dev/sdb

Next we'll have to remove the old drive from the RAID.
mdadm --fail /dev/md0 /dev/sdb1
mdadm --fail /dev/md1 /dev/sdb3
mdadm --remove /dev/md0 /dev/sdb1
mdadm --remove /dev/md1 /dev/sdb3

Shutdown the machine, remove the failed drive and replace it with the new, and boot.

Now copy the partition table from the old drive to the new
sfdisk -d /dev/sda | sfdisk /dev/sdb

Add the partitions back into the RAID
mdadm --add /dev/md0 /dev/sdb1
mdadm --add /dev/md1 /dev/sdb3

For safety, make sure there's a bootloader on sdb
grub-install /dev/sdb

Finally make sure the SWAP space is usable
mkswap /dev/sdb2

Prevention Tips

Drives fail all the time, but good practice will minimize the likelihood of premature drive failure.

Applies To

failed hard drives


hard drive, HDD, failure, raid, mdadm, mdstat, raid1, harddrive

 Login [Lost Password] 
Remember Me:
 Article Options