Deploying Suse Linux Enterprise Server
| Purchase "Deploying Suse Linux Enterprise Server" at Lulu.com |
Installing Suse Linux Enterprise Server
Maintaining Software RAID Array
Once you implement a software RAID array, it pretty much takes care of itself as long as the drives involved in the array stay functional. So the burden becomes: how do you know when a drive is not functional, how do you replace a drive that no longer works and how do you plan for a drive failure. Just remember that the following procedures will only work with arrays that have data redundancy (these will have no effect on RAID Level 0 arrays).
Monitoring RAID Arrays
First let us look at how to query the status of a Software RAID array. There are a few ways to do this. For instance, since GNU/Linux treats everything as a file, you can simply "cat /proc/mdstat" to see the status of the RAID arrays on the system. Also included with Suse Linux Enterprise Server (as well as most GNU/Linux distributions) is the "mdadm" utility. This utility lets you query and maintain your software RAID arrays. For instance in the following image, I ran a detailed query of a Software RAID array using:
mdadm --detail /dev/md1
Now, of course you do not want to have to remember to look at all of your software RAID arrays periodically to ensure they are operating without a drive failure. The mdadm program can also be ran as a daemon to send alerts (usually by email) to you if the status of a software RAID array changes in any way.
To enable this service you will want to edit the "/etc/sysconfig/mdadm" file to ensure that the email address is correct (you can utilize the Yast "/etc/sysconfig" editor module to accomplish this). You will also want to make sure that you configure the "Mail Transfer Agent" Yast Module if your server is not a full fledged email server to ensure that emails will be sent correctly. Once you edit the appropriate file and ensure that the system will mail alerts correctly, to start the service you run:
/etc/init.d/mdadmd start
and to ensure it runs when the computer reboots, issue the command:
chkconfig mdadmd on
![]()
Viewing the status of a software RAID Array
Replacing a Malfunctioning Drive
If you do have a hard drive failure, the array will become in a "degraded" state. This means that you no longer have "data redundancy" and if another drive fails you will lose the data stored within the array. To prevent this from happening you must replace the drive that has failed and add the new drive to the array.
The first thing that must be done to a new drive in order for it to be added to the array is that you must partition it with "Linux RAID" partitions. This procedure is the same used to configure the Software RAID array. These partitions must be at least the same size as those contained on the failed drive.
Once you have the new drive partitioned, you then simply need to add the "Linux RAID" partitions into the appropriate software RAID array using the mdadm utility. For instance:
mdadm -a /dev/md2 /dev/sdb3
This command will add the "/dev/sdb3" partition into the "/dev/md2" RAID array. Once the partition is added to the array, the array will start it's recovery procedure. This procedure may take a while depending upon the hardware involved with the array. Unfortunately, some individuals complain that this takes way too long (days in some accounts). If you are careful in choosing the right hardware for the software RAID array, this delay can be mitigated. For instance, during testing I was using 36GB Western Digital Raptors with a SATA interface, the average recovery time from a failure was about 18 minutes (or 500MB per minute) - not too shabby.
If you want to test your hardware, or if you absolutely need to remove a drive that you believe may become faulty in the near future, you can force the RAID array to mark it as failed. To do this simply issue the following command:
mdadm --manage --set-faulty /dev/md0 /dev/sdb1
Adding a "Hot Spare" Drive
If the data residing on the RAID array is extremely important, or if you are installing a server that will be a large distance away, you can add what is called a "Hot Spare" to the array. What a hot spare will do is nothing... until a failure occurs. If you have a hot spare within your array and a failure occurs, immediately the RAID array will start the recovery procedure using the hot spare. This can be a lifesaver if your server is 200 miles away or if a replacement drive is not readily available.
The procedure for adding a hot spare to the array is basically the same procedure used to replace the drive (without removing an existing drive of course). First you will want to partition the drive with "Linux RAID" partitions, then you will want to add the partitions to the appropriate array. If the array is not in a "degraded" state when you add the partition to it, the new partition will be added as a hot spare.
| Purchase "Deploying Suse Linux Enterprise Server" at Lulu.com |


