Tuesday, August 7, 2012

How To Configure RAID 5 in Linux


Backup Your System First

Software RAID creates the equivalent of a single RAID virtual disk drive made up of all the underlying regular partitions used to create it. You have to format this new RAID device before your Linux system can store files on it. Formatting, however, causes all the old data on the underlying RAID partitions to be lost. It is best to backup the data on these and any other partitions on the disk drive on which you want implement RAID. A mistake could unintentionally corrupt valid data.

Configure RAID In Single User Mode

As you will be modifying the disk structure of your system, you should also consider configuring RAID while your system is running in single-user mode from the VGA console. This makes sure that most applications and networking are shutdown and that no other users can access the system, reducing the risk of data corruption during the exercise.

[root@linuxhowto.in tmp]# init 1

Once finished, issue the exit command, and your system will boot in the default runlevel provided in the /etc/inittab file.
Debian / Ubuntu Differences
This chapter focuses on Fedora / CentOS / RedHat for simplicity of explanation. Whenever there is a difference in the required commands for Debian / Ubuntu variations of Linux it will be noted.

The universal difference is that the commands shown are done by the Fedora / CentOS / RedHat root user. With Debian / Ubuntu you will either have to become root using the "sudo su –" command or you can temporarily increase your privilege level to root using the "sudo <command>" command.

Here is an example of how to permanently become root:

user@ubuntu:~$ sudo su -
[sudo] password for pankaj:
root@ubuntu:~#

Here is an example of how to temporarily become root to run a specific command. The first attempt to get a directory listing fails due to insufficient privileges. The second attempt succeeds when the sudo keyword is inserted before the command.

user@ubuntu:~$  ls -l /var/lib/mysql/mysql
ls: cannot access /var/lib/mysql/mysql: Permission denied
user@ubuntu:~$ sudo ls -l /var/lib/mysql/mysql
[sudo] password for pankaj:
total 964
-rw-rw---- 1 mysql mysql   8820 2010-12-19 23:09 columns_priv.frm
-rw-rw---- 1 mysql mysql      0 2010-12-19 23:09 columns_priv.MYD
-rw-rw---- 1 mysql mysql   4096 2010-12-19 23:09 columns_priv.MYI
-rw-rw---- 1 mysql mysql   9582 2010-12-19 23:09 db.frm
...
...
...
user@ubuntu:~$

Now that you have got this straight, let’s continue with the discussion.

Configuring Software RAID

Configuring RAID using Fedora Linux requires a number of steps that need to be followed carefully. In the tutorial example, you'll be configuring RAID 5 using a system with three pre-partitioned hard disks. The partitions to be used are:

/dev/hde1
/dev/hdf2
/dev/hdg1

Be sure to adapt the various stages outlined below to your particular environment.

RAID Partitioning

You first need to identify two or more partitions, each on a separate disk. If you are doing RAID 0 or RAID 5, the partitions should be of approximately the same size, as in this scenario. RAID limits the extent of data access on each partition to an area no larger than that of the smallest partition in the RAID set.

Determining Available Partitions

First use the fdisk -l command to view all the mounted and unmounted filesystems available on your system. You may then also want to use the df -k command, which shows only mounted filesystems but has the big advantage of giving you the mount points too.

These two commands should help you to easily identify the partitions you want to use. Here is some sample output of these commands.

[root@linuxhowto.in tmp]# fdisk -l

Disk /dev/hda: 12.0 GB, 12072517632 bytes
255 heads, 63 sectors/track, 1467 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hda1   *         1        13    104391   83  Linux
/dev/hda2            14       144   1052257+  83  Linux
/dev/hda3           145       209    522112+  82  Linux swap
/dev/hda4           210      1467  10104885    5  Extended
/dev/hda5           210       655   3582463+  83  Linux
...
...
/dev/hda15         1455      1467    104391   83  Linux
[root@linuxhowto.in tmp]#

[root@linuxhowto.in tmp]# df -k
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/hda2              1035692    163916    819164  17% /
/dev/hda1               101086      8357     87510   9% /boot
/dev/hda15              101086      4127     91740   5% /data1
...
...
...
/dev/hda7              5336664    464228   4601344  10% /var
[root@linuxhowto.in tmp]#

Unmount the Partitions

You don't want anyone else accessing these partitions while you are creating the RAID set, so you need to make sure they are unmounted.
[root@linuxhowto.in tmp]# umount /dev/hde1
[root@linuxhowto.in tmp]# umount /dev/hdf2
[root@linuxhowto.in tmp]# umount /dev/hdg1

Prepare The Partitions With FDISK

You have to change each partition in the RAID set to be of type FD (Linux raid autodetect), and you can do this with fdisk. Here is an example using /dev/hde1.

[root@linuxhowto.in tmp]# fdisk /dev/hde
The number of cylinders for this disk is set to 8355.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
  (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help):

Use FDISK Help

Now use the fdisk m command to get some help:
Command (m for help): m
  ...
  ...
  p   print the partition table
  q   quit without saving changes
  s   create a new empty Sun disklabel
  t   change a partition's system id
  ...
  ...
Command (m for help):

Set The ID Type

Partition /dev/hde1 is the first partition on disk /dev/hde. Modify its type using the t command, and specify the partition number and type code. You also should use the L command to get a full listing of ID types in case you forget. In this case, RAID uses type fd, it may be different for your version of Linux.

Command (m for help): t
Partition number (1-5): 1
Hex code (type L to list codes): L


...
...
...
16  Hidden FAT16    61   SpeedStor       f2  DOS secondary
17  Hidden HPFS/NTF 63  GNU HURD or Sys fd  Linux raid auto
18  AST SmartSleep  64  Novell Netware  fe  LANstep
1b  Hidden Win95 FA 65  Novell Netware  ff  BBT
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)

Command (m for help):

Make Sure The Change Occurred

Use the p command to get the new proposed partition table:
Command (m for help): p

Disk /dev/hde: 4311 MB, 4311982080 bytes
16 heads, 63 sectors/track, 8355 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes

   Device Boot    Start       End    Blocks   Id  System
/dev/hde1             1      4088   2060320+  fd  Linux raid autodetect
/dev/hde2          4089      5713    819000   83  Linux
/dev/hde4          6608      8355    880992    5  Extended
/dev/hde5          6608      7500    450040+  83  Linux
/dev/hde6          7501      8355    430888+  83  Linux

Command (m for help):

Save The Changes

Use the w command to permanently save the changes to disk /dev/hde:
Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.

[root@linuxhowto.in tmp]#

The error above will occur if any of the other partitions on the disk is mounted.

Repeat For The Other Partitions

For the sake of brevity, I won't show the process for the other partitions. It's enough to know that the steps for changing the IDs for /dev/hdf2 and /dev/hdg1 are very similar.

Preparing the RAID Set

Now that the partitions have been prepared, we have to merge them into a new RAID partition that we'll then have to format and mount. Here's how it's done.

Create the RAID Set

You use the mdadm command with the --create option to create the RAID set. In this example we create the raid device /dev/md0 using the --level option to specify RAID 5, and the --raid-devices option to define the number of partitions to use.

[root@linuxhowto.in tmp]# mdadm --create --verbose /dev/md0 --level=5 \
   --raid-devices=3 /dev/hde1 /dev/hdf2 /dev/hdg1

mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 64K
mdadm: /dev/hde1 appears to contain an ext2fs file system
    size=48160K  mtime=Sat Jan 27 23:11:39 2007
mdadm: /dev/hdf2 appears to contain an ext2fs file system
    size=48160K  mtime=Sat Jan 27 23:11:39 2007
mdadm: /dev/hdg1 appears to contain an ext2fs file system
    size=48160K  mtime=Sat Jan 27 23:11:39 2007
mdadm: size set to 48064K
Continue creating array? y
mdadm: array /dev/md0 started.
[root@linuxhowto.in tmp]#

Confirm RAID Is Correctly Inititalized

The /proc/mdstat file provides the current status of all RAID devices. Confirm that the initialization is finished by inspecting the file and making sure that there are no initialization related messages. If there are, then wait until there are none.

[root@linuxhowto.in tmp]# cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 hdg1[2] hde1[1] hdf2[0]
      4120448 blocks level 5, 32k chunk, algorithm 3 [3/3] [UUU]

unused devices: <none>
[root@linuxhowto.in tmp]#

Notice that the new RAID device is called /dev/md0. This information will be required for the next step.

Format The New RAID Set

Your new RAID partition now has to be formatted. The mkfs.ext4 command is used to do this.

[root@linuxhowto.in tmp]# mkfs.ext4 /dev/md0
mke2fs 1.39 (29-May-2006)
Filesystem label=
OS type: Linux
Block size=1024 (log=0)
Fragment size=1024 (log=0)
36144 inodes, 144192 blocks
7209 blocks (5.00%) reserved for the super user
First data block=1
Maximum filesystem blocks=67371008
18 block groups
8192 blocks per group, 8192 fragments per group
2008 inodes per group
Superblock backups stored on blocks:
        8193, 24577, 40961, 57345, 73729

Writing inode tables: done                           
Creating journal (4096 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 33 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
[root@linuxhowto.in tmp]#

Note: The ext4 filesystem type may not be supported with your version of Linux. Use the mkfs.ext3 command in these cases to format your filesytem in the ext3 mode.

Create the mdadm.conf Configuration File

Your system doesn't automatically remember all the component partitions of your RAID set. This information has to be kept in the mdadm.conf file located in either the /etc or /etc/mdadm directory. The formatting can be tricky, but fortunately the output of the mdadm --detail --scan command provides you with it. Here we see the output sent to the screen.

[root@linuxhowto.in tmp]# mdadm --detail --scan
ARRAY /dev/md0 metadata=1.2 name=linuxhowto.in:0 UUID=77b695c4:32e5dd46:63dd7d16:17696e09
[root@linuxhowto.in tmp]#
All three partitions were given the UUID label 77b695c4:32e5dd46:63dd7d16:17696e09 when the mdadm command created device /dev/md0. The mdadm.conf file makes sure this mapping is remembered when you reboot.
Here we export the screen output to create the configuration file.
[root@linuxhowto.in tmp]# mdadm --detail --scan > /etc/mdadm.conf
Note: With Debian / Ubuntu systems the configuration file is /etc/mdadm/mdadm.conf so the command would be
[root@linuxhowto.in tmp]# mdadm --detail --scan > /etc/mdadm/mdadm.conf

Create A Mount Point For The RAID Set

The next step is to create a mount point for /dev/md0. In this case we'll create one called /mnt/raid

[root@linuxhowto.in mnt]# mkdir /mnt/raid

Edit The /etc/fstab File

The /etc/fstab file lists all the partitions that need to mount when the system boots. Add an Entry for the RAID set, the /dev/md0 device.

/dev/md0      /mnt/raid     ext4    defaults    1 2

Note: If you are using an ext3 filesystem, you must replace the ext4 keyword with ext3.

Note: Do not use UUID labels in the /etc/fstab file for RAID devices; just use the real device name, such as /dev/md0. The mount command doesn’t recognize RAID UUID labels. In older Linux versions, the /etc/rc.d/rc.sysinit script would check the /etc/fstab file for device entries that matched RAID set names listed in the now unused /etc/raidtab configuration file. The script would not automatically start the RAID set driver for the RAID set if it didn't find a match. Device mounting would then occur later on in the boot process. Mounting a RAID device that doesn't have a loaded driver can corrupt your data and produce this error.

Starting up RAID devices: md0(skipped)
Checking filesystems
/raiddata: Superblock has a bad ext3 journal(inode8)
CLEARED.
***journal has been deleted - file system is now ext 2 only***

/raiddata: The filesystem size (according to the superblock) is 2688072 blocks.
The physical size of the device is 8960245 blocks.
Either the superblock or the partition table is likely to be corrupt!
/boot: clean, 41/26104 files, 12755/104391 blocks

/raiddata: UNEXPECTED INCONSISTENCY; Run fsck manually (ie without -a or -p options).

If you are not familiar with the /etc/fstab file use the man fstab command to get a comprehensive explanation of each data column it contains.

The /dev/hde1, /dev/hdf2, and /dev/hdg1 partitions were replaced by the combined /dev/md0 partition. You therefore don't want the old partitions to be mounted again. Make sure that all references to them in this file are commented with a # at the beginning of the line or deleted entirely.

#/dev/hde1       /data1        ext3    defaults        1 2
#/dev/hdf2       /data2        ext3    defaults        1 2
#/dev/hdg1       /data3        ext3    defaults        1 2

Mount The New RAID Set

Use the mount command to mount the RAID set. You have your choice of methods:
  • The mount command's -a flag causes Linux to mount all the devices in the /etc/fstab file that have automounting enabled (default) and that are also not already mounted.
[root@linuxhowto.in tmp]# mount -a
  • You can also mount the device manually.
[root@linuxhowto.in tmp]# mount /dev/md0 /mnt/raid

Check The Status Of The New RAID

The /proc/mdstat file provides the current status of all the devices.

[root@linuxhowto.in tmp]# cat /proc/mdstat
Personalities : [raid5]
read_ahead 1024 sectors
md0 : active raid5 hdg1[2] hde1[1] hdf2[0]
      4120448 blocks level 5, 32k chunk, algorithm 3 [3/3] [UUU]

unused devices: <none>
[root@linuxhowto.in tmp]#

Conclusion
Linux software RAID provides redundancy across partitions and hard disks, but it tends to be slower and less reliable than RAID provided by a hardware-based RAID disk controller.
Hardware RAID configuration is usually done via the system BIOS when the server boots up, and once configured, it is absolutely transparent to Linux. Unlike software RAID, hardware RAID requires entire disks to be dedicated to the purpose and when combined with the fact that it usually requires faster SCSI hard disks and an additional controller card, it tends to be expensive.


No comments:

Post a Comment