Configure RAID on Ubuntu

3-systemsettings_raid5.zoom70

 

RAID (originally redundant array of inexpensive disks; now commonly redundant array of independent disks) is a data storage virtualization technology that combines multiple disk drive components into a logical unit for the purposes of data redundancy or performance improvement.

Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the specific level of redundancy and performance required. The different schemes or architectures are named by the word RAID followed by a number (e.g. RAID 0, RAID 1). Each scheme provides a different balance between the key goals: reliability, availability, performance, and capacity. RAID levels greater than RAID 0 provide protection against unrecoverable (sector) read errors, as well as whole disk failure.

RAID 5 consists of block-level striping with distributed parity. Unlike in RAID 4, parity information is distributed among the drives. It requires that all drives but one be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data is lost. RAID 5 requires at least three disks.

In comparison to RAID 4, RAID 5’s distributed parity evens out the stress of a dedicated parity disk among all RAID members. Additionally, read performance is increased since all RAID members participate in serving of the read requests.

 

1. I am performing these examples in Virtualbox, so the hard drive sizes will be much smaller than what you’ll have in reality, but this will serve as a good demonstration of how to perform the actions.

2. RAID-5 requires a minimum of 3 drives, and all should be the same size. It provides the ability for one drive to fail without any data loss. Here’s a quick way to calculate how much space you’ll have when you’re complete.

Usable space = (number of drives – 1) * size of smallest drive

3. Install required softwares

apt-get update && apt-get upgrade -y
apt-get install mdadm ssh parted gdisk

4. Initial Setup

Before we jump into creating the actual RAID array, I would suggest you put a partition on each drive that you plan to use in your array. This is not a requirement with mdadm, but I like to have the disks show up in fdisk as partitioned. In the past I would have shown you how to create the partitions with fdisk, but fdisk doesn’t support partitions greater than 2TB, so that rules out many modern hard drives.

Instead, I’ll show you how to created the partitions with parted using GPT labels. But first, let’s view a list of our available hard drives and partitions.

fdisk -l

This will output, for each drive you have, something along the lines of:

root@test:~# fdisk -l

Disk /dev/sda: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00030001

Device Boot Start End Blocks Id System
/dev/sda1 * 1 996 7993344 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 996 1045 392193 5 Extended
/dev/sda5 996 1045 392192 82 Linux swap / Solaris

Disk /dev/sdb: 1073 MB, 1073741824 bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn’t contain a valid partition table

Disk /dev/sdc: 1073 MB, 1073741824 bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdc doesn’t contain a valid partition table

Disk /dev/sdd: 1073 MB, 1073741824 bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdd doesn’t contain a valid partition table

Disk /dev/sde: 1073 MB, 1073741824 bytes
255 heads, 63 sectors/track, 130 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sde doesn’t contain a valid partition table

This shows (4) 1GB drives with no partition, and /dev/sda that is where the operating system is installed. The last 4 drives should be safe to use in our array. Next, let’s actually partition those (4) disks.

parted -a optimal /dev/sdb
GNU Parted 2.3
Using /dev/sdb
Welcome to GNU Parted! Type ‘help’ to view a list of commands.
(parted) mklabel gpt
(parted) mkpart primary 1 -1
(parted) align-check
alignment type(min/opt) [optimal]/minimal? optimal
Partition number? 1
1 aligned
(parted) quit

After being started with the above switch, parted will align all the partitions created in such a way that every partition on the drive will start from a cylinder that is divisible by 4096. As a result, it will be properly aligned as the align-check command shows above.

If you are using GPT, then you can use sgdisk to clone the partition table from /dev/sdb to the other three drives. This has the other benefit of having a backup of your disk partition table.

sgdisk –backup=table /dev/sdb
sgdisk –load-backup=table /dev/sdc
sgdisk –load-backup=table /dev/sdd
sgdisk –load-backup=table /dev/sde

5. Creating the array

Now that our disks are partitioned correctly, it’s time to start building an mdadm RAID5 array. To create the array, we use the mdadm create flag. We also need to specify what RAID level we want, as well as how many devices and what they are. The following command will use 3 of our newly partitioned disks.

mdadm –create –verbose /dev/md0 –level=5 –raid-devices=3 /dev/sd[bcd]1

The verbose flag tells it to output extra information. In the above command I am creating a RAID-5 array at /dev/md0, using 3 partitions. The number of partitions you are using, and their names may be different, so do not just copy and paste all of the command above without verifying your setup first. Note that the partition name is something like /dev/sdb1, whereas the drive name is something like /dev/sdb; the 1 refers to the partition number on the disk.

If you wanted to build a RAID6 array, it’s equally as easy. For this example, I’ll throw in a couple new example drives to make our array bigger.

mdadm –create –verbose /dev/md0 –level=6 –raid-devices=8 /dev/sd[bcdefghi]1

While the array is being built you can view its status in the file /proc/mdstat. Here the watch command comes in handy:

watch cat /proc/mdstat

This will output the contents of the file to the screen, refreshing every 2 seconds (by default). While the array is being built it will show how much of the “recovery” has been done, and an estimated time remaining. This is what it looks like when it’s building.

Every 2,0s: cat /proc/mdstat Tue Dec 31 16:48:47 2013

Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdd1[3] sdc1[1] sdb1[0]
7813770240 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_]
[=======>………….] recovery = 35.0% (1368089744/3906885120) finish=346.5min speed=122077K/sec

unused devices: <none>

When it’s completed syncing, it should look like this. This process can take many hours depending on how big the array you’re assembling is.

Every 2.0s: cat /proc/mdstat Tue Nov 15 13:02:37 2011

Personalities : [raid6] [raid5] [raid4]
md0 : active raid5 sdd1[3] sdc1[1] sdb1[0]
2092032 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/3] [UUU]

unused devices: <none>

Now that we have set up the array, we need to edit the mdadm configuration file so that it knows how to assemble the array when the system reboots.

echo “DEVICE partitions” > /etc/mdadm/mdadm.conf
echo “HOMEHOST fileserver” >> /etc/mdadm/mdadm.conf
echo “MAILADDR youruser@gmail.com” >> /etc/mdadm/mdadm.conf
mdadm –detail –scan | cut -d ” ” -f 4 –complement >> /etc/mdadm/mdadm.conf

Next, update the initramfs, so that the OS has access to your new mdadm array at boot time.

update-initramfs -u

You can view your new array like this.

mdadm –detail /dev/md0

It should look something like this.

/dev/md0:
Version : 1.2
Creation Time : Tue Nov 15 13:01:40 2011
Raid Level : raid5
Array Size : 2092032 (2043.34 MiB 2142.24 MB)
Used Dev Size : 1046016 (1021.67 MiB 1071.12 MB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Tue Nov 15 13:57:25 2011
State : clean
Active Devices : 3
Working Devices : 3
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Name : test:0
UUID : 2e46af23:bca95854:eb8f8d7c:3fb727ef
Events : 34

Number Major Minor RaidDevice State
0 8 17 0 active sync /dev/sdb1
1 8 33 1 active sync /dev/sdc1
3 8 49 2 active sync /dev/sdd1

You can see that the chunk size is 512K and the metadata version is 1.2.

Verify that your email is working like this. You’ll need to setup an email server first. This will ensure that you get alerts if something goes wrong on your system.

mdadm –monitor -m youruser@gmail.com /dev/md0 -t

6. Creating and mounting the filesystem

Now that the array is built we need to format it. What filesystem you choose is up to you, but I would recommend ext4, unless your array will be bigger than 16TB as that’s current the maximum size supported by e2fsprogs. I will show you the quick way to apply a filesystem to your array first. If you want to optimize your filesystem performance, follow the next example

Unoptimized

mkfs.ext4 /dev/md0

This will take a while, especially if your array is large. If you want to optimize your filesystem performance on top of mdadm, you’ll need to do a little work, or use this calculator. Here’s how you do it.

Optimized
1. chunk size = 512kB (see chunk size advise above)
2. block size = 4kB (recommended for large files, and most of time)
3. stride = chunk / block in this example 512kB / 4k = 128
4. stripe-width = stride * ( (n disks in raid5) – 1 ) this example: 128 * ( (3) – 1 ) = 256
So, your optimized mkfs command would look like this.

mkfs.ext4 -b 4096 -E stride=128,stripe-width=256 /dev/md0

Note: As a caveat for using ext4 with volumes > 16TB, you’ll need to use a newer version of e2fsprogs if you want to create a filesystem that will support 16TB+. Ubuntu 12.04 comes with e2fsprogs version 1.42, and this version supports creating a 64bit filesystem (will support > 16TB) like this.

mkfs.ext4 -O 64bit /dev/md0

If you chose to use ext2/3/4 you should also be aware of reserved space. By default ext2/3/4 will reserve 5% of the drives space, which only root is able to write to. This is done so a user cannot fill the drive and prevent critical daemons writing to it, but 5% of a large RAID array which isn’t going to be written to by critical daemons anyway, is a lot of wasted space. I chose to set the reserved space to 0%, using tune2fs:

tune2fs -m 0 /dev/md0

Next we should add the array to the fstab, so that it will automatically be mounted when the system boots up. This can be done by editing the file /etc/fstab.

nano /etc/fstab

Your fstab should already contain a few entries (if it doesn’t something is wrong!). At the bottom add a line similar to the following:

/dev/md0 /storage ext4 defaults 0 0

Press crtl+x, and then y to save and exit the program.

I chose to mount my array on /storage, but you may well wish to mount it somewhere else. As I said earlier I chose to use ext4, but here you will need to enter whatever filesystem you chose earlier. If the folder you chose doesn’t exist you will need to create it like this.

mkdir /storage

Now, mount the array.

mount -a

This will mount anything mentioned in the fstab that isn’t currently mounted. Hopefully, your array is now available on /storage. Check your available space like this.

df -h /storage

It should look something like this…

Filesystem Size Used Avail Use% Mounted on
/dev/md0 2.0G 35M 2.0G 2% /storage

718 Comments