|
Lesson #10 -
RAID and File Systems
Over the past two
weeks we dealt with how storage media work, but there are
still a couple of dragging issues about storage media that we
have to work on. One is RAID, which is a specialized
form of controller for SCSI hard drives. The other issue
is Filing Systems, which govern exactly how a hard drive works
with the operating system.
RAID
Previously we talked
about drive controllers and how they work with your
computer. There is one form of drive controller we did
not talk about, called RAID. RAID stands for Redundant
Array of Independent Disks, and comes in many different
formats. The basic advantages of RAID are it's fault
tolerance, it's speed, and it's data replication and back-up
properties.
There are two possible
approaches to RAID: Hardware RAID and Software
RAID.
Hardware
RAID
The hardware-based system
manages the RAID subsystem independently from the host and
presents to the host only a single disk per RAID
array.
An example of a hardware RAID
device would be one that connects to a SCSI controller and
presents the RAID arrays as a single SCSI drive. An external
RAID system moves all RAID handling "intelligence" into a
controller located in the external disk subsystem. The whole
subsystem is connected to the host via a normal SCSI
controller and appears to the host as a single
disk.
RAID controllers also come in
the form of cards that act like a SCSI
controller to the operating system, but handle all of the
actual drive communications themselves. In these cases, you
plug the drives into the RAID controller just like you would a
SCSI controller, but then you add them to the RAID
controller's configuration, and the operating system never
knows the difference.
Software
RAID
Software RAID implements the
various RAID levels in the kernel disk (block device) code. It
also offers the cheapest possible solution: Expensive disk
controller cards or hot-swap chassis are not required, and
software RAID works with cheaper IDE disks as well as SCSI
disks. With today's fast CPUs, software RAID performance can
excel against hardware RAID.
The MD driver in the Linux
kernel is an example of a RAID solution that is completely
hardware independent. The performance of a software-based
array is dependent on the server CPU performance and
load.
Levels and linear
support
RAID offers levels 0, 1, 4, 5,
and linear support. These RAID types act as
follows:
-
Level 0 -- RAID level
0, often called "striping," is a performance- oriented
striped data mapping technique. That means the data being
written to the array is broken down into strips and written
across the member disks of the array. This allows high I/O
performance at low inherent cost but provides no redundancy.
Storage capacity of the array is equal to the total capacity
of the member disks.
-
Level 1 -- RAID level
1, or "mirroring," has been used longer than any other form
of RAID. Level 1 provides redundancy by writing identical
data to each member disk of the array, leaving a "mirrored"
copy on each disk. Mirroring remains popular due to its
simplicity and high level of data availability. Level 1
operates with two or more disks that may use parallel access
for high data-transfer rates when reading, but more commonly
operate independently to provide high I/O transaction rates.
Level 1 provides very good data reliability and improves
performance for read-intensive applications but at a
relatively high cost. Array capacity is equal to the
capacity of one member disk.
-
Level 4 -- Level 4 uses
parity concentrated on a single disk drive to protect data.
It's better suited to transaction I/O rather than large file
transfers. Because the dedicated parity disk represents an
inherent bottleneck, level 4 is seldom used without
accompanying technologies such as write-back caching.
Although RAID level 4 is an option in some RAID partitioning
schemes, it is not an option allowed in Red Hat Linux RAID
installations. Array capacity is equal to the capacity of
member disks, minus capacity of one member disk.
-
Level 5 -- The most
common type of RAID. By distributing parity across some or
all of an array's member disk drives, RAID level 5
eliminates the write bottleneck inherent in level 4. The
only bottleneck is the parity calculation process. With
modern CPUs and software RAID, that isn't a very big
bottleneck. As with level 4, the result is asymmetrical
performance, with reads substantially outperforming writes.
Level 5 is often used with write-back caching to reduce the
asymmetry. Array capacity is equal to the capacity of member
disks, minus capacity of one member disk.
-
Linear RAID -- Linear
RAID is a simple grouping of drives to create a larger
virtual drive. In linear RAID, the chunks are allocated
sequentially from one member drive, going to the next drive
only when the first is completely filled. This grouping
provides no performance benefit, as it is unlikely that any
I/O operations will be split between member drives. Linear
RAID also offers no redundancy, and in fact decreases
reliability -- if any one member drive fails, the entire
array cannot be used. The capacity is total of all member
disks.
Filing
Systems
All the previous
lessons on storage media have told you the hardware aspects of
these systems. Without software however, these storage
media would be useless. The filing system used by the
operating system defines how a hard drive is used and how
information is stored upon it. There are several filing
systems, including FAT, VFAT, FAT32, and NTFS.
In order to set up a
drive to accept an operating system, it must be formatted to
accept operating system instructions. The set of
instructions that govern hard drive usage are called the
filing system, and they vary operating system to operating
system.
The first filing
system was called FAT. (File Allocation Table) It was
used by DOS as a method of reading and writing information to
the hard drive. It also tracked the usage of file
fragments, which we will discuss later.
The next progression
of filing systems was VFAT. (Virtual File Allocation
Table) It was the filing system for Windows 3.x, and
managed read/write instructions as well as separating the
application from having direct read/write access. The
earliest versions of Windows 95 used VFAT, and it was the
first filing system to support long file names.
Next came FAT32.
(File Allocation Table 32bit) Like VFAT, it controls
read/write instructions as well as separation of the
application and physical hard drive. On top of that, it
was a 32 bit filing system, which enabled it to use smaller
cluster sizes on larger hard drives and supported up to 2
terabyte hard drives.
There is also NTFS.
(New Technology Filing System) This is the filing system
supported by Windows NT, and incorporates several security and
fault-tolerant properties into the filing system. It
allows for transaction logs and the ability to set
file/directory/drive/user permissions on every level. NT
also supports FAT, but without the security and
fault-tolerance.
Clusters,
Partitions And Fragmentation
In order to control
the read/write instructions of an operating system, a hard
drive must be broken down into smaller parts. These are
partitions (See below) and clusters. Clusters are the
smallest unit that a filing system writes to, and are
generally referred to in a kilobyte size. For example,
the smallest cluster size under Windows VFAT and FAT32 is
4k. For larger hard drive sizes, the cluster size can be
up to 64k, which means that every time a write is performed
multiples of exactly 64k are used.
This means that a
single 1k file on a 64k cluster size hard drive takes up 64k,
making it quite an inefficient storage system. That's
why as hard drive capacities have increased, so have the
operating system's ability to access larger number of
clusters, and thereby keep the cluster size
smaller.
In order to set up
the clusters on a hard drive for any Microsoft-based operating
system, you must do two operations. First you must
partition the drive, which involves entering the program FDISK
through DOS. Partitioning is the act of separating the
hard drive into sections, which we see as drive C, D, E,
etc. By separating the drives into smaller units, it
allows cluster sizes to remain small, and set up different
hard drive areas for different operating systems.
The second step in
the process is to format the drive, through the FORMAT
command. Formatting a drive sets up it's File Allocation Table
and Filing System, so that the operating system understands
how the drive is used. Remember that the FAT always
resides at Track 0 (as mentioned in previous lessons), so that
the operating system ALWAYS knows where the FAT
begins.
A low-level
format is a process that is no longer done by field
technicians, but is still done at the factory. It
creates the basic FAT table that the hard drive
maintains. It also checks for defects in the drive
platters, and maintains a list of bad tracks that can't be
used for data storage.
In addition to the
standard FAT table, Microsoft operating systems create a
secondary FAT table, which is called high-level
formatting. Microsoft-based operating systems use
this system to manage cluster size, and all other modern
operating systems tend to have some form of second-level FAT
table.
One of the problems
with operating systems is they tend to write files as they are
added to the hard drive. This means that if you write a
file, then write another file, and finally edit the first
file, the first file ends up fragmented in two spots.
Here is an example;
| File 1,Piece 1 |
File 2 Piece 1 |
File 1 Piece 2 |
File 3 Piece 1 |
File 1 Piece 3 |
File 3 Piece 2 |
File 2 Piece 2 |
The above example
involves 3 files that are fragmented on the hard drive.
This occurs when edited files extend beyond their original
cluster size, and have to be written on the next available
cluster. In order to combat this, files must be
de-fragmented, or set back in their right order. An
example of the above hard drive de-fragmented is
below;
| File 1,Piece 1 File 1 Piece
2 File 1 Piece 3 |
File 2 Piece 1 File 2 Piece 2 |
File 3 Piece 1 File 3 Piece
2 |
When files are set
back in their proper order they become faster to read, as the
read head doesn't have to bounce all over the platter looking
for fragments of the file. This can improve disk access
speed and therefore over-all system
performance. |