In The Case Of Tape,
RAID Stands For
"Redundant Array of Independent Devices"
by Mo NourmohamadianRAID Stands For
"Redundant Array of Independent Devices"
As its name declares, when the concept of RAID was conceived, it was intended for use with, well, disks. RAID provided for making multiple smaller disks operate like a single bigger, faster, and more reliable one.
That was in the ‘80s. Now, in the ‘90s, the RAID concept is being applied to tape arrays as well. This has led to a rather obvious self-contradiction as far as the name is concerned, since there are no disks at all, expensive or otherwise, in a tape-based "RAID" system. The name "Tape RAID" has gained widespread acceptance for this new kind of product, but that name has also contributed to some misconceptions that are worth clearing up.
Most important, disk and tape have different characteristics, characteristics which have led them to be used at different positions in the storage hierarchy. Disk provides quick, random access to a limited amount of online data for multiple users simultaneously. Tape provides very inexpensive offline storage for an unlimited amount of sequentially recorded data which must be accessed by one user at a time.
Therefor, in spite of the name, tape RAID can never substitute for an array of small disks nor for a single larger one. Instead it "substitutes" for a hypothetical ultra fast tape with extremely large capacity.
Misconception number one about implementing RAID with tape instead of disk is that the original concept is somehow compromised. Not true. The RAID concept survives the transition; only the applications change. In fact, there are half a dozen concepts in this thing we call RAID, and several of them (but not all) can be applied effectively to tape.
With a single bus controller it becomes more difficult to realize throughput rates which are much greater than that of a single drive. Trying to optimize around, leads to a Catch-22. Increasing block sizes lessens parallelism. Decreasing block size results in more bus overhead thus limiting performance. Parallel connection to tape drives achieves maximum throughput. With the right kind of controller the user could conceptually achieve a reasonable level of concurrency as well, by providing multiple ranks of striped tape drives. Controller costs are higher, but performance can be vastly superior. | |
Using multiple disks as if they constituted a single, larger, virtual one yields improvements both in performance and in fault tolerance. Generally speaking, performance is improved by making multiple reads and writes in parallel, and fault tolerance is improved by mirroring, error correction, or the use of parity.
Striping
RAID performance is achieved through "striping" which divides user data into consecutive blocks residing on multiple devices. The size of this block is referred to as a "striping unit".
Host system read and write request sizes of more than one block create parallelism, allowing more than one device to contribute to the data transfer speed. The larger the request size, the more parallelism is maintained for a longer period of time, resulting in throughputs several times that of a single drive. Decreasing the "striping unit" so more parallelism is achieved reduces concurrency (servicing multiple small requests simultaneously).
Choosing The Optimum Stripe Unit
The higher the degree of parallelism, the greater the throughput; but the lower the parallelism, the greater the concurrency. Disk RAID users are forced to compromise.
This actually causes no problems with tape. The concept of striping works just as well with tape as with disk. Multiple tapes in an array can be read or written for a single request simultaneously, and their data streams merged. Th result is a throughput level which can be several times that of a single drive.
And concurrency, measured by the number of simultaneous I/Os per second, does not apply. Tape is a single user medium. Concurrency doesn’t count. The arrays can be set up to maximize parallelism (and therefore throughput) by choosing the smallest striping unit possible, the byte. That’s what you’ll see in all the newest, highest performance tape RAID systems.
Fault-tolerance
In disk RAID, fault tolerance can be improved by mirroring (maintaining a duplicate), parity or the use of error checking codes (hamming Codes). Hamming Codes are rarely, if ever, implemented, but the other two methods are very popular.
Mirroring
Mirroring, by definition, provides 100% redundancy. It also takes up twice as many disk spindles. Mirroring in tape RAID is not only the fault tolerance. Tapes are removable media. Users may elect to make two or more backup copies of the same files, one to be kept nearby, another to be shipped to a disaster recovery center, and perhaps more. Tape media is inexpensive. It takes no longer to record up to five multiple copies than one, why not?
Parity Methods
Two RAID levels (Levels 3 and 4) call for dedicating a separate drive for parity information, stored as individual bytes in the case of Level 3 and as blocks in Level 4. One RAID level (Level 5) calls for interspersing blocks of data and parity information across all drives in a set. Al three incorporate striping for data and parity recording.
All three forms could conceivably be implemented with tape RAID, although not to equal effect.
To begin, achieving higher throughput performance in tape RAID, as in disk RAID, demands that you design in more parallelism and get more drives operating at one time. That in turn demands a smaller "striping unit". Byte striping, with or without a dedicated parity drive, simply works better than block striping with tape. Therefore Level 3 tape RAID systems will outperform Level 4 and Level 5.
In real life, the question is moot as no one has introduced a Level 4 tape RAID system. Level 3 systems, on the other hand are becoming increasingly popular.
Misconception number two about RAID in general and tape RAID in particular is that the dedicated parity drive represents a single point of failure, and therefor extra vulnerability. However, adding a dedicated parity drive makes it possible to recover from the failure of any single drive including the parity drive.
A bad or missing tape can be reconstructed by coupling the data on the remaining good tapes with that of the parity drive. The data from a missing or faulty tape can be regenerated for the host on the fly if the tape controller is fast enough. (In some systems it is even possible to turn off one of the tape drives while the array is operating, without any degradation in performance.)
And if the parity drive is the one that’s bad, who cares? The real data still exists on the other tapes and doesn’t have to be regenerated at all.
What about Level 5, interspersing parity blocks and data blocks across all the drives in a set? Again, it’s one of these things that works better with disk than with tape. And yet Level 5 tape RAID, block striping and all, is very common. Why? One reason is that Level 5 systems can be less expensive than systems requiring parallel tape drive connection. The hardware to support a Level 5 block striping system is also less expensive in general than that for a Level 3 byte-striping system, and it was on the market first.
The difference is in the controllers. ( See Figure 1 ) SCSI controllers which operate as extensions of the host SCSI bus don’t require as much circuitry as do controllers which provide a separate parallel bus for each drive. On the other hand, the simpler controllers are limited to one-at-a-time reads and writes. Although Level 5 tape arrays are theoretically not limited to single-bus controllers, those currently on the market have been built that way. Level 3 controllers have all been built with multiple busses.
Applications
Disk RAID systems are all used for high performance, secure, online storage. Tape RAID systems have broader application.
Striping is used for fast disk backup and for high speed data acquisition. Mirroring is used for making duplicate backups (which is especially big in banking), and for simple copying operations.
In addition to those modes, it is possible for tape arrays to be used in two others; cascading and pass-through.
Cascading refers to the use of an array as a single tape drive with very large capacity. The first tape is written until full, then the second, and so on. Applications would be in unattended backup.
In the pass-through mode, each tape can operate independently. The array operates as Just a Bunch Of Tapes (and of course we need an acronym for that, so you’ll hear JBOT).
In both cascading and pass-through modes, little use is made of the tape array controller, and the same effects can be achieved using multiple drives and the right software (without the array controller).
However, many arrays can switch between multiple operating modes, providing their users with additional benefits, including cascading and pass-through. Although this author is not aware of Level 5 arrays which can operate in multiple modes, some combinations which include one or more levels are common.
Levels 0, 1 and 3 are all concurrently implemented in systems from Andataco, CoComp, Dynatek Automation Systems, Excitron, LAND-5, Symbios/Metastor, Trimm U. K., and Virtual Technology. Levels 0 and 1 and cascading are implemented together by Hi-Par Systems. Cascading and Level 1 operation are implemented together in systems from Contemporary Cybernetics and Transitional Technology. Level 5 hardware systems are available from Data General and Peripheral Vision. Software implementations of Level 5 are available directly from Netframe or from Cheyenne. There may be others, and, as always, the vendors themselves are the best sources of information about their systems.
Pricing, Performance And Scalability
Current tape technology provides for uncompressed capacity of up to 20 GB per cartridge (DLT 4000) and transfer rates in the range of 1.5 MB/s per drive. 50 GB tapes and 5 MB/s transfers are just over the horizon. Multiplying those capacities and rates in a tape array leads to some potentially mind boggling systems.
And there’s more. Big misconception number three is that tape arrays aren’t scalable. Of course they are. Users can acquire a two-drive system for as little as
and move up to at least seven drives with other systems already on the market. In addition, arrays can be expanded with stackers and tape libraries.
With stackers or libraries, the volume of offline data can be enormous. And with parallel bus controllers already achieving transfers in the range of 2X to 4X that of a single drive, those large volumes can be brought into memory very rapidly.