9. PERFORMABILITY

In this chapter, performability of a disk array subsystem is studied. The focus is concentrated on the performability as the definition of “cost” in the cost-performability is somewhat ambiguous. A simple performance model of the disk array is used in this chapter, because more accurate models are considered to be out of the scope of this thesis.

9.1 Performability models

Undisplayed Graphic

Figure 37. Simple Markov model for performability of TMM ( Undisplayed Graphic is the number of disks, Undisplayed Graphic is the disk failure rate, Undisplayed Graphic is the repair rate, Undisplayed Graphic defines the probability of the system being at state Undisplayed Graphic at time Undisplayed Graphic where Undisplayed Graphic defines the number of faulty disks in the disk array, and Undisplayed Graphic defines the reward function at state Undisplayed Graphic at time Undisplayed Graphic)

Performability (i.e., combined performance and reliability) of a system can be expressed using Markov reward models [Trivedi 1994, Catania 1993, Pattipati 1993, Smith 1988, Furchgott 1984, Meyer 1980, Beaudry 1978]. Figure 37 illustrates a performability model for TMM presented in Chapter 6. This is a typical RAID-5 array with a D+1 redundancy scheme. For each state i (i=0,1, and 2), two parameters are defined: probability and reward.

The first parameter, probability ( Undisplayed Graphic), defines the probability of the system being in state i at a given time t. This is the same probability as used for the reliability analysis in Chapters 6, 7, and 8.

The second parameter, reward ( Undisplayed Graphic), defines the reward what system gets while being in state i at a given time t. The reward function can be, for example, the performance of the system. In a disk array, the performance can be expressed using the number of I/O operations per second. For example, in state 0 of Figure 37, the reward function specifies the number of user I/O operations that the disk array can perform in the fault-free state. In state 1, the reward function specifies the number of user I/O operations that the crippled disk array can perform while it is either waiting for the repair process to start or while the repair process in ongoing. In state 2, the reward function is zero because the data is lost and the disk array has failed.

For simplicity, it is assumed in this thesis that the reward function is constant (i.e., Undisplayed Graphic) and only depends on state i but not the time.

The performability (or computational availability) in state i at a given time t can be then expressed as a product of the two above mentioned parameters (i.e., reward and probability of state i) as follows

Undisplayed Graphic. (106)

And, the total performability at a given time t can be expressed as the sum of the performabilities of all states i as follows

Undisplayed Graphic. (107)

Steady state performability

If the Markov model describes a steady state system, the performability can be expressed as follows

Undisplayed Graphic. (108)

Non-steady state performability

In a non-steady state system, the probability of state i is changing. Eventually, the system will fail (in Figure 37, Undisplayed Graphic). The cumulative performability of a system with non-repairable faults can be expressed as

Undisplayed Graphic (109)

where Undisplayed Graphic is the cumulative reliability of state i.

9.1.1 Performability of TMM

The performability of a RAID-5 disk array that is modeled with TMM can be expressed using the above equation (109) and the probabilities of states 0, 1, and 2 as expressed in Chapter 6 in equations (12) - (14). The cumulative reliabilities of TMM are:

Undisplayed Graphic, (110)

Undisplayed Graphic, (111)

and the value of Undisplayed Graphic has no effect since Undisplayed Graphic. Then, the performability of TMM is

Undisplayed Graphic (112)

where Undisplayed Graphic and Undisplayed Graphic are the reward functions of state 0 and 1, respectively. The reward functions depend on the type of the operation (read or write).

It should be noticed that Undisplayed Graphic equals MTTDL of TMM if Undisplayed Graphic and Undisplayed Graphic equal one. As Undisplayed Graphic is typically greater than or equal to Undisplayed Graphic, it is possible to obtain an upper limit estimation for performability by multiplying MTTDL of the array with the reward function of the fault-free state. Hence, the approximation of the performability can be expressed as

Undisplayed Graphic (113)

where Undisplayed Graphic is MTTDL of the array.

9.1.2 Performability of EMM1

The performability of a RAID-5 disk array that is modeled with EMM1 can be expressed using the above equation (109) and the probabilities of states 00, 01, 10, and f as expressed in Chapter 7 in equations (61) - (64). The cumulative reliabilities of EMM1 are:

Undisplayed Graphic, (114)

Undisplayed Graphic, (115)

Undisplayed Graphic, (116)

and the value of Undisplayed Graphic has no effect since Undisplayed Graphic. Then, the performability of EMM1 is

Undisplayed Graphic (117)

where Undisplayed Graphic, Undisplayed Graphic, and Undisplayed Graphic are the reward functions of state 00, 01, and 10, respectively. The reward functions depend on the type of the operation (read or write).

9.1.3 Reward functions of disk array subsystems

The performance of disk arrays can be modeled using a simple performance model for the arrays like in [Hillo 1993, Gibson 1991, Kemppainen 1991]. Here, the reward functions are modeled using the performance of the disk array that is estimated for either read or write operations but not for mixed read and write operations. More accurate performance model of the disk arrays is considered to be out of the scope of this thesis.

RAID-5 performance

In a RAID-5 array, a total of Undisplayed Graphic disks is used for building an array of Undisplayed Graphic data disks. There are Undisplayed Graphic disks in the crippled array. If I/O requests are not assumed to span over several disks, each request would require the following number of disk accesses:

• 1 disk operation to read from a fault-free disk array;

Undisplayed Graphic disk operations to read from a crippled disk array (in the worst case);

• 4 disk operations to write to a fault-free disk array;

Undisplayed Graphic disk operations to write to a crippled disk array (in the worst case); and

Undisplayed Graphic disk operations to reconstruct a faulty disk block.

RAID-1 performance

The above equations (112) and (117) for performability are dedicated to RAID-5 arrays analysis. However, later in this chapter, it is shown that good estimation for the performability can be made using MTTDL of the array and the reward function of the fault-free state. Hence, reward functions for the RAID-1 array are also included here.

In a RAID-1 array, a total of Undisplayed Graphic disks is used for building an array of Undisplayed Graphic data disks. There are Undisplayed Graphic disks in the crippled array. If I/O requests are not assumed to span over several disks, each request would require the following number of disk accesses:

• 1 disk operation to read from a fault-free disk array;

• 1 disk operation to read from a crippled disk array;

• 2 disk operations to write to a fault-free disk array;

• 2 disk operations to write to a crippled disk array (in the worst case); and

• 2 disk operations to reconstruct a faulty disk block.

Relative performance

In a disk array, the maximum number of I/O operations depends on the array configuration, the type of the operation and the properties of the disks. The array configuration specifies how many parallel read and write operations can be performed as illustrated in the introduction in Chapter 1. In this thesis, the performance is expressed as relative comparison with a single disk. Relative performance value one corresponds to one fully working disk serving user requests. For example, a fault-free RAID-1 with two disks has relative performance two for read operations and one for write operations.

Effect of the scanning algorithm

The effect of the scanning algorithm is studied by reserving a certain capacity for the scanning algorithm. For every disk, a certain capacity (as expressed with Undisplayed Graphic) is reserved for scanning and remaining capacity ( Undisplayed Graphic) is available for other requests (user requests or repair).

Effect of the repair process

The repair process decreases the maximum number of user operations in the crippled array. The degree of degradation depends on the activity of the repair process. When, for example, a disk array of a total of ten disks is being repaired using 20% of the capacity for repair (as expressed with the repair activity, Undisplayed Graphic), the theoretical remaining capacity is 8 units. This is further reduced if the read or write request needs to have several disk operations. For example, to write to a crippled RAID-5 array needs 10 disk operations. Hence, the relative performance is only Undisplayed Graphic. As for comparison, the relative write performance in the same size fault-free array would be 2.5.

Reward functions of RAID-5 and RAID-1

The relative reward functions of RAID-5 and RAID-1 arrays are illustrated in Table 15. It is assumed that three different states from the point of view of performance are:

• all disks working (state 0 in TMM and states 00 and 01 in EMM1);

• one disk unit failed (state 1 in TMM and state 10 in EMM1); and

• data lost (state 2 in TMM and state f in EMM1).

Sector faults are considered not to degrade the performance.

Table 15. Relative reward functions of RAID-5 and RAID-1

RAID-5

Reward function for a read operation

Reward function for a write operation

Undisplayed Graphic, Undisplayed Graphic, Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic, Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic, Undisplayed Graphic

0

0

RAID-1

Reward function for a read operation

Reward function for a write operation

Undisplayed Graphic, Undisplayed Graphic, Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic, Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic

Undisplayed Graphic, Undisplayed Graphic

0

0

In a RAID-5 array, all disks are involved with the repair process. As the worst case scenario is used here, the read operation to a crippled array would require to access all remaining disks. Hence, the relative performance is one from which the repair activity is deducted. Similarly in the worst case, the write operation requires to read all remaining disks once and to write to one disk. From this relative performance, the repair activity is deducted.

In a RAID-1 array, only two disks are involved with the repair process. When the array has Undisplayed Graphic data disks ( Undisplayed Graphic disks totally), Undisplayed Graphic data disks are not effected by a disk unit fault. For a read operation, there are Undisplayed Graphic disks available and the performance is further reduced by the repair process in one disk. For a write operation, there are Undisplayed Graphic data disks that are not effected by the disk unit fault and one data disk that is effected by the repair process.

9.1.4 Performability comparisons

Performability of a RAID-5 array modeled using TMM and EMM1 models is illustrated in Figure 38. Here, the same default parameters are used as in Chapter 8. This figure shows that the approximation (performability equals MTTDL multiplied with the reward function of the fault-free state) provides accurate results. Hence, the same approximation principle is used with the RAID-1 array. It is also concluded that both performability models provide similar results that correspond to the reliability results.

Undisplayed Graphic

Figure 38. Performability of RAID-5 array as a function of the number of disks in the array

Effect of the repair activity

The effect of the repair activity is studied in a configuration where the repair time depends on the number of disks and the repair activity. In Figure 39, performability of a RAID-5 array is illustrated as a function of the repair activity. Here, a RAID-5 array of 50 data disk is studied in two configurations: hot swap (repair starts 8 hours after the disk failure) and hot spare (repair starts immediately after the disk failure). The read operation provides four times better performability than the write operation as its reward function in state 00 is four times better. The repair time is assumed to be two hours with 100% repair activity and relatively longer, if the repair activity is less than 100%. The hot spare configuration provides significantly better performability than the hot swap configuration as the repair time in the latter case is shorter. The performability of state 10 of EMM1 has only a marginal effect (less than 1%) on the total performability of the RAID-5 array. This is because the failure rates in EMM1 are much smaller than the repair rates and therefore the system is mainly in state 00.

The only factor that may limit the disk repair activity is the performance requirement during the repair time. If no minimum requirement for performance during the repair time is set, then the repair can and should be done at full speed, otherwise the repair activity should be obstructed to guarantee the minimum performance. The reliability increase due to faster repair is much more significant than the minor performance degradation during the repair time when the total performability is considered.

Undisplayed Graphic

Figure 39. Performability of RAID5 array modeled with EMM1 as a function of repair activity

Effect of the scanning algorithm

The effect of the scanning algorithm on the performability is studied by varying the scanning activity. It is assumed that it takes 24 hours to scan all disks in the array with 5% scanning activity. The performability of a RAID-5 array is presented in Figure 40 as a function of the scanning activity. When the hot swap (hot spare) configuration is used, the optimum performability is achieved in this sample configuration when the scanning algorithm uses 20% (30%) of the capacity for scanning the disk surface. The hot swap configuration reaches its peak performability earlier than the hot spare configuration as its reliability is dominated more by the longer disk repair time than the hot spare where the reliability can be increased longer with the increased scanning activity and its sector faults detection. Eventually in both cases, the performability starts decreasing when the scanning activity approaches 100%. This is obvious since less and less capacity of the array remains for user disk requests and the reliability does not increase because it is limited by the repair time of the disk unit failure.

Undisplayed Graphic

Figure 40. Performability of RAID5 array modeled with EMM1 as a function of scanning activity

RAID-5 vs. RAID-1

The performability of RAID-1 and RAID-5 arrays is compared in Figure 41. The performability of the RAID-5 array is achieved using the above equations (112) and (117) while the performability of the RAID-1 array is approximated using MTTDL of RAID-1 multiplied with the appropriate reward function. As MTTDL of a RAID-1 array is approximated by dividing MTTDL of RAID-1 with two disks with the number of data disks in the array while the reward function is relative to the number of disks in the array, the performability of the RAID-1 array is constant (i.e., while the performance of the disk array increases with the number of disks in the array, at the same time the reliability decreases thus keeping the performability constant). Actually, the same effect can be found also with RAID-0 arrays where the performability remains constant but at a much lower level because the RAID-0 array has no redundancy. On the other hand, the performance of the RAID-5 array increases almost linearly with the number of disks in the array, but the reliability decreases more rapidly as more and more disks are protected just with a single disk.

9.1.5 Conclusions of performability analysis

The conclusions of the performability analysis are gathered in the following list:

• Performability of a disk array can be well approximated by multiplying MTTDL of the array with the reward function of the fault-free state when the repair rates are much higher than the failure rates.

• Performability of RAID-0 and RAID-1 arrays is constant regardless of the number of disks in the array. Higher performance is achieved with larger number of disks but at the expense of reduced reliability.

Undisplayed Graphic

Figure 41. Performability of RAID-1 and RAID-5 arrays

• Performability of a RAID-5 array decreases as the number of disks increases. This is because reliability drops more than what performance increases.

• A RAID-1 array provides better performability than a RAID-5 array with the same number of data disks. The penalty for higher performability of the RAID-1 array is the larger number of disks in the array and higher number of failed disks.

• A scanning algorithm can improve performability. The scanning algorithm increases first the performability as the disk array reliability increases while the performance degradation remains still moderate. When the scanning activity increases further, the reliability no longer increases because the reliability bottleneck will be the disk unit faults, but at the same time the performance of the array drops. Thus, the performability also sinks.

• The increased speed of the repair process effects the performability by improving the reliability while the effect on the average performance is marginal. The only reason to limit the speed of the repair process is to guarantee a certain performance even with a crippled array.

Links

RAID data recovery, Mac data recovery, Unix data recovery, Linux data recovery, Oracle data recovery, CD data recovery, Zip data recovery, DVD data recovery , Flash data recovery, Laptop data recovery, PDA data recovery, Ipaq data recovery, Maxtor HDD, Hitachi HDD, Fujitsi HDD, Seagate HDD, Hewlett-Packard HDD, HP HDD, IBM HDD, MP3 data recovery, DVD data recovery, CD-RW data recovery, DAT data recovery, Smartmedia data recovery, Network data recovery, Lost data recovery, Back-up expert data recovery, Tape data recovery, NTFS data recovery, FAT 16 data recovery, FAT 32 data recovery, Novell data recovery, Recovery tool data recovery, Compact flash data recovery, Hard drive data recovery, IDE data recovery, SCSI data recovery, Deskstar data recovery, Maxtor data recovery, Fujitsu HDD data recovery, Samsung data recovery, IBM data recovery, Seagate data recovery, Hitachi data recovery, Western Digital data recovery, Quantum data recovery, Microdrives data recovery, Easy Recovery, Recover deleted data , Data Recovery, Data Recovery Software, Undelete data, Recover, Recovery, Restore data, Unerase deleted data, unformat, Deleted, Data Destorer, fat recovery, Data, Recovery Software, File recovery, Drive Recovery, Recovery Disk , Easy data recovery, Partition recovery, Data Recovery Program, File Recovery, Disaster Recovery, Undelete File, Hard Disk Rrecovery, Win95 Data Recovery, Win98 Data Recovery, WinME data recovery, WinNT 4.x data recovery, WinXP data recovery, Windows2000 data recovery, System Utilities data recovery, File data recovery, Disk Management recovery, BitMart 2000 data recovery, Hard Drive Data Recovery, CompactFlash I, CompactFlash II, CF Compact Flash Type I Card,CF Compact Flash Type II Card, MD Micro Drive Card, XD Picture Card, SM Smart Media Card, MMC I Multi Media Type I Card, MMC II Multi Media Type II Card, RS-MMC Reduced Size Multi Media Card, SD Secure Digital Card, Mini SD Mini Secure Digital Card, TFlash T-Flash Card, MS Memory Stick Card, MS DUO Memory Stick Duo Card, MS PRO Memory Stick PRO Card, MS PRO DUO Memory Stick PRO Duo Card, MS Memory Stick Card MagicGate, MS DUO Memory Stick Duo Card MagicGate, MS PRO Memory Stick PRO Card MagicGate, MS PRO DUO Memory Stick PRO Duo Card MagicGate, MicroDrive Card and TFlash Memory Cards, Digital Camera Memory Card, RS-MMC, ATAPI Drive, JVC JY-HD10U, Secured Data Deletion, IT Security Firewall & Antiviruses, PocketPC Recocery, System File Recovery , RAID