A Disk RAID array Volume Manager for Disk

of these models is relevant, gaining the benefits of both. In [11], the authors explore the use of a trie to
store sequences of file system events, and their frequencies, thereby establishing multiple contexts to make
prefetching decisions. Griffioen and Appleton in [7] and [8] record file events in a probability graph. Each
node in the graph is a file; directed edges from each node to other nodes represent related accesses, and
edges are weighted by frequency of occurrence. Prefetching decisions are then made by choosing accesses
with the largest probability, conditioned on the fact that the file being prefetched for has just been opened.
Madhyastha and Reed [15, 14] propose a system that optimizes input/output performance by recognizing
file access patterns and choosing appropriate caching policies to match application needs. The authors
also discuss the concept of adaptive steering which uses a feedback system to determine what file system
policy to use. They claim that a priori identification of file system policy is impossible because policies
may depend on data sets or unforeseeable environments, and inaccurate hints may worsen the problem. To
classify each access pattern into a category, each of which corresponds to a different file policies, the authors
used a pre-trained neural network. Their approach is similar to ours in that it uses workload information
to decide between a set of algorithms, but it differs in several important ways. Madhyastha and Reed are
changing caching behavior instead of actually redoing disk layout as explored here. Their system works on
a per-process basis, while our system analyzes global behavior. In addition, they use a pre-trained neural
network, while we propose online learning. Finally, they automatically assign a given type of workload to
an optimization algorithm, while we attempt to learn this association.
To summarize, researchers have explored a variety of heuristics for disk layout optimization, as well
as hand-crafted combinations of heuristics. In other policy decisions they have explored methods to choose
from a set of heuristics. However, this work is the first to explore a general combination method for heuristics
in disk layout.
3 Two-tiered Learning For Layout
This work focuses on the problem of identifying a disk layout that improves performance for a given workload.
At the most general level, this is an optimization problem that takes as input a workload and outputs a
new layout. However, due to the size of the state space, solving this problem using a brute-force optimization
algorithm is intractable; a typical disk has millions of blocks, and workloads often contain millions of
requests.
For this reason most approaches take a much simpler view of the problem, and only implement handcrafted
heuristics. These heuristics will improve performance if the assumptions they make are true, but will
4
....
workload
heuristic 1 heuristic n
constraint generation
adaptive combiner
....
weighted
constraint set
consistent
constraint set
performance update weights
analyzer
layout manager learner
constraints
candidate layout
final layout
Figure 2: The two-tiered learning architecture. This figure illustrates the two components of a two-tiered architecture for
refining disk layout. The constraint generation layer consists of the individual heuristics and their constraints. The adaptive
combiner layer merges the constraints and refines the resulting data layout.
not work in different cases and may miss opportunities for other optimization.
The two-tiered learning architecture takes a middle ground, by using heuristics to build up a smaller,
more directed state space. The system then searches for better performing disk layouts within this space.
This makes the system more flexible than the single heuristic approach, yet keeps the problem tractable,
hopefully leading to a manageable state space.
In effect, the two-tiered architecture allows the system designer to prune the search space with a set
of well-known heuristics. This makes the problem more tractable, while still allowing an adaptive solution.
Finding the balance between flexibility and problem size is a key point of this kind of system. If the field
is too narrow, (e.g. too few heuristics) the system will be unable to find the good solutions because they
will already have been eliminated. However, if the state space is allowed to grow too large (e.g. too many
heuristics), the system will be unable to find a good solution because it is overwhelmed.
4 Design
Figure 2 illustrates the two-tiered learning architecture. The constraint generation layer consists of a collection
of independent heuristics. Each of these heuristics generates a set of constraints, or invariants, based
on the workload. The adaptive combiner consists of three parts: the learner, the layout manager, and the
performance analyzer.
5
Constraint Type Description
place-in-region place blocks into a specific region
place-seq place blocks sequentially
Table 1: Common Constraint Language. This table shows the constraints we use in our implementation.
The learner assigns a value measurement, called a weight, to each of the constraints based on the
performance of previous disk layouts. It then uses the weights to decide which constraints to apply, and then
to resolve any conflicts between them, creating a single set of consistent constraints.
The layout manager takes the constraint set from the learner and builds a new disk layout.
The performance analyzer takes each candidate layout and determines its success based on the target
performance metrics and the given workload. We use the performance metric of average response time in
this paper. The results of the performance analyzer are then passed to the learner for use in updating the
constraint weights.
The adaptive combiner iterates through these steps to refine the layout. In order to search the state
space for the maximum allowed time, the adaptive combiner holds the best observed layout. If at any time
the adaptive combiner must be stopped (e.g., due to computation constraints or diminishing returns), it can
immediately output the best observed layout.
The remainder of this section describes the heuristics, learner, layout manager, and performance analyzer
in more detail.
4.1 Heuristics
The first level of the architecture is the heuristic layer. At this layer, a group of independent heuristics
propose changes to the disk layout to improve performance. A prerequisite to doing so is having a common
language in which to express heuristic suggestions.
4.1.1 Common Constraint Language
The learner uses constraints as the common language used by disk organization heuristics. A constraint is
an invariant on the disk layout that a heuristic considers important. A constraint may be specified on a single
disk block or a set of blocks. The learner combines heuristics by selectively applying their constraints.
Table 1 shows two example constraints. Place-in-region constraints specify in which region of the disk
6
a set of blocks should be placed.1 Place-seq constraints specify a set of blocks that should be placed sequentially.
These two constraint types have been sufficient for many heuristics, but additions are expected for
some future heuristics. For example, heuristics that exploit replication and low-level device characteristics
(e.g. track-aligned extents [26]) may require additional constraints.
By dividing each heuristic’s suggestions into constraints, the learner can apply the pieces of the heuristic
that are helpful and ignore those that are not. This allows for a finer granularity in dealing with the problem,
but also increases the size of the problem. Experimentation into ways to decrease the granularity of the
problem, such as dealing with reorganization at a file level could be very interesting. There is a trade-off
between finding a granularity that effectively captures the patterns of the workload, without exposing the
learner to too many options.
4.1.2 Example Heuristics
This section describes the five heuristics currently used in the prototype constraint generation layer.
Disk Shuffling: The disk shuffling heuristic generates place-in-region constraints to place frequently
accessed data near the middle of the disk and less frequently accessed data towards the edges [23]. Such an
arrangement reduces the average seek distance of requests. This heuristic is done at a large granularity, to
allow for sequential placement of data within the disk shuffling arrangement.
Threading: The threading heuristic generates place-seq constraints for sets of blocks usually accessed
in a sequential manner. Doing so exploits the efficiency of disk streaming. It is implemented using a firstorder
Markov-model [20] that tracks the probability of one block following another.
Run clustering: The run clustering heuristic is similar to the threading heuristic, but it is built to
find sequences in a multi-threaded environment. In this environment, blocks in a sequence may not occur
immediately after one another, but may have other blocks interspersed in the sequence. The run packing
heuristic keeps track of a window of previous requests, weighting requests closer in the sequence more
highly than those farther away [9].
Front loading: This heuristic generates place-in-region constraints to place frequently accessed data
at the beginning of the disk, where the streaming bandwidth of the disk is higher (due to zoned recording.)
Bad: This heuristic is used purely for testing the adaptive combiner. It is built to hurt performance, to
simulate a poorly implemented or poorly fitted heuristic. It generates place-in-region constraints to spread
the most frequently accessed blocks across the disk in an attempt to destroy locality. We do not plot results
for the bad heuristic, because it increases response times by at least an order of magnitude.
1The system currently divides the disk into 11 regions to simplify placement constraints.