A Disk RAID array Volume Manager for Disk

Challenges in Building a Two-Tiered Learning Architecture for Disk
Layout
Brandon Salmon, Eno Thereska, Craig A.N. Soules, John D. Strunk,
Gregory R. Ganger
August 2004
CMU-PDL-04-109
School of Electrical and Computer Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
Abstract
Choosing the correct settings for large systems can be a daunting task. The performance of the system is often heavily dependent
upon these settings, and the “correct” settings are often closely coupled with the workload. System designers usually resort to using
a set of heuristic approaches that are known to work well in some cases. However, hand-combining these heuristics is painstaking
and fragile. We propose a two-tiered architecture that makes this combination transparent and robust, and describe an application
of the architecture to the problem of disk layout optimization. This two-tiered architecture consists of a set of independent heuristics,
and an adaptive method of combining them. However, building such a system has proved to be more difficult than expected. Each
heuristic depends heavily on decisions from other heuristics, making it difficult to break the problem into smaller pieces. This paper
outlines our approaches and how they have worked, discusses the biggest challenges in building the system, and mentions additional
possible solutions. Whether this problem is solvable is still open to debate, but the experiences reported provide a cautionary tale;
system policy automation is complex and difficult.
We thank the members and companies of the PDL Consortium (including EMC, Engenio, Hewlett-Packard, HGST, Hitachi, IBM, Intel,
Microsoft, Network Appliance, Oracle, Panasas, Seagate, Sun, and Veritas) for their interest, insights, feedback, and support. Their work is partially
funded by the National Science Foundation, via grant #CCR-0326453.
Keywords: disk layout, adaptive, self-managing, self-tuning, learning, automated tuning
disk shuffling
organize
based on
system hints
name space locality
organize by age
place track-aligned
new layout
workload
traditional combination
(a) Traditional Monolithic Architecture.
workload
heuristic 1 ... heuristic n
adaptive combiner
new layout feedback
constraints
(b) Two-tiered Architecture.
Figure 1: Two-tiered vs. traditional architecture for adaptive layout software. The traditional architecture combines different
heuristics in an ad-hoc fashion, usually using a complicated mesh of if-then-else logic. This combination is difficult to build and
tends to be fragile. The two-tiered architecture separates the heuristics from the combiner and uses feedback to refine its decisions
and utilize the best parts of each heuristic. This provides a flexible and robust way to combine a set of heuristics.
1 Introduction
Internal system policies, such as on-disk layout and disk prefetching, have been the subject of decades of
research. Researchers try to identify algorithms that work well for different workload mixes, developers try
to decide which to use and how to configure them, and administrators must decide on values for tunable
parameters (e.g., run lengths and prefetch horizons). Unfortunately, this process places significant burden
on developers and administrators, yet still may not perform well in the face of new and changing workloads.
To address this, researchers now strive for automated algorithms that learn the right settings for a given
system. Of course, different configurations work best for different workloads meaning that any particular
setup will work well for one workload and poorly for others. Worse, most deployed systems support many
workloads simultaneously, potentially making any single decision suboptimal for the aggregate. Devising
a composite algorithm for such circumstances can be a daunting task, and updating such an algorithm to
workload changes even more so.
This paper describes a two-tiered architecture for such automated self-tuning software, using on-disk
data layout as a concrete example [24]. Instead of a single monolithic algorithm, as illustrated in Figure 1a,
the decision-making software consists of a set of independent heuristics and an adaptive combiner, used for
merging the heuristics’ suggested solutions as shown in Figure 1b. Each heuristic implements a single policy

that hopefully works well in some circumstances but not necessarily in all. Heuristics provide suggested
constraints on the end layout, such as placing a given block in a given region or allocating a set of blocks
sequentially. The adaptive combiner uses prediction models and on-line observations to balance and merge
conflicting constraints.
This two-tiered architecture would provide at least three benefits. First, heuristic implementations
could focus on particular workload characteristics, making local decisions without concern for global consequences.
Second, new heuristics could be added easily, without changing the rest of the software. In fact,
bad (or poorly implemented) heuristics could even be handled, because their constraints would be identified
as less desirable and ignored. Third, the adaptive combiner could balance constraints without knowledge of
or concern for how they were generated. The overall result would be a simpler and more robust software
structure.
However, in order for this architecture to work, an efficient learning agent must be developed to actually
do the combination of the heuristics. While the problem seems reasonably solvable at first glance, it is
actually quite difficult. This paper describes our experiences in trying to build such a system and lessons
learned. It presents several failed attempts to solve it, and analyzes why they did not work. It remains an open
question whether the problem is solvable, but our efforts highlight an important point: policy automation is
difficult in practice.
The remainder of this paper is organized as follows. Section 2 discusses related work. Sections 3 and
4 discusses the overall design of the system. Section 5 outlines the experimental set up for the paper, and
Section 6 describes an evaluation of the system. Section 7 describes some of the challenges in building
a functional system, and Section 8 describes various tradeoffs in building the system. Sections 9 and 10
describe future work and conclusions.
2 Related Work
The AI community continues to develop and extend the capabilities of automated learning systems. The systems
community is adopting these automated approaches to address hard problems in systems management.
This section discusses related work, both from the AI and systems perspectives.
AI techniques: The AI community has long recognized the need for self-managing systems. In fact,
a whole branch of AI research, machine learning, exists especially to solve real-life problems where human
involvement is not practical [17].
One general AI problem of relevance is the n-experts problem, in which a system must choose between

the outputs of �� different experts. The n-experts problem is not an exact match to our problem, because
we am merging experts’ suggestions rather than choosing one. Nonetheless, solutions such as the weighted
majority algorithm [12] provide valuable insight.
Another general challenge for the AI community is the efficient exploration of a large state space (i.e.,
the set of all possible solutions to an optimization problem). For example, our current prototype explores its
state space using a variety of guided hill-climbing algorithms and a method similar to simulated annealing
to avoid local maxima.
Genetic Algorithms [6] are another approach to optimizing problems without having a detailed understanding
of the underlying trade offs. The method works by choosing a population of solutions, evaluating
their effectiveness, and choosing a new population from those solution which perform well, and mutations
and combinations of these solutions. While our problem could be adapted to a genetic algorithm, we fear it
might take more than the 50-100 iterations that are feasible in our current system, given the approach’s lack
of knowledge of the problem specifics. However, exploration of this technique would be interesting.
Adaptive disk layout techniques: A disk layout is the mapping between a system’s logical view of
storage and physical disk locations. Useful heuristics have been devised based on block-level access patterns
and file-level information.
Block-based heuristics arrange the layout of commonly accessed blocks to minimize access latency.
For example, Ruemmler and Wilkes [23] explored putting frequently used data in the middle of the disk to
minimize seek time. Wang and Hu [29] tuned a log-structured file system [21] by putting active segments
on the high bandwidth areas of the disk. Several researchers [1, 13, 18, 30] have explored replication
of disk blocks to minimize seek and rotational latencies. Hsu et al. propose the ALIS system [9], which
combines two disk layout heuristics, heat clustering and run packing. The project shows a good hand-crafted
combination of two heuristics, but does not attempt to combine an arbitrary set of conflicting heuristics.
File-based heuristics use information about inter and intra-file relationships to co-locate related blocks.
For example, most file systems try to allocate blocks of a file sequentially. C-FFS [5] allocates the data
blocks that belong to multiple small files named by the same directory adjacently. Hummingbird [10] and
Cheetah [27] perform similar grouping for related web objects (e.g., an HTML document and its embedded
images).
Other system management policy techniques: Relevant research has also been done on storage system
policies, such as caching and prefetching [3, 19]. Similar schemes have been used in other domains
as well, such as branch prediction in modern processors [16, 28]. For example, in NOAH [2], Amer et
RAID 1Disk array Disk.php al. extend the first and last successor models for predictive caching. It provides a way to determine which