Challenges in Building a Two-Tiered Learning Architecture for Disk Layout Brandon Salmon, Eno Thereska, Craig A.N. Soules, John D. Strunk, Gregory R. Ganger August 2004 CMU-PDL-04-109 School of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA 15213 Abstract Choosing the correct settings for large systems can be a daunting task. The performance of the system is often heavily dependent upon these settings, and the “correct” settings are often closely coupled with the workload. System designers usually resort to using a set of heuristic approaches that are known to work well in some cases. However, hand-combining these heuristics is painstaking and fragile. We propose a two-tiered architecture that makes this combination transparent and robust, and describe an application of the architecture to the problem of disk layout optimization. This two-tiered architecture consists of a set of independent heuristics, and an adaptive method of combining them. However, building such a system has proved to be more difficult than expected. Each heuristic depends heavily on decisions from other heuristics, making it difficult to break the problem into smaller pieces. This paper outlines our approaches and how they have worked, discusses the biggest challenges in building the system, and mentions additional possible solutions. Whether this problem is solvable is still open to debate, but the experiences reported provide a cautionary tale; system policy automation is complex and difficult. We thank the members and companies of the PDL Consortium (including EMC, Engenio, Hewlett-Packard, HGST, Hitachi, IBM, Intel, Microsoft, Network Appliance, Oracle, Panasas, Seagate, Sun, and Veritas) for their interest, insights, feedback, and support. Their work is partially funded by the National Science Foundation, via grant #CCR-0326453. Keywords: disk layout, adaptive, self-managing, self-tuning, learning, automated tuning disk shuffling organize based on system hints name space locality organize by age place track-aligned new layout workload traditional combination (a) Traditional Monolithic Architecture. workload heuristic 1 ... heuristic n adaptive combiner new layout feedback constraints (b) Two-tiered Architecture. Figure 1: Two-tiered vs. traditional architecture for adaptive layout software. The traditional architecture combines different heuristics in an ad-hoc fashion, usually using a complicated mesh of if-then-else logic. This combination is difficult to build and tends to be fragile. The two-tiered architecture separates the heuristics from the combiner and uses feedback to refine its decisions and utilize the best parts of each heuristic. This provides a flexible and robust way to combine a set of heuristics. 1 Introduction Internal system policies, such as on-disk layout and disk prefetching, have been the subject of decades of research. Researchers try to identify algorithms that work well for different workload mixes, developers try to decide which to use and how to configure them, and administrators must decide on values for tunable parameters (e.g., run lengths and prefetch horizons). Unfortunately, this process places significant burden on developers and administrators, yet still may not perform well in the face of new and changing workloads. To address this, researchers now strive for automated algorithms that learn the right settings for a given system. Of course, different configurations work best for different workloads meaning that any particular setup will work well for one workload and poorly for others. Worse, most deployed systems support many workloads simultaneously, potentially making any single decision suboptimal for the aggregate. Devising a composite algorithm for such circumstances can be a daunting task, and updating such an algorithm to workload changes even more so. This paper describes a two-tiered architecture for such automated self-tuning software, using on-disk data layout as a concrete example [24]. Instead of a single monolithic algorithm, as illustrated in Figure 1a, the decision-making software consists of a set of independent heuristics and an adaptive combiner, used for merging the heuristics’ suggested solutions as shown in Figure 1b. Each heuristic implements a single policy that hopefully works well in some circumstances but not necessarily in all. Heuristics provide suggested constraints on the end layout, such as placing a given block in a given region or allocating a set of blocks sequentially. The adaptive combiner uses prediction models and on-line observations to balance and merge conflicting constraints. This two-tiered architecture would provide at least three benefits. First, heuristic implementations could focus on particular workload characteristics, making local decisions without concern for global consequences. Second, new heuristics could be added easily, without changing the rest of the software. In fact, bad (or poorly implemented) heuristics could even be handled, because their constraints would be identified as less desirable and ignored. Third, the adaptive combiner could balance constraints without knowledge of or concern for how they were generated. The overall result would be a simpler and more robust software structure. However, in order for this architecture to work, an efficient learning agent must be developed to actually do the combination of the heuristics. While the problem seems reasonably solvable at first glance, it is actually quite difficult. This paper describes our experiences in trying to build such a system and lessons learned. It presents several failed attempts to solve it, and analyzes why they did not work. It remains an open question whether the problem is solvable, but our efforts highlight an important point: policy automation is difficult in practice. The remainder of this paper is organized as follows. Section 2 discusses related work. Sections 3 and 4 discusses the overall design of the system. Section 5 outlines the experimental set up for the paper, and Section 6 describes an evaluation of the system. Section 7 describes some of the challenges in building a functional system, and Section 8 describes various tradeoffs in building the system. Sections 9 and 10 describe future work and conclusions. 2 Related Work The AI community continues to develop and extend the capabilities of automated learning systems. The systems community is adopting these automated approaches to address hard problems in systems management. This section discusses related work, both from the AI and systems perspectives. AI techniques: The AI community has long recognized the need for self-managing systems. In fact, a whole branch of AI research, machine learning, exists especially to solve real-life problems where human involvement is not practical [17]. One general AI problem of relevance is the n-experts problem, in which a system must choose between the outputs of different experts. The n-experts problem is not an exact match to our problem, because we am merging experts’ suggestions rather than choosing one. Nonetheless, solutions such as the weighted majority algorithm [12] provide valuable insight. Another general challenge for the AI community is the efficient exploration of a large state space (i.e., the set of all possible solutions to an optimization problem). For example, our current prototype explores its state space using a variety of guided hill-climbing algorithms and a method similar to simulated annealing to avoid local maxima. Genetic Algorithms [6] are another approach to optimizing problems without having a detailed understanding of the underlying trade offs. The method works by choosing a population of solutions, evaluating their effectiveness, and choosing a new population from those solution which perform well, and mutations and combinations of these solutions. While our problem could be adapted to a genetic algorithm, we fear it might take more than the 50-100 iterations that are feasible in our current system, given the approach’s lack of knowledge of the problem specifics. However, exploration of this technique would be interesting. Adaptive disk layout techniques: A disk layout is the mapping between a system’s logical view of storage and physical disk locations. Useful heuristics have been devised based on block-level access patterns and file-level information. Block-based heuristics arrange the layout of commonly accessed blocks to minimize access latency. For example, Ruemmler and Wilkes [23] explored putting frequently used data in the middle of the disk to minimize seek time. Wang and Hu [29] tuned a log-structured file system [21] by putting active segments on the high bandwidth areas of the disk. Several researchers [1, 13, 18, 30] have explored replication of disk blocks to minimize seek and rotational latencies. Hsu et al. propose the ALIS system [9], which combines two disk layout heuristics, heat clustering and run packing. The project shows a good hand-crafted combination of two heuristics, but does not attempt to combine an arbitrary set of conflicting heuristics. File-based heuristics use information about inter and intra-file relationships to co-locate related blocks. For example, most file systems try to allocate blocks of a file sequentially. C-FFS [5] allocates the data blocks that belong to multiple small files named by the same directory adjacently. Hummingbird [10] and Cheetah [27] perform similar grouping for related web objects (e.g., an HTML document and its embedded images). Other system management policy techniques: Relevant research has also been done on storage system policies, such as caching and prefetching [3, 19]. Similar schemes have been used in other domains as well, such as branch prediction in modern processors [16, 28]. For example, in NOAH [2], Amer et RAID 1Disk array Disk.php al. extend the first and last successor models for predictive caching. It provides a way to determine which |