15-712: Advanced Operating Systems & Distributed Systems

Papers

For each class meeting, readings are assigned. Usually, the readings will consist of two or three computer systems papers. The papers selected for this course are either classic papers or papers from recent top conferences. You are expected to read these papers thoroughly and summarize them BEFORE arriving at class. For each class meeting, we identify the topic and papers below; for each, we also try to identify good sources for background reading and for further investigation.

Electronic versions are linked where available (access will be denied for IP addresses outside of CMU); for paper copies, visit the 712 drawer of the course file cabinet (on the D level of Hamerschlag Hall, just outside the elevator). We will try to provide paper copies of the assigned readings at least a week in advance.

(NOTE: This schedule is not set in stone. Some changes may be made to this schedule during the term. As well, guest lectures are still being determined.)

November 13: Composable Systems

32. File-System Development with Stackable Layers
J. Heidemann, G. Popek, ACM Tranactions on Computer Systems, vol. 12, no. 1, February 1994, pp. 58-89.
33. Scripting: Higher-Level Programming for the 21st Century
J. Ousterhout, IEEE Computer, vol. 31, no. 3, March 1998, pp. 23-30.

As computer systems become more and more complex, it becomes important to be able to compose them of substantial pre-existing components. Far from new, composability has appeared in many aspects of system construction in many different fashions. This class meeting will look at various forms of composability that have arisen in systems, such as module switches (e.g., vnode interfaces), stackable layers (e.g., in file systems and networks), pipelined applications (e.g., the UNIX programming model), tweakable interfaces, and glue-logic via scripting. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two specific examples of composability in systems. The first describes stackable file systems, which allow functionality to easil be added and subtracted incrementally. The second discusses the importance of scriptin languages (e.g., Perl and Tcl) in the construction and extension of systems.

I don't have a lot to offer in terms of suggestions for background reading at this time. The related work sections of the assigned papers provide some suggestions for further reading.

November 11 : OS Structure and Extensibility

30. Extensibility, Safety, and Performance in the SPIN Operating System
B. Bershad, S. Savage, et. al, ACM Symposium on Operating Systems Principles, December 1995, pp. 267-284.
31. Application Performance and Flexibility on Exokernel Systems
M.F. Kaashoek, D. Engler, et. al, ACM Symposium on Operating Systems Principles, December 1997, pp. 52-65.

The structure of operating systems has been the topic of much research and debate, since it has such a large impact on complexity, performance, flexibility, robustness, security, etc. This class meeting will look at various options and their trade-offs, including monolithic (e.g., linux, Windows), microkernels (e.g., Mach), virtual machine monitors (e.g., VM/370), and recent research systems. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two recently researched operating system structures focused on enabling safe extensibility. The first describes Spin, which uses a type-safe programming language (Modula-3) to enable code to be safel loaded into its kernel. The second describes exokernels, which minimize kernel functionality to that required for protection, pushing all high-level abstractions into application libraries wher it can be replaced by application writers at will. It is worth noting that not everyone buys into the value of extensible systems -- a fun rebuttal is given in Extensible Systems are Leading OS Researchers Astray (by P. Druschel, V. Pai, W. Zwaenepoel, IEEE Workshop on Hot Topics in Operating Systems (HotOS-6), May 1997, pp. 38-42).

Background material for this topic is just basic operating systems, though specifically relevant discussion can be found in Tanenbaum's Modern Operating Systems. For additional details on the research systems described in the papers, see the Spin and exokernel project pages.

November 6: Privacy and Censorship-Resistence

28. Publius: A robust, tamper-evident, censorship-resistant, web publishing system
M. Waldman, A. Rubin, and L. Cranor, 9th USENIX Security Symposium, August, 2000.
29. The Design, Implementation, and Operation of an Email Pseudonym Server
D. Mazieres, and M.F. Kaashoek, 5th ACM Conference on Computer and Communications Security, 1998.

One of the most heated topics for the future will center on issues of privacy, anonymity, and censorship-resistence. Although there are many questions of ethics and law, there are also difficult technical issues related to making these possible at all. Most technological advances (faster computing, more storage, better face recognition, smarter data mining ) march against these historical givens, as we are rapidly closing in on the complete set of technologies needed to create Big Brother. During this class period, we will talk about a few of the ideas being explored to help hold back the tide. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe insights from two specific projects. The first describes Publius, which is a decentralized storage service that distributes document storage responsibilities among many independent parties. The second describes experiences with operating an e-mail pseudonym server, providing an interesting look at the practical difficulties faced.

The security books listed earlier provide some background on this sub-topic as well, as are the bibliographies of the papers. One interesting piece of background is Ross Anderson's "The Eternity Service", which focused many people's attention on the value of not losing the anonymous, censureship-resistent publication capability that has historically existed (and been so significant).

November 4: Fault Tolerance and System Structure

26. Recovery Management in Quicksilver
R. Haskin, Y. Malachi, W. Sawdon, G. Chan, ACM Transactions on Computer Systems, vol. 6, no. 1, February 1988, pp. 82-108.
27. Hive: Fault Containment for Shared-Memory Multiprocessors
J. Chapin, M. Rosenblum, et. al, ACM Symposium on Operating Systems Principles, December 1995, pp. 12-25.

Fault-tolerance is a broad and important topic in operating systems and distributed systems. This class meeting will discuss a range of issues and approaches, complementing previous discussions. Some topics to be discussed include data redundancy, failover, checkpointing, state machines, N-versioning, fault containment, isolation, and failstop versus byzantine failures. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe fault tolerance strategies for particular systems. The first describes fault management in Quicksilver, which relied heavily on transactions and their recoverability properties. The second describes Hive, a cellular operating system architecture for large multiprocessor systems.

Fault-tolerance is another of the large areas of research and practice. For example, see the IEEE Technical Committee on Fault-Tolerant Computing. For some interesting and scary stories of insufficiently fault tolerant systems and their consequences, see Neumann's Computer Related Risks or Peterson's Fatal Defect.

November 1 : Group Communication and State Machines

24. Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial
F. Schneider, ACM Computing Surveys, vol. 22, no. 4, December 1990.
25. Practical Byzantine Fault Tolerance
M. Castro, B. Liskov, USENIX Symposium on Operating Systems Design and Implementation (OSDI), February 1999.

As we've discussed, one reason for replacing centralized services with distributed systems is fault tolerance. Dealing with both clean (fail-stop) failures and misbehaving systems (e.g., compromised systems) can be done by having multiple systems doing exactly the same thing in lockstep. This class meeting will look at approaches and technologies for accomplishing such function replication. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe this method in general and in a particular application, distributed file service. The first provides a tutorial-like overview of replicated state machines. The second describes a fault-tolerant NFS replacement that employs some well-tuned algorithms.

The first paper actually provides some pretty good background discussion, and we encourage those looking for further reading to follow up the Related Work sections.

October 25 : Event Ordering and Multi-Party Consensus

22. Time, Clocks, and the Ordering of Events in a Distributed System
L. Lamport, Communications of the ACM, vol. 21, no. 7, July 1978, pp. 558-565.
23. The Byzantine General's Problem
L. Lamport, R. Shostak, M. Pease, ACM Transactions on Programming Languages and Systems, vol. 4, no. 3, July 1982, pp. 382-401.

The nature of distributed systems is such that autonomous systems don't see the same things, don't see things at the same time, and can't even easily agree on what time it is. These problems make it difficult for each to reason about what the others are up to and makes it difficult for systems to come to consensus. This class meeting will discuss various issues of event ordering and coming to consensus. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today discuss different aspects of this general area of problem. The first discusses the problems of event ordering and the distributed clock synchronization. The second discusses the problem of coming to consensus when some systems misbehave.

Much of the theoretical side of distributed systems is more or less focuses on these two problems. Thus, many good distributed systems books address them thoroughly, including those on the reserved list.

October 23 : Concurrency, Threads, Transactions

20. Using Threads in Interactive Systems: A Case Study
C. Hauser, C. Jacobi, et. al, ACM Symposium on Operating Systems Principles, December 1993, pp. 94-105.
21. On Optimistic Methods for Concurrency Control
H.T. Kung, J. Robinson, ACM Transactions on Database Systems, vol. 6, no. 2, June 1981, pp. 213-226.

One of the most basic (and yet persistently complex) aspects of both operating systems and distributed systems is dealing with concurrent threads of control. While your undergraduate OS course undoubtedly spent a lot of time talking about monitors and semaphores, there is a lot more to concurrency than that. This class meeting will discuss a number of such topics, including locking, avoiding deadlock, optimistic concurrency control, avoiding livelock, leases, and transactions. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today differ in what they offer. The first describes experiences with thread programming, offering a number of insights from serious usage. The second describes an optimistic approach to concurrency control, which allows locks to be completely avoided.

For additional background on thread programming, we suggest An Introduction to Programming with Threads (A. Birrell, DEC Technical Report TR-35, DEC/SRC, January 1989; the instructors will be happy to provide you with a copy), Principles of Concurrent and Distributed Programming (by M. Ben-Ari, Prentice Hall Publishers) or Programming with POSIX Threads (by D. Butenhof, Addison-Wesley Publishers).

October 9 : Security Mechanisms

18. The protection of information in computer systems
J. Saltzer, M. Schroeder, Proceedings of the IEEE, vol. 63, no. 9, September 1975, pp. 1278-1300.
19. Better Security via Smarter Devices
G. Ganger, D. Nagle, HotOS-VIII, May 2001.

Computer system security is a topic of growing importance (and great confusion). This class meeting complements the previous one by discussing mechanisms available to mitigate security dilemmas, such as the role of cryptography, assurrance, trust, firewalls, intrusion diagnosis, and blame. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today discuss interesting aspects of computer system security. The first is the classic description of how to build a secure computing system. It is a bit dated (and therefore doesn't include some recent problems/solutions), but is quite comprehensive in the fundamentals. The second (much shorter) discusses some wild ideas regarding a future form of distributed security.

Suggested readings for this class meeting are the same as for the last.

October 7 : Security Dilemmas

15. Reflections on Trusting Trust
K. Thompson, Communications of the ACM, vol. 27, no. 8, August 1984, pp. 761-763.
16. Why Cryptosystems Fail
R. Anderson, Communications of the ACM, vol. 37, no. 11, November 1994, pp. 32-40.
17. Crisis and Aftermath
E. Spafford, Communications of the ACM, vol. 32, no. 6, June 1989, pp. 678-687.

Computer system security is a topic of growing importance (and great confusion). This class meeting and the next will explore the basics of computer system security as it relates to operating systems and distributed systems. Topics for this meeting will include what security is really all about and the dilemmas faced by system designers, implementers, and administrators. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today discuss interesting aspects of computer system security. The first is a brief lecture given by Ken Thompson when accepting his Turing Award; it makes an interesting point about trust and security. The second discusses why cryptography does not solve the security problem, by drawing on real-world examples, and some consequences of this fact. The third describes the notorious Internet Worm and some lessons regarding the real problems in computer security; sadly the lessons haven't changed much 12 years later.

There is a huge and growing literature on security. Two excellent overview books are Secrets and Lies: Digital Security in a Networked World by Bruce Schneier (published in 2000 by John Wiley and Sons, Inc.) and Security Engineering: A Guide to Building Dependable Distributed Systems by Ross Anderson (published in 2001 by John Wiley and Sons, Inc.). The former is a light read, and the latter is a serious textbook.

October 2 : Function Placement

13. Fine-Grained Mobility in the Emerald System
E. Jul, H. Levy, N. Hutchinson, A. Black, ACM Transactions on Computer Systems, vol. 6, no. 1, February 1988, pp. 109-133.
14. Dynamic Function Placement for Data-Intensive Cluster Computing
K. Amiri, D. Petrou, G. Ganger, G. Gibson, Usenix Annual Technical Conference, June 2000, pp. 307-322.

An important topic in distributed systems in function placement, including both deciding what to run where and instantiating these decisions. Issues involved with deciding what should run where include inter-function communication, parallelism, load balancing, and security. Instantiating these decisions involves communication issues discussed in previous class meetings. In some cases, it also includes moving functionality from one place to another, which introduces the topic of mobile code and such issues as execution environment heterogeneity, virtual machine environments, re-binding, and encapsulation/isolation for protection. In examining these issues, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe systems with particular approaches to making and instantiating function placement decisions. The first describes Emerald, a system that realizes fine-grained mobility by encapsulating data and threads into mobile objects. The second describes Abacus, a system that dynamically adjusts function plaement decisions based on black-box monitoring of runtime performance.

There is not a lot of background literature about function placement (check out the papers' Related Work sections to dig further), but there is a huge amount of literature on mobile code systems. One interesting place to look is at UMBC's Agent Resources website. Another is the Object Management Group's website.

September 30 : Communication Models

11. Implementing Remote Procedure Call
A. Birrell, B. Nelson, ACM Transactions on Computer Systems, vol. 2, no. 1, February 1984, pp. 39-59.
12. Cluster I/O with River: Making the Fast Case Common
R. Arpaci-Dusseau, E. Anderson, et. al, Workshop on Input/Output for Parallel and Distributed Systems (IOPADS), May 1999, pp. 10-22.

Having plowed through a concrete set of advanced OS and basic distributed systems examples, we now start looking at aspects in a more general light. Our first such aspect is the communication mechanism among components of a distributed system. The focus is not on the particular network protocol, but on how the communication is viewed by the parts and how it all hangs together. Some particular topics for today include remote procedure call (RPC), implementing RPC, data streams, load balancing, and work stealing. As always, along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today highlight particular communication models. The first is a classic paper discussing implementation details involved with RPCs, which are the foundation of most distributed systems. The second is a more recent paper that describes a (possibly) more appropriate mechanism for data-intensive cluster applications.

For both background reading and further reading, we suggest looking at general distributed systems books, such as Tanenbaum's Distributed Operating Systems, Coulouris, et al.,'s Distributed Systems, or Mullender's Distributed Systems.

September 27 : Writing Systems Papers

10. An Evaluation of the Ninth SOSP Submissions

Great computer systems researchers must be great writers. For better or worse, presentation quality determines the destiny of a research paper nearly as much as the technical content. If your readers cannot understand (or cannot maintain consciousness while reading) what you have written, how can they be expected to appreciate your brilliance?? Today's class meeting will focus on what makes a good systems paper, from outline to final polish. Along the way, we will discuss several types of papers.

The one paper assigned for today hammers the systems community by telling them what the papers submitted to SOSP in 198X should have looked like. It is focused mainly on issues of content.

There are various web pages dedicated to giving advice about written and spoken communication, including some dedicated to computer systems researchers.

September 25 : Decentralized Storage Services

8. A Hierarchichal Object Cache
A Chankhunthod, P. Danzig, C. Neerdaels, M. Schwartz, K. Worrell, Proceedings of the USENIX Technical Conference, January 1996.
9. Wide-area cooperative storage with CFS
F. Dabek, F. Kaashoek, D. Karger, R. Morris, I. Stoica, ACM Symposium on Operating Systems Principles,October 2001, pp. 202-215.

September 23 : Decentralized Storage Services

7. A Cost-Effective, High-Bandwidth Storage Architecture
G. Gibson, D. Nagle, et. al, 8th Conf. on Architectural Support for Programming Languages and Operating Systems, October 1998, pp. 92-103
8. Serverless Network File Systems
T. Anderson, M. Dahlin, et. al, ACM Transactions on Computer Systems, vol. 14, no. 1, February 1996, pp. 41-79.

The fourth class in the file systems series will focus on distributed storage services, which include file systems whose server-side functionality is distributed among several file servers. Topics touched on will include partitioning and coordination, naming and resource discovery, reliability/availability, security, data consistency, dealing with scale. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe distributed storage service designs. The first describes Network-Attached Secure Disks (NASD), which is an architecture designed to cost-effectively deliver scalable storage bandwidth via a particular partitioning of functionality among clients, servers, and network-attached storage devices. The second describes an ambitious cluster storage system design (xFS) that is meant to avoid having any central point of failure or performance bottleneck.

There isn't a particularly good source of background reading on this topic, but the assigned readings do identify a number of other system designs. We suggest following up on papers referenced therein as a start toward digging deeper. To gain an appreciation for the benefits and difficulties of "multi-device" storage subsystems (even when there is a centralized controller), we suggest RAID: High-Performance, Reliable Secondary Storage (by P. Chen, et. al, in ACM Computing Surveys, June 1994).

September 18 : Distributed File Systems

5. Scale and Performance in a Distributed File System
J. Howard, M. Kazar, et. al, ACM Transactions on Computer Systems, vol. 6, no. 1, February 1988, pp. 51-81.
6. Caching in the Sprite Network File System
M. Nelson, B. Welch, J. Ousterhout, ACM Transactions on Computer Systems,6:1, 1988, pp.134-154.

The third class in the file systems series will focus on distributed file systems, which are file systems whose functionality is split between client machines and file servers that are connected via some kind of network. Topics touched on will include client/server organization, how it works (RPC), how it differs from the local file system case, cache coherence, concurrency control, data consistency, performance enhancement, scalability of clients supported. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe distributed file system designs. The first is a landmark paper that describes AFS and how its design enhances scalability. The second looks at performance gains from distributed caching in a networked file system environment.

For additional background on distributed file systems, a couple of good sources are Chapter 5 of Distributed Operating Systems (by A. Tanenbaum, Prentice Hall Publishers) and Chapter 9 of The Design and Implementation of the 4.4BSD Operating System (by M.K. McKusick, K. Bostic, M. Karels and J. Quarterman, Addison-Wesley publishers).

September 16 : File System Integrity

3. The Design and Implementation of a Log-Structured File System
M. Rosenblum, J. Ousterhout, ACM Transactions on Computer Systems, vol. 10, no. 1, February 1992, pp. 26-52.
4. Soft Updates: A Solution to the Metadata Update Problem in File Systems
G. Ganger, M.K. McKusick, C. Soules, Y. Patt, ACM Transactions on Computer Systems, vol. 18, no. 2, pp. 127-153.

The second class in the file systems series will focus on file system integrity, which is a requirement that post-crash file systems be in a recoverable state. The importance of this topic cannot be over-stated, as this is what ensures that storage persistence is actually valuable. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two file system implementation techniques that offer alternate views on this problem. Both assume that a small window of vulnerability for new data is acceptible to users. The first assumes that large caches will capture all reads and thus optimizes for writes; note that, like most big ideas, the merits of LFS have been the source of heated debate. The second eliminates the historical expense of synchronous I/O operations by enabling integrity-maintaining write caching for metadata.

Suggested readings for this class meeting are the same as for the last.

September 13 : Maximizing Disk Performance

1. Track-Aligned Extents: Matching Access Patterns to Disk Characteristics
J. Schindler, J. Griffin, C. Lumb, G. Ganger, Conference on File and Storage Technologies, January 2002, pp. 259-274.
2. Virtual Log Based File Systems for a Programmable Disk
R. Wang, T. Anderson, D. Patterson, Symposium on Operating Systems Design and Implementation, February 1999.

The next four class meetings will use file systems as a concrete example for the various general problems introduced on day one (and covered throughout the course). This first one will focus on performance enhancement in local file systems, particularly related to disk drives. To set the stage, we'll look at the levels of control and translation between application interfaces and disk media. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two storage system implementation techniques specifically focused on maximizing performance given specific disk features. Each describes some of the trends leading to their designs, although both have a "back to the future" feel given that systems used to do such things in "the good old days" when disk drives did not have high-level interfaces like SCSI or IDE. The first identifies the disk track as a sweet spot for mid-sized requests and addresses the difficulties involved with exploiting this insight. The second describes how small, synchronous writes can be completed with minimal mechanical positioning (the dominant cause of disk delays) and quantifies the benefits.

This class topic will move rapidly through a variety of file system and disk management issues, since you should have a solid base already (from your undergraduate OS course). A good source for more background on how disks work is An Introduction to Disk Drive Modeling (by C. Ruemmler and J. Wilkes, in IEEE Computer magazine, March 1994). For more background in file systems, we suggest Practical File Systems Design with the Be File System (by D. Giampaolo, Morgan Kaufmann Publishers), and chapters 6, 7 and 8 of The Design and Implementation of the 4.4BSD Operating System (by M.K. McKusick, K. Bostic, M. Karels and J. Quarterman, Addison-Wesley publishers).

September 11 : Welcome to 712

0. Hints for Computer System Design
B. Lampson, ACM Symposium on Operating Systems Principles, Dec. 1983, pp 33-48.
0. "Worse is Better", an excerpt (section 2-2.1, pages 7-10) from LISP: good news, bad news, how to win BIG
R. Gabriel, AI Expert, vol. 6, no. 6, June 1991, pp. 31-39.

Because of the late start to 712, we will start quickly. This first meeting will be more than just organizational in nature. In it, we will discuss how the class is going to work and what will (and won't) be covered. In addition, we will dive into the material. This will include very rapidly recapping stuff you should already know (e.g., stuff covered in 15-412), discussing what defines operating systems and distributed systems and what makes them continue to be interesting after all these years, and overviewing how the various topics in the course fit together.

The papers listed above are fun and insight-filled, talking generally about the construction and history of computer systems. We strongly encourage you to read them, but are not requiring the standard written summary for them. The Lampson paper, in particular, is something that you should read now, read again in a few weeks, and then put into the pile of papers that you re-read every year or so. A book that should go in this same category is The Mythical Man-Month: Essays on Software Engineering (by F. Brooks, Addison-Wesley Publishers).

Last modified: Sun Nov 10 18:34:00 EST 2002