15-712: Advanced Operating Systems & Distributed Systems

Papers

For each class meeting, readings are assigned. Usually, the readings will consist of two or three computer systems papers. The papers selected for this course are either classic papers or papers from recent top conferences. You are expected to read these papers thoroughly and summarize them BEFORE arriving at class. For each class meeting, we identify the topic and papers below; for each, we also try to identify good sources for background reading and for further investigation.

Electronic versions are linked where available (access will be denied for IP addresses outside of CMU; if you have an andrew ID, use the web proxy ); for paper copies, visit the 712 drawer of the course file cabinet (on the D level of Hamerschlag Hall, just outside the elevator). We will try to provide paper copies of the assigned readings at least a week in advance.

A schedule is available here.

November 24: Ubiquitous and Autonomic Computing

Today we will discuss future ideas for computer systems research.

November 12: Privacy and Censorship-Resistence

One of the most heated topics for the future will center on issues of privacy, anonymity, and censorship-resistence. Although there are many questions of ethics and law, there are also difficult technical issues related to making these possible at all. Most technological advances (faster computing, more storage, better face recognition, smarter data mining) march against these historical givens, as we are rapidly closing in on the complete set of technologies needed to create Big Brother. During this class period, we will talk about a few of the ideas being explored to help hold back the tide. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe insights from two specific projects. The first describes Publius, which is a decentralized storage service that distributes document storage responsibilities among many independent parties. The second describes experiences with operating an e-mail pseudonym server, providing an interesting look at the practical difficulties faced.

The security books listed earlier provide some background on this sub-topic as well, as are the bibliographies of the papers. One interesting piece of background is Ross Anderson's "The Eternity Service", which focused many people's attention on the value of not losing the anonymous, censureship-resistent publication capability that has historically existed (and been so significant).

November 10: Fault Tolerance and System Structure

Fault-tolerance is a broad and important topic in operating systems and distributed systems. This class meeting will discuss a range of issues and approaches, complementing previous discussions. Some topics to be discussed include data redundancy, failover, checkpointing, state machines, N-versioning, fault containment, isolation, and failstop versus byzantine failures. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe fault tolerance strategies for particular systems. The first describes fault management in Quicksilver, which relied heavily on transactions and their recoverability properties. The second describes Hive, a cellular operating system architecture for large multiprocessor systems.

Fault-tolerance is another of the large areas of research and practice. For example, see the IEEE Technical Committee on Fault-Tolerant Computing. For some interesting and scary stories of insufficiently fault tolerant systems and their consequences, see Neumann's Computer Related Risks or Peterson's Fatal Defect.

November 5: Composable Systems

As computer systems become more and more complex, it becomes important to be able to compose them of substantial pre-existing components. Far from new, composability has appeared in many aspects of system construction in many different fashions. This class meeting will look at various forms of composability that have arisen in systems, such as module switches (e.g., vnode interfaces), stackable layers (e.g., in file systems and networks), pipelined applications (e.g., the UNIX programming model), tweakable interfaces, and glue-logic via scripting. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two specific examples of composability in systems. The first describes stackable file systems, which allow functionality to easily be added and subtracted incrementally. The second describes Knit, a component definition and linking language for systems code.

I don't have a lot to offer in terms of suggestions for background reading at this time. The related work sections of the assigned papers provide some suggestions for further reading.

November 3 : OS Structure and Extensibility

The structure of operating systems has been the topic of much research and debate, since it has such a large impact on complexity, performance, flexibility, robustness, security, etc. This class meeting will look at various options and their trade-offs, including monolithic (e.g., linux, Windows), microkernels (e.g., Mach), virtual machine monitors (e.g., VM/370), and recent research systems. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two recently researched operating system structures focused on enabling safe extensibility. The first describes Spin, which uses a type-safe programming language (Modula-3) to enable code to be safel loaded into its kernel. The second describes exokernels, which minimize kernel functionality to that required for protection, pushing all high-level abstractions into application libraries wher it can be replaced by application writers at will. It is worth noting that not everyone buys into the value of extensible systems -- a fun rebuttal is given in Extensible Systems are Leading OS Researchers Astray (by P. Druschel, V. Pai, W. Zwaenepoel, IEEE Workshop on Hot Topics in Operating Systems (HotOS-6), May 1997, pp. 38-42).

Background material for this topic is just basic operating systems, though specifically relevant discussion can be found in Tanenbaum's Modern Operating Systems. For additional details on the research systems described in the papers, see the Spin and exokernel project pages.

October 29 : Concurrency, Threads, Transactions

One of the most basic (and yet persistently complex) aspects of both operating systems and distributed systems is dealing with concurrent threads of control. While your undergraduate OS course undoubtedly spent a lot of time talking about monitors and semaphores, there is a lot more to concurrency than that. This class meeting will discuss a number of such topics, including locking, avoiding deadlock, optimistic concurrency control, avoiding livelock, leases, and transactions. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today differ in what they offer. The first describes experiences with thread programming, offering a number of insights from serious usage. The second describes an optimistic approach to concurrency control, which allows locks to be completely avoided.

For additional background on thread programming, we suggest An Introduction to Programming with Threads (A. Birrell, DEC Technical Report TR-35, DEC/SRC, January 1989; the instructors will be happy to provide you with a copy), Principles of Concurrent and Distributed Programming (by M. Ben-Ari, Prentice Hall Publishers) or Programming with POSIX Threads (by D. Butenhof, Addison-Wesley Publishers).

October 27 : Group Communication and State Machines

As we've discussed, one reason for replacing centralized services with distributed systems is fault tolerance. Dealing with both clean (fail-stop) failures and misbehaving systems (e.g., compromised systems) can be done by having multiple systems doing exactly the same thing in lockstep. This class meeting will look at approaches and technologies for accomplishing such function replication. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe this method in general and in a particular application, distributed file service. The first provides a tutorial-like overview of replicated state machines. The second describes a fault-tolerant NFS replacement that employs some well-tuned algorithms.

The first paper actually provides some pretty good background discussion, and we encourage those looking for further reading to follow up the Related Work sections.

October 24 : Event Ordering and Multi-Party Consensus

See also Lamport's summaries of these papers: http://research.microsoft.com/users/lamport/pubs/pubs.html#byz and http://research.microsoft.com/users/lamport/pubs/pubs.html#time-clocks

The nature of distributed systems is such that autonomous systems don't see the same things, don't see things at the same time, and can't even easily agree on what time it is. These problems make it difficult for each to reason about what the others are up to and makes it difficult for systems to come to consensus. This class meeting will discuss various issues of event ordering and coming to consensus. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today discuss different aspects of this general area of problem. The first discusses the problems of event ordering and the distributed clock synchronization. The second discusses the problem of coming to consensus when some systems misbehave.

Much of the theoretical side of distributed systems is more or less focuses on these two problems. Thus, many good distributed systems books address them thoroughly, including those on the reserved list.

October 10 : First Exam!

October 3 : Trusted Computing

The papers assigned for today discuss trusted computing. The first lists many reasons why not to trust trusted computing. It's a light, entertaining read. (Section 3.1 and 3.2 are not very important, and can be skimmed.) The second is an example of a recently proposed tamper-resistant processor. Unfortunately, we do not have solid academic descriptions of Palladium or Lagrande.

Suggested readings for this class meeting are the XOM papers. Though James thinks the original XOM paper is better written than the Aegis paper (fewer acronyms), their design is susceptible to an obvious replay attack. The Aegis papers are also good. They focus more on physical security, such as how to detect the physical tampering Ross Anderson discusses without requiring the processor to be encased in an epoxied mesh of wires (like the IBM 4758).

The gold-standard for secure processors is still the IBM 4758 PCI Cryptographic Coprocessor. Presumably its successor, the PCI-X Cryptographic Coprocessor, will perform similarly. This system has the required sensors Anderson noted. It reached FIPS 140 level 4 validation in 1998, two years before anyone else; to this day, there are still only three vendors who meet this standard of security. (Source)

The motivation for the IBM secure cryptocard was Bennet Yee's thesis here at Carnegie Mellon.

October 1 : Security Mechanisms

Computer system security is a topic of growing importance (and great confusion). This class meeting complements the previous one by discussing mechanisms available to mitigate security dilemmas, such as the role of cryptography, assurrance, trust, firewalls, intrusion diagnosis, and blame. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today discuss interesting aspects of computer system security. The first is the classic description of how to build a secure computing system. It is a bit dated (and therefore doesn't include some recent problems/solutions), but is quite comprehensive in the fundamentals. The second (shorter) outlines various buffer overflow prevention mechanisms, and is provided for your information. You should understand from skimming this paper that most tools guard against known attacks rather than fixing program bugs, there are many possible attacks, and hence new protection against a known attack may just lead to a new program exploit.

Please note: We understand that Saltzer's paper is much longer, but it is also by far the more important paper! Don't spend too much time on the buffer overflow paper; it's included mostly because we think you may find it interesting.

Suggested readings for this class meeting are the same as for the last.

September 29 : Security Dilemmas

Computer system security is a topic of growing importance (and great confusion). This class meeting and the next will explore the basics of computer system security as it relates to operating systems and distributed systems. Topics for this meeting will include what security is really all about and the dilemmas faced by system designers, implementers, and administrators. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today discuss interesting aspects of computer system security. The first is a brief lecture given by Ken Thompson when accepting his Turing Award; it makes an interesting point about trust and security. The second discusses why cryptography does not solve the security problem, by drawing on real-world examples, and some consequences of this fact. The third describes the notorious Internet Worm and some lessons regarding the real problems in computer security; sadly the lessons haven't changed much 12 years later.

There is a huge and growing literature on security. Two excellent overview books are Secrets and Lies: Digital Security in a Networked World by Bruce Schneier (published in 2000 by John Wiley and Sons, Inc.) and Security Engineering: A Guide to Building Dependable Distributed Systems by Ross Anderson (published in 2001 by John Wiley and Sons, Inc.). The former is a light read, and the latter is a serious textbook.

September 26 : Writing Systems Papers

Great computer systems researchers must be great writers. For better or worse, presentation quality determines the destiny of a research paper nearly as much as the technical content. If your readers cannot understand (or cannot maintain consciousness while reading) what you have written, how can they be expected to appreciate your brilliance?? Today's class meeting will focus on what makes a good systems paper, from outline to final polish. Along the way, we will discuss several types of papers.

The one paper assigned for today hammers the systems community by telling them what the papers submitted to SOSP in 198X should have looked like. It is focused mainly on issues of content.

There are various web pages dedicated to giving advice about written and spoken communication, including some dedicated to computer systems researchers.

September 24: Function Placement

An important topic in distributed systems in function placement, including both deciding what to run where and instantiating these decisions. Issues involved with deciding what should run where include inter-function communication, parallelism, load balancing, and security. Instantiating these decisions involves communication issues discussed in previous class meetings. In some cases, it also includes moving functionality from one place to another, which introduces the topic of mobile code and such issues as execution environment heterogeneity, virtual machine environments, re-binding, and encapsulation/isolation for protection. In examining these issues, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe systems with particular approaches to making and instantiating function placement decisions. The first describes Emerald, a system that realizes fine-grained mobility by encapsulating data and threads into mobile objects. The second describes Abacus, a system that dynamically adjusts function plaement decisions based on black-box monitoring of runtime performance.

There is not a lot of background literature about function placement (check out the papers' Related Work sections to dig further), but there is a huge amount of literature on mobile code systems. One interesting place to look is at UMBC's Agent Resources website. Another is the Object Management Group's website.

September 22 : Communication Models

Having plowed through a concrete set of advanced OS and basic distributed systems examples, we now start looking at aspects in a more general light. Our first such aspect is the communication mechanism among components of a distributed system. The focus is not on the particular network protocol, but on how the communication is viewed by the parts and how it all hangs together. Some particular topics for today include remote procedure call (RPC), implementing RPC, data streams, load balancing, and work stealing. As always, along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today highlight particular communication models. The first is a classic paper discussing implementation details involved with RPCs, which are the foundation of most distributed systems. The second is a more recent paper that describes a (possibly) more appropriate mechanism for data-intensive cluster applications.

For both background reading and further reading, we suggest looking at general distributed systems books, such as Tanenbaum's Distributed Operating Systems, Coulouris, et al.,'s Distributed Systems, or Mullender's Distributed Systems.

September 19 : Wide-Area Storage

September 17 : Decentralized Storage Services

The fourth class in the file systems series will focus on distributed storage services, which include file systems whose server-side functionality is distributed among several file servers. Topics touched on will include partitioning and coordination, naming and resource discovery, reliability/availability, security, data consistency, dealing with scale. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe distributed storage service designs. The first describes Network-Attached Secure Disks (NASD), which is an architecture designed to cost-effectively deliver scalable storage bandwidth via a particular partitioning of functionality among clients, servers, and network-attached storage devices. The second describes an ambitious cluster storage system design (xFS) that is meant to avoid having any central point of failure or performance bottleneck.

There isn't a particularly good source of background reading on this topic, but the assigned readings do identify a number of other system designs. We suggest following up on papers referenced therein as a start toward digging deeper. To gain an appreciation for the benefits and difficulties of "multi-device" storage subsystems (even when there is a centralized controller), we suggest RAID: High-Performance, Reliable Secondary Storage (by P. Chen, et. al, in ACM Computing Surveys, June 1994).

September 15 : Distributed File Systems

The third class in the file systems series will focus on distributed file systems, which are file systems whose functionality is split between client machines and file servers that are connected via some kind of network. Topics touched on will include client/server organization, how it works (RPC), how it differs from the local file system case, cache coherence, concurrency control, data consistency, performance enhancement, scalability of clients supported. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe distributed file system designs. The first is a landmark paper that describes AFS and how its design enhances scalability. The second looks at utilizing remote DMA and similar techniques to make a network file system run as fast or faster than the local file system.

For additional background on distributed file systems, a couple of good sources are Chapter 5 of Distributed Operating Systems (by A. Tanenbaum, Prentice Hall Publishers) and Chapter 9 of The Design and Implementation of the 4.4BSD Operating System (by M.K. McKusick, K. Bostic, M. Karels and J. Quarterman, Addison-Wesley publishers).

September 10 : Metadata Integrity

The second class in the file systems series will focus on file system integrity, which is a requirement that post-crash file systems be in a recoverable state. The importance of this topic cannot be over-stated, as this is what ensures that storage persistence is actually valuable. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two file system implementation techniques that offer alternate views on this problem. Both assume that a small window of vulnerability for new data is usually acceptible to users. The first assumes that large caches will capture all reads and thus optimizes for writes; note that, like most big ideas, the merits of LFS have been the source of heated debate. The second is a survey article of database recovery techniques.

Suggested readings for this class meeting are the same as for the last.

September 8 : Maximizing Disk Performance

The next four class meetings will use file systems as a concrete example for the various general problems introduced on day one (and covered throughout the course). This first one will focus on performance enhancement in local file systems, particularly related to disk drives. To set the stage, we'll look at the levels of control and translation between application interfaces and disk media. Along the way, we will look at several case studies and discuss (yes, you too) their pros and cons.

The papers assigned for today describe two storage system implementation techniques specifically focused on maximizing performance given specific disk features. Each describes some of the trends leading to their designs. The first has a "back to the future" feel given that systems used to do such things in "the good old days" when disk drives did not have high-level interfaces like SCSI or IDE. It identifies the disk track as a sweet spot for mid-sized requests and addresses the difficulties involved with exploiting this insight. The second describes how combining "mirroring" (data writes get duplicated on multiple disks) with "eager-writing" (data writes are written to the physically closest free block) to outperform either technique alone. Mirroring speeds reads (more potential sources) at the cost of write performance, while eager-writing speeds writes at the cost of read performance (diminished locality); for certain (database) workloads, these two techniques synergistically eliminate each other's drawbacks.

This class topic will move rapidly through a variety of file system and disk management issues, since you should have a solid base already (from your undergraduate OS course). A good source for more background on how disks work is An Introduction to Disk Drive Modeling (by C. Ruemmler and J. Wilkes, in IEEE Computer magazine, March 1994). For more background in file systems, we suggest Practical File Systems Design with the Be File System (by D. Giampaolo, Morgan Kaufmann Publishers), and chapters 6, 7 and 8 of The Design and Implementation of the 4.4BSD Operating System (by M.K. McKusick, K. Bostic, M. Karels and J. Quarterman, Addison-Wesley publishers).

September 3 : Welcome to 712

This first meeting will be more than just organizational in nature. In it, we will discuss how the class is going to work and what will (and won't) be covered. In addition, we will dive into the material. This will include very rapidly recapping stuff you should already know (e.g., stuff covered in 15-412), discussing what defines operating systems and distributed systems and what makes them continue to be interesting after all these years, and overviewing how the various topics in the course fit together.

The papers listed above are fun and insight-filled, talking generally about the construction and history of computer systems. We strongly encourage you to read them, but are not requiring the standard written summary for them. The Lampson paper, in particular, is something that you should read now, read again in a few weeks, and then put into the pile of papers that you re-read every year or so. A book that should go in this same category is The Mythical Man-Month: Essays on Software Engineering (by F. Brooks, Addison-Wesley Publishers).

Last modified: Sun Nov 2 23:52:41 EST 2003