18-845: Internet Services
Carnegie Mellon University, Spring 2013

Syllabus (pdf) | Critiques | Individual Project (IP) | Group Project (GP)

1. Instructors

Prof. David O'Hallaron, droh@cs.cmu.edu, GHC 9125, (412) 268-8199,
Office hours: Mon 4-6pm. (These are my nominal hours. Drop by any time the door is open.)

TA: Praveen Ramakrishnan, praveenr@andrew.cmu.edu, INI 210 (4616 Henry Street), (412) 613-0974,
Office hours: Wed 4-6pm.

2. Organization

Class times: Mon and Wed, 2:30-3:50, WeH 5409
Web page: www.ece.cmu.edu/~ece845
Discussion Board: Piazza.
Blackboard: We will not be using Blackboard.
Course directory: /afs/ece/class/ece845

3. Reference material

There is no required textbook for 18-845. The following are standard references for Linux programming and network programming:
  • Michael Kerrisk, The Linux Programming Interface: A Linux and UNIX System Programming Handbook, No Starch Press, 2010.
  • W. Richard Stevens, Bill Fenner, Andrew M. Rudoff Unix Network Programming: The Sockets Networking API, Volume 1 (3rd Edition), Prentice Hall, 2003.
The CS:APP2e text, which is available in the campus bookstore and on permanent reserve in the Engineering library, covers system-level programming topics such as dynamic linking, process control, Unix I/O, the sockets interface, writing Web servers, and application level concurrency and synchronization:

4. Cluster resources

  • Andrew cluster: linux.andrew.cmu.edu
    • RHEL 6x, 64-bit, login using your Andrew credentials
  • Gates cluster: ghc{01..81}.ghc.andrew.cmu.edu
    • RHEL 6x, 64-bit, login using your Andrew credentials
  • ECE cluster: ece{000-031}.ece.cmu.edu
    • SuSE 12x, 64 bit, login using your ECE credentials
    • See here for details. Contact help@ece.cmu.edu for help with accounts.
  • Other ECE clusters: "games" and "dead mathematician" clusters
    • See here for details. Contact help@ece.cmu.edu for help with accounts.

5. Course schedule

Legend: IP: individual project, GP: group project

Class Date Day Topic Projects Discussion Leader
1 01/14 Mon Intro and welcome Dave O'Hallaron
2 01/16 Wed System design principles IP out Dave O'Hallaron
3 01/21 Mon No class (MLK day) ---
4 01/23 Wed Server design - Basics Dave O'Hallaron
5 01/28 Mon Server design - Events vs. threads Praveen R.
6 01/30 Wed Comparing server performance Praveen R.
7 02/04 Mon Measuring server capacity Max Buevich
8 02/06 Wed Clustering: Request routing Pradyumna Agrawal
9 02/11 Mon Google search Dylan Swen
10 02/13 Wed Google file system (GFS) IP due, 2:30pm Robert Walzer
11 02/18 Mon Map-reduce Puneet Pruthi
12 02/20 Wed Table-based storage (BigTable) Gouri Joshi
13 02/25 Mon Lock services (Chubby) Chinmay Kamat
14 02/27 Wed Incremental updates (Percolator) Praveen R.
15 03/04 Mon Massive storage (Megastore) Max Buevich
16 03/06 Wed Globally distributed storage (Spanner) GP abstracts due Gouri Joshi
17 03/11 Mon No class - Spring break ---
18 03/13 Wed No class - Spring break ---
19 03/18 Mon Key-value stores (SILT) Lock Thepdusith
20 03/20 Wed Request tracing (Dapper) Puneet Pruthi
21 03/25 Mon Data center overview I Dave O'Hallaron
22 03/27 Wed Data center overview II GP oral mid-term reports Dave O'Hallaron
23 04/01 Mon Energy-efficient computing (FAWN) Robert Walzer
24 04/13 Wed Virtual machine overview Dylan Swen
25 04/08 Mon HW/SW virtualization Chinmay Kamat
26 04/10 Wed VM implementations Praveen R.
27 04/15 Mon Nested virtualization Pradyumna Agarwal
28 04/17 Wed VM live migration Lock Thepdusith
29 04/22 Mon No class - GP prep ---
30 04/24 Wed No class - GP reports due GP reports due, 11:59pm ---
31 04/29 Mon No class - GP reviews due GP reviews due, 2:30pm ---
32 05/01 Wed GP poster session Location: WeH 5409 all
05/05 Sun GP final reports due, 11:59pm

6. Detailed course schedule

Students who are not leading the discussion for a particular class should prepare a single 1-page critique. Unless explictly noted, the critique should cover all papers with a "*".

Bring a hardcopy (no email) of your critique with you to class and give it to the TA before class. He will grade it and return it to you next class.

Class 1: Welcome and intro

Big picture and intellectual overview.

  • Note: No critiques are due today.
  • Eric A. Brewer, Lessons from Giant-Scale Services, IEEE Internet Computing, Vol 5, Num. 4, Aug, 2001. (pdf)

Class 2: System design principles

  • Note: Your critique should list three other examples (not discussed by the authors) of end-to-end arguments in system design.
  • *J. Saltzer, D. Reed, and D. Clark, End-to-End Arguments in System Design, ACM Transactions on Computer Systems, Vol 2, No 4, Nov, 1984. (pdf)
  • Butler Lampson, Hints for Computer System Design ACM Operating Systems Rev. 15, 5 (Oct. 1983), pp 33-48. Reprinted in IEEE Software 1, 1 (Jan. 1984), pp 11-28. (html)

Class 3: No class - MLK day

Class 4: Server design - basics

  • Note: Please write two separate critiques.
  • *V. Pai, P. Druschel, and W. Zwaenepoel, Flash: An efficient and portable Web server, Proceedings of the USENIX 1999 Annual Technical Conference, 1999. (pdf)
  • *Tim Brecht , David Pariag, and Louay Gammo, accept()able Strategies for Improving Web Server Performance, Proceedings of the USENIX 2004 Annual Technical Conference, June, 2004. (pdf)
  • D. Mosberger and T. Jin. httperf: A Tool for Measuring Web Server Performance. Performance Evaluation Review, Volume 26, Number 3, December 1998, 31-37. (Originally appeared in Proceedings of the 1998 Internet Server Performance Workshop, June, 1998.) ( html) )

Class 5: Server design - events vs. threads

  • Note: Please write two separate critiques.
  • *Gaurav Banga, Jeff Mogul and Peter Druschel, A scalable and explicit event delivery mechanism for UNIX, in the Proceedings of the USENIX 1999 Technical Conference, June 1999. (pdf)
  • *R. von Behren, J. Condit and E. Brewer, Why Events Are A Bad Idea (for high-concurrency servers), Proceedings of HotOS IX, Lihue, Kauai, Hawaii, May, 2003. (pdf)
  • J. K. Osterhout, Why Threads Are a Bad Idea (for most purposes), Presentation at 1996 Usenix Annual Technical Conference, January, 1996 (pdf)
  • E. A. Lee, The Problem with Threads, IEEE Computer, May, 2006 (html)

Class 6: Comparing server performance

  • *David Pariag, Tim Brecht, Ashif Harji, Peter Buhr, and Amol Shukla, Comparing the Performance of Web Server Architectures, EuroSys 2007, Lisbon, Portugal, March, 2007. (pdf)

Class 7: Measuring server capacity

  • *G. Banga and P. Druschel, Measuring the Capacity of a Web Server under Realistic Loads, World Wide Web Journal (Special issue on World Wide Web Characterization and Performance Evaluation), 2(1), May 1999. (pdf)

Class 8: Clustering - request routing

  • Note: Please write a single critique covering both papers.
  • *V. Pai, M. Aron, G. Banga, M. Svendsen, P. Drushel, W. Zwaenepoel, E. Nahum, Locality-aware request distribution in cluster-based network services. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998. (pdf)
  • *M. Aron, D. Sanders, P. Druschel, and W. Zwaenepoel. Scalable Content-aware Request Distribution in Cluster-based Network Servers. In Proceedings of the USENIX 2000 Annual Technical Conference, June, 2000. (pdf)

Class 9: Google search

  • Note: Please write a single critique covering both papers.
  • *Sergey Brin and Larry Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine, Seventh International World Wide Web Conference / Computer Networks 30(1-7): 107-117. 1998. (pdf)
  • *Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd, The PageRank Citation Ranking: Bringing Order to the Web, 1998. (pdf)
  • Ian Rogers, The Google Pagerank Algorithm and How It Works (html) May, 2002.

Class 10: Google file system (GFS)

  • *Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, The Google File System, in Proceedings of the 19th ACM Symposium on Operating Systems Principles, October, 2003. (pdf)
  • Kirk McKusick and Sean Quinlan, GFS: Evolution on Fast-Forward, CACM, March, 2010. (html)

Class 11: Map-reduce

  • *J. Dean, and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, in Proceedings of Sixth Symposium on Operating System Design and Implementation, December, 2004. (pdf)

Class 12: Table-based storage (BigTable)

  • *F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, Bigtable: A Distributed Storage System for Structured Data, in Proceedings of the Seventh Symposium on Operating System Design and Implementation, December, 2006. (pdf)

Class 13: Lock services (Chubby)

  • *M. Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems, in Proceedings of the Seventh Symposium on Operating System Design and Implementation, December, 2006. (pdf)

Class 14: Incremental Web updates (Percolator)

  • *D. Peng and F. Dabek, Large-scale Incremental Processing Using Distributed Transactions and Notifications, OSDI, 2010. (pdf)

Class 15: Massive storage (Megastore)

  • *Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson, Jean-Michel Leon, Yawei Li, Alexander Lloyd, Vadim Yushprakh, Megastore: Providing Scalable, Highly Available Storage for Interactive Services, CIDR'11, Jan, 2011. (pdf)

Class 16: Globally distributed storage (Spanner)

  • *J. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford, Spanner: Google's Globally-Distributed Database, OSDI'12, 2012, Jay Lepreau Best Paper Award. (pdf)

Class 17: No class - Spring break

Class 18: No class - Spring break

Class 19: Key-value stores (SILT)

  • *H. Lim, B. Fan, D. Andersen, M. Kaminsky, SILT: A Memory-Efficient, High-Performance Key-Value Store, SOSP, Oct, 2011. (pdf)

Class 20: Request tracing (Dapper)

  • *Benjamin H. Sigelman, Luiz Andre Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, Chandan Shanbhag, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Google Technical Report dapper-2010-1, April 2010. (pdf)

Class 21: Datacenter Overview (Chapters 1-4)

  • Note: Please write a single critique covering chapters 1-4
  • *Luiz Andre Barrosa and Urs Holzle, The Datacenter as a Computer, Morgan & Claypool, 2009. (pdf)

Class 22: Datacenter Overview (Chapters 5-8)

  • Note: Please write a single critique covering chapters 5-8
  • *Luiz Andre Barrosa and Urs Holzle, The Datacenter as a Computer, Morgan & Claypool, 2009. (pdf)

Class 23: Energy-efficient computing (FAWN)

  • Note: Please write a single critique covering both papers.
  • *David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan, FAWN: A Fast Array of Wimpy Nodes, SOSP 2009, Big Sky, MT. October 2009. Awarded best paper. (pdf)
  • *V. Vasudevan, D. Andersen, M. Kaminsky, L. Tan, J. Franklin, I. Moraru Energy-efficient Cluster Computing with FAWN: Workloads and Implications, Proceedings of e-energy, 2010, invited paper. (pdf)

Class 24: VM overview

  • Note: Please write a single critique covering both papers.
  • *Mendel Rosenblum and Tal Garfinkel, Virtual Machine Monitors: Current Technology and Future Trends, IEEE Computer, May, 2005. (pdf)
  • *G. Neiger, A. Santoni, F. Leung, D. Rodgers, R. Uhlig, "Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization", Intel Technology Journal, Aug, 2006. (pdf)

Class 25: HW/SW virtualization

  • *K. Adams, and O. Agesen, A Comparison of Software and Hardware Techniques for x86 Virtualization, In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, 2006. (pdf)
  • Nikhil Bhatia, "Performance Evaluation of Intel EPT Hardware Assist", VWware white paper, 2009 (pdf)

Class 26: VM implementations

  • Note: Please write a single critique covering both papers.
  • *P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, A. Warfiel, Xen and the Art of Virtualization, In Proceedings of the 19th ACM Symposium on Operating Systems Principles, October, 2003. (pdf)
  • *A. Kivity, Y. Kamay, Dor Laor, U. Lublin, A. Liguori, kvm: the Linux Virtual Machine Monitor, Proceedings of the Linux Symposium, Volume 1, pp 225-230, Ottawa, Ontario, Canada, June 2007. (pdf)
  • F. Bellard, QEMU, a Fast and Portable Dynamic Translator, Usenix Annual Technical Conference, 2005. (pdf)

Class 27: Nested virtualization

  • *M. Ben-Yehuda, M. Day, Z. Dubitzky, M. Factor, N. HarEl, A. Gordon, A. Liguori, O. Wasserman, B. Yassour, The Turtles Project: Design and Implementation of Nested Virtualization, OSDI, 2010. Awarded best paper. (pdf)

Class 28: VM live migration

  • *Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hanseny, Eric July, Christian Limpach, Ian Pratt, Andrew Warfield, "Live Migration of Virtual Machines", NSDI '05. (pdf)

Class 29: No class - GP prep

Class 30: No class - GP reports due

Class 31: No class - GP reviews due

Class 32: GP poster session