18-845: Internet Services
Carnegie Mellon University, Spring 2011

Syllabus (pdf) | Critiques | Individual Project (IP) | Group Project (GP)

Instructors

Prof. David O'Hallaron, droh@cs.cmu.edu, GHC 9125, (412) 268-8199,
Office hours: Wed 4-6pm. (These are my nominal hours. Drop by any time the door is open.)

TA: Kushal Dalmia, kdalmia@andrew.cmu.edu, WeH 5207, (412)519-9943
Office hours: Tue 4:30-5:30pm

Organization

Class times: Mon and Wed, 2:30-3:50, PH 226A
Admistrative Asst: Tracy Farbacher, tracyf@cs.cmu.edu, GHC 9129, 412-268-8824
Web page: www.ece.cmu.edu/~ece845
Newsgroup: cmu.ece.class.ee845 (not monitored by teaching staff)
Course directory: /afs/ece/class/ece845

Reference material

There is no required textbook for 18-845. The following are standard references for Linux programming and network programming:
  • Michael Kerrisk, The Linux Programming Interface: A Linux and UNIX System Programming Handbook, No Starch Press, 2010.
  • W. Richard Stevens, Bill Fenner, Andrew M. Rudoff Unix Network Programming: The Sockets Networking API, Volume 1 (3rd Edition), Prentice Hall, 2003.
The CS:APP2e text, which is available in the campus bookstore and on reserve in the Engineering library, covers system-level programming topics such as dynamic linking, process control, Unix I/O, the sockets interface, writing Web servers, and application level concurrency and synchronization:

Course schedule

Legend: IP: individual project, GP: group project

Class Date Day Topic Projects Discussion Leader(s)
1 01/10 Mon Welcome Dave O'Hallaron
2 01/12 Wed A tour of Internet services IP out Dave O'Hallaron
3 01/17 Mon No class - MLK Day
4 01/19 Wed System design Dave O'Hallaron
5 01/24 Mon Server design - basics Dave O'Hallaron
6 01/26 Wed Server design - events/threads Kushal Dalmia
7 01/31 Mon Measuring server capacity Dave O'Hallaron
8 02/02 Wed Clustering - architecture IP due, 2:30pm, at Intel Pittsburgh Richard Gass
9 02/07 Mon Clustering - request routing Chris Peplin
10 02/09 Wed Virtual machine overview Joe Greco
11 02/14 Mon HW/SW virtualization Glenn Stroz
12 02/16 Wed Paravirtualization Vishal Patel
13 02/21 Mon VM live migration GP abstract due, 2:30pm Kangyuan Niu
14 02/23 Wed VM memory management Dave O'Hallaron
15 02/28 Mon VM deployment Kushal Dalmia
16 03/02 Wed FAWN Kushal Dalmia
17 03/07 Mon No class - Spring break
18 03/09 Wed No class - Spring break
19 03/14 Mon Google file system Dave O'Hallaron
20 03/16 Wed Map-reduce Chris Peplin
21 03/21 Mon No class - droh out
22 03/23 Wed Table-based storage GP Mid-term oral reports due Glen Stroz
23 03/28 Mon Coordination systems Vishal Patel
24 03/30 Wed Google Search Kushal Dalmia
25 04/04 Mon Request tracing At Google Pittsburgh All
26 04/06 Wed Storage management Joe Greco
27 04/11 Mon Request tracing Dave O'Hallaron
28 04/13 Wed Datacenter debugging Dave O'Hallaron
29 04/18 Mon No class - GP prep
30 04/20 Wed No class - GP prep GP reports due, 2:30pm
31 04/25 Mon No class - GP reviews due GP reviews due, 2:30pm
32 04/27 Wed GP Presentations GP presentations (in class)
05/01 Sun GP final reports due, 11:59pm

Detailed course schedule

Students who are not leading the discussion for a particular class should prepare a single 1-page critique. Unless explictly noted, the critique should cover all papers with a "*".

Bring a hardcopy (no email) of your critique with you to class and give it to the TA before class. He will grade it and return it to you next class.

Class 1: Welcome (Mon 1/10)

Discussion leader: Dave O'Hallaron

Welcome and overview of course organization.

Class 2: A tour of Internet services (Wed 1/12)

Discussion leader: Dave O'Hallaron

Big picture and intellectual overview.

  • Note: No critiques are due today.
  • Eric A. Brewer, Lessons from Giant-Scale Services, IEEE Internet Computing, Vol 5, Num. 4, Aug, 2001. (pdf)

Class 3: No class - MLK Day (Mon 1/17)

Class 4: System design principles (Wed 1/23)

Discussion leader: Dave O'Hallaron
  • Note: Please write a single critique covering both papers
  • Note: Your critique should list three other examples (not discussed by the authors) of end-to-end arguments in system design.
  • *J. Saltzer, D. Reed, and D. Clark, End-to-End Arguments in System Design, ACM Transactions on Computer Systems, Vol 2, No 4, Nov, 1984. (pdf)
  • *Butler Lampson, Hints for Computer System Design ACM Operating Systems Rev. 15, 5 (Oct. 1983), pp 33-48. Reprinted in IEEE Software 1, 1 (Jan. 1984), pp 11-28. (html)

Class 5: Server design - basics (Mon 1/24)

Discussion leader: Dave O'Hallaron
  • Note: Please write two separate critiques.
  • *V. Pai, P. Druschel, and W. Zwaenepoel, Flash: An efficient and portable Web server, Proceedings of the USENIX 1999 Annual Technical Conference, 1999. (pdf)
  • *Tim Brecht , David Pariag, and Louay Gammo, accept()able Strategies for Improving Web Server Performance, Proceedings of the USENIX 2004 Annual Technical Conference, June, 2004. (pdf)
  • D. Mosberger and T. Jin. httperf: A Tool for Measuring Web Server Performance. Performance Evaluation Review, Volume 26, Number 3, December 1998, 31-37. (Originally appeared in Proceedings of the 1998 Internet Server Performance Workshop, June, 1998.) ( pdf, download )

Class 6: Server design - events/threads (Wed 1/26)

Discussion leader: Kushal Dalmia
  • Note: Please write two separate critiques.
  • *Gaurav Banga, Jeff Mogul and Peter Druschel, A scalable and explicit event delivery mechanism for UNIX, in the Proceedings of the USENIX 1999 Technical Conference, June 1999. (pdf)
  • *R. von Behren, J. Condit and E. Brewer, Why Events Are A Bad Idea (for high-concurrency servers), Proceedings of HotOS IX, Lihue, Kauai, Hawaii, May, 2003. (pdf)
  • David Pariag, Tim Brecht, Ashif Harji, Peter Buhr, and Amol Shukla, Comparing the Performance of Web Server Architectures EuroSys 2007, Lisbon, Portugal, March, 2007. (pdf)

Class 7: Measuring server capacity (Mon 1/31)

Discussion leader: Dave O'Hallaron
  • *G. Banga and P. Druschel, Measuring the Capacity of a Web Server under Realistic Loads, World Wide Web Journal (Special issue on World Wide Web Characterization and Performance Evaluation), 2(1), May 1999. (pdf)

Class 8: Clustering - architecture (Wed 2/2)

Discussion leader: Richard Gass
  • *Luiz Barroso, Jeffrey Dean, and Urs Hoelzle, Web Search for a Planet: The Google Cluster Architecture, IEEE Micro, March-April, 2003. (pdf)

Class 9: Clustering - request routing (Mon 2/7)

Discussion leader: Chris Peplin
  • Note: Please write a single critique covering both papers.
  • *V. Pai, M. Aron, G. Banga, M. Svendsen, P. Drushel, W. Zwaenepoel, E. Nahum, Locality-aware request distribution in cluster-based network services. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems, October 1998. (pdf)
  • *M. Aron, D. Sanders, P. Druschel, and W. Zwaenepoel. Scalable Content-aware Request Distribution in Cluster-based Network Servers. In Proceedings of the USENIX 2000 Annual Technical Conference, June, 2000. (pdf)

Class 10: VM overview (Wed 2/9)

Discussion leader: Joe Greco
  • Note: Please write a single critique covering both papers.
  • *James E. Smith and Ravi Nair, The Architecture of Virtual Machines, IEEE Computer, May, 2005. (pdf)
  • *Rich Uhlig, Gil Neiger, Dion Rodgers, Amy L. Santoni, Fernando C.M. Martins, Andrew V. Anderson, Steven M. Bennett, Alain Kagi, Felix H. Leung, Larry Smith, Intel Virtualization Technology, IEEE Computer, May, 2005. (pdf)

  • Additional background:
    • G. Neiger, A. Santoni, F. Leung, D. Rodgers, R. Uhlig, "Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization", Intel Technology Journal, Aug, 2006. (pdf)

Class 11: SW/HW virtualization (Mon 2/14)

Discussion leader: Glenn Stroz
  • *K. Adams, and O. Agesen, A Comparison of Software and Hardware Techniques for x86 Virtualization, In Proceedings of the 12th international conference on Architectural support for programming languages and operating systems, 2006. (pdf)

Class 12: Paravirtualization (Wed 2/16)

Discussion leader: Vishal Patel
  • *P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, A. Warfiel, Xen and the Art of Virtualization, In Proceedings of the 19th ACM Symposium on Operating Systems Principles, October, 2003. (pdf)

Class 13: VM live migration (Mon 2/21)

Discussion leader: Kangyuan Niu
  • *Christopher Clark, Keir Fraser, Steven Hand, Jacob Gorm Hanseny, Eric July, Christian Limpach, Ian Pratt, Andrew Warfield, "Live Migration of Virtual Machines", NSDI '05. (pdf)

Class 14: VM memory management (Wed 2/23)

Discussion leader: Dave O'Hallaron
  • *D. Gupta, S. Lee, M. Vrable, S. Savag, A. Snoeren, G. Barghese, G. Voelker, and A. Vahdat, Difference Engine: Harnessing Memory Redundancy in Virtual Machines, OSDI '08, Dec, 2008. Awarded best paper. (pdf)

Class 15: VM deployment (Mon 2/28)

Discussion leader: Kushal Dalmia
  • *H. Andres Lagar-Cavilla, Joseph A. Whitney, Adin Scannell, Philip Patchin, Stephen M. Rumble, Eyal de Lara, Michael Brudno, M. Satyanarayanan, SnowFlock: Rapid Virtual Machine Cloning for Cloud Computing, Eurosys '09, Apr, 2009. Awarded best paper. (pdf)

Class 16: FAWN (Wed 3/2)

Discussion leader: Kushal Dalmia
  • *David Andersen, Jason Franklin, Michael Kaminsky, Amar Phanishayee, Lawrence Tan, Vijay Vasudevan, FAWN: A Fast Array of Wimpy Nodes, SOSP 2009, Big Sky, MT. October 2009. Awarded best paper. (pdf)

Class 17: No class - Spring break (Mon 3/7)

Class 18: No class - Spring break (Wed 3/9)

Class 19: Google file system (Mon 3/14)

Discussion leaders: Dave O'Hallaron
  • *Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, The Google File System, in Proceedings of the 19th ACM Symposium on Operating Systems Principles, October, 2003. (pdf)

Class 20: Map-reduce (Wed 3/16)

Discussion leader: Chris Peplin
  • *J. Dean, and S. Ghemawat, MapReduce: Simplified Data Processing on Large Clusters, in Proceedings of Sixth Symposium on Operating System Design and Implementation, December, 2004. (pdf)

Class 21: No class - droh out (Mon 3/21)

Class 22: Table-based storage (Wed 3/23)

Discussion leaders: Glenn Stroz
  • *F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber, Bigtable: A Distributed Storage System for Structured Data, in Proceedings of the Seventh Symposium on Operating System Design and Implementation, December, 2006. (pdf)

Class 23: Coordination services (Mon 3/28)

Discussion leader: Vishal Patel
  • *M. Burrows, The Chubby Lock Service for Loosely-Coupled Distributed Systems, in Proceedings of the Seventh Symposium on Operating System Design and Implementation, December, 2006. (pdf)

Class 24: Google search (Wed 3/30)

Discussion leader: Kushal Dalmia
  • Note: Please write a single critique covering both papers.
  • *Sergey Brin and Larry Page, The Anatomy of a Large-Scale Hypertextual Web Search Engine, Seventh International World Wide Web Conference / Computer Networks 30(1-7): 107-117. 1998. (pdf)
  • *Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd, The PageRank Citation Ranking: Bringing Order to the Web, 1998. (pdf)
  • Ian Rogers, The Google Pagerank Algorithm and How It Works (html) May, 2002.

Class 25: Request tracing (Mon 4/4 at Google Pittsburgh)

Discussion leader: All
  • *Benjamin H. Sigelman, Luiz Andre Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, Chandan Shanbhag, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure, Google Technical Report dapper-2010-1, April 2010. (pdf)

Class 26: Cluster storage management (Wed 4/6)

Discussion leader: Joe Greco
  • *Hrishikesh Amurb , James Cipar, Varun Gupta, Gregory R. Ganger, Michael A. Kozuch, Karsten Schwan, Robust and Flexible Power-proportional Storage, ACM Symposium on Cloud Computing (SOCC), June 10-11, 2010, Indianapolis, IN. (pdf)

Class 27: Request tracing (Mon 4/11)

Discussion leader: Dave O'Hallaron
  • Note: Please write a single critique covering both papers.
  • *Paul Barham, Austin Donnelly, Rebecca Isaacs, and Richard Mortier, Using Magpie for Request Extraction and Workload Modelling, OSDI'04, Dec, 2004. (pdf)
  • *Mike Chen, Emre Kiciman, Eugene Fratkin, Eric Brewer, and Armando Fox, Pinpoint: Problem Determination in Large, Dynamic, Internet Services, Dependable Systems and Networks (DSN'02), 2002. (pdf)

Class 28: Datacenter debugging (Wed 4/13)

Discussion leader: Dave O'Hallaron
  • *Peter Bodik, Moises Goldszmidt, Armando Fox, Dawn B. Woodard, Hans Andersen, Fingerprinting the Datacenter: Automated Classification of Performance Crises, Eurosys 2010, Paris, France, April, 2010. (pdf)

Class 29: No class - GP prep (Mon 4/18)

Class 30: No class - GP reports due (Wed 4/20)

Class 31: No class - GP reviews due (Mon 4/25)

Class 32: GP presentations


Dave O'Hallaron