18-845 Group Project (GP)

Important dates

  1. Mon Mar 19 (11:59pm): GP abstracts due
    • Email your abstract to droh@cs.cmu.edu

  2. Wed, Apr 4: GP oral status reports due
    • In class

  3. Thur Apr 26 (11:59pm) : GP reports due
    • Email your completed report (pdf format) to droh@cs.cmu.edu

  4. Sun, Apr 29 (11:59pm) : GP reviews due
    • Email your critiques (one PDF file per critique) to droh@cs.cmu.edu

  5. Wed, May 2 (2:30pm): GP poster session, Location TBD
    • Prepare eight or nine individual letter-sized sheets. Your instructors will supply poster boards, pins, and stands.

  6. Sun, May 6 (11:59pm): Final camera-ready GP reports due
    • Email your report (pdf format) to droh@cs.cmu.edu

1. Instructions for Preparing Your GP abstracts

  • The abstract must contain the following parts:
    • Title
    • Authors
    • One or two paragraphs describing the question you want to answer, and what will do to answer that question.
    • A paragraph describing the expected result (What do you hope to learn? What conclusions do you hope to draw?)

2. Instructions for Delivering Your GP Mid-Term Oral Report

  • Your TA will meet with each group one-on-one to learn about your progress on the project and to give you guidance and encouragement. This is an informal face-to-face meeting. You are not required to prepare any written material. All group members must attend.

3. Instructions for Preparing and Submitting Your GP Reports

  • Reports are limited to 10 pages (this is a hard limit).
  • Font size must be at least 10pt (but 11pt is even better).
  • Reports must follow the official ACM Proceedings format. Use the 10pt Latex template provided here, or the 10pt Word template provided here.
  • Reports should include the following content:
    • Abstract - A paragraph that summarizes the problem and the results.
    • Introduction - Sets the context, describes the problem, and describes your solution.
    • Description - One or more sections that describes the problem and your approach to the solution in detail.
    • Evaluation - A section that quantitatively evaluates your ideas.
    • Related work - Compare and contrast related work. Don't just enumerate.
    • Summary and Conclusions - Summarize what you did and what interesting things you learned from the project.
  • Send your reports to droh@cs.cmu.edu.

4. Instructions for Reviewing Your Classmates' Reports

  • Each report will be formally reviewed by four reviewers: your instructor, your TA, and two classmates chosen by your instructor. Thus, every student will receive four reviews of their project. The two student reviews should be anonymous.
  • Your instructors will evaluate the quality of your reviews as part of your overall project score.
  • Use the same review template you used for your critiques.
  • Send each review as a separate PDF file attachment to droh@cs.cmu.edu.

5. Instructions for Preparing Your Poster Presentations

  • The staff will bring 32" x 40" foam core poster boards, easels, and push pins for you to attach your hardcopies to the poster board. Each group will have their own poster board and easel.
  • A poster board can hold 8 letter-size (8 1/2 x 11) pieces of paper in landscape format, or 9 pages in portrait format.
  • You should prepare an 8 or 9 page presentation and bring the hardcopies with you to class. One of these pages must be a title page with the title of your project and your names.
  • We'll order some good food and drinks for us.

Hints for Coming Up with a Topic

If you are currently working on a masters or Ph.D. thesis, we encourage you to pursue a topic that is directly related to your thesis research. It's OK (ideal in fact!) to use the group project as a way to make progress on your thesis.

There are two basic approaches you can use for your group research projects.

  • Develop a new idea or a new twist on an existing idea, and then do enough evaluation to serve as a proof of concept.
  • Do an extensive evaluation of an existing idea that gives you some insight into the advantages or disadvantages of that idea.

Here are examples of some project ideas from previous years:

Here are some other ideas for topics (in no particular order):

  • Evaluation of performance issues in high-speed non-blocking servers.The idea here is to build a high speed server that never blocks on I/O (such as the Flash server from Rice) and then do extensive micro-evaluation of its performance in order to understand the extent of the performance gain that is possible from such non-blocking servers.

  • Threads vs events in high speed servers. We've seen a number of conflicting conclusions in our readings. Which approach is better? Compare and contrast the performance implications of kernel-level threads, cooperatively scheduled user-level threads, and event systems based on select().

  • Monitoring in a non-cooperative environment. Stefan Savage at UCSD has developed a powerful technique for estimating end-to-end bandwidths and packet-loss between hosts, where the remote host is not cooperative in the sense that it would be impossible to get an account on the machine (e.g., the Yahoo server). Savage's approach is to exploit the behavior of TCP (which all servers must implement to the specification) to gain information about the effective bandwidth from the server to the client. For this project, you might apply this general idea in some new context, or use Savage's method for estimating packet loss in the context of a larger application. For example, would it be possible to use Savage's technique to build a client-side performance monitoring system that, for a given HTTP transaction, would isolate the network transmission time from the server processing time and determine which is the bottleneck?

  • Attaching geographical locations to IP addresses in the context of a world-wide disaster monitoring system. When natural disasters such as earthquakes occur, it is very difficult to make accurate estimates of the geographical extent and severity of the damage because the communication infrastructure disappears. However, hosts that provide Internet services are always turned on, so the lack of response from those systems contains some information. The idea is to build a system that would sample hosts in earthquake prone regions on a continual basis. Each sample is a bit vector, one bit per host. Some interesting issues are assigning IP addresses to geographical locations, developing a hierarchical scheme to aggregate response bit vectors, and developing analysis techniques of the response bit vectors to distinguish transients (e.g. localized power failures or normal host downtime) from real damage.

  • Scalable search engines. Current search engines are not scalable because all of the work is done at the remote server site. As a result, the servers are not able to perform much computation when they satisfy a request, typically a quick lookup of an inverted index. As a result, single-word queries, which directly index the database on the server, typically work pretty well, but multiple word queries often give poor results. The idea here is to investigate the following question: Can we improve the performance of search engines such as Google by doing some additional work on the client?