18-845 Group Project (GP)
Important dates
- Tue Feb 25: Project Proposal Due
- Tue Mar 18 and Thu Mar 20: Oral Status Reports
- Thu Apr 24 (5 pm): Project Reports Due
(email PDF only to droh@cs)
- Tue Apr 29 (5 pm) : Project Reviews Due
(email ASCII text only to droh@cs)
- Thu May 1: Poster Presentations (Location Hamburg Hall 1510,
12:30pm - 2:00pm)
- Sun May 4 (11:59pm): Final Camera-Ready Project Reports Due
(email PDF only to droh@cs)
Instructions for Preparing and Submitting Your Project Reports
- Reports are limited to 10 pages.
- Reports must follow the official ACM Proceedings format. Use the
Word or Latex templates provided by ACM.
- Reports must include the following parts:
- Abstract - A paragraph that summarizes the problem and the results.
- Introduction -
Sets the context, describes the problem, and describes your solution.
- Description - One or more sections that
describes the problem and your approach to the solution in detail.
- Evaluation - A section that quantitatively evaluates your
ideas.
- Related work - Describes work related to your work.
- Summary and Conclusions - Summarize what you did and what interesting
things you learned from the project.
- Send your reports (PDF format only please) to droh@cs.cmu.edu.
Instructions for Reviewing Your Classmates' Project Reports
- Each project report will be reviewed (anonymously) by three of your
classmates. Thus, every student will review three reports.
- Your
instructors will evaluate the quality of your reviews as part of your
overall project score.
- Use the same review template you used for your critiques.
- Send each review as a separate ASCII text file (no Word or PDF) to
droh@cs.cmu.edu.
Instructions for Preparing Your Poster Presentations
- We'll meet in our usual classroom at 12:30pm.
- I'll bring 32" x 40" foam core poster boards,
easels, and push pins for you to attach your
hardcopies to the poster board. Each group will have
their own poster board and easel.
- A poster board can hold 8 letter-size (8 1/2 x 11)
pieces of paper in landscape format, or 9 pages in
portrait format.
- You should prepare an 8 or 9 page presentation and bring
the hardcopies with you to class. One of these pages must
be a title page with the title of your project and your names.
- I'll order some good food and drinks, and we'll also do the FCEs
before the poster session starts.
Hints for Coming Up with a Topic
You are free to propose any idea you want for your group project. To
help you start thinking, here are some of the projects from previous
years:
- Blake Scholl,
Distributed Computation of Performance-Aware Webmaps with HTTP Proxies.
- Thomas Madden and Christopher Palow,
Denial of Service Detector (DoSD).
- Shaheen Gandhi and Alan Wang,
Effects of Latency on Game State Prediction Methods.
- Asad Samar,
An Implementation of Capture Resilient Devices.
- Glenn Judd,
Improving 802.11 Access Point Selection: A Preliminary Investigation.
- Hen-I Yang and Anupam Dhanuka,
Performance Evaluation of Multiple Fields Matching Scheme.
- Gautaum Garg and Gene Soo,
Space-Time Codes.
- Punitha Manavalan and Michael Wagner,
Robot Telemetry Manager.
- Li-Chiou Chen & Xia Chen,
Evaluating Methods of Defending Distributed Denial of Service
Attacks.
- Pratish Halady, Rahul Mangharam and Vishal Soni,
Location Based Wireless Network Services.
- Nitin Gupta and Sandhya Gupta,
QoS in Web-Servers.
- Aravind Pavuluri and Saumitra Das,
An Active Architecture for User-Profile Based Dynamic Web Caching.
- Vijay Pandurangan and Mehmet Bakkaloglu,
PASISizing the Web.
- Arif Ulaugac and Nawaportn Wisitpongphan,
Micro-Evaluation of the Flash Server.
- David Oleszkiewicz and Ed Neto,
Distributed Anonymous Information Retrieval.
There are two basic approaches you can use for your group research
projects.
- Develop a new idea or a new twist on an existing idea, and then do
enough evaluation to serve as a proof of concept.
- Do an extensive evaluation of an existing idea that gives you
some insight into the advantages or disadvantages of that idea.
Here are some other ideas for topics (in no particular order):
- Internet host counting.
Perform your own Internet Domain Survey and compare it to
the published survey at www.isc.org.
- Congestion-aware routing. Develop a scheme that would allow us
to use BGP to change the routing tables in routers to improve communication
performance between different AS's (ISP's). (Blake Scholl)
- Peer-to-peer systems.
There are a number of interesting unresolved issues for
peer-to-peer systems such as
Freenet. What are the performance bottlenecks? How anonymous are
Freenet objects really? How can Freenet peers be attacked and
defended? How to delete and update documents? How to invalidate
cached copies of updated documents? How to resolve the fundamental
tension between privacy and accountability? How to name objects? How
to search for objects in systems that are designed to make it
impossible to identify the origin server for any particular document?
Are there better approaches for caching and replicating objects
through the network.
- Peer-to-peer distributed keyword search. Develop a
peer-to-peer distributed search engine. Issues: Tradeoffs between
security and convenience, scaling outside the local area, searching on
metadata as well as contents.
- Peer-to-peer content publishing systems.
There are a number of interesting unresolved issues for
anonymous content publishing peer-to-peer systems such as
Freenet. What are the performance bottlenecks? How anonymous are
Freenet objects really? How can Freenet peers be attacked and
defended? How to delete and update documents? How to invalidate
cached copies of updated documents? How to resolve the fundamental
tension between privacy and accountability? How to name objects? How
to search for objects in systems that are designed to make it
impossible to identify the origin server for any particular document?
Are there better approaches for caching and replicating objects
through the network.
- Monitoring in a non-cooperative environment. Stefan
Savage at UCSD has developed a powerful technique for estimating
end-to-end bandwidths and packet-loss between hosts, where the remote
host is not cooperative in the sense that it would be impossible to
get an account on the machine (e.g., the Yahoo server). Savage's
approach is to exploit the behavior of TCP (which all servers must
implement to the specification) to gain information about the
effective bandwidth from the server to the client. For this project,
you might apply this general idea in some new context, or use Savage's
method for estimating packet loss in the context of a larger
application. For example, would it be possible to use Savage's
technique to build a client-side performance monitoring system that,
for a given HTTP transaction, would isolate the network transmission
time from the server processing time and determine which is the
bottleneck?
- Attaching geographical locations to IP addresses in the
context of a world-wide disaster monitoring system. When natural
disasters such as earthquakes occur, it is very difficult to make
accurate estimates of the geographical extent and severity of the
damage because the communication infrastructure disappears. However,
hosts that provide Internet services are always turned on, so the lack
of response from those systems contains some information. The idea is
to build a system that would sample hosts in earthquake prone regions
on a continual basis. Each sample is a bit vector, one bit per host.
Some interesting issues are assigning IP addresses to geographical
locations, developing a hierarchical scheme to aggregate response bit
vectors, and developing analysis techniques of the response bit
vectors to distinguish transients (e.g. localized power failures or
normal host downtime) from real damage.
- Network topology discovery and bandwidth monitoring with
incomplete and incompatible SNMP information. Network monitoring
tools such as the CMU Remos system use
information from SNMP daemons running on routers and hosts to discover
network topologies (bridges, routers, and links) and to predict the
available link bandwidths. However, the SNMP information is sometimes
incomplete (because of misconfigured routers) or unavailable (because
of proprietary and non-compatible SNMP daemons or heavy link
traffic). Our current systems assume perfect information, and thus
fail in the presence of incomplete or incompatible information. The
idea here would be to develop some techniques to improve this
situation and then implement them in Remos.
- Scalable search engines. Current search engines are not
scalable because all of the work is done at the remote server site. As
a result, the servers are not able to perform much computation when
they satisfy a request, typically a quick lookup of an inverted index.
As a result, single-word queries, which directly index the database on
the server, typically work pretty well, but multiple word queries can
fail miserably. The idea here is to investigate the following
question: Can we improve the performance of search engines
such as Google by doing some additional work on the client?
- Performance evaluation of content distribution networks.
Content distribution networks such as Akamai claim to significantly
reduce the latency of Web page downloads. The idea here is to
evaluate and quantify the performance benefits of the Akamai service.
When does it help? When does it not help?
- Defending against dDOS attacks. Distributed denial of
service attacks are somewhat scary and difficult to defend
against. The idea here would be to survey the existing approaches,
identify strengths and weaknesses, and propose and evaluate
some improvement.
- Investigate issues in caching dynamic content. The
unfortunate irony is that the high-volume sites that could benefit the
most from Web caching typically generate dynamic content, which is not
cached by existing Web caches. The idea here is to survey the existing
approaches for dynamic content and develop and evaluate an
alternative.
- Locality and load-balancing tradeoffs in cluster-based servers.
The idea here is to explore the conflicting
tradeoffs between locality and load-balancing in cluster-based servers,
identify weaknesses in existing approaches (e.g., LARD) and propose
and evaluate an alternative.
- Evaluation of performance issues in high-speed non-blocking servers.
The idea here is to build a high speed server that never blocks on I/O (such
as the Flash server from Rice) and then do extensive micro-evaluation
of its performance in order to understand the extent of the performance
gain that is possible from such non-blocking servers.
Last modified: Tue Apr 15 10:08:39 EDT 2003