Team 5, Spring 2006

Team Members

Patty Pun - tpun@andrew.cmu.edu
Kevin Smith - kevinsmith@cmu.edu
Zhang Yi - zhangyi@cmu.edu
Felix Tze-Shun Yip - fty@andrew.cmu.edu

Project Info

Title

Mafia: Online Mobsters

Description

An implementation of the game 'Mafia' that requires instant messenging, character status maintenance and a graphical user interface.

System Configuration

Middleware - EJB
Operating System - Linux
Programming Language - Java

Third-party Software

Ant (Code Compilation & Deployment)
CVS (Version Control)
Eclipse (Development Environment)
Javadoc (Source Code Documentation)
JBoss (EJB Implementation)
MySQL (Database Management)
Tomcat (Web Server)
XDoclet (Interface File Generation)

Baseline Design

Requirements

Allows a client to communicate with other clients
A client may only belong to one chat session
A server may support multiple chat sessions
Discussions take place using any web browser

Interfaces

sendMessage();
receiveMessage();
submitVote();
murderVictim();
leaveGame();
enterGame();
gameMsgs();
youAreKilled();

Scenarios & Interactions

Message processing
Status update

Current Status

Creation of single end-to-end case on the Windows Operating System

Downloads

Binary Distribution

Documentation

Source Code Documentation

Fault-Tolerant Design

Scenarios & Interactions

Use of passive replication
Upon failure of primary server, replication manager begins redirecting clients to backup server

Downloads

Binary Distribution

Documentation

Source Code Documentation

Fault-Tolerance Evaluation

Design Proposal (PDF)
Results & Analysis (PDF)
Client Invocations (PDF)
Raw Graph Data (158 Figures)
Raw Graph Data (tar.gz)
Raw Probe Data (tar.gz)

Real-Time, Fault-Tolerant Design

Real-Time Evaluation

Real-Time Evaluation Results (PDF)

High-Performance, Real-Time, Fault-Tolerant Design

High Performance Plan

Our real-time evaluation showed us that failover from the primary server to the backup server made up over 90% of our recovery time for faulty invocations. As a result, our goal in Phase IV will be to improve this failover time by optimizing the failover process.

One bottleneck in our failover mechanism is the replication manager, which takes a considerable amount of time to update its list. The client spends too much time waiting on the replication manager to provide a new server name. The other bottleneck in our system is the process of creating a new bean once the replication manager has provided a valid server name.

Both of these delays can be greatly reduced by always having a second bean ready on the client. When the client starts, it can ask the replication manager for the name of the next backup server along with the name of the primary server. The client can then create two beans - each pointing to these two different servers. In the event that the client cannot invoke a method on the primary server, it can immediately begin using the secondary bean. It can then continue processing as usual and in the background it can get the name of the next backup server from the replication manager and create a new secondary bean. Using this approach, the client will always have a secondary bean readily available in the event that the primary server goes down. This of course assumes that the backup server will not go down before the primary, but if this does happen, the delay would be no worse than in our current setup. Using this approach of always having two beans readily available on the client, we can significanly reduce the end-to-end latency in the presence of primary server failure.

Tips

JBoss and Java 5

For system evaluation, you may wish to make use of Java 5's System.nanoTime() method. Unfortunately, JBoss has difficulties working under Java 5. To get around this, you can delete the javax.management.* classes in your Java 5 installation. The following commands should accomplish this for you.

cd $JAVA_HOME/jre/lib
mkdir temp
cp rt.jar temp
cd temp
jar xf rt.jar
rm -rf rt.jar javax/management/*
jar cf rt.jar *
cp rt.jar ..
cd ..
rm -rf temp

SSH Environment Variables

SSH provides a nice way of performing remote execution. This is very beneficial for 749 projects which need to remotely start and stop servers and clients. To start the JBoss server on machine risk, for example, you could execute

ssh risk $JBOSS_HOME/bin/run.sh& 2>&1

Unfortunately, when using the ssh method of remote execution, you do not have access to all the environment variables you would normally have access to when logging into machine risk. You can, however, explicitly specify variable values for ssh to use by adding them to the file ~/.ssh/environment on the machine from which you will be performing the remote execution. So in our example, you would modify the file on your source machine and not on machine risk. If you're running everything on ECE machines, it doesn't matter though thanks to AFS. So your file might look something like

JAVA_HOME=/usr/local/j2sdk1.4.2_02
JBOSS_HOME=/afs/ece/class/ece749/ejb/jboss-3.2.3

Documents & Downloads

Baseline Design

Binary Distribution
Source Code Documentation

FT Baseline Design

Binary Distribution
Source Code Documentation

FT Baseline Evaluation

Design Proposal (PDF)
Results & Analysis (PDF)
Client Invocations (PDF)
Raw Graph Data
Raw Graph Data (tar.gz)
Raw Probe Data (tar.gz)

Real-Time Evaluation

RT Results (PDF)

High-Performance Evaluation

HP Results (PDF)