Team #: 3
Group Name #COSMOS
18-749: Fault-Tolerant Distributed Systems
Spring 2005

 

FIRST DEPLOYMENT

How to Deploy:


Launch in this order
1. replication manager 2. webserver 3. application service
REPLICATION MANAGER
(1) Deploy the replication manager on any of the follow machines:
a. beta.ece.cmu.edu
b. rho.ece.cmu.edu
c. gamma.ece.cmu.edu
d. lambda.ece.cmu.edu
e. phi.ece.cmu.edu
(2) Deploy ONLY ONE REPLICATION MANAGER!
i. Could deploy to many but only one will be used, in the order of lookup

WEBSERVER & APPLICATION
(1) Deploy the application and webserver on any computer of your choice.
(2) Launch as many application instance that you want on any computer of your choice.

SERVICE
(1) Currently you can Create an Account, View an Account, Login to the System, and Edit an Account

CreateAccount:
Enter first and last name, a Social Security Number (fake), choose a login and password for the system

ViewAccount:
Enter the login of the user whose account you would like to see.

LoginAccount:
Enter login and password chosen in create Account to login into System.

EditAccount:
Enter name and password to change account information for login name.
 

COSMOS REPLICATION MANAGER
COSMOS WEBSERVER
COSMOS APPLICATION SERVER

CLICK HERE TO USER A RUNNING INSTANCE OF SERVICE

Fault Tolerant Real-time Measurements
Excel Doc

To reduce jitter/latency:
An aspect of fault recovery is when the backup application server becomes the primary application server and must go to the back end tier (e.g. the database) to retrieve all state. This takes time. For the case in which an application server fails while in the midst of undergoing a transaction, this process of retrieving all transactional data from the database and then figuring out which transaction was being executed and then completing that transaction adds a tremendous amount of latency for the overall throughput of this transaction.

Thus, when the webserver contacts the new primary, instead of the new primary figuring out what current transaction (if any) to undertake, it'd be faster if the webserver just asked the new primary to redo the transaction. Most of our transactions are idempotent anyway; however, duplicate transactions are already being caught and handled by the application server code, so this would not be a concern.