FaultToleranceDesign

Replication

Two servers will be running on separate machines. Each server will be registered with the naming service. There is no state kept in the servers, so each server will be identical without need for any server to refresh its state.

Fault Detection

The client will receive an exception that the server has gone down when trying to communicate with it. The replication manager will consistently ping "isAlive()" to each of the servers every five seconds. If the server does not respond, the Replication Manager will kill the server.

Fail-Over

When the client gets an exception that the server is dead, the client will ask the naming service for the next server. Format of servers will be <server><X>. Example: Fred1, and Fred2. The client will then retry the request to the second server. There is no need for a unique identifier to be passed to the fail over server, because there is a Primary Key constraint on user_id, rank, election_id, so the database will not allow the same data to be added to the database twice.

Recovery

When the Replication Manager detects a dead server (or has to kill a server), it will restart that server. There is no need to refresh state, since the middle tier is stateless.

Checkpointing

This is not necessary, because the middle tier is stateless.