General system scenario

  1. A JBoss instance is manually started.
  2. The Global JNDI service is manually started in the JBoss instance of the previous point, in the yahtzee machine. The startjndi.sh script can be used for this purpose.
  3. The Replication Manager is manually started. It should be started in drlucky.ece.cmu.edu. The runRepMan.sh script can be used for this purpose.
  4. The Replication Manager starts the replicas. The servers where the replicas are started have been previously configured in a file called repman.properties.
  5. Now, the client can be started. The runFTClient.sh script can be used for this purpose.
  6. The client dynamically reads the location (hostname and port number) of the Replication Manager from the file vaultjndi.properties.
  7. The client access the Replication Manager using RMI.
  8. If the operation successful the client will request the JNDI host name, the number of server instances and the JNDI name of the server beans.
  9. The client connects to the JNDI server and request the object of each EJB. The client will do steps 7 to 9 periodically in parallel.
  10. The client is ready for user input.
  11. The user selects an operation.
  12. The client creates as many threads as EJB instances it has. Each thread will call the same operation in a different server instance.
  13. The client waits until a thread comes back with an answer.
  14. The client shows the results.
  15. Optionally, go to step 10.

Fault injection scenario.

  1. Repeat steps 1-10.
  2. A server bean is killed.
  3. The user selects an operation.
  4. The client creates as many threads as EJB instances it was given by the Replication Manager. Each thread will call the same operation in a different server instance.
  5. The client waits until a thread comes back with an answer. The client does not show any exception or timeouts because a replica does not answer. The Replication Manager updates the list of available replicas after the it realizes that a replica is down (by default it checks on the replicas every 100 seconds).
  6. The client shows the results.

Recovery scenario

  1. From step 15 of the fault injection scenario.
  2. Start the replica that was killed.
  3. The replica should automatically register to the JNDI.
  4. Wait until the Replication Manager refreshes the list of replicas it has (by default 100 seconds).
  5. The user selects an operation.
  6. The client creates as many threads as EJB instances it was given by the Replication Manager.
  7. The client waits until a thread comes back with an answer. The client log should show that the replica was recovered.
  8. The client shows the results.

Current Status

          Active replication Ready
  Fault detection Ready. The clients detect exceptions from the replicas.
  Fail-over and Checkpointing         N/A
  Recovery Ready. The clients discover recovered replicas automatically.
  Fault manager Not ready.

Exception Handling

          Handled exceptions: Naming service, client or Replication Manager executed in wrong order.
The user enters a wrong option.
Problems getting connecting the client to the EJBs (javax.ejb.CreateException).
Middleware timeouts (java.rmi.RemoteException)
  Not handled exceptions: Database errors.
Replication Manager or Global JNDI failures.
Some wrong parameters in the client property file.
Common mode failures.