Title: Determination of Replication Degree Author: Chutika Udomsinn Replication is a well known technique for providing fault tolerance. Most of the time, application developers know characteristics of their replicas resources(cpu, memory, and network bandwidth), amount of application state, workload, and fault interarrival time. The famous f+1 equation is not practical because it is hard to find out the maximum failure to begin with. Therefore, they have to use trial-and-error approach to find out replication configuration: replication style, number of replicas, checkpoint period, and fault detection frequency in order to achieve their availability goal. It is a time consuming process and sometimes ends up with suboptimal configuration. I plan to build the Advisor component of MEAD middleware(http://www.ece.cmu.edu/~mead/) developed at CMU. Together with resource monitoring agent, the Advisor will perform a test run to determine fail-over and recovery time of a given replica and application. With workload and fault interarrival time from user, the Advisor gives replication configuration for a specified replication style. The main test application will be electronic voting system with adjustable amount of state. The advisor system also will be tested on another 8 different CORBA or J2EE applications from groups in 18-749 class, Spring 2006. The evaluation will be performed in two cases: validation and optimization. For validation test, I will show that the suggested configuration will not give down time within the user's mission time. For optimization test, I will show that the number of replication suggested by the Advisor is optimum. Using the number of replication less than suggest will result in system failure, using more will not help anything. Assumptions: 1) constant workload 2) constant fault interarrival time 3) homogeneous replica nodes Constants: 1) replicas resources availability 2) replication style 3) workload 4) fault interarrival time 5) application (tells amount of state) Variables: 1) checkpoint period 2) fault detection frequency Determine: 1) recovery time: f(restart time, amount of state) 2) fail-over time: f(amount of state, resource, checkpoint period, fault detection time) 3) replication degree