ClusterSim, a Flexible E-Commerce Cluster Simulation Andrew Boyer Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh, PA aboyer@ece.cmu.edu ABSTRACT This paper describes ClusterSim, an e-commerce cluster simulation tool, and two related tools, TrafficGen and ClusterVis. ClusterSim models a three-tier cluster and supports boot-time adjustment of most relevant parameters. The performance characteristics of the different elements of the system are modeled probabilistically; the format for a probability distribution input file is described. A similar file format is used to model the overload behavior of an individual machine. TrafficGen generates traffic trace files that provide input to the simulation. It is configurable using probability distribution input files. ClusterVis is a Matlab visualization tool that facilitates analysis of the simulation's output trace file. It can be used to calculate performance statistics and to graph timing information. Together, the tools can be used for off-line performance analysis and run-time reconfiguration modeling without any of the costs associated with actual hardware. Analysis of two example cluster configurations is provided under both light load and heavy load. Directions for future research and improvements are provided.