Matrix-Matrix Multiplication (MMM) is a fundamental operation in scientific com-puting as it forms the foun-dation to many scientific libraries and applications. It is one of the few operations that can achieve the theoretical peaking floating point capacity and peak memory bandwidth of a conventional processor. Unfortunately reaching this floating point peak requires expert knowledge of linear algebra and computer architecture to craft a tuned implementation for that par-ticular microarchitecture. The search space of possible MMM implementations is large and experts have found ways to traverse this space with models, optimizations and fine tuning. In this talk we will show how we automated the domain expert in this field using Spiral, a framework for generating high performance code from formal descriptions. The end result: the performance of the Spiral generated code com-pares to hand written and tuned code on a variety of computer architectures.
Richard Veras is a third year PhD Student in the Electrical and Computer Engineering Department at Carnegie Mellon University. He is a member of the SPIRAL group and is being advised by Dr. Franz Franchetti. He earned his bachelors in Computer Science and Mathematics at the University of Texas at Austin and his research interests are in Automatic Program Generation, and Scientific and High Performance Computing.