The goal of the project is to develop an algorithm and implementation for Pearson Correlation that maps well to the GPU. We investigate special parallel programming techniques and algorithmic trade-offs necessary for using minimal shared memory and avoid bank conflicts, as well as obtaining high memory streaming bandwidth. The project is based on preliminary work over the summer.
Objective: 1. Solve bank conflicts issue and fill the machine with maximum threads. 2. Compare the performance between the version which only generates the unique correlations and the version generates all combinations. 3. Aims to have performance over 200 GFlops 4. Improve occupancy of GPU machine over 50% percent. 5. An advanced version can generate output for signals after shifting in a given range using optimal numbers of registers and threads. 6. target larger problem sizes and memory streaming. 7. Modify read-in method and GUI for website.