Sinha, Arnov2018-05-102018-05-102017http://udspace.udel.edu/handle/19716/23150The Sparse Fast Fourier Transform (sFFT) is a recent algorithm developed by Hassanieh et al. at MIT for Discrete Fourier Transforms on signals with a sparse frequency domain. A reference implementation of the algorithm exists and proves that the sFFT can be faster than modern FFT libraries for signals of sparse nature. The algorithm has been parallelized using multiple approaches, such as over multicore by Cheng et al. over GPGPUs using CUDA by Cheng et al. and optimized for serial execution using SSE intrinsics by Schumacher et al. ☐ While the increase in number of cores and memory bandwidth on modern architectures provide an opportunity to improve performance through sophisticated parallel algorithm design, the sFFT is inherently complex, embarrassingly parallel, and numerous challenges need to be addressed to deliver the optimal performance. In this Masters Thesis, we employ a high-level directive-based parallel programming model, OpenACC to create a performance portable sFFT code. We call our implementation, ACCsFFT. Our implementation can target heterogeneous platforms consisting of x86 or Power Processors integrated with GPUs. The performance of our implementation is compared against existing parallel implementations that have used either low-level or proprietary software on CPUs and GPUs. ☐ Several optimizations are proposed and implemented in ACCsFFT. The performance is also compared against the highly optimized parallel FTW library. We also target GPUs from different families, old to the most modern hardware, to verify if the algorithm is performance portable, scalable and reproducible across generations. ☐ We deliver a high-level parallel sparse FFT library capable of running one version of the code in a serial manner on CPU or in parallel on multicore, GPGPUs and other architectures that OpenACC currently supports. A programmer would only need to change input parameters of the algorithm for the different runs.Applied sciencesGPUHPCIrregular algorithmOpenACCParallelSparse Fast Fourier TransformHigh performance sparse Fast Fourier Transform using OpenACCThesis10352147592017-11-10en