STATISTICAL ANALYSIS OF LLVM IR COMPILATION

Date
2024-05
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
LLVM has become an integral part of many compilation pipelines, from closed source to open-source compilers across industry and academia. Since the open-source LLVM project was started by Chris Lattner and Vikram Adve, it has proven to be a versatile and efficient language representation that is capable of being used in multiple product environments across multiple system architectures. Because LLVM has become mature and is now in frequent use, LLVM is constantly being changed by developers. Users of LLVM expect robust functionality and efficiency in the compilation pipeline without needing to write source code to take advantage of specific parts of the pipeline. More specifically, users expect compiler optimization to preserve the functionality of code while improving the execution runtime. If the compilation takes a significant amount of time to complete, that becomes a notable bottleneck in the development process of the source code. Furthermore, editing parts of the LLVM optimization pipeline that are contributing to a large amount of time in compilation is necessary to lower the overall time to complete the LLVM pipeline execution. As such, identifying codes that trigger large compilation times in parts of the optimization pipeline can yield insight into which parts of the pipeline are contributing most to the compilation time. Furthermore, by considering only the LLVM Interme diate Representation (IR) taken from a given source code, insights can be obtained that apply to several other cases of source code with LLVM IR representations (gen erated by the compiler frontend). Thus, analyzing LLVM IR and the length of time it takes to compile can provide a straightforward way of suggesting which portions of the LLVM optimization pipeline (invoked using opt) are responsible for unexpectedly large compilation times.
Description
Keywords
Citation