Methodologies, tools, and techniques for developing and optimizing applications in science and engineering

Date
2024
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
In the realm of computational and data-driven research applications, the quest to increase the quality and efficiency of such applications stood as a paramount challenge for researchers. This dissertation is driven by the objective of improving these two key aspects through the development of innovative methodologies, tools, and techniques. Enhancing the quality of CDI applications leads to more accurate research outcomes and more reliable research findings, which are essential for building upon previous work and advancing the field. Meanwhile, improving the efficiency of CDI applications enables researchers to complete tasks and experiments more quickly, ultimately boosting their research progress and overall productivity. This increased efficiency also results in the more effective utilization of system resources. By collectively improving both the quality and efficiency of CDI applications, this work accelerates the pace of scientific discovery and innovation, leading to the development of more effective solutions for some of the world's most pressing challenges. ☐ This research integrates best practices into the CDI development process to elevate the quality of applications. Furthermore, it identifies parallelization techniques and optimization methodologies that result in the creation of tools to streamline and accelerate the optimization process, thereby improving the efficiency of scientific applications. The projects in this work form a cohesive whole, each building upon the previous one to significantly advance the field. ☐ To enhance application quality, the Xpert Network was initiated to identify and compile a set of guidelines for the development of CDI applications, drawing from the collective experience of professionals in the field. The ATOM Project followed, implementing these practices in real-world settings to evaluate their effectiveness and practicality, which also identified additional practices along the way for addressing evolving requirements of funding agencies for software sustainability. Together, these efforts produced a set of guidelines aimed at improving the quality, reliability, and maintainability of CDI applications. To further assess the effectiveness and practicality of these practices, surveys were distributed to a diverse group of researchers to gather a broad range of perspectives. The results confirmed the high impact and usability of the guidelines, demonstrating their value in improving the quality of CDI applications in real-world settings. Findings from these projects resulted in a set of best practices for CDI application development providing a valuable resource for CDI practitioners. ☐ On the efficiency front, to identify effective techniques for optimizing scientific applications beyond those currently offered by automatic parallelizers, a comparative analysis of auto-parallelized and manually parallelized codes was conducted. This study uncovered key optimization strategies where manual parallelization consistently outperformed automatic parallelization methods, highlighting areas where existing auto-parallelization tools can be improved. These findings offer the potential for implementing more efficient parallelization techniques in auto-parallelizers. Furthermore, the code sections that revealed these differences will be used in future studies to test and validate newly developed methods and tools. ☐ Recognizing the need for faster iterative optimization cycles — particularly for long-running CDI applications, as highlighted by the comparative study — the CaRV (Capture, Replay, and Validate) methodology and tool were developed. CaRV accelerates the optimization process by running only the specific code sections undergoing optimization, rather than re-running the entire application, while simultaneously validating these optimized code sections for both performance and correctness. Evaluation results demonstrated that CaRV significantly reduced execution time, proving its effectiveness in optimizing code segments within long-running CDI applications. ☐ To further accelerate the optimization process of scientific applications and increase the efficiency of CDI applications, we developed iCetus, an interactive parallelizer designed to support users through the entire optimization cycle. Integrated with CaRV, the tool enabled a dual approach: CaRV accelerates the optimization process while validating both performance improvements and correctness, and iCetus automates the optimization workflow by incorporating multiple underlying tools. The integration of iCetus and CaRV leverages the strengths of both tools, enabling use cases that were not previously supported, such as applying manual or AI-generated optimizations, while delivering rapid feedback on their effectiveness and correctness. This integration provided significant performance improvements during the development process across a wide range of application sizes. This integration marked the culmination of our efforts to improve the efficiency of scientific applications by enhancing their performance. ☐ By introducing tools, techniques, and methodologies that enhance the efficiency of CDI applications, this research benefits practitioners involved in the development and optimization of these applications. It facilitates faster execution, accelerates research progress, expedites research outcomes, increases researcher productivity, and promotes more resource-efficient solutions.
Description
Keywords
Compiler optimization, Computational and data-intensive applications, Interactive parallelization, Optimization validation, Efficiency
Citation