Application of deep-learning to compiler-based graphs

Date
2018
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Graph-structured data is used in many domains to represent complex objects, such as the molecular structure of chemicals or interactions between members of a social network. However, extracting meaningful information from these graphs is a difficult task, which is often undertaken on a case by case basis. Devising automated methods to mine information from graphs has become increasingly important as the use of graphs becomes more prevalent. Techniques have been developed that adapt algorithms, like support vector machine, to extract information from graphs with minimal preprocessing. Unfortunately, none of these techniques permit the use of deep neural networks (DNNs) to learn from graphs. Given the potential of DNNs to learn from large amounts of data, this has become an important area of interest. Recently, a technique based on graph spectral analysis was proposed to characterize graphs in a way that allows them to be used as input by DNNs. ☐ We used this technique to apply DNNs to two different systems problems, i.e., 1) classifying malicious applications based on graph-structured representations of executable code and 2) developing prediction models that assist in iterative compilation to optimize and parallelize scientific code. Our results on malicious application classification show that graph-based characterizations increase the ability of DNN to distinguish malware from different families. We performed a detailed evaluation of deep learning applied to state-of-the-art and graph-based malware characterizations. The graph-based characterizations are obtained by reverse engineering potentially malicious applications. For performance prediction, the graphs represent versions of optimized code. We use machine learning to rank these versions and inform an iterative compilation process. The models are trained using only five percent of the search space. ☐ Our work shows that graph structured data can be used to build powerful deep learning models. The techniques developed for this dissertation shows great potential in a diverse pair of systems.
Description
Keywords
Applied sciences, Compiler, GPGPU, Graph, Neural network
Citation