Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhiyuan Shao

Accelerating Backward Aggregation in GCN Training with Execution Path Preparing on GPUs

Apr 06, 2022

Shaoxian Xu, Zhiyuan Shao, Ci Yang, Xiaofei Liao, Hai Jin

Figure 1 for Accelerating Backward Aggregation in GCN Training with Execution Path Preparing on GPUs

Figure 2 for Accelerating Backward Aggregation in GCN Training with Execution Path Preparing on GPUs

Figure 3 for Accelerating Backward Aggregation in GCN Training with Execution Path Preparing on GPUs

Figure 4 for Accelerating Backward Aggregation in GCN Training with Execution Path Preparing on GPUs

Abstract:The emerging Graph Convolutional Network (GCN) has now been widely used in many domains, and it is challenging to improve the efficiencies of applications by accelerating the GCN trainings. For the sparsity nature and exploding scales of input real-world graphs, state-of-the-art GCN training systems (e.g., GNNAdvisor) employ graph processing techniques to accelerate the message exchanging (i.e. aggregations) among the graph vertices. Nevertheless, these systems treat both the aggregation stages of forward and backward propagation phases as all-active graph processing procedures that indiscriminately conduct computation on all vertices of an input graph. In this paper, we first point out that in a GCN training problem with a given training set, the aggregation stages of its backward propagation phase (called as backward aggregations in this paper) can be converted to partially-active graph processing procedures, which conduct computation on only partial vertices of the input graph. By leveraging such a finding, we propose an execution path preparing method that collects and coalesces the data used during backward propagations of GCN training conducted on GPUs. The experimental results show that compared with GNNAdvisor, our approach improves the performance of the backward aggregation of GCN trainings on typical real-world graphs by 1.48x~5.65x. Moreover, the execution path preparing can be conducted either before the training (during preprocessing) or on-the-fly with the training. When used during preprocessing, our approach improves the overall GCN training by 1.05x~1.37x. And when used on-the-fly, our approach improves the overall GCN training by 1.03x~1.35x.

Via

Access Paper or Ask Questions

Cross-Language Binary-Source Code Matching with Intermediate Representations

Jan 19, 2022

Yi Gui, Yao Wan, Hongyu Zhang, Huifang Huang, Yulei Sui, Guandong Xu, Zhiyuan Shao, Hai Jin

Figure 1 for Cross-Language Binary-Source Code Matching with Intermediate Representations

Figure 2 for Cross-Language Binary-Source Code Matching with Intermediate Representations

Figure 3 for Cross-Language Binary-Source Code Matching with Intermediate Representations

Figure 4 for Cross-Language Binary-Source Code Matching with Intermediate Representations

Abstract:Binary-source code matching plays an important role in many security and software engineering related tasks such as malware detection, reverse engineering and vulnerability assessment. Currently, several approaches have been proposed for binary-source code matching by jointly learning the embeddings of binary code and source code in a common vector space. Despite much effort, existing approaches target on matching the binary code and source code written in a single programming language. However, in practice, software applications are often written in different programming languages to cater for different requirements and computing platforms. Matching binary and source code across programming languages introduces additional challenges when maintaining multi-language and multi-platform applications. To this end, this paper formulates the problem of cross-language binary-source code matching, and develops a new dataset for this new problem. We present a novel approach XLIR, which is a Transformer-based neural network by learning the intermediate representations for both binary and source code. To validate the effectiveness of XLIR, comprehensive experiments are conducted on two tasks of cross-language binary-source code matching, and cross-language source-source code matching, on top of our curated dataset. Experimental results and analysis show that our proposed XLIR with intermediate representations significantly outperforms other state-of-the-art models in both of the two tasks.

* SANER2022

Via

Access Paper or Ask Questions