How to obtain informative representations of transactions and then perform the identification of fraudulent transactions is a crucial part of ensuring financial security. Recent studies apply Graph Neural Networks (GNNs) to the transaction fraud detection problem. Nevertheless, they encounter challenges in effectively learning spatial-temporal information due to structural limitations. Moreover, few prior GNN-based detectors have recognized the significance of incorporating global information, which encompasses similar behavioral patterns and offers valuable insights for discriminative representation learning. Therefore, we propose a novel heterogeneous graph neural network called Spatial-Temporal-Aware Graph Transformer (STA-GT) for transaction fraud detection problems. Specifically, we design a temporal encoding strategy to capture temporal dependencies and incorporate it into the graph neural network framework, enhancing spatial-temporal information modeling and improving expressive ability. Furthermore, we introduce a transformer module to learn local and global information. Pairwise node-node interactions overcome the limitation of the GNN structure and build up the interactions with the target node and long-distance ones. Experimental results on two financial datasets compared to general GNN models and GNN-based fraud detectors demonstrate that our proposed method STA-GT is effective on the transaction fraud detection task.