Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!
Abstract:Motivation: RNAs are ubiquitous molecules involved in many regulatory and catalytic processes. Their ability to form complex structures is often key to support these functions. Remarkably, RNA 3D structures are articulated around smaller 3D sub-units referred as RNA 3D motifs that can be found in unrelated molecules. The classification of these 3D motifs is thus essential to characterize RNA structures, but current methods can only retrieve motifs with identical base interaction patterns. Results: Here, we relax this constraint by posing the motif finding problem as a graph representation learning and clustering task. This framing takes advantage of the continuous nature of graph representations to model the flexibility of RNA motifs while retaining the convenient encoding of RNAs as graphs. We propose a set of node similarity functions, clustering methods, and motif construction algorithms to recover flexible RNA motifs. We show that our methods are able to retrieve and expand known classes of motifs, but also to identify new motifs. Our tool, VeRNAl can be easily customized by users to desired levels of motif flexibility, abundance and size. Availability and Implementation: The source code, data, and a webserver are available at vernal.cs.mcgill.ca