Finding Similar RNA Secondary Structures Without Structure Prediction

Sarah Middleton, University of Pennsylvania

Photo of Sarah Middleton

Ribonucleic acid (RNA) is a class of abundant biomolecules that play an essential role in decoding and regulating genetic material. Unlike DNA, which forms a double helix through base pairing of two separate complementary strands, RNAs are usually single-stranded and form mostly intra-molecular base pairs. This self-pairing directs an RNA to take on a particular structural form – referred to as secondary structure – and often determines the cellular function of the RNA. Since experimental methods for determining RNA structure and function are generally low-throughput and time-consuming, much effort has been put into developing computational methods for clustering RNAs into functional families based on structural predictions. Although over 2,000 families have so far been described, many RNAs do not fit into any known family and remain functionally and structurally unclassified. This is at least partly due to the limitations of computational structure prediction and alignment most current methods use. Here we present a novel method for structural clustering of RNAs that makes no explicit structure predictions or sequence alignments. We show that this method is highly useful for both classifying structures into known families as well as identifying novel structure families. We apply our method to recently published sequencing data of dendritically localized RNAs and identify recurring structures that potentially mediate the subcellular localization of protein-coding RNAs in neurons.

Abstract Author(s): Sarah A. Middleton & Junhyong Kim