Recent computational scans for non-coding
RNAs (nc
RNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of
RNA structure—frequent compensating base changes—is increasingly likely to cause sequence-based alignment methods to misalign, or even refuse to align, homologous nc
RNAs, consequently obscuring that structural signal. We have used CMfinder, a structure-oriented local alignment tool, to search the ENCODE regions of vertebrate multiple alignments. In agreement with other studies, we find a large number of potential
RNA structures in the ENCODE regions. We report 6587 candidate regions with an estimated false-positive rate of 50%. More intriguingly, many of these candidates may be better represented by alignments taking the
RNA secondary structure into account than those based on primary sequence alone, often quite dramatically. For example, approximately one-quarter of our predicted motifs show revisions in >50% of their aligned positions. Furthermore, our results are strongly complementary to those discovered by sequence-alignment-based approaches—84% of our candidates are not covered by Washietl et al., increasing the number of nc
RNA candidates in the ENCODE region by 32%. In a group of 11 nc
RNA candidates that were
tested by RT-PCR, 10 were confirmed to be present as
RNA transcripts in human tissue, and most show evidence of significant differential expression across tissues. Our results broadly suggest caution in any analysis relying on multiple sequence alignments in less well-conserved regions, clearly support growing appreciation for the biological significance of nc
RNAs, and strongly support the argument for considering
RNA structure directly in any searches for these elements.
Source -
http://genome.cshlp.org/content/18/2/242.abstract?sid=3910ac8d-5e4c-46f2-b7ef-d374fc02a5e9
No comments:
Post a Comment