Non-coding RNA

A non-coding RNA (ncRNA) is a functional RNA molecule that is not translated into a protein. Less-frequently used synonyms are non-protein-coding RNA (npcRNA), non-messenger RNA (nmRNA) and functional RNA (fRNA). The term small RNA (sRNA) is often used for short bacterial ncRNAs. The DNA sequence from which a non-coding RNA is transcribed is often called an RNA gene.

Non-coding RNA genes include highly abundant and functionally important RNAs such as transfer RNA (tRNA) and ribosomal RNA (rRNA), as well as RNAs such as snoRNAs, microRNAs, siRNAs and piRNAs and the long ncRNAs that include examples such as Xist and HOTAIR (see here for a more complete list of ncRNAs). The number of ncRNAs encoded within the human genome is unknown, however recent transcriptomic and bioinformatic studies suggest the existence of thousands of ncRNAs. , but see Since many of the newly identified ncRNAs have not been validated for their function, it is possible that many are non-functional.

History and discovery
Nucleic acids were first discovered in 1868 by Friedrich Miescher and by 1939 RNA had been implicated in protein synthesis. Two decades later, Francis Crick predicted a functional RNA component which mediated translation; he reasoned that RNA is better suited to base-pair with an mRNA transcript than a pure polypeptide. The first non-coding RNA to be characterised was an alanine tRNA found in baker's yeast, its structure was published in 1965. To produce a purified alanine tRNA sample, Robert W. Holley et al. used 140kg of commercial baker's yeast to give just 1g of purified tRNAAla for analysis. The 80 nucleotide tRNA was sequenced by first being digested with Pancreatic ribonuclease (producing fragments ending in Cytosine or Uridine) and then with takadiastase ribonuclease Tl (producing fragments which finished with Guanosine). Chromatography and identification of the 5' and 3' ends then helped arrange the fragments to establish the RNA sequence. Of the three structures originally proposed for this tRNA, the 'cloverleaf' structure was independently proposed in several following publications. The cloverleaf secondary structure was finalised following X-ray crystallography anaylsis performed by two independent research groups in 1974.

Ribosomal RNA was next to be discovered, followed by URNA in the early 1980s. Since then, the discovery of new non-coding RNAs has continued with snoRNAs, Xist, CRISPR and many more. Recent notable additions include riboswitches and miRNA, the discovery of the RNAi mechanism associated with the latter earned Craig C. Mello and Andrew Fire the 2006 Nobel Prize in Physiology or Medicine.

Biological roles of ncRNA
Noncoding RNAs belong to several groups and are involved in many cellular processes. These range from ncRNAs of central importance that are conserved across all or most cellular life through to more transient ncRNAs specific to one or a few closely related species. The more conserved ncRNAs are thought to be molecular fossils or relics from LUCA and the RNA world.

ncRNAs in translation




Many of the conserved, essential and abundant ncRNAs are involved in translation. Ribonucleoprotein (RNP) particles called ribosomes are the 'factories' where translation takes place in the cell. The ribosome consists of more than 60% ribosomal RNA, these are made up of 3 ncRNAs in prokaryotes and 4 ncRNAs in eukaryotes. Ribosomal RNAs catalyse the translation of nucleotide sequences to protein. Another set of ncRNAs, Transfer RNAs, form an 'adaptor molecule' between mRNA and protein. The H/ACA box and C/D box snoRNAs are ncRNAs found in archaea and eukaryotes, RNase MRP is restricted to eukaryotes, both groups of ncRNA are involved in the maturation of rRNA. The snoRNAs guide covalent modifications of rRNA, tRNA and snRNAs, RNase MRP cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs. The ubiquitous ncRNA, RNase P, is an evolutionary relative of RNase MRP. RNase P matures tRNA sequences by generating mature 5'-ends of tRNAs through cleaving the 5'-leader elements of precursor-tRNAs. Another ubiquitous RNP called SRP recognizes and transports specific nascent proteins to the endoplasmic reticulum in eukaryotes and the plasma membrane in prokaryotes. In bacteria Transfer-messenger RNA (tmRNA) is an RNP involved in rescuing stalled ribosomes, tagging incomplete polypeptides and promoting the degradation of aberrant mRNA.

ncRNAs in RNA splicing


In eukaryotes the spliceosome performs the splicing reactions essential for removing intron sequences, this process is required for the formation of mature mRNA. The spliceosome is another RNP often also known as the snRNP or tri-snRNP. There are two different forms of the spliceosome, the major and minor forms. The ncRNA components of the major spliceosome are U1, U2, U4 and U5. The ncRNA components of the minor spliceosome are U11, U12, U5, U4atac and U6atac.

Another group of introns can catalyse their own removal from host transcripts, these are called self-splicing RNAs. There are two main groups of self-splicing RNAs, these are the group I catalytic intron and group II catalytic intron. These ncRNAs catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms.

In mammals it has been found that snoRNAs can also regulate the alternative splicing of mRNA, for example snoRNA HBII-52 regulates the splicing of serotonin receptor 2C.

In nematodes the SmY ncRNA appears to be involved in mRNA trans-splicing.

ncRNAs in gene regulation
The expression of many thousands of genes are regulated by ncRNAs. This regulation can occur in trans or in cis.

trans-acting ncRNAs
In higher eukaryotes microRNAs regulate gene expression. A single miRNA can reduce the expression levels of hundreds of genes. The mechanism by which mature miRNA molecules act is through partial complementary to one or more messenger RNA (mRNA) molecules, generally in 3' UTRs. The main function of miRNAs is to down-regulate gene expression.

The ncRNA RNase P has also been shown to influence gene expression. In the human nucleus RNase P is required for the normal and efficient transcription of various ncRNAs transcribed by RNA polymerase III. These include tRNA, 5S rRNA, SRP RNA and U6 snRNA genes. RNase P exerts its role in transcription through association with Pol III and chromatin of active tRNA and 5S rRNA genes.

It has been shown that 7SK RNA, a metazoan ncRNA, acts as a negative regulator of the RNA polymerase II elongation factor P-TEFb, and that this activity is influenced by stress response pathways.

The bacterial ncRNA, 6S RNA, specifically associates with RNA polymerase holoenzyme containing the sigma70 specificity factor. This interaction represses expression from a sigma70-dependent promoter during stationary phase.

Another bacterial ncRNA, OxyS RNA represses translation by binding to Shine-Dalgarno sequences thereby occluding ribosome binding. OxyS RNA is induced in response to oxidative stress in Escherichia coli.

The B2 RNA is a small noncoding RNA polymerase III transcript that represses mRNA transcription in response to heat shock in mouse cells. B2 RNA inhibits transcription by binding to core Pol II. Through this interaction, B2 RNA assembles into preinitiation complexes at the promoter and blocks RNA synthesis.

A recent study has shown that just the act of transcription of ncRNA sequence can have an influence on gene expression. RNA polymerase II transcription of ncRNAs is required for chromatin remodelling in the Schizosaccharomyces pombe. Chromatin is progressively converted to an open configuration, as several species of ncRNAs are transcribed.

cis-acting ncRNAs
A number of ncRNAs are embedded in the 5' UTRs of protein coding genes and influence their expression in various ways. For example, a riboswitch can directly bind a small target molecule, the binding of the target affects the gene's activity.

RNA leader sequences are found upstream of the first gene of in amino acid biosynthetic operons. These RNA elements form one of two possible structures in regions encoding very short peptide sequences that are rich in the end product amino acid of the operon. A terminator structure forms when there is an excess of the regulatory amino acid and ribosome movement over the leader transcript is not impeded. When there is a deficiency of the charged tRNA of the regulatory amino acid the ribosome translating the leader peptide stalls and the antiterminator structure forms. This allows RNA polymerase to transcribe the operon. Known RNA leaders are Histidine operon leader, Leucine operon leader, Threonine operon leader and the Tryptophan operon leader.

Iron response elements (IRE) are bound by iron response proteins (IRP). The IRE is found in UTRs (Untranslated Regions) of various mRNAs whose products are involved in iron metabolism. When iron concentration is low, IRPs bind the ferritin mRNA IRE leading to translation repression.

Internal ribosome entry sites (IRES) are a RNA structure that allow for translation initiation in the middle of a mRNA sequence as part of the process of protein synthesis.

ncRNAs and genome defense
Piwi-interacting RNAs (piRNAs) expressed in mammalian testes and somatic cells, they form RNA-protein complexes with Piwi proteins. These piRNA complexes (piRCs) have been linked to transcriptional gene silencing of retrotransposons and other genetic elements in germ line cells, particularly those in spermatogenesis.

Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) are repeats found in the DNA of many bacteria and archaea. The repeats are separated by spacers of similar length. It has been demonstrated that these spacers can be derived from phage and subsequently help protect the cell from infection.

ncRNAs and chromosome structure
Telomerase is an RNP enzyme that adds specific DNA sequence repeats ("TTAGGG" in vertebrates) to telomeric regions, which are found at the ends of eukaryotic chromosomes. The telomeres contain condensed DNA material, giving stability to the chromosomes. The enzyme is a reverse transcriptase that carries Telomerase RNA, which is used as a template when it elongates telomeres, which are shortened after each replication cycle.

Xist (X-inactive-specific transcript) is an long ncRNA gene on the X chromosome of the placental mammals that acts as major effector of the X chromosome inactivation process forming Barr bodies. An antisense RNA, Tsix, is a negative regulator of Xist. X chromosomes lacking Tsix expression (and thus having high levels of Xist transcription) are inactivated more frequently than normal chromosomes. In drosophilids, which also use an XY sex-determination system, the roX (RNA on the X) RNAs are involved in dosage compensation. Both Xist and roX operate by epigenetic regulation of transcription through the recruitment of histone-modifying enzymes.

Bifunctional RNA
Bifunctional RNAs are RNAs that have two distinct functions, these are also known as dual function RNAs. The majority of the known bifunctional RNAs are both mRNAs that encode a protein and ncRNAs. However there are also a growing number of ncRNAs that fall into two different ncRNA categories e.g. H/ACA box snoRNA and miRNA.

Two well known examples of bifunctional RNAs are SgrS RNA and RNAIII. However, a handful of other bifunctional RNAs are known to exist, e.g. SRA (Steroid Receptor Activator) , VegT RNA , Oskar RNA and ENOD40.

ncRNAs and disease
See also: Long noncoding RNAs in disease

As with proteins, mutations or imbalances in the ncRNA repertoire within the body can cause a variety of diseases.

Cancer
Many ncRNAs show abnormal expression patterns in cancerous tissues. These include miRNAs, long mRNA-like ncRNAs , GAS5, SNORD50, telomerase RNA and Y RNAs. The miRNAs are involved in the large scale regulation of many protein coding genes, the Y RNAs are important for the initiation of DNA replication, telomerase RNA that serves as a primer for telomerase, an RNP that extends telomeric regions at chromosome ends (see telomeres and disease for more information). The direct function of the long mRNA-like ncRNAs is less clear.

Germ-line mutations in miR-16-1 and miR-15 primary precursors have been shown to be much more frequent in patients with chronic lymphocytic leukemia compared to control populations.

It has been suggested that a rare SNP (rs11614913) that overlaps hsa-mir-196a2 has been found to be associated with non-small cell lung carcinoma. Likewise, a screen of 17 miRNAs that have been predicted to regulate a number of breast cancer associated genes found variations in the microRNAs miR-17 and miR-30c-1, these patients were noncarriers of BRCA1 or BRCA2 mutations, lending the possibility that familial breast cancer may be caused by variation in these miRNAs.

The p53 tumor suppressor is arguably the most important player in preventing tumor formation and progression. The p53 protein functions as a transcription factor with a crucial role in orchestrating the cellular stress response. In addition to its crucial role in cancer, p53 has been implicated in other diseases including diabetes, cell death after ischemia, and various neurodegenerative diseases such as Huntington, Parkinson, and Alzheimer. Studies have suggested that p53 expression is subject to regulation by non-coding RNA.

Prader–Willi syndrome
The deletion of the 48 copies of the C/D box snoRNA SNORD116 has been shown to be the primary cause of Prader–Willi syndrome. Prader–Willi is a developmental disorder associated with over-eating and learning difficulties. SNORD116 has potential target sites within a number of protein-coding genes, and could have a role in regulating alternative splicing.

Autism
The chromosomal locus containing the small nucleolar RNA SNORD115 gene cluster has been duplicated in approximately 5% of individuals with autistic traits. A mouse model engineered to have a duplication of the SNORD115 cluster displays autistic-like behaviour.

Cartilage-hair hypoplasia
Mutations within RNase MRP have been shown to cause cartilage-hair hypoplasia, a disease associated with an array of symptoms such as short stature, sparse hair, skeletal abnormalities and a suppressed immune system that is frequent among Amish and Finnish. The best characterised variant is an A-to-G transition at nucleotide 70 that is in a loop region two bases 5' of a conserved pseudoknot. However, many other mutations within RNase MRP also cause CHH.

Alzheimer's disease
The antisense RNA, BACE1-AS is transcribed from the opposite strand to BACE1 and is upregulated in patients with Alzheimer's disease. BACE1-AS regulates the expression of BACE1 by increasing BACE1 mRNA stability and generating additional BACE1 through a post-transcriptional feed-forward mechanism. By the same mechanism it also raises concentrations of beta amyloid, the main constituent of senile plaques. BACE1-AS concentrations are elevated in subjects with Alzheimer's disease and in amyloid precursor protein transgenic mice.

miR-96 and hearing loss
Variation within the seed region of mature miR-96 has been associated with autosomal dominant, progressive hearing loss in humans and mice. The homozygous mutant mice were profoundly deaf, showing no cochlear responses. Heterozygous mice and humans progressively lose the ability to hear.

Distinction between functional RNA (fRNA) and ncRNA
Several publications have started using the term functional RNA (fRNA), as opposed to ncRNA, to describe regions functional at the RNA level that may or may not be stand-alone RNA transcripts. Therefore, every ncRNA is a fRNA, but there exist fRNA (such as riboswitches, SECIS elements, and other cis-regulatory regions) that are not ncRNA. Yet the term fRNA could also include mRNA as this is RNA coding for protein and hence is functional. Additionally artificially evolved RNAs also fall under the fRNA umbrella term. Some publications state that the terms ncRNA and fRNA are nearly synonymous.