Open reading frame

In molecular genetics, an open reading frame (ORF) is a DNA sequence that does not contain a stop codon in a given reading frame. Normally, inserts which interrupt the reading frame of a subsequent region after the start codon, cause frameshift mutation of the sequence and dislocate the sequences for stop codons.

Significance
Normally, inserts which interrupt the reading frame of a subsequent region after the start codon, cause frameshift mutation of the sequence and dislocate the sequences for stop codons. One common use of open reading frames are as one piece of evidence to assist in gene prediction. Long ORFs are often used, along with other evidence, to initially identify candidate protein coding regions in a DNA sequence. The presence of an ORF does not necessarily mean that the region is ever translated. For example in a randomly generated DNA sequence with an equal percentage of each nucleotide, a stop-codon would be expected once every 21 codons. A simple gene prediction algorithm for prokaryotes might look for a start codon followed by an open reading frame that is long enough to encode a typical protein, where the codon usage of that region matches the frequency characteristic for the given organism's coding regions. By itself even a long open reading frame is not conclusive evidence for the presence of a gene.

Example
If a portion of a genome has been sequenced (e.g. 5'-ATCTAAAATGGGTGCC-3'), ORFs can be located by examining each of the three possible reading frames on each strand. In this sequence two out of three possible reading frames are entirely open, meaning that they do not contain a stop codon:


 * 1) ...A TCT  AAA  ATG  GGT  GCC...
 * 2) ...AT CTA  AAA  TGG  GTG  CC...
 * 3) ...ATC TAA  AAT  GGG  TGC  C...

Possible stop codons in DNA are "TGA", "TAA" and "TAG". Thus, the last reading frame in this example contains a stop codon (TAA), unlike the first two.