Recognition sequence

The recognition sequence, sometimes also referred to as recognition site, of any DNA-binding protein motif that exhibits binding specificity, refers to the DNA sequence (or subset thereof), to which the domain is specific. Recognition sequences are palindromes.

The transcription factor Sp1 for example, binds the sequences 5'-(G/T)GGGCGG(G/A)(G/A)(C/T)-3', where (G/T) indicates that the domain will bind a guanine or thymine at this position.

The restriction endonuclease PstI recognizes, binds, and cleaves the sequence 5'-CTGCAG-3'.

However, a recognition sequence refers to a different aspect from that of recognition site. A given recognition sequence can occur one or more times, or not at all on a specific DNA fragment. A recognition site is specified by the position of the site. For example, there are two PstI recognition site in the following DNA sequence fragment, start at base 9 and 31 respectively. A recognition sequence is a specific sequence, usually very short (less than 10 bases). Depending on the degree of specificity of the protein, a DNA-binding protein can bind to more than one specific sequence. For PstI, which has a single sequence specificity, it is 5'-CTGCAG-3'. It is always the same whether at the first recognition site or the second in the following example sequence. For Sp1, which has multiple (16) sequence specificity as shown above, the two recognition sites in the following example sequence fragment are at 18 and 32, and their respective recognition sequences are 5'-GGGGCGGAGC-3' and 5'-TGGGCGGAAC-3'.

5'-AACGTTAGCTGCAGTCGGGGCGGAGCTAGGCTGCAGGAATTGGGCGGAACCT-3'