Ankyrin repeat

The ankyrin repeat is a 33-residue motif in proteins consisting of two alpha helices separated by loops, first discovered in signaling proteins in yeast Cdc10 and Drosophila Notch. Ankyrin repeats mediate protein–protein interactions and are among the most common structural motifs in known proteins. They appear in bacterial, archaeal, and eukaryotic proteins, but are far more common in eukaryotes. Ankyrin repeat proteins, though absent in most viruses, are common among poxviruses. Most proteins that contain the motif have four to six repeats, although its namesake ankyrin contains 24, and the largest known number of repeats is 34, predicted in a protein expressed by Giardia lamblia.

The ankyrin repeat is one of the most common protein–protein interaction motifs in nature. They occur in a large number of functionally diverse proteins, mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result of horizontal gene transfers. The repeat has been found in proteins of diverse function such as transcriptional initiators, cell cycle regulators, cytoskeletal, ion transporters, and signal transducers. The ankyrin fold appears to be defined by its structure rather than its function, since there is no specific sequence or structure that is universally recognised by it.

Role in protein folding
The ankyrin-repeat sequence motif has been studied using multiple sequence alignment to determine which conserved amino acid residues are critical for folding and stability. The residues that appear on the wide lateral surface of ankyrin repeat structures are variable, often hydrophobic, and involved mainly in mediating protein–protein interactions. An artificial protein design based on a consensus sequence derived from sequence alignment has been synthesized and found to fold stably, representing the first designed protein with multiple identical repeats. More extensive design strategies have used combinatorial sequences to "evolve" ankyrin-repeat motifs that specifically recognize particular protein targets, a technique that has been presented as a possible alternative to antibody design for applications requiring high-affinity binding.

Ankyrin-repeat proteins present an unusual problem in the study of protein folding, which has largely focused on globular proteins that form well-defined tertiary structure stabilized by long-range, nonlocal residue-residue contacts. Ankyrin repeats, by contrast, contain very few such contacts (that is, they have a low contact order). Most studies have found that ankyrin repeats fold in a two-state folding mechanism, suggesting a high degree of folding cooperativity despite the local inter-residue contacts and the evident need for successful folding with varying numbers of repeats. Some evidence, based on synthesis of truncated versions of natural repeat proteins, and on the examination of phi values, suggests that the C-terminus forms the folding nucleation site.

Clinical significance
Ankyrin-repeat proteins have been associated with a number of human diseases. These proteins include the cell cycle inhibitor p16, which is associated with cancer, and the Notch protein (a key component of cell signalling pathways) which can cause the neurological disorder CADASIL when the repeat domain is disrupted by mutations.

A specialized family of ankyrin proteins known as muscle ankyrin repeat proteins (MARPs) are involved with the repair and regeneration of muscle tissue following damage due to injury and stress.

A natural variation between glutamine and lysine at position 703 in the 11th ankyrin repeat of ANKK1, known as the TaqI A1 allele, has been credited with encouraging addictive behaviours such as obesity, alcoholism, nicotine dependency and the Eros love style while discouraging juvenile delinquency and neuroticism-anxiety. The variation may affect the specificity of protein interactions made by the ANKK1 protein kinase through this repeat.

Human proteins containing this repeat
ABTB1;    ABTB2;     ACBD6;     ACTBL1;    ANK1;      ANK2;      ANK3;      ANKAR; ANKDD1A;  ANKFY1;    ANKHD1;    ANKIB1;    ANKK1;     ANKMY1;    ANKMY2;    ANKRA2; ANKRD1;   ANKRD10;   ANKRD11;   ANKRD12;   ANKRD13;   ANKRD13A;  ANKRD13B;  ANKRD13C; ANKRD13D; ANKRD15;   ANKRD16;   ANKRD17;   ANKRD18A;  ANKRD18B;  ANKRD19;   ANKRD2; ANKRD20A1; ANKRD20A2; ANKRD20A3; ANKRD20A4; ANKRD21;  ANKRD22;   ANKRD23;   ANKRD25; ANKRD26;  ANKRD27;   ANKRD28;   ANKRD30A;  ANKRD30B;  ANKRD32;   ANKRD33;   ANKRD35; ANKRD36;  ANKRD36B;  ANKRD37;   ANKRD38;   ANKRD39;   ANKRD40;   ANKRD41;   ANKRD42; ANKRD43;  ANKRD44;   ANKRD45;   ANKRD46;   ANKRD47;   ANKRD49;   ANKRD5;    ANKRD50; ANKRD52;  ANKRD53;   ANKRD54;   ANKRD55;   ANKRD56;   ANKRD57;   ANKRD58;   ANKRD6; ANKRD7;   ANKRD9;    ANKS1A;    ANKS3;     ANKS4B;    ANKS6;     ANKZF1;    ASB1; ASB10;    ASB11;     ASB12;     ASB13;     ASB14;     ASB15;     ASB16;     ASB2; ASB3;     ASB4;      ASB5;      ASB6;      ASB7;      ASB8;      ASB9;      ASZ1; BARD1;    BAT4;      BAT8;      BCL3;      BCOR;      BCORL1;    BTBD11;    C20orf12; C20orf86; C21orf99;  C7orf7;    CAMTA1;    CAMTA2;    CASKIN1;   CASKIN2;   CCM1; CDKN2A;   CDKN2B;    CDKN2C;    CDKN2D;    CENTB1;    CENTB2;    CENTB5;    CENTG1; CENTG2;   CENTG3;    CLIP3;     CLIP4;     CLPB;      CTGLF1;    CTGLF2;    CTGLF3; CTGLF4;   CTGLF5;    CTTNBP2;   DAPK1;     DDEF1;     DDEF2;     DDEFL1;    DGKI; DGKZ;     DP58;      DYSFIP1;   EHMT1;     EHMT2;     ESPN;      FANK1;     FEM1A; FEM1B;    GABPB2;    GIT1;      GIT2;      GLS;       GLS2;      HACE1;     HECTD1; IBTK;     ILK;       INVS;      KIDINS220; KRIT1;     LOC348840; LOC554226; LRRK1; MAIL;     MGC26718;  MGC29891;  MIB1;      MIB2;      MPHOSPH8;  MTPN;      MYO16; NFKB1;    NFKB2;     NFKBIA;    NFKBIB;    NFKBIE;    NFKBIL1;   NFKBIL2;   NOTCH1; NOTCH2;   NOTCH3;    NOTCH4;    NRARP;     NUDT12;    OSBPL1A;   OSTF1;     PLA2G6; POTE14;   POTE15;    POTE8;     PPP1R12A;  PPP1R12B;  PPP1R12C;  PPP1R13B;  PPP1R13L; PPP1R16A; PPP1R16B;  PSMD10;    RAI14;     RFXANK;    RIPK4;     RNASEL;    SHANK1; SHANK2;   SHANK3;    SNCAIP;    TA-NFKBH;  TEX14;     TNKS;      TNKS2;     TNNI3K;    TP53BP2;   TRP7;      TRPA1;     TRPC3;     TRPC4;     TRPC5;     TRPC6;     TRPC7;     TRPV1;     TRPV2;     TRPV3;     TRPV4;     TRPV5;     TRPV6;     UACA;      USH1G;     ZDHHC13;   ZDHHC17;