RNA polymerase II

RNA polymerase II (also called RNAP II and Pol II) is an enzyme found in eukaryotic cells. It catalyzes the transcription of DNA to synthesize precursors of mRNA and most snRNA and microRNA. A 550 kDa complex of 12 subunits, RNAP II is the most studied type of RNA polymerase. A wide range of transcription factors are required for it to bind to its promoters and begin transcription.

Subunits
The eukaryotic core RNA polymerase II was first purified using transcription assays. The purified enzyme has typically 10-12 subunits (12 in humans and yeast) and is incapable of specific promoter recognition. Many subunit-subunit interactions are known.

Computer generated image of POLR2A gene colorized subunits: green - RPB1 domain 1, blue - RPB1 domain 2, sand - RPB1 domain 3, light blue - RPB1 domain 4, brown - RPB1 domain 6, and magenta - RPB1 CTD.


 * DNA-directed RNA polymerase II subunit RPB1 - an enzyme that in humans is encoded by the POLR2A gene. RPB1 is the largest subunit of RNA polymerase II. It contains a carboxy terminal domain (CTD) composed of up to 52 heptapeptide repeats (YSPTSPS) that are essential for polymerase activity. In combination with several other polymerase subunits, it forms the DNA binding domain of the polymerase, a groove in which the DNA template is transcribed into RNA. It strongly interacts with RPB8.


 * RPB2 (POLR2B) - the second largest subunit which in combination with at least two other polymerase subunits forms a structure within the polymerase that maintains contact in the active site of the enzyme between the DNA template and the newly synthesized RNA.


 * RPB3 (POLR2C) - the third largest subunit. Exists as a heterodimer with another polymerase subunit, POLR2J forming a core subassembly. RPB3 strongly interacts with RPB1-5, 7, 10-12.


 * RNA polymerase II subunit B4 (RPB4) - encoded by the POLR2D gene is the fourth largest subunit and may have a stress protective role.


 * RPB5 - In humans is encoded by the POLR2E gene. Two molecules of this subunit are present in each RNA polymerase II. RPB5 strongly interacts with RPB1, RPB3, and RPB6.


 * RPB6 (POLR2F) - forms a structure with at least two other subunits that stabilizes the transcribing polymerase on the DNA template.


 * RPB7 - encoded by POLR2G and may play a role in regulating polymerase function. RPB7 interacts strongly with RPB1 and RPB5.


 * RPB8 (POLR2H) - interacts with subunits RPB1-3, 5, and 7.


 * RPB9 - The groove in which the DNA template is transcribed into RNA is composed of RPB9 (POLR2I) and RPB1.


 * RPB10 - the product of gene POLR2L. It interacts with RPB1-3 and 5, and strongly with RPB3.


 * RPB11 - the RPB11 subunit is itself composed of three subunits in humans: POLR2J (RPB11-a), POLR2J2 (RPB11-b), and POLR2J3 (RPB11-c).


 * RPB12 - Also interacting with RPB3 is RPB12 (POLR2K).

Assembly
RPB3 is involved in RNA polymerase II assembly. A subcomplex of RPB2 and RPB3 appears soon after subunit synthesis. This complex subsequently interacts with RPB1. RPB3, RPB5, and RPB7 interact with themselves to form homodimers, and RPB3 and RPB5 together are able to contact all of the other RPB subunits, except RPB9. Only RPB1 strongly binds to RPB5. The RPB1 subunit also contacts RPB7, RPB10, and more weakly but most efficiently with RPB8. Once RPB1 enters the complex, other subunits such as RPB5 and RPB7 can enter, where RPB5 binds to RPB6 and RPB8 and RPB3 brings in RPB10, RPB 11, and RPB12. RPB4 and RPB9 may enter once most of the complex is assembled. RPB4 forms a complex with RPB7.

Kinetics
Enzymes can catalyze up to several million reactions per second. Enzyme rates depend on solution conditions and substrate concentration. Like other enzymes POLR2 has a saturation curve and a maximum velocity (Vmax). It has a Km (substrate concentration required for one-half Vmax) and a kcat (the number of substrate molecules handled by one active site per second). The specificity constant is given by kcat/Km. The theoretical maximum for the specificity constant is the diffusion limit of about 108 to 109 (M−1 s−1), where every collision of the enzyme with its substrate results in catalysis.

The turnover number for RNA polymerase II is 0.16 s−1 subject to concentration. Bacterial RNA polymerase, a relative of RNA Polymerase II, switches between inactivated and activated states by translocating back and forth along the DNA. Concentrations of [NTP]eq = 10 μM GTP, 10 μM UTP, 5 μM ATP and 2.5 μM CTP, produce a mean elongation rate, turnover number, of ~1 bp (NTP)−1 for bacterial RNAP, a relative of RNA polymerase II.

RNA Polymerase II is inhibited by α-amanitin.

Holoenzyme
RNA polymerase II holoenzyme is a form of eukaryotic RNA polymerase II that is recruited to the promoters of protein-coding genes in living cells. It consists of RNA polymerase II, a subset of general transcription factors, and regulatory proteins known as SRB proteins.

Part of the assembly of the holoenzyme is referred to as the preinitiation complex, because its assembly takes place on the gene promoter before the initiation of transcription. The mediator complex acts as a bridge between RNA polymerase II and the transcription factors.

Control by chromatin structure
This is an outline of an example mechanism of yeast cells by which chromatin structure and histone posttranslational modification help regulate and record the transcription of genes by RNA polymerase II.

This pathway gives examples of regulation at these points of transcription:
 * Pre-initiation (promotion by Bre1, histone modification)
 * Initiation (promotion by TFIIH, Pol II modification AND promotion by COMPASS, histone modification)
 * Elongation (promotion by Set2, Histone Modification)

Please note that this refers to various stages of the process as regulatory steps. It has not been proven that they are used for regulation, but is very likely they are.

RNA Pol II elongation promoters can be summarised in 3 classes.
 * 1) Drug/sequence-dependent arrest affected factors (Various interfering proteins).
 * 2) Chromatin structure oriented factors (Histone posttranscriptional modifiers, eg HMTs).
 * 3) RNA Pol II catalysis improving factors (Various interfering proteins and Pol II cofactors, see RNA polymerase II).

Protein Complexes Involved
Chromatin structure oriented factors: (HMTs (Histone MethylTransferases)): COMPASS§† - (COMplex of Proteins ASsociated with Set1) - Methylates lysine 4 of histone H3. Set2 - Methylates lysine 36 of histone H3. (interesting irrelevant example: Dot1*‡ - Methylates lysine 79 of histone H3.)

(Other): Bre1 - Ubiquinates (adds ubiquitin to) lysine 123 of histone H2B. Associated with pre-initiation and allowing RNA Pol II binding.

N-terminus
The N-terminus (also known as the amino-terminus, NH2-terminus, N-terminal end or amine-terminus) refers to the start of a protein or polypeptide terminated by an amino acid with a free amine group (-NH2). The convention for writing peptide sequences is to put the N-terminus on the left and write the sequence from N- to C-terminus. When the protein is translated from messenger RNA, it is created from N-terminus to C-terminus.

The N-terminus is the first part of the protein that exits the ribosome during protein biosynthesis. It often contains sequences that act as targeting signals, basically intracellular zip codes, that allow for the protein to be delivered to its designated location within the cell. The targeting signal is usually cleaved off after successful targeting by a processing peptidase. Some proteins are modified posttranslationally.

C-terminus
The C-terminus (also known as the carboxyl-terminus, carboxy-terminus, C-terminal end, or COOH-terminus) of a protein or polypeptide is the end of the amino acid chain terminated by a free carboxyl group (-COOH). The convention for writing peptide sequences is to put the C-terminal end on the right and write the sequence from N- to C-terminus.

Each amino acid has a carboxyl group and an amine group, and amino acids link to one another to form a chain by a dehydration reaction by joining the amine group of one amino acid to the carboxyl group of the next. Thus polypeptide chains have an end with an unbound carboxyl group, the C-terminus, and an end with an amine group, the N-terminus. Proteins are naturally synthesized starting from the N-terminus and ending at the C-terminus.

The C-terminus can contain retention signals for protein sorting. The most common ER retention signal is the amino acid sequence -KDEL (or -HDEL) at the C-terminus, which keeps the protein in the endoplasmic reticulum and prevents it from entering the secretory pathway.

The C-terminus of proteins can be modified posttranslationally, for example, most commonly by the addition of a lipid anchor to the C-terminus that allows the protein to be inserted into a membrane without having a transmembrane domain. With Pol II, the C-terminus of RPB1 is appended to form the C-terminal domain (CTD).

CTD of RNA polymerase
The carboxy-terminal domain of RNA polymerase II typically consists of up to 52 repeats of the sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser. Other proteins often bind the C-terminal domain of RNA polymerase in order to activate polymerase activity. It is the protein domain which is involved in the initiation of DNA transcription, the capping of the RNA transcript, and attachment to the spliceosome for RNA splicing.