Structure and genome of HIV

The genome and proteins of HIV have been the subject of extensive research since the discovery of the virus in 1983. The discovery of the virus itself was not until two years after the first major cases of AIDS associated illnesses were reported in 1981.

Structure


HIV is different in structure from other retroviruses. It is around 120 nm in diameter (around 60 times smaller than a red blood cell) and roughly spherical.

HIV-1 is composed of two copies of single-stranded RNA enclosed by a conical capsid comprising the viral protein p24, typical of lentiviruses (Figure 1). The RNA component is 9749 nucleotides long. This is in turn surrounded by a plasma membrane of host-cell origin. The single-strand RNA is tightly bound to the nucleocapsid proteins, p6, p7 and enzymes that are indispensable for the development of the virion, such as reverse transcriptase and integrase. The nucleocapsid (p7 and p6) associates with the genomic RNA (one molecule per hexamer) and protects the RNA from digestion by nucleases. A matrix composed of an association of the viral protein p17 surrounds the capsid, ensuring the integrity of the virion particle. Also enclosed within the virion particle are Vif, Vpr, Nef, p7 and viral Protease (Figure 1). The envelope is formed when the capsid buds from the host cell, taking some of the host-cell membrane with it. The envelope includes the glycoproteins gp120 and gp41.

As a result of its role in virus-cell attachment, the structure of the virus envelope spike, consisting of gp120 and gp41, is of particular importance. It is hoped that determining the envelope spike's structure would contribute to scientific understanding of the virus and its replication cycle, and help in the creation of a cure. The first model of its structure was compiled in 2006 using cryo-electron microscopy and suggested that three copies of gp120-gp41 heterodimers are thought to form a trimer as the envelope spike. However, published shortly after was evidence for a single-stalk "mushroom" model, with a head consisting of a trimer gp120s and a gp41 stem, which appears as a compact structure with no obvious separation between the three monomers, anchoring it to the envelope. There are various possibilities as to the source of this difference, as it is unlikely that the viruses imaged by the two groups were structurally different. More recently, further evidence backing up the heterodimer trimer-based model has been found.

Genome organization


HIV has several major genes coding for structural proteins that are found in all retroviruses, and several nonstructural ("accessory") genes that are unique to HIV. The gag gene provides the basic physical infrastructure of the virus, and pol provides the basic mechanism by which retroviruses reproduce, while the others help HIV to enter the host cell and enhance its reproduction. Though they may be altered by mutation, all of these genes except tev exist in all known variants of HIV; see Genetic variability of HIV.


 * gag (group-specific antigen): codes for the Gag polyprotein, which is processed during maturation to MA (matrix protein, p17); CA (capsid protein, p24); SP1 (spacer peptide 1, p2); NC (nucleocapsid protein, p7); SP2 (spacer peptide 2, p1) and p6.


 * pol: codes for viral enzymes reverse transcriptase, integrase, and HIV protease.


 * env (for "envelope"): codes for gp160, the precursor to gp120 and gp41, proteins embedded in the viral envelope which enable the virus to attach to and fuse with target cells.


 * Transactivators: tat, rev, vpr


 * Other regulators: vif, nef, vpu


 * tev: This gene is only present in a few HIV-1 isolates. It is a fusion of parts of the tat, env, and rev genes, and codes for a protein with some of the properties of tat, but little or none of the properties of rev.

RNA secondary structure
Several conserved secondary structure elements have been identified within the HIV RNA genome. These include the trans-activating responsive (TAR) element located within the 5' end of the genome and the HIV Rev response element (RRE) within the env gene. Another RNA structure that has been identified is gag stem loop 3 (GSL3), thought to be involved in viral packaging. RNA secondary structures have been proposed to affect the HIV life cycle by altering the function of HIV protease and reverse transcriptase, although not all elements identified have been assigned a function.

An RNA secondary structure determined by 2' hydroxyl acetylation and primer extension (SHAPE) analysis has shown to contain three stem loops and is located between the HIV protease and reverse transcriptase genes. This cis regulatory RNA has been shown to be conserved throughout the HIV family and is thought to influence the viral life cycle.

The complete structure of an HIV-1 genome, extracted from infectious virions, has been solved to single-nucleotide resolution.