Proteinogenic amino acid

Proteinogenic amino acids are amino acids that are precursors to proteins, and are produced by cellular machinery coded for in the genetic code of any organism. There are 22 standard amino acids, but only 21 are found in eukaryotes. Of the 22, selenocysteine and pyrrolysine are incorporated into proteins by distinctive biosynthetic mechanisms. The other 20 are directly encoded by the universal genetic code. Humans can synthesize 11 of these 20 from each other or from other molecules of intermediary metabolism. The other 9 must be consumed (usually as their protein derivatives) in the diet and so are thus called essential amino acids. The essential amino acids are histidine, isoleucine, leucine, lysine, methionine, phenylalanine, threonine, tryptophan, and valine.

The word proteinogenic means "protein building". Proteinogenic amino acids can be condensed into a polypeptide (the subunit of a protein) through a process called translation (the second stage of protein biosynthesis, part of the overall process of gene expression).

In contrast, non-proteinogenic amino acids are either not incorporated in proteins (like GABA, L -DOPA, or triiodothyronine), or are not produced directly and in isolation by standard cellular machinery (like hydroxyproline and selenomethionine). The latter often results from posttranslational modification of proteins.

The proteinogenic amino acids have been found to be related to the set of amino acids that can be recognized by ribozyme auto-aminoacylation systems. Thus, non-proteinogenic amino acids would have been excluded by the contingent evolutionary success of nucleotide-based life forms. Other reasons have been offered to explain why certain specific non-proteinogenic amino acids into proteins: for example, ornithine and homoserine cyclize against the peptide backbone and fragment the protein with relatively short half-lives, while others are toxic because they can be mistakenly incorporated into proteins, such as the arginine analog canavanine.

Non-proteinogenic amino acids are incorporated in nonribosomal peptides, which are not produced by the ribosome during translation.

Structures
The following illustrates the structures and abbreviations of the 21 amino acids that are directly encoded for protein synthesis by the genetic code of eukaryotes. The structures given below are standard chemical structures, not the typical zwitterion forms that exist in aqueous solutions.

Non-specific abbreviations
Sometimes the specific identity of an amino acid cannot be determined unambiguously. Certain protein sequencing techniques do not distinguish among certain pairs. Thus, the following codes are used: In addition, the symbol X is used to indicate an amino acid that is completely unidentified.
 * Asx (B) is "asparagine or aspartic acid"
 * Glx (Z) is "glutamic acid or glutamine"
 * Xle (J) is "leucine or isoleucine"

Chemical properties
Following is a table listing the one-letter symbols, the three-letter symbols, and the chemical properties of the side-chains of the standard amino acids. The masses listed are based on weighted averages of the elemental isotopes at their natural abundances. Note that forming a peptide bond results in elimination of a molecule of water, so the mass of an amino acid unit within a protein chain is reduced by 18.01524 Da.

General chemical properties

Side chain properties
Note: The pKa values of amino acids are typically slightly different when the amino acid is inside a protein. Protein pKa calculations are sometimes used to calculate the change in the pKa value of an amino acid in this situation.

Gene expression and biochemistry
* UAG is normally the amber stop codon, but encodes pyrrolysine if a PYLIS element is present. ** UGA is normally the opal (or umber) stop codon, but encodes selenocysteine if a SECIS element is present.

† The stop codon is not an amino acid, but is included for completeness.

†† UAG and UGA do not always act as stop codons (see above).

‡ An essential amino acid cannot be synthesized in humans and must, therefore, be supplied in the diet. Conditionally essential amino acids are not normally required in the diet, but must be supplied exogenously to specific populations that do not synthesize it in adequate amounts.

Mass spectrometry
In mass spectrometry of peptides and proteins, it is useful to know the masses of the residues. The mass of the peptide or protein is the sum of the residue masses plus the mass of water.

§ Monoisotopic mass

Stoichiometry and metabolic cost in cell
Following table lists the abundance of amino acids in E.coli cell and the metabolic cost (ATP) for synthesis the amino acids. Negative numbers indicate the metabolic processes are energy favorable and do not cost net ATP of the cell. Note that the abundance of amino acids include amino acids in free-form and in polymerization form (proteins).