The Genetic Code



1) The central dogma of molecular biology is that genetic information is stored in DNA which is transcribed into RNA which is translated into protein. A central part of this concept is the flow of information in that direction (DNA > RNA > Protein).

2) DNA is transcribed into mRNA which serves as the intermediate molecule between DNA and the ultimate protein which it encodes. mRNA encodes an RNA version of the information encoded in DNA.

3) Genetic information is encoded in a triplet code, meaning that every 3 nucleotides of DNA (and RNA) code for a single amino acid of a protein.

4) Since there are 4 types of nucleotides, and it is based on a triplet codon (3 nucleotides in a row) there are 64 possible combainations to code for 20 different possible amino acids. This means that the same amino acid can be coded for by more than one triplet in many cases.

nonpolar polar basic acidic (stop codon)
Standard genetic code
1st
base
2nd base 3rd
base
T C A G
T TTT (Phe/F) Phenylalanine TCT (Ser/S) Serine TAT (Tyr/Y) Tyrosine TGT (Cys/C) Cysteine T
TTC TCC TAC TGC C
TTA (Leu/L) Leucine TCA TAA Stop (Ochre) TGA Stop (Opal) A
TTG TCG TAG Stop (Amber) TGG (Trp/W) Tryptophan     G
C CTT CCT (Pro/P) Proline CAT (His/H) Histidine CGT (Arg/R) Arginine T
CTC CCC CAC CGC C
CTA CCA CAA (Gln/Q) Glutamine CGA A
CTG CCG CAG CGG G
A ATT (Ile/I) Isoleucine ACT (Thr/T) Threonine         AAT (Asn/N) Asparagine AGT (Ser/S) Serine T
ATC ACC AAC AGC C
ATA ACA AAA (Lys/K) Lysine AGA (Arg/R) Arginine A
ATG[A] (Met/M) Methionine ACG AAG AGG G
G GTT (Val/V) Valine GCT (Ala/A) Alanine GAT (Asp/D) Aspartic acid GGT (Gly/G) Glycine T
GTC GCC GAC GGC C
GTA GCA GAA (Glu/E) Glutamic acid GGA A
GTG GCG GAG GGG G
Table from Wikipedia

5) There are several non-obvious characteristics to the way information is encoded in mRNA. The genetic code is:

  • linear
  • unambiguous - every triplet possible has a meaning of what it codes for
  • degenerate - multiple codons can code for the same amino acid
  • commaless - each codon begins directly after the last without any spacer nucleotides
  • non-overlapping
  • nearly universal - the same in all living organisms with few exceptions
  • contains start and stop signals (start and stop of translation)

6) Each codon of the mRNA pairs with a specific anticodon of a tRNA molecule. The tRNA serves as the adapter between a codon and a specific amino acid. This pairing between codon (of RNA) and anticodon (of tRNA) occurs on the ribosome where protein is made (translated).

Figure Showing Translation on the Ribosome

7) Only tryptophan and methionine are encoded by a single codon.

8) Again, the code is degenerate, meaning there are multiple codons that code for the same amino acid in many cases. In many cases the 2 or 3 codons for the same amino acid have very slight differences which can be thought of as a way to minimize the effects of spontanous mutations. As in, sometimes a mutation can occur in the underlying DNA which ends up still coding for the same amino acid.

9) There is also some order in chemically similar amino acids being coded for by similar codons. This can be thought of as an additional mutation buffer

10) Not every codon codes for an amino acid. AUG codes for the start signal (as in start translation here). UAA, UGA and UAG code for the stop signal (stop translation here).