Horse Genetics. Ernest Bailey
Читать онлайн книгу.20 amino acids found in proteins are presented, as well as the codon signals that start and stop protein synthesis.
Genetic code
DNA is an information molecule. It contains all the information necessary to transform a single cell, specifically a fertilized egg, into a complex, multicellular individual. Scientists were initially surprised that a molecule with only four basic units—A, T, G, and C—could deliver sufficient information. However, the information was found to be contained in the precise order of bases along the molecule. Each DNA molecule is millions of base pairs long with any one of the four bases possible at each position. The number of possible, random permutations of base order for chromosomes exceeds the number of animals that have ever existed! However, this is an information molecule and the order of bases is not random.
The function of a protein is based on the precise number and order of amino acids in its composition. The DNA sequence specifies the order and composition of amino acids in a protein based on a three-nucleotide code. Each group of three nucleotides that specify a particular amino acid is called a “codon.” As ribosomes move down RNA molecules, they begin at a precise point, determined by multiple factors (including the start codon, AUG), read the DNA sequence as a set of three nucleotides, then jump to the next set of three. Codons do not overlap. Each set of three is read in sequence. The triplet codes of RNA bases for amino acids are shown in Table 4.1.
Because four possible bases are used to create codons of three bases, there are 64 possible codons (43). As shown in Table 4.1, the 64 codons are used to signal the 20 amino acids (listed in columns 1 and 3) as well as to provide a signal to start (START) or stop (STOP) protein production. As there are 64 possibilities for 20 amino acids and two signals (start and stop), most amino acids can be encoded by more than one codon. This is referred to as “redundancy of the genetic code.” For example, alanine, in the top left corner of Table 4.1, is encoded by four different codons. Only two amino acids, methionine and tryptophan, have a single codon. The amino acids with the largest number of possible codons are arginine, leucine, and serine with six each.
Identification of genetic variation at the DNA level for horses
One of the most common types of genetic variation detectable at the DNA level is called a single nucleotide polymorphism (SNP and pronounced “SNiP”). Fig. 4.1 illustrates a SNP. Recall that DNA is made up of billions of bases called nucleotides. When just one of them is mutated it may change from A to C, or G to T, or C to A, etc. When that happens, the DNA can exist in two forms at that site—essentially, it becomes polymorphic (derived from the Greek for “many forms”).
Fig. 4.1. Comparison of base sequences of DNA strands illustrating a single nucleotide polymorphism (SNP). A hypothetical sequence is shown for five horses. The bold letter denotes the presence of a SNP, a site at which two horses have a T and three horses have a C.
Other types of variation include rearrangements of DNA, such as inversions, duplications, and translocations from one site to another, as well as insertions and deletions of DNA, ranging from single bases to large sections with tens of thousands of bases.
Mutations and variants affecting proteins
Where does genetic variation come from? Sometimes DNA is altered. We know that some chemicals, UV light, and radiation can alter DNA. In addition, random errors may occur when DNA strands are being copied. When this happens, the change is called a mutation. Most mutations do not cause problems. Most DNA does not code for genes, so most changes are of little consequence (Chapter 6). Furthermore, the redundancy of the genetic code also prevents many mutations from having an effect on the protein. A mutation in the codon GAA changing it to GAG would still code for the amino acid, glutamic acid (see Table 4.1). These kinds of mutations are called silent mutations or neutral mutations or, more commonly synonymous mutations since they do not change the amino acid. When the DNA sequence change does substitute a different amino acid, this may or may not have an impact on the protein. These types of mutations are called non-synonymous mutations.
Another type of mutation is called a frameshift mutation, caused by an insertion or deletion of a base coding for an amino acid. As noted above, the genetic code specifies amino acids based on reading the DNA in frames of three bases. If a base is added or deleted, this causes a shift in the reading frame of the codon. As a result, the amino acids in the protein following the frameshift mutation are very likely to be changed and may include a stop codon, halting the translation process entirely. An example of a deletion causing a frameshift mutation is the variant responsible for severe combined immunodeficiency (SCID) in Arabian horses (see Chapter 16).
Effect of amino acid substitutions
The following table lists the chemical properties of the 20 amino acids that commonly appear in proteins. These are the amino acids that are specified by DNA/RNA using the genetic code as defined in Table 4.1. All amino acids have the same backbone, but they have side chains with different chemical properties as described in Table 4.2. These chemical properties are referred to as polar, non-polar, hydrophobic, acidic, basic, aromatic, or having other special properties. The chemical properties of the amino acid are key to performing the function of the protein.
Table 4.2. Chemical properties of amino acids based on their side chains.
Chemical group | Amino acids |
Hydrophobic (non-polar, uncharged) | Alanine, leucine, isoleucine, methionine, phenylalanine, tryptophan, tyrosine, valine |
Polar (uncharged) | Serine, threonine, asparagine, glutamine |
Aromatic | Tryptophan, phenylalanine, tyrosine |
Basic (positively charged) | Lysine, arginine, histidine |
Acidic (negatively charged) | Aspartic acid and glutamic acid |
Special properties | Cysteine, proline and glycine |
If the amino acid has profoundly different chemical properties, this can change or even destroy the function of the protein. For example, the MC1R gene (also known as melanocortin 1 receptor), is responsible for pigment production in melanocytes (Chapter 7 on the Extension locus). There are two well-known variants of this gene, one associated with the production of red pigment and another with the production of black pigment. The differences are the result of a substitution of a T for a C in one of the codons (Marklund et al., 1996). The situation is illustrated in Fig. 4.2 taken from the paper of Marklund et al. (