Lesson Explainer: The Genetic Code | Nagwa Lesson Explainer: The Genetic Code | Nagwa

Lesson Explainer: The Genetic Code Biology • Third Year of Secondary School

Join Nagwa Classes

Attend live Biology sessions on Nagwa Classes to learn more about this topic from an expert teacher!

In this explainer, we will learn how to describe the nature of the genetic code and recall how information is transferred from DNA to protein.

The human genome is 3.1 billion base pairs long and is stuffed inside most of the cells in our body. This DNA accounts for all of our physical traits, like the color of our hair, the size of our hands, and even the way we taste certain foods! If you do not like cilantro, it may be because you have a variation in an olfactory receptor gene that makes you perceive the strong soapy-flavored aldehydes in cilantro leaves! All these traits are based on the DNA sequence for the relevant genes, and once that gene is expressed, it will produce a specific protein that gives that particular trait. Simply put, the gene for red hair will be converted to a red hair protein that will make your hair red!

Key Term: Characteristic (Trait)

A characteristic is an observable, heritable feature of an organism.

The question is this: how are these DNA sequences interpreted to give us all these wonderful proteins and the corresponding traits?

There are a few steps that occur to go from a DNA sequence to a protein that expresses the associated trait. These steps are collectively called the “central dogma of molecular biology,” and it describes how a gene goes from a DNA to the corresponding protein sequence. Let’s look at this process in more detail using the example of insulin, a protein involved in regulating glucose levels in the body.

Key Term: Gene

A gene is a section of DNA that contains the information needed to produce a functional unit, for example, a protein. It is the functional unit of heredity.

The first step is to tell the cell that insulin needs to be made, which can be triggered by a rise in glucose levels. There are tens of thousands of genes in the human genome, and in this example, only the protein for insulin and nothing else is needed. The specific request for insulin to be made is given by signals that the cell receives. The cell then creates a message to make insulin in a process called transcription, where a copy of the gene is made that can leave the nucleus to be turned into a protein. This message is called messenger RNA (mRNA). For simplicity, the mRNA for insulin contains the same information that is in the insulin gene, except it is single-stranded RNA and not double-stranded DNA. Because it is single stranded, RNA is less stable than DNA and can degrade more easily. This is an important feature of mRNA and is used to help regulate its levels in the cell. A simplified overview of transcription is shown in Figure 1.

Key Term: mRNA (Messenger RNA)

mRNA is a message that is transcribed from the DNA of a gene and can be translated to make the corresponding protein.

Key Term: Transcription

Transcription is the process of converting a DNA sequence into mRNA.

Figure 1: A simplified overview showing the process of transcription, where a section of double-stranded DNA is converted into single-stranded mRNA.

Once the insulin mRNA is produced by transcription, the next step is to turn it into the insulin protein. The instructions for building the insulin protein are in the insulin mRNA, and all that needs to happen is for the mRNA to be translated into a linear sequence of amino acids. Translation occurs inside a macromolecular machine called the ribosome. Amino acids are brought in using specialized RNA molecules called transfer RNAs and are linked together to form a polypeptide chain. This is shown in Figure 2.

Key Term: Translation

Translation is the process of converting an mRNA sequence into a polypeptide that can fold into a protein.

Definition: Amino Acid

Amino acids are the individual monomers that together make up a protein.

Definition: Protein

A protein is a complex biological macromolecule, made up of amino acid monomers, that can have a wide variety of forms and functions.

Figure 2: An illustration showing the process of translation. The sequence in mRNA is translated to specific amino acids that are added to a growing polypeptide chain.

Due to the nature of the positive and negative charges within the individual amino acids, the polypeptide chain can then fold into a specific shape to make the protein with its specific shape and function.

So, “the central dogma of molecular biology,” which is shown in Figure 3, consists of these two concepts: transcription and translation. This process describes how the DNA for a specific gene is transcribed into mRNA and then translated into the corresponding protein for that specific gene.

Figure 3: An illustration showing the “central dogma of molecular biology,” where a section of DNA is transcribed into mRNA that is then translated to an amino acid sequence.

This concept of the “central dogma of molecular biology” might be better understood if you think of a customer at a restaurant who is ordering food. The menu, in this analogy, is like the genome and contains all the genes. The customer who represents the cell can order whatever they require depending on their specific needs. Once they place an order, it is written up (or transcribed to mRNA) and delivered to the kitchen to be converted into the meal (or translated into the corresponding protein for the gene). Because the genome contains so much information, it is important to have a system to specify what genes need to be expressed as protein. And cells, like people ordering from a menu, have specific needs depending on their situation.

Example 1: Understanding the Central Dogma of Molecular Biology

The diagram provided shows a basic outline of the central dogma of molecular biology.

  1. What process has been replaced by label X?
    1. Transcription
    2. Mitosis
    3. Translation
    4. Synthesis
    5. Meiosis
  2. What process has been replaced by label Y?
    1. Synthesis
    2. Mitosis
    3. Transcription
    4. Meiosis
    5. Translation

Answer

The DNA in our cells contains the instructions for building all the proteins that make up our body. These instructions are in the form of genes. In order for a gene in DNA to be converted into protein, the first messenger RNA (mRNA) must be formed in a process called transcription. Transcription produces a sequence of mRNA that carries the same information that is in the DNA of the gene, but as it is RNA, the sequence includes uracil (U) in place of thymine (T). This mRNA sequence can then be translated into amino acids to form a polypeptide in a process called translation. The resulting polypeptide can then fold to form the corresponding protein for the gene.

Part 1

In the diagram, we can see a sequence of DNA being converted into mRNA, which is the first step in the central dogma of molecular biology, called transcription. Therefore, the correct answer is A: transcription.

Part 2

In the diagram, we can see a sequence of mRNA being converted into a polypeptide chain of amino acids, which is the process of translation. Therefore, the correct answer is E: translation.

Relationship: Central Dogma of Molecular Biology

DNAmRNAprotein

We know that there are 20 standard amino acids that are involved in protein synthesis. We also know that there are 4 nucleotides in either DNA and RNA. So, how is the genetic code of this nucleotide sequence translated into these 20 amino acids?

If only one nucleotide was required for the genetic code (A, C, G, U), then there would only be 4 different amino acid possibilities. If there were two nucleotides in the genetic code, then we would have 4=16 different possibilities, which still is not enough. With three nucleotides, we have 4=64 different combinations, which more than makes up for the 20 amino acids observed. Therefore, the minimum size of the genetic code is 3 nucleotides.

By using the genetic code, a DNA sequence like GAATTAGGCAGTGGGATTTAGCCA can easily be turned into amino acids.

The genetic code is a set of rules for how nucleotides are translated into specific amino acids. The genetic code states that three nucleotides, or a codon, determine the specific amino acid to be made into a protein.

So, our previous example of the DNA sequence GAATTAGGCAGTGGGATTTAGCCA can be converted to mRNA by transcription (and replacing thymine (T) with uracil (U)): CUUAAUCCGUCACCCUAAAUCGGU that can then be divided into 8 sets of codons: CUUAAUCCGUCACCCUAAAUCGGU.

Each of these codons refers to a specific amino acid that is given by the genetic code.

Key Term: Genetic Code

The genetic code is formed by the sequence of nitrogenous bases in a strand of messenger RNA (mRNA) molecule that is synthesized from the DNA and codes for the information needed for a cell to synthesize specific proteins.

Definition: Codon

A codon is a sequence of three nucleotides of DNA or RNA that corresponds to a specific amino acid.

You may notice that when the DNA sequence is divided into codons, there is no overlap.

The genetic code is nonoverlapping, meaning that the bases in each codon are only used once. So, an mRNA sequence “AUGGGACCU” is translated as 3 nonoverlapping codons as shown in Figure 4.

Figure 4: A diagram showing the difference between nonoverlapping codons and overlapping codons. The genetic code is nonoverlapping, so the first example at the top shows how codons are correctly read and translated.

Key Term: Nonoverlapping

The genetic code is nonoverlapping, meaning that the genetic code is translated in groups of three, and the same base is not used twice in translating a codon.

Example 2: Understanding the Rules of the Genetic Code for Reading Codons

A student reads the following sequence of mRNA bases: UACGAGAACCGA. They divide it up into the following codons: UACG AGAA CCGA. What is wrong with this sequence of codons?

  1. The codons overlap.
  2. Codons should be 3 bases long.
  3. Codons should be read as DNA bases.
  4. There is nothing wrong; this sequence is correct.

Answer

In order for a gene in DNA to be converted into protein, the first messenger RNA (mRNA) must be formed in a process called transcription. Transcription produces a sequence of mRNA that carries the same information that is in the DNA of the gene, but it is RNA, so it includes uracil (U) in place of thymine (T). This mRNA sequence can then be translated into amino acids to form a polypeptide in a process called translation. The resulting polypeptide can fold to form the corresponding protein for the gene.

In order to translate mRNA to protein, the sequence of nucleotides in the mRNA is converted to amino acids using the genetic code. Sequences of three bases, called codons, are what correspond to specific amino acids. For example, the codon in mRNA “GGG” codes for the amino acid glycine during translation. These codons are nonoverlapping, meaning the same base is not used twice in a specific sequence. So, GGGUAA would correspond to GGG UAA and would not repeat from the second G (so, not GGG GGU GUA).

The sequence in this example UACGAGAACCGA would therefore be translated as the codons UAC GAG AAC CGA. The example of UACG AGAA CCGA is grouped in codons that are 4 bases long, which is incorrect.

Therefore, the correct answer is B: codons should be 3 bases long.

In the genetic code, each codon identifies a specific amino acid. For example, the DNA sequence “ATG” is a codon that codes for the amino acid methionine. Remember that it is mRNA that is translated, and not DNA, so the “ATG” codon would actually be read as “AUG” during translation, because uracil (U) in RNA replaces thymine (T) in DNA.

In Figure 5, we can see how different codons can be translated to code for 20 amino acids.

Figure 5: A codon wheel that shows the different combinations of nucleotides and the corresponding amino acid. Special “start” or “stop” codons exist, which is a special code used to either begin translation or terminate it.

So, if you have the mRNA sequence AUGGGGUCU, the corresponding codons would be AUG-GGG-UCU, which translates to Met-Gly-Ser.

You will notice in Figure 5 that there is redundancy (also called degeneracy) in the genetic code, meaning that some amino acids are coded by multiple codons. Arginine (Arg) can be coded by 4 different codons: CGG, CGA, CGC, and CGU, while methionine (Met) can only be coded by 1: AUG.

Key Term: Degenerate

The genetic code is degenerate, or redundant, because some of the amino acids can be translated from different codons.

Example 3: Using a Codon Wheel to Determine a Sequence of Amino Acids

A sequence of DNA is transcribed into an RNA sequence. This RNA sequence reads 5-GCUUUCACGCAC-3

Use the codon wheel provided to determine the sequence of amino acids.

  1. Arg, Ser, Thr, Pro
  2. Ser, Leu, Ala, His
  3. Ala, Phe, Thr, His
  4. Ser, Leu, Ala, Gln
  5. Ala, Leu, Thr, Gln

Answer

In order for a gene in DNA to be translated into protein, the first messenger RNA (mRNA) must be formed in a process called transcription. Transcription produces a sequence of mRNA that carries the same information that is in the DNA of the gene, but it is RNA, so it includes uracil (U) in place of thymine (T).

In order to translate mRNA to protein, the sequence of nucleotides in the mRNA is converted to amino acids using the genetic code. Sequences of three bases, called codons, are what correspond to specific amino acids. For example, the codon in mRNA “GGG” codes for the amino acid glycine during translation. These codons are nonoverlapping, meaning the same base is not used twice in a specific sequence. So, GGGUAA would correspond to GGG UAA and would not repeat from the second G (so, not GGG GGU GUA).

A codon wheel, as shown above, is the genetic code for codons and the corresponding amino acid. To use it, start from the center of the wheel (at the 5 end) and work your way out (toward the 3 end). You will recall that mRNA sequences are read in the 53 direction by convention. So, for the codon 5-GAG-3, you would work through the codon wheel starting from the center and choosing “G,” then “A” in the next area, then “G” in the next area. This gives the amino acid Glu.

The provided sequence 53-GCUUUCACGCAC- can be broken up into codons after the third nucleotide, which corresponds to the codons GCU UUC ACG CAC. Let’s look at how each codon translates to its amino acid. Using the codon wheel and starting from the center of the wheel for “GCU,” focus in on the G quadrant, then move out to the C, then finally to the U at the outer edge of the circle. You can see this corresponds to Ala (or alanine). For the remaining codons,

  • UUC corresponds to Phe,
  • ACG corresponds to Thr,
  • CAC corresponds to His.

Therefore, the correct answer is C: Ala, Phe, Thr, His.

For the most part, the genetic code is the same code used by all life on Earth. So, “GGG” always codes for glycine in bacteria and in humans—it is universal!

Key Term: Universal

The genetic code is universal, meaning it applies to all organisms that use DNA as genetic material on Earth.

Example 4: Features of the Genetic Code

Which of the following correctly describes the features of the genetic code?

  1. It is degenerate, organism specific, and nonoverlapping.
  2. It is degenerate, universal, and nonoverlapping.
  3. It is universal, organism specific, and nonrepeating.

Answer

In order for a gene in DNA to be converted into protein, the first messenger RNA (mRNA) must be formed in a process called transcription. Transcription produces a sequence of mRNA that carries the same information that is in the DNA of the gene, but it is RNA, so it includes uracil (U) in place of thymine (T). This mRNA sequence can then be translated into amino acids to form a polypeptide in a process called translation. The resulting polypeptide can fold to form the corresponding protein for the gene.

In order to translate mRNA to protein, the sequence of nucleotides in the mRNA is converted to amino acids using the genetic code. Chunks of three bases, called codons, are what correspond to specific amino acids. For example, the codon in mRNA “GGG” codes for the amino acid glycine during translation. There are three features of the genetic code:

  1. It is degenerate or redundant, meaning that one amino acid can be translated from multiple codons.
  2. Codons are nonoverlapping, meaning that the same base is not used twice in a specific sequence. So, GGGUAA would correspond to GGG UAA and would not repeat from the second G (so, not GGG GGU GUA).
  3. It is universal, meaning that the genetic code is what all organisms on Earth use to make protein from mRNA. It is not organism specific.

Therefore, the correct answer is B: It is degenerate, universal, and nonoverlapping.

Since the genetic code is universal, it becomes possible to compare the amino acid sequences of proteins across all forms of life. To compare amino acid sequences between species of organisms, the sequences are aligned. These sequence alignments can then be examined for changes between species.

For example, the protein sequence of insulin can be aligned to show where there are similarities and differences between chickens, humans, and chimpanzees! Areas of the protein sequence that are conserved tend to be in crucial regions for the protein’s function, and this can give us clues about how the protein works. We can even use this information to learn more about how proteins evolve, or change over time, which we can extend to finding evolutionary relationships between organisms.

In Figure 6, the amino acid sequence for insulin in chickens, humans, and chimpanzees are aligned for comparison. By aligning these proteins, we can see where there are differences, and the more differences there are, the more distantly related they are. Notice how similar insulin is between the human and chimpanzee (only 1 amino acid is different) compared to the insulin protein in a chicken (with about 10 differences). From this, scientists would infer that chickens are more distantly related to us than chimpanzees, because there are more differences (mutations) in the chicken insulin compared to the chimpanzee insulin. This is because mutations happen over a long period of time.

Figure 6: A protein alignment for a segment of the insulin protein in chickens, humans, and chimpanzees. The letters refer to the standard single-letter convention used for amino acids. Differences are highlighted in red.

Let’s recap some of the key points we have covered in this explainer.

Key Points

  • The central dogma of molecular biology is that DNA is transcribed to mRNA that is then translated to protein.
  • The mRNA sequence is broken up into codons that correspond to specific amino acids based on the genetic code.
  • The genetic code is nonoverlapping and is read in a sequence of 3 nucleotides and is degenerate (redundant).
  • The genetic code is universal and is shared by all life on Earth and because of this, protein sequences can be aligned to show changes across evolutionary time.

Join Nagwa Classes

Attend live sessions on Nagwa Classes to boost your learning with guidance and advice from an expert teacher!

  • Interactive Sessions
  • Chat & Messaging
  • Realistic Exam Questions

Nagwa uses cookies to ensure you get the best experience on our website. Learn more about our Privacy Policy