Lecture 4
MOLECULAR GENETICS: DNA, RNA AND PROTEIN
DNA is a double stranded molecule made of four subunits called nucleotides. These nucleotides are composed of a nitrogenous base (A = adenine, T = thymine, C = cytosine, G = guanine) attached to a sugar called deoxyribose and the sugar is attached to a phosphate group which is negatively charged. The double stranded DNA helix is like a twisted ladder. The sides of the ladder are repeating sugar-phosphate and the rungs of the ladder are the bases, A, T, C, G. The rungs always are A matched to T and C matched to G. This specific complementary base pairing is a chemical necessity and is responsible for the faithfulness of DNA replication each time a cell divides. The double strandedness also gives the molecule stability but the bonds between the bases are weak bonds and can be broken for replication and transcription to occur. The genetic code is found in the sequence of bases in only one of the two strands...only one of the sides of the ladder contains the genetic code.
RNAs are single stranded molecules also composed of four different nucleotides. However, the nucleotides in RNA contain the sugar ribose instead of deoxyribose and they contain uracil instead of thymine. The RNA is synthesized from the "sense" strand of the DNA molecule. There are three kinds of RNA in the cell messenger RNA (mRNA), transfer RNA (tRNA) and ribosomal RNA (rRNA). These will be the subject of a future lecture.
ATP is adenosine triphosphate is a nucleotide which is the energy coenzyme. It is a very important molecule in energy metabolism and works with a large variety of enzymes. It is the common dollar bill used by all cells to store and transfer energy in all cellular processes.
Polypeptide is the name given to a chain of amino acids synthesized from one mRNA. The term polypeptide refers to the fact that amino acids are linked by what is called a peptide bond and, of course, poly means many. Proteins are usually composed of more than one polypeptide chain. Each polypeptide chain is coded for by a gene.
Proteins have a variety of functions. A very large number are enzymes. Enzymes catalyze reactions and make them go much, much faster than they could if the enzyme were not around. The sugar on your table would eventually break down to CO2 and H2O but it would take a very, very long time. When you put sugar into your body, the sugar is broken down within minutes to provide ATP and to release CO2 and H2O. Enzymes are responsible for all the metabolic reactions that occur in all the cells of our bodies. Other proteins act as carriers, an example is hemoglobin which has four polypeptide chains. Some proteins are cell membrane receptors for protein hormones. Others are antibodies, the molecules that are made by special white blood cells (lymphocytes) to fight off foreign organisms and molecules. Antibodies are proteins that contain four polypeptide chains and they are all similar except for the regions which bind to the foreign cell, virus, or other molecule. Each antibody-producing cell produces only one kind of antibody and once stimulated it "remembers" and will produce its antibodies whenever it meets the same invader. That is why we get booster shots for vaccinations...to help our cells remember to make antibodies to the organisms or molecule for which we were inoculated. DNA binding proteins turn genes on and off, so they regulate what each cell is producing. Not all cells make the same proteins and even the same cell may make different proteins at different times. Some proteins like collagen, found in connective tissues, and keratin, found in hair, are purely structural proteins. These examples are not exhaustive of the many functions served by proteins.
Proteins are composed of 20 different amino acids . . . The sequence of amino acids is called its primary structure and the sequence is determined by the genetic code. The final shape of any particular protein depends on the sequence of amino acids it contains. Except for the purely structural proteins, most proteins are globular but they often contain regions of alpha helices and sometimes, beta sheets within them. And, as stated earlier, most functional proteins have more than one polypeptide chain. Although proteins are macromolecules, they are still quite small compared to the size of a cell. A cell will contain many thousands of different proteins that carry out its work.
THE CENTRAL DOGMA
To understand the nucleic acids and proteins, it is important to understand how they are related. The Central Dogma of molecular biology states that: DNA makes DNA (replication); DNA makes RNA (transcription) and RNA directs the synthesis of proteins (translation). DNA is the genetic material in all cells and in eukaryotic cells it stays in the nucleus in the form of chromosomes. RNA molecules are copies of genes, much like a blueprint of a house. The RNA that codes for a protein is appropriately called messenger RNA (mRNA). Messenger RNA goes out to the cytoplasm of the cell where the sequence of nucleotides is translated into a sequence of amino acids to form a unique protein.
DNA (deoxyribose nucleic acid) is the genetic material found in all living cells DNA (and many viruses). DNA is always found in the cells as a double helix. In prokaryotic cells (and mitochondria and chloroplasts), there is a single circular DNA molecule. In eukaryotic cells there are several pairs of chromosomes and although they are more complex, each chromosome contains a single, very long molecule of DNA. The genes are within the DNA molecules.
Mendel, the father of genetics, knew nothing of chromosomes or DNA. He deduced the laws of inheritance purely from observations of the progeny of his pea plants. The discovery that DNA is the genetic material occurred in the first half of the 20th century. Experiments with bacteria and viruses showed that DNA was the genetic material and not protein as many people in the mid 1900's believed. At the time, people reasoned that proteins were more complex with 20 different kinds of subunits than nucleic acids with only 4 different subunits. (They forgot that the Morse code which consists of dots and dashes can code for all the 26 letters of our alphabet.) Scientists were able to show that with bacterial viruses, the viral DNA entered the host cell and was able to direct the synthesis of complete viral particles, both the DNA and the protein capsid, while the viral protein remained outside the host cell.
The structure of DNA was deduced in 1953 by Watson and Crick using the accumulated biochemical knowledge of DNA and the X-ray diffraction pictures of Rosalind Franklin. It was already known that DNA was composed of four subunits called nucleotides which were composed of phosphate, sugar and one of four different bases. These bases were adenine, thymine, cytosine and guanine, which are referred to as A, T, C, and G. It was also known that in every DNA molecule, the number of A's always equaled the number of T's, and the number of C's always equaled the number of G's.
Part of Watson and Crick's success was due to their thinking as biologists. They reasoned out the structure from their knowledge of biochemistry, physical chemistry and the role of DNA in the cell. They knew the structure of DNA would have to be able to explain how the molecule replicated and how it coded for proteins. (It had previously been shown by biochemical geneticists that genes code for proteins.) They proposed that the reason A = T and C = G was because they formed complementary base pairs between two strands of a double helix. The bases formed the rungs of the ladder-like molecule. The outside strands of the ladder were repetitions of sugar and phosphate, sugar and phosphate, etc.
It was further proposed that replication was accomplished by the separation of the two parent strands, each acting as a template to attract the complementary bases of new nucleotides to form a new half of the molecule. Thus, DNA molecules relied on complementary base pairing for replication with each new double stranded molecule having one parent strand and one newly assembled strand. This is called "semi-conservative" replication since one of the parent strands is "conserved" in the new DNA molecule. The bonds between the sugars (ribose) and phosphates are strong covalent bonds and the hydrogen bonds between the bases are relatively weak (individually) and can be broken to open the molecule for replication and transcription.
Watson and Crick proposed that the genetic code could be found in the sequence of bases in one of the two strands. (Only one strand carried the genetic message and is read.) A little later, the "code was broken" and found to be a three-letter non-overlapping code. The three "letters" are the bases and each sequence of three bases is called a codon. Codons code for amino acids. If the code is read with 3 bases at a time, there are 4 X 4 X 4 = 64 combinations of three bases. There are only 20 amino acids, so this means the code is "redundant" since more than one codon can code for the same amino acid. The codon, AUG, signals the "start" of translation and three different codons signal the termination of translation (UAA, UGA, UAG). [Probably the original code was a two letter code with 4 X 4 = 16 different doublet codons (a number closer to 20) but when a few "fancier" amino acids were added to the cell's repertoire, more codons were needed. When you look at the table of codons you will see that the "third" base is less important and many amino acid codons use any of the bases in the third place, only the first two bases are important.] The linear sequence of bases in the DNA (gene), codes for the linear sequence of amino acids in the polypeptide (protein) for which it codes. Since proteins usually contain hundreds of amino acids, genes can be very long stretches of DNA.
Many enzymes are devoted to the process of replication and the repair of mistakes in the structure of DNA. If these "mistakes" in replication or damage to DNA caused by chemical and physical agents (e.g., UV light, X-rays, tobacco, and other mutagens) are not repaired, the result is a mutation either in your gametes or in your body cells. If a mutation occurs in your body cells, it can result in cancer. Carcinogens (agents that cause cancer) are, in fact, mutagens.
The Central Dogma of molecular biology says that the flow of information in the cell is:
Transcription Translation
DNA---------------------------------RNA (mRNA, rRNA, tRNA)---------------------------Proteins
Replication
In eukaryotic cells, replication and transcription occurs in the nucleus on DNA templates and translation occurs in the cytosol on ribosomes (either free or on the RER). In prokaryotic cells all three occur in the cytosol.
Proteins, often composed of more than one polypeptide chain, are the work horses of the cells. Each polypeptide chain is coded for by a different gene. Not all genes in each cell are "turned on." The cells of the tissues and organs of your body make only the proteins required for its function. Therefore, not all the DNA is transcribed in every cell. There are special regulatory proteins that have the job of controlling which genes (stretches of DNA) will be transcribed in the various tissues.
In eukaryotes, DNA never leaves the nucleus so the genetic messages the DNA contains must be "transcribed" for export to the ribosomes where they will be translated into proteins (polypeptides). Transcription is rather like replication. It also occurs on a DNA template and like replication, the DNA opens up and but only one of the two strands is copied into a messenger RNA molecule. The copying process uses complementary base pairing just as replication uses complementary base pairing. However, the resulting RNA molecule is single stranded and uses uracil(U) instead of T to pair with A.
The messenger RNA (mRNA) is thus a copy of a gene. It leaves the nucleus through nuclear pores and it travels to the ribosomes. The ribosomes are composed of ribosomal RNAs (synthesized in the nucleolus in eukaryotes) and ribosomal proteins. They are rather complex structures composed of a small and large subunit. Ribosomal RNA is also transcribed from rRNA genes. Another type of RNA is the transfer RNAs (tRNAs). They, too, are transcribed from tRNA genes. They are small molecules compared to rRNAs and mRNAs. They have the very important function of reading the codons of the mRNA and of bringing the correct amino acid, corresponding to that specific codon, into alignment on the ribosome. There are enzymes that specifically attach the correct amino acid to the correct tRNA with the appropriate anticodon. The tRNAs are all the same dimension from head to foot. They read the codon with their "head" end and align the amino acids at their "foot" end. Enzymes will zip up the amino acids into the polypeptide chain.