Nucleic Acids

The Essentials of Nucleic Acids

The human genome is known to contain about three billion base pairs, which contain information that would more than fill a 500,000-page textbook. The DNA is contained in 46 chromosomes consisting of 44 autosomal and 2 sex chromosomes, but each chromosome is one continuous DNA molecule around which is wrapped several proteins. The smallest chromosome, 21, has more than 50 million base pairs, whereas chromosome 1, the largest, has over 250 million base pairs. There is enough DNA to form several hundred thousand genes; however, it is estimated that only about 67,000 genes encode for a human being. This would indicate that less than 5 percent of DNA is used to code for protein. The remainder of the DNA is used to provide spacing, structure, regulatory information, and other as yet unknown functions.

DNA consists of four building blocks referred to as nucleotides or merely as bases. A nucleotide consists of a nitrogenous base, a 5-carbon sugar (deoxyribose), and a phosphate group Fig.

4-1). There are two purine bases (adenine and guanine) and two pyrimidine bases (cytosine and thymine) (Fig. 4-2). The triphosphate molecule is bonded to the 5' carbon of the sugar, and the base is bonded to the 1' carbon of the sugar. Each DNA molecule consists of millions of nucleotides joined together in a linear fashion through the phosphate group, which forms a bond with the hydroxyl group of the 3' carbon of the next sugar. The phosphate groups form the backbone of the molecule, but because they are water-soluble, they face outward. Attached to the inner side of the sugar is the hydrophobic base, which faces inward to shield it from the aqueous environment. The molecule forms a right-sided spiral coil with a turn every 10 nucleotides (3.4 nm), referred to as a right-sided CT-helix, and pairs with its complementary strand to form the so-called double helix (Fig. 4-3). The center of the molecule consists of the bases that face inward and are opposite to each other. This arrangement provides for the hydrogen bonding between the bases that keeps the two strands together. The hydrogen bonds are perpendicular to the helical axis. The directionality of the strands is referred to as 5 to 3 or 3 to 5, which is based on the position of the carbons in the sugar. The end of the molecule with a phosphate or hydroxyl group on the 5' carbon is termed the 5 end, whereas the end with a free terminal 3' carbon is referred to as the 3 end. It is important to distinguish the two ends because the enzyme DNA polymerase always initiates replication of DNA from the 5' end and proceeds to the 3' end. There seems to be no constraints on which bases can be adjacent to each other; however, the hydrogen binding between the bases of the two chains is highly specific, since adenine (A) always pairs with thymine (T), and guanine (G) always pairs with cytosine (C). The sugars and the phosphate groups are always the same, whereas the sequence of the bases varies and determines the nature of the hereditary information to be passed onto the progeny. The specificity of this "base pairing" is the basis of the ability of DNA to replicate itself and pass on the genotype characteristics and also forms the basis for the specificity of essentially all the procedures used in recombinant DNA technology. During the process of DNA replication, the strands separate, and new strands form complementary to the original strands, resulting in two additional identical molecules.

Sequence Dna Purine
Figure 4-2: The common purine and pyrimidine bases found in DNA. Uracil is substituted for thymine in RNA. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probi Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

Figure 4-3: DNA replication conserves the nucleotide sequence. DNA is a double-stranded helical molecule bound together by the nucleotide bases contained on each individual strand. During cell division, two identical copies of the original parental strand are made by unwinding the DNA and then synthesis of a complementary second strand to make two identical new daughter strands.

Figure 4-3: DNA replication conserves the nucleotide sequence. DNA is a double-stranded helical molecule bound together by the nucleotide bases contained on each individual strand. During cell division, two identical copies of the original parental strand are made by unwinding the DNA and then synthesis of a complementary second strand to make two identical new daughter strands.

Transcription (from DNA to RNA)

The central dogma of molecular biology is that DNA produces RNA, which in turn produces a polypeptide, the latter being the molecules that make up proteins that provide the cell structure and perform the functions of the cell (Fig. 4-4). The genetic information inherited by each individual is encoded by the sequence of the bases of the DNA (the genotype), which is translated into proteins and provides the observable characteristics of the individual (the phenotype). This overall process from DNA to protein, however, must first go through the intermediary step of RNA. The process whereby mRNA is synthesized using DNA as the template is referred to as transcription (Fig. 4-5). Transcription and the processing of mRNA occur in the nucleus of the cell, separated by the nuclear membrane from the cytoplasm of the cell. The process of transcription is initiated by attachment of the enzyme RNA polymerase II to specific recognition sites where the DNA is double-stranded, but on activation by the enzyme, the strands now selectively unwind and separate (Fig. 4-6). The binding site of RNA polymerase II is always located on the 5' end of the gene, and the enzyme remains attached to a single strand of DNA as it travels in the 3' direction. The DNA immediately in front of it separates into two strands with just one strand of DNA (antisense) acting as a template for the synthesis of mRNA. Thus, in contrast to DNA, mRNA is a single-stranded polynucleotide. Messenger RNA also differs from DNA in that deoxyribose, the sugar found in DNA, is replaced by ribose. Moreover, uracil (U) replaces thymine (T), and like thymine, uracil pairs exclusively with adenine (A). Thus, by this mechanism, each adenine (A) of DNA pairs with uracil (U) of RNA, each cytosine (C) of DNA pairs with guanine (G) of RNA, each thymine (T) of DNA pairs with adenine (A) of RNA, and each guanine (G) of DNA pairs with cytosine (C) of RNA.

Transcription Translation

Cell Function

Figure 4-4: Central dogma of molecular biology.

Steps Transcription Translation
Figure 4-5: Schematic localization of the processes of transcription and translation.

RNA Polymerase IE

Promoter a s

\ Initiation of transcription

Promoter a s

\ Initiation of transcription

Figure 4-6: Illustration of how RNA polymerase II interacts with DNA and the promoter to generate a single-stranded mRNA. RNA polymerase II attaches to the initiation site promoted by the 5' promoter sequence. mRNA is synthesized in the 5' to 3' direction from just one strand, the antisense strand. The specificity of base pairing between mRNA and the antisense strand provides for an mRNA with sequences complementary to that of the antisense strand and identical to that of the sense strand.

The mRNA, as transcribed from the DNA, is referred to as the primary transcript, or sometimes as immature mRNA, and is a complementary copy of the DNA (Fig. 4-7). Since protein synthesis occurs in the cytoplasm, the mRNA must exit the nucleus, but prior to transport, it undergoes extensive posttranscriptional processing primarily through three main events: (1) addition of a methylated guanosine (4-methylguanosine residue) to the 5' end, referred to as a cap, which is important for the initiation of translation; (2) addition of a long tail of repeated adenine nucleotides, called the poly(A) tail, to the 3' region of the mRNA, which is essential for stability of the message in the cytoplasm; and (3) the primary transcript, which contains introns and exons, undergoes a specific splicing process whereby the introns are removed and the exons are properly respliced together prior to exit from the nucleus as mature mRNA. The process of splicing is, in part, performed by molecules referred to as small nuclear ribonucleoproteins (snRNPs), which consist of RNA molecules tightly associated with a group of about 10 different proteins. Exons survive the mRNA processing and exit the nucleus (hence the name) as part of the mature mRNA. The mRNA consists of three distinct regions. The exons of the 5' end are not translated into protein but signal the beginning of mRNA translation and contain sequences that direct the mRNA to the ribosome in the cytoplasm for protein synthesis. The exons in the second region, referred to as the coding region, contain the information that determines the amino acid sequence of the protein. The exons of the 3' end do not code for protein but for signals that terminate translation and direct the addition of the poly(A) tail. Introns are portions of the gene included in the primary mRNA transcript but which are spliced out of the mature mRNA. The process of splicing out introns and rejoining exons is an important means of introducing genetic diversity, since one mRNA may provide several different mRNAs that code for different polypeptides (this will be discussed further under gene regulation). The primary transcript undergoes extensive shortening such that the mature mRNA often represents only 10 percent of the primary transcript. The mature mRNA exits the nucleus through nuclear pores, enters the cytoplasm, and attaches to a ribosome to initiate protein synthesis.

Figure 4-7: Transcription. Transcription occurs in the nucleus, producing mRNA that is processed into mature mRNA and transported to the cytoplasm. In the cytoplasm, translation occurs, with the mRNA coding for specific amino acids that are linked together to form a polypeptide and ultimately to form a mature protein. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probl Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

Figure 4-7: Transcription. Transcription occurs in the nucleus, producing mRNA that is processed into mature mRNA and transported to the cytoplasm. In the cytoplasm, translation occurs, with the mRNA coding for specific amino acids that are linked together to form a polypeptide and ultimately to form a mature protein. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probl Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

Translation

The final process whereby the nucleic acids of the mRNA code for a specific polypeptide is referred to as translation. This process is the most complex of the various processes that occur in the flow from genomic DNA (gene) to mature protein. The alphabet of the DNA or its single-stranded complementary mRNA is that of the four nucleotides (bases), whereas that of the protein is the 20 amino acids. Crick in 1961,1! while trying to determine the code for translation from DNA to protein, showed that the genetic code was written in triplets of bases, with each amino acid being encoded by three base pairs referred to as a codon and specific amino acids determined by the sequence of the codon. The mRNA codons dictate which amino acids are to be selected, and the order of the codons dictates the sequence of the amino acids in the protein. Determination of the codons for each amino acid was completed in 1966. There are four different nucleotides to form the triplets; thus the number of combinations (43) is 64, but there are only 20 amino acids. There is considerable redundancy, referred to as degeneracy, and this results in most of the amino acids having more than one codon. In addition to codons for each amino acid, there is also the codon AUG, which is the start codon that initiates protein synthesis and also codes for methionine. To stop translation, there are three codons, UAA, UAG, and UGA, that signal the end of a particular polypeptide. Translation into protein requires two other RNA species, ribosomal RNA (rRNA) and transfer RNA (tRNA). The mRNA, after exiting the nucleus, recognizes the ribosome, which is the site of protein synthesis. The ribosome moves along an mRNA molecule, translating each of its codons in a 5' to 3' direction to assemble the polypeptide from its amino (N-terminal) to its carboxy (C-terminal) ends Fig. 4-8).

The mRNA does not interact directly with amino acids but rather through adaptor molecules-referred to as transfer RNA (tRNA)-to which amino acids are covalently joined by a highly specific enzyme (aminoacyl tRNA synthetase) using ATP. There is at least one tRNA species corresponding to each of the 20 naturally occurring amino acids. The aminoacyl tRNA synthetase performs a special function of activating the amino acids and ensuring that each amino acid is joined to its tRNA and to no other. The structure of tRNA is now known in great detail, and its specificity is attributed to the sequence of three nucleotides complementary to the codon exposed at one end of the folded tRNA molecule, which, on the tRNA, is referred to as the anticodon. The amino acid receptor site is exposed at the other end. Amino acids thus are specified at two recognition steps: one in which a specific enzyme joins the amino acid to a specific tRNA and the other in which the tRNA serving as an adaptor molecule joins the amino acid to the ribosomal-mRNA complex through a codon-anticodon specific-base-pairing interaction between the mRNA and the tRNA. Once the process of protein synthesis is initiated, the ribosome moves along the mRNA joining the amino acids via peptide bonds in the sequence specified by the mRNA to form the mature polypeptide. The process of protein synthesis from this complex of mRNA and ribosome involves over 100 enzymes. The steps involved consist of initiation, elongation, and termination of the polypeptide, with each process having its own enzymes.

The mature polypeptide consists of amino acids joined together by peptide bonds; the mature protein, however, often consists of multiple covalently bound polypeptides, and many undergo other modifications referred to as posttranslational changes. A more detailed analysis of protein synthesis is given in Chap. 5. Encoded in the polypeptide are other features that have been determined by the mRNA, namely, leader sequences that will direct the protein to either intracellular membranes, the plasma membrane, or organelles such as the mitochondria. There is also considerable proteolytic activity following entry of the molecule into its organelle, or membrane, as the leader sequences are removed. There are also the processes whereby disulfhydryl bonds are formed or glycosylation occurs (in the Golgi apparatus) (see Fig. 4

8). The mRNAs generally are not long-lived due to their rapid degradation by RNAses and so may last from only a few minutes to many hours. A single mRNA may code for only a few copies of the polypeptide or several thousand. The average estimate is 1400. In contrast, rRNAs and tRNAs are much less rapidly degraded and therefore have acquired the name stable RNAs. Their relative concentration in the cell, in large part, reflects their stability, with more than 80 percent being rRNAs, 15 percent being tRNAs, and less than 5 percent being mRNAs.

Gene Structure, Expression, and Regulation

The concept that one gene leads to one protein remains basic to the central dogma of molecular biology but does, in some cases, need to be modified slightly in view of recent observations. In the classic sense, a gene consists of a discrete unit of DNA that encodes for a specific polypeptide. Two observations must be noted: First, transcription produces two end points-ribonucleic acid (RNA) and protein. The products, or rRNA, tRNA, and small nuclear RNA (snRNA), do not get translated into protein but rather perform functions during posttranscription and translation that are pivotal to expression of the mRNA that does code for protein. The polymerases necessary for transcription of these genes are of three types, polymerase I for rRNA, polymerase II for mRNA, and polymerase III for tRNA and some other snRNAs. Second, in part because of snRNA and certain proteins, alternative splicing of the exons in the primary mRNA can lead to different mature mRNAs that each code for a slightly different polypeptide. The forms generally are isoforms of the same protein, however, such as multiple forms of tropomyosin from the same gene. The genes that do encode for proteins do so only through mRNA. The following discussion will address the regulation of those genes which encode for proteins.

The anatomy of a protein-coding gene is composed of introns and exons. The average exon is about 300 base pairs long, whereas introns are much larger and are spliced out of the mature mRNA and, thus, do not code for protein. A typical mRNA has three regions: the 5' untranslated region that contains the cis-acting sequences that regulate translation; the central portion, referred to as the coding region, that codes for protein; and the 3' untranslated end, which also has regulatory sequences and coding signals for stability of the mature mRNA. The first nucleotide to be transcribed is given the +1 number, and everything 5' to it is referred to as upstream or proximal and is numbered with the first base pair as -1, etc. The initiation site for transcription is always upstream from the 5' untranslated region. The 5' regulatory untranslated region has variable sequences, but there are several consistent sequences present in the same position in most human genes. Polymerase II has no affinity for DNA and can only bind after several transcription factors have bound. The site of transcription and its direction are determined by a TATA box, which has a consensus sequence of TATAA(T)AA(T) and is found at base pairs -25 to -30 upstream from the start site. A large complex of transcription factors (more than 25 proteins) binds to the TATA box in preparation for RNA polymerase II binding and transcription. Collectively, these transcription factors are referred to as transcription factors for polymerase II (TFII), with letters designating the different factors. TFIID binds first, then TFIIB, followed by RNA polymerase II, followed by several TFII factors such as E, F, G, H, and J, etc. TFIIH has kinase activity and phosphorylates RNA polymerase II, which now, independent of transcription factors, can initiate transcription. In addition, in many human genes, located at about base pair -200 upstream is the GGGCG box to which SP1 binds, and this is felt to be a regulator of housekeeping genes (Fig. 4-9).

Figure 4-9: Structure of a gene. These small functional units within the nucleus contain the coding information for the synthesis of a polypeptide and on their 5' ends have regulatory sequences that include silencers, enhancers, and promoters. The coding region consisting of exons (code for protein) as well as intervening noncoding sequences (introns) is followed by a 3' noncoding region that is translated into the mRNA. The 3' end appears important for exit of the mRNA from the nucleus and its stability in the cytoplasm but does not code for protein. The TATA is the initiation site for polymerase and is present in most eukaryotes at about 10 to 30 base pairs 5' from the start codon (TAC) of the coding region. The AATAA will become the recognition site on the mRNA to which attaches an enzyme that cleaves the 3' region and replaces the distal portion with a poly(A) tail. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probl Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

Figure 4-9: Structure of a gene. These small functional units within the nucleus contain the coding information for the synthesis of a polypeptide and on their 5' ends have regulatory sequences that include silencers, enhancers, and promoters. The coding region consisting of exons (code for protein) as well as intervening noncoding sequences (introns) is followed by a 3' noncoding region that is translated into the mRNA. The 3' end appears important for exit of the mRNA from the nucleus and its stability in the cytoplasm but does not code for protein. The TATA is the initiation site for polymerase and is present in most eukaryotes at about 10 to 30 base pairs 5' from the start codon (TAC) of the coding region. The AATAA will become the recognition site on the mRNA to which attaches an enzyme that cleaves the 3' region and replaces the distal portion with a poly(A) tail. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probl Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

Gene expression refers to all the processes required to go from DNA to protein, from the initial unfolding of the nuclear chromatin in preparation for transcription to the mature protein emerging following completion of posttranslational changes. Regulation of this process occurs at all levels in response to signals both from within the cell and from the environment. The latter mechanism is of particular interest because it represents one of the major areas of research in molecular biology and cardiology, and it is also an area that has great potential for therapeutic intervention. The cell maintains its integrity and responds to external stimuli through signals that activate receptors (generally in the cell membrane). These in turn use signaling proteins to transfer their message to the cytoplasm or nucleus, which in some way modifies gene expression. Delineation of the receptor, the signaling proteins, and where and how gene expression is altered are of prime importance.

The most fundamental level of gene regulation involves cell differentiation (discussed later). The body contains at least 200 different types of cells that have been programmed by their genes to perform highly specialized functions. All cells have the same DNA and the same genes, but only those genes which are expressed determine the cell's phenotype. Cardiac myocytes, for example, are characterized by a set of proteins that specialize in contractile activity, whereas hepatocytes specialize in the synthesis and catabolism of proteins. Selective gene expression is the basis of cell differentiation. Cell growth and replication occur in what is termed the undifferentiated cell but, through complex mechanisms, give rise to cells that cease to replicate and are programmed to take on specialized functions (cell differentiation). In the process of cell differentiation, genes-particularly those concerned with cell proliferation and undifferentiated functions-are down-regulated, whereas those genes coding for the proteins that perform the specialized functions are up-regulated. Once cells are differentiated, protein synthesis, however, remains a dynamic process to maintain cell integrity. Most of gene regulation is concerned with the maintenance of cellular integrity, and the genes responsible for this basal function are referred to as housekeeping genes. Housekeeping genes are constitutively regulated, as opposed to genes responsible for cell differentiation and growth, which are developmentally regulated. It is estimated that organs use about 10,000 genes (constitutive) to maintain their integrity, with one exception-the brain, which is estimated to use around 20,000 genes. Gene regulation may be classified under the following headings: pretranscription, transcription, posttranscription, translation, and posttranslation.28

Pretranscriptional regulation refers to the decompaction of the DNA and exposure of the region about to undergo transcription. The total DNA of a single cell would measure about 1 m in length, yet in the nucleus it is markedly compacted and is folded around specific proteins, the dominant class being histone. The coiling of the DNA appears to be in domains that can be exposed when transcription is activated. It is also at this level that methylation plays a part. Heavily methylated genes, made insensitive to digestion by the enzyme DNAse, tend not to be transcribed, whereas other areas sensitive to digestion appear to be very active in transcription. The precise mechanisms involved with chromatin conformational changes or exposure of the gene for transcription are, at present, relatively unknown. There is evidence, however, that methylation is involved in regulating cell differentiation.

The role of transcriptional control is a major rate-limiting step to gene expression. While transcription is catalyzed by the enzyme RNA polymerase II, the enzyme by itself cannot initiate transcription and acts only with the help of additional transcriptional factors. In addition to the promoter sequences previously described (TATA box and CG box), several DNA sequences in conjunction with their DNA-binding proteins act as either promoters, enhancers, or silencers of transcription and will be defined subsequently (see Fig. 4-9). The 5' upstream region, immediately adjacent to the transcription initiation site and including the area that binds RNA polymerase II, is referred to as the promoter region. This region contains sequences that are specific binding sites for proteins referred to as transacting factors, or transcriptional factors. The protein-binding sites are often referred to as cis-acting sequences because they are on the same DNA molecule on which they act. The transcription factors (also referred to as DNA-binding proteins) are referred to as transacting factors (acting at a distance) because they are encoded by genes that may even be on another chromosome. The average promoter binding site consists of several hundred base pairs grouped into motifs of 4 to 10 base pairs.29 It is hypothesized that all the motifs have to be bound by transcription factors of the appropriate nature and in the appropriate sequence for transcription to occur.

The promoter sequences and their corresponding DNA-binding proteins may act ubiquitously or may be tissue-specific. Promoters often increase transcription of a class of genes rather than a single gene. Another type of DNA sequence that increases transcription is referred to as an enhancer (see Fig. 4-9). Enhancers differ from promoter sequences in that they may be upstream or downstream from the coding region and be separated by as many as hundreds of thousand base pairs and are effective in either the 5' to 3' or 3' to 5' direction. An extreme example is the DNA sequence that enhances expression of the gene for hemoglobin, which is located more than 1 million base pairs from the transcription initiation site. These enhancers, like promoters, consist of several small motifs of 4 to 10 base pairs, and when bound by their corresponding DNA-binding proteins (transcription factors), they have a positive influence on gene transcription. Another regulatory DNA sequence that is similar to enhancers in size and location but exerts a negative influence on transcription is referred to as a silencer or repressor. It is believed that enhancer and silencer sequences, when bound by transcription factors, communicate with promoters by DNA looping that is induced by the binding. This DNA binding that brings the enhancer, silencer, and promoter in close proximity is the mechanism responsible for the action-at-a-distance phenomenon seen in human gene regulation.

The genes that encode proteins regulating cardiac growth are many: growth factors, growth factor receptors, intracellular signaling proteins that relay growth signals from the extracellular milieu, and ultimately, transcription factors that regulate RNA polymerase and selectively induce or down-regulate gene expression.30 Several DNA-binding proteins are recognized (transcription factors) (E-hB; Fig. 4-10), including the zinc-finger, leucine-zipper, helix-loop-helix, MADS domain, and helix-turn-helix proteins. The zinc-finger type of protein is used by developmental genes called GATA factors and the receptors for circulating hormones, including the glucocorticoids, progesterones, androgens, mineralocorticoids, estrogen, thyroxine, vitamin D3, and retinoic acid. These hormones, which are lipophilic, penetrate the cell membrane and activate an intracellular receptor or nuclear receptor, which, in turn, activates gene expression through the zinc-finger transcription proteins. Many of the growth-related signaling proteins, such as c-fos, jun-B, and c-jun, dimerize through leucine-zipper proteins prior to binding to DNA. For example, c-fos dimerizes with c-jun and subsequently binds to DNA.31 Transcription factors such as the myo-D family genes, which are the master genes for inducing differentiation of skeletal muscle, contain a helix-loop-helix motif. The MADS domain proteins include myocyte enhancer factor 2 (MEF2) and the serum response factor (SRF). The helix-turn-helix proteins include homeodomain-containing proteins that are important in the development of prokaryotes and eukaryotes.

Another level at which gene expression may be regulated is that of mRNA processing, whereby the introns are removed and the exons spliced together to provide the mature mRNA. In the majority of instances, each exon present in the gene is incorporated into a mature mRNA via ligation of consecutive pairs of exons and removal of all introns. This constitutive splicing process produces a single gene product from each transcriptional unit, even when the coding sequence is split into many separated exons. In other instances, however, nonconsecutive exons are joined in the processing of some gene transcripts, and this alternative pattern of primary mRNA splicing can exclude individual exons from mature mRNA in some transcripts and include them in others. The use of such differential splicing patterns creates mRNAs that generate a variety of proteins from a single gene. Differential splicing is particularly prevalent in genes of muscles and has been shown to occur in three of the eight major sarcomeric proteins studied thus far-myosin heavy chains, tropomyosin, and troponin T (skeletal and cardiac).

The 3' non-protein-coding region of the mature mRNA contains the poly(A) tail, which is essential for message stability. It is believed that protein synthesis is, in part, regulated on the basis of alterations in message stability. The precise mechanism whereby an mRNA is induced to remain stable and encode several thousand polypeptides as opposed to being extremely unstable and encoding only a few molecules is not well understood. Nevertheless, it is likely to be an important step in regulating the response to cytoplasmic signals that require rapid synthesis of a particular polypeptide. Synthesis of a polypeptide initiated via transcription is estimated to take several minutes, whereas synthesis of a protein initiated through translation requires only seconds. Regulation of gene expression also occurs at the translational and posttranslational levels. Proteins are often translated as precursors that must undergo proteolytic cleavage. Others must undergo cleavage of leader sequences that are attached to direct them to their particular subcellular compartment. Other posttranslational modifications include protein glycosylation, the addition of polysaccharides and lipids, and the formation of disulfide bonds. Finally, polypeptides often polymerize with similar or different polypeptides to form complex tertiary structures that make up the mature proteins. The folding of polypeptides into mature proteins is guided by a group of genes that encode for so-called chaperone genes. Regulation of gene expression at the protein synthesis level is more fully discussed in Chap. 5.

Molecular Biology and the Basis for Recombinant DNA Technology

Modern molecular biology, initiated in the 1970s,27,32 was in part due to four pivotal discoveries or inventions: restriction enzymes, reverse transcription, cloning, and DNA sequencing. Since DNA consists simply of four nucleotides joined together, it is a monotonous, repetitive molecule that, at first glance, offers no landmarks to recognize that a particular segment of DNA codes for a particular mRNA. The discovery of the restriction endonucleases provided the genetic scalpel to cut DNA into smaller pieces of predictable size that could be used in a variety of procedures. The unique feature of these enzymes is that each recognizes a specific sequence of DNA of 4 to 8 base pairs and cleaves the molecule at that particular site. Thus one knows precisely where the enzyme cuts, and using a number of different enzymes, one can identify the site and number of recognition sites for each enzyme in a fragment of DNA of interest and develop what is referred to as a restriction map. These enzymes also made it possible to cut DNA from different sources in a predictable manner in preparation for ligating them together into a recombinant molecule. Restriction endonucleases are obtained from bacteria, and enzymes have been purified that recognize more than 100 different cleavage sites. A restriction endonuclease is named after the bacterium from which it was isolated, taking the first letter of the genus of the bacterium, the first two letters of the species, and the first letter of the strain. An example of this would be an enzyme from Haemophilus influenzae referred to as Hind-III. The III simply refers to the third restriction endonuclease enzyme isolated from that particular species of bacteria. Thus the availability of restriction endonucleases made it possible to digest DNA into smaller molecules that could be manipulated and used in a variety of reactions and to develop a restriction map as well as develop chimeric DNA molecules, the latter being the essence of recombinant DNA technology.

The discovery that retroviruses contain an enzyme that catalyzes the formation of DNA from RNA, referred to as reverse transcriptase, revolutionized molecular biology. The resulting so-called complementary DNA (cDNA) (represented by the appropriate complementary bases for the mRNA, except, of course, with thymine replacing uracil) binds to the nucleotide sequences from which the particular mRNA was originally derived (Fig. 4-11). Messenger RNA, as discussed previously, codes for a specific polypeptide and is derived from a discrete, specific unit of DNA referred to as a gene. Reverse transcriptase reverses this process so that a cDNA is generated from an mRNA (coding part of the gene) and can be used as a gene to express the protein. The cDNA is reinserted into the genome of a vector (virus or plasmid) and subsequently replicated in an appropriate host, such as a bacterium, which made possible the first cloning of the gene. Radioactive labeling of a cDNA provides an extraordinarily powerful tool to develop known chromosomal landmarks and to isolate and identify particular genes. The labeled cDNA, referred to as a probe, or indicator molecule, is a routine, essential tool used to identify and isolate DNA or RNA fragments of interest. Development of rapid-sequencing techniques made it possible to sequence several thousand of bases per day. It is expected that a by-product of the Human Genome Project will be technology to sequence millions of bases per day.

Figure 4-11: Generation of a complementary DNA (cDNA). Taking advantage of the enzyme reverse transcriptase, mRNA is converted to DNA, referred to as complementary DNA (cDNA) The DNA is single-stranded and complementary to the sequence of RNA, except thymine now replaces uracil. Using DNA polymerase, one can then make the single-stranded DNA into double-stranded cDNA. The cDNA can be used as a probe to identify specific sequences or genes of the genomic DNA, or it can be inserted into vectors to be cloned or expressed in a variety of hosts.

Two features essential to all techniques of recombinant DNA technology need to be highlighted: The first is the ability of DNA to denature and anneal, or hybridize. The double-stranded DNA, held together by hydrogen bonding of the corresponding complementary bases, will, on exposure to high temperatures (95°C), separate into two strands, but under appropriate conditions (55°C), the complementary strands will again anneal precisely as originally and return to their normal double-stranded state. The process of separating into separate strands is referred to as denaturation, and the recombining process is known as annealment, or hybridization, with the latter term preferred if the two DNA fragments are from different sources. Second, the strands come together identically to the parent molecule because of complementary base pairing, whereby A must bind to T and C to G.

Unique Features of Recombinant DNA Technology

The techniques of recombinant DNA are unique and are not limited by some of the restrictions imposed on other scientific techniques.3 Some of these are the abilities (1) to perform the structure-function analysis of a selected molecule or a portion thereof in the intact living cell or organism, (2) to isolate and identify genes responsible for hereditary diseases, (3) to unravel the molecular basis for the regulation of growth (including the heart), and (4) to generate large quantities of protein present only in trace amounts that otherwise would not be available, as well as the opportunity to genetically engineer proteins for maximum benefit with the least side effects. The techniques routinely used in molecular biology consist of electrophoresis, Southern and Northern blotting, DNA cloning, polymerase chain reaction (PCR), electrophoretic mobility shift assay, and the development of gene libraries. Techniques related to vessel wall biology and gene transfer are discussed in Chap. 8.

Isolation of DNA

Since the DNA of all human tissues is the same, practically any tissue can be used to obtain a DNA sample. It requires only a microgram for most procedures. In humans, lymphocytes are commonly used because they are very accessible and the DNA can be extracted easily. Lymphocytes are also used because they can be transformed by Epstein-Barr virus into an immortal cell line that can provide a continuous, renewable source of DNA. The cells can be grown in culture, frozen for years (from which samples can be obtained), thawed, and regrown, providing a renewable source of DNA for several decades. A sample of 10 to 15 mL of whole blood typically would yield about 50 to 100 l^g of genomic DNA. If one's interest is restricted to the DNA sequences that are expressed, one would isolate mRNA and, using it as a template, employ reverse transcriptase to derive its cDNA. cDNA molecules represent the expressed form of a gene and thus can be used as probes to select the specific genomic DNA segments from which the mRNA was transcribed. Myocardial biopsies obtained under appropriate conditions provide adequate tissue for most DNA or RNA analyses.

Digestion and Electrophoretic Separation of DNA

One of the important physical properties of the DNA molecule is that each individual nucleotide possesses a net negative charge resulting from the phosphate group. Thus fragments of different sizes exposed to an electric field tend to migrate toward the positive electrode at differential rates depending on their size, with small fragments migrating faster than larger ones. This process of separation based on electric charge is called electrophoresis.33 The DNA sample, after being digested into fragments of different size by a restriction endonuclease, is added to a gel matrix such as agarose or acrylamide. After separation by electrophoresis, the pattern of the DNA can be visualized under an ultraviolet lamp with a fluorescent dye such as ethidium bromide (Fig. 4-12). Agarose gel electrophoresis will separate fragments from 1000 to 60,000 base pairs (60 kb) in size, and polyacrylamide gels effectively separate fragments smaller than 1000 base pairs (1 kb). The recent development of pulse-field gel electrophoresis (PFGE) made possible the separation of DNA fragments even up to 2000 kb in size. In this technique, the electric field is alternated in different directions, forcing the molecules of DNA to reorient between each pulse of electric current. Thus this technique is particularly suitable for isolating and characterizing large segments of DNA, such as to identify a known gene.

Southern Blotting Technique

Figure 4-12: Southern blotting technique. The DNA is cleaved with an appropriately selected restriction endonuclease. The digested fragments are separated by electrophoresis on agarose gel, and the fragments of gene A are located at positions 1, 2, and 3 but cannot be seen against the background of many other randomly occurring DNA fragments. The DNA is denatured and transferred to a membrane in an identical pattern to what it was on the agarose gel. It is difficult to manipulate anything on a soft gel or to remove it. Once transferred to the membrane (filter), a solid support system, the DNA is much easier to handle. A DNA probe (cDNA) that has been labeled with 32P is hybridized to its cDNA and visualized after exposure of the nylon membrane to an autoradiograph. The transfer of the DNA from the gel to the membrane developed by Southern was a major innovation illustrated in the next figure. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probi Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

Figure 4-12: Southern blotting technique. The DNA is cleaved with an appropriately selected restriction endonuclease. The digested fragments are separated by electrophoresis on agarose gel, and the fragments of gene A are located at positions 1, 2, and 3 but cannot be seen against the background of many other randomly occurring DNA fragments. The DNA is denatured and transferred to a membrane in an identical pattern to what it was on the agarose gel. It is difficult to manipulate anything on a soft gel or to remove it. Once transferred to the membrane (filter), a solid support system, the DNA is much easier to handle. A DNA probe (cDNA) that has been labeled with 32P is hybridized to its cDNA and visualized after exposure of the nylon membrane to an autoradiograph. The transfer of the DNA from the gel to the membrane developed by Southern was a major innovation illustrated in the next figure. (From Mares A Jr, Towbin J, Bies RG, Roberts R. Molecular biology for the cardiologist. Curr Probi Cardiol 1992; 17:9-72. Reproduced with permission from the publisher and authors.)

As noted previously, prior to electrophoresis, the DNA must be digested with one of the restriction endonucleases. The size of the fragments resulting from digestion will depend on the type of restriction endonuclease used, i.e., whether they recognize sequences of 4, 5, 6, or 8 base pairs. Enzymes recognizing a 4-base-pair sequence will cut the DNA into much smaller fragments than one that recognizes an 8-base-pair sequence.

Development of a DNA Probe

A nucleic acid probe is a fragment of nucleic acid to which has been attached a label such as a radioisotope or a fluorescent compound, making it possible to easily detect and recognize the desired fragment among other native DNA molecules. The fragment labeled is usually cDNA or a synthetic oligonucleotide, although it could be RNA. It is now possible to synthesize DNA fragments of up to 30 to 40 base pairs, referred to as oligonucleotides, that, with an attached label, can be used as probes to identify cDNA in the human genome or mRNA. This takes advantage of the fact that at high temperatures, the double-stranded DNA probe and the native DNA will break into separate strands. On recombining at random, the labeled DNA probe can bind with either its original complementary strand or the native DNA that is complementary to the probe and thus provide a means of isolating a fragment of native genomic DNA. A probe is necessary in most recombinant DNA procedures to detect the molecule of interest following electrophoresis.

Southern, Northern, and Western Blotting

A procedure to separate and detect specific DNA fragments, referred to as Southern blotting, is named after E. M. Southern, who developed it in 1975.34 Genomic DNA is isolated and digested into small fragments with restriction enzymes, and the fragments are separated by gel electrophoresis as described previously. Following separation, DNA fragments are denatured chemically into single-strand fragments. It is very difficult to handle gels and even more impractical to store them. Southern developed a technique whereby these separated single-strand fragments in the gel could be transferred by capillary action to a solid support medium (nylon or nitrocellulose membrane) and fixed permanently by heating. The pattern on the membrane reflects identically the pattern induced by electrophoresis on the gel. The process used to produce a Southern blot is illustrated schematically in Fig. 4-12. The nylon membrane and its attached singlestrand DNA fragments are then incubated with a radioactively labeled complementary probe. The hybridized, radioactive double-strand product, on exposure to x-ray film (autoradiography), will exhibit the pattern of the radiolabeled DNA fragments (Fig. 4-13). In summary, the electrophoretic separation of DNA followed by its transfer to a nylon membrane for subsequent identification by radioactive hybridization is referred to as Southern blotting, and the autoradiogram as a Southern blot. The same approach to detect mRNA is referred to as Northern blotting. This procedure also can be used for detection of proteins, in which case it is referred to as Western blotting (Table 4-1). The only significant difference in detecting protein versus nucleic acid by this procedure is the probe, which is an antibody rather than an oligonucleotide, or cDNA. However, as in Southern and Northern blotting, the probe may be labeled with a radioactive isotope, a fluorescent tag, or some visual colorimetric substance.

Was this article helpful?

0 0
My Life My Diet

My Life My Diet

I lost over 60 pounds and 4+ inches off my waist without pills, strenuous exercise, or any of the things that the diet experts tell you to do...and I did it in less than 4 months! If you have the desire, and can read through my e-book , then this is for you! I could have easily made it a lot more difficult, with stacks of information that people will never read, but why?

Get My Free Ebook


Post a comment