Authors: Robert Roberts, Richard Lifton THE HUMAN GENOME
The term genome refers to all of the DNA, including the genes, responsible for an organism. The term proteome refers to all of the proteins responsible for an organism. The genes exert all of their influence through the proteins they produce. In general, the dogma is still true that each gene produces a unique protein, although it is preferable to refer to the end product as a polypeptide, since some proteins are made of two or more polypeptides and occasionally certain genes, through alternative splicing, may produce more than one polypeptide. The human genome is contained in 23 pairs of chromosomes. Twenty-two of these pairs are homologous chromosomes (one from the father and one from the mother), referred to as autosomes, and the remaining pair contain the sex chromosomes, which in the male consists of an X and a Y chromosome and in the female of two X chromosomes. Only a small portion of the X and Y chromosomes are homologous which is referred to as the pseudoautosomal region. Each pair of autosomal homologous chromosomes carries the same set of genes, with one inherited from each parent. Despite their homology and potentially identical function, some of the genes have a slightly different DNA sequence from that of the corresponding gene on their homologous partner, which may slightly or markedly alter their function. For example, the gene encoding for angiotensin-converting enzyme (ACE) has three forms (alleles): D, DI, and II. Thus, the chromosome from the mother may have the D form and the homologous chromosome from the father the I form; nevertheless, both genes encode for ACE and convert angiotensinogen to angiotensin II. However, there is increased plasma enzyme activity associated with the D form, leading to an exaggeration of ACE function. Studies suggest that individuals who are homozygous for the DD gene are predisposed to develop cardiac hypertrophy.!,2 These minor differences give rise to individual's genetic distinguishing features and in some instances predispose to the disease.
It is estimated that the difference in the DNA sequence among all humans is about 0.1 percent, which means that 99.9 percent of the DNA sequence is identical. However, there is a difference in over 3 million bases of the DNA sequence. Each chromosome is a long molecule made of DNA. DNA is made up of only four bases: **adenine (A), guanine (G), cytosine (C), and thymidine (T). If one visualizes a chromosome, it consists of repetitions of these four bases and is extremely monotonous. Nevertheless, the sequence of these four bases determines all of one's inherited characteristics. The average length of a chromosome is about 135,000,000 base pairs. The longest chromosome, chromosome 1, has over 250,000,000 base pairs. The smallest, chromosome 21, has only 50,000,000 base pairs. The 23 chromosomes together contain a total of 3 billion base pairs (Table 7-1). Genes themselves are discrete units with a start and stop point and vary in size from 10,000 to 2,000,000 base pairs. The estimated average is about 20,000 base pairs. Despite the fact that the whole of the human genome has 3 billion base pairs, it is estimated that only about 3 percent is used to make genes.3 Genes themselves do not participate in specific functions, but function through an intermediary, their single-stranded templates, referred to as messenger RNA (mRNA). The mRNA leaves the nucleus and goes to the ribosome in the cytoplasm, where it provides the template for protein synthesis. It is estimated there are between 50,000 and 100,000 genes.3
Table 7-1: The Human Genome
Base pairs 3 billion
Genes estimated 50,000-100,000
Percent of DNA contained in genes <3%
The intervening DNA sequences between the genes that do not exit the nucleus are referred to as introns, and the DNA sequences transcribed into mRNA that exits the nucleus to form the template for protein synthesis are referred to as exons. The function of the introns is largely unknown. A small proportion of the introns has the important regulatory function of determining when and how often the gene make, mRNA. Another function of the introns is, presumably, maintaining the structure and integrity of the DNA molecule. On a simple mathematical basis, the introns also offer some protection of the genes from mutations. The natural mutation rate is 1 every 200,000 years per gene. The mutation rate is higher in the introns, but, since the intron is not expressed in the protein, they are benign and nondisease producing. The DNA used to make genes consists of just one copy of each gene per chromosome. However, the introns not infrequently have many repeating units of the same sequence throughout the genome. The most frequent example of this is the ALU repeats, which consists of a 300-base pair repeat with over 500,000 copies scattered throughout the human genome. The role of these repeat sequences is also not known, but they may play a role as replication or initiation sites for duplication of DNA. While foreign DNA is usually destroyed, some, such as the genomes of retroviruses, does get incorporated into the human genome. It is estimated that 35 percent of the human genome is composed of DNA from evolutionary relics of mobile DNA elements transposed into the human genome with no known function.4 Mutations that induce single-gene diseases inherited as Mendelian disorders occur at a frequency of less than 1 percent. In contrast, mutations that induce more subtle changes (genes that predispose to polygenic diseases; e.g., DD versus II) or none at all may be located in exons or introns and occur more frequently, in the range of 10 to 20 percent. One form of these polymorphisms, single-nucleotide polymorphisms,5,6 which occur every 1000 base pairs, is discussed subsequently as the most promising marker for identifying genes responsible for polygenic diseases.
Was this article helpful?