Info

Figure 3.25 Functional analysis of the Arabidopsis genes predicted from the genome sequence, showing the similarities between Arabidopsis functional gene categories and bacterial genomes (E. coli and Synechocystis, a cyanobacterium) and those of yeast, nematode, and fruit fly. The y axis indicates the fraction of Arabidopsis genes in a functional category showing a BLAST match with the respective reference genome. The right to use this figure provided courtesy of members of the Arabidopsis Genome Initiative and Nature magazine. This figure first appeared in Nature 408, 796-815 (2000).

Figure 3.25 Functional analysis of the Arabidopsis genes predicted from the genome sequence, showing the similarities between Arabidopsis functional gene categories and bacterial genomes (E. coli and Synechocystis, a cyanobacterium) and those of yeast, nematode, and fruit fly. The y axis indicates the fraction of Arabidopsis genes in a functional category showing a BLAST match with the respective reference genome. The right to use this figure provided courtesy of members of the Arabidopsis Genome Initiative and Nature magazine. This figure first appeared in Nature 408, 796-815 (2000).

E. coli I Synechocystis I S. cerevisiae I C. elegans H Drosophila turned out that in the Arabidopsis genome genes may be found that share clear homology with human disease genes! However, the percentage of genes matching those of other species depended greatly on the functional category. Among the genes related to transcription only a small percentage (8-23%) had a match in another species, whereas among the genes related to protein synthesis up to 60% corresponded to a gene in another species. Overall, the similarity with prokaryote genomes was significantly less than with eukaryotes, but in the functional category of energy metabolism more than 30% of the plant genes were similar to a bacterial gene. This is obviously a consequence of the transfer of chloro-plast genes to the nuclear genome. Maybe less surprisingly, in the category of cellular communication and signal transduction hardly any match was found between Arabidopsis genes and those of bacteria, but the correspondence with the

(unicellular) yeast genome was also relatively low (Fig. 3.25).

Why has the Arabidopsis genome 87% more genes than Drosophila melanogaster? Two explanations have been given (The Arabidopsis Genome Initiative 2000). First, individual genes have been subjected to wide-scale amplification events, generating large arrays of tandems and dispersed gene families; unequal crossing-over may be the predominant mechanism involved. Second, the genome of A. thaliana has undergone a whole-genome duplication after it diverged from most other dicotyledons (Bowers et al. 2003), classifying A. thaliana as a cryptotetraploid species (see Section 3.1). These two genome-enlargement mechanisms have led to a considerable degree of genetic redundancy in the genome; that is, more than one gene has the same function. This is consistent with observations from genetic engineering studies which show that many genes can be knocked-out in Arabidopsis without any phenoypic consequences. The Arabidopsis Genome Initiative speculated that such large-scale duplication events may be needed to generate new functions, and that creating new functions by duplication is more common in plants than in animals, where novelties are more often generated by rearrangements of promoters and alternative splicing.

The possibility of ancient polyploidy in model plants was analysed in more detail by Blanc and Wolfe (2004), using whole-genome data and EST sequences for 14 different species. The authors estimated the sequence divergence between the two genes of a paralogous pair by looking at the average number of substitutions without amino acid alteration (number of substitutions per synonymous site, Ks; see Section 3.1). The frequency distribution of Ks values over all the genes is a cue to the timing of the duplication process (Fig. 3.26). Arabidopsis obviously has a peak in the frequency distribution around a Ks value of 0.8, which is indicative of synchronized duplication of many genes together. The most likely explanation for synchrony is a polyploidization of the whole

Distribution Whole Genome Duplication

Figure 3.26 Top: theoretical age distributions of pairs of duplicated genes in a genome. The general decrease of the curve indicates that fewer and fewer genes remain as recognizable duplicate pairs with increasing time since duplication (measured by the number of substitutions per synonymous site, Ks). Peaks in the curve are indicative of 'cohorts' of synchronous duplications. Bottom: distribution of Ks values (a measure of divergence) of paralogous gene pairs in A. thaliana (left) and O. sativa (right). Distributions are shown for genomic gene sequences and for partial, sequenced cDNAs (ESTs). These two approaches result in practically the same pattern. The peak in the Arabidopsis curve around Ks = 0.7-0.8 is indicative of an ancient polyploidy event. In the rice genome the distribution conforms mostly to the theoretical prediction of the top-left panel. After Blanc and Wolfe (2004). Copyright American Society of Plant Biologists.

Number of synonymous substitutions per site (Ks)

Figure 3.26 Top: theoretical age distributions of pairs of duplicated genes in a genome. The general decrease of the curve indicates that fewer and fewer genes remain as recognizable duplicate pairs with increasing time since duplication (measured by the number of substitutions per synonymous site, Ks). Peaks in the curve are indicative of 'cohorts' of synchronous duplications. Bottom: distribution of Ks values (a measure of divergence) of paralogous gene pairs in A. thaliana (left) and O. sativa (right). Distributions are shown for genomic gene sequences and for partial, sequenced cDNAs (ESTs). These two approaches result in practically the same pattern. The peak in the Arabidopsis curve around Ks = 0.7-0.8 is indicative of an ancient polyploidy event. In the rice genome the distribution conforms mostly to the theoretical prediction of the top-left panel. After Blanc and Wolfe (2004). Copyright American Society of Plant Biologists.

genome, dated around 25-26.7 million years ago. This was followed by extensive rearrangements and an accelerated loss of genes, with the consequence that the Arabidopsis genome is now relatively small among plant genomes (Table 3.8) and constitutes a complicated mosaic of duplicated genes. In rice, the distribution of Ks values is much more similar to the theoretical expection following from a continuous process of individual duplications; however, there is a small elevation in the distribution, which according to Blanc and Wolfe (2004) is indicative of a partial chromosomal duplication dated at 70 million years (Fig. 3.26).

Rice was the second higher plant species with a completely sequenced genome. In fact, two different projects were conducted, one by Syngenta focusing on the japonica subspecies (Goff et al. 2002), and one by the Beijing Genomics Institute, focusing on the most widely cultivated subspecies in China, O. sativa indica (Yu et al. 2002). The indica genome was 466 Mbp in size, with the number of genes estimated to be between 46022 and 55615; the japonica data were similar. Again these counts show that the number of genes in plants can be much higher than in animals. Rice and Arabidopsis belong to two different lineages of angiosperm plants, the monocotyledons and dicotyledons, which diverged around 200 million years ago; however, despite this ancient evolutionary divergence, there appears to be a considerable degree of homology between individual genes. Goff et al. (2002) estimated that 85% of the Arabidopsis predicted proteins had a homologue in the rice genome and that 31% of the proteins shared between Arabidopsis and rice were not found in fruit fly, nematode, or yeast. Almost all genes related to disease resistance in Arabidopsis are also found in rice. These data show that the defence against pathogens is a very basic element of plant biology and is highly conserved between dicotyledons and monocotyledons.

Despite the large number of orthologues shared between Arabidopsis and rice, the degree of synteny between these two species is very limited. There is, however, a great deal of genome synteny (colinearity) between the species of the tribus Triticaceae, which in addition to rice includes wheat, barley, rye, and some wild plants of the genus Aegilops (goatgrass). Analysing the genetic maps of the Triticaceae, Devos and Gale (2000) showed that only two chromosomal rearrangements need to be assumed to achieve colinearity between the genome of Aegilops tauschii (Tausch's goatgrass) and Hordeum vulgare (barley), whereas seven rearrangements can explain the relationship between Ae. tauschii and rye (Secale cereale). Similar syntenic relationships hold for the family Poaceae in general. So the sequence of the relatively small rice genome allows identification of chromosomal segments in other species. However, on a smaller scale (microsynteny), numerous discontinuities in gene order between wheat and rice were identified by Sorrells et al. (2003), so the use of rice as a model for cross-species gene isolation in other Triticaceae could prove to be limited.

The website for the Arabidopsis Information Resource (TAIR; http://arabidopsis.org) allows researchers to search for genes, proteins, alleles, markers, etc., and provides various analysis tools, such as sequence viewers, map viewers, BLAST protocols, and microarray analysis. There are also a great number of links, for example to the Arabidopsis Biological Resource Center, which has thousands of stocks in the form of clones or seeds, which are shipped around the world. The website includes a search engine for publications on Arabidopsis genomics in the widest sense. A frequently used platform for transcription profiling in Arabidopsis is the Affymetrix Arabidopsis Genome Array ATH1, which has probes for 24000 genes. Specialized software packages for surveying and mining gene-expression data generated with the Affymetrix gene chips have also been developed (Zimmerman et al. 2004).

3.3.5 The deuterostome lineage

Genomes of higher animals are discussed jointly here with reference to the subkingdom Deuterostomia, which includes the phyla Pterobranchia, Echinodermata, Hemichordata, and Chordata. Nielsen (1995) also includes Phoronida and Branchipoda in the Deuterostomia, although most zoologists rank these two phyla with the protostome Lophotrochozoa group. Within the deuterostomes, genome sequencing is greatly biased towards mammals (see Table 1.1); however, in this section we will pay most attention to the genomes of the basal members of the Chordata. Two species of echinoderm, the California purple sea urchin, Strongylocentrotus purpuratus, and the green sea urchin, Lytechinus variegatus, are presently being sequenced, but an analysis of their genomes has not yet been made, so we will limit ourselves to the chordates.

An interesting view of the origin of chordates and vertebrates is obtained from the genome of a sea squirt, Ci. intestinalis (Dehal et al. 2002; Canestro et al. 2003). This animal belongs to the chordate subphylum Urochordata, also called the Tunicata, after the tunic, a tough fibrous cover, excreted from the skin, in which the animal is contained. The Urochordata comprise three classes, one of them being the Ascidiacea or sea squirts. As an adult, Ci. intestinalis is sessile and attached to an underwater substrate where it filters food particles by pumping water through its elaborate pharynx, a basket-like structure, which fills most of the tunic. The name squirt is due to the regular pulses of water driven out of the exhalent siphon (Fig. 3.27a). Although aside from the gill slits in the pharynx no obvious characters indicate that the animal is closely related to vertebrates, the organization of the free-swimming ascidian larvae differs greatly from the adult and reveals it chordate body plan. In fact the larva of a sea squirt looks very much like a jawless fish, and is equipped with a chorda and a dorsal nerve cord, externally resembling a tadpole (Fig. 3.27b).

Ci. intestinalis is a solitary, small, and relatively short-lived marine animal that colonizes solid substrates in the sublittoral zone, such as protected rocky shores, ship wrecks, and buoys. Due to its rapid colonizing capacity, it is sometimes a conspicous and abundant representative of the 'fouling' community. With their large filtration capacity, the animals act as filters and so contribute to purification of coastal waters, although by the same mechanism they accumulate chemicals and are used for biomonitoring of coastal sea pollution. Ecological work on Ciona and other tunicates aims at answering questions about

Figure 3.27 (a) Adult sea squirts. (b) A group of larvae. David Keys (photo) and Leila Hornick (artistic rendering), courtesy of the U.S. Department of Energy Joint Genome Institute. © 2005 The Regents of the University of California.

settlement in relation to density and intraspecific competition. Local populations seem to be highly dynamic and are characterized by cyclic retreat and recolonization events. Because of this type of population dynamics, ecologists are interested in geographical population genetic structure; microsatellite markers have been developed to support such analyses (Procaccini et al. 2000). The recently generated genomic information on

Ciona has, however, not yet penetrated into ecological studies.

The genomes of tunicates are considerably smaller than those of vertebrates, and Ciona's genome measures about 160 Mbp (Dehal et al. 2002). The gene content represents an interesting blend between ancient protostome signatures and chord-ate innovations, with some tunicate autapomor-phisms added. Dehal et al. (2002) found a total of 15852 protein-encoding genes and these were compared with the gene complements of Drosophila, C. elegans, puffer fish, and mammals. It turned out that 60% of the genes shared homology with fruit flies and nematodes, so these represent the core physiological and developmental machinery that is common to all bilaterian animals. A few hundred of these genes even have a stronger similarity to fruit fly or nematode than to any vertebrate, and so these genes represent functions that were present in the invertebrates, but were lost in the vertebrate lineage. Examples are chitin synthase (there is no chitin exoskeleton in chordates), phytochelatin synthase (the role of the zinc-binding molecule phytochelatin was taken over by metallothionein), and haemocyanin (the copper-containing blood pigment of arthropods and bivalves, absent from vertebrates). Another 16% of the genes lacked a homologue in the protostome groups, but had a clear vertebrate counterpart. These genes apparently have arisen on the deuterostome branch before the split between tunicates and vertebrates. Then another 21% of the genes had no clear homologue in fruit fly, nematode, fish, or mammal and represent tunicate-specific genes.

Interestingly, Ciona's genome has genes related to the synthesis and degradation of cellulose (cellulose synthase and several endoglucanases), genes that are never found in animals, only in plants, nitrogen-fixing bacteria, and bacteria living endosymbiotically with termites and woodfeeding cockroaches. The presence of these genes is related to the composition of the tunic, which is built largely of a cellulose-like carbohydrate called tunicin. How Ciona obtained these genes (a dramatic example of lateral gene transfer?) remains a mystery, but obviously it has been a very significant event in the evolution of this group (Matthysse et al. 2004). Ciona's genome has all the genes related to the innate immune system, as in Anopheles and Drosophila, but genes implicated in adaptive immunity could not be found. This suggests that the adaptive immune system is an apomorphy of the vertebrates, not of the chordates as a whole. Ascidians are also known for their extremely high body concentration of vanadium, several orders of magnitude higher than any other animal. Vanadium is accumulated in specialized blood cells, vanadocytes, where it is localized in intracellular vacuoles, together with a similarly high concentration of sulphate. Three vanadium-binding proteins, vanabins, have been characterized recently in Ascidia sydneiensis samea (Ueki et al. 2003) and five vanabins are encoded in the genome of Ci. intestinalis (Trivedi et al. 2003). However, a genome-wide analysis of the peculiar vanadium metabolism of ascidians has not yet been conducted.

Turning our attention from urochordates to vertebrates, we note that three species of fish presently serve as genomic models, Takifugu rubripes, Tetraodon nigroviridis (both puffer fish, family Tetraodontidae), and Danio rerio (zebrafish, family Cyprinidae). T. rubripes (also known as Fugu rubripes) was proposed in 1993 by Sydney Brenner as a genomic model because with its small genome (470 Mbp) it would allow a cost-effective way of illuminating the human genome. In the far east, the fish is not only known for its small genome, but also for containing the extremely toxic compound tetrodotoxin, which, with an oral LD50 to mammals of 15 mg per kg of body weight is one of the most potent toxins known. Japanese men practice the habit of eating 'fugu' fish in restaurants that have obtained a special licence allowing the cook to separate the flesh from the hypertoxic liver and ovaria. The International Fugu Genome Consortium was formed in the year 2000, coordinated by the Institute of Molecular and Cell Biology in Singapore, in collaboration with groups in the UK and the USA (www.fugu-sg.org). The sequence was released less than 2 years later (Aparicio et al. 2002).

Because the Takifugu genome assembly remained highly fragmented, another team, coordinated by the French sequencing centre Genoscope, started on a related puffer fish, Te. nigroviridis. This species has an even smaller genome and it offered the additional advantage of being a popular aquarium fish, easily maintained in tap water. The name puffer is derived from the fish's habit of inflating itself when it is threath-ened. The analysis of the genome, published by Jaillon et al. (2004), revealed several interesting trends about gene duplications in the actinopter-ygian fish lineage (ray-finned fish, as opposed to lobe-finned fish, the Sarcopterygii, such as lung fish and coelocanths). The genome of Tetraodon measured 342 Mbp and had 28918 putative protein-encoding genes, 1.8 times more than in Ciona, but somewhat less than in Takifugu (31 059). The slightly smaller genome size was ascribed to the absence of transposable elements, which are rather abundant in fugu fish. Careful analysis of the content of each of the 21 Tetraodon chromosomes allowed reconstruction of a duplication event in the actinopterygian lineage (Fig. 3.28). Assuming that the original number of chromosomes of the ancestral gnathostome (jawed fish) was 12, a duplication event, followed by 10 different chromosomal rearrangements (fusions and translocations), can explain the present organization of the 21 chromosomes. The duplication is assumed to have taken place later in the evolution of the Actinopterygii, close to the origin of the Teleostei (modern bony fish), because some early-branching actinopterygian fish (bichirs, Polypteriformes) do not have the duplication. Similar conclusions were reached by Christoffels et al. (2004) in an analysis of the fugu genome.

The model of Jaillon et al. (2004) is consistent with an earlier analysis of the vertebrate Hox genes by Amores et al. (1998). These authors had

Was this article helpful?

0 0
How To Bolster Your Immune System

How To Bolster Your Immune System

All Natural Immune Boosters Proven To Fight Infection, Disease And More. Discover A Natural, Safe Effective Way To Boost Your Immune System Using Ingredients From Your Kitchen Cupboard. The only common sense, no holds barred guide to hit the market today no gimmicks, no pills, just old fashioned common sense remedies to cure colds, influenza, viral infections and more.

Get My Free Audio Book


Post a comment