Many common laboratory models and pathogens Several methanogens and extremophiles

Dictyostelium discoideum (slime mould) Entamoeba histolytica (amoeba causing dysentery) Four Plasmodium and two Mlcrosporidlum species Trypanosoma brucei, Leishmania tropica (parasites) Guillardia theta (flagellated unicellular alga) Thalassiosira pseudonana (marine diatom) Cyanidioschyzon merolae (small unicellular red alga) Chlamydomonas reinhardtii (green alga), Populus trichocarpa (black cottonwood), Arabidopsis thaliana (thale cress), Oryza sativa var. japponica, var. indica (rice) Including Saccharomyces cerevisiae (baker's yeast)

Caenorhabditis elegans (free-living roundworm), Caenorhabditis briggsae Bombyx mori (silk worm), Drosophila melanogaster (fruit fly), Anopheles gambiae (mosquito, malaria vector), Apis mellifera (honey bee) Ciona intestinalis (sea squirt)

Takifugu rubripes (puffer or fugu fish), Tetraodon nigroviridis (puffer fish), Danio rerio (zebrafish) Gallus gallus (red jungle fowl)

Rattus norvegicus (brown rat), Mus musculus (house mouse), Canis familiaris (domestic dog), Pan troglodytes (chimpanzee), Homo sapiens (human)

Sources: from,, GenBank Nucleotide Sequence Database, and sundry sources.

qualified this announcement as more a matter of public communication than scientific achievement. At that time the accepted criterion for completion of a genome sequence, namely that only a few gaps or gaps of known size remained to be sequenced and that the error rate was below 1 in 10 000 bp, had not been met by far. The euchro-matin part of the genome was not completed until mid-2004, although that milestone was again considered by some to be only the end of the beginning (Stein 2004). Nevertheless, the Human Genome Project can be regarded as one of the most successful scientific endeavours in history and the assembly of the 3.12 billion bp of DNA, requiring some 500 million trillion sequence comparisons, was the most extensive computation that had ever been undertaken in biology.

The number of organisms whose genome has been sequenced completely and published is now approaching 300 (Table 1.1). Bacteria dominate the list, as the small size of their genomes makes these organisms well-suited for whole-genome sequencing. By June 2005, no fewer than 730 prokaryotic organisms and 496 eukaryotes were the subject of ongoing genome sequencing projects. The list in Table 1.1 will certainly be out of date by the time this book goes to press, as new genome projects are being launched or completed every month.

The list of species with completed genome sequences does not represent a random choice from the Earth's biodiversity. From an ecologist's point of view, the absence of reptiles, amphibians, molluscs, and annelids is striking, as also is the scarcity of birds and arthropods other than the insects. How did a species come to be a model in genomics? We review the various arguments below, asking whether they would also apply when selecting model species for ecological studies.

Previously established reputation. This holds for yeast, C. elegans, Drosophila, mouse, and rat. These species had already proven their usefulness as models before the genomics revolution and were adopted by genomicists because so much was known about their genetics and biochemistry, and, perhaps just as important, because a large research community was interested, could support the work, and use the results. Genome size. One of the first questions that is asked when a species is considered for whole-genome sequencing is, what is the size of its genome? At least in the beginning, a relatively small genome was a major advantage for a sequencing project. The genome size of living organisms ranges across nine orders of magnitude, from 103bp (0.001 Mbp) in RNA viruses to nearly 1012bp (1000 000 Mbp) in some protists, ferns, and amphibians. The puffer fish, Takifugu rubripes, was indeed chosen because of its relatively small genome (one-eighth of the human genome). Possibility for genetic manipulation. The possibility of genetic manipulation was an important reason why Arabidopsis, Drosophila, and mouse became such popular genomic models. The ultimate answer about the function of a gene comes from studies in which the genome segment is knocked out, downregulated, or overexpressed against a genetic background that is the same as that of the wild type. Also, the introduction of constructs in the genome that can report activity of certain genes by means of signal molecules is very important. This can only be done if the species is accessible using recombinant-DNA techniques. Foreign DNA can be introduced using transposons; for example, modified P-elements that can 'jump' into the DNA of Drosophila, or bacteria such as Agrobacterium tumefaciens that can transfer a piece of DNA to a host plant. DNA can also be introduced by physical means, especially in cell cultures, using electroporation, microinjection, or bombardment with gold particles. Another popular approach is post-transcriptional gene silencing using RNA interference (RNAi), also called inhibitory RNA expression. The question can be asked, should the possibility for genetic manipulation be an argument for selecting model species in ecological genomics? We think that it should, knowing that the capacity to generate mutants and transgenes of ecologically relevant species is crucial for confirming the function of genes. Ecologists should also use the natural variation in ecologically relevant traits to guide their explorations of the genome (Koornneef 2004, Tonsor et al. 2005). A basic resource for genome investigation can be obtained by using natural varieties of the study species, and developing genetically defined culture stocks.

Medical or agricultural significance. Many bacteria and parasitic protists were chosen because of their pathogenicity to humans (see the many parasites in Table 1.1). Other bacteria and fungi were taken as genomic models because of their potential to cause plant diseases (phytopatho-genicity). Obviously, the sequencing of rice was motivated by the huge importance of this species as a staple food for the world population (Adam 2000). Some agriculturally important species have great relevance for ecological questions; for example, the bacterium Sinorhizo-bium meliloti, a symbiont of leguminous plants, is known for its nitrogen-fixing capacities, but it also makes an excellent model system for the analysis of ecological interactions in nutrient cycling, together with its host Medicago truncatula.

Biotechnological significance. Many bacteria and fungi are important as producers of valuable products, for example antibiotics, medicines, vitamins, soy sauce, cheese, yoghurt, and other foods made from milk. There is considerable interest in analysing the genomes of these microorganisms because such knowledge is expected to benefit production processes

(Puhler and Selbitschka 2003). Other bacteria are valuable genomic models because of their capacity to degrade environmental pollutants; for example, the marine bacterium Alcanivorax borkumensis is a genomic model because it produces surfactants and is associated with the biodegradation of hydrocarbons in oil spills (Roling et al. 2004).

Evolutionary position. Whole-genome analysis of organisms at crucial or disputed positions in the tree of life can be expected to contribute significantly to our knowledge of evolution. The sea squirt, Ci. intestinalis, was chosen as a model because it belongs to a group, the Urochordata, with properties similar to the ancestors of vertebrates. The study of this species should provide valuable information about the early evolution of the phylum to which we belong ourselves. Me. jannaschii was chosen for more or less the same reason, because it was the first sequenced representative from the domain of the Archaea. Many other organisms, although not on the list for a genome project to date, have a strong case for being declared as model species for evolutionary arguments. These include the velvet worm, Peripatus, traditionally seen as a missing link between the arthropods and annelids, but now classified as a separate phylum in the Panarthropoda lineage (Nielsen 1995), and the springtail, Folsomia candida, formerly regarded as a primitive insect, but now suggested to have developed the hexapod bodyplan before the insects separated from the crustaceans (Nardi et al. 2003).

Comparative purposes. Over the last few years, genomicists have realized that assigning functions to genes and recognizing promoter sequences in a model genome can greatly benefit from comparison with a set of carefully chosen reference organisms at defined phylogenetic distances. Comparative genomics is developing an increasing array of bioinformatics techniques, such as synteny analysis, phylogenetic footprinting, and phylogenetic shadowing (see Chapter 3), by which it is possible to understand aspects of a model genome from other genomes. One of the main reasons for sequencing the chimpanzee's genome was to illuminate the human genome, and a variety of fungi were sequenced to illuminate the genome of S. cerevisae.

Ecological significance. It will be clear that ecological arguments have only played a minor role in the selection of species for whole-genome sequencing, but we expect them to become more important in the future. Jackson et al. (2002) have formulated arguments for the selection of ecological model species, and we present them in slightly adapted form.

Biodiversity. The new range of models should embrace diverse phylogenetic lineages, varying in their physiology and life-history strategy. For example, the model plants Arabidopsis and rice both employ the C3 photosynthetic pathway. To complement our genomic knowledge of primary production, new models should be chosen among plants utilizing C4 photosynthesis or crassulacean acid metabolism (CAM). Considering the diversity of life histories, species differing in their mode of reproduction and dispersal capacity should be chosen; for example, hermaphoditism versus gonochorism, parthenogenesis versus bisexual reproduction, etc.

Ecological interactions. Species that take part in critical ecological interactions (mutualisms, antagonisms) are obvious candidates for geno-mic analysis. One may think of mycorrhizae, nitrogen-fixing symbionts, pollinators, natural enemies of pests, parasites, etc. The most obvious strategy for analysing such interactions would be to sequence the genomes of the players involved and to try and understand interactions between them from mutualisms or antagonisms in gene expression.

Suitability for field studies. The wealth of knowledge from experienced field ecologists should play a role in deciding about new 'ecogenomic' models. Not all species lend themselves to studies of behaviour, foraging strategy, habitat choice, population size, age structure, dispersal, or migration in the field, simply because they are too rare, not easily spotted, difficult to sample quantitatively, impossible to mark and recapture, not easy to distinguish from related species, or inaccessible to invasive techniques. Thus suitability for field research is another important criterion.

Feder and Mitchell-Olds (2003) developed a similar series of criteria for an ideal model species in evolutionary and ecological functional genomics (Fig. 1.5). These authors point out that there is currently a discrepancy between classical model species and many ecologically interesting species. Models such as Drosophila and Arabidopsis are not very suitable for ecological studies, whereas many popular ecological models have a poorly characterized genome and lack a large community of investigators. In some cases a large ecological community is available, but functional genomic studies are difficult for reasons of quite another nature. For example, many ecologists favour wild birds as a study object, but there are ethical objections to genetic manipulation of such species and laboratory experiments are restricted by law.

It is not easy to foresee how the list of genomic model species will develop in the future. Obviously, ecologists taking ecological genomics seriously will need to avail themselves of genomic information on their model species, preferrably a whole-genome sequence. This is not to say however, that all questions in ecological genomics require the full-length DNA sequence of a species before they can be answered. Some issues may prove to be solvable with the use of less extensive genomic investigations, for example a gene hunt followed by multiplex quantitative PCR, rather than transcription profiling with microarrays of

Insect Microarrays
Figure 1.5 Criteria in evolutionary and ecological functional genomics for a model species, according to Feder and Mitchell-Olds (2003). At present few species satisfy all criteria. Reproduced by permission of Nature Publishing Group.

the complete genome (see Section 2.3). In addition, microarray studies with part of the expressed genome are possible even in species lacking a complete DNA sequence. Microarrays can be manufactured at costs that are affordable for small research groups if they are limited to genes associated with a specific function or response pathway (Held et al. 2004; see also Section 6.4). Still, the number of species with fully characterized genomes is expected to rise dramatically in the coming years; after a while all the major ecological models will also be genomic models and the saturation point could very well be due to the limited number of molecular ecologists in the worldwide scientific community.

Not all ecological models will enjoy the type of in-depth investigations now dedicated to yeast, fly, worm, and weed. Murray (2000) points out that the development of genome-based tools has a strong element of positive feedback; the rich—that is, widely studied organisms—get richer and the poor get poorer. This development has already been felt in the fields of animal and plant physiology, where many of the species traditionally investigated in comparative physiology and biochemistry have been abandoned in favour of models that can be genetically manipulated to study the function of genes. Murray (2000) predicted that 'the larger its genome and the fewer its students, the more likely work on an organism is to die'. Crawford (2001) has argued, however, that functional genomics should resist this tendency and instead choose species best suited to addressing specific physiological or biochemical processes. For example, the Nobel Prize for Medicine was given to H.A. Krebs for his research on the citric acid cycle, which was conducted on common doves. By modern standards the dove is a nonmodel species, but it was chosen because its breast muscle is very rich in mitochondria. In animal physiology, Krogh's principle assumes that for every physiological problem there is a species uniquely suited for its analysis (Gracey and Cossins 2003). According to this principle, genomic standard species are likely to be suboptimal for at least some problems of physiology, because no model is uniquely suited to answering all questions.

DNA microarrays, with their associated massive generation of data on expression profiles (see Section 2.3), are one of the most tangible features of modern genomics and are often seen as holding the greatest promise for solving problems in ecology. However, not all ecologists are convinced that microarray-based transcription profiling is the best way to advance the genomics revolution into ecology. Thomas and Klaper (2004), for example, argued that commercial microarrays are available only for genomic model species, whereas the interest of ecologists is with species that are important in the environment and amenable to ecological studies; these two interests do not necessarily coincide. This leaves ecologists with two options. One is to develop their own micro-arrays, starting with spotted cDNAs of unknown sequence, doing a lot of tedious sequencing work, and gradually finding out more about the genome of their study species. Another option is to apply transcriptome samples of non-models to micro-arrays of model species. In these cross-species hybridizations it is assumed that there is sufficient homology between the non-model and the model to allow differential expressions to be assessed reliably. For example, Arabidopsis may function as a model for other species of the Brassicaceae, and Drosophila as a model for other higher insects. Obviously, how useful such an approach is will depend on how far the sequences of model and non-model diverge. This will not be the same for all parts of the genome and therefore there is some doubt on the validity of cross-species hybridization, although there will certainly be situations where it works well.

Other investigators are less hesitant about the prospects of microarrays in ecology. Gibson (2002) emphasized that today it is feasible to establish a 5000-clone microarray resource within 12 months of a commencing project and that neither the estimated expense nor the availability of technology need to be a major obstacle for progress. We share this optimism. Given the fact that the number of almost completely sequenced organisms is increasing month by month, we can expect that the genome of several species of great interest to ecologists may be completed within a few years.

In addition, we expect that almost all ecologically relevant species will have basic genomics databases—for example, an annotated EST library—sufficient to answer a considerable number of ecological questions.

All Natural Yeast Infection Treatment

All Natural Yeast Infection Treatment

Ever have a yeast infection? The raw, itchy and outright unbearable burning sensation that always comes with even the mildest infection can wreak such havoc on our daily lives.

Get My Free Ebook

Post a comment