Eukaryotic diversity and phylogeny using small - Université Paris-Sud
Transcription
Eukaryotic diversity and phylogeny using small - Université Paris-Sud
Environmental Microbiology (2009) doi:10.1111/j.1462-2920.2009.02023.x Eukaryotic diversity and phylogeny using smalland large-subunit ribosomal RNA genes from environmental samples emi_2023 1..10 William Marande, Purificación López-García and David Moreira* Unité d’Ecologie, Systématique et Evolution, UMR CNRS 8079, Univ. Paris-Sud 11, bâtiment 360, 91405 Orsay Cedex, France. Summary The recent introduction of molecular techniques in eukaryotic microbial diversity studies, in particular those based in the amplification and sequencing of small-subunit ribosomal DNA (SSU rDNA), has revealed the existence of an unexpected variety of new phylotypes. The taxonomic ascription of the organisms bearing those sequences is generally deduced from phylogenetic analysis. Unfortunately, the SSU rDNA sequence alone has often not enough phylogenetic information to resolve the phylogeny of fast-evolving or very divergent sequences, leading to their misclassification. To address this problem, we tried to increase the phylogenetic signal by amplifying the complete eukaryotic rDNA cluster [i.e. the SSU rDNA, the internal transcribed spacers, the 5.8S rDNA and the large-subunit (LSU) rDNA] from environmental samples, and sequencing the SSU and LSU rDNA part of the clones. Using marine planktonic samples, we showed that surveys based on either SSU or SSU + LSU rDNA retrieved comparable diversity patterns. In addition, phylogenetic trees based on the concatenated SSU + LSU rDNA sequences showed better resolution, yielding good support for major eukaryotic groups such as the Opisthokonta, Rhizaria and Excavata. Finally, highly divergent SSU rDNA sequences, whose phylogenetic position was impossible to determine with the SSU rDNA data alone, could be placed correctly with the SSU + LSU rDNA approach. These results suggest that this method can be useful, in particular for the analysis of eukaryotic microbial communities rich in phylotypes of difficult phylogenetic ascription. Received 22 August, 2009; accepted 26 June, 2009. *For correspondence. E-mail [email protected]; Tel. (+33) 169157608; Fax (+33) 169154697. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd Introduction The use of molecular methods has played a major role in our understanding of microbial diversity. During the last two decades, PCR amplification and sequencing of the small-subunit ribosomal RNA gene (SSU rDNA) has been extensively applied to survey the huge bacterial and archaeal diversity found in many common and extreme habitats (Barns et al., 1996; Hugenholtz et al., 1998). Recently, this technique was also adopted for the analysis of eukaryotic diversity and it revealed a variety of new groups, such as the marine Group I alveolates (LópezGarcía et al., 2001; Moon-van der Staay et al., 2001) or the putative new algal clade picobyliphytes (Not et al., 2007), as well as a large diversity within many well-known groups such as the prasinophyte algae (Guillou et al., 2004; Viprey et al., 2008). In addition, several SSU rDNA sequences that branched deeply in the eukaryotic phylogeny were impossible to be specifically related to any known group, even at the kingdom level, suggesting the controversial existence of several novel eukaryotic kingdom-level groups (Dawson and Pace, 2002; Edgcomb et al., 2002). A key component of all the SSU rDNA-based diversity studies is the taxonomic classification of the organisms whose sequences are retrieved, which is commonly inferred by phylogenetic analysis of the novel environmental phylotypes. In addition to artefacts derived from undetected chimeric sequences, single-marker phylogenies have a series of inherent methodological problems, such as the lack of enough informative signal for the resolution of deep nodes, which can engender artefactual misplacement in phylogenetic trees (Philippe and Adoutte, 1998; Stiller and Hall, 1999; Philippe et al., 2000). A typical phylogenetic reconstruction problem, the long-branch attraction (LBA) artefact, often causes highly divergent SSU rDNA sequences to be placed at the base of phylogenetic trees, leading them to be considered as potential novel eukaryotic lineages. Recently, it was demonstrated that from 28 published phylotypes representing putative novel high-level eukaryotic taxa, only 11 were actual potential new lineages (Berney et al., 2004; Cavalier-Smith, 2004). For example, the phylotypes DH145-EKD11 and CCW75, putative members of a new 2 W. Marande, P. López-García and D. Moreira eukaryotic clade (López-García et al., 2001; Stoeck and Epstein, 2003), were later recognized to be highly divergent phylotypes related to the very fast-evolving ciliates Myrionecta rubra and Mesodinium pulex (Johnson et al., 2004). This is probably the case for many other divergent eukaryotic phylotypes (Berney et al., 2004; CavalierSmith, 2004). A natural way to alleviate the misplacement of environmental sequences due to the intrinsic limitation of singlemarker data would be to increase the number of phylogenetic informative sites in phylogenetic analyses by concatenating large-subunit (LSU) and SSU rDNA sequences. This strategy was applied recently, producing eukaryotic phylogenetic trees with much better resolution than the SSU rDNA alone and better statistical support for the major eukaryotic groups (Moreira et al., 2007). The LSU and SSU rRNA genes are generally adjacent in the genome, so that both markers could be easily retrieved by PCR amplification also from environmental DNA samples. The feasibility of this approach has already been shown for marine planktonic bacteria (Suzuki et al., 2001) but never tried for eukaryotes. In this work, we analysed the results of parallel SSU rDNA and SSU + LSU rDNA surveys to check whether the use of this larger marker retrieves comparable eukaryotic diversity profiles. In addition, we tested if the use of the SSU + LSU rDNA concatenations allows the correct placement of fast-evolving phylotypes in eukaryotic phylogenetic trees. Results and discussion Diversity analysis and comparison between SSU rDNA and complete rDNA cluster libraries We successfully amplified the eukaryotic rDNA cluster from DNA extracted from four different surface (25 m) marine planktonic samples (Table 1), obtaining PCR products of 5–6 kbp. To compare the genetic diversity retrieved from the complete rDNA cluster amplification with the classical SSU rDNA analysis, we generated in parallel SSU rDNA libraries from the same four samples. For the DH18, DH114 and Ma131 libraries, we selected 48 positive clones of each type (SSU and SSU + LSU), whereas for the DH141 library, which appeared to show the most even taxonomic diversity without any clearly dominant group, we sequenced 96 additional clones (i.e. 144 clones in total) of each type in order to have a library with a more exhaustive characterization for comparative purposes. For each SSU or SSU + LSU rDNA clone, we sequenced ~800 bp of the 5′ region of the SSU rRNA gene using the primer Euk-42F. A total of 508 partial sequences were thus determined (see Table 1). The overall taxonomic diversity retrieved was large, revealing the presence of typical marine planktonic eukaryotic groups in the samples, including alveolates, cryptophytes, streptophytes, chlorophytes, haptophytes and heterokonts, with variable frequencies depending on the sample analysed (Fig. 1). For each sample, we compared the proportions of the different taxonomic groups for the two types of libraries, SSU and SSU + LSU rDNA. We detected the groups that normally dominate surface marine environments (Díez et al., 2001) and observed quite similar taxonomic compositions for the most abundant groups in both types of libraries (Fig. 1). Differences between SSU and SSU + LSU rDNA libraries from a same sample mostly concerned the less abundant groups (Table 2). This likely reflected, at least in part, that the library surveys were not exhaustive, namely that we had not reached a plateau in the saturation curves for each library (data not shown). However, this was not the only reason. For example, we did not retrieve any haptophyte sequence in the SSU + LSU rDNA libraries, despite the fact that members of this widespread photosynthetic group were identified, although in relatively small proportions, in all the corresponding SSU rDNA libraries. An inspection of the available haptophyte LSU rDNA sequences revealed that the eukaryotic ‘universal’ primer 28S-4R, used for the construction of the SSU + LSU rDNA libraries, showed two mismatches with those haptophyte sequences, which probably explains the bias against the haptophytes observed in the SSU + LSU rDNA libraries. Nevertheless, this was the only clear bias that we could identify in the taxonomic composition, considered at large scale, in our libraries. Preferential PCR amplification and cloning of rDNA clusters of small size is another possible bias that may have escaped from our attention. It is known Table 1. Sampling sites and number of clones sequenced. Sample Station Coordinates Depth (m) Volume filtered (l) DH18 DHARMA 5 South Atlantic DHARMA 32 South Atlantic DHARMA 18b South Atlantic Marmara Sea 62°22′11′′ 53°35′56′′ 25 19 46/36 54°59′44′′ 58°22′17′′ 25 14 47/48 59°22′45′′ 55°46′27′′ 25 14 138/114 40°50.295 N 28°1.397E 25 4 42/37 DH114 DH141 Ma131 Clones sequenced (SSU/SSU + LSU) © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology Eukaryotic diversity and phylogeny using SSU + LSU rDNA 3 90 80 70 60 50 40 30 20 10 0 i ng Fu va tes ia ca lar Ex dio Ra im a ls s An yte ts ph on pto Ha He ter ok ola tes tes ve op hy tes Ch top yp Cr SSU rDNA clones lor yte ph pto Ma131 hy Clo n e freq u en cy (% ) 90 80 70 60 50 40 30 20 10 0 s s yte ph Ha on ts re pto ok St He ter ola tes tes ve hy Al op lor Ch yp top hy tes Clo n e freq u en cy (% ) DH18 DH114 Al DH141 90 80 70 60 50 40 30 20 10 0 Cr Cr Clo n e freq u en cy (% ) yp to Ch phyt lor es op hy Al tes ve o He lates ter ok on t An s im Gl a au Ch co ls oa ph no yte s fla ge Ha llate pto s ph yte s Cr Clo n e freq u en cy (% ) yp top Ch hy lor tes op hy Al tes ve o He late ter s ok Ha ont pto s ph Te ytes lon em i Ce ds rco zo a 90 80 70 60 50 40 30 20 10 0 SSU + LSU rDNA clones Fig. 1. Eukaryotic diversity identified using SSU and SSU + LSU rDNA libraries. Histograms represent the percentage of clones belonging to each eukaryotic group for a given environmental sample. For the number of clones analysed see Table 1. that several eukaryotic groups, such as the foraminifers, have very long rDNA clusters, but in most cases the difficulties to amplify and clone their rDNAs are mostly due to the fact that their sequences are not only long but very divergent, decreasing the probability of primer annealing unless taxon-specific primers are used (Habura et al., 2004). Therefore, we think that primer bias is probably much more important than the amplicon-length bias, as it is also the case for the classical SSU rDNA libraries. To compare at a lower taxonomic level, we studied in more detail the four studied samples. We calculated the operational taxonomic units (OTU) composition of the corresponding SSU and SSU + LSU rDNA libraries for each sample, by considering as members of a same OTU those sequences with > 98% identity for the SSU rDNA region (Table 2). We identified in that way a total of 41 OTUs containing more than one sequence, 20 of which were present in both the SSU and SSU + LSU rDNA libraries. The 41 OTUs represented 85% of all the sequences; the remaining 15% (76 sequences) were singletons (unique sequences with less than 98% of identity with the rest of our sequences). With the exception of the OTUs related to the haptophytes (see above), all OTUs found in only one type of library contained very few sequences (2–4), corresponding very likely to stochastic differences due to the incomplete exploration of the libraries. Our analysis showed that for each of the most abundant OTUs, identical or very closely related clones were retrieved with both markers (Table 2). This was in agreement with the results already reported for the SSU + LSU rDNA-based analysis of bacterial diversity in marine plankton (Suzuki et al., 2001). All this suggested that the SSU + LSU rDNA approach is able to retrieve a quite similar diversity than the SSU rDNA. This opened the possibility to apply that approach to improve the phylogenetic analysis of the environmental sequences difficult to place on the basis of the SSU rDNA alone. Phylogenetic reconstruction using SSU + LSU rDNA concatenations After confirming the feasibility of the construction of SSU + LSU rDNA libraries for eukaryotic diversity description and their overall similarity when compared with the classical SSU rDNA-based approach, we compared the resolving power of these two data sets in phylogenetic analysis. Given that the concatenation of SSU + LSU rDNA © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology 4 W. Marande, P. López-García and D. Moreira Table 2. OTUs retrieved in the eight marine surface SSU and SSU + LSU rDNA libraries. Clones per library (SSU/SSU + LSU) OTU 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 Singletons DH141 DH18 0/2 6/4 1/3 2/0 1/2 1/1 6/1 2/0 1/0 2/0 9/2 1/0 1/1 1/0 2/0 9/4 2/0 DH114 2/0 1/0 2/3 1/0 0/1 5/1 3/1 1/0 1/0 12/7 2/0 3/0 4/4 9/17 1/1 1/0 7/2 Ma131 Phylum Taxonomy group, genusa Alveolates Dinoflagellates Dinoflagellates, Karlodinium Dinoflagellates Dinoflagellates Dinoflagellates, Lepidodinium Dinoflagellates, Gyrodinium Dinoflagellates, Pentapharsodinium Dinoflagellates, Gymnodinium Dinoflagellates, Karlodinium Dinoflagellates, Karlodinium Dinoflagellates Dinoflagellates, Gyrodinium rubrum Ciliates, Strombidium Ciliates, Strombidium Ciliates, Myrionecta Ciliates, Strobilidium Ciliates, Strombidium Ciliates, Strombidium Ciliates Alveolates, Group I Alveolates, Group I Alveolates, Group I Alveolates, Group I Diatoms Diatoms, Fragilariopsis Dictyochophytes, Dictyocha Mast1 Mast1 Mast1 Prasinophytes, Pyramimonas Prasinophytes, Bathycoccus Prasinophytes, Micromonas Pyrenomonadales, Geminigera Phaeocystales, Phaeocystis Unclassified Thaumatomonads, Protapsis Picobiliphytes Ascomycetes Crustacea Crustacea Crustacea 1/0 1/1 1/0 3/7 0/3 3/9 2/3 3/0 0/2 1/0 2/5 1/2 0/2 0/1 14/11 3/0 67/63 8/0 0/2 0/2 0/2 0/2 Heterokonts 0/1 0/2 1/0 3/1 2/0 0/2 Chlorophytes 0/3 2/5 0/1 2/0 0/1 2/0 2/3 0/2 2/3 0/2 7/7 3/2 0/3 6/9 Cryptophytes Haptophytes Streptophytes Cercozoa Picobiliphytes Fungi Animals 28/14 a. OTUs are affiliated to a genus only if > 97% identical with a genus representative. sequences yielded at least 2000 additional conserved aligned sites for phylogenetic reconstruction, it was expected that trees based on that combination of markers should be significantly better resolved than those retrieved from the SSU rDNA alone (Moreira et al., 2007). To test this idea, we completely sequenced 22 new SSU + LSU rDNA clones from our marine surface libraries and added their sequences to an available 111-taxa SSU + LSU rDNA concatenated alignment containing sequences representing the major eukaryotic groups (Moreira et al., 2007). With those new sequences, we enriched the taxonomic sampling of that data set by incorporating sequences from important missing groups, such as the polycystine and acantharean radiolaria, prasinophyte green algae, nonphotosynthetic heterokonts and diverse alveolates. We removed four groups composed exclusively of fastevolving species for which we did not retrieve representatives in our samples: Microsporidia, Diplomonadida, Parabasalida and Amoebozoa. This allowed us including ~500 additional aligned sites, giving a total of 3247 conserved positions for phylogenetic reconstruction. Maximum likelihood (ML) and Bayesian inference phylogenetic trees reconstructed from the SSU + LSU rDNA alignment (Fig. 2) recovered a number of well-known eukaryotic groups, in agreement with global eukaryotic phylogenies published recently (Medina et al., 2003; Rodríguez-Ezpeleta et al., 2005; 2007; Burki et al., 2007). We found strong support for the monophyly of the Opisthokonta, uniting Metazoa, Fungi and their unicellular close relatives, the Choanozoa, Ichthyosporea and the © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology Eukaryotic diversity and phylogeny using SSU + LSU rDNA 5 Spathidium composite Paramecium tetraurelia Alveolata Tetrahymena thermophila Ciliophora 40/Euplotes aediculatus DH114_3A07 96/1 DH18_2A73 100/1 69/1 DH18_2A74 Plasmodium falciparum 100/1 98/1 Theileria parva Eimeria tenella 100/1 Sarcocystis neurona 99/1 Apicomplexa Hammondia hammondi 100/1 100/1 Neospora caninum 100/1 84/1 Toxoplasma gondii Chromera velia Perkinsus andrewsi Perkinsida 100/1 72/.91 Perkinsus sp. Ma131_1A38 96/1 100/1 Ma131_1A45 Ma131_1A49 51/.71 Ma131_1A57 61/1 Gymnodinium aureolum 96/.56 Dinoflagellata Prorocentrum micans 96/Pfiesteria piscicida 70/DH114_3A10 65/Alexandrium catenella 79/99/.91 Gonyaulax polyedra polycystine-like clone HA2 Polycystinea 100/1 Ma_1A27 (1/4) Rotaliid foram (composite) 100/1 Foraminifera Sticholonche-like clone KW16 73/.92 Ma121_1A14 51/.91 Acantharia 100/1 100/1 Ma121_1A29 Gromia oviformis Bigelowiella natans 92/1 Dimorpha-like 94/1 Cercozoa Cercomonas longicauda 84/1 Thaumatomastix sp. 99/1 RHIZARIA 88/1 Cercomonas sp. 96/1 100/1 C15_1A03 73/.93 Blastocystis hominis Cafeteria roenbergensis Heterokonta DH18_4A30 100/1 DH22_2A48 85/1 Hyphochytrium catenoides 100/1 94/1 Hyaloperonospora parasitica Pseudofungi 100/1 Phytophthora megasperma DH114_3A83 100/1 Rhizosolenia setigera Cylindrotheca closterium 100/1 78/1 Skeletonema pseudocostatum 60/.97 100/1 DH22_2A36 Nannochloropsis salina Glossomastix chrysoplasta 100/1 Pinguiococcus pyrenoidosus 99/1 53/1 Dictyocha speculum 100/1 Apedinella radians 100/1 Rhizochromulina marina 100/1 Aureococcus anophagefferens Ochrophyta 100/1 Pelagomonas calceolata /.59 DH114_3A43 97/1 81/1 Chrysolepidomonas dendrolepidota Ochromonas danica 100/1 Mallomonas composite 97/1 Synura sphagnicola Vacuolaria virescens 100/1 /.63 Chattonella subsalsa 64/.89 Heterosigma akashiwo 70/1 Fucus composite 100/1 Scytosiphon lomentaria 98/1 Tribonema aequale 100/1 Vaucheria bursata Phaeocystis antarctica Haptophyta 100/1 Prymnesium patelliferum Pterocystis tropica Heliozoa 100/1 Sphaerastrum fockii 67/.76 Jakobida Reclinomonas americana Ma131_1A46 90/1 (1/4) (1/4) Euglena gracilis 100/1 Crithidia fasciculata Euglenozoa 54/.88 51/.7 (1/4) Trypanosoma brucei 100/1 87/1 Trypanosoma cruzi EXCAVATA Ancyromonas sigmoides Apusomonadida 79/1 Apusomonas sp. Nucleariida Nuclearia simplex 87/99 92/1 Mucor racemosus 83/Pneumocystis carinii 100/1 Magnaporthe grisea Fungi 99/1 87/1 Saccharomyces cerevisiae 100/1 Ichthyophonus hoferi 80/1 Capsaspora owczarzaki Choanozoa 79/1 Monosiga brevicollis 100/1 Salpingoeca infusionum 89/1 Atolla vanhoeffeni 100/1 Nectopyramis sp 70/.64 Leucosolenia sp. 100/1 Metazoa Suberites ficus 61/.92 (1/4) Drosophila melanogaster OPISTHOKONTA 97/.92 100/1 Homo sapiens Goniomonas sp. Chilomonas paramecium 100/1 Cryptophyta Guillardia theta 100/1 DH141_3A30N Cyanophora paradoxa Glaucophyta 75/.64 100/1 Glaucocystis nostochinearum Cyanidioschyzon merolae Corallina officinalis 67/1 100/1 Rhodogorgon carriebowensis 59/.64 100/1 Thorea violacea Rhodophyta Gelidium floridanum 100/1 Gracilaria verrucosa 54/.58 Marchantia polymorpha Gnetum gnemon 100/1 Arabidopsis thaliana 100/1 100/1 Oryza sativa 100/1 Chlorella ellipsoidea Viridiplantae Chlamydomonas pulsatilla 100/1 100/1 Pediastrum duplex 81/.98 DH141_3A12 DH114_3A06 82/1 Ostreococcus lucimarinus 100/1 PLANTAE 70/.95 DH114_3A14 100/1 100/1 Fig. 2. ML phylogenetic tree constructed with a SSU + LSU rDNA data set (129 taxa, 3247 positions). Several long branches were shortened to one fourth (labelled 1/4). Numbers at nodes are ML bootstrap proportions and Bayesian Inference posterior probabilities (only those higher than 50% and 0.50 are shown respectively). 0.06 © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology 6 W. Marande, P. López-García and D. Moreira Nucleariida [100% ML bootstrap proportion (BP) and Bayesian posterior probability (PP) of 1]. The Plantae, which encompass the three primary photosynthetic eukaryotic groups, Viridiplantae, Rhodophyta and Glaucophyta, branched together with moderate to strong support (59–92% ML BP and PP 0.64–0.92, depending on the taxonomic sampling used). This result was very encouraging as only phylogenomic analyses using large multi-protein concatenations have been able to recover their monophyly when haptophyte and cryptophyte sequences were present in the analysis (Burki et al., 2007), as in our data set. The Excavata, represented in our tree by the Euglenozoa and the Jakobida, received high support (90% ML BP and PP 1), in agreement with recent phylogenetic analyses (Cavalier-Smith, 2002; Simpson, 2003; Hampl et al., 2005; Moreira et al., 2007; Rodríguez-Ezpeleta et al., 2007). Alveolata, Heterokonta and Rhizaria formed monophyletic groups with 100% ML BP and PP 1. Within the Rhizaria, we retrieved a strong relationship between Foraminifera and Radiolaria, in agreement with their inclusion within the Retaria (Moreira et al., 2007). Interestingly, the sisterhood of Rhizaria and Heterokonta received strong support (88% ML BP and PP 1), corroborating the recent multi-protein phylogenomic analysis (~30 000 amino acid positions) that proposed the new eukaryotic super-assemblage S.A.R. (Stramenopile + Alveolates + Rhizaria) (Burki et al., 2007). Those three groups, comprising the largest diversity of known unicellular eukaryotes, were among those for which we most significantly improved the taxonomic sampling, including new five basal heterokonts, eight alveolates and four rhizarians, which most likely helped to resolve the relationship between them in comparison with previous analyses (Moreira et al., 2007). In contrast, we did not find support for the Chromalveolata, originally defined as containing all the groups bearing plastids derived from red algae, namely the haptophytes, cryptophytes, heterokonts and alveolates (Cavalier-Smith, 1999). In our analysis, cryptophytes branched with Plantae with moderate support (75% ML BP and PP 0.63), which was most likely an artefactual result as recent phylogenomic analysis supported that cryptophytes are the sister group of haptophytes (Burki et al., 2007). The position of haptophytes and centrohelid heliozoa was unsupported in all our analyses. Finally, we retrieved the monophyly of Apusozoa (Ancyromonas sigmoides and Apusomonas sp.) with relatively good support (75% ML and 1 PP). topology of the trees based on our data set was in agreement with most of the accepted intra- and inter-group relationships (see above). To test the hypothesis that the SSU + LSU rDNA may be useful to infer the taxonomic affiliation of fast-evolving, conflictive environmental sequences, we analysed two phylotypes closely related to the red-tide forming ciliate species M. rubra obtained during our environmental diversity surveys. The very fastevolving SSU rDNA of this ciliate was sequenced only a few years ago (Johnson et al., 2004), and it was shown to affiliate with a number of divergent environmental sequences that were originally thought to be members of a potential novel eukaryotic phylum (López-García et al., 2001; Edgcomb et al., 2002; Savin et al., 2004). It was shown that the extremely rapid evolutionary rate of those SSU rDNA sequences made it impossible to retrieve their correct ciliate affinity in SSU rDNA analyses because of a LBA artefact. Thanks to taxon-rich SSU rDNA phylogenetic analyses including the SSU rDNA portion of our SSU + LSU phylotypes, we recognized that two of them (DH114-1A22 and DH18-4A88) were very closely related to M. rubra (99% sequence identity). We incorporated the two Myrionecta-like phylotypes to our data set and reconstructed phylogenetic trees with only the SSU rDNA part (1221 positions) and the concatenated SSU + LSU rDNA sequences (3247 positions). In the SSU rDNA tree (Fig. 3), the Myrionecta-like phylotypes branched together with the other long branches, represented by the excavate species (Euglenozoa and Reclinomonas americana), with moderate support (69% ML BP and PP of 0.52), which testified for a LBA artefact. Interestingly, in the tree reconstructed with the two concatenated markers, the Myrionecta-like sequences correctly branched within the Alveolata with maximal support (100% ML BP and PP of 1), close to the ciliates (57% ML BP, not retrieved with the Bayesian method). In addition, the overall topology of the SSU + LSU rDNA tree was not affected by the addition of these two highly divergent sequences (compare Figs 2 and 3) demonstrating the robustness of the SSU + LSU rDNA approach. Therefore, even if the statistical support remained moderate, the concatenation of the two rDNA markers allowed identifying these highly divergent phylotypes as ciliates (and certainly as alveolates with strong support), which was impossible when using the SSU rDNA information alone. Testing the SSU + LSU rDNA sequences to retrieve correct taxonomic affiliations for fast-evolving phylotypes: the Myrionecta case study Conclusion The improved SSU + LSU rDNA taxonomic sampling covered all the major eukaryotic groups, and the general For several years, the SSU rDNA has been used as the preferred marker to explore the diversity of microbial eukaryotic communities in a variety of environments, leading to the discovery of a huge hidden diversity. Con- © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology Eukaryotic diversity and phylogeny using SSU + LSU rDNA 7 SSU rDNA 69/.85 57/1 Dinoflagellata 100/1 Dinoflagellata -/.76 97/1 Perkinsida 84/51/- SSU + LSU rDNA 100/1 74/.94 Chromera velia 70/.84 98//.99 100/1 Apicomplexa Perkinsida Chromera velia 87/- Ma131_1A49 100/1 Apicomplexa 100/1 Ma131_1A57 100/1 57/- Ciliophora 99/1 Ciliophora 57*/- 100/.5 DH114_1A22 Heterokonta (1/4) 89/1 97/1* 96/1 53/- 100/1 75/100/1 60/- DH18_4A88 Rhizaria Rhodophyta 100/1 50/1 100/1 77/- Centrohelida 100/1 57/- Cryptophyta 79/.92 Viridiplantae 100/1 100/1 93/.78 Opisthokonta 100/1 71/90/1 100/1 Opisthokonta Ancyromonas sigmoides 80/1 Euglenozoa Apusomonas sp. Myrionecta rubra 100/1 62/.97 69/.52 Cryptophyta 72/.98 Ancyromonas sigmoides (1/4) Plantae 57/.92 72/.96 Apusomonas sp. (1/4) Haptophyta Glaucophyta 67/.57 Rhizaria 100/1 (1/4) Haptophyta 94/.98 100/1 81/- Heterokonta 90/1 100/1 95/- 100/1 Centrohelida DH114_1A22 100/1 (1/4) DH18_4A88 Reclinomonas americana Euglenozoa 100/1 89/1 Reclinomonas americana 0.06 0.06 Fig. 3. ML phylogenetic trees with Myrionecta-like phylotypes (grey boxes). The taxonomic sampling is the same as in Fig. 2, but large groups have been collapsed. On the left, the tree based on the SSU rDNA alone (1200 positions); on the right, the tree based on the SSU + LSU rDNA concatenation (3247 positions). sequently, the number of environmental sequences has increased exponentially and the necessity to place them correctly in phylogenetic trees to make taxonomic inferences about the corresponding organisms has became crucial (Berney et al., 2004; Cavalier-Smith, 2004). We have shown here that amplifying the complete rDNA cluster and using a combination of SSU + LSU rDNA sequences allow the retrieval of quite similar diversity profiles while significantly increasing the phylogenetic signal, which is essential for correct taxonomic affiliation of divergent phylotypes without the need of organism isolation. The additional advantages of complete ribosomal DNA cluster analysis are that the existing large SSU rDNA database remains exploitable (using the corresponding SSU rDNA portion of the cluster), but much more additional information becomes available, not only for phylogenetic or taxonomic purposes. For example, in a comparable analysis where the complete rDNA cluster © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology 8 W. Marande, P. López-García and D. Moreira was amplified from marine bacterioplankton, the authors used the ITS information to confirm the close relationship among members of several clades and to identify recombination between different ITS regions (Suzuki et al., 2001). Finally, as we have shown, with relatively modest additional sequencing effort, the combined SSU + LSU rDNA analyses provide more reliable eukaryotic phylogenetic trees and taxonomic affiliations of environmental rDNA sequences. Experimental procedures DNA samples and library construction We amplified rRNA genes from four DNA samples of smallsize plankton (0.2–5 mm size fraction) collected in various sea surface (25 m) locations in the South Atlantic (DH18, DH114 and DH141) and in the Marmara Sea (Ma131) that were available in the laboratory from previous studies (LópezGarcía et al., 2001; Lara et al., 2009). For each sample, SSU and SSU + LSU rDNA libraries were constructed by PCR amplification and cloning. The SSU rDNA was amplified with eukaryote specific primers, EK-42F (CTCAARGAYTAAGC CATGCA) and EK-1498R (CACCTACGGAAACCTTGTTA), and the PCR products were cloned with the Topo TA cloning kit (Invitrogen) following the manufacturer’s instructions. The protocol was slightly modified for the construction of the SSU + LSU rDNA libraries because of the much larger size of the fragments to be amplified and cloned. The size of a complete eukaryotic ribosomal RNA gene cluster, i.e. the SSU rDNA, the internal transcribed spacers (ITS1 and ITS2), the 5.8S ribosomal RNA genes and the LSU rDNA, is of around 5.5 kbp, so we used a long-run enzyme, the Ex TaKaRa (TaKaRa) to avoid premature elongation end during the PCR cycles. PCR reactions were carried out under the following conditions: 35 cycles (denaturation at 94°C for 15 s, annealing at 55°C for 30 s, extension at 72°C for 6 min) preceded by 2 min denaturation at 94°C, and followed by 10 min extension at 72°C using the SSU rDNA-specific primer Euk-42F and the LSU rDNA primer 28S-4R (TTCTGACTTAGAGGCGTTCAG). The library construction was made with the Topo TA XL kit (Invitrogen), designed for high cloning efficiency of long PCR products, following the manufacturer’s instruction. Two additional libraries were constructed for the SSU + LSU rDNA marker only, from samples DH22 (5 m depth South Atlantic) and Ma121 (500 m depth Marmara Sea). Diversity analysis For each SSU and SSU + LSU rDNA library, we selected 48 positive clones and sequenced their 5′ region of the SSU rRNA gene using the eukaryotic primer Euk-42F. For preliminary phylogenetic affiliation, the sequences were blasted against the nr GenBank database to identify their closest relatives. Saturation analysis of our libraries was performed with those partial sequences using the software DOTUR (Schloss and Handelsman, 2005). The OTU identification was carried out with Codon Code Aligner (Licor), by clustering sequences with a minimum identity percentage of 98%. SSU + LSU rDNA sequence data set construction We sequenced the SSU and the LSU rRNA genes for 22 positive clones, selected to represent all the groups retrieved in our libraries, using a set of internal primers, including 28S-568F (TTGAAACACGGACCAAGGAG), 28S-803R (ACTTCGGAGGG AACCAGCTA), 28S-1909F (GGAGT AACTATGACTCTCT), 28S-1965R (TCTCGTTAATCCA TTCATGC) and 28S-4R. In addition, we amplified and sequenced the SSU and LSU rRNA genes from a culture of Cafeteria roenbergensis isolated by Dr E. Lara from marine sediment (1200 m depth) of the Marmara Sea (Turkey) from the Cinarcik Basin (E. Lara, pers. comm.). We updated the available data set of concatenated SSU + LSU rDNA sequences (Moreira et al., 2007) with the new sequences present in GenBank: Gymnodinium aureolum (DQ779991), Chromera velia (EU106870), Hammondia hammondi (AF101070) and two complete genome projects: Ostreococcus lucimarinus (Palenik et al., 2007) and Hyaloperonospora parasitica (http://vmd.vbi.vt.edu/browse/cgi-bin/browserUI. cgi). The LSU rDNA sequence of Blastocystis hominis was generously provided by Drs F. Delbac and C. Vivares. This sequence was obtained through the B. hominis genome project conducted by the Genoscope (http://www.genoscope. cns.fr/spip/Blastocystis-hominis-whole-genome.html). We checked for chimerical SSU rDNA and SSU + LSU rDNA sequences manually by independent BLAST searches (Altschul et al., 1997) of the 5′ and the 3′ halves of the SSU rDNA and by comparing the phylogenetic trees obtained with SSU and LSU rDNA independently. We detected two potential chimerical sequences in that way, which were removed from our analyses. Sequences were deposited in GenBank under Accession Numbers FJ032645–FJ032660 and FJ032662– FJ032695. Phylogenetic analysis All new sequences were incorporated to the available data set of concatenated SSU + LSU rDNA sequences and aligned using the ED program of the package MUST (Philippe, 1993). Ambiguously aligned regions and gaps were excluded in phylogenetic analyses. The ML phylogenetic analyses were carried out with the program Treefinder (Jobb et al., 2004) and bootstrap values were calculated from 1000 replicates. Bayesian trees were inferred using MrBAYES 3.1. Markov chain Monte Carlo searches were run with four chains for 1000 000 generations, with trees being sampled every 100 generations. The first 5000 trees were discarded as ‘burnin’, keeping only trees generated well after chain parameters stabilized. For both the ML and the Bayesian methods, we applied a GTR + G + I model of nucleotide substitution, taking into account a proportion of invariable sites, and a G-shaped distribution of the rates of substitution among sites with four rate categories. Fast-evolving groups (Microsporidia, Parabasalida, Diplomonadida and Amoebozoa), for which we did not retrieve closely related phylotypes in our samples, were removed from our analyses. Sequence alignments are available upon request. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology Eukaryotic diversity and phylogeny using SSU + LSU rDNA 9 Acknowledgements This work was supported by a grant from the French Agence Nationale de la Recherche (ANR JC05-44674) to D.M. We thank E. Lara for providing the C. roenbergensis culture, and C. Vivares, F. Delbac and the Genoscope for providing the B. hominis sequences. We thank two anonymous reviewers for constructive suggestions. References Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402. Barns, S.M., Delwiche, C.F., Palmer, J.D., and Pace, N.R. (1996) Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc Natl Acad Sci USA 93: 9188–9193. Berney, C., Fahrni, J., and Pawlowski, J. (2004) How many novel eukaryotic ‘kingdoms’? Pitfalls and limitations of environmental DNA surveys. BMC Biol 2: 13. Burki, F., Shalchian-Tabrizi, K., Minge, M., Skjaeveland, A., Nikolaev, S.I., Jakobsen, K.S., and Pawlowski, J. (2007) Phylogenomics reshuffles the eukaryotic supergroups. PLoS ONE 2: e790. Cavalier-Smith, T. (1999) Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origin and the eukaryote family tree. J Eukaryot Microbiol 46: 347–366. Cavalier-Smith, T. (2002) The phagotrophic origin of eukaryotes and phylogenetic classification of Protozoa. Int J Syst Evol Microbiol 52: 297–354. Cavalier-Smith, T. (2004) Only six kingdoms of life. Proc R Soc Lond B Biol Sci 271: 1251–1262. Dawson, S.C., and Pace, N.R. (2002) Novel kingdom-level eukaryotic diversity in anoxic environments. Proc Natl Acad Sci USA 99: 8324–8329. Díez, B., Pedrós-Alió, C., and Massana, R. (2001) Study of genetic diversity of eukaryotic picoplankton in different oceanic regions by small-subunit rRNA gene cloning and sequencing. Appl Environ Microbiol 67: 2932–2941. Edgcomb, V.P., Kysela, D.T., Teske, A., De Vera Gomez, A., and Sogin, M.L. (2002) Benthic eukaryotic diversity in the Guaymas Basin hydrothermal vent environment. Proc Natl Acad Sci USA 99: 7658–7662. Guillou, L., Eikrem, W., Chretiennot-Dinet, M.J., Le Gall, F., Massana, R., Romari, K., et al. (2004) Diversity of picoplanktonic prasinophytes assessed by direct nuclear SSU rDNA sequencing of environmental samples and novel isolates retrieved from oceanic and coastal marine ecosystems. Protist 155: 193–214. Habura, A., Pawlowski, J., Hanes, S.D., and Bowser, S.S. (2004) Unexpected foraminiferal diversity revealed by small-subunit rDNA analysis of Antarctic sediment. J Eukaryot Microbiol 51: 173–179. Hampl, V., Horner, D.S., Dyal, P., Kulda, J., Flegr, J., Foster, P.G., and Embley, T.M. (2005) Inference of the phylogenetic position of oxymonads based on nine genes: support for Metamonada and Excavata. Mol Biol Evol 22: 2508– 2518. Hugenholtz, P., Goebel, B.M., and Pace, N.R. (1998) Impact of culture-independent studies on the emerging phylogenetic view of bacterial diversity. J Bacteriol 180: 4765– 4774. Jobb, G., von Haeseler, A., and Strimmer, K. (2004) TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics. BMC Evol Biol 4: 18. Johnson, M.D., Tengs, T., Oldach, D.W., Delwiche, C.F., and Stoecker, D.K. (2004) Highly divergent SSU rRNA genes found in the marine ciliates Myrionecta rubra and Mesodinium pulex. Protist 155: 347–359. Lara, E., Moreira, D., Vereshchaka, A., and Lopez-Garcia, P. (2009) Pan-oceanic distribution of new highly diverse clades of deep-sea diplonemids. Environ Microbiol 11: 47–55. López-García, P., Rodríguez-Valera, F., Pedrós-Alió, C., and Moreira, D. (2001) Unexpected diversity of small eukaryotes in deep-sea Antarctic plankton. Nature 409: 603–607. Medina, M., Collins, A.G., Taylor, J.W., Valentine, J.W., Lipps, J.H., Amaral-Zettler, L., and Sogin, M.L. (2003) Phylogeny of Opisthokonta and the evolution of multicellularity and complexity in Fungi and Metazoa. Int J Astrobiol 2: 203–211. Moon-van der Staay, S.Y., De Wachter, R., and Vaulot, D. (2001) Oceanic 18S rDNA sequences from picoplankton reveal unsuspected eukaryotic diversity. Nature 409: 607– 610. Moreira, D., von der Heyden, S., Bass, D., Lopez-Garcia, P., Chao, E., and Cavalier-Smith, T. (2007) Global eukaryote phylogeny: combined small- and large-subunit ribosomal DNA trees support monophyly of Rhizaria, Retaria and Excavata. Mol Phylogenet Evol 44: 255–266. Not, F., Valentin, K., Romari, K., Lovejoy, C., Massana, R., Tobe, K., et al. (2007) Picobiliphytes: a marine picoplanktonic algal group with unknown affinities to other eukaryotes. Science 315: 253–255. Palenik, B., Grimwood, J., Aerts, A., Rouze, P., Salamov, A., Putnam, N., et al. (2007) The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation. Proc Natl Acad Sci USA 104: 7705–7710. Philippe, H. (1993) MUST, a computer package of management utilities for sequences and trees. Nucleic Acids Res 21: 5264–5272. Philippe, H., and Adoutte, A. (1998) The molecular phylogeny of Eukaryota: solid facts and uncertainties. In Evolutionary Relationships among Protozoa, Coombs, G., Vickerman, K., Sleigh, M, and Warren, A (eds). London, UK: Chapman & Hall, pp. 25–56. Philippe, H., Lopez, P., Brinkmann, H., Budin, K., Germot, A., Laurent, J., et al. (2000) Early-branching or fast-evolving eukaryotes? An answer based on slowly evolving positions. Proc R Soc Lond B Biol Sci 267: 1213–1221. Rodríguez-Ezpeleta, N., Brinkmann, H., Burey, S.C., Roure, B., Burger, G., Loffelhardt, W., et al. (2005) Monophyly of primary photosynthetic eukaryotes: green plants, red algae, and glaucophytes. Curr Biol 15: 1325–1330. Rodríguez-Ezpeleta, N., Brinkmann, H., Burger, G., Roger, A.J., Gray, M.W., Philippe, H., and Lang, B.F. (2007) Toward resolving the eukaryotic tree: the phylogenetic positions of jakobids and cercozoans. Curr Biol 17: 1420–1425. Savin, M.C., Martin, J.L., LeGresley, M., Giewat, M., and Rooney-Varga, J. (2004) Plankton diversity in the Bay of © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology 10 W. Marande, P. López-García and D. Moreira Fundy as measured by morphological and molecular methods. Microb Ecol 48: 51–65. Schloss, P.D., and Handelsman, J. (2005) Introducing DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl Environ Microbiol 71: 1501–1506. Simpson, A.G. (2003) Cytoskeletal organization, phylogenetic affinities and systematics in the contentious taxon Excavata (Eukaryota). Int J Syst Evol Microbiol 53: 1759– 1777. Stiller, J., and Hall, B. (1999) Long-branch attraction and the rDNA model of early eukaryotic evolution. Mol Biol Evol 16: 1270–1279. Stoeck, T., and Epstein, S. (2003) Novel eukaryotic lineages inferred from small-subunit rRNA analyses of oxygendepleted marine environments. Appl Environ Microbiol 69: 2657–2663. Suzuki, M.T., Beja, O., Taylor, L.T., and Delong, E.F. (2001) Phylogenetic analysis of ribosomal RNA operons from uncultivated coastal marine bacterioplankton. Environ Microbiol 3: 323–331. Viprey, M., Guillou, L., Ferreol, M., and Vaulot, D. (2008) Wide genetic diversity of picoplanktonic green algae (Chloroplastida) in the Mediterranean Sea uncovered by a phylum-biased PCR approach. Environ Microbiol 10: 1804–1822. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd, Environmental Microbiology