Category Archives: comparative

NSF Poststdoc opportunity for Research using biological collections

Earlier this year the NSF released a postdoc opportunity for research to use Biological Collections. In particular these can be strain collections and stock collections. The US Culture Collection Network is a Research Coordination Network which brings together many collaborating culture collections. You can find many of the U.S. living collections there include fungal centers like the Phaff Yeast Collection and Fungal Genetics Stock Center. The Gilbertson Mycological Herbarium at U Arizona under Elizabeth Arnold‘s leadership has developed a rich collection of endophyte fungi which would be another excellent environment to work with these resources. Kyria Boundy-Mills who is the curator of the Phaff collection has also expressed interest in either hosting or helping working with a postdoc on this. There is tremendous biodiversity of the fungi available in these and other culture collections so seems like a great chance to tap into these.
This would be a great opportunity to link work in the 1000 Fungal genomes project and sampling from culture collections (not just sequencing, but growing and characterizing growth, carbon source utilization and integrating that with predictions made from genome comparisons). If this is something interesting to you – do get in touch with some of the curators at these collections, but also my lab and I expect many other labs would be interested hosting someone to work on these questions that take advantage of these living collections of fungi.
Proposals are to be submitted by potential post docs. Submitter must be a US citizen or US permanent resident. The next deadline is November 3, 2015Funding total for the program is $8 million, 40 awards anticipated, up to two years. Here’s some key text from the solicitation:

Competitive Area 2. Postdoctoral Research Fellowships Using Biological Collections.

Biological research collections represent the documented scientific history of life on Earth, and the U.S. museum community alone curates over a billion specimens ranging from bacteria to plants, insects and vertebrates, as well as fossils. Across the globe, collections represent critical infrastructure and support essential research activities in biology and its related fields. Scientists, government agencies, industry and citizens utilize collections to document and understand evolution and biodiversity, study global change, formulate advice on conservation planning, educate the general public, improve interactions between sciences, and devise new practical applications from science to every day life. New technologies supported by NSF in digitization, such as the Advancing Digitization of Biodiversity Collections (ADBC) program, are making collections and their associated data, whether they are physical specimens, text, images, sounds, or data tables, searchable in online databases. Despite this clear progress in improving access to physical specimens and their associated metadata, collections remain under-utilized for answering contemporary questions about fundamental aspects of biological processes. Thus, collections are poised to become a critical resource for developing transformative approaches to address key questions in biology and potentially develop applications that extend biology to physical, mathematical, engineering and social sciences. This postdoctoral track seeks transformative approaches that use biological collections in highly innovative ways to address grand challenges in biology. Priority may be given to applicants who integrate biological collections and associated resources with other types of data in an effort to forge new insight into areas traditionally funded by BIO. Examples of key questions in biology of interest include, but are not limited to, links between genotype and phenotype, evolutionary developmental biology, comparative approaches in functional and developmental neurobiology, and the biophysics of nanostructures. Using collections as a resource for grand challenge questions in biology is expected to present new opportunities to advance understanding of biological processes and systems, inspiring new discoveries in areas with relevance to other disciplines with overlapping interests in biological systems. Applicants must document access to the selected collection(s) in the research and training plan.

Some recent fungal and oomycete genome papers A few papers covering some published genomes you should definitely read if you have the chance.

  • Youssef NH, Couger MB, Struchtemeyer CG, Liggenstoffer AS, Prade RA, Najar FZ, Atiyeh HK, Wilkins MR, & Elshahed MS (2013). The Genome of the Anaerobic Fungus Orpinomyces sp. Strain C1A Reveals the Unique Evolutionary History of a Remarkable Plant Biomass Degrader. Applied and environmental microbiology, 79 (15), 4620-34 PMID: 23709508
    Describes first published genome of a Neocallimastigomycota fungus that resides within the rumen gut. Cool findings related to lignocellulolytic degradation pathways and basic biology about early diverging fungi which have intact flagellar apparatus.
  • Bushley KE, Raja R, Jaiswal P, Cumbie JS, Nonogaki M, Boyd AE, Owensby CA, Knaus BJ, Elser J, Miller D, Di Y, McPhail KL, & Spatafora JW (2013). The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster. PLoS Genetics, 9 (6) PMID: 23818858Describes the genome of a pathogen of beetle larvae (and related to Cordyceps). This fungus is important as it produces the immunosuppresive drug cyclosporin as a secondary metabolite. Analysis of the complete secondary metabolite pathways in the genome help shed light on the origin of this and other secondary metabolite gene clusters.
  • Schardl CL, Young CA, Hesse U, Amyotte SG, Andreeva K, Calie PJ, Fleetwood DJ, Haws DC, Moore N, Oeser B, Panaccione DG, Schweri KK, Voisey CR, Farman ML, Jaromczyk JW, Roe BA, O’Sullivan DM, Scott B, Tudzynski P, An Z, Arnaoudova EG, Bullock CT, Charlton ND, Chen L, Cox M, Dinkins RD, Florea S, Glenn AE, Gordon A, Güldener U, Harris DR, Hollin W, Jaromczyk J, Johnson RD, Khan AK, Leistner E, Leuchtmann A, Li C, Liu J, Liu J, Liu M, Mace W, Machado C, Nagabhyru P, Pan J, Schmid J, Sugawara K, Steiner U, Takach JE, Tanaka E, Webb JS, Wilson EV, Wiseman JL, Yoshida R, & Zeng Z (2013). Plant-symbiotic fungi as chemical engineers: multi-genome analysis of the clavicipitaceae reveals dynamics of alkaloid loci. PLoS Genetics, 9 (2) PMID: 23468653 

    A very rich and detailed paper, this presents a gold mine of complete genome data of 15 species and secondary metabolite profiling. The data include genomes of 10 epichloae fungi that are endophytes of grasses, three Claviceps species (ergot fungi), a morning-glory symbiont and a bamboo pathogen. The analyses of the genes from pathway analyses of the genomes along with profiling alkaloid productions the authors were able to link clusters to products in many cases. This is a rich and useful paper for anyone working in this field of secondary metabolites and sets the standard for a how a biological question can be answered by genome sequencing of a clade of related species.

  • Wicker T, Oberhaensli S, Parlange F, Buchmann JP, Shatalina M, Roffler S, Ben-David R, Doležel J, Simková H, Schulze-Lefert P, Spanu PD, Bruggmann R, Amselem J, Quesneville H, van Themaat EV, Paape T, Shimizu KK, & Keller B (2013). The wheat powdery mildew genome shows the unique evolution of an obligate biotroph. Nature Genetics PMID: 23852167

    Genome of wheat pathogen Blumeria graminis f.sp. tritici.This paper includes an identification and analysis of effector genes and dating the emergence of the pathogen relative the domestication and diversification of wheat.
  • Jiang RH, de Bruijn I, Haas BJ, Belmonte R, Löbach L, Christie J, van den Ackerveken G, Bottin A, Bulone V, Díaz-Moreno SM, Dumas B, Fan L, Gaulin E, Govers F, Grenville-Briggs LJ, Horner NR, Levin JZ, Mammella M, Meijer HJ, Morris P, Nusbaum C, Oome S, Phillips AJ, van Rooyen D, Rzeszutek E, Saraiva M, Secombes CJ, Seidl MF, Snel B, Stassen JH, Sykes S, Tripathy S, van den Berg H, Vega-Arreguin JC, Wawra S, Young SK, Zeng Q, Dieguez-Uribeondo J, Russ C, Tyler BM, & van West P (2013). Distinctive Expansion of Potential Virulence Genes in the Genome of the Oomycete Fish Pathogen Saprolegnia parasitica. PLoS Genetics, 9 (6) PMID: 23785293

    Genome of the fish pathogen and Oomycete Saprolegnia provide additional perspective on this diverse group organisms, evolution of metabolism and host-associated lifestyles.
  • Aylward FO, Burnum-Johnson KE, Tringe SG, Teiling C, Tremmel DM, Moeller JA, Scott JJ, Barry KW, Piehowski PD, Nicora CD, Malfatti SA, Monroe ME, Purvine SO, Goodwin LA, Smith RD, Weinstock GM, Gerardo NM, Suen G, Lipton MS, & Currie CR (2013). Leucoagaricus gongylophorus produces diverse enzymes for the degradation of recalcitrant plant polymers in leaf-cutter ant fungus gardens. Applied and environmental microbiology, 79 (12), 3770-8 PMID: 23584789Genome of the ant farmed fungus Leucoagaricus. This paper presents a draft genome assembly a useful step in understanding the fascinating symbiosis between ants and their cultivated fungi.

Presents for the holidays – Plant pathogen genomes

Though a bit cliche, I think the metaphor of “presents under the tree” of some new plant pathogen genomes summarized in 4 recent publications is still too good to resist.  There are 4 papers in this week’s Science that will certainly make a collection of plant pathogen biologists very happy. There are also treats for the general purpose genome biologists with descriptions of next generation/2nd generation sequencing technologies, assembly methods, and comparative genomics. Much more inside these papers than I am summarizing so I urge you to take look if you have access to these pay-for-view articles or contact the authors for reprints to get a copy.


These include the genome of biotrophic oomycete and Arabidopsis pathogen Hyaloperonospora arabidopsidis (Baxter et al). While preserving the health of Arabidopsis is not a major concern of most researchers, this is an excellent model system for studying plant-microbe interaction.  The genome sequence of Hpa provides a look at specialization as a biotroph. The authors found a reduction (relative to other oomycete species) in factors related to host-targeted degrading enzymes and also reduction in necrosis factors suggesting the specialization in biotrophic lifestyle from a necrotrophic ancestor. Hpa also does not make zoospores with flagella like its relatives and sequence searches for 90 flagella-related genes turned up no identifiable homologs.

While the technical aspects of sequencing are less glamourous now the authors used Sanger and Illumina sequencing to complete this genome at 45X sequencing coverage and an estimated genome size fo 80 Mb. To produce the assembly they used Velvet on the paired end Illumina data to produce a 56Mb assembly and PCAP (8X coverage to produce a 70Mb genome) on the Sanger reads to produce two assemblies that were merged with an ad hoc procedure that relied on BLAT to scaffold and link contigs through the two assembled datasets. They used CEGMA and several in-house pipelines to annotate the genes in this assembly. SYNTENY analysis was completed with PHRINGE. A relatively large percentage (17%) of the genome fell into ‘Unknown repetitive sequence’ that is unclassified – larger than P.sojae (12%) but there remain a lot of mystery elements of unknown function in these genomes.  If you jump ahead to the Blumeria genome article you’ll see this is still peanuts compared to that Blumeria’s genome (64%). The largest known transposable element family in Hpa was the LTR/Gypsy element. Of interest to some following oomycete literature is the relative abundance of the RLXR containing proteins which are typically effectors – there were still quite a few (~150 instead of ~500 see in some Phytophora genomes).



A second paper on the genome of the barley powdery mildew Blumeria graminis f.sp. hordei and two close relatives Erysiphe pisi, a pea pathogen, and Golovinomyces orontii, an Arabidopsis thaliana pathogen (Spanu et al).  These are Ascomycetes in the Leotiomycete class where there are only a handful of genomes Overall this paper tells a story told about how obligate biotrophy has shaped the genome. I found most striking was depicted in Figure 1. It shows that typical genome size for (so far sampled) Pezizomycotina Ascomycetes in the ~40-50Mb range whereas these powdery mildew genomes here significantly large genomes in ~120-160 Mb range. These large genomes were primarily comprised of Transposable Elements (TE) with ~65% of the genome containing TE. However the protein coding gene content is still only on the order of ~6000 genes, which is actually quite low for a filamentous Ascomycete, suggesting that despite genome expansion the functional potential shows signs of reduction.  The obligate lifestyle of the powdery mildews suggested that the species had lost some autotrophic genes and the authors further cataloged a set of ~100 genes which are missing in the mildews but are found in the core ascomycete genomes. They also document other genome cataloging results like only a few secondary metabolite genes although these are typically in much higher copy numbers in other filamentous ascomycetes (e.g. Aspergillus).  I still don’t have a clear picture of how this gene content differs from their closest sequenced neighbors, the other Leotiomycetes Botrytis cinerea and Sclerotinia sclerotium, are on the order of 12-14k genes. Since the E. pisi and G. orontii data is not yet available in GenBank or the MPI site it is hard to figure this out just yet – I presume it will be available soon.

More techie details — The authors used Sanger and second generation technologies and utilized the Celera assembler to build the assemblies from 120X coverage sequence from a hybrid of sequencing technologies.  Interestingly, for the E. pisi and G. orontii assemblies the MPI site lists the genome sizes closer to 65Mb in the first drafts of the assembly with 454 data so I guess you can see what happens when the Newbler assembler which overcollapses repeats. They also used a customized automated annotation with some ab intio gene finders (not sure if there was custom training or not for the various gene finders) and estimated the coverage with the CEGMA genes. I do think a Fungal-Specific set of core-conserved genes would be in order here as a better comparison set – some nice data like this already exist in a few databases but would be interesting to see if CEGMA represents a broad enough core-set to estimate genome coverage vs a Fungal-derived CEGMA-like set.


A third paper in this issue covers the genome evolution in the massively successful pathogen Phytophora infestans through resequencing of six genomes of related species to track recent evolutionary history of the pathogen (Raffaele et al). The authors used high throughput Illumina sequencing to sequence genomes of closely related species. They found a variety differences among genes in the pathogen among the findings “genes in repeat-rich regions show[ed] higher rates of structural polymorphisms and positive selection”. They found 14% of the genes experienced positive selection and these included many (300 out of ~800) of the annotated effector genes. P. infestans also showed high rates of change in the repeat rich regions which is also where a lot of the disease implicated genes are locating supporting the hypothesis that the repeat driven expansion of the genome (as described in the 2009 genome paper). The paper generates a lot of very nice data for followup by helping to prioritize the genes with fast rates of evolution or profiles that suggest they have been shaped by recent adaptive evolutionary forces and are candidates for the mechanisms of pathogenecity in this devastating plant pathogen.


A fourth paper describes the genome sequencing of Sporisorium reilianum, a biotrophic pathogen that is closely related species to corn smut Ustilago maydis (Schirawski et al). Both these species both infect maize hosts but while U. maydis induces tumors in the ears, leaves, tassels of corn the S. reilianum infection is limited to tassels and . The authors used comparative biology and genome sequencing to try and tease out what genetic components may be responsible for the phenotypic differences. The comparison revealed a relative syntentic genome but also found 43 regions in U. maydis that represent highly divergent sequence between the species. These regions contained disproportionate number of secreted proteins indicating that these secreted proteins have been evolving at a much faster rate and that they may be important for the distinct differences in the biology. The chromosome ends of U. maydis were also found to contain up to 20 additional genes in the sub-telomeric regions that were unique to U. maydis. Another fantastic finding that this sequencing and comparison revealed is more about the history of the lack of RNAi genes in U. maydis. It was a striking feature from the 2006 genome sequence that the genome lacked a functioning copy of Dicer. However knocking out this gene in S. reilianum failed to show a developmental or virulence phenotype suggesting it is dispensible for those functions so I think there will be some followups to explore (like do either of these species make small RNAs, do they produce any that are translocated to the host, etc).  The rest of the analyses covered in the manuscript identify the specific loci that are different between the two species — interestingly a lot of the identified loci were the same ones found as islands of secreted proteins in the first genome analysis paper so the comparative approach was another way to get to the genes which may be important for the virulence if the two organisms have different phenotypes. This is certainly the approach that has also been take in other plant pathogens (e.g. Mycosphaerella, Fusarium) and animal pathogens (Candida, Cryptococcus, Coccidioides) but requires a sampling species or appropriate distance that that the number of changes haven’t saturated our ability to reconstruct the history either at the gene order/content or codon level.

Without the comparison of an outgroup species it is impossible to determine if U. maydis gained function that relates to the phenotypes observed here through these speculated evolutionary changes involving new genes and newly evolved functions or if S. reilianum lost functionality that was present in their common ancestor. However, this paper is an example of how using a comparative approach can identify testable hypotheses for origins of pathogenecity genes.


Hope everyone has a chance to enjoy holidays and unwrap and spend some time looking at these and other science gems over the coming weeks.


Baxter, L., Tripathy, S., Ishaque, N., Boot, N., Cabral, A., Kemen, E., Thines, M., Ah-Fong, A., Anderson, R., Badejoko, W., Bittner-Eddy, P., Boore, J., Chibucos, M., Coates, M., Dehal, P., Delehaunty, K., Dong, S., Downton, P., Dumas, B., Fabro, G., Fronick, C., Fuerstenberg, S., Fulton, L., Gaulin, E., Govers, F., Hughes, L., Humphray, S., Jiang, R., Judelson, H., Kamoun, S., Kyung, K., Meijer, H., Minx, P., Morris, P., Nelson, J., Phuntumart, V., Qutob, D., Rehmany, A., Rougon-Cardoso, A., Ryden, P., Torto-Alalibo, T., Studholme, D., Wang, Y., Win, J., Wood, J., Clifton, S., Rogers, J., Van den Ackerveken, G., Jones, J., McDowell, J., Beynon, J., & Tyler, B. (2010). Signatures of Adaptation to Obligate Biotrophy in the Hyaloperonospora arabidopsidis Genome Science, 330 (6010), 1549-1551 DOI: 10.1126/science.1195203

Spanu, P., Abbott, J., Amselem, J., Burgis, T., Soanes, D., Stuber, K., Loren van Themaat, E., Brown, J., Butcher, S., Gurr, S., Lebrun, M., Ridout, C., Schulze-Lefert, P., Talbot, N., Ahmadinejad, N., Ametz, C., Barton, G., Benjdia, M., Bidzinski, P., Bindschedler, L., Both, M., Brewer, M., Cadle-Davidson, L., Cadle-Davidson, M., Collemare, J., Cramer, R., Frenkel, O., Godfrey, D., Harriman, J., Hoede, C., King, B., Klages, S., Kleemann, J., Knoll, D., Koti, P., Kreplak, J., Lopez-Ruiz, F., Lu, X., Maekawa, T., Mahanil, S., Micali, C., Milgroom, M., Montana, G., Noir, S., O’Connell, R., Oberhaensli, S., Parlange, F., Pedersen, C., Quesneville, H., Reinhardt, R., Rott, M., Sacristan, S., Schmidt, S., Schon, M., Skamnioti, P., Sommer, H., Stephens, A., Takahara, H., Thordal-Christensen, H., Vigouroux, M., Wessling, R., Wicker, T., & Panstruga, R. (2010). Genome Expansion and Gene Loss in Powdery Mildew Fungi Reveal Tradeoffs in Extreme Parasitism Science, 330 (6010), 1543-1546 DOI: 10.1126/science.1194573

Raffaele, S., Farrer, R., Cano, L., Studholme, D., MacLean, D., Thines, M., Jiang, R., Zody, M., Kunjeti, S., Donofrio, N., Meyers, B., Nusbaum, C., & Kamoun, S. (2010). Genome Evolution Following Host Jumps in the Irish Potato Famine Pathogen Lineage Science, 330 (6010), 1540-1543 DOI: 10.1126/science.1193070

Schirawski, J., Mannhaupt, G., Munch, K., Brefort, T., Schipper, K., Doehlemann, G., Di Stasio, M., Rossel, N., Mendoza-Mendoza, A., Pester, D., Muller, O., Winterberg, B., Meyer, E., Ghareeb, H., Wollenberg, T., Munsterkotter, M., Wong, P., Walter, M., Stukenbrock, E., Guldener, U., & Kahmann, R. (2010). Pathogenicity Determinants in Smut Fungi Revealed by Genome Comparison Science, 330 (6010), 1546-1548 DOI: 10.1126/science.1195330

A mushroom on the cover

I’ll indulge a bit here to happily to point to the cover of this week’s PNAS with an image of Coprinopsis cinerea mushrooms fruiting referring to our article on the genome sequence of this important model fungus.  You should also enjoy the commentary article from John Taylor and Chris Ellison that provides a summary of some of the high points in the paper.

Coprinopsis cover

Stajich, J., Wilke, S., Ahren, D., Au, C., Birren, B., Borodovsky, M., Burns, C., Canback, B., Casselton, L., Cheng, C., Deng, J., Dietrich, F., Fargo, D., Farman, M., Gathman, A., Goldberg, J., Guigo, R., Hoegger, P., Hooker, J., Huggins, A., James, T., Kamada, T., Kilaru, S., Kodira, C., Kues, U., Kupfer, D., Kwan, H., Lomsadze, A., Li, W., Lilly, W., Ma, L., Mackey, A., Manning, G., Martin, F., Muraguchi, H., Natvig, D., Palmerini, H., Ramesh, M., Rehmeyer, C., Roe, B., Shenoy, N., Stanke, M., Ter-Hovhannisyan, V., Tunlid, A., Velagapudi, R., Vision, T., Zeng, Q., Zolan, M., & Pukkila, P. (2010). Insights into evolution of multicellular fungi from the assembled chromosomes of the mushroom Coprinopsis cinerea (Coprinus cinereus) Proceedings of the National Academy of Sciences, 107 (26), 11889-11894 DOI: 10.1073/pnas.1003391107

Where can I get orthologs?

There are several databases that include orthology prediction for fungi. These all have pros and cons. Some are more comprehensive and have many more species. Some are curated orthologies and paralogy which should be pretty stable. Some are automated and groupings and ortholog group IDs change at each iteration.

  • A phylogenetic approach from a Saccharomyces perspective is at PhylomeDB.
  • Fungal Orthogroups is based on Synergy algorithm from I. Wapinski formerly of the Regev group at the Broad Institutue.
  • Yeast gene order browser (YGOB) for Saccharomyces spp and CGOB for Candida spp.
  • OrthoMCL database based on whole genomes, not a ton of fungi but useful starting set.
  • Ensembl Genomes provides ortholog prediction as part of the Compara pipeline though there is a limited phylogenetic diversity in the current Ensembl Fungal genomes.
  • TreeFam has Saccharomyces cerevisiae and Schizosaccharomyces pombe as the two fungi included in the curated ortholog assignments and phylogenies.
  • SIMAP provides pre-computed similarities among all proteins in UniProt.
  • InParanoid provides a pretty comprehensive of available 100 whole genomes and many fungal genomes which I tried to help select.
  • JGI’s Mycocosm attempts to provide a fungal focused paralog/gene family look at clusters of genes based on whole genomes
  • E-Fungi is also an attempt at automated clustering with some fancy webservices logic.
  • Fungal Transcription Factor database focused just on families of transcription factors.

Some of these tools are better than others in terms of providing downloadable tables.  Another problem is what Identifiers are used. Many biologists are using gene names or Locus identifiers not UniProt/GenPept IDs to identify genes or proteins of interest.  So tools that just cluster UniProt data aren’t as useful as those which refer to the gene or locus names. Also, providing a way to download all the data from a comparison is important for further mining and grouping of the data or cross-referencing local datasets.  One-by-one plugging in geneids is not really a tool that respects the idea that your user wants to ask sophisticated queries.

Also – beware that some approaches are very much pairwise comparisons lists whereas others are finding orthologous groupings.  So if you want to fine the Rad59 ortholog from all fungi it may be easier or harder depending on the source.

[I may make this a static page in the future to allow for more detailed updating since I know the available resources wax and wane]

Horizontal gene transfer from Zygo to pea aphid

Pea AphidAnother result from the analysis of the recently published genome of the pea aphid, Acyrthosiphon pisum. Nancy Moran and Tyler Jarvik present a study of the origin of the carotenoid production gene in pea aphid. Animals typically cannot make carotenoids so they sought to discover how this is possible. They find that it is derived from a horizontal gene transfer event of a fungal gene into the aphid lineage. This gene is responsible for the red-green color polymorphism in the aphid. It appears the gene is derived from a ‘zygomycete’ or relative in the early branching lineage of the fungi. One gene, a carotenoid desaturase, is encoded in a 30kb genomic region that is missing in green aphids but present in the red morphs. The region is apparently maintained in the population by frequency dependent selection since each color has an advantage or disadvantage for evading detection by predators in different environments.

The reports of eukaryotic HGT event from fungi to animals is quite rare so this finding is surprising in that sense, but the authors argue that the important ecological role of carotenoids suggest we might see even more examples if we look harder.

Moran, N., & Jarvik, T. (2010). Lateral Transfer of Genes from Fungi Underlies Carotenoid Production in Aphids Science, 328 (5978), 624-627 DOI: 10.1126/science.1187113

A cacophony of comparative genomics papers

A nice series of comparative genomics articles have been published in the last few weeks. The pace of genome sequencing has accelerated to the point that we have lots of sequencing projects coming from individual labs and small consortia not necessarily from genome centers. We are seeing a preview of what next (2nd) generation sequencing will enable and can start to imagine what happens when even cheaper 3rd generation sequencing technologies are applied. I’m behind in reviewing these papers for you, dear reader, but I hope you’ll click through and take a look at some of these papers if you are interested in the topics.

In the following set of papers we have some nice examples of comparative genomics of closely related species and among a clade of species. The papers mentioned below include our work on the human pathogens Coccidioides and Histoplasma (Sharpton et al) studied at several evolutionary distances, a study on Saccharomycetaceae (Souciet et al) clade of yeast species, and a comparison of two species of Candida (Jackson et al): the commensal and opportunistic fungal pathogen Candida albicans with a very closely related species Candida dubliensis.  There is also a nice comparison of strains of Saccharomyces cerevisiae looking at effects of domestication and examples of horizontal transfer.

There is also a report of de novo sequencing of a filamentous fungus using several approaches, traditional Sanger sequencing, 454, and Illumina/Solexa (DiGuistini et al).

Finally, a paper from a few months ago (Ma et al), gives a fantastic look at one of the early branches in the fungal tree – the Mucorales (formerly Zygomycota) – via the genome of Rhizopus oryzae.  This paper is a really excellent example of what we can learn about a group of species by contrasting genomic features in the early branches in the tree with the more well studied Ascomycete and Basidiomycete fungi.  More genome sequences will help us build on these findings and clarify if some of the observations are unique to the lineage or universal aspects of the earliest fungi.

I hope you enjoy!

Novo, M., Bigey, F., Beyne, E., Galeote, V., Gavory, F., Mallet, S., Cambon, B., Legras, J., Wincker, P., Casaregola, S., & Dequin, S. (2009). Eukaryote-to-eukaryote gene transfer events revealed by the genome sequence of the wine yeast Saccharomyces cerevisiae EC1118 Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.0904673106 (via J Heitman)

Jackson, A., Gamble, J., Yeomans, T., Moran, G., Saunders, D., Harris, D., Aslett, M., Barrell, J., Butler, G., Citiulo, F., Coleman, D., de Groot, P., Goodwin, T., Quail, M., McQuillan, J., Munro, C., Pain, A., Poulter, R., Rajandream, M., Renauld, H., Spiering, M., Tivey, A., Gow, N., Barrell, B., Sullivan, D., & Berriman, M. (2009). Comparative genomics of the fungal pathogens Candida dubliniensis and C. albicans Genome Research DOI: 10.1101/gr.097501.109

DiGuistini, S., Liao, N., Platt, D., Robertson, G., Seidel, M., Chan, S., Docking, T., Birol, I., Holt, R., Hirst, M., Mardis, E., Marra, M., Hamelin, R., Bohlmann, J., Breuil, C., & Jones, S. (2009). De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biology, 10 (9) DOI: 10.1186/gb-2009-10-9-r94 (open access)

Sharpton, T., Stajich, J., Rounsley, S., Gardner, M., Wortman, J., Jordar, V., Maiti, R., Kodira, C., Neafsey, D., Zeng, Q., Hung, C., McMahan, C., Muszewska, A., Grynberg, M., Mandel, M., Kellner, E., Barker, B., Galgiani, J., Orbach, M., Kirkland, T., Cole, G., Henn, M., Birren, B., & Taylor, J. (2009). Comparative genomic analyses of the human fungal pathogens Coccidioides and their relatives Genome Research DOI: 10.1101/gr.087551.108 (open access)

Souciet, J., Dujon, B., Gaillardin, C., Johnston, M., Baret, P., Cliften, P., Sherman, D., Weissenbach, J., Westhof, E., Wincker, P., Jubin, C., Poulain, J., Barbe, V., Segurens, B., Artiguenave, F., Anthouard, V., Vacherie, B., Val, M., Fulton, R., Minx, P., Wilson, R., Durrens, P., Jean, G., Marck, C., Martin, T., Nikolski, M., Rolland, T., Seret, M., Casaregola, S., Despons, L., Fairhead, C., Fischer, G., Lafontaine, I., Leh, V., Lemaire, M., de Montigny, J., Neuveglise, C., Thierry, A., Blanc-Lenfle, I., Bleykasten, C., Diffels, J., Fritsch, E., Frangeul, L., Goeffon, A., Jauniaux, N., Kachouri-Lafond, R., Payen, C., Potier, S., Pribylova, L., Ozanne, C., Richard, G., Sacerdot, C., Straub, M., & Talla, E. (2009). Comparative genomics of protoploid Saccharomycetaceae Genome Research DOI: 10.1101/gr.091546.109 (open access)

Ma, L., Ibrahim, A., Skory, C., Grabherr, M., Burger, G., Butler, M., Elias, M., Idnurm, A., Lang, B., Sone, T., Abe, A., Calvo, S., Corrochano, L., Engels, R., Fu, J., Hansberg, W., Kim, J., Kodira, C., Koehrsen, M., Liu, B., Miranda-Saavedra, D., O’Leary, S., Ortiz-Castellanos, L., Poulter, R., Rodriguez-Romero, J., Ruiz-Herrera, J., Shen, Y., Zeng, Q., Galagan, J., Birren, B., Cuomo, C., & Wickes, B. (2009). Genomic Analysis of the Basal Lineage Fungus Rhizopus oryzae Reveals a Whole-Genome Duplication PLoS Genetics, 5 (7) DOI: 10.1371/journal.pgen.1000549 (open access)

Yeast population genomics
I have cheered the Sanger-Wellcome SGRP group work to generate multiple Saccharomyces cerevisiae and S. paradoxus strain genome sequences.   The group had previously submitted a version of the manuscript to Nature precedings and it is now published in Nature AOP showing that submitting to a preprint server doesn’t necessarily hurt your manuscript getting published…  The research groups explored the impact of domestication (as was also recently done for the sake and soy sauce worker fungus, Aspergillus oryzae) on the Saccharomyces genome by comparing individuals from wild strains of S. paradoxus.

This paper addressed several challenges including methodology for light genome sequencing for population genomics. This data represents in a way, a pilot project on for genome resequencing projects and using draft genome sequencing with next generation sequencing tools. Of course with the pace of sequencing technology development, any project more than a couple months old will be using outdated technology it seems, but this work represents some important progress.  Tools like MAQ were also developed and tuned as part of the project.  In addition to the methods development it also provided a new look at evolutionary dynamics of a well-studied fungus.

Genome assembly
The authors apply several different quality controls and utilize a new tool called PALAS (Parallel ALignment and ASsembly)  to assemble all the strains at the same time using a graph-based approach that utilized the reference genome sequences for each species. This is different than a full-blown WGA approach like PCAP, Phusion or Arachne because this is deliberately low-coverage sequencing pass.  The authors are trying impute missing sequence via Ancestral Recombination Graphs as implemented in the Margarita system.   They also use MAQ to align sequence from Illumina/Solexa sequencing to these assemblies made by PALAS.

Since this project was on two species of SaccharomycesS. cerevisiae and S. paradoxus they needed good reference assemblies for each of these species. The previously availably S.paradoxus assembly wasn’t complete enough for this study so they did an addition 4.3 X coverage with sanger/ABI sequencing and 80X coverage with Illumina.

Population genomics and domestication

The sequencing data also provided a framework for population genetic investigations. Some simple findings showed that geographic isolates within each species were more genetically similar to each other.  The main geographic regions of samples for S.paradoxus data included the UK, American, and Far East samples, some of which had been analyzed in a very nice study on Chromosome III.  For the S. cerevisiae samples there were individuals from around Europe, at least 10 European wine strains, Malaysian, Sake brewing strains, West Africa, and North America. From these data it was possible to discover that there are several of strains with mosiac genomes meaning that pieces of the genome match best with the sake fermentation strains and other parts from the wine/European samples.

Efforts to detect the effects of natural selection that may be linked to domestication of these strains explored two different approaches. The McDonald-Kreitman test did not identify any loci under positive selection while Tajima’s D was negative in the S.cerevisiae global and wine strain populations indicating an excess of singleton polymorphisms – though they draw little conclusions from that.  The authors also observed a sharper decay of linkage disequilibrium in S.cerevisiae (half maximum of 3kb) than S.paradoxus (half maximum 9kb) suggesting that S.cerevisiae is recombining more, either due to increased opportunities or a great frequency of recombination events when it does.

In context of the paper title and the idea of exploring the effects of domestication on the genome, the authors observe that the standard paradigm that ‘domesticated’ species have lower diversity levels is simply not the case in these samples.  This isn’t to say there isn’t evidence of the selection for fermentation production from these strains based on the stress response conditions they were tested on, but that there is still ample evidence of maintaining diversity within the populations presumably through various amounts of outcrossing.

We are also interested in these results as we apply similar questions to population genomics of the human pathogenic fungus Coccidioides where 14 strains have been sequenced with sanger sequencing technology.  Hopefully some of these lessons will resonate in our analyses and also that this era of population genomics will see ever more extensive collections to address aspects of migration, phylogeography, and local adaptations within populations of fungi and other microbes.

Gianni Liti, David M. Carter, Alan M. Moses, Jonas Warringer, Leopold Parts, Stephen A. James, Robert P. Davey, Ian N. Roberts, Austin Burt, Vassiliki Koufopanou, Isheng J. Tsai, Casey M. Bergman, Douda Bensasson, Michael J. T. O’Kelly, Alexander van Oudenaarden, David B. H. Barton, Elizabeth Bailes, Alex N. Nguyen, Matthew Jones, Michael A. Quail, Ian Goodhead, Sarah Sims, Frances Smith, Anders Blomberg, Richard Durbin, Edward J. Louis (2009). Population genomics of domestic and wild yeasts Nature DOI: 10.1038/nature07743

Lichen genome projects and the power shift prompted by next-gen sequencing

Genome Technology highlights the very cool thing about next-gen sequencing – it puts the power in the hands of the researchers to explore genome sequence and doesn’t limit them to projects only funded through sequencing centers. The Genome Technology piece highlights work at Duke to sequence the genome Cladonia grayi, a lichenized fungus, with 454 technology at Duke’s Institute for Genome Sciences and Policy through their next-gen sequencing program. This is the way of the future where sequencing core facilities will be able to generate sequence only having to wait in the queue at the own university rather than through community sequencing project or sequencing center proposal queues.

This isn’t the only lichen being sequenced. Xanthoria parietina is also in the queue at JGI, but has taken a while to get going because of some logistical problems getting the DNA (and any problems are amplified because it takes a long time to get new material since lichens grow very slow).

The transfer of the power for researchers to be able to quick exploratory whole-genome sequencing with next-gen and eventually, high quality genome sequences from next-gen sequencing is predicted to transform how this kind of science gets done. It means we’ll probably just sequence a mutant strain instead of trying to map the mutation – this is happening already in anecdotal stories in worms and in our work in mushrooms. N.B. this is done after a mutagenized strain has been cleaned up a bit to insure we’re looking for one or only a few mutations based on some crosses – but that is part of standard genetic approaches anyways.

This fast,cheap,whole-genome-sequencing is also the stuff of personal genomics, but for basic research it will also mean that a first pass exploring gene repertoire of an organism will be a multi-week instead of multi-year project. I just hope we’re training enough people who can efficiently extract the information from all this data with solid bioinformatics, computational, data-oriented programming, and statistical skills to support all the labs that will want to take this approach. You’ll need a life-vest to swim in the big data pool for a while until more tools are developed that can be deployed by non-experts.

Papers on our desk

A quick post of some recent comparative genomics papers on our desk that are worth a look.

  • Khaldi N, Wolfe KH (2008) Elusive Origins of the Extra Genes in Aspergillus oryzae. PLoS ONE 3(8): e3036. doi:10.1371/journal.pone.0003036. This was a cool but somewhat controversal finding presented at Fungal Genetics last year.
  • Casselton, LA. Fungal sex genes – searching for the ancestors. doi: 10.1002/bies.20782. A review of recent findings about the Zygomycete MAT locus.
  • Soanes DM, Alam I, Cornell M, Wong HM, Hedeler C, et al. (2008) Comparative Genome Analysis of Filamentous Fungi Reveals Gene Family Expansions Associated with Fungal Pathogenesis. PLoS ONE 3(6): e2300. doi:10.1371/journal.pone.0002300
  • Lee DW, Freitag M, Selker EU, Aramayo R (2008) A Cytosine Methyltransferase Homologue Is Essential for Sexual Development in Aspergillus nidulans. PLoS ONE 3(6): e2531. doi:10.1371/journal.pone.0002531