Category Archives: methods

Fungal tree of life papers

Lots of papers in Mycologia (subscription required) this month of different groups analyzing the fine-scale relationships of many different fungal clades using the loads of sequences that were generated as part of the Fungal Tree of Life project.

Some highlights – there are just too many papers in the issue to cover them all. As usual with more detailed studies of clades with molecular sequences we find that morphologically defined groupings aren’t always truly monophyletic and some species even end up being reclassified. Not that molecular sequence approaches are infallable, but for many fungi the morphological characters are not always stable and can revert (See Hibbet 2004 for a nice treatment of this in mushrooms; subscription required).

  • Meredith Blackwell and others describe the Deep Hypha research coordination network that helped coordinate all the Fungal Tree of Life-rs.
  • John Taylor and Mary Berbee update their previous dating work with new divergence dates for the fungi using as much of the fossil evidence as we have.
  • The early diverging Chytridiomycota, Glomeromycota, and Zygomycota are each described. Tim James and others present updated Chytridiomycota relationships so of which were only briefly introducted in the kingdom-wide analysis paper published last year.
  • There is a nice overview paper of the major Agaricales clades (mushrooms for the non-initiated) from Brandon Matheny as well as as individual treatment of many of the sub-clades like the cantharelloid clade (mmm chanterelles…) .
  • Relationships of the Puccinia clade are also presented – we blogged about the wheat pathogen P. graminis before.
  • A new Saccharomycetales phylogeny is presented by Sung-Oui Suh and others.
  • The validity of the Archiascomycete group is also tested (containing the fission yeast Schizosaccharomyces pombe and the mammalian pathogen Pneumocystis) and they confirm that it is basal to the two sister clades the euascomycete (containing Neurospora) and hemiascomycete (containing Saccharomyces) clades. However it doesn’t appear there are enough sampled species/genes to confirm monophyly of the group. There are/will be soon three genome sequences of Schizosaccharomyces plus one or two Pneumocystis genomes – it will be interesting to see how this story turns out if more species can be identified.

This was a monster effort by a lot of people who it is really nice to see it all have come together in what looks like some really nice papers.

More Neurospora genomes

We got word last week from the JGI that our DNA for Neurospora tetrasperma and N. discreta have passed QC and library QC and are on their way to being sequenced. The center also plans to do some EST sequencing to improve gene calling abilities.

Why more Neurospora genomes? The sequencing proposal discussed these species as a model system for evolutionary and ecological genetics. It will allow us and others to test several hypotheses about the molecular evolution of things like genome defense in Neurospora and to understand more about the evolutionary history of the model organism N. crassa.

Continue reading More Neurospora genomes

Orthology detection software

Blogging about Peer-Reviewed Research A paper in PLoS One, Assessing Performance of Orthology Detection Strategies Applied to Eukaryotic Genomes, reports a new approach to assess the performance of automated orthology detection. These authors also wrote the OrthoMCL (2006 DB paper, 2003 algorithm paper) which uses MCL to build orthologous gene families. The authors discuss the trade-offs between highly sensitive specific tree-based methods and fast but less sensitive approaches of the Best-Reciprocal-Hits from BLAST or FASTA or some of the hybrid approaches. The authors employ Latent Class Analysis (LCA) to aid in “evaluation and optimization of a comprehensive set of orthology detection methods, providing a guide for selecting methods and appropriate parameters”. LCA is also the statistical basis for feature choice in combing gene predictions into a single set of gene calls in GLEAN written by many of the same authors including Aaron Mackey.

I’ve been reading a lot of orthology and gene tree-species tree reconcilation papers lately, some are listed in Ian Holmes’s group as well as listing some of the software on the BioPerl site. This also follows with on our Phyloinformatics hackathon work which we are trying to formalize in some more documentation for phyloinformatics pipelines to support some of the described use cases. I’m also applying some of this to a tutorial I’m teaching at ISMB2007 this summer.

That was a lot of work

I’ve never worked with Magnaporthe grisea, the fungus responsible for rice blast, one of the most devastating crop diseases, but I do know that its life cycle is complicated and that knocking out roughly 61% of the genes in the genome and evaluating the mutant phenotype to infer gene function is not trivial. In their recent letter to Nature, Jeon et al did what many of us have dreamed of doing in our fungus of interest: manipulate every gene to find those that contribute to a phenotype of interest.

In their study, the authors looked for pathogenecity genes. Interestingly, the defects in appressorium formation and condiation had the strongest correlation with defects pathogenicity, suggesting that these two developmental stages are crucial for virulence. Ultimately, the authors identify 203 loci involved in pathogenecity, the majority of which have no homologous hits in the sequence databases and have no clear enriched GO functions. Impressively, this constitutes the largest, unbiased list of pathogenecity genes identified for a single species (though so of us, I’m sure, may have a problem with the term “unbiased”).

If you’d like to play with their data, the authors have made it available in their ATMT Database.

Proteins Evolve Differentially in Saccharomyces

Blogging about Peer-Reviewed ResearchPerhaps not a surprise to anyone that has dabbled in evolutionary analysis of proteins, Kawahara and Imanishi (BMC Evolutionary Biology 2007) confirm that not every protein evolves via a molecular clock in Saccharomyces sensu scricto. Using everyone’s favorite evolutionary tool, PAML, the authors identify protein lineages via a whole genome scan that evolve relatively slow or fast compared to the rest of the clade. Some changes even appear to be due to the invisible hand of natural selection and independent of the complications that may have arisen during the whole genome duplication in the ancestor of this clade.

It has been previously speculated that, either upon protein duplication or change in the selective regime of the environment, a protein may rapidly evolve at speciation and then, upon obtaining a new, important function, slow down it’s evolutionary rate to a clock-like tempo. One of the black boxes in this hypothesis is whether or not closely related proteins can rapidly diverge. While the authors are not able to identify a mechanism explaining how, their study demonstrates the plausibility of this hypothesis. However, it remains uncertain if proteins that exhibit rapid divergence will subsequently slow down their evolutionary rate later in time.

It’s good to see evolutionary analysis being applied to fungal genomes. With so many sequenced species spanning a great range of phylogenetic distance, the fungal kingdom is poised to provide great insight into the evolution of eukaryotes.

Fungal Genetics 2007 details

I’m including a recapping as many of the talks as I remember. There were 6 concurrent sessions each afternoon so you have to miss a lot of talks. The conference was bursting at the seams as it was- at least 140 people had to be turned away beyond the 750 who attended.

If there was any theme in the conference it was “Hey we are all using these genome sequences we’ve been talking about getting”. I only found the overview talks that solely describe the genome solely a little dry as compared to those more focused on particular questions. I guess my genome palate is becoming refined.

Continue reading Fungal Genetics 2007 details

Genome resources for Candida species

The Candida clade of Hemiascomycete fungi have received much attention from funding bodies so that many genomic and experimental resources are available address questions of pathogenecity and industrial applications of these species.

The Candida genus

Traditionally, species of yeasts that were thought to be asexual were given the genus name Candida. This has lead to Candida being a sort of taxonomic rubbish bin as this system of classification breaks down when asexuality arises more than once (creating homoplasy). For example, the asexual Candida glabrata is found within the Saccharomyces clade when molecular phylogenetics is applied. The problem lies in that many of these species appear very similar visually and microscopically and so there had not been enough phylogenetically informative phenotypic characters to easily classify them further. With the use of molecular phylogenetics the classifications have been improved as shown in several studies, however we retain the historical nature of the genus and species names for these organisms for the time being even though the phylogenetic diversity of species in the “genus” is much broader than other genus-level classifications. It will be interesting to see whether taxonomic proposals like PhyloCode or traditional revisions of the species names will provide new names for the group.

The Candida Genome Database (CGD) sister to the Saccharomyces Genome Database (SGD) provides resources for phenotype and sequences related to human commensal and dimorphic fungus Candida albicans. A recent paper by Arnaud et al describes the resources that are available through their website. An essentially completed C. albicans diploid genome with curated gene models and annotations provides an essential resource for this model pathogenic system. In addition to the SC5314 strain of C. albicans the white-opaque (WO) strain can switch between different colony morphologies – white and smooth or gray and rod shaped.

6 additional species have had their genomes in the Candida clade have had their genomes sequenced including Pichia stipis, Debaryomyces hansenii, Candida lusitaniae, Candida tropicalis, Candida guilliermondii, and Lodderomyces elongisporus. These resources will hopefully shed some light on the importance and mechanisms for dimorphic switching in the pathogen C. albicans, the importance and evolution of alternative codon usage in the clade, and better usage of the industrial yeasts like P. stipitis and D. hansenii.

Approaching 100% coverage for GO assignments in S.pombe

A paper by Martin Aslett and Val Wood indicate that the fission yeast community is approaching 100% coverage of a GO annotation for every gene in the S. pombe genome. Only Ashbya gossypii has a smaller genome in the fungi (see a recent paper on Ashbya annotation database) and doesn’t yet have complete GO coverage. This is quite remarkable and a great dataset for studies in S. pombe and all fungi.

S. pombe taken from Paul Young’s site

My quick predictions of genes a closely related species, S. japonicus, has more than twice as many genes as S. pombe (but be over-prediction by ab initio predictors). Taken in comparison to many other fungi, S. pombe represents a streamlined and reduced genome which probably occured indepdently from reduction in the Hemiascomycetes.