Category Archives: gene family

Recent animal-associated fungal genome papers

The genomes of five dermatophyte fungi were sequenced and the analyses of their lifestyles presented in a new paper out in mBio in Martinez et al. 2012. The authors were able to identify gene family changes that associate with lifestyle changes including proteases that can degrade keratin suggesting how these species have adapted to obtaining nutrients from an animal host. The continued finding of fungal-specific kinase families in these fungi, extending the observations from previous studies in Coprinopsis and Paracoccidioides on the FunK1 kinase family, makes me hope we will some day get some molecular information on the specificity of these families in addition to these copy number observations.
Another paper published in Genome Research this summer from Emily Troemel‘s lab and the Broad Institute describes the sequencing of two microsporidia species that are natural parasites of Caenorhabditis.The paper reveals some suprising things about Microsporidia evolution including the presence of a clade-specific nucleoside H+ symporter which is only found in bacteria and some eukaryotes and not in any Fungi. The phyletic distribution suggested it was acquired more recently and couple from lateral gene transfer. This acquisition likely helps the microsporidia cells obtain nucleosides from the host since the parasite cannot synthesize these. There is also evidence of evolution of microsporidia-specific secretion signals in the hexokinases which may be a mechanism for delivery of these enzymes into host cells to catalyze rapid growth once inside the host. Many more gems in this paper including phylogenetic placement of the microsporidia from phylogenomic approaches (also see related recent work from Toni Gabaldon‘s lab).

Genome sequence of mushroom Schizophyllum commune

Schizophyllum CommuneI am excited to announce the publication of another mushroom genome this week. The mushroom Schizophyllum commune is an important model system for mushroom biology, development of genome was sequenced as part of efforts at the Joint Genome Institute and a collection of international researchers.  The data and analyses from these efforts are presented in a publication appearing in Nature Biotechnology today.

Studies in mushrooms can have important impact on other research areas.  They can be useful in biotechnology as protein biosynthesis factories for producing compounds or even as an edible delivery mechanism for new drugs.  What we found in the analysis of this genome include clues to mechanisms of how white rotting fungi degrade lignin through analysis of enzyme families.  We also saw evidence for extensive antisense transcription during different developmental stages suggesting some important clues as to how some gene regulation could impact or control developmental progression.  Through gene expression comparison (by MPSS) a large number of transcription factors were shown to be differentially regulated during sexual development.  A knockout out two of these (fst3 and fst4) resulting in changes in ability to form mushrooms (fst4) or smaller mushrooms (fst3).

Several more interesting findings in this work that I hope to add back to this post when there is a little more time –

Ohm, R., de Jong, J., Lugones, L., Aerts, A., Kothe, E., Stajich, J., de Vries, R., Record, E., Levasseur, A., Baker, S., Bartholomew, K., Coutinho, P., Erdmann, S., Fowler, T., Gathman, A., Lombard, V., Henrissat, B., Knabe, N., Kües, U., Lilly, W., Lindquist, E., Lucas, S., Magnuson, J., Piumi, F., Raudaskoski, M., Salamov, A., Schmutz, J., Schwarze, F., vanKuyk, P., Horton, J., Grigoriev, I., & Wösten, H. (2010). Genome sequence of the model mushroom Schizophyllum commune Nature Biotechnology DOI: 10.1038/nbt.1643

A mushroom on the cover

I’ll indulge a bit here to happily to point to the cover of this week’s PNAS with an image of Coprinopsis cinerea mushrooms fruiting referring to our article on the genome sequence of this important model fungus.  You should also enjoy the commentary article from John Taylor and Chris Ellison that provides a summary of some of the high points in the paper.

Coprinopsis cover

Stajich, J., Wilke, S., Ahren, D., Au, C., Birren, B., Borodovsky, M., Burns, C., Canback, B., Casselton, L., Cheng, C., Deng, J., Dietrich, F., Fargo, D., Farman, M., Gathman, A., Goldberg, J., Guigo, R., Hoegger, P., Hooker, J., Huggins, A., James, T., Kamada, T., Kilaru, S., Kodira, C., Kues, U., Kupfer, D., Kwan, H., Lomsadze, A., Li, W., Lilly, W., Ma, L., Mackey, A., Manning, G., Martin, F., Muraguchi, H., Natvig, D., Palmerini, H., Ramesh, M., Rehmeyer, C., Roe, B., Shenoy, N., Stanke, M., Ter-Hovhannisyan, V., Tunlid, A., Velagapudi, R., Vision, T., Zeng, Q., Zolan, M., & Pukkila, P. (2010). Insights into evolution of multicellular fungi from the assembled chromosomes of the mushroom Coprinopsis cinerea (Coprinus cinereus) Proceedings of the National Academy of Sciences, 107 (26), 11889-11894 DOI: 10.1073/pnas.1003391107

Where can I get orthologs?

There are several databases that include orthology prediction for fungi. These all have pros and cons. Some are more comprehensive and have many more species. Some are curated orthologies and paralogy which should be pretty stable. Some are automated and groupings and ortholog group IDs change at each iteration.

  • A phylogenetic approach from a Saccharomyces perspective is at PhylomeDB.
  • Fungal Orthogroups is based on Synergy algorithm from I. Wapinski formerly of the Regev group at the Broad Institutue.
  • Yeast gene order browser (YGOB) for Saccharomyces spp and CGOB for Candida spp.
  • OrthoMCL database based on whole genomes, not a ton of fungi but useful starting set.
  • Ensembl Genomes provides ortholog prediction as part of the Compara pipeline though there is a limited phylogenetic diversity in the current Ensembl Fungal genomes.
  • TreeFam has Saccharomyces cerevisiae and Schizosaccharomyces pombe as the two fungi included in the curated ortholog assignments and phylogenies.
  • SIMAP provides pre-computed similarities among all proteins in UniProt.
  • InParanoid provides a pretty comprehensive of available 100 whole genomes and many fungal genomes which I tried to help select.
  • JGI’s Mycocosm attempts to provide a fungal focused paralog/gene family look at clusters of genes based on whole genomes
  • E-Fungi is also an attempt at automated clustering with some fancy webservices logic.
  • Fungal Transcription Factor database focused just on families of transcription factors.

Some of these tools are better than others in terms of providing downloadable tables.  Another problem is what Identifiers are used. Many biologists are using gene names or Locus identifiers not UniProt/GenPept IDs to identify genes or proteins of interest.  So tools that just cluster UniProt data aren’t as useful as those which refer to the gene or locus names. Also, providing a way to download all the data from a comparison is important for further mining and grouping of the data or cross-referencing local datasets.  One-by-one plugging in geneids is not really a tool that respects the idea that your user wants to ask sophisticated queries.

Also – beware that some approaches are very much pairwise comparisons lists whereas others are finding orthologous groupings.  So if you want to fine the Rad59 ortholog from all fungi it may be easier or harder depending on the source.

[I may make this a static page in the future to allow for more detailed updating since I know the available resources wax and wane]

Chlamy genome investigations

Chlamy coverThis month’s Genetics has a series of articles exploring the genome (published last year & freely available at Science) of the green algae Chlamydomonas reinhardtii. These manuscripts are primarily genome analyses making for a very bioinformatics focused issue of Genetics. Some of the highlights include:

Trichoderma reesei genome paper published

TrichodermaThe Trichoderma reesei genome paper was recently published in Nature Biotechnology from Diego Martinez at LANL with collaborators at JGI, LBNL, and others. This fungus was chosen for sequencing because it was found on canvas tents eating the cotton material suggesting it may be a good candidate for degrading cellulose plant material as part of cellulosic ethanol or other biofuels production.  The fungus also has starring roles in industrial processes like making stonewashed jeans due to its prodigious cellulase production.

The most surprising findings from the paper include the fact that there are so few members of some of the enzyme families even though this fungus is able to generate enzymes with so much cellulase activity. The authors found that there is not a significantly larger number of glucoside hydrolases which is a collection of carbohydrate degrading enzymes great for making simple sugars out of complex ones. In fact, several plant pathogens compared (Fusarium graminearum and Magnaporthe grisea) and the sake fermenting Aspergillus oryzae all have more members of this family than does.  T. reesei has almost the least (36) copies of a cellulose binding domain (CBM) of any of the filamentous ascomycete fungi.  They used the CAZyme database (carbohydrate active enzymes) database which has done a fantastic job building up profiles of different enzymes involved in carhohydrate degradation binding, and modifications.

Whether T. reesei is really the best cellulose degrading fungus is definitely an open question.  That it works well in the industrial culture that it has been utilized in is important, but there may be other species of fungi with improved cellulase activity and who may in fact have many more copies of cellulases.  So it will be good to add other fungi to the mix with quantitative information about degradation to try and glean what are the most important combination of enzymes and activities.

One technical note.  The comparison of copy number differences employed in the paper is a simple enough Chi-Squared, work that I’ve done with Matt Hahn and others include a gene family size comparison approach that also taked into account phylogenetic distances and assumes a birth-death process of gene family size change.  It would be great to apply the copy number differences through this or other approaches that just evaluate gene trees for these domains to see where the differences are significant and if they can be polarized to a particular branch of the tree.

So will this genome sequence lead to cheaper, better biofuel production? Certainly it provides an important toolkit to start systematically testing individual cellulase enzymes. It’s hard to say how fast this will make an impact, but the work of JBEI and a host of other research groups and biotech companies are going to be able to systematically test out the utility of these individual enzymes.

There is also evolutionary work by other groups on the evolution of these Hypocreales fungi trying to better define when biotrophic and heterotrophic transitions occurred to sample fungi with different lifestyles that might have different cellulase enyzmes that may not have been observed. Defining the relationships of these fungi and when and how many times transitions to lifestyles occurred to choose the most diverse fungi may be an important part of discovering novel enzymes.

Also see

Martinez, D., Berka, R.M., Henrissat, B., Saloheimo, M., Arvas, M., Baker, S.E., Chapman, J., Chertkov, O., Coutinho, P.M., Cullen, D., Danchin, E.G., Grigoriev, I.V., Harris, P., Jackson, M., Kubicek, C.P., Han, C.S., Ho, I., Larrondo, L.F., de Leon, A.L., Magnuson, J.K., Merino, S., Misra, M., Nelson, B., Putnam, N., Robbertse, B., Salamov, A.A., Schmoll, M., Terry, A., Thayer, N., Westerholm-Parvinen, A., Schoch, C.L., Yao, J., Barbote, R., Nelson, M.A., Detter, C., Bruce, D., Kuske, C.R., Xie, G., Richardson, P., Rokhsar, D.S., Lucas, S.M., Rubin, E.M., Dunn-Coleman, N., Ward, M., Brettin, T.S. (2008). Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). Nature Biotechnology DOI: 10.1038/nbt1403