Recent animal-associated fungal genome papers

The genomes of five dermatophyte fungi were sequenced and the analyses of their lifestyles presented in a new paper out in mBio in Martinez et al. 2012. The authors were able to identify gene family changes that associate with lifestyle changes including proteases that can degrade keratin suggesting how these species have adapted to obtaining nutrients from an animal host. The continued finding of fungal-specific kinase families in these fungi, extending the observations from previous studies in Coprinopsis and Paracoccidioides on the FunK1 kinase family, makes me hope we will some day get some molecular information on the specificity of these families in addition to these copy number observations.
Another paper published in Genome Research this summer from Emily Troemel‘s lab and the Broad Institute describes the sequencing of two microsporidia species that are natural parasites of Caenorhabditis.The paper reveals some suprising things about Microsporidia evolution including the presence of a clade-specific nucleoside H+ symporter which is only found in bacteria and some eukaryotes and not in any Fungi. The phyletic distribution suggested it was acquired more recently and couple from lateral gene transfer. This acquisition likely helps the microsporidia cells obtain nucleosides from the host since the parasite cannot synthesize these. There is also evidence of evolution of microsporidia-specific secretion signals in the hexokinases which may be a mechanism for delivery of these enzymes into host cells to catalyze rapid growth once inside the host. Many more gems in this paper including phylogenetic placement of the microsporidia from phylogenomic approaches (also see related recent work from Toni Gabaldon‘s lab).

Schizophyllum genome update

Robin Ohm at the JGI has announced the release of version 2 of the Schizophyllum commune genome. This is great news on the heels of the announcement that one of the funded 2012 CSPs will include detailed functional genomics experiments in this mushroom.

I am pleased to announce the public release of the JGI annotation and portal for the improved assembly of Schizophyllum commune.  Annotations of the assembly are now publicly visible at http://jgi.doe.gov/Scommune2 .  Annotation and editing privileges remain password-protected but all other tools are now available to the general public.

A detailed set of statistics on the assembly and annotation can be found on the Info page of that portal:  http://genome.jgi-psf.org/Schco2/Schco2.info.html

 

Neurospora annotation update (v5)

Here is a message from the Broad Institute about a gene annotation update that was made recently in response to an issue that was revealed in the June 2010 release.  This new version is called V5 and should be on its way to GenBank.

Dear Neurospora scientists,

Recently we discovered an issue with the way locus tags were assigned
to our most recent Neurospora gene set, released publicly on the Broad
website in June of 2010. Many genes in this gene set have mismatched
locus numbers compared to the same genes released in February 2010.
Adding to the confusion, both releases were labeled version 4.

To remedy this we have recalled the June locus numbers and released a
new, version 5 gene set. Genes in this set have been numbered to
preserve historical locus numbers (back to the original genbank
release) as much as possible.

Folks who call their favorite genes by their v1, v2 or v3 numbers can
search for them on our web page, which will map them to v5
automatically and accurately. The same will work for most v4 numbers.
Unfortunately, 863 genes have different locus tags in the two v4
releases. If you search for one of them, you will get two hits - the
v5 gene that the February edition mapped to, and the v5 gene that the
June edition mapped to.

Two examples to clarify:

A. Suppose you search for NCU11713.4 on our web page. This query will
retrieve two genes, NCU11688.5 and NCU11713.5. The gene which in the
February release was called NCU11713.4 is the same as NCU11688.5,
while the gene labeled NCU11713.4 in June is the same as NCU11713.5.

B. Searching for NCU11324.4 yields but one hit because that gene, like
most genes, was consistently numbered between the two releases labeled
4.

If you are not sure when you downloaded your genes, the following may
help. If you see any of these locus numbers in your gene set:

NCU00129.4, NCU00457.4, NCU00499.4, NCU00556.4, NCU00627.4,
NCU00685.4, NCU00768.4, NCU00856.4, NCU00986.4, NCU01064.4,
NCU01065.4, NCU01282.4, NCU01299.4, NCU01300.4, NCU01483.4,
NCU01559.4, NCU01560.4, NCU01610.4, NCU01611.4, NCU01664.4,
NCU01665.4, NCU01871.4, NCU01903.4, NCU02200.4, NCU02259.4,
NCU02666.4, NCU02758.4, NCU02837.4, NCU02998.4, NCU03047.4,
NCU03206.4, NCU03773.4, NCU04239.4, NCU04240.4, NCU04518.4,
NCU04519.4, NCU04710.4, NCU04711.4, NCU05275.4, NCU05512.4,
NCU05776.4, NCU06013.4, NCU06370.4, NCU06732.4, NCU07107.4,
NCU07259.4, NCU07260.4, NCU07301.4, NCU07405.4, NCU07856.4,
NCU07857.4, NCU08090.4, NCU08182.4, NCU08323.4, NCU08332.4,
NCU09085.4, NCU09256.4, NCU09257.4, NCU09998.4, NCU10166.4,
NCU10574.4, NCU11040.4, NCU11240.4, NCU11253.4, NCU11376.4,
NCU11390.4, NCU11393.4

then your genes are from the February 2010 gene set. However, if you see

NCU00082.4, NCU00083.4, NCU00084.4, NCU00085.4, NCU00516.4,
NCU01819.4, NCU04299.4, NCU04300.4, NCU04301.4, NCU04302.4,
NCU04303.4, NCU04304.4, NCU04305.4, NCU05000.4, NCU05111.4,
NCU05112.4, NCU05113.4, NCU05114.4, NCU05115.4, NCU05116.4,
NCU05448.4, NCU05452.4, NCU06667.4, NCU07323.4, NCU09066.4,
NCU10179.4, NCU10301.4, NCU10379.4, NCU10383.4, NCU10753.4,
NCU10866.4, NCU10914.4, NCU11068.4, NCU11182.4, NCU12157.4,
NCU12158.4, NCU12159.4, NCU12160.4, NCU12161.4, NCU12162.4,
NCU12163.4, NCU12164.4, NCU12165.4, NCU12166.4, NCU12167.4,
NCU12168.4, NCU12169.4, NCU12170.4, NCU12171.4, NCU12172.4,
NCU12173.4, NCU12174.4, NCU12175.4, NCU12176.4, NCU12177.4,
NCU12178.4, NCU12179.4, NCU12180.4, NCU12181.4, NCU12182.4,
NCU12183.4, NCU12184.4, NCU12185.4, NCU12186.4, NCU12187.4, NCU12188.4

then your genes are from the June 2010 release.

Attached please find five mapping tables which can be used to migrate
locus numbers from any of the previous releases to the latest version
5 locus tags (linked below).

We apologize for any confusion this may cause.
Love,
The Broad Institute

I’ve also uploaded the locus update files which maps between versions of the annotation.

Microsporidia genomes on the way

New genomes from Microsporidia are on the way from the Broad Institute and other groups, and will be a boon to those working on these fascinating creatures. Microsporidia are obligate intracellular parasites of eukaryotic cells and many can cause serious disease in humans. Some parasitize worms and insects too. The evolutionary placement of these species in the fungi is still debated with recent evidence placing them as derived members of the Mucormycotina based on shared synteny (conserved gene order), in particular around the mating type locus.  There is still some debate as to where this group belongs in the Fungal kingdom, with their highly derived characteristics and long branches they are still make them hard to place.  The synteny-based evidence was another way to find a phylogenetic placement for them but it would be helpful to have additional support in the form of additional shared derived characteristics that group Mucormycotina and Microsporidia. There is hope that increased number of genome sequences and phylogenomic approaches can help resolve the placement and more further understand the evolution of the group.

For data analysis, a new genome database for comparing these genomes is online called MicrosporidiaDB. This project has begun incorporating the available genomes and providing a data mining interface that extends from the EuPathDB project.

Genome sequence of mushroom Schizophyllum commune

Schizophyllum CommuneI am excited to announce the publication of another mushroom genome this week. The mushroom Schizophyllum commune is an important model system for mushroom biology, development of genome was sequenced as part of efforts at the Joint Genome Institute and a collection of international researchers.  The data and analyses from these efforts are presented in a publication appearing in Nature Biotechnology today.

Studies in mushrooms can have important impact on other research areas.  They can be useful in biotechnology as protein biosynthesis factories for producing compounds or even as an edible delivery mechanism for new drugs.  What we found in the analysis of this genome include clues to mechanisms of how white rotting fungi degrade lignin through analysis of enzyme families.  We also saw evidence for extensive antisense transcription during different developmental stages suggesting some important clues as to how some gene regulation could impact or control developmental progression.  Through gene expression comparison (by MPSS) a large number of transcription factors were shown to be differentially regulated during sexual development.  A knockout out two of these (fst3 and fst4) resulting in changes in ability to form mushrooms (fst4) or smaller mushrooms (fst3).

Several more interesting findings in this work that I hope to add back to this post when there is a little more time -

Ohm, R., de Jong, J., Lugones, L., Aerts, A., Kothe, E., Stajich, J., de Vries, R., Record, E., Levasseur, A., Baker, S., Bartholomew, K., Coutinho, P., Erdmann, S., Fowler, T., Gathman, A., Lombard, V., Henrissat, B., Knabe, N., Kües, U., Lilly, W., Lindquist, E., Lucas, S., Magnuson, J., Piumi, F., Raudaskoski, M., Salamov, A., Schmutz, J., Schwarze, F., vanKuyk, P., Horton, J., Grigoriev, I., & Wösten, H. (2010). Genome sequence of the model mushroom Schizophyllum commune Nature Biotechnology DOI: 10.1038/nbt.1643

A mushroom on the cover

I’ll indulge a bit here to happily to point to the cover of this week’s PNAS with an image of Coprinopsis cinerea mushrooms fruiting referring to our article on the genome sequence of this important model fungus.  You should also enjoy the commentary article from John Taylor and Chris Ellison that provides a summary of some of the high points in the paper.

Coprinopsis cover

Stajich, J., Wilke, S., Ahren, D., Au, C., Birren, B., Borodovsky, M., Burns, C., Canback, B., Casselton, L., Cheng, C., Deng, J., Dietrich, F., Fargo, D., Farman, M., Gathman, A., Goldberg, J., Guigo, R., Hoegger, P., Hooker, J., Huggins, A., James, T., Kamada, T., Kilaru, S., Kodira, C., Kues, U., Kupfer, D., Kwan, H., Lomsadze, A., Li, W., Lilly, W., Ma, L., Mackey, A., Manning, G., Martin, F., Muraguchi, H., Natvig, D., Palmerini, H., Ramesh, M., Rehmeyer, C., Roe, B., Shenoy, N., Stanke, M., Ter-Hovhannisyan, V., Tunlid, A., Velagapudi, R., Vision, T., Zeng, Q., Zolan, M., & Pukkila, P. (2010). Insights into evolution of multicellular fungi from the assembled chromosomes of the mushroom Coprinopsis cinerea (Coprinus cinereus) Proceedings of the National Academy of Sciences, 107 (26), 11889-11894 DOI: 10.1073/pnas.1003391107

Where can I get orthologs?

There are several databases that include orthology prediction for fungi. These all have pros and cons. Some are more comprehensive and have many more species. Some are curated orthologies and paralogy which should be pretty stable. Some are automated and groupings and ortholog group IDs change at each iteration.

  • A phylogenetic approach from a Saccharomyces perspective is at PhylomeDB.
  • Fungal Orthogroups is based on Synergy algorithm from I. Wapinski formerly of the Regev group at the Broad Institutue.
  • Yeast gene order browser (YGOB) for Saccharomyces spp and CGOB for Candida spp.
  • OrthoMCL database based on whole genomes, not a ton of fungi but useful starting set.
  • Ensembl Genomes provides ortholog prediction as part of the Compara pipeline though there is a limited phylogenetic diversity in the current Ensembl Fungal genomes.
  • TreeFam has Saccharomyces cerevisiae and Schizosaccharomyces pombe as the two fungi included in the curated ortholog assignments and phylogenies.
  • SIMAP provides pre-computed similarities among all proteins in UniProt.
  • InParanoid provides a pretty comprehensive of available 100 whole genomes and many fungal genomes which I tried to help select.
  • JGI’s Mycocosm attempts to provide a fungal focused paralog/gene family look at clusters of genes based on whole genomes
  • E-Fungi is also an attempt at automated clustering with some fancy webservices logic.
  • Fungal Transcription Factor database focused just on families of transcription factors.

Some of these tools are better than others in terms of providing downloadable tables.  Another problem is what Identifiers are used. Many biologists are using gene names or Locus identifiers not UniProt/GenPept IDs to identify genes or proteins of interest.  So tools that just cluster UniProt data aren’t as useful as those which refer to the gene or locus names. Also, providing a way to download all the data from a comparison is important for further mining and grouping of the data or cross-referencing local datasets.  One-by-one plugging in geneids is not really a tool that respects the idea that your user wants to ask sophisticated queries.

Also – beware that some approaches are very much pairwise comparisons lists whereas others are finding orthologous groupings.  So if you want to fine the Rad59 ortholog from all fungi it may be easier or harder depending on the source.

[I may make this a static page in the future to allow for more detailed updating since I know the available resources wax and wane]

An Inky-cap mushroom genome

Francis Martin has written up a delightful summary pointing to our publication of the genome of Coprinopsis cinereus which appears in the early edition of PNAS and will grace the cover at the end of the month.  I encourage you to take a look at Francis’s post and the paper, available as Open Access from PNAS.  I’ll do my best to post a summary of the paper when I get a free moment.

For now I’ll leave you with a picture of this cute little mushroom fruting in the lab and a link to many more at Flickr.

Mature Coprinus cinereus (Coprinopsis cinerea)

Methylation to the max!

A new paper from the Zilberman lab at UC Berkeley shows the application of high throughput sequencing to the study of DNA methylation in eukaryotes.  They generate an huge data set of whole genome methylation patterns in several plants, animals, and five fungi including early diverging Zygomycete.

The work was performed using Bisulfite sequencing (Illumina) to capture methylated DNA, RNA-Seq of mRNA. The also performed some ChIP-Seq of H2A.Z on pufferfish to look at the nucleosome positioning in that species. For aligning the reads, they used BowTie to align the bisulfite sequences (though I’d be curious how a new aligner, BRAT, designed for Bisulfite seq reads would perform) to the genome.  They also sequenced mRNA via RNA-Seq to assay gene expression for some of the species.

They find several interesting patterns in animal and fungal genomes.  I’ll highlight one in the fungi. They find an unexpected pattern in U. reesii of reduced CGs in repeats, which shows signatures of a RIP-like process, are also methylated.  This finding is also consistent with observations in Coccidioides (Sharpton et al, Genome Res 2009) that showed depleted CGs pairs in repeats.  Since the phenomenon is also found in Coccidioides genomes this methylation of some repeats is likely not unique to U. reesii but may be important in recent evolution of the Onygenales fungi or the larger Eurotiales fungi.  There are several other interesting findings with the first such study that shows methylation data for Zygomycete fungi and a basidiomycete close to my heart, Coprinopsis.  It will be interesting is to dig deeper into this data and see how the patterns of methylation compare to other genomic features and the mechanisms regulating methylation process.

Zemach, A., McDaniel, I., Silva, P., & Zilberman, D. (2010). Genome-Wide Evolutionary Analysis of Eukaryotic DNA Methylation Science DOI: 10.1126/science.1186366

I’ll have the truffles and huitlacoche

Black TruffleA couple of papers should have captured your attention lately in the realm of fungal genomics.

One is the publication of the genome of the black truffle Tuber melanosporum. This appears as an advanced publication at Nature (OA by virtue of Nature’s agreement on genome papers) along with a NYT writeup and is a tasty exploration of the genome of an ascomycete ectomycorrhizal (ECM) fungus. There are several gems in there including the differences in transposable element content, content of gene families related to carbohydrate metabolism. This genome helps open the doorway for exploring the several independent origins of ECM in both ascomycete and basidiomycete fungi.

I’ll also point out there is some work on the analysis of mating type locus found in this genome has applied aspects suggesting that inoculation of roots with both mating types may increase truffle yields in truffle farms. Evidence for sexual reproduction is also discovered from this genome analysis based on the sexual cycle genes present and the structure of the MAT locus.  Much like what was revealed in the genome analysis of the previously ‘asexual’ species Aspergillus fumigatus (and later reconstitution of a sexual cycle), the Tuber genome has the potential for mating and is a heterothallic (outcrossing) fungus based on its mating type locus -just like many other filamentous Ascomycete species.

A second paper I encourage you take a look at (those with a Science subscription) is from Virginia Walbot’s lab on the formation of tumors by U. maydis in Maize. These tumors end up destroying the corn but can produce a delicious (to some) dish that is huitlacooche. The idea that the fungus is co-opting the host system by secreting proteins that acted in the same way as native proteins and that it has a tissue or organ specific repertoire was one that her lab has been pursuing. U. maydis can grow inside corn without detection and  the formation of tumors seems to be a manipulation of the plant as much as it is the pathogen directly taking resources from the plant.  It reminds me a bit of the production of secondary metabolites that can control plant growth like gibberellins produced by fungi.  This kind of manipulation and also ability to evade detection suggests a pretty specific set of controls that prevent the fungus from doing the wrong thing at the wrong time (to avoid detection). So they set out to see if there are a set of organ specific genes that the fungus uses during infection that would suggest a very host-specific strategy by this corn smut.

In this paper the authors evaluate the role of fungal genes specifically expressed in infection of different organs and also the role of secreted proteins in colonization of the organs.  In what is impressive and elegant work, the authors show through the use of microarrays and genetics that there is plant tissue specific gene expression of U. maydis – so infections in leaves express a different set of genes than those in seedlings.  Genetic and phenotypic evaluation of fungal strains with knockouts of sets of the predicted secreted proteins was able to confirm a role for specific secreted proteins that previously may have not had any discernible phenotype. They infect strains with knockouts of sets of genes that encode secreted proteins and compare the virulence when these strains infect individual organs of the maize host.  They showed there is significantly different virulence in the various tissues for a some of the mutants suggesting an organ-specific role for virulence of secreted proteins. They also go on to show that some of this organ specific infection requires organ-specific gene expression by evaluating maize mutants and the ability of the fungus to infect different organs.

Future work will hopefully followup to see what these secreted proteins are manipulating in the host and how they either enable virulence by protecting the pathogen, avoiding detection by turning of host responses, or co-opting host gene networks in some other way.

Martin F, Kohler A, Murat C, Balestrini R, Coutinho PM, Jaillon O, Montanini B, Morin E, Noel B, Percudani R, Porcel B, Rubini A, Amicucci A, Amselem J, Anthouard V, Arcioni S, Artiguenave F, Aury JM, Ballario P, Bolchi A, Brenna A, Brun A, Buée M, Cantarel B, Chevalier G, Couloux A, Da Silva C, Denoeud F, Duplessis S, Ghignone S, Hilselberger B, Iotti M, Marçais B, Mello A, Miranda M, Pacioni G, Quesneville H, Riccioni C, Ruotolo R, Splivallo R, Stocchi V, Tisserant E, Viscomi AR, Zambonelli A, Zampieri E, Henrissat B, Lebrun MH, Paolocci F, Bonfante P, Ottonello S, & Wincker P (2010). Périgord black truffle genome uncovers evolutionary origins and mechanisms of symbiosis. Nature PMID: 20348908

Skibbe DS, Doehlemann G, Fernandes J, & Walbot V (2010). Maize tumors caused by Ustilago maydis require organ-specific genes in host and pathogen. Science (New York, N.Y.), 328 (5974), 89-92 PMID: 20360107