Category Archives: genome

NSF Poststdoc opportunity for Research using biological collections

Earlier this year the NSF released a postdoc opportunity for research to use Biological Collections. In particular these can be strain collections and stock collections. The US Culture Collection Network is a Research Coordination Network which brings together many collaborating culture collections. You can find many of the U.S. living collections there include fungal centers like the Phaff Yeast Collection and Fungal Genetics Stock Center. The Gilbertson Mycological Herbarium at U Arizona under Elizabeth Arnold‘s leadership has developed a rich collection of endophyte fungi which would be another excellent environment to work with these resources. Kyria Boundy-Mills who is the curator of the Phaff collection has also expressed interest in either hosting or helping working with a postdoc on this. There is tremendous biodiversity of the fungi available in these and other culture collections so seems like a great chance to tap into these.
This would be a great opportunity to link work in the 1000 Fungal genomes project and sampling from culture collections (not just sequencing, but growing and characterizing growth, carbon source utilization and integrating that with predictions made from genome comparisons). If this is something interesting to you – do get in touch with some of the curators at these collections, but also my lab and I expect many other labs would be interested hosting someone to work on these questions that take advantage of these living collections of fungi.
Proposals are to be submitted by potential post docs. Submitter must be a US citizen or US permanent resident. The next deadline is November 3, 2015Funding total for the program is $8 million, 40 awards anticipated, up to two years. Here’s some key text from the solicitation:

Competitive Area 2. Postdoctoral Research Fellowships Using Biological Collections.

Biological research collections represent the documented scientific history of life on Earth, and the U.S. museum community alone curates over a billion specimens ranging from bacteria to plants, insects and vertebrates, as well as fossils. Across the globe, collections represent critical infrastructure and support essential research activities in biology and its related fields. Scientists, government agencies, industry and citizens utilize collections to document and understand evolution and biodiversity, study global change, formulate advice on conservation planning, educate the general public, improve interactions between sciences, and devise new practical applications from science to every day life. New technologies supported by NSF in digitization, such as the Advancing Digitization of Biodiversity Collections (ADBC) program, are making collections and their associated data, whether they are physical specimens, text, images, sounds, or data tables, searchable in online databases. Despite this clear progress in improving access to physical specimens and their associated metadata, collections remain under-utilized for answering contemporary questions about fundamental aspects of biological processes. Thus, collections are poised to become a critical resource for developing transformative approaches to address key questions in biology and potentially develop applications that extend biology to physical, mathematical, engineering and social sciences. This postdoctoral track seeks transformative approaches that use biological collections in highly innovative ways to address grand challenges in biology. Priority may be given to applicants who integrate biological collections and associated resources with other types of data in an effort to forge new insight into areas traditionally funded by BIO. Examples of key questions in biology of interest include, but are not limited to, links between genotype and phenotype, evolutionary developmental biology, comparative approaches in functional and developmental neurobiology, and the biophysics of nanostructures. Using collections as a resource for grand challenge questions in biology is expected to present new opportunities to advance understanding of biological processes and systems, inspiring new discoveries in areas with relevance to other disciplines with overlapping interests in biological systems. Applicants must document access to the selected collection(s) in the research and training plan.

Some recent fungal and oomycete genome papers A few papers covering some published genomes you should definitely read if you have the chance.

  • Youssef NH, Couger MB, Struchtemeyer CG, Liggenstoffer AS, Prade RA, Najar FZ, Atiyeh HK, Wilkins MR, & Elshahed MS (2013). The Genome of the Anaerobic Fungus Orpinomyces sp. Strain C1A Reveals the Unique Evolutionary History of a Remarkable Plant Biomass Degrader. Applied and environmental microbiology, 79 (15), 4620-34 PMID: 23709508
    Describes first published genome of a Neocallimastigomycota fungus that resides within the rumen gut. Cool findings related to lignocellulolytic degradation pathways and basic biology about early diverging fungi which have intact flagellar apparatus.
  • Bushley KE, Raja R, Jaiswal P, Cumbie JS, Nonogaki M, Boyd AE, Owensby CA, Knaus BJ, Elser J, Miller D, Di Y, McPhail KL, & Spatafora JW (2013). The Genome of Tolypocladium inflatum: Evolution, Organization, and Expression of the Cyclosporin Biosynthetic Gene Cluster. PLoS Genetics, 9 (6) PMID: 23818858Describes the genome of a pathogen of beetle larvae (and related to Cordyceps). This fungus is important as it produces the immunosuppresive drug cyclosporin as a secondary metabolite. Analysis of the complete secondary metabolite pathways in the genome help shed light on the origin of this and other secondary metabolite gene clusters.
  • Schardl CL, Young CA, Hesse U, Amyotte SG, Andreeva K, Calie PJ, Fleetwood DJ, Haws DC, Moore N, Oeser B, Panaccione DG, Schweri KK, Voisey CR, Farman ML, Jaromczyk JW, Roe BA, O’Sullivan DM, Scott B, Tudzynski P, An Z, Arnaoudova EG, Bullock CT, Charlton ND, Chen L, Cox M, Dinkins RD, Florea S, Glenn AE, Gordon A, Güldener U, Harris DR, Hollin W, Jaromczyk J, Johnson RD, Khan AK, Leistner E, Leuchtmann A, Li C, Liu J, Liu J, Liu M, Mace W, Machado C, Nagabhyru P, Pan J, Schmid J, Sugawara K, Steiner U, Takach JE, Tanaka E, Webb JS, Wilson EV, Wiseman JL, Yoshida R, & Zeng Z (2013). Plant-symbiotic fungi as chemical engineers: multi-genome analysis of the clavicipitaceae reveals dynamics of alkaloid loci. PLoS Genetics, 9 (2) PMID: 23468653 

    A very rich and detailed paper, this presents a gold mine of complete genome data of 15 species and secondary metabolite profiling. The data include genomes of 10 epichloae fungi that are endophytes of grasses, three Claviceps species (ergot fungi), a morning-glory symbiont and a bamboo pathogen. The analyses of the genes from pathway analyses of the genomes along with profiling alkaloid productions the authors were able to link clusters to products in many cases. This is a rich and useful paper for anyone working in this field of secondary metabolites and sets the standard for a how a biological question can be answered by genome sequencing of a clade of related species.

  • Wicker T, Oberhaensli S, Parlange F, Buchmann JP, Shatalina M, Roffler S, Ben-David R, Doležel J, Simková H, Schulze-Lefert P, Spanu PD, Bruggmann R, Amselem J, Quesneville H, van Themaat EV, Paape T, Shimizu KK, & Keller B (2013). The wheat powdery mildew genome shows the unique evolution of an obligate biotroph. Nature Genetics PMID: 23852167

    Genome of wheat pathogen Blumeria graminis f.sp. tritici.This paper includes an identification and analysis of effector genes and dating the emergence of the pathogen relative the domestication and diversification of wheat.
  • Jiang RH, de Bruijn I, Haas BJ, Belmonte R, Löbach L, Christie J, van den Ackerveken G, Bottin A, Bulone V, Díaz-Moreno SM, Dumas B, Fan L, Gaulin E, Govers F, Grenville-Briggs LJ, Horner NR, Levin JZ, Mammella M, Meijer HJ, Morris P, Nusbaum C, Oome S, Phillips AJ, van Rooyen D, Rzeszutek E, Saraiva M, Secombes CJ, Seidl MF, Snel B, Stassen JH, Sykes S, Tripathy S, van den Berg H, Vega-Arreguin JC, Wawra S, Young SK, Zeng Q, Dieguez-Uribeondo J, Russ C, Tyler BM, & van West P (2013). Distinctive Expansion of Potential Virulence Genes in the Genome of the Oomycete Fish Pathogen Saprolegnia parasitica. PLoS Genetics, 9 (6) PMID: 23785293

    Genome of the fish pathogen and Oomycete Saprolegnia provide additional perspective on this diverse group organisms, evolution of metabolism and host-associated lifestyles.
  • Aylward FO, Burnum-Johnson KE, Tringe SG, Teiling C, Tremmel DM, Moeller JA, Scott JJ, Barry KW, Piehowski PD, Nicora CD, Malfatti SA, Monroe ME, Purvine SO, Goodwin LA, Smith RD, Weinstock GM, Gerardo NM, Suen G, Lipton MS, & Currie CR (2013). Leucoagaricus gongylophorus produces diverse enzymes for the degradation of recalcitrant plant polymers in leaf-cutter ant fungus gardens. Applied and environmental microbiology, 79 (12), 3770-8 PMID: 23584789Genome of the ant farmed fungus Leucoagaricus. This paper presents a draft genome assembly a useful step in understanding the fascinating symbiosis between ants and their cultivated fungi.

2012 Fungal Genomes: a review of mycological genomic accomplishments

2012 was certainly a banner year in genome sequence production and publications. The cost of generating the data keeps dropping and the automation for assembly and annotation continues to improve making it possible for a range of groups to publish genomes.

I made a NCBI PubMed Collection of these here Fungal Genomes 2012

Some notable fungal genome publications include

There were also several new insights into the evolution of wood decay fungi derived from new genomes of basidiomycete fungi. This includes

(Now I might have missed a few in my attempt to get this done before holidays overtake me – if so, please post comments or tweets and I’ll be sure to amend the list on pubmed and here.)

A new trend for fungal genome papers can be seen now in the Genome Announcements of Eukaryotic Cell which aim to get the genome data out quickly with a citateable reference. These are short descriptions which I expect will become more popular ways to insure data made public can also be cited. I only counted about 5 published in 2012 but I expect to see a lot more of these in the 2013 either at EC or other journals. I’m sure there will still be some tension between providers making data public as soon as possible and the sponsoring authors’ desire to have first crack at analyzing and publish interpretations and comparison of the genome(s). The bacterial community has been doing this for Genome Reports in the SIGS journal and the Journal of Bacteriology so will see what happens as these small eukaryotic genomes become even easier to produce.

I look forward to exciting year with more of the 1000 Fungal genomes and other JGI  projects start to roll out more genomes.  I also predict there will be many more resequencing datasets published as functional and population genomics. It will also probably be a countdown for what are the last Sanger sequenced genomes and how the many flavors of next generation sequencing will be optimized for generation.  I am hopeful work on automation of annotation and comparisons will be even easier for more people to use and that we start to provide a shared repository of gene predictions.  I’ve just launched the latter and look forward to engaging more people to contribute to this.

Recent animal-associated fungal genome papers

The genomes of five dermatophyte fungi were sequenced and the analyses of their lifestyles presented in a new paper out in mBio in Martinez et al. 2012. The authors were able to identify gene family changes that associate with lifestyle changes including proteases that can degrade keratin suggesting how these species have adapted to obtaining nutrients from an animal host. The continued finding of fungal-specific kinase families in these fungi, extending the observations from previous studies in Coprinopsis and Paracoccidioides on the FunK1 kinase family, makes me hope we will some day get some molecular information on the specificity of these families in addition to these copy number observations.
Another paper published in Genome Research this summer from Emily Troemel‘s lab and the Broad Institute describes the sequencing of two microsporidia species that are natural parasites of Caenorhabditis.The paper reveals some suprising things about Microsporidia evolution including the presence of a clade-specific nucleoside H+ symporter which is only found in bacteria and some eukaryotes and not in any Fungi. The phyletic distribution suggested it was acquired more recently and couple from lateral gene transfer. This acquisition likely helps the microsporidia cells obtain nucleosides from the host since the parasite cannot synthesize these. There is also evidence of evolution of microsporidia-specific secretion signals in the hexokinases which may be a mechanism for delivery of these enzymes into host cells to catalyze rapid growth once inside the host. Many more gems in this paper including phylogenetic placement of the microsporidia from phylogenomic approaches (also see related recent work from Toni Gabaldon‘s lab).

Schizophyllum genome update

Robin Ohm at the JGI has announced the release of version 2 of the Schizophyllum commune genome. This is great news on the heels of the announcement that one of the funded 2012 CSPs will include detailed functional genomics experiments in this mushroom.

I am pleased to announce the public release of the JGI annotation and portal for the improved assembly of Schizophyllum commune.  Annotations of the assembly are now publicly visible at .  Annotation and editing privileges remain password-protected but all other tools are now available to the general public.

A detailed set of statistics on the assembly and annotation can be found on the Info page of that portal:


Neurospora annotation update (v5)

Here is a message from the Broad Institute about a gene annotation update that was made recently in response to an issue that was revealed in the June 2010 release.  This new version is called V5 and should be on its way to GenBank.

Dear Neurospora scientists,

Recently we discovered an issue with the way locus tags were assigned
to our most recent Neurospora gene set, released publicly on the Broad
website in June of 2010. Many genes in this gene set have mismatched
locus numbers compared to the same genes released in February 2010.
Adding to the confusion, both releases were labeled version 4.

To remedy this we have recalled the June locus numbers and released a
new, version 5 gene set. Genes in this set have been numbered to
preserve historical locus numbers (back to the original genbank
release) as much as possible.

Folks who call their favorite genes by their v1, v2 or v3 numbers can
search for them on our web page, which will map them to v5
automatically and accurately. The same will work for most v4 numbers.
Unfortunately, 863 genes have different locus tags in the two v4
releases. If you search for one of them, you will get two hits - the
v5 gene that the February edition mapped to, and the v5 gene that the
June edition mapped to.

Two examples to clarify:

A. Suppose you search for NCU11713.4 on our web page. This query will
retrieve two genes, NCU11688.5 and NCU11713.5. The gene which in the
February release was called NCU11713.4 is the same as NCU11688.5,
while the gene labeled NCU11713.4 in June is the same as NCU11713.5.

B. Searching for NCU11324.4 yields but one hit because that gene, like
most genes, was consistently numbered between the two releases labeled

If you are not sure when you downloaded your genes, the following may
help. If you see any of these locus numbers in your gene set:

NCU00129.4, NCU00457.4, NCU00499.4, NCU00556.4, NCU00627.4,
NCU00685.4, NCU00768.4, NCU00856.4, NCU00986.4, NCU01064.4,
NCU01065.4, NCU01282.4, NCU01299.4, NCU01300.4, NCU01483.4,
NCU01559.4, NCU01560.4, NCU01610.4, NCU01611.4, NCU01664.4,
NCU01665.4, NCU01871.4, NCU01903.4, NCU02200.4, NCU02259.4,
NCU02666.4, NCU02758.4, NCU02837.4, NCU02998.4, NCU03047.4,
NCU03206.4, NCU03773.4, NCU04239.4, NCU04240.4, NCU04518.4,
NCU04519.4, NCU04710.4, NCU04711.4, NCU05275.4, NCU05512.4,
NCU05776.4, NCU06013.4, NCU06370.4, NCU06732.4, NCU07107.4,
NCU07259.4, NCU07260.4, NCU07301.4, NCU07405.4, NCU07856.4,
NCU07857.4, NCU08090.4, NCU08182.4, NCU08323.4, NCU08332.4,
NCU09085.4, NCU09256.4, NCU09257.4, NCU09998.4, NCU10166.4,
NCU10574.4, NCU11040.4, NCU11240.4, NCU11253.4, NCU11376.4,
NCU11390.4, NCU11393.4

then your genes are from the February 2010 gene set. However, if you see

NCU00082.4, NCU00083.4, NCU00084.4, NCU00085.4, NCU00516.4,
NCU01819.4, NCU04299.4, NCU04300.4, NCU04301.4, NCU04302.4,
NCU04303.4, NCU04304.4, NCU04305.4, NCU05000.4, NCU05111.4,
NCU05112.4, NCU05113.4, NCU05114.4, NCU05115.4, NCU05116.4,
NCU05448.4, NCU05452.4, NCU06667.4, NCU07323.4, NCU09066.4,
NCU10179.4, NCU10301.4, NCU10379.4, NCU10383.4, NCU10753.4,
NCU10866.4, NCU10914.4, NCU11068.4, NCU11182.4, NCU12157.4,
NCU12158.4, NCU12159.4, NCU12160.4, NCU12161.4, NCU12162.4,
NCU12163.4, NCU12164.4, NCU12165.4, NCU12166.4, NCU12167.4,
NCU12168.4, NCU12169.4, NCU12170.4, NCU12171.4, NCU12172.4,
NCU12173.4, NCU12174.4, NCU12175.4, NCU12176.4, NCU12177.4,
NCU12178.4, NCU12179.4, NCU12180.4, NCU12181.4, NCU12182.4,
NCU12183.4, NCU12184.4, NCU12185.4, NCU12186.4, NCU12187.4, NCU12188.4

then your genes are from the June 2010 release.

Attached please find five mapping tables which can be used to migrate
locus numbers from any of the previous releases to the latest version
5 locus tags (linked below).

We apologize for any confusion this may cause.
The Broad Institute

I’ve also uploaded the locus update files which maps between versions of the annotation.

Microsporidia genomes on the way

New genomes from Microsporidia are on the way from the Broad Institute and other groups, and will be a boon to those working on these fascinating creatures. Microsporidia are obligate intracellular parasites of eukaryotic cells and many can cause serious disease in humans. Some parasitize worms and insects too. The evolutionary placement of these species in the fungi is still debated with recent evidence placing them as derived members of the Mucormycotina based on shared synteny (conserved gene order), in particular around the mating type locus.  There is still some debate as to where this group belongs in the Fungal kingdom, with their highly derived characteristics and long branches they are still make them hard to place.  The synteny-based evidence was another way to find a phylogenetic placement for them but it would be helpful to have additional support in the form of additional shared derived characteristics that group Mucormycotina and Microsporidia. There is hope that increased number of genome sequences and phylogenomic approaches can help resolve the placement and more further understand the evolution of the group.

For data analysis, a new genome database for comparing these genomes is online called MicrosporidiaDB. This project has begun incorporating the available genomes and providing a data mining interface that extends from the EuPathDB project.

Genome sequence of mushroom Schizophyllum commune

Schizophyllum CommuneI am excited to announce the publication of another mushroom genome this week. The mushroom Schizophyllum commune is an important model system for mushroom biology, development of genome was sequenced as part of efforts at the Joint Genome Institute and a collection of international researchers.  The data and analyses from these efforts are presented in a publication appearing in Nature Biotechnology today.

Studies in mushrooms can have important impact on other research areas.  They can be useful in biotechnology as protein biosynthesis factories for producing compounds or even as an edible delivery mechanism for new drugs.  What we found in the analysis of this genome include clues to mechanisms of how white rotting fungi degrade lignin through analysis of enzyme families.  We also saw evidence for extensive antisense transcription during different developmental stages suggesting some important clues as to how some gene regulation could impact or control developmental progression.  Through gene expression comparison (by MPSS) a large number of transcription factors were shown to be differentially regulated during sexual development.  A knockout out two of these (fst3 and fst4) resulting in changes in ability to form mushrooms (fst4) or smaller mushrooms (fst3).

Several more interesting findings in this work that I hope to add back to this post when there is a little more time –

Ohm, R., de Jong, J., Lugones, L., Aerts, A., Kothe, E., Stajich, J., de Vries, R., Record, E., Levasseur, A., Baker, S., Bartholomew, K., Coutinho, P., Erdmann, S., Fowler, T., Gathman, A., Lombard, V., Henrissat, B., Knabe, N., Kües, U., Lilly, W., Lindquist, E., Lucas, S., Magnuson, J., Piumi, F., Raudaskoski, M., Salamov, A., Schmutz, J., Schwarze, F., vanKuyk, P., Horton, J., Grigoriev, I., & Wösten, H. (2010). Genome sequence of the model mushroom Schizophyllum commune Nature Biotechnology DOI: 10.1038/nbt.1643

A mushroom on the cover

I’ll indulge a bit here to happily to point to the cover of this week’s PNAS with an image of Coprinopsis cinerea mushrooms fruiting referring to our article on the genome sequence of this important model fungus.  You should also enjoy the commentary article from John Taylor and Chris Ellison that provides a summary of some of the high points in the paper.

Coprinopsis cover

Stajich, J., Wilke, S., Ahren, D., Au, C., Birren, B., Borodovsky, M., Burns, C., Canback, B., Casselton, L., Cheng, C., Deng, J., Dietrich, F., Fargo, D., Farman, M., Gathman, A., Goldberg, J., Guigo, R., Hoegger, P., Hooker, J., Huggins, A., James, T., Kamada, T., Kilaru, S., Kodira, C., Kues, U., Kupfer, D., Kwan, H., Lomsadze, A., Li, W., Lilly, W., Ma, L., Mackey, A., Manning, G., Martin, F., Muraguchi, H., Natvig, D., Palmerini, H., Ramesh, M., Rehmeyer, C., Roe, B., Shenoy, N., Stanke, M., Ter-Hovhannisyan, V., Tunlid, A., Velagapudi, R., Vision, T., Zeng, Q., Zolan, M., & Pukkila, P. (2010). Insights into evolution of multicellular fungi from the assembled chromosomes of the mushroom Coprinopsis cinerea (Coprinus cinereus) Proceedings of the National Academy of Sciences, 107 (26), 11889-11894 DOI: 10.1073/pnas.1003391107

Where can I get orthologs?

There are several databases that include orthology prediction for fungi. These all have pros and cons. Some are more comprehensive and have many more species. Some are curated orthologies and paralogy which should be pretty stable. Some are automated and groupings and ortholog group IDs change at each iteration.

  • A phylogenetic approach from a Saccharomyces perspective is at PhylomeDB.
  • Fungal Orthogroups is based on Synergy algorithm from I. Wapinski formerly of the Regev group at the Broad Institutue.
  • Yeast gene order browser (YGOB) for Saccharomyces spp and CGOB for Candida spp.
  • OrthoMCL database based on whole genomes, not a ton of fungi but useful starting set.
  • Ensembl Genomes provides ortholog prediction as part of the Compara pipeline though there is a limited phylogenetic diversity in the current Ensembl Fungal genomes.
  • TreeFam has Saccharomyces cerevisiae and Schizosaccharomyces pombe as the two fungi included in the curated ortholog assignments and phylogenies.
  • SIMAP provides pre-computed similarities among all proteins in UniProt.
  • InParanoid provides a pretty comprehensive of available 100 whole genomes and many fungal genomes which I tried to help select.
  • JGI’s Mycocosm attempts to provide a fungal focused paralog/gene family look at clusters of genes based on whole genomes
  • E-Fungi is also an attempt at automated clustering with some fancy webservices logic.
  • Fungal Transcription Factor database focused just on families of transcription factors.

Some of these tools are better than others in terms of providing downloadable tables.  Another problem is what Identifiers are used. Many biologists are using gene names or Locus identifiers not UniProt/GenPept IDs to identify genes or proteins of interest.  So tools that just cluster UniProt data aren’t as useful as those which refer to the gene or locus names. Also, providing a way to download all the data from a comparison is important for further mining and grouping of the data or cross-referencing local datasets.  One-by-one plugging in geneids is not really a tool that respects the idea that your user wants to ask sophisticated queries.

Also – beware that some approaches are very much pairwise comparisons lists whereas others are finding orthologous groupings.  So if you want to fine the Rad59 ortholog from all fungi it may be easier or harder depending on the source.

[I may make this a static page in the future to allow for more detailed updating since I know the available resources wax and wane]