Next week a collection of international scientists with stakes in seeing fungal genome databases evolve and rise to meet the tide of genome data being produced and analyzed from fungi will be meeting in DC. I am hopeful we’ll come up with some strategies and principles that can guide how this data can be more effectively managed and provided to researchers. This includes web-based resources, tools, and simply adhering to a standardized formats for genome annotations (like GFF3), automated methods for gene ontology associations on newly annotated genomes, and integration of what I expect to be the major amount of data in the years to come: individual lab produced genomic, ChIP, resequencing, and RNA-sequencing results. This means the integration (and sharing) of individual labs produced genomic data with the public data will be key along with cross-species comparisons of this information. Tools like Ensembl and UCSC-browser provide great portals for animal data and some plant data with a few fungi sprinkled in as outgroups. (Okay UCSC does have some data for close relatives to Saccharomyces data in their “other clade” that provides data from the Phastcons paper and Ensembl is now serving up a few Fungi). Tools like Phytozome are attempting to integrate some of the plant genomic data in one place as well. However the resources for fungal researchers with a wide collection of highly detailed manually curated genomes to shotgun sequenced and automated annotation are available and the tools to search, compare, and integrate are still insufficient for what is needed by the community.
I expect will also be discussing how databases that incorporate the data from all the genomes can have some centralized aspects so comparative analyses are possible, and importantly, how can these types of resources be sustainably funded by public and private money.
Fungi are important in a wide variety of human and ecosystem processes, from pathogens of agriculture crops to human disease causing to symbiotic relationships with plants to industrial agents in food, chemical, and biofuel production. The study of them needs modern tools including genomic resources for molecular studies of these species. The current tools and data are quite useful and important in our current research but with the increasing amount of new sequence and phenotype data, and a need to effectively connect data from different experimental, model, and pathogen study systems needs to be much improved.
I hope to provide some updates on what are some of the ideas of what we discuss about “Pan-fungal” genome resources and will be interested in helping engage a wider audience on how tools and resources should be built to meet our needs as researchers.