I came across this posting which might shed some light on microbial explanation for #deflategate and the story of Jock Giblet.
I’ve migrated some of the services from UCB to UCR so the blog and wiki server as well as Gbrowse V1 services have migrated to UCR machine. The Gbrowse2 installation and migration of under-development services are still in progress but will hopefully be done in next two weeks.
This setup is more flexible and I’ll blog about geeky server details later, perhaps in a different forum. In the future we should be able to support additional project data in a simplified format.
I hope after this and a few more setup issues we can resume development of new tools and services for these data.
Too much on my plate as of late, so I’m woefully behind on posting much on interesting papers or news. Here’s a short list of links and papers that are worth a look though.
- “Evolution of pathogenicity and sexual reproduction in eight Candida genomes” published (Nature)
- NYT Science article sort of summarizing the good, bad, and ugly of fungi and human interactions
- Attempts to save amphibians from chytridiomycosis “Riders of a Modern-Day Ark” (PLoS Biology)
- Looks like Scott Baker with the JGI are in the process of resequencing several classical mutant strains of Phycomyces, Neurospora and Cochliobolus, Cryphonectria for sequence-based mapping of mutants (i.e. here and here and here).
A new and improved annotation of Cryptococcus neoformans var grubii strain H99 (serotype A) has been made available in GenBank and the Broad Institute website. This update is collaboration between several groups providing data and analyses and the genome annotation team at the Broad Institute.
Some changes noted by the Broad Institute include:
“This release of gene predictions for the serotype A isolate Cryptococcus neoformans var. grubii H99 is based on a new genomic assembly provided by Dr. Fred Dietrich at the Duke Center for Genome Technology. The new assembly consists of 14 nuclear chromosomes and a single 21 KB mitochondrial chromosome, and has resulted in a reduction of the estimated genome size from 19.5 to 18.9 Mb. Improvements in the assembly and in our annotation process have resulted in a set of 6,967 predicted protein products, 335 fewer than the previous release.”
NPR’s Science Friday covered fungi with several myco-luminaries on the radiowaves including:
- Kelli Hoover, Penn State University who recently discovered a new fungus related to Fusarium solani in the gut of Asian horn beetle that can break down lignin.(DOI: 10.1073/pnas.0805257105)
- Arturo Casadevall who has been a leader and pioneer in Cryptococcus research including recent work on its ability to use ionizing radiation as an energy source.
- Kathy Hodge, Cornell University and Cornell Mushroom Blog author
David Fischer author of many great books on Mushroom identification
The show also speaks with Gavin Sherlock on recently published work on the origins of lager yeast.
An outbreak of a fungal infection called “white-nose syndrome” is killing bats in the Northeastern US. This New Scientist article mentions the outbreak briefly and an NPR story and recent Boston Globe story also gives it some coverage. Sounds like we still don’t know much about the causal agent or how it is killing the bats at this time, but some researchers, including Elizabeth Buckles at Cornell University, Vishnu Chaturvedi at NY State Dept of Health, and Jon Reichard at Boston University are working on it.
This is of course old news if you read what Hyphoid Logic has been saying.
That there is a previously undescribed cold loving fungus sounds very interesting, there have been some recent discoveries of psychrophilic fungi like Cryptococcus laurentii and Rhodotorula himalayensis so it would be interesting to learn more when the researchers publish some of these results.
Some more links
Thanks Kathyrn B for reminder about this story.
The need for a comprehensive database for information about fungi, starting at least with systematic collections of genomic and transcript data, is highlighted as a major need. Really and sort of new database effort should strive to be more comprehensive and include genetic and population data (alleles, strains) and information like protein-protein, protein-nucleic acid interactions (as Pedro mentioned). But on top of that it, it needs to be comparative so that information from systems that serve as great models can be transferred to other fungal systems that are being studied for their role as pathogens or interacting in the environmental.
Affordable next-gen sequencing will allow us to obtain genome and transcript sequence for basically all species or strains of interest. Researchers with no bioinformatics support in their lab will likely be able to outsource this to a company or campus core facility. But how can they easily map in the collective information about genes, proteins, and pathways onto this new data? And have it be a dynamic system that can update as new information is published and curated in other systems.
I think this has to be the future beyond setting up a SGD, CGD, etc for every system. The individual databases are useful for a large enough community where there are curators (and funding), but we will have to move to a more modular system in the future (aspects of which are in GMOD) that can have both an individual focus on a specific species/clade and a more comprehensive view of the that is comparable across the kingdom. There are 100+ fungal genomes, but the community size for some of them are in the dozens of labs or less. How can they take advantage of the new resources without an existing infrastructure of curators? Their systems serve an important need in a research aim, but how can discoveries there make its way back into the datastream of othe systems?
I see it as there are several ways one would interact with a system that provided single-genome tools as well as a framework for comparative information. At a gene level, one might be looking for all information about a specific gene, based on sequence similarity searches, or starting with a cloned gene in one species. Something akin to Phylofacts or precomputed Orthogroups for defining a Gene but with more linking information about function by linking in information from all sources. So a comparative resource, but also tapping into curated andliterature mined data.
At a genome level, one might want to do whole genome comparisons of gene content from evolutionarily defined families genes (gene family size change) or at a functional level. To start out with, each gene/protein would already need a systematic functional mapping. This could be as simple as running InterProScan on every protein, expanded to find Orthogroups (or OrthoMCL orthologs) and transfer function from model systems, and finally even more advanced, do further classified better with tools like SIFTER.
Interlinked with these orthologous and paralogous gene sets would be anchors for analyses of chromosomal synteny and even comparative assembly including tools like Mercator. Certainly things like all of this exist but making it more pluggable for different sets of species would be an important additional component.
At a utility level, the gene annotation and functional mapping of all this information should be possible. I would imagine a researcher could upload the sequence assembly they received from the core facility and the system can generate multiple gene predictions, annotate the genes, and link these genes within the known orthogroups of the system (preserving their privacy for these genes if desired). Presumably this sort of thing would be easier as a standalone in-house for the researcher, but web services could also be the place for this.
For fungal-sized genomes this amount of data is not too extereme. Things like Genome Browser, BLAST, etc should all be rolled out of the box based on the basic builds.
On the DIY and community annotation front, there would also need to be a layer of community derived annotation that could be layered on all these systems. I would imagine this both to be for gene structure annotation (genome annotation) and functional annotation (protein X does Y based on experiment Z, here is the journal reference). I think aspects of this would be visible, auditable (tracked), but maybe not blessed as official until a curator could oversee these inputs. In my mind, whether or not this is in a Wiki per se or just new system that allows community input is less important to me than having it be a) structured (not a bunch of free text) b) tracked and versionable c) easy for researchers to input so that the knowledge is captured, even if it has to be reorganized later on.
Seems like a lot of work to be done, but really many of these things already exist through what the GMOD project has built. Many loose ends and software that doesn’t fully meet up to these needs, but I think the important concept is these are all general solutions that will be of benefit to most communities, not just the fungal ones. One lingering question I always have when approaching genomic datas
that will be dynamic, what if any of this makes its way into GenBank? How is this sort of thing banked so that it can be captured, and does the improved functional or gene structure annotation ever make its way into the repository databases to correct and improve what has already been submitted there?
In particular it is important to establish a community network that will help basidiomycete labs. There is also a strong need for shared approaches for effective use the genomic data from the more than a dozen basidiomycete genomes currently being sequenced.
Mike’s blog is open for discussion on the topic and I hope you’ll weigh in on suggestions for how the community can better communicate and share ideas.
The American Academy of Microbiology has released a report (PDF and archived on fungalgenomes.org) on the Fungal Kingdom outlining importance of research in the kingdom and recommending several areas of priority for future areas of research.
One recommendation that makes the top of the list is an integrated database for fungal genomes, something we’re keenly interested in seeing happen. This sort of centralized repository of functional annotation, literature links, and genome sequences and annotation is critical given the 150+ genomes that are available or on their way. Systematic re-annotation with consistent tools, comparative analyses and gene predictions, and linking gene sequences by homology and ortholog predictions are a critical component to fully utilizing the genomic data that has been produced for the fungi and other organisms.
Looks like the USDA, Mars (the candy company), and IBM are partnering up to sequence the Cacao plants genome for everyone to use. Here is the article over at BBC News.