Clusters of genomes

As announced at the Fungal Genetics meeting, the FGI at the Broad Institute is focusing on clusters of genomes rather than single ones. Some of genome projects are already grouped.

  • Coccidioides has 3 strains already plus the outgroup Uncinocarpus and conceivable one could include Histoplasma in there. This resources will grow to 14 strains (which comprise two species) of Coccidioides contributed by FGI and one from TIGR.
  • Aspergillus currently has 8 species sequenced with several in pipeline at Broad and TIGR.
  • Fusarium group has 3 species including recently released F. oxysporium.
  • The Candida clade also have several different already sequenced genomes and of course there is the already well studied (and well utilized genome resources I’ll add) for the Saccharomyces clade.
  • There are 4 genomes (well 5 but JEC21 and B-3501 are nearly identical) of Cryptococcus.

All in all a very exciting time for comparative genomics and I’m particularly intrigued to see how people will begin to use the resources.

This work to consolidate the clusters of genomes will, I hope, be very powerful. However, I still feel we are not doing a good job translating and centralizing information from different related species into a more centralized resource. Lots of money is spent on sequencing but I don’t know that we have realized the dream of having the comparative techniques illuminate the new genomes to the point that we are learning huge new things.

It seems to me, initially there is the lure of gathering low-hanging fruit from a genome analysis (which drives the first genome(s) paper), but not always the financial support of the longer term needs of the community to feed the experimental and functional work back into the genome annotation and interpretation.  The cycle works really well for Saccharomyces cerevisiae because the curators who work with the community to insure information is deposited and that literature is gleaned to link genomic and functional information. But this is expensive in terms of funding many curators for many different projects.

It seems as we add more genomes there isn’t a very centralized effort for this type of curatorial information and so we lack the gems of high-quality annotation that is only seen in a few “model” systems.  At some point a better meta-database that builds bridges between resource and literature rich “model system” communities may help, but maybe something new will have to be created? I like thinking about this as a user-driven content via a wiki which also dynamic (and versioned!) content from automated intelligent systems to map the straight-forward things.  Tools like SCI-PHY already exist that can do this and generate robust orthology groups (or Books as the PhyloFact database organizes them) for futher analysis. The SGD wiki for yeast is a start at this, but is mostly an import of SGD data into a mediawiki framework – I wonder how this can be built upon in a more explictly comparative environment.

Pyrenophora tritici-repentis

The genome of Pyrenophora tritici-repentis, the fourth sequenced Dothideomycete genome, was released by the FGI at the Broad Institute this spring (March 2007). P. tiritici-repentis was sequenced for its role as the cause of tan spot on wheat and as a research model for other Pyrenophora sp. that are pathogens of several grasses.

The 6X assembly contains 37.8 Mb of sequence similar to the other Dothideomycetes such as Stagnospora nodorum (37.2 Mb), Alternaria brassicola (32 Mb), and Mycosphaerella graminicola (41.8 Mb).