Tag Archives: annotation

Basidiomycete genomes galore


Just finished attending Genetics and Cell Biology of Basidiomycetes in Cape Girardeau, MO which was an intimate gathering of basidiomycetaphiles.  I learned about systems that are used for studying fruiting body development, genetic mapping, pheromone and mating genes, kinesin dynamics, meoitic gene regulation, and a host of topics.  I’m happy I got a chance to meet more folks in the community and learned about where informatics and computational approaches are really needed to push along some of the interpretation of the more than a dozen basidiomycete genomes.  In particular it sounds like the PleurotusSchizophyllum, Agaricus bisporus, and Serpula genomes are all marching along to completion with some already in 4X assembly or further.  

GCBBVI Group Picture

So we’ll further have more samples from of key model and some less-model species to assist researchers working on many different mushroom-forming fungi that range from brown and white-rotting saprophyte fungi to mycorrhizal fungi that associate with plants.    I’m excited about the work to make transformation and knockouts more readily in these systems too to push the genetics and cellular biology of these systems even further.  The genome sequences will be another tool in these endeavors.

The last day ended with a discussion about genome annotation and future support for curating gene models.  Basically everyone is unhappy with computational predictions and want to be able to go in and fix things. (I think people remember the ones that are gotten wrong more readily than the ones that were right, but computational prediction definitely performs poorly in some situations).   In this Web 2.0-land we live in, this is still not something easily done with any of the freely available genome browsing tools. The JGI’s browser was lauded for its ability to handle these kinds of requests, but how do we proceed when genomes are not sequenced by that center or when (not too distant future) communities are able to sequence a genome themselves using 454/Illumina-Solexa/Helicos/Pacific Biosystems approaches in their own lab?  There is still a huge lag in what kinds of tools researchers can use to annotate genomes to fix gene models and add functions.  Hopefully projects like GMOD will continue to develop useful tools for solving these needs, but there is certainly a need for better support of distributed community annotation of genomes where this little direct money for supporting curators from a single place.

Lest you think annotation is easy

Ensembl!Ewan Birney and Ensembl (the other/original genome browser depending on if you are a UCSC junkie) have started blogging a bit more about what is going on under the proverbial hood over there in Hinxton.  There are some great nuggets talking about what are some of the current problems.  These bite-sized comments should be a great glimpse into what is going on without drowning in the deluge that is ensembl-dev.  

This is a recent post on the challenges of gene annotation coordination among “manual” and “automated” annotation of gene structure of groups at the same institution.  

Scale that up among multiple genomes, genome centers, quality of prediction programs and assemblies, and you can see why the fungal genome comparisons could use a little bit more help. It is great to hear what the animal genome annotation groups are doing to solve informatics challenges and data management issues and coordination. I’m big fan of more informatics+science in the open where it is feasible. 

Fungal Genetics 2007 details

I’m including a recapping as many of the talks as I remember. There were 6 concurrent sessions each afternoon so you have to miss a lot of talks. The conference was bursting at the seams as it was- at least 140 people had to be turned away beyond the 750 who attended.

If there was any theme in the conference it was “Hey we are all using these genome sequences we’ve been talking about getting”. I only found the overview talks that solely describe the genome solely a little dry as compared to those more focused on particular questions. I guess my genome palate is becoming refined.

Continue reading Fungal Genetics 2007 details

Whole genome tiling arrays

A recent paper describes the discovery of 9 new introns in Saccharomyces cerevisiae by Ron Davis’s group at Stanford, using high density tiling arrays from Affymetrix. The arrays are designed for both strands allow the detection of transcripts transcribed from both strands. The arrays were also put to work by the Davis and Steinmetz labs to create a high density map of transcription in yeast and for polymorphism mapping from the Kruglyak lab.

PNAS Yeast Transcriptional map

Whole genome tiling arrays have also been employed in other fungi. For example, Anita Sil’s group at UCSF constructed a random tiling array for Histoplasma capsulatum and used it to identify genes responding to reactive nitrogen species. A similar approach was used in Cryptococcus neoformans to investigate temperature regulated genes using random sequencing clones.

As the technology has become cheaper, it may become sensible to use a tiling array to detect transcripts rather than ESTs when attempting to annotate a genome. In the Histoplasma work transcriptional units could be identified from hybridization alone. Some of the algorithms will need some work to correct incorporate this information, and the sensitivity and density of the array will influence this. These techniques can be part of a resequencing approaches or fast genotyping progeny from QTL experiments when the sequence from both parents is known (or at least enough of the polymorphims for the genetic map).

What is superior about the current Affymetrix yeast tiling array is the inclusion of both strands. This allows detection of transcripts from both strands. Several anti-sense transcripts in yeast have been discovered recently including in the IME4 locus through more classical approaches, but perhaps many more await discovery with high resolution transcriptional data from whole genome tiling arrays.