Category Archives: ascomycota

Deeper and Deeper, Down the Transcriptome-hole We Fall

Your eye contains the same genetic content as your fingernail, but these two tissues look nothing alike. One significant cause of this difference is the tissue specific regulation of the genes in the genome. In some tissues in your body, a gene may be expressed (transcribed) while that same gene may be silent in another tissue type. A great deal of modern biological research explores the regulation of expression of all the genes in a genome, collectively known as the transcriptome. Such studies are, for example, aimed at understanding which genetic regulation events account for the differences between an eye and a fingernail.

However, the effectiveness of this research is predicated upon actually knowing which parts of the genome are capable of being expressed and, subsequently, regulated. Conventionally, researchers extract RNA from an organism grown in various conditions (or, as in the case of our example, various tissues from an organism) and clone and sequence the RNA to identify at least a subset of genes that are expressed (Ebbole 2004*). Such Expressed Sequence Tags (ESTs) have proven vital to our understanding of gene and gene structure annotation as they frequently provide evidence of intron splice sites. While this method has facilitated a robust understanding of gene regulation, it is expensive, time consuming, and provides a relatively low coverage of the transcriptome. If our goal is to understand everything that is expressed, then we need a superior tool.

Enter SAGE (serial analysis of gene expression) and MPSS (massively parallel signature sequencing) [Irie 2003*, Harbers 2005*]. Both methods sequence short tags of a transcript’s 3′ end. SAGE uses conventional sequencing technology while MPSS uses Solexa, Inc.’s novel bead-based hybridization technology. One of the massive advantages of these technologies is the number of sequences they provide: large EST databases are on the order of several tens of thousands, while SAGE generally provides 100,000 to 200,00 tags and MPSS can provide over a million signatures. That being said, there are still questions regarding the sensitivity of the depth of coverage of the transcriptome. It may well be that despite a lower total sequence count, ESTs provide more information about what parts of the genome are expressed.

Fortunately, Gowda et al put all three methods to work as well as an RNA microarray (which doesn’t provide sequence, but enables its inference through hybridization) in their recent study of the Magnaporthe grisea transcriptome [Gowda 2006]. M. grisea is the causative agent of rice blast, a devastating disease that results in tremendous crop yield loss. The researchers evaluated two tissues types: the non-pathogenic mycelium and the invasive, plant penetrating appressorium.

Interestingly, 40% of the MPSS tags and 55% of the SAGE tags identified represent novel genes as they had no matches in the existing M. grisea JGI EST collection. Additionally, the authors found that no one method could identify the majority of the transcripts, but that a two-way combination of array data, MPSS or SAGE could provide over 80% of the total unique transcripts all of the methods identified. One additional suprise was that roughly a quarter of the genes identified also produced an antisense RNA, possibly for siRNA regulation of the gene.

The long story short appears to be that there is, as of yet, no magic bullet of a method. To adequately cover the transcriptome, multiple techniques are required.

*These references are, unfortunately, not located in an open access journal.

Whole genome tiling arrays

A recent paper describes the discovery of 9 new introns in Saccharomyces cerevisiae by Ron Davis’s group at Stanford, using high density tiling arrays from Affymetrix. The arrays are designed for both strands allow the detection of transcripts transcribed from both strands. The arrays were also put to work by the Davis and Steinmetz labs to create a high density map of transcription in yeast and for polymorphism mapping from the Kruglyak lab.

PNAS Yeast Transcriptional map

Whole genome tiling arrays have also been employed in other fungi. For example, Anita Sil’s group at UCSF constructed a random tiling array for Histoplasma capsulatum and used it to identify genes responding to reactive nitrogen species. A similar approach was used in Cryptococcus neoformans to investigate temperature regulated genes using random sequencing clones.

As the technology has become cheaper, it may become sensible to use a tiling array to detect transcripts rather than ESTs when attempting to annotate a genome. In the Histoplasma work transcriptional units could be identified from hybridization alone. Some of the algorithms will need some work to correct incorporate this information, and the sensitivity and density of the array will influence this. These techniques can be part of a resequencing approaches or fast genotyping progeny from QTL experiments when the sequence from both parents is known (or at least enough of the polymorphims for the genetic map).

What is superior about the current Affymetrix yeast tiling array is the inclusion of both strands. This allows detection of transcripts from both strands. Several anti-sense transcripts in yeast have been discovered recently including in the IME4 locus through more classical approaches, but perhaps many more await discovery with high resolution transcriptional data from whole genome tiling arrays.

Making the Revolution Work for You

In a recent Microbiology Mini-Review, Meriel Jones catalogs both the potential benefits and problems that arise from fungal genome sequencing. Using the nine genomes (being) sequenced from the Aspergillus clade, Jones addresses several issues tied to a singular theme: if we are to unlock the potential that fungal genome sequencing holds, both academically and entrepreneurially, then a more robust infrastructure that enables comparative and functional annotation of genomes must be established.

Fortunately, like any good awareness advocate, Jones points us in the direction of e-Fungi, a UK based virtual project aimed at setting up such an infrastructure. Anyone can navigate this database to either compare the stored genomic information or evaluate any fungus of interest in the light of the e-Fungi genomic data. The data appears to be precomputed, similar to IMG from JGI, so there are inherent limitations on the data that one can obtain. However, tools such as these put important data in the hands of expert mycologists that can turn the information into something biologically meaningful.

As Jones points out, this is just the beginning. If fungal genomes are to live up to their promise, they must engage more than just experts at reading genomes.

Not one, but two Aspergillus niger genome sequences

Blogging about Peer-Reviewed ResearchA.niger growing on plate (this is not the sequenced strain)The JGI has previously released A. niger strain ATCC 1015 sequence in November 2005. ATCC 1015 is used in industrial production of citric acid as it is one of the best producers of citric acid. In Nature Biotechnology a Dutch team has published the sequence of another strain, CBS 513.88 which is an early ancestor of ATCC 1015 used in industrial enzyme production.