Your eye contains the same genetic content as your fingernail, but these two tissues look nothing alike. One significant cause of this difference is the tissue specific regulation of the genes in the genome. In some tissues in your body, a gene may be expressed (transcribed) while that same gene may be silent in another tissue type. A great deal of modern biological research explores the regulation of expression of all the genes in a genome, collectively known as the transcriptome. Such studies are, for example, aimed at understanding which genetic regulation events account for the differences between an eye and a fingernail.
However, the effectiveness of this research is predicated upon actually knowing which parts of the genome are capable of being expressed and, subsequently, regulated. Conventionally, researchers extract RNA from an organism grown in various conditions (or, as in the case of our example, various tissues from an organism) and clone and sequence the RNA to identify at least a subset of genes that are expressed (Ebbole 2004*). Such Expressed Sequence Tags (ESTs) have proven vital to our understanding of gene and gene structure annotation as they frequently provide evidence of intron splice sites. While this method has facilitated a robust understanding of gene regulation, it is expensive, time consuming, and provides a relatively low coverage of the transcriptome. If our goal is to understand everything that is expressed, then we need a superior tool.
Enter SAGE (serial analysis of gene expression) and MPSS (massively parallel signature sequencing) [Irie 2003*, Harbers 2005*]. Both methods sequence short tags of a transcript’s 3′ end. SAGE uses conventional sequencing technology while MPSS uses Solexa, Inc.’s novel bead-based hybridization technology. One of the massive advantages of these technologies is the number of sequences they provide: large EST databases are on the order of several tens of thousands, while SAGE generally provides 100,000 to 200,00 tags and MPSS can provide over a million signatures. That being said, there are still questions regarding the sensitivity of the depth of coverage of the transcriptome. It may well be that despite a lower total sequence count, ESTs provide more information about what parts of the genome are expressed.
Fortunately, Gowda et al put all three methods to work as well as an RNA microarray (which doesn’t provide sequence, but enables its inference through hybridization) in their recent study of the Magnaporthe grisea transcriptome [Gowda 2006]. M. grisea is the causative agent of rice blast, a devastating disease that results in tremendous crop yield loss. The researchers evaluated two tissues types: the non-pathogenic mycelium and the invasive, plant penetrating appressorium.
Interestingly, 40% of the MPSS tags and 55% of the SAGE tags identified represent novel genes as they had no matches in the existing M. grisea JGI EST collection. Additionally, the authors found that no one method could identify the majority of the transcripts, but that a two-way combination of array data, MPSS or SAGE could provide over 80% of the total unique transcripts all of the methods identified. One additional suprise was that roughly a quarter of the genes identified also produced an antisense RNA, possibly for siRNA regulation of the gene.
The long story short appears to be that there is, as of yet, no magic bullet of a method. To adequately cover the transcriptome, multiple techniques are required.
*These references are, unfortunately, not located in an open access journal.