Category Archives: cell biology

Social Slime Mold

Slime molds are interesting organisms that receive surprisingly little attention. Take the case of Dictyostelium discoideum, a single-celled amoeba that, when starved, will aggregate with other D. discoideum amoeba cells in the neighborhood to create a motile, multicellular structure known as a slug. Eventually the slug differentiates into a reproductive structure, with some individuals making a long stalk and others producing spores. In other words, some individuals help other reproduce but do not reproduce themselves.

D. discodium lifecycleBut why form a slug? Why would a single celled organism decide to cooperate with other, genetically different individuals, particularly when it may provide no direct passage of its genes? The evolutionary benefits of kin relationships aside, previous work has shown that slugs do provide multiple benefits to the population as a whole. Continue reading Social Slime Mold

Deeper and Deeper, Down the Transcriptome-hole We Fall

Your eye contains the same genetic content as your fingernail, but these two tissues look nothing alike. One significant cause of this difference is the tissue specific regulation of the genes in the genome. In some tissues in your body, a gene may be expressed (transcribed) while that same gene may be silent in another tissue type. A great deal of modern biological research explores the regulation of expression of all the genes in a genome, collectively known as the transcriptome. Such studies are, for example, aimed at understanding which genetic regulation events account for the differences between an eye and a fingernail.

However, the effectiveness of this research is predicated upon actually knowing which parts of the genome are capable of being expressed and, subsequently, regulated. Conventionally, researchers extract RNA from an organism grown in various conditions (or, as in the case of our example, various tissues from an organism) and clone and sequence the RNA to identify at least a subset of genes that are expressed (Ebbole 2004*). Such Expressed Sequence Tags (ESTs) have proven vital to our understanding of gene and gene structure annotation as they frequently provide evidence of intron splice sites. While this method has facilitated a robust understanding of gene regulation, it is expensive, time consuming, and provides a relatively low coverage of the transcriptome. If our goal is to understand everything that is expressed, then we need a superior tool.

Enter SAGE (serial analysis of gene expression) and MPSS (massively parallel signature sequencing) [Irie 2003*, Harbers 2005*]. Both methods sequence short tags of a transcript’s 3′ end. SAGE uses conventional sequencing technology while MPSS uses Solexa, Inc.’s novel bead-based hybridization technology. One of the massive advantages of these technologies is the number of sequences they provide: large EST databases are on the order of several tens of thousands, while SAGE generally provides 100,000 to 200,00 tags and MPSS can provide over a million signatures. That being said, there are still questions regarding the sensitivity of the depth of coverage of the transcriptome. It may well be that despite a lower total sequence count, ESTs provide more information about what parts of the genome are expressed.

Fortunately, Gowda et al put all three methods to work as well as an RNA microarray (which doesn’t provide sequence, but enables its inference through hybridization) in their recent study of the Magnaporthe grisea transcriptome [Gowda 2006]. M. grisea is the causative agent of rice blast, a devastating disease that results in tremendous crop yield loss. The researchers evaluated two tissues types: the non-pathogenic mycelium and the invasive, plant penetrating appressorium.

Interestingly, 40% of the MPSS tags and 55% of the SAGE tags identified represent novel genes as they had no matches in the existing M. grisea JGI EST collection. Additionally, the authors found that no one method could identify the majority of the transcripts, but that a two-way combination of array data, MPSS or SAGE could provide over 80% of the total unique transcripts all of the methods identified. One additional suprise was that roughly a quarter of the genes identified also produced an antisense RNA, possibly for siRNA regulation of the gene.

The long story short appears to be that there is, as of yet, no magic bullet of a method. To adequately cover the transcriptome, multiple techniques are required.

*These references are, unfortunately, not located in an open access journal.

Splicing machinery and introns

Splicing of pre-messenger RNA is necessary to remove introns and create well formed and translateable mRNA, but the purpose of introns still remains a mystery. One idea is they provide a role in the error checking machinery, or Nonsense Mediated Decay (NMD), by providing way-points during translation. A protein is deposited at the exon junction complex (EJC) which indicates a splicing event has occurred. During translation, if the ribosome encounters a premature stop (or termination) codon (PTC) and then sees one of these EJC way-points, it signals the corrupted message for degradation.

NMD_PTC

Several predictions come out of these models including the lack of introns in the 3′ UTR and that the average length of exons should be correlated with the window that the proofreading mechanism can operate on. These are discussed in several papers out of Mike Lynch’s lab including (Lynch and Connery 2003), (Lynch and Kewalramani, 2003), (Lynch and Richardson, 2002) and recently (Scofield et al, 2007).

Efforts to understand the splicing machinery, particularly in S. cerevisiae have led to the discovery of numerous genes that code for proteins that make up the spliceosome. Some of these include small RNAs as well as protein coding genes. The SR proteins are serine-arginine rich proteins that regulate splicing and are found in almost all eukaryotes including most fungi (even those with few introns, such as S. cerevisiae). SR proteins play a role in splicing and in nuclear export (Masuyama et al, 2004, Sanford et al, 2004) indicating that a coupling of these processes may explain why genes with introns tend to be more highly expressed. The evolution of the spliceosomal family of genes is also interesting because the families appear to diversify in some eukaryotes perhaps where there are more elaborate splicing and regulatory action (Barbosa-Morais et al, 2006).

There is some debate as to whether splicing occurs after the pre-mRNA is completely synthesized or if it happens as transcription is occurring. Work on this has shown that both spliceosomal assembly can co-occur with polymerase during transcription, as well as evidence that most splicing (in yeast) is post-transcriptional (Tardiff et al, 2006). It is argued that the two steps occur together to maximize efficiency and fidelity (Das et el, 2006, Moore et al, 2006), but perhaps affinities are species-specific and have evolved to correlate with intron densities?

[Note: This post has links to non-open access journal articles. At this point I am still referring to these even if they are not all readable by everyone, because they contain some data that is only available there. I will strive to focus more narrowly on only papers that are available as open access through pubmed central or directly through open-access journals.]

Whole genome tiling arrays

A recent paper describes the discovery of 9 new introns in Saccharomyces cerevisiae by Ron Davis’s group at Stanford, using high density tiling arrays from Affymetrix. The arrays are designed for both strands allow the detection of transcripts transcribed from both strands. The arrays were also put to work by the Davis and Steinmetz labs to create a high density map of transcription in yeast and for polymorphism mapping from the Kruglyak lab.

PNAS Yeast Transcriptional map

Whole genome tiling arrays have also been employed in other fungi. For example, Anita Sil’s group at UCSF constructed a random tiling array for Histoplasma capsulatum and used it to identify genes responding to reactive nitrogen species. A similar approach was used in Cryptococcus neoformans to investigate temperature regulated genes using random sequencing clones.

As the technology has become cheaper, it may become sensible to use a tiling array to detect transcripts rather than ESTs when attempting to annotate a genome. In the Histoplasma work transcriptional units could be identified from hybridization alone. Some of the algorithms will need some work to correct incorporate this information, and the sensitivity and density of the array will influence this. These techniques can be part of a resequencing approaches or fast genotyping progeny from QTL experiments when the sequence from both parents is known (or at least enough of the polymorphims for the genetic map).

What is superior about the current Affymetrix yeast tiling array is the inclusion of both strands. This allows detection of transcripts from both strands. Several anti-sense transcripts in yeast have been discovered recently including in the IME4 locus through more classical approaches, but perhaps many more await discovery with high resolution transcriptional data from whole genome tiling arrays.