Category Archives: mRNA splicing

Gene prediction without training?

A new paper in Genome Research from Borodovsky lab at Georgia Tech provides an improved ab initio gene prediction building on their previous program GeneMark called GeneMark.hmm ES.  This application doesn’t require a training set when building models for gene prediction in fungal genomes and reports to have as good or better sensitivity and specificity than most of the commonly used ab initio programs. They are picking up on proviously described insights about fungal gene structures and introns which is the lack of a necessary branch site and varying degrees of conservation of splice-sites in most intron rich fungi (Schwartz et al, 2008) and that these intron sizes remain short across the fungi (Stajich et al. 2007).

In practice it should simplify the initial genome annotation protocols used and could really streamline the procedures. It doesn’t replace the need to gathering EST sequence data that can also be used generate a training set in an automated fashion.  EST and transcriptional evidence is still very important for identification of UTR and alternative splicing isoforms.

Hopefully these data from the predictions will integrate into the Cryptococcus and Coprinus genome annotations that are undergoing an update at the Broad.  We’ll see how well this performs on a couple of the Chytrid genome sequences we are working on as well.

Neurospora alternative splicing

mitochondriaA quick link to a Neurospora paper in Genetics today entitled “Alternative Splicing Gives Rise to Different Isoforms of the Neurospora crassa Tob55 Protein That Vary in Their Ability to Insert ß-Barrel Proteins Into the Outer Mitochondrial Membrane”. The authors investigated alternative splicing of a gene found in the TOB complex on the outside of the mitochondria. They found reduced growth rate when a strain expressed only the the longest form of three isoforms and confirmed the protein expression of the three isoforms with mass spec.

Splicing machinery and introns

Splicing of pre-messenger RNA is necessary to remove introns and create well formed and translateable mRNA, but the purpose of introns still remains a mystery. One idea is they provide a role in the error checking machinery, or Nonsense Mediated Decay (NMD), by providing way-points during translation. A protein is deposited at the exon junction complex (EJC) which indicates a splicing event has occurred. During translation, if the ribosome encounters a premature stop (or termination) codon (PTC) and then sees one of these EJC way-points, it signals the corrupted message for degradation.


Several predictions come out of these models including the lack of introns in the 3′ UTR and that the average length of exons should be correlated with the window that the proofreading mechanism can operate on. These are discussed in several papers out of Mike Lynch’s lab including (Lynch and Connery 2003), (Lynch and Kewalramani, 2003), (Lynch and Richardson, 2002) and recently (Scofield et al, 2007).

Efforts to understand the splicing machinery, particularly in S. cerevisiae have led to the discovery of numerous genes that code for proteins that make up the spliceosome. Some of these include small RNAs as well as protein coding genes. The SR proteins are serine-arginine rich proteins that regulate splicing and are found in almost all eukaryotes including most fungi (even those with few introns, such as S. cerevisiae). SR proteins play a role in splicing and in nuclear export (Masuyama et al, 2004, Sanford et al, 2004) indicating that a coupling of these processes may explain why genes with introns tend to be more highly expressed. The evolution of the spliceosomal family of genes is also interesting because the families appear to diversify in some eukaryotes perhaps where there are more elaborate splicing and regulatory action (Barbosa-Morais et al, 2006).

There is some debate as to whether splicing occurs after the pre-mRNA is completely synthesized or if it happens as transcription is occurring. Work on this has shown that both spliceosomal assembly can co-occur with polymerase during transcription, as well as evidence that most splicing (in yeast) is post-transcriptional (Tardiff et al, 2006). It is argued that the two steps occur together to maximize efficiency and fidelity (Das et el, 2006, Moore et al, 2006), but perhaps affinities are species-specific and have evolved to correlate with intron densities?

[Note: This post has links to non-open access journal articles. At this point I am still referring to these even if they are not all readable by everyone, because they contain some data that is only available there. I will strive to focus more narrowly on only papers that are available as open access through pubmed central or directly through open-access journals.]