Though a bit cliche, I think the metaphor of “presents under the tree” of some new plant pathogen genomes summarized in 4 recent publications is still too good to resist. There are 4 papers in this week’s Science that will certainly make a collection of plant pathogen biologists very happy. There are also treats for the general purpose genome biologists with descriptions of next generation/2nd generation sequencing technologies, assembly methods, and comparative genomics. Much more inside these papers than I am summarizing so I urge you to take look if you have access to these pay-for-view articles or contact the authors for reprints to get a copy.
These include the genome of biotrophic oomycete and Arabidopsis pathogen Hyaloperonospora arabidopsidis (Baxter et al). While preserving the health of Arabidopsis is not a major concern of most researchers, this is an excellent model system for studying plant-microbe interaction. The genome sequence of Hpa provides a look at specialization as a biotroph. The authors found a reduction (relative to other oomycete species) in factors related to host-targeted degrading enzymes and also reduction in necrosis factors suggesting the specialization in biotrophic lifestyle from a necrotrophic ancestor. Hpa also does not make zoospores with flagella like its relatives and sequence searches for 90 flagella-related genes turned up no identifiable homologs.
While the technical aspects of sequencing are less glamourous now the authors used Sanger and Illumina sequencing to complete this genome at 45X sequencing coverage and an estimated genome size fo 80 Mb. To produce the assembly they used Velvet on the paired end Illumina data to produce a 56Mb assembly and PCAP (8X coverage to produce a 70Mb genome) on the Sanger reads to produce two assemblies that were merged with an ad hoc procedure that relied on BLAT to scaffold and link contigs through the two assembled datasets. They used CEGMA and several in-house pipelines to annotate the genes in this assembly. SYNTENY analysis was completed with PHRINGE. A relatively large percentage (17%) of the genome fell into ‘Unknown repetitive sequence’ that is unclassified – larger than P.sojae (12%) but there remain a lot of mystery elements of unknown function in these genomes. If you jump ahead to the Blumeria genome article you’ll see this is still peanuts compared to that Blumeria’s genome (64%). The largest known transposable element family in Hpa was the LTR/Gypsy element. Of interest to some following oomycete literature is the relative abundance of the RLXR containing proteins which are typically effectors – there were still quite a few (~150 instead of ~500 see in some Phytophora genomes).
A second paper on the genome of the barley powdery mildew Blumeria graminis f.sp. hordei and two close relatives Erysiphe pisi, a pea pathogen, and Golovinomyces orontii, an Arabidopsis thaliana pathogen (Spanu et al). These are Ascomycetes in the Leotiomycete class where there are only a handful of genomes Overall this paper tells a story told about how obligate biotrophy has shaped the genome. I found most striking was depicted in Figure 1. It shows that typical genome size for (so far sampled) Pezizomycotina Ascomycetes in the ~40-50Mb range whereas these powdery mildew genomes here significantly large genomes in ~120-160 Mb range. These large genomes were primarily comprised of Transposable Elements (TE) with ~65% of the genome containing TE. However the protein coding gene content is still only on the order of ~6000 genes, which is actually quite low for a filamentous Ascomycete, suggesting that despite genome expansion the functional potential shows signs of reduction. The obligate lifestyle of the powdery mildews suggested that the species had lost some autotrophic genes and the authors further cataloged a set of ~100 genes which are missing in the mildews but are found in the core ascomycete genomes. They also document other genome cataloging results like only a few secondary metabolite genes although these are typically in much higher copy numbers in other filamentous ascomycetes (e.g. Aspergillus). I still don’t have a clear picture of how this gene content differs from their closest sequenced neighbors, the other Leotiomycetes Botrytis cinerea and Sclerotinia sclerotium, are on the order of 12-14k genes. Since the E. pisi and G. orontii data is not yet available in GenBank or the MPI site it is hard to figure this out just yet – I presume it will be available soon.
More techie details — The authors used Sanger and second generation technologies and utilized the Celera assembler to build the assemblies from 120X coverage sequence from a hybrid of sequencing technologies. Interestingly, for the E. pisi and G. orontii assemblies the MPI site lists the genome sizes closer to 65Mb in the first drafts of the assembly with 454 data so I guess you can see what happens when the Newbler assembler which overcollapses repeats. They also used a customized automated annotation with some ab intio gene finders (not sure if there was custom training or not for the various gene finders) and estimated the coverage with the CEGMA genes. I do think a Fungal-Specific set of core-conserved genes would be in order here as a better comparison set – some nice data like this already exist in a few databases but would be interesting to see if CEGMA represents a broad enough core-set to estimate genome coverage vs a Fungal-derived CEGMA-like set.
A third paper in this issue covers the genome evolution in the massively successful pathogen Phytophora infestans through resequencing of six genomes of related species to track recent evolutionary history of the pathogen (Raffaele et al). The authors used high throughput Illumina sequencing to sequence genomes of closely related species. They found a variety differences among genes in the pathogen among the findings “genes in repeat-rich regions show[ed] higher rates of structural polymorphisms and positive selection”. They found 14% of the genes experienced positive selection and these included many (300 out of ~800) of the annotated effector genes. P. infestans also showed high rates of change in the repeat rich regions which is also where a lot of the disease implicated genes are locating supporting the hypothesis that the repeat driven expansion of the genome (as described in the 2009 genome paper). The paper generates a lot of very nice data for followup by helping to prioritize the genes with fast rates of evolution or profiles that suggest they have been shaped by recent adaptive evolutionary forces and are candidates for the mechanisms of pathogenecity in this devastating plant pathogen.
A fourth paper describes the genome sequencing of Sporisorium reilianum, a biotrophic pathogen that is closely related species to corn smut Ustilago maydis (Schirawski et al). Both these species both infect maize hosts but while U. maydis induces tumors in the ears, leaves, tassels of corn the S. reilianum infection is limited to tassels and . The authors used comparative biology and genome sequencing to try and tease out what genetic components may be responsible for the phenotypic differences. The comparison revealed a relative syntentic genome but also found 43 regions in U. maydis that represent highly divergent sequence between the species. These regions contained disproportionate number of secreted proteins indicating that these secreted proteins have been evolving at a much faster rate and that they may be important for the distinct differences in the biology. The chromosome ends of U. maydis were also found to contain up to 20 additional genes in the sub-telomeric regions that were unique to U. maydis. Another fantastic finding that this sequencing and comparison revealed is more about the history of the lack of RNAi genes in U. maydis. It was a striking feature from the 2006 genome sequence that the genome lacked a functioning copy of Dicer. However knocking out this gene in S. reilianum failed to show a developmental or virulence phenotype suggesting it is dispensible for those functions so I think there will be some followups to explore (like do either of these species make small RNAs, do they produce any that are translocated to the host, etc). The rest of the analyses covered in the manuscript identify the specific loci that are different between the two species — interestingly a lot of the identified loci were the same ones found as islands of secreted proteins in the first genome analysis paper so the comparative approach was another way to get to the genes which may be important for the virulence if the two organisms have different phenotypes. This is certainly the approach that has also been take in other plant pathogens (e.g. Mycosphaerella, Fusarium) and animal pathogens (Candida, Cryptococcus, Coccidioides) but requires a sampling species or appropriate distance that that the number of changes haven’t saturated our ability to reconstruct the history either at the gene order/content or codon level.
Without the comparison of an outgroup species it is impossible to determine if U. maydis gained function that relates to the phenotypes observed here through these speculated evolutionary changes involving new genes and newly evolved functions or if S. reilianum lost functionality that was present in their common ancestor. However, this paper is an example of how using a comparative approach can identify testable hypotheses for origins of pathogenecity genes.
Hope everyone has a chance to enjoy holidays and unwrap and spend some time looking at these and other science gems over the coming weeks.
Baxter, L., Tripathy, S., Ishaque, N., Boot, N., Cabral, A., Kemen, E., Thines, M., Ah-Fong, A., Anderson, R., Badejoko, W., Bittner-Eddy, P., Boore, J., Chibucos, M., Coates, M., Dehal, P., Delehaunty, K., Dong, S., Downton, P., Dumas, B., Fabro, G., Fronick, C., Fuerstenberg, S., Fulton, L., Gaulin, E., Govers, F., Hughes, L., Humphray, S., Jiang, R., Judelson, H., Kamoun, S., Kyung, K., Meijer, H., Minx, P., Morris, P., Nelson, J., Phuntumart, V., Qutob, D., Rehmany, A., Rougon-Cardoso, A., Ryden, P., Torto-Alalibo, T., Studholme, D., Wang, Y., Win, J., Wood, J., Clifton, S., Rogers, J., Van den Ackerveken, G., Jones, J., McDowell, J., Beynon, J., & Tyler, B. (2010). Signatures of Adaptation to Obligate Biotrophy in the Hyaloperonospora arabidopsidis Genome Science, 330 (6010), 1549-1551 DOI: 10.1126/science.1195203
Spanu, P., Abbott, J., Amselem, J., Burgis, T., Soanes, D., Stuber, K., Loren van Themaat, E., Brown, J., Butcher, S., Gurr, S., Lebrun, M., Ridout, C., Schulze-Lefert, P., Talbot, N., Ahmadinejad, N., Ametz, C., Barton, G., Benjdia, M., Bidzinski, P., Bindschedler, L., Both, M., Brewer, M., Cadle-Davidson, L., Cadle-Davidson, M., Collemare, J., Cramer, R., Frenkel, O., Godfrey, D., Harriman, J., Hoede, C., King, B., Klages, S., Kleemann, J., Knoll, D., Koti, P., Kreplak, J., Lopez-Ruiz, F., Lu, X., Maekawa, T., Mahanil, S., Micali, C., Milgroom, M., Montana, G., Noir, S., O’Connell, R., Oberhaensli, S., Parlange, F., Pedersen, C., Quesneville, H., Reinhardt, R., Rott, M., Sacristan, S., Schmidt, S., Schon, M., Skamnioti, P., Sommer, H., Stephens, A., Takahara, H., Thordal-Christensen, H., Vigouroux, M., Wessling, R., Wicker, T., & Panstruga, R. (2010). Genome Expansion and Gene Loss in Powdery Mildew Fungi Reveal Tradeoffs in Extreme Parasitism Science, 330 (6010), 1543-1546 DOI: 10.1126/science.1194573
Raffaele, S., Farrer, R., Cano, L., Studholme, D., MacLean, D., Thines, M., Jiang, R., Zody, M., Kunjeti, S., Donofrio, N., Meyers, B., Nusbaum, C., & Kamoun, S. (2010). Genome Evolution Following Host Jumps in the Irish Potato Famine Pathogen Lineage Science, 330 (6010), 1540-1543 DOI: 10.1126/science.1193070
Schirawski, J., Mannhaupt, G., Munch, K., Brefort, T., Schipper, K., Doehlemann, G., Di Stasio, M., Rossel, N., Mendoza-Mendoza, A., Pester, D., Muller, O., Winterberg, B., Meyer, E., Ghareeb, H., Wollenberg, T., Munsterkotter, M., Wong, P., Walter, M., Stukenbrock, E., Guldener, U., & Kahmann, R. (2010). Pathogenicity Determinants in Smut Fungi Revealed by Genome Comparison Science, 330 (6010), 1546-1548 DOI: 10.1126/science.1195330