Tag Archives: resequencing

Yeast population genomics

ResearchBlogging.org
I have cheered the Sanger-Wellcome SGRP group work to generate multiple Saccharomyces cerevisiae and S. paradoxus strain genome sequences.   The group had previously submitted a version of the manuscript to Nature precedings and it is now published in Nature AOP showing that submitting to a preprint server doesn’t necessarily hurt your manuscript getting published…  The research groups explored the impact of domestication (as was also recently done for the sake and soy sauce worker fungus, Aspergillus oryzae) on the Saccharomyces genome by comparing individuals from wild strains of S. paradoxus.

This paper addressed several challenges including methodology for light genome sequencing for population genomics. This data represents in a way, a pilot project on for genome resequencing projects and using draft genome sequencing with next generation sequencing tools. Of course with the pace of sequencing technology development, any project more than a couple months old will be using outdated technology it seems, but this work represents some important progress.  Tools like MAQ were also developed and tuned as part of the project.  In addition to the methods development it also provided a new look at evolutionary dynamics of a well-studied fungus.

Genome assembly
The authors apply several different quality controls and utilize a new tool called PALAS (Parallel ALignment and ASsembly)  to assemble all the strains at the same time using a graph-based approach that utilized the reference genome sequences for each species. This is different than a full-blown WGA approach like PCAP, Phusion or Arachne because this is deliberately low-coverage sequencing pass.  The authors are trying impute missing sequence via Ancestral Recombination Graphs as implemented in the Margarita system.   They also use MAQ to align sequence from Illumina/Solexa sequencing to these assemblies made by PALAS.

Since this project was on two species of SaccharomycesS. cerevisiae and S. paradoxus they needed good reference assemblies for each of these species. The previously availably S.paradoxus assembly wasn’t complete enough for this study so they did an addition 4.3 X coverage with sanger/ABI sequencing and 80X coverage with Illumina.

Population genomics and domestication

The sequencing data also provided a framework for population genetic investigations. Some simple findings showed that geographic isolates within each species were more genetically similar to each other.  The main geographic regions of samples for S.paradoxus data included the UK, American, and Far East samples, some of which had been analyzed in a very nice study on Chromosome III.  For the S. cerevisiae samples there were individuals from around Europe, at least 10 European wine strains, Malaysian, Sake brewing strains, West Africa, and North America. From these data it was possible to discover that there are several of strains with mosiac genomes meaning that pieces of the genome match best with the sake fermentation strains and other parts from the wine/European samples.

Efforts to detect the effects of natural selection that may be linked to domestication of these strains explored two different approaches. The McDonald-Kreitman test did not identify any loci under positive selection while Tajima’s D was negative in the S.cerevisiae global and wine strain populations indicating an excess of singleton polymorphisms – though they draw little conclusions from that.  The authors also observed a sharper decay of linkage disequilibrium in S.cerevisiae (half maximum of 3kb) than S.paradoxus (half maximum 9kb) suggesting that S.cerevisiae is recombining more, either due to increased opportunities or a great frequency of recombination events when it does.

In context of the paper title and the idea of exploring the effects of domestication on the genome, the authors observe that the standard paradigm that ‘domesticated’ species have lower diversity levels is simply not the case in these samples.  This isn’t to say there isn’t evidence of the selection for fermentation production from these strains based on the stress response conditions they were tested on, but that there is still ample evidence of maintaining diversity within the populations presumably through various amounts of outcrossing.

We are also interested in these results as we apply similar questions to population genomics of the human pathogenic fungus Coccidioides where 14 strains have been sequenced with sanger sequencing technology.  Hopefully some of these lessons will resonate in our analyses and also that this era of population genomics will see ever more extensive collections to address aspects of migration, phylogeography, and local adaptations within populations of fungi and other microbes.

Gianni Liti, David M. Carter, Alan M. Moses, Jonas Warringer, Leopold Parts, Stephen A. James, Robert P. Davey, Ian N. Roberts, Austin Burt, Vassiliki Koufopanou, Isheng J. Tsai, Casey M. Bergman, Douda Bensasson, Michael J. T. O’Kelly, Alexander van Oudenaarden, David B. H. Barton, Elizabeth Bailes, Alex N. Nguyen, Matthew Jones, Michael A. Quail, Ian Goodhead, Sarah Sims, Frances Smith, Anders Blomberg, Richard Durbin, Edward J. Louis (2009). Population genomics of domestic and wild yeasts Nature DOI: 10.1038/nature07743

New Saccharomyces resequencing assembly

SGRP LogoDavid Carter at the Sanger Centre emailed a message that new assemblies of Saccharomyces strain resequencing project have been posted including a new three-way alignment of S. bayanusS.paradoxusS.cerevisiae. This updates the Dec 2007 release.

Continue reading New Saccharomyces resequencing assembly

More updates on Saccharomyces resequencing project at Sanger

I’ve paraphrased an email sent by David Carter to folks interested in Saccharomyces resequencing project.

The latest version of the SGRP data is on the web site and ftp site. This release is somewhat provisional, and motivated more by the fact that we have a paper deadline coming up than by any claim to finality. It should be quite a bit better than what was there before, but doesn’t have a correct treatment of transposons.

You can get the data by starting here:
http://www.sanger.ac.uk/Teams/Team71/durbin/sgrp/datadoc.shtml

There is also a new version of the browser:
http://www.sanger.ac.uk/Teams/Team71/durbin/sgrp/browser.shtml

There are a few new features in the browser which [David] is going to document over the next couple of days.

Major new features of the data are that there should be much better consistency between alignments; Solexa/Illumina data has been incorporated for the strains that had it; and the S. paradoxus alignments are based on a new assembly that created a few weeks ago and which covers about 95% of the genome; a description is at
http://www.sanger.ac.uk/Teams/Team71/durbin/sgrp/spara_assembly.shtml

Yeast resequencing update

Ed Louis at Nottingham sent out an email today outlining plans for publishing analyses of the Saccharomyces Genome Resequencing Project.  They are in process of analyzing the data and ask that people respect their use of the data, but also invite collaborations and companion papers.

“If anyone has done or plans on doing a global analysis with a tight clean result which you think should be included in the overview paper, please contact us [Richard Durbin and Ed Louis; emails available through above links]. The analysis would have to be complete by 14 December and you would have to be willing to have the details transparently displayed on the web pages associated with the project.”

Next gen sequencing technology

Nature has an overview of what goes in and out of next generation sequencers with an interview with a smiling Chad Nusbaum from the Broad Institute. Most of these have been out and about for a while, but it seems that the hayride/bandwagon is starting to pick up more steam as GT‘s Genome Scan has several posts about sequencing referencing J. Craig V, George Church, and the Nature news article (not free).

Note that Solexa is no longer the cool name – “Genome analyzer” being the name for the machine that was previously called Solexa 1G. I’m holding out hope for funnier names in the future. I do feel that ABI’s choice of SOLiD is more exciting than 310/3700/3730 that is as inspiring as HAL9000.

But I mean if your technology is called pyrosequencing, I am hoping Roche will come up with a firey or at least smoldering play on words if they rename 454 again (GS FLX for now).