As an update to previous post, the N. crassa annotation has been updated to version 5 on the Broad Institute website. Previously the data was not yet available for this update, but as of 8-Mar-2011 it is. The assembly hasn’t changed but the annotation is updated and includes some fixes to improperly renamed locus names. On the N. crassa genome site you can see files with the history of loci through this to determine if a locus name was improperly changed in the past. This should be rectified in the currently released annotation, and definitely encourage you to take it for a spin and report back to the Broad Institute if you have any questions.
Here is a message from the Broad Institute about a gene annotation update that was made recently in response to an issue that was revealed in the June 2010 release. This new version is called V5 and should be on its way to GenBank.
Dear Neurospora scientists, Recently we discovered an issue with the way locus tags were assigned to our most recent Neurospora gene set, released publicly on the Broad website in June of 2010. Many genes in this gene set have mismatched locus numbers compared to the same genes released in February 2010. Adding to the confusion, both releases were labeled version 4. To remedy this we have recalled the June locus numbers and released a new, version 5 gene set. Genes in this set have been numbered to preserve historical locus numbers (back to the original genbank release) as much as possible. Folks who call their favorite genes by their v1, v2 or v3 numbers can search for them on our web page, which will map them to v5 automatically and accurately. The same will work for most v4 numbers. Unfortunately, 863 genes have different locus tags in the two v4 releases. If you search for one of them, you will get two hits - the v5 gene that the February edition mapped to, and the v5 gene that the June edition mapped to. Two examples to clarify: A. Suppose you search for NCU11713.4 on our web page. This query will retrieve two genes, NCU11688.5 and NCU11713.5. The gene which in the February release was called NCU11713.4 is the same as NCU11688.5, while the gene labeled NCU11713.4 in June is the same as NCU11713.5. B. Searching for NCU11324.4 yields but one hit because that gene, like most genes, was consistently numbered between the two releases labeled 4. If you are not sure when you downloaded your genes, the following may help. If you see any of these locus numbers in your gene set: NCU00129.4, NCU00457.4, NCU00499.4, NCU00556.4, NCU00627.4, NCU00685.4, NCU00768.4, NCU00856.4, NCU00986.4, NCU01064.4, NCU01065.4, NCU01282.4, NCU01299.4, NCU01300.4, NCU01483.4, NCU01559.4, NCU01560.4, NCU01610.4, NCU01611.4, NCU01664.4, NCU01665.4, NCU01871.4, NCU01903.4, NCU02200.4, NCU02259.4, NCU02666.4, NCU02758.4, NCU02837.4, NCU02998.4, NCU03047.4, NCU03206.4, NCU03773.4, NCU04239.4, NCU04240.4, NCU04518.4, NCU04519.4, NCU04710.4, NCU04711.4, NCU05275.4, NCU05512.4, NCU05776.4, NCU06013.4, NCU06370.4, NCU06732.4, NCU07107.4, NCU07259.4, NCU07260.4, NCU07301.4, NCU07405.4, NCU07856.4, NCU07857.4, NCU08090.4, NCU08182.4, NCU08323.4, NCU08332.4, NCU09085.4, NCU09256.4, NCU09257.4, NCU09998.4, NCU10166.4, NCU10574.4, NCU11040.4, NCU11240.4, NCU11253.4, NCU11376.4, NCU11390.4, NCU11393.4 then your genes are from the February 2010 gene set. However, if you see NCU00082.4, NCU00083.4, NCU00084.4, NCU00085.4, NCU00516.4, NCU01819.4, NCU04299.4, NCU04300.4, NCU04301.4, NCU04302.4, NCU04303.4, NCU04304.4, NCU04305.4, NCU05000.4, NCU05111.4, NCU05112.4, NCU05113.4, NCU05114.4, NCU05115.4, NCU05116.4, NCU05448.4, NCU05452.4, NCU06667.4, NCU07323.4, NCU09066.4, NCU10179.4, NCU10301.4, NCU10379.4, NCU10383.4, NCU10753.4, NCU10866.4, NCU10914.4, NCU11068.4, NCU11182.4, NCU12157.4, NCU12158.4, NCU12159.4, NCU12160.4, NCU12161.4, NCU12162.4, NCU12163.4, NCU12164.4, NCU12165.4, NCU12166.4, NCU12167.4, NCU12168.4, NCU12169.4, NCU12170.4, NCU12171.4, NCU12172.4, NCU12173.4, NCU12174.4, NCU12175.4, NCU12176.4, NCU12177.4, NCU12178.4, NCU12179.4, NCU12180.4, NCU12181.4, NCU12182.4, NCU12183.4, NCU12184.4, NCU12185.4, NCU12186.4, NCU12187.4, NCU12188.4 then your genes are from the June 2010 release. Attached please find five mapping tables which can be used to migrate locus numbers from any of the previous releases to the latest version 5 locus tags (linked below). We apologize for any confusion this may cause. Love, The Broad Institute
I’ve also uploaded the locus update files which maps between versions of the annotation.
An article about the Fungal Genetics Stock Center written by the curators provides some insight into the 50 year history of this resource. It is a great summary of how the stock center has grown over the years and demonstrates how it is an essential aspect of how research on filamentous fungi is possible. The FGSC staff also provide important infrastructure in organization of meetings like the Neurospora and Fungal Genetics meetings and are also active pursuing their own research. So don’t forget to cite FGSC in your talks and (very importantly) papers.
McCluskey K, Wiest A, & Plamann M (2010). The Fungal Genetics Stock Center: a repository for 50 years of fungal genetics research. Journal of Biosciences, 35 (1), 119-26 PMID: 20413916
I spy a picture of Neurospora growing on the cover of Genetics this month. The cover highlights the results from the work of the lab of Luis Corrochano who works on light regulation in a variety of systems like Neurospora and Phycomyces. This work describes their work on the fluffy gene which regulates conidiation (production of conidia or asexual spores). They show that an important interplay between an inducer of light response, the White Collar Complex (WCC), and the FLD protein on fluffy. The data from indicate hat FLD represses fluffy as a response to dark but that this repression is removed in response to light through the action of WCC.
Olmedo, M., Ruger-Herreros, C., & Corrochano, L. (2009). Regulation by Blue Light of the fluffy Gene Encoding a Major Regulator of Conidiation in Neurospora crassa Genetics, 184 (3), 651-658 DOI: 10.1534/genetics.109.109975
Too much on my plate as of late, so I’m woefully behind on posting much on interesting papers or news. Here’s a short list of links and papers that are worth a look though.
- “Evolution of pathogenicity and sexual reproduction in eight Candida genomes” published (Nature)
- NYT Science article sort of summarizing the good, bad, and ugly of fungi and human interactions
- Attempts to save amphibians from chytridiomycosis “Riders of a Modern-Day Ark” (PLoS Biology)
- Looks like Scott Baker with the JGI are in the process of resequencing several classical mutant strains of Phycomyces, Neurospora and Cochliobolus, Cryphonectria for sequence-based mapping of mutants (i.e. here and here and here).
The JGI in collaboration with our lab at Berkeley have released the Neurospora tetrasperma (mat A) and N. discreta (mat A) genome sequences and annotation after about two years of work. These are two closely related species to the well studied laboratory workhorse Neurospora crassa.
The N.tetrasperma assembly (8X) has an N50 of 976kb and is highly colinear with the N.crassa genome. With the JGI, we’ve also done some additional 454 sequencing which will represent an improved assembly and 23X coverage in the next release. We also did some comparative scaffolding and can basically double that N50 – most of which looks good when compared to the improved V2 assembly.
The N.discreta assembly (8X) is also quite good with an N50 of 2.3 Mb. For comparison, the V7 of N.crassa has an N50 of 664 kb. although with genetic map information the 250+ contigs can be scaffolded into 7 chromosomes with 146 unmapped contigs.
Both N.discreta and N.tetrasperma genomes contain about 10k predicted genes similar to counts in other related species like N.crassa and Podospora anserina.
We’re finalizing several analyses to present at the Asilomar meeting to describe these Neurospora genomes and comparisons with other Sordariomycete species.
This is a research blog so I though I’d post some quick numbers we are seeing for de novo assembly of the Neurospora crassa genome using Velvet. The genome of N.crassa is about 40Mb and sequencing of several flow cells using Solexa/Illumina technology to see what kind of de novo reconstruction we’d get. I knew that this is probably insufficient for a very good assembly given what has been reported in the literature, but sometimes it is helpful to give it a try on local data. Mostly this is a project about SNP discovery from the outset. I used a hash size of 21 in velvet with an early (2FC) and later (4FC) dataset. Velvet was run with a hashsize of 21 for these data based on some calculations and running it with different hash sizes to see the optimal N50. Summary contig size numbers come from the commands using cndtools from Colin Dewey.
faLen < contigs.fa | stats
2 flowcells (~10M reads @36bp/read or about 10X coverage of 40Mb genome)
N = 199562 SUM = 25463251 MIN = 49 1ST-QUARTILE = 87 MEDIAN = 107.0 3RD-QUARTILE = 146 MAX = 5371 MEAN = 127.59568956 N50 = 130
4 flow cells (~20M reads @36bp/read; or about 20X coverage of a 40Mb genome)
N = 102437 SUM = 38352075 MIN = 41 1ST-QUARTILE = 77.0 MEDIAN = 153 3RD-QUARTILE = 467 MAX = 7189 MEAN = 374.396702363 N50 = 837
So that’s N50 of 837bp – for those used to seeing N50 on the order or 1.5Mb this is not great. But from4 FC worth of sequencing which was pretty cheap. This is a reasonably repeat-limited genome so we should get pretty good recovery if the seq coverage is high enough. Using Maq we can both scaffold the reads and recover a sufficient number of high quality SNPs for the mapping part of the project.
To get a better assembly one would need much deeper coverage as Daniel and Ewan explain in their Velvet paper and shown in Figure 4 (sorry, not open-access for 6 mo). Full credit: This sequence was from unpaired sequence reads from Illumina/Solexa Genomic sequencing done at UCB/QB3 facility on libraries prepared by Charles Hall in the Glass lab.
The genome of Podospora anserina S mat+ strain was sequenced by Genoscope and CNRS and published recently in Genome Biology. The genome sequence data has been available for several years, but it is great to see a publication describing the findings. The 10X genome assembly with ~10,000 genes provides an important dataset for comparisons among filamentous Sordariomycete fungi. The authors primarily focused on comparative genomics of Podospora to Neurospora crassa, the next closest model filamentous species. Within the Sordariomycetes there are now a very interesting collection of closely related species which can be useful for applying synteny and phylogenomics approaches.
The analyses in the manuscript focused on these differences between Neurospora and Podospora identifying some key differences in carbon utilization contrasting the coprophillic (Podospora) and plant saprophyte (Neurospora). There are several observations of gene family expansions in the Podospora genome which could be interpreted as additional enzyme capacity to break down carbon sources that are present in dung.
The genome of Neurospora has be shaped by the action of the genome defense mechanisms like RIP that has been on interpretation of the reduced number of large gene families and paucity of transposons. The authors report a surprising finding that in their analysis that despite sharing orthologs of genes that are involved in several genome defense, they in fact find fewer repetitive sequences in Podospora while it still fails to have good evidence of RIP.
Overall, these data suggest that P. anserina has experienced a fairly complex history of transposition and duplications, although it has not accumulated as many repeats as N. crassa. P. anserina possesses all the orthologues of N. crassa factors necessary for gene silencing, including RIP, meiotic MSUD and also vegetative quelling, a post transcriptional gene silencing mechanism akin to RNA interference
I think this data and observations interleaves nicely with the work our group is exploring on evolution of genome of several Neurospora species which have different mating systems. The fact that the gene components that play a role in MSUD and a RIP are found in Podpospora but yet the degree of RIP and the lack of any observed meiotic silencing suggests some interesting occurrences on the Neurospora branch to be explored. The potentially different degrees of RIP efficiency and types of mating systems (heterothallic and pseudohomothallic) among the Neurospora spp may also provide a link to understanding how RIP evolved and its role on N. crassa evolution.
Senescence in Podospora
Another aspect of Podopsora biology that isn’t touched on, is the use of the fungus as a model for senescence. The fungus exhibits maternal senescence which involves targeted changes in the mitochondria that leads to cell death. The evolutionary and molecular basis for this process has been of interest to many research groups and the genome sequence can provide an additional toolkit for identifying the factors involved in the apoptosis process in this filamentous fungi. Whether it will help find a real link for aging research in other eukaryotes remains to be seen, but it is a good model system for some aspects of how aging and damage to mtDNA are linked.
Espagne, E., Lespinet, O., Malagnac, F., Da Silva, C., Jaillon, O., Porcel, B.M., Couloux, A., Aury, J., et al (2008). The genome sequence of the model ascomycete fungus Podospora anserina. Genome Biology, 9(5), R77. DOI: 10.1186/gb-2008-9-5-r77