Tag Archives: pathogen

Postdoc: Population genomics and speciation in fungal pathogens

Post-doc position in genomics of introgressions in fungal pathogens

[from EvolDir]

We invite applications for a postdoctoral position in the Research Institute of Horticulture and Seeds. The position is for 1 year starting as soon as January 2016.

The Postdoc will conduct his research in the field of population genomics of secondary contacts and introgression in two fungal pathogens: Venturia inaequalis, an ascomycete responsible of the apple scab, and the Scedosporium apiospermum species complex which is responsible for pulmonary infections in children with cystic fibrosis.

The Postdoc will have to identify genomics regions involved in introgression between divergent populations of Venturia inaequalis and Scedosporium species. Indeed, secondary contacts between divergent genomics pools may favour the creation of new genetic combination of loci involved in pathogenicity. New hybrids should then exhibit hitherto unseen epidemiological properties. The Postdoc will work in a team involved in several projects of genetics or genomics, functional genomics, and evolutionary epidemiology (IRHS – ECOFUN team).

Using resequenced genomes (89 for V. inaequalis and 23 for the Scedosporium species complex), the Postdoc will be in charge of the assembling, genome aligning and SNP calling, prior to population genomics analyses. The Postdoc will have to infer evolutionary histories at the interspecies and species levels for both datasets, identify and characterise genomic regions involved in introgressions. He [or she] will possibly collaborate with all the researchers involved in this project : population geneticists, microbiologists, functional genomicists, phytopathologists.

We are looking for a candidate with a keen interest for population genomics and evolutionary history in structured populations. The candidate must hold a PhD in population genomics with strong skills in bioinformatics (manipulation of NGS data, assembling, demographic inferences). Good written communication skill and ability to work as part of a team are required.

How to apply:
Applicants should submit

  1.  a cover letter describing their research interests and background,
  2.  a detailed CV (including list of publications), and
  3. the contact details of three references to bruno.lecam@angers.inra.fr or christophe.lemaire@univ-angers.fr. The cover letter should also include possible starting dates.

Job: Faculty Positions at Institute Pasteur

FACULTY POSITIONS IN MYCOLOGY

The Institut Pasteur in Paris announces an international call for outstanding candidates at all levels to establish independent research groups in the Mycology Department. Preference will be given to studies on human pathogenic filamentous fungi and yeasts, fungal cell biology or population genetics and genomics. Research on model species will be also considered when connecting to fungal pathogenesis. Attractive start-up and ongoing support includes salary, equipment, and operating costs. In addition, Institut Pasteur provides access to state-of-the-art technology platforms, and to laboratories and research infrastructure in disease-endemic regions through the Pasteur International Network. Further information on the Institute and on-campus facilities can be found at http://www.pasteur.fr. Further information on the Mycology Department can be found at http://www.pasteur.fr/mycology.

The application should comprise the following (in order) in a single pdf file: i) A brief introductory letter, ii) A Curriculum Vitae, a list of 10 selected publications and a full publication list, iii) A description of past and present research activities (up to 3 pages), iv) The proposed research project (up to 6 pages, including a summary).

Junior candidates [1] should also provide:
v) The names of 3 scientists from whom letters of recommendation can be sought, together with the names of scientists with a potential conflict of interest from whom evaluations should not be requested.

Applications and requests for information should be addressed to myco_call2015@pasteur.fr by February 27, 2015. Short-listed candidates will be invited for interviews in spring 2015 and decisions will be announced by summer 2015.

[1] Institut Pasteur is an equal opportunity employer. Junior group leaders should be less than 8 years after PhD at the time of submission. Women are eligible up to 11 years after their PhD if they have one child and up to 14 years after their PhD if they have two or more children.

Flier from the posting DeptMyco_call2015

Microsporidia genomes on the way

New genomes from Microsporidia are on the way from the Broad Institute and other groups, and will be a boon to those working on these fascinating creatures. Microsporidia are obligate intracellular parasites of eukaryotic cells and many can cause serious disease in humans. Some parasitize worms and insects too. The evolutionary placement of these species in the fungi is still debated with recent evidence placing them as derived members of the Mucormycotina based on shared synteny (conserved gene order), in particular around the mating type locus.  There is still some debate as to where this group belongs in the Fungal kingdom, with their highly derived characteristics and long branches they are still make them hard to place.  The synteny-based evidence was another way to find a phylogenetic placement for them but it would be helpful to have additional support in the form of additional shared derived characteristics that group Mucormycotina and Microsporidia. There is hope that increased number of genome sequences and phylogenomic approaches can help resolve the placement and more further understand the evolution of the group.

For data analysis, a new genome database for comparing these genomes is online called MicrosporidiaDB. This project has begun incorporating the available genomes and providing a data mining interface that extends from the EuPathDB project.

Presents for the holidays – Plant pathogen genomes

Though a bit cliche, I think the metaphor of “presents under the tree” of some new plant pathogen genomes summarized in 4 recent publications is still too good to resist.  There are 4 papers in this week’s Science that will certainly make a collection of plant pathogen biologists very happy. There are also treats for the general purpose genome biologists with descriptions of next generation/2nd generation sequencing technologies, assembly methods, and comparative genomics. Much more inside these papers than I am summarizing so I urge you to take look if you have access to these pay-for-view articles or contact the authors for reprints to get a copy.

Hyaloperonospora

These include the genome of biotrophic oomycete and Arabidopsis pathogen Hyaloperonospora arabidopsidis (Baxter et al). While preserving the health of Arabidopsis is not a major concern of most researchers, this is an excellent model system for studying plant-microbe interaction.  The genome sequence of Hpa provides a look at specialization as a biotroph. The authors found a reduction (relative to other oomycete species) in factors related to host-targeted degrading enzymes and also reduction in necrosis factors suggesting the specialization in biotrophic lifestyle from a necrotrophic ancestor. Hpa also does not make zoospores with flagella like its relatives and sequence searches for 90 flagella-related genes turned up no identifiable homologs.

While the technical aspects of sequencing are less glamourous now the authors used Sanger and Illumina sequencing to complete this genome at 45X sequencing coverage and an estimated genome size fo 80 Mb. To produce the assembly they used Velvet on the paired end Illumina data to produce a 56Mb assembly and PCAP (8X coverage to produce a 70Mb genome) on the Sanger reads to produce two assemblies that were merged with an ad hoc procedure that relied on BLAT to scaffold and link contigs through the two assembled datasets. They used CEGMA and several in-house pipelines to annotate the genes in this assembly. SYNTENY analysis was completed with PHRINGE. A relatively large percentage (17%) of the genome fell into ‘Unknown repetitive sequence’ that is unclassified – larger than P.sojae (12%) but there remain a lot of mystery elements of unknown function in these genomes.  If you jump ahead to the Blumeria genome article you’ll see this is still peanuts compared to that Blumeria’s genome (64%). The largest known transposable element family in Hpa was the LTR/Gypsy element. Of interest to some following oomycete literature is the relative abundance of the RLXR containing proteins which are typically effectors – there were still quite a few (~150 instead of ~500 see in some Phytophora genomes).

Blumeria

 

A second paper on the genome of the barley powdery mildew Blumeria graminis f.sp. hordei and two close relatives Erysiphe pisi, a pea pathogen, and Golovinomyces orontii, an Arabidopsis thaliana pathogen (Spanu et al).  These are Ascomycetes in the Leotiomycete class where there are only a handful of genomes Overall this paper tells a story told about how obligate biotrophy has shaped the genome. I found most striking was depicted in Figure 1. It shows that typical genome size for (so far sampled) Pezizomycotina Ascomycetes in the ~40-50Mb range whereas these powdery mildew genomes here significantly large genomes in ~120-160 Mb range. These large genomes were primarily comprised of Transposable Elements (TE) with ~65% of the genome containing TE. However the protein coding gene content is still only on the order of ~6000 genes, which is actually quite low for a filamentous Ascomycete, suggesting that despite genome expansion the functional potential shows signs of reduction.  The obligate lifestyle of the powdery mildews suggested that the species had lost some autotrophic genes and the authors further cataloged a set of ~100 genes which are missing in the mildews but are found in the core ascomycete genomes. They also document other genome cataloging results like only a few secondary metabolite genes although these are typically in much higher copy numbers in other filamentous ascomycetes (e.g. Aspergillus).  I still don’t have a clear picture of how this gene content differs from their closest sequenced neighbors, the other Leotiomycetes Botrytis cinerea and Sclerotinia sclerotium, are on the order of 12-14k genes. Since the E. pisi and G. orontii data is not yet available in GenBank or the MPI site it is hard to figure this out just yet – I presume it will be available soon.

More techie details — The authors used Sanger and second generation technologies and utilized the Celera assembler to build the assemblies from 120X coverage sequence from a hybrid of sequencing technologies.  Interestingly, for the E. pisi and G. orontii assemblies the MPI site lists the genome sizes closer to 65Mb in the first drafts of the assembly with 454 data so I guess you can see what happens when the Newbler assembler which overcollapses repeats. They also used a customized automated annotation with some ab intio gene finders (not sure if there was custom training or not for the various gene finders) and estimated the coverage with the CEGMA genes. I do think a Fungal-Specific set of core-conserved genes would be in order here as a better comparison set – some nice data like this already exist in a few databases but would be interesting to see if CEGMA represents a broad enough core-set to estimate genome coverage vs a Fungal-derived CEGMA-like set.

 

A third paper in this issue covers the genome evolution in the massively successful pathogen Phytophora infestans through resequencing of six genomes of related species to track recent evolutionary history of the pathogen (Raffaele et al). The authors used high throughput Illumina sequencing to sequence genomes of closely related species. They found a variety differences among genes in the pathogen among the findings “genes in repeat-rich regions show[ed] higher rates of structural polymorphisms and positive selection”. They found 14% of the genes experienced positive selection and these included many (300 out of ~800) of the annotated effector genes. P. infestans also showed high rates of change in the repeat rich regions which is also where a lot of the disease implicated genes are locating supporting the hypothesis that the repeat driven expansion of the genome (as described in the 2009 genome paper). The paper generates a lot of very nice data for followup by helping to prioritize the genes with fast rates of evolution or profiles that suggest they have been shaped by recent adaptive evolutionary forces and are candidates for the mechanisms of pathogenecity in this devastating plant pathogen.

 

A fourth paper describes the genome sequencing of Sporisorium reilianum, a biotrophic pathogen that is closely related species to corn smut Ustilago maydis (Schirawski et al). Both these species both infect maize hosts but while U. maydis induces tumors in the ears, leaves, tassels of corn the S. reilianum infection is limited to tassels and . The authors used comparative biology and genome sequencing to try and tease out what genetic components may be responsible for the phenotypic differences. The comparison revealed a relative syntentic genome but also found 43 regions in U. maydis that represent highly divergent sequence between the species. These regions contained disproportionate number of secreted proteins indicating that these secreted proteins have been evolving at a much faster rate and that they may be important for the distinct differences in the biology. The chromosome ends of U. maydis were also found to contain up to 20 additional genes in the sub-telomeric regions that were unique to U. maydis. Another fantastic finding that this sequencing and comparison revealed is more about the history of the lack of RNAi genes in U. maydis. It was a striking feature from the 2006 genome sequence that the genome lacked a functioning copy of Dicer. However knocking out this gene in S. reilianum failed to show a developmental or virulence phenotype suggesting it is dispensible for those functions so I think there will be some followups to explore (like do either of these species make small RNAs, do they produce any that are translocated to the host, etc).  The rest of the analyses covered in the manuscript identify the specific loci that are different between the two species — interestingly a lot of the identified loci were the same ones found as islands of secreted proteins in the first genome analysis paper so the comparative approach was another way to get to the genes which may be important for the virulence if the two organisms have different phenotypes. This is certainly the approach that has also been take in other plant pathogens (e.g. Mycosphaerella, Fusarium) and animal pathogens (Candida, Cryptococcus, Coccidioides) but requires a sampling species or appropriate distance that that the number of changes haven’t saturated our ability to reconstruct the history either at the gene order/content or codon level.

Without the comparison of an outgroup species it is impossible to determine if U. maydis gained function that relates to the phenotypes observed here through these speculated evolutionary changes involving new genes and newly evolved functions or if S. reilianum lost functionality that was present in their common ancestor. However, this paper is an example of how using a comparative approach can identify testable hypotheses for origins of pathogenecity genes.

 

Hope everyone has a chance to enjoy holidays and unwrap and spend some time looking at these and other science gems over the coming weeks.

 

Baxter, L., Tripathy, S., Ishaque, N., Boot, N., Cabral, A., Kemen, E., Thines, M., Ah-Fong, A., Anderson, R., Badejoko, W., Bittner-Eddy, P., Boore, J., Chibucos, M., Coates, M., Dehal, P., Delehaunty, K., Dong, S., Downton, P., Dumas, B., Fabro, G., Fronick, C., Fuerstenberg, S., Fulton, L., Gaulin, E., Govers, F., Hughes, L., Humphray, S., Jiang, R., Judelson, H., Kamoun, S., Kyung, K., Meijer, H., Minx, P., Morris, P., Nelson, J., Phuntumart, V., Qutob, D., Rehmany, A., Rougon-Cardoso, A., Ryden, P., Torto-Alalibo, T., Studholme, D., Wang, Y., Win, J., Wood, J., Clifton, S., Rogers, J., Van den Ackerveken, G., Jones, J., McDowell, J., Beynon, J., & Tyler, B. (2010). Signatures of Adaptation to Obligate Biotrophy in the Hyaloperonospora arabidopsidis Genome Science, 330 (6010), 1549-1551 DOI: 10.1126/science.1195203

Spanu, P., Abbott, J., Amselem, J., Burgis, T., Soanes, D., Stuber, K., Loren van Themaat, E., Brown, J., Butcher, S., Gurr, S., Lebrun, M., Ridout, C., Schulze-Lefert, P., Talbot, N., Ahmadinejad, N., Ametz, C., Barton, G., Benjdia, M., Bidzinski, P., Bindschedler, L., Both, M., Brewer, M., Cadle-Davidson, L., Cadle-Davidson, M., Collemare, J., Cramer, R., Frenkel, O., Godfrey, D., Harriman, J., Hoede, C., King, B., Klages, S., Kleemann, J., Knoll, D., Koti, P., Kreplak, J., Lopez-Ruiz, F., Lu, X., Maekawa, T., Mahanil, S., Micali, C., Milgroom, M., Montana, G., Noir, S., O’Connell, R., Oberhaensli, S., Parlange, F., Pedersen, C., Quesneville, H., Reinhardt, R., Rott, M., Sacristan, S., Schmidt, S., Schon, M., Skamnioti, P., Sommer, H., Stephens, A., Takahara, H., Thordal-Christensen, H., Vigouroux, M., Wessling, R., Wicker, T., & Panstruga, R. (2010). Genome Expansion and Gene Loss in Powdery Mildew Fungi Reveal Tradeoffs in Extreme Parasitism Science, 330 (6010), 1543-1546 DOI: 10.1126/science.1194573

Raffaele, S., Farrer, R., Cano, L., Studholme, D., MacLean, D., Thines, M., Jiang, R., Zody, M., Kunjeti, S., Donofrio, N., Meyers, B., Nusbaum, C., & Kamoun, S. (2010). Genome Evolution Following Host Jumps in the Irish Potato Famine Pathogen Lineage Science, 330 (6010), 1540-1543 DOI: 10.1126/science.1193070

Schirawski, J., Mannhaupt, G., Munch, K., Brefort, T., Schipper, K., Doehlemann, G., Di Stasio, M., Rossel, N., Mendoza-Mendoza, A., Pester, D., Muller, O., Winterberg, B., Meyer, E., Ghareeb, H., Wollenberg, T., Munsterkotter, M., Wong, P., Walter, M., Stukenbrock, E., Guldener, U., & Kahmann, R. (2010). Pathogenicity Determinants in Smut Fungi Revealed by Genome Comparison Science, 330 (6010), 1546-1548 DOI: 10.1126/science.1195330

Dynamics of amphibian pathogen infection cycles

ResearchBlogging.org
Two papers out this week on the population dynamics and epidemiology of the chytrid pathogen of amphibians, Batrachochytrium dendrobatidis (Bd). This is work from the Vredenburg and Briggs labs that includes several decade-long studies of frog declines and the prevalence of Bd.

See Vance in action swabbing a frog

In the Briggs et al paper, they describe a 5-year study on the fungal load in surviving populations of frogs in Sierra Nevada mountain lakes.  They find that adult frogs that have low enough fungal load escape chytridiomycosis and can actually lose and regain infection. They propose that fungal load dynamics are the reason behind differential survival of various populations of mountain frogs. They conclude that:

“Importantly, model results suggest that host persistence versus extinction does not require differences in host susceptibility, pathogen virulence, or environmental conditions, and may be just epidemic and endemic population dynamics of the same host–pathogen system.”

So they propose that differences in the populations that are coming down with the disease is due only to “density-dependent host–pathogen dynamics” not that some populations are resistant. They go on to provide a detailed model of persistence if the host and pathogen, chance of reinfection, and survival of the host which is derived from the long-term study data.  There are many more interesting findings and models proposed in the paper. It also further reinforces (for me) the need to know more about the molecular basis of the host-pathogen interactions and more about how the fungus persists without a host, lifestyle of how it overwinters, and the details of the microbe-host interactions, and the infection dynamic when zoospores disperse from infected frogs.

The Vrendenburg et al paper adresses the dynamics of population decline in the mountain yellow-legged frogs over a periods of 1-5 and 9-13 year study in 3 different study sites at different sampling intervals.  The authors were able to catalog the species decline and conduct skin swabbing to assess Bd prevalence. They found that the fungus spread quickly as it could detected in virtually all the lakes over the course of a year starting with a 2004 survey. The dramatic declines of frog populations in these lakes followed in the years subsequent to the initial detection. This sadly predicts that most if not all of the mountain lakes will go extinct for the frogs as the current tadpoles develop into frogs in the next 3 years and then fall victim to Bd. Based on their sampling work, the authors were also able to correlate what fungal burden predicted a subsequent decline – in populations where more the ~10,000 zoospores were detected in a swab from frog skin, then the frog population was about to experience a sharp decline.  The take-home from this work is that finding ways to keep the intensity of fungal infections down could provide a meaningful intervention that could prolong the viability of the population.

Briggs, C., Knapp, R., & Vredenburg, V. (2010). Enzootic and epizootic dynamics of the chytrid fungal pathogen of amphibians Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.0912886107


Vredenburg, V., Knapp, R., Tunstall, T., & Briggs, C. (2010). Dynamics of an emerging disease drive large-scale amphibian population extinctions Proceedings of the National Academy of Sciences DOI: 10.1073/pnas.0914111107

Origins and evolution of pathogens

ResearchBlogging.org An article in PLoS Pathogens by Morris et al describe a hypothesis about the evolution and origins of plant pathogens applying the parallel theories to the emergence of medically relevant pathogens. The authors highlight the importance of understanding the evolution of organisms in the context of emerging pathogens like Puccinia Ug99 for our ability to design strategies to protect human health and food supplies.  Both bacterial and fungal pathogens of plants are discussed but I (perhaps unsurprisingly) focus on the fungi here. Continue reading Origins and evolution of pathogens

Genome survey sequencing of Witches’ Broom

Genome survey sequencing (1.9X coverage) was generated for Moniliophthora perniciosa, the cause of witches’ broom disease on cacao plants. The sequence for this basidiomycete plant pathogen was published in BMC Genomics this week. The authors report a higher number of ROS metabolism and P450 genes. Evaluating whether these copy number differences are significantly different from other basidiomycete fungi and are lineage specific expansions will help determine if these families played a role in the adaptation of this plant pathogen.

This work provides an important stepping stone in understanding and eventually controlling this pathogen which is devastating cacao plantations. An associated review describes what we have and can learn about Witches’ broom disease.

See related:

Jorge MC Mondego, Marcelo F Carazzolle, Gustavo GL Costa, Eduardo F Formighieri, Lucas P Parizzi, Johana Rincones, Carolina Cotomacci, Dirce M Carraro, Anderson F Cunha, Helaine Carrer, Ramon O Vidal, Raissa C Estrela, Odalys Garcia, Daniela PT Thomazella, Bruno V de Oliveira, Acassia BL Pires, Maria Carolina S Rio, Marcos Renato R Araujo, Marcos H de Moraes, Luis AB Castro, Karina P Gramacho, Marilda S Goncalves, Jose P Moura Neto, Aristoteles Goes Neto, Luciana V Barbosa, Mark J Guiltinan, Bryan A Bailey, Lyndel W Meinhardt, Julio CM Cascardo, Goncalo AG Pereira (2008). A genome survey of Moniliophthora perniciosa gives new insights into Witches’ Broom Disease of cacao BMC Genomics, 9 (1) DOI: 10.1186/1471-2164-9-548

Bat White-nose syndrome brevia

A Brevia piece in Science today describes efforts to describe the causal agent in white-nose syndrome (WNS) in bats which appears to be contributing to bat decline. According to the authors, previous work had described an uncharacterized fungus associated with bats that showed signs of being sick with WNS. This is an emerging pathogen as the samples described in this paper were from Spring 2008. Phylogenetic analysis of the rDNA (and presumably ITS) sequence of fungal isolates from diseased bats placed it as a Geomyces spp, in the Helotiales order (in the Leotiomycetes if you are wondering what are the closest sequenced fungal genomes for this species). Other Geomyces spp are also psychrophiles and found colonizing the skin of animals in cold climates (it must be hard to make a living). The authors suggest the finding of this fungal species on bats is consistent with its involvement in disease. The authors also make the parallel to chytridiomycosis, an emerging pathogen of amphibians that is contributing to the worldwide amphibian decline.

This is just the first of hopefully several publications studying this phenomenon as this brief piece sets the stage for additional questions. It is not yet been shown that this fungus is actually causing the disease, i.e. satisfying Koch’s postulates, and isn’t just a canary in the coal mine. So-called opportunistic fungi like Aspergillus fumigatus, Cryptococcus neoformans, and Candida albicans cause infections that emerge after the patient’s immune system has been compromised by something else such as HIV or immunosuppressant drugs as part of an organ transplant regime. It is possible that the white-nose syndrome (ie white conidia from Geomyces sp is just a manifestation of an infection of a commensal organism like thrush or yeast infections of Candida albicans that only emerge when something else has knocked down the host’s immune system. I don’t know if this same Geomyces sp can be cultured from healthy bats from so-far uninfected colonies which would suggest the fungus is present all the time.

As we track and learn more about natural die-offs and disease in animals from infectious diseases there are series of recent fungal-associated disease of animal populations including honeybees perhaps from a virus and a microsporidium, frogs and amphibians via Batrachochytrium dendrobatidis, and white-nose syndrome. Diseases like Cryptococcus gattii are also examples of pathogens that may be able to infect healthy animals and humans. It seems quite important to know more important to track and study how these outbreaks spread and the evolutionary and ecological basis for the sudden rise in infection and mortality in animal populations to understand diseases of human relevance as well.

Related links:

D. S. Blehert, A. C. Hicks, M. Behr, C. U. Meteyer, B. M. Berlowski-Zier, E. L. Buckles, J. T. H. Coleman, S. R. Darling, A. Gargas, R. Niver, J. C. Okoniewski, R. J. Rudd, W. B. Stone (2008). Bat White-Nose Syndrome: An Emerging Fungal Pathogen? Science DOI: 10.1126/science.1163874

A word about databases

Logo for fungal GenomesReport concludes that a fungal genome database is of “the highest priority”.

This is the title as listed in PubMed for this article from Future Medicine about the AAM report on charting future needs and avenues of research on the fungal kingdom.

The need for a comprehensive database for information about fungi, starting at least with systematic collections of genomic and transcript data, is highlighted as a major need.  Really and sort of new database effort should strive to be more comprehensive and include genetic and population data (alleles, strains) and information like protein-protein, protein-nucleic acid interactions (as Pedro mentioned). But on top of that it, it needs to be comparative so that information from systems that serve as great models can be transferred to other fungal systems that are being studied for their role as pathogens or interacting in the environmental.

Affordable next-gen sequencing will allow us to obtain genome and transcript sequence for basically all species or strains of interest.  Researchers with no bioinformatics support in their lab will likely be able to outsource this to a company or campus core facility.  But how can they easily map in the collective information about genes, proteins, and pathways onto this new data?  And have it be a dynamic system that can update as new information is published and curated in other systems.

I think this has to be the future beyond setting up a SGD, CGD, etc for every system.  The individual databases are useful for a large enough community where there are curators (and funding), but we will have to move to a more modular system in the future (aspects of which are in GMOD) that can have both an individual focus on a specific species/clade and a more comprehensive view of the that is comparable across the kingdom.  There are 100+ fungal genomes, but the community size for some of them are in the dozens of labs or less. How can they take advantage of the new resources without an existing infrastructure of curators?  Their systems serve an important need in a research aim, but how can discoveries there make its way back into the datastream of othe systems?

I see it as there are several ways one would interact with a system that provided single-genome tools as well as a framework for comparative information.  At a gene level, one might be looking for all information about a specific gene, based on sequence similarity searches, or starting with a cloned gene in one species. Something akin to Phylofacts or precomputed Orthogroups for defining a Gene but with more linking information about function by linking in information from all sources.  So a comparative resource, but also tapping into curated andliterature mined data.

At a genome level, one might want to do whole genome comparisons of gene content from evolutionarily defined families genes (gene family size change) or at a functional level.  To start out with, each gene/protein would already need a systematic functional mapping.  This could be as simple as running InterProScan on every protein, expanded to find Orthogroups (or OrthoMCL orthologs) and transfer function from model systems, and finally even more advanced, do further classified better with tools like SIFTER.

Interlinked with these orthologous and paralogous gene sets would be anchors for analyses of chromosomal synteny and even comparative assembly including tools like Mercator.  Certainly things like all of this exist but making it more pluggable for different sets of species would be an important additional component.

At a utility level, the gene annotation and functional mapping of all this information should be possible. I would imagine a researcher could upload the sequence assembly they received from the core facility and the system can generate multiple gene predictions, annotate the genes, and link these genes within the known orthogroups of the system (preserving their privacy for these genes if desired).  Presumably this sort of thing would be easier as a standalone in-house for the researcher, but web services could also be the place for this.

For fungal-sized genomes this amount of data is not too extereme.  Things like Genome Browser, BLAST, etc should all be rolled out of the box based on the basic builds.

On the DIY and community annotation front, there would also need to be a layer of community derived annotation that could be layered on all these systems.  I would imagine this both to be for gene structure annotation (genome annotation) and functional annotation (protein X does Y based on experiment Z, here is the journal reference).  I think aspects of this would be visible, auditable (tracked), but maybe not blessed as official until a curator could oversee these inputs. In my mind, whether or not this is in a Wiki per se or just new system that allows community input is less important to me than having it be a) structured (not a bunch of free text) b) tracked and versionable c) easy for researchers to input so that the knowledge is captured, even if it has to be reorganized later on.

Seems like a lot of work to be done, but really many of these things already exist through what  the GMOD project has built.  Many loose ends and software that doesn’t fully meet up to these needs, but I think the important concept is these are all general solutions that will be of benefit to most communities, not just the fungal ones.  One lingering question I always have when approaching genomic datas

that will be dynamic, what if any of this makes its way into GenBank?  How is this sort of thing banked so that it can be captured, and does the improved functional or gene structure annotation ever make its way into the repository databases to correct and improve what has already been submitted there?