<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: ISCB BoF on open source and open data</title>
	<atom:link href="http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/feed/" rel="self" type="application/rss+xml" />
	<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/</link>
	<description>Digesting the fungal genomes</description>
	<lastBuildDate>Tue, 23 Feb 2010 18:31:48 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=abc</generator>
	<item>
		<title>By: Open data &#171; Stajichlog</title>
		<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/comment-page-1/#comment-2006</link>
		<dc:creator>Open data &#171; Stajichlog</dc:creator>
		<pubDate>Tue, 18 Dec 2007 16:07:20 +0000</pubDate>
		<guid isPermaLink="false">http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/#comment-2006</guid>
		<description>[...] ISMB open data and open source BoF which never got around to discussing aspects of open data - it was mostly a discussion of different [...]</description>
		<content:encoded><![CDATA[<p>[...] ISMB open data and open source BoF which never got around to discussing aspects of open data &#8211; it was mostly a discussion of different [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Sean Eddy</title>
		<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/comment-page-1/#comment-737</link>
		<dc:creator>Sean Eddy</dc:creator>
		<pubDate>Thu, 02 Aug 2007 13:37:45 +0000</pubDate>
		<guid isPermaLink="false">http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/#comment-737</guid>
		<description>Hi Jason,

My feeling is that &quot;open source&quot; and &quot;open access&quot; tend to cloud an even more important issue of immediate relevance to science. There are many facets of the &quot;open&quot; debates, not all of which are relevant to science, and some of which tend to immediately polarize a debate. But a really important issue to us is what happens with data, software, and materials upon publication.

The scientific publication system was founded in 1665 explicitly as a reward or a quid pro quo mechanism, comparable in some ways to the patent system: the idea is that publication is an incentive to get scientists to disclose their findings to everyone else. The community allots priority and prestige to the author, and the author gives something of value to the community. This was a great improvement over Newton encrypting his discoveries on secret rings stored with his London lawyer.

One might argue whether Sir Henry Oldenburg&#039;s personal motivation in 1665 for starting the Philosophical Transactions of the Royal Society rises to the level of a community ethical standard in 2007, but I think most people would agree that it is indeed the &quot;community standard&quot; by which science operates. A 2003 report from the National Research Council articulated this view at length, in the wake of the Science publications of the Celera human genome and the Syngenta rice genome -- neither of which was deposited at that time in Genbank despite Science&#039;s own policies.  Executive summary of the NAS report is at http://selab.janelia.org/publications/NAS03/NAS03-execsum.pdf.

The key argument in the report is that upon publication, authors are obligated to deliver enough information about the central result in their paper that other scientists can reproduce the result and build on it.

For publications where the central result is too large to fit in a journal&#039;s pages, like a genome sequence or a software package, that data, software, or material must be made readily available to everyone in the community -- whether they work in academia or industry. (&quot;Academic only&quot; distribution is viewed as inconsistent with the principles of the scientific publication system.)

If it&#039;s not readily available, reviewers and editors can and should take that into account -- the usefulness of a result depends on the degree to which it&#039;s being made available to the community.

Suicyte&#039;s point is a good one, and the NAS report covered it in some detail. If the central result is an algorithm, then the algorithm is all that is being given to the community - no need for an implementation or source code. But if the central result is &quot;Foo: a program to align everything&quot;, then the program needs to be available at least as an executable (people can build on it by building pipelines around it) and ideally as source (even easier to build on it then). 

The problem with ISCB&#039;s policy is that it says *nothing* about a computational biologist&#039;s ethical obligations to make data and software available upon publication -- and what it does say is an attack on open source that is widely interpreted as being a defense of those few people who withhold availability of published results.

In my view, ISCB ought to show better leadership, and adopt a policy consistent with the ethical principles of scientific publication that are being articulated by NIH, HHMI, the National Academies, and other funding bodies and scientific organizations. Whether ISCB also chooses to wade into the wider &quot;open source&quot; debate after that is not my concern, really, but I don&#039;t think it&#039;s advisable.

Sean</description>
		<content:encoded><![CDATA[<p>Hi Jason,</p>
<p>My feeling is that &#8220;open source&#8221; and &#8220;open access&#8221; tend to cloud an even more important issue of immediate relevance to science. There are many facets of the &#8220;open&#8221; debates, not all of which are relevant to science, and some of which tend to immediately polarize a debate. But a really important issue to us is what happens with data, software, and materials upon publication.</p>
<p>The scientific publication system was founded in 1665 explicitly as a reward or a quid pro quo mechanism, comparable in some ways to the patent system: the idea is that publication is an incentive to get scientists to disclose their findings to everyone else. The community allots priority and prestige to the author, and the author gives something of value to the community. This was a great improvement over Newton encrypting his discoveries on secret rings stored with his London lawyer.</p>
<p>One might argue whether Sir Henry Oldenburg&#8217;s personal motivation in 1665 for starting the Philosophical Transactions of the Royal Society rises to the level of a community ethical standard in 2007, but I think most people would agree that it is indeed the &#8220;community standard&#8221; by which science operates. A 2003 report from the National Research Council articulated this view at length, in the wake of the Science publications of the Celera human genome and the Syngenta rice genome &#8212; neither of which was deposited at that time in Genbank despite Science&#8217;s own policies.  Executive summary of the NAS report is at <a href="http://selab.janelia.org/publications/NAS03/NAS03-execsum.pdf" rel="nofollow">http://selab.janelia.org/publications/NAS03/NAS03-execsum.pdf</a>.</p>
<p>The key argument in the report is that upon publication, authors are obligated to deliver enough information about the central result in their paper that other scientists can reproduce the result and build on it.</p>
<p>For publications where the central result is too large to fit in a journal&#8217;s pages, like a genome sequence or a software package, that data, software, or material must be made readily available to everyone in the community &#8212; whether they work in academia or industry. (&#8220;Academic only&#8221; distribution is viewed as inconsistent with the principles of the scientific publication system.)</p>
<p>If it&#8217;s not readily available, reviewers and editors can and should take that into account &#8212; the usefulness of a result depends on the degree to which it&#8217;s being made available to the community.</p>
<p>Suicyte&#8217;s point is a good one, and the NAS report covered it in some detail. If the central result is an algorithm, then the algorithm is all that is being given to the community &#8211; no need for an implementation or source code. But if the central result is &#8220;Foo: a program to align everything&#8221;, then the program needs to be available at least as an executable (people can build on it by building pipelines around it) and ideally as source (even easier to build on it then). </p>
<p>The problem with ISCB&#8217;s policy is that it says *nothing* about a computational biologist&#8217;s ethical obligations to make data and software available upon publication &#8212; and what it does say is an attack on open source that is widely interpreted as being a defense of those few people who withhold availability of published results.</p>
<p>In my view, ISCB ought to show better leadership, and adopt a policy consistent with the ethical principles of scientific publication that are being articulated by NIH, HHMI, the National Academies, and other funding bodies and scientific organizations. Whether ISCB also chooses to wade into the wider &#8220;open source&#8221; debate after that is not my concern, really, but I don&#8217;t think it&#8217;s advisable.</p>
<p>Sean</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mark Voorhies</title>
		<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/comment-page-1/#comment-736</link>
		<dc:creator>Mark Voorhies</dc:creator>
		<pubDate>Wed, 01 Aug 2007 17:10:18 +0000</pubDate>
		<guid isPermaLink="false">http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/#comment-736</guid>
		<description>In addition to cost and redundancy of effort, it is worth considering reproducibility.  Any result that depends on private software or software of limited availability (due to price, unportable binaries, etc.) will be less subject to independent verification by other labs.  In cases where the specification of the algorithm is incomplete, or where the results depend on a non-trivial parameterization, even verification by independent implementation of the algorithm may not be possible.</description>
		<content:encoded><![CDATA[<p>In addition to cost and redundancy of effort, it is worth considering reproducibility.  Any result that depends on private software or software of limited availability (due to price, unportable binaries, etc.) will be less subject to independent verification by other labs.  In cases where the specification of the algorithm is incomplete, or where the results depend on a non-trivial parameterization, even verification by independent implementation of the algorithm may not be possible.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: suicyte</title>
		<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/comment-page-1/#comment-734</link>
		<dc:creator>suicyte</dc:creator>
		<pubDate>Tue, 31 Jul 2007 16:37:08 +0000</pubDate>
		<guid isPermaLink="false">http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/#comment-734</guid>
		<description>A probably not very popular remark:

In the old days of bioinformatics, it was mainly the algorithm that was published in a scholarly journal. It was expected to be clever and to solve an important problem, but nobody required that there also is a program that implements this algorithm. Not to speak of any requirement for source code. Sometimes, there was a program to show that the algorithm actually works, but these were not elaborate software packages but just proofs-of-concept. In a way I think this is how it should be. For me, devising an algorithm is science, coding is not. (Pleeaase, developers, don&#039;t kill me, not yet) .

No mistaking, writing good (usable) programs is extremely valuable, what would we do without BLAST, HMMER, TMEV etc.? Bioinformatics software developemt (even if I don&#039;t consider it science) is often done by scientists, and I understand that they need the credit for this. In science, credit typically means  publication. I guess this is why Bioinformatics and other journals have a section like &#039;application notes&#039; that does not deal with algorithms but rather with software.

So, what does this have to do with open software? In my humble opinion, it is still perfectly o.k. to publish an algorithm without an associated application (open source or otherwise). For these types of publications, I consider it silly to demand the submission of any source code. For application notes, matters might be different. Anyway, I feel that a scientist must at one point make the decision whether to become rich OR famous. In the former case, it is o.k. to sell software but don&#039;t expect to get free advertising space in a science journal. In the latter case, one should publish but abandon the hope of making money with the program.

Maybe some disclaimers: I used to work in academia, where I wrote some programs myself (all in the public domain, but mostly useless). Nowadays, I work  for a biotech company, but have to run on a very tight budget. Thus, I rely on free (as in beer) software. I normally don&#039;t care much about source availability, but I can see that this is a big issue for others. What I don&#039;t like is the (nowadays very common) option to make software freely available to people in academia but have industry people pay ridiculous amounts of money. Again, I can understand the motivation for this move, but the assumption that biotechs are swimming in money is not always justified. Also, I feel that I get hit particularly hard, as I mostly tend to use the software for doing basic science that gets published, not really for making money out of other people&#039;s work.</description>
		<content:encoded><![CDATA[<p>A probably not very popular remark:</p>
<p>In the old days of bioinformatics, it was mainly the algorithm that was published in a scholarly journal. It was expected to be clever and to solve an important problem, but nobody required that there also is a program that implements this algorithm. Not to speak of any requirement for source code. Sometimes, there was a program to show that the algorithm actually works, but these were not elaborate software packages but just proofs-of-concept. In a way I think this is how it should be. For me, devising an algorithm is science, coding is not. (Pleeaase, developers, don&#8217;t kill me, not yet) .</p>
<p>No mistaking, writing good (usable) programs is extremely valuable, what would we do without BLAST, HMMER, TMEV etc.? Bioinformatics software developemt (even if I don&#8217;t consider it science) is often done by scientists, and I understand that they need the credit for this. In science, credit typically means  publication. I guess this is why Bioinformatics and other journals have a section like &#8216;application notes&#8217; that does not deal with algorithms but rather with software.</p>
<p>So, what does this have to do with open software? In my humble opinion, it is still perfectly o.k. to publish an algorithm without an associated application (open source or otherwise). For these types of publications, I consider it silly to demand the submission of any source code. For application notes, matters might be different. Anyway, I feel that a scientist must at one point make the decision whether to become rich OR famous. In the former case, it is o.k. to sell software but don&#8217;t expect to get free advertising space in a science journal. In the latter case, one should publish but abandon the hope of making money with the program.</p>
<p>Maybe some disclaimers: I used to work in academia, where I wrote some programs myself (all in the public domain, but mostly useless). Nowadays, I work  for a biotech company, but have to run on a very tight budget. Thus, I rely on free (as in beer) software. I normally don&#8217;t care much about source availability, but I can see that this is a big issue for others. What I don&#8217;t like is the (nowadays very common) option to make software freely available to people in academia but have industry people pay ridiculous amounts of money. Again, I can understand the motivation for this move, but the assumption that biotechs are swimming in money is not always justified. Also, I feel that I get hit particularly hard, as I mostly tend to use the software for doing basic science that gets published, not really for making money out of other people&#8217;s work.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pedro Beltrao</title>
		<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/comment-page-1/#comment-733</link>
		<dc:creator>Pedro Beltrao</dc:creator>
		<pubDate>Tue, 31 Jul 2007 10:02:37 +0000</pubDate>
		<guid isPermaLink="false">http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/#comment-733</guid>
		<description>Another potential reason not to open source the code is simply competition in academia. A closed code or database can serve as a differentiation factor. It takes time to build some resources and this unfortunate credit system does not value how much a code or database is used but how many papers get produced in X impact journals. Even if people agree by principle that opening up research is morally correct the current reward system will tend to push people to secrecy.</description>
		<content:encoded><![CDATA[<p>Another potential reason not to open source the code is simply competition in academia. A closed code or database can serve as a differentiation factor. It takes time to build some resources and this unfortunate credit system does not value how much a code or database is used but how many papers get produced in X impact journals. Even if people agree by principle that opening up research is morally correct the current reward system will tend to push people to secrecy.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Impressions from ISMB 2007 &#171; Suicyte Notes</title>
		<link>http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/comment-page-1/#comment-732</link>
		<dc:creator>Impressions from ISMB 2007 &#171; Suicyte Notes</dc:creator>
		<pubDate>Tue, 31 Jul 2007 09:40:16 +0000</pubDate>
		<guid isPermaLink="false">http://fungalgenomes.org/blog/2007/07/iscb-bof-on-open-source-and-open-data/#comment-732</guid>
		<description>[...] Fungal Genomes here and here [...]</description>
		<content:encoded><![CDATA[<p>[...] Fungal Genomes here and here [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
