25

Yesterday I was debugging some things in R trying to get a popular Flow Cytometry tool to work on our data. After a few hours of digging into the package I discovered that our data was hitting an edge case, and it seems like the algorithm wouldn't work correctly under certain circumstances.

This bug is not neccessarily the most significant issue, but I'm confident that it has occurred for other users before me, perhaps in a more insidious and less obvious way.

Given that since this tool was released, literature has been published that utiltised it, how does the profession handle the discovery of bugs in these widely used packages?

I'm sure that given the piecemeal nature of a lot of the libraries out there, this is going to happen in a much more significant way at some point (i.e. invalidating a large number of published results)

For some context I'm a programmer who's been working across various other parts of software development, and is quite new to Bioinformatics.

llrs
  • 4,693
  • 1
  • 18
  • 42
Nic Barker
  • 351
  • 3
  • 6
  • 6
    Have you reported the bug to the developers? Is the tool still maintained? Really it depends on the developers how the bug will be handled. Sometime it will go out in the next point release and sometime it will be determined to be a low risk and low probability occurrence and the potential fix would introduce great risk so it will stay as a known issue. – Bioathlete Nov 14 '17 at 03:27
  • 4
    I think this question might benefit from a more general tag. This might come to other packages/libraries in C++, python, perl, S, or whatever language. Maybe a good tag would be [tag:software-quality]? – llrs Nov 14 '17 at 08:19
  • 2
    Might also be a relevant question over on Academics.SE – Wayne Werner Nov 14 '17 at 14:29
  • 1
    @WayneWerner there is a relevant question there – llrs Nov 14 '17 at 14:39
  • I'm not sure I understood the question. "it seems like the algorithm wouldn't work correctly under certain circumstances" - do you mean that the algorithm is incorrect from a biologist's point of view, but was implemented correctly in the package, or that the general algorithm would compute everything properly, but its R implementation in the specific package does not meet the specification? – rumtscho Nov 14 '17 at 15:18
  • @Llopis I wanted to create a new general tag, unfortunately I didn't have the reputation at the time. I'll have a think about it and retag if I can. – Nic Barker Nov 15 '17 at 02:22
  • @Bioathlete Yep, I probably should have been more clear - I've reported and also fixed the bug, but I was musing about the larger implications on public literature in a more general sense. – Nic Barker Nov 15 '17 at 02:27
  • @rumtscho What I mean by that statement is that for some cases it works as expected, and for other cases it doesn't. I.e. for certain data sets the results would be erroneous, but for other data sets it works fine. – Nic Barker Nov 15 '17 at 05:40
  • 1
    @NicBarker You seem to have created issues on the repository package but you haven't arised these concerns in the support site of Bioconductor. I would first report there your findings. You can always fork the most updated version and modify the code yourself. – llrs Nov 15 '17 at 07:54
  • Thanks for the tip @Llopis, I'm still finding my feet and figuring out best practise in the field. – Nic Barker Nov 16 '17 at 00:25

5 Answers5

21

I prefer to treat software tools and computers in a similar fashion to laboratory equipment, and in some sense biology in general. Biologists are used to unexpected things happening in their experiments, and it's not uncommon for a new discovery to change the way that people look at something. Things break down, cells die off quicker on a Wednesday afternoon, results are inconsistent, and that third reviewer keeps on about doing that thing that's been done a hundred times before without anything surprising happening (just not this time). It's a good idea to record as much as can be thought of that might influence an experiment, and for software that includes any input data or command line options, and especially software version numbers.

In this sense, a discovered software bug can be treated as a new discovery of how the world works. If the discovery is made public, and other people consider that it's important enough, then some people might revisit old research to see if it changes things.

Of course, the nice thing about software is that bugs can be reported back to the creators of programs, and possibly fixed, resulting in an improved version of the software at a later date. If the bug itself doesn't spark interest and the program gets fixed anyway, people unknowingly use newer versions, and there might be a bit more confusion and discussion about why results don't match similar studies carried out before the software change.

If you want a bit of an idea of the biological equivalent of a major software bug, have a look at the barcode index switching issue, or the cell line contamination issue.

gringer
  • 14,012
  • 5
  • 23
  • 79
  • Great answer (+1), except for introducing the IMHO unnecessary and confusing distinction between 'corner' and 'non-corner' case bugs. A code is not like a lab equipment in that you do not have a general understanding over the state-space of how the tool is used (e.g. you do not need to label all lab equipment with 'this tool might not work in outer space'). Another way to put this: all bugs are necessarily corner cases to the person that wrote the code. – user189035 Nov 14 '17 at 12:04
  • 1
    One person's bug could be another person's (feature)[https://xkcd.com/1172/]. A person doesn't need to know (and often can't know) everything about software in order to use it, and developers don't need to fix every known bug in order for a program to be useful. – gringer Nov 14 '17 at 18:48
  • Thanks for those two links @gringer, definitely an interesting read. – Nic Barker Nov 15 '17 at 02:49
  • 4
    "people unknowingly use newer versions, and there might be a bit more confusion and discussion about why results don't match similar studies carried out before the software change." Which is why you should note the versions of the software you used in your publications. However small and however subtle you do it, make sure it's in there. – Mast Nov 15 '17 at 08:18
9

While I agree with gringer's answer I would first report the bug to the maintainers with a test case (if possible) in order to have the bug corrected.

The bug might be an erroneous implementation in a corner case or in a more common use case of the software. But it is a good opportunity to collaborate to increase the quality of the software, either by introducing tests (or expanding them), by re-evaluating the state of the art with other tools, or to communicate with users of other software that depends on the tool.

Depending on the bug and the time one is able to spend, do a collaboration to study the effect of such a bug in the published studies using the software.

Sometimes the bug doesn't affect much, but sometimes it can have quite an impact which can be difficult to estimate. For instance, I found that the reactome.db package didn't have the same annotations that the available online database (in version 1.58.0), due to an incorrect mapping. However, it was impossible for me to evaluate the damage of such an error on the published literature, because gene set enrichment tools (usually) don't report the data they are based on and the versions of the databases used are often under reported in the literature.

llrs
  • 4,693
  • 1
  • 18
  • 42
  • Don't worry about it! That's the beauty of a collaboratively edited site! And I cheat: I'm a native speaker :) – terdon Nov 14 '17 at 11:26
7

If you have a software development background you probably know about filing bug reports. Your question seems to be more about how to correct the scientific literature which has been affected by the bug.

The traditional way that this has been done is by writing an paper that explains the bug and gives examples of how it may impact published results (see gringer's examples of index switching and HELA contamination). This way the community will be aware that older results using the implicated method need to be verified before building upon them. Unfortunately, awareness of these issues doesn't always diffuse through the community as quickly as it should, but it's probably still the best tool we have to ensure that science is self-correcting.

heathobrien
  • 1,816
  • 7
  • 16
7

In extreme cases, it can destroy careers:

One of the most spectacular flameouts in science happened last year. In a short letter (barely over 300 words long) published in Science in the very last issue of 2006, Geoffrey Chang, a crystallographer, retracted 3 Science articles, a Nature article, a PNAS article and a JMB article. The sum of 5 years of work was destroyed, apparently, over a single sign error in a data-processing program. – A Sign, a Flipped Structure, and a Scientific Flameout of Epic Proportions

Most cases are not that extreme. In fact, realistically, I would guess that most papers have some data processing errors of some kinds, yet still we muddle on towards truth. And this isn't unique to software – the whole idea of independent replication in science is that we're always probably messing things up in ways we don't realize, so we need to re-do things and hope that the errors are different this time. But software is unusual in how tiny errors can cause very large effects, and often the effects are hard to detect. If you drop a test tube on the floor, you'll probably notice. If a piece of new software gives incorrect results... well, we didn't know what the results were supposed to look like, that's what makes it science! There are things you can do to mitigate this somewhat, but to some extent it's just inherent to the scientific process.

I still like to show that blog post to new grad students to scare them into writing tests. I actually think it's a good habit to, whenever you start using a new software package, start by writing some tests for it – it's a good way to check your assumptions and understanding about what it actually does, and even if it's a popular package by eminent researchers, you'll still sometimes be shocked by what you find.

4

It depends also where the package is available from. I mean if it is from bioconductor, authors and maintainers are responsible to respond to bug reports. It could be reported at support sites, so others are also aware of the bug. If the maintainer doesn't fix it, it might lose its spot in bioconductor. However, other sites, such as CRAN or Sourgeforge might be less stringent.

benn
  • 3,571
  • 9
  • 28
  • 1
    I always wonder what the "Response to bug reports" means, but that would be for the bioconductor mailing list. I read it as they don't need to be corrected, but just have a (reasonable) response. And AFAIK it is not enforced, I reported several bugs and I didn't get any response from the maintainers :( – llrs Nov 14 '17 at 10:39
  • 1
    In my opinion an adequate response would be a bug fix. Not just a "thanks, you'll hear from us" response on the support site. It's a pity to hear that you don't even got a response, I thought bioconductor was stringent is these matters. – benn Nov 14 '17 at 10:43