I have heard several conspiracy theories regarding the origin of the new coronavirus, 2019-nCov. For example that the virus and/or SARS were produced in a laboratory or were some variant of Middle Eastern respiratory syndrome (MERS), shipped via laboratory workers.

I am well aware bioinformatics has debunked many conspiracy theories involving infectious diseases, an important one being polio vaccination programs in Africa were the origin of HIV, for example here. Likewise, numerous conspiracy theories on the man-made origins of HIV were similarly overturned using bioinformatic studies.

Do you have bioinformatics evidence that would debunk the current conspiracists about coronaviruses?

  • 10,071
  • 5
  • 22
  • 48
  • 303
  • 1
  • 6
  • 2
    More relevant at: https://skeptics.stackexchange.com/ – zx8754 Feb 05 '20 at 07:53
  • Thanks for editing the question @terdon. There was a minor publication raising the 'spooky' flag in this regard (which got them publicity but was irresponsible) and yes it can be directly analytically debunked if the hypothesis concerns sufficient amino acids (in this case it was). It is a significant amount of work because you need to generate a null distribution - i.e. everything - and that's a lot of work. Secondly there is a concern that a minor publication just gets further publicity. – M__ Mar 16 '22 at 15:00
  • I read your answer back when you first posted it and it is a really great resource! probably more useful to those of us in the field than laypeople, but great neverheless! – terdon Mar 16 '22 at 15:40

1 Answers1


The scenarios are impossible and would be laughable if they were not so serious. The evidence is in the phylogenetic trees. Its a bit like a crime scene when the forensics team investigate. We've done enough crime-scenes often going to the site, collecting the pathogen, sequencing and then analysis - (usually neglected diseases) without any associated conspiracy theories.

The key technical issue is coronaviruses are zoonoses , pathogens spread to humans from animal reservoirs and phylogenetic tree really helps understand the how the virus is transmitted.


  1. The key thing about all the trees are bats. Bats lineages are present at every single point of the betacoronavirus phylogeny (tree), both as paraphyletic and monophyletic lineages, one example is this tree of betacoronaviruses here. Meaning the nodes connecting the branches of the tree to the "master-branch", represent common ancestors and these were almost certainly bat-borne cornaviruses. This is especially true for SARS and - here bat-viruses are EVERYWHERE.
  2. The tree here also shows that SARS arose on independently on two occassions, again surrounded by bat lineages and 2019-nCov has emerged separately at least once, again associated with bats.
  3. Finally the tree below is a figure from BioRxiv Zhou et al (2020) "Discovery of a novel coronavirus associated with the recent pneumonia outbreak in humans and its potential bat origin" shows the 2019-nCov lineage is a direct descendent of a very closely virus isolated from a bat (RaTG13*). This is a really conclusive finding BTW. enter image description here

Note, I don't normally present inline images, but it is such a nice finding (hint to reviewers) and BioRxiv is open access.

Conspiracy theory 1: laboratory made virus

Literally it would require someone passaging a new virus, with unknown human pathogenicity, and independently introducing all the earlier passages enmass across bat populations of China. They would then hope each lineage becomes an indepedent virus population before, then introducing the virus to humans. Thus when field teams of scientists go around using mist nets to trap the bats, buy them from markets, isolate the virus and sequence it they would find a beautiful, array of natural variation in the bat populations leading up to the human epidemics, that perfectly matches vast numbers of other viral zoonoses. Moreover, this would have to have happen substancially prior to SARS and 2019-nCov, because the bat betacoronaviruses have been known about prior both epidemics, viz. its simply not feasible.

Biological explanation General Bats are a reservoir host to vast numbers of pathogens, particularly viruses, including many alphaviruses, flaviviruses, rabies virus and beleived to be important in ebolavirus (I don't know about this) and even important to several eukaryotic parasites. It makes sense, they are mammals, so evolutionary much closer to us than birds for example, with large dispersal potential and roost in 'overcrowded' areas enable rapid transmission between bats.

Technical The trees show bats are the common ancestor of betacoronaviruses in particular for the lineage leading into the emergence of 2019-nCov and SARS, this is seen in this tree, this one and the tree above. The obvious explanation is the virus circulates endemically in bats and has jumped into humans. For SARS the intermediate host, or possible "vector" was civet cats.

The theory and the observations fit into a seamless biological answer.

Conspiracy theory 2: Middle Eastern connection

I heard a very weird conspiracy theory attempting to connect MERS with 2019-nCov. The theory was elaborate and I don't think it is productive to describe here.

Biological explanation All the trees of betacoronaviruses show MERS was one of the earliest viruses to diverge and is very distant from 2019-nCov, to the extent the theory is completely implausible. The homology between these viruses is 50%, so its either MERS or 2019-nCov. Its more extreme than mixing up yellow fever virus (mortality 40-80%) with West Nile virus (mortality <<0.1%), the two viruses are completely different at every level.

What about errors? Phylogeneticists can spot it a mile off. There are tell-tale phylogenetic signatures we pick up, but also we do this to assess 'rare' genetic phenomina. There is nothing 'rare' about the coronaviruses. The only anomaly is variation in the poly-A tail and that is the natural variation from in vitro time-series experiments. Basically we've looked at enough virses/parasites through trees, that have no conspiracy theories at all (often neglected diseases), and understand how natural variation operates - so a phylogenecist can shift the wheat from the chaff without really thinking about it.

Opinion The conspiracy theories are deeply misplaced, and the only connection I can imagine is its China. However, the Chinese have loads of viruses, influenza in particular which causes major pandemics, but that is a consequence of their natural ecology (small-holder farming) allowing the virus to move between reservoir hosts. I've not visited small-hold farms in China, but I have in other parts of the world and when you see them, you get it. The pigs, chickens (ducks .. China), dogs, horses and humans all living within 10 meters of each other.

Conclusion Shipping large numbers of bats to market, bat soup, raw meat from arboreal (tree-living) mammals such as civets that are sympatric to bats. Then consider the classical epidemiology in light of the phylogenetic data, which is very consistent, a single picture emerges that coronavirus is one of many zoonoses which has managed to transmit between patients.

Summary The fundamental point is the bioinformatics fit into the classical epidemiology of a zoonose.

*, Note The bat coronavirus RaTG13 predates the 2019-nCov outbreak by 7 years. It is not even clear whether the virus has been isolated, i.e. could just be a RNA sequence.

"They have found some 500 novel coronaviruses, about 50 of which fall relatively close to the SARS virus on the family tree, including RaTG13—it was fished out of a bat fecal sample they collected in 2013 from a cave in Moglang in Yunnan province."

Cohen, "Mining coronavirus genomes for clues to the outbreak’s origins" Feb, 2020 Science Magazine,

  • 12,263
  • 5
  • 28
  • 47
  • What about the "amino acid residues" in "4 inserts" that "have identity or similarity to those in the HIV-1 gp120 or HIV-1 Gag", that are mentioned by this Indian team, and which are probably the same mutations refered to by this Taiwanese professor? Did they make this up? Are their conclusions wrong? – John Slegers Mar 19 '20 at 11:44
  • We examined a very similar question here, https://bioinformatics.stackexchange.com/questions/11283/a-new-paper-suggests-the-corona-virus-has-uncanny-similarity-of-unique-inserts/11289#11289 – M__ Mar 19 '20 at 12:03
  • Interestingly, a more recent paper suggests that the overall similarity with RaTG13 may be a little misleading, pointing to GD Pangolin-CoV instead: "Intriguingly, these same six critical AAs are identical between GD Pangolin-CoV and SARS-CoV-2 [16]. In contrast, although the genomes of SARS-CoV-2 and RaTG13 are more similar overall, only one out of the six functional sites are identical between the two viruses (Fig. 1B). [...]" https://doi.org/10.1093/nsr/nwaa036 – the gods from engineering Mar 19 '20 at 14:46
  • That in turn seems to have caused controversy http://virological.org/t/response-to-on-the-origin-and-continuing-evolution-of-sars-cov-2/418 – the gods from engineering Mar 19 '20 at 14:55