Virginie Courtier (biologist, France), Gilles Demaneuf (data scientist, NZ), François Graner (biophysicist, France), Milton Leitenberg (senior research fellow, USA),
Jamie Metzl (USA), Steven Quay (physician-scientist, USA)


More than two years after the initial outbreak of the COVID-19 pandemic, questions about its origins remain unanswered. Because these questions have significant geopolitical implications, the search for answers has been blocked and compromised. 

This short summary is designed to provide essential background information for journalists seeking to cover the issue of the origin of the pandemic. Although the authors have strived to present the available evidence in as fair and impartial a manner as possible, we recognize that complete impartiality is impossible in this context.

Our goal in providing this information is to facilitate media coverage and public debate on this critically important issue which can underpin efforts to understand and address shortcomings that leave our world at unnecessary risk of future pandemics.

Early days

In December 2019 patients with severe respiratory symptoms started to appear in several hospitals in Wuhan. Respiratory samples were analyzed and a new virus was found. On 30 December 2019, Pr Shi Zheng-Li, who runs a large, world-renowned lab working on coronaviruses in Wuhan, was called while in Shanghai by her director, with the news that a novel coronavirus had been detected in pneumonia patients and that she should return immediately. In the overnight train, she wondered whether an accident from her lab could be responsible for the outbreak. She later said that, after a number of sleepless nights, she was relieved to find out that the sequence of the novel virus did not match any of the viruses her team had sampled from bat caves.

An unsolved question

Two years after the beginning of the outbreak, there is still no definitive evidence that the virus jumped naturally from non-human animals to humans. No intermediate host species has been identified so far. More than 80,000 animal samples from the Wuhan area and further afield were collected by Chinese researchers and none tested positive for the virus. 

One possible explanation for why no intermediate species has been found in the wild could be that COVID-19’s origin is research-related. For example, a lab employee or some accompanying non-lab personnel may have been infected at a sampling site or during transport of collected animals or samples. Alternatively, someone may have been infected within a Wuhan laboratory or someone nearby may have been infected via lab waste, an improper air filter or by escaped animals.

In 2020 many scientists from several countries, some with conflicts of interest as prior funders of and collaborators with the Wuhan Institute of Virology, asserted that the virus had a natural origin and accused those raising questions about a possible lab incident origin as “conspiracy theorists.” Given the lack of available evidence, however, these arguments were more efforts to influence public opinion than articulations of scientific findings. These views, however, guided expert and public dialogue about the origin of the pandemic for all of 2020. Today, the question of the origin of the virus remains unsolved. It has led to debate within the scientific community, with several scientists calling for an open debate and thorough investigation.

Among scientists, opinions fall into four groups: some who favor a natural origin, some who favor a lab-related incident, a few who aim for a full and unrestricted investigation including the possibility of a research-related accident, and a majority who do not know and prefer, for various reasons, not to participate in the debate.

30 years ago

SARS-CoV-2, the virus causing COVID-19 pandemic, has a genome made up of 30.000 letters. The closest viruses have been found in bats in South-East Asia, indicating that the ancestor of SARS-CoV-2 was previously circulating in bats. As time passes, viruses accumulate mutations at a relatively constant rate. Given the number of mutations separating SARS-CoV-2 from its most closely related relatives, it is estimated that the ancestor of SARS-CoV-2 began circulating in bats about 30 years ago, around 1990. What happened between 1990 and 2019 is unknown.

What the virus sequences tell us

The SARS-CoV-2 sequences retrieved from December 2019 patients are extremely close to each other (with just a few genomic letter differences) whereas today’s sequences are much more variable. Comparison of all available sequences show that all the SARS-CoV-2 viruses circulating today originate from the viruses detected in humans in 2019 in China. The sequences indicate that the introduction(s) into the human population (from an intermediate host, a bat or a lab sample) occurred in the second half of 2019 and that no other introduction has occurred since then (except possibly the omicron variant – work is ongoing to understand the origin of the omicron variant). 

Early patients

The WHO-China joint report mentions 174 confirmed COVID-19 cases in Wuhan with disease onset in December 2019. Many of them were linked to the Huanan seafood market, but the earliest ones were not. In May 2020, Gao Fu, director of China’s Center for Disease Control and Prevention announced that the Huanan Seafood market was not the location of the origin and that « The virus came into the market not from the market ». Generally, in an outbreak investigation, contacts of known patients are traced backward in time so that more and more early cases are identified. In the case of COVID-19, the opposite happened: several early cases from November-December 2019 were presented in scientific journals and newspapers and yet they were later dismissed as having been mislabelled.

Two main scenarios are currently considered by the scientific community: either a single introduction into the human population, dated around September-November 2019 or two animal-to-human transmission events, probably in different markets at the same time, in November-December 2019.

Coronavirus research in Wuhan

In 2019 and earlier, coronaviruses were being manipulated in four Wuhan laboratories, including the Wuhan Center for Disease Control, which is 8-min walk from the Huanan market, and the Wuhan Institute of Virology (WIV). The newest WIV site is located south of the city and hosts a Biosafety Level-4 (the highest level for working with human pathogens) lab. The original WIV laboratory in downtown Wuhan hosts Biosafety Level-2 and -3 labs. Wuhan researchers previously published scientific papers where they reported the generation of new viruses by combining pieces of several coronaviruses. Together with American collaborators they also submitted a research grant where they proposed to insert specific nucleotide sequences (a so-called furin cleavage site) to try to enhance virus pathogenicity. As a matter of fact, the SARS-CoV-2 virus contains such a furin cleavage site, but its sequence is different from the ones previously reported for coronaviruses. In addition, Wuhan researchers carried out multiple sampling trips prior to 2019 to collect bats and virus samples across Asia. Finally, Wuhan researchers are known to raise in their laboratories live bats and humanized mice which express a human receptor for SARS-CoV-2, making it possible for coronaviruses to evolve and adapt in the laboratory. Collecting natural viruses from great distances away (over 1000 km), bringing them to a laboratory to grow them in culture flasks and in animals, and changing them genetically are activities that could, in theory, lead to the appearance of a virus like SARS-CoV-2.

Lack of transparency

Three main issues suggest a possible research-related accident as the origin of the pandemic. 

First, when presenting a closely related virus (named “RaTG13”) in their seminal publication of SARS-CoV-2 in the journal Nature, the Wuhan researchers did not mention that it was collected in an abandoned mine in Yunnan province, where, in 2012, six people had suffered from a severe pneumonia with symptoms partly similar to those of COVID-19 after cleaning up bat droppings. This important piece of information was unraveled by a group of internet detectives and was only later confirmed by the Wuhan researchers.

Second, a large virus sequence database which was created by the Wuhan Institute of Virology for rapidly apprehending a pathogen in case of outbreaks, and which contains more than 20,000 unpublished viruses, became unavailable to the worldwide research community in late 2019 and has still not been shared with experts. It is thus not possible for experts to verify the claim that no such virus was known to be present in the Wuhan Institute of Virology when the outbreak began. The database is no longer accessible online and even the scientific article describing this database has been removed from the internet. It is also notable that the database became unavailable to international researchers around the time estimated by modeling studies when the disease was first contracted by humans. 

Third, the key feature of SARS-CoV-2, the presence of a furin cleavage site, which is known to increase transmission and pathogenicity of coronaviruses, was not mentioned by the Wuhan researchers in their seminal Nature publication. This is odd because the furin site is a specificity of SARS-CoV-2 that can be easily noticed as it is associated with the insertion of a sequence not found in closely related viruses, and also because the Wuhan researchers were part of a research project in 2018 aiming to insert cleavage sites in existing viruses.

It is unfortunate that Wuhan researchers have not clarified the last two issues.

The investigation by the World Health Organization

A May 2020 World Health Assembly resolution authorized the World Health Organization to help organize a joint study with China, not of the origins of the pandemic but of the hypothesis of a natural origin not associated with a laboratory incident. It took six months of intense negotiations between the World Health Organization (WHO) and the Chinese government for the terms for this study to be hammered out. The Chinese authorities were given veto power over which international experts could join the study as well as the right to provide summaries of evidence rather than sharing the critical raw data itself. Furthermore, it was decided that the research studies would be carried out by Chinese scientists, while the non-Chinese experts would simply review the studies undertaken by China researchers and officials. 

In early 2021, a team of 17 independent international experts traveled to Wuhan for a four-week visit. After two weeks of strict quarantine in their hotels, they had approximately ten working days to participate in a highly curated, chaperoned visit to selected Wuhan sites. In their short visit to the Wuhan Institute of Virology, they did not ask to see the missing database. 

In a press conference held in Wuhan on February 9, 2021, the leader of the international team, Dr. Peter Ben Embarek, announced that – together with their Chinese counterparts – they had concluded that a natural origin was likely but that a lab incident origin was “extremely unlikely” and should not be investigated. Dr. Ben Embarek later admitted in an interview with Danish television that he actually believed one possible lab origin scenario was “likely,” that the move of the Wuhan Center for Disease Control laboratory to a new location adjacent to the Huanan Seafood Market should be investigated, and that he had been under pressure from Chinese hosts not to raise this hypothesis.

In July 2021 the WHO created a new panel, the Scientific Advisory Group on the Origins of Novel Pathogens (SAGO), one of whose goals is to study the origin of the SARS-CoV-2 virus. This panel comprises 27 persons, including several scientists from the previous joint WHO/China study. No public report has been produced so far by SAGO, and no information is currently available regarding what studies it may be undertaking.

Contradictions between private and official statements of key scientists

Over the last few months various emails released under the Freedom of Information Act show that in the early days of the pandemic some top international scientists considered a research-related accident and even a lab-enhanced virus as fully possible to be the cause of the origin of the SARS-CoV-2 virus, if not most plausible explanation. These scientists alerted Dr Fauci and Dr Collins to their doubts and shared their concerns in a conference call convened by Dr Farrar and Dr Fauci on February 1st, 2020. Within only a few days of that call, these scientists apparently changed their mind, and several of them immediately began writing a paper later published in Nature Medicine. That paper, which strongly argued for a natural origin of the SARS-CoV-2 virus, became one of the most cited biology papers in 2020. 

Why it matters and what needs to be done

In summary, there are still two valid hypotheses for the origin of the COVID-19 pandemic : via a spillover event from one or more animals, or via a research-related event.

To prevent another pandemic in the future, knowing how the outbreak began is essential to the future mitigation steps the world needs to take. The actions the world should take are very different under these two scenarios. That is why a comprehensive investigation into the origins of COVID-19, with full access to all relevant records, samples, and personnel in China, and, as appropriate, beyond, is required. In this process, secure whistleblower provisions should be established making it as safe as possible for scientists and others in China and across the globe to share essential information without fear of retribution. 

Without a comprehensive and unrestricted international investigation into COVID-19 origins, the risk is unnecessarily high that everyone on Earth, including future generations, will live through another pandemic in the near future.

