Greetings from an undisclosed location in my apartment.
It has been 284 days since the first documented human case of COVID-19.
Housekeeping note:
Today, the International Liver Congress 2020, which is now a virtual meeting, begins, and I am up at 5 AM to take in the presentations that have been accepted. The meeting was originally scheduled to take place in April 2020, but you can imagine why it was postponed and ultimately converted to a virtual format.
Also today, I am running the second part of my series on why this virus did not emerge from a laboratory, by design or by accident.
Glossary terms are bolded words with links to the running newsletter glossary.
Keep the newsletter growing by sharing it! I love talking about science and explaining important concepts in human health, but I rely on all of you to grow the audience for this:
Now, let’s talk COVID.
Moderna vaccination data in elderly patients
Yesterday, Moderna announced some Phase I trial data showing that their mRNA vaccine elicits an immune response in elderly patients: https://www.cnbc.com/2020/08/26/moderna-says-its-coronavirus-vaccine-shows-promising-results-in-small-trial-of-elderly-patients.html
According to Moderna, in this small trial, patients showed both antibody and T-cell responses.
However, the paper hasn’t been peer-reviewed yet and it was a small study, so we’ll try and revisit this one when it has been through the formal process.
That said, we want to be sure that the vaccine that comes to market works in people who are in high-risk groups. Older people are at higher risk, and also generally have weaker immune responses to vaccines. I have been concerned that a vaccine that does not prevent spread of the virus not help them much if it also doesn’t generate a good immune response in older people. These results indicate that maybe I don’t need to be so worried about this. We’ll see!
CDC makes a change in recommendations for testing close contacts
Yesterday also, the CDC announced new testing guidelines that say that people who are not showing symptoms should not be tested for COVID-19, even if there is evidence they have been recently exposed: https://www.nytimes.com/2020/08/25/health/covid-19-testing-cdc.html
I have to say in no uncertain terms that this decision is irresponsible and wrong. There is no solid evidentiary reason to make this recommendation. There is good evidence that asymptomatic people are capable of spreading this virus, and it is necessary to test such people for the virus in order to trace, isolate, and prevent future cases.
There is only one reason I could imagine that a recommendation of this type would have been made: political influence.
I must be very clear. I know people at the CDC. I have worked with people at the CDC. I have done experiments that relied on work done at the CDC. I never in my life thought that I would have cause to question a recommendation from the CDC.
But this is where we are. This is the way that the US pandemic response has been run. I am, frankly, incredibly disappointed and depressed by this development.
What am I doing to cope with the pandemic? This:
Working
With the conference on, it’s a busy week. I’m not getting a lot of downtime.
How do we know it didn’t come from a lab, Part 2
Yes, it’s finally here. The continuation of this piece.
In the first installment, which appeared last week, I discussed my career as a virus engineer as well as the extreme difficulty of creating a functional virus that in some way deviates from what would evolve naturally. The argument that this establishes is that it is not really possible for us to do better at shaping a virus than the billions of iterated experiments that are conducted in nature on a regular basis.
This is a logical argument, but it is not a directly evidentiary argument. Today’s piece will focus on arguments from direct evidence.
One of the first things that happens after a new virus is identified is an evolutionary analysis of the virus genome. This is a necessary step because viruses always descend from other viruses; they do not appear from thin air. By the end of January 2020, an analysis of this type had been performed for SARS-CoV-2.
The way these analyses are done is relatively simple, though there are some details that I’ll have to gloss over. Essentially, the sequence of the virus genome is checked against sequences of other virus genomes, accounting for small changes and small insertions. Every stretch of sequence is compared against a database of other known sequences, so that we can tell if the new genome is a composite of different existing sequences or if it is descended from just one related species, but with differentiating individual mutations.
A paper that ran in The Lancet in late January (https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)30251-8/fulltext) did exactly this, and they determined the following relationships between bat coronaviruses and SARS-COV-2, represented here in an image not unlike a family tree:
Image is a tree-like diagram with various branches, showing the relationships between human coronaviruses, mouse coronaviruses, bat coronaviruses, and the SARS-CoV viruses. Note that at the time this was published, the name “SARS-CoV-2” had not been established, and it was still being referred to as “2019-nCoV” for “2019 novel coronavirus.” In diagrams like this, each horizontal line is an evolutionary separation between two viruses that have been sequenced. The distance between branches, and the number of branch points that separate the path connecting two specific labeled viruses, indicates their overall relatedness. So, viruses that are closer together on such a diagram are more closely related. In this diagram, SARS-CoV-2 is sandwiched between two groups of bat viruses, and separated by many branch points from human coronaviruses.
What this analysis told us was that the SARS-CoV-2 genome has a lot of sequence in common with viruses that are commonly found circulating in bats. This is important, because we knew from this that we were not seeing large amounts of sequence from other origins. Coronaviruses are known to share parts of their genomes relatively freely—a process known as recombination. Recombination can have unexpected results, and can lead to sequences that are a mosaic of different lineages. Most of the sequences that were found to be similar to this new virus, however, seemed at the time to have come from bats. However, there was some variation between the sequences of the bat viruses identified and the sequence of SARS-CoV-2, enough variation that researchers thought that it might have passed through another species and mutated or recombined there, before entering the human population.
One area where there was a great deal of variation, however, was at the sequences that create the virus spike glycoprotein (S), which is involved in the attachment of, and entry of the virus to, target cells. This became the source of our first clear evidence that this is not a virus that was developed in a lab.
We were lucky that we had seen SARS-CoV in the early 2000s, because it allowed the rapid understanding of the S protein and its structure. Early on in the emergence of this virus, we were able to determine that the SARS-CoV-2 S protein binds its receptor, ACE2, by an unusual binding solution that had not been previously documented. Previous work on SARS-CoV had examined the optimal binding solution for the S protein to ACE2, and over time it was observed that SARS-CoV had mutated for better binding affinity to human ACE2 during the outbreaks that it caused. The expectation at the time for future outbreaks of SARS-like viruses was that ACE2-binding would use something closer to this “optimal” solution predicted by our best structural analysis.
Instead, SARS-CoV-2 had a markedly different binding domain with 5 key locations that were different from previous SARS-CoV lineages. This led to a different binding solution than had been previously predicted as optimal by the world’s leading scientists using the best equipment. One of the researchers who predicted that optimal solution is Dr. Ralph Baric, who also happens to be the person who first made the argument that nature is a far better experimenter that has done far more of the work than we will ever do. He seems to have proved himself right by accident.
It is my belief, and the belief of many virologists, that only nature could have created something as novel as this binding solution. Our best researchers did not predict this clever solution. It could only have emerged by random evolution.
However, there are researchers who do a process called “directed evolution.” This is where we use conditions in a lab that encourage the survival of mutations of interest. You can do this sort of experiment with bacteria, viruses, yeast, and other systems where the generation time is fast and the growth conditions are easy to control. While the above evidence is convincing that SARS-CoV-2 emerged by random action, that does not mean that random action necessarily took place in nature. Perhaps it took place in laboratory conditions designed to encourage such a random emergence?
No, it did not, and there is good evidence of that. The first piece of evidence is that coronavirus sequences have been isolated from nature that match the novel ACE2-binding domain, and that could have recombined with a SARS-like virus to create the SARS-CoV-2 sequence that jumped into humans. The sequence was already out there, waiting—the idea that among millions and millions of animals, this preexisting sequence eventually connected with the bat viruses in the SARS-CoV lineage to produce SARS-CoV-2 is a far more likely scenario than the concept that this specific set of changes showed up in a lab, at random, from a few experiments.
That’s not the end of the argument, though. There are certain hallmarks of laboratory passage of viruses that we need to consider. Laboratory-origin viruses often lose important features relative to their wild-type counterparts. Specifically, when viruses are passed through cultured cells many times, as would be necessary in a directed-evolution experiment, they stop having to overcome the robust immune system that is present in a complete organism.
Cultured cells are not a full animal. They don’t have immune systems. They are heavily mutated to survive in plastic dishes in a 20% oxygen environment. The cells inside you do not do these things, and so the cells that are worked with in labs have been seriously messed up so that they can. They are more similar to tumors than to body cells. They are defective in key immune genes, a lot of the time. And there are no T-cells or B-cells to rush in and defend them when they become infected. No antibodies come to their rescue.
These facts take a lot of pressure off of the virus, and there are a few patterns that have been noticed in lab-grown viruses as a result of this. Specifically, lab-grown (or “passaged”) viruses eventually lose sites that are used for the attachment of sugars to their surface proteins. In the wild, the attachment of these sugars puts the sugar molecule physically in the way of antigens that might otherwise generate an effective immune response. These sugars are a way of hiding those virus antigens from the immune system.
When the pressure of that immune system is removed, the sequences that encourage attachment of sugars rapidly mutate away, because they no longer offer any survival advantage. It is typical to see the loss of such sites within viruses that are passaged in tissue culture.
SARS-CoV-2 has a bunch of sites for the attachment of sugars to its surface proteins. These sites have been present in the earliest sequences that were identified and they have been conserved all the way through the pandemic. This is a hallmark of a naturally-occurring virus, and direct evidence in favor of a natural origin.
There are other sequence features that make it appear that SARS-CoV-2 originated in nature, but I personally was convinced by the facts that I’ve presented already. However, I will summarize the additional parts of the argument. The SARS-CoV-2 sequence also contains features that would have required long-term experimentation in a previously-known virus backbone, because there is no documented coronavirus with these sequence features. One example is the “cleavage” site that primes the virus S protein for entry; it’s different from other related coronaviruses, and uses a different mechanism. To introduce that feature would have required experimentation, and there is no evidence that anyone anywhere in the world was doing such experiments. This is true of other features of the virus sequence as well.
I find this less convincing than the absence of the sugar-linking sites, because it’s always possible some foolish government is doing something unwise in secret. However, the presence of these immune-evading sugar-linking sites defeats that argument immediately because it demonstrates the applied evolutionary pressure of an immune system. Taken together, though, these two pieces of evidence suggest that many tissue culture experiments would have been required to create the new cleavage site, but if many tissue culture experiments had been performed, we would expect the sugar-linking sites to have started to disappear. This is a contradiction.
We know that the specific sequences in this virus didn’t exist together before, and a lab would have had to passage the virus in tissue culture many times to bring them together. We know that passage in tissue culture many times would have removed characteristic features that this virus still has. These facts are incompatible with each other, at least if you’re operating under a lab-origin hypothesis.
Meanwhile, we know that the specific sequences in this virus existed in other viruses of animals that we know interact with each other, allowing the progenitor viruses to recombine and form new sequences. We know that viruses circulating through animals do need to have immune-masking sugars, too. This means that a virus carrying novel sequence features as well as these immune-masking sugars would be entirely possible in nature. There is no contradiction because the combination of features makes sense as something that could exist in nature, and does not make sense as something that could have been created in a laboratory environment.
My conclusion is, therefore, that this virus was not engineered in a lab. This is based both on my personal experience with the difficulty of virus generation as well as these evidence-based arguments, which are not, in fact, my own arguments. I have drawn them from a paper that was published early in the pandemic, which appeared in Nature and can be found here: https://www.nature.com/articles/s41591-020-0820-9. This paper contains citations and references that back up every claim that I have made here. I have read them all, in order to convince myself that the correct interpretations of these references were made, and I have distilled them here for you, but I present the link so that you can check my work if you are so inclined.
So, that puts to bed the possibility of an engineered laboratory origin, either by intentional design or by directed evolution. A final lab-origin hypothesis exists: what if an animal captured for analysis in a lab was carrying this virus, and by some accident it jumped into a human being working in that lab?
To discuss this, I would like to show you the following photo from the EcoHealth Alliance, a nonprofit that is active in surveillance for emerging viruses around the world:
Image shows a bat in the wild being handled by scientists wearing high levels of personal protective equipment, including full-body “bunny suits,” N95 masks, and latex gloves.
This demonstrates that field operations in which scientists interact with bats involve serious protective measures. There are, I should note, a very small number of scientists who do this sort of work, and I’ve met quite a few of them. They do not want to die by exposing themselves to some unknown bat virus. These scientists also collect and sample a very small number of bats.
There is a much larger number of people who like to capture bats for various reasons or who invade bat habitats to harvest other products. These people are not trained virologists and they are not always educated regarding the risks at which they are placing themselves. There is also a much larger number of bats that interact with such people, carrying millions of virus particles with mutations that can easily pass into unprotected airways, uncovered eyes, or other means of entry.
We have to consider the numbers game and the relative odds of a laboratory origin as compared to a random origin through introduction to an unprotected person with no training in laboratory safety. Today, when SARS-CoV-2 is studied in a lab, it occurs in a high biosafety environment involving lots of protective equipment. When you go outside and sit on a park bench, it is entirely possible for you to catch SARS-CoV-2 because there are minimal biosafety precautions being taken. If a scientist makes a mistake with their protective equipment, they become like you—an average, unprotected person.
So when we think about the large number of average, unprotected people living adjacent to bat habitats, and then consider the number of scientists who know the correct precautions, it seems to me that even if the scientists slip up from time to time, the odds clearly indicate that the average person near a bat habitat is still a more likely candidate for spillover than the trained scientist.
I recognize this isn’t a definitive argument, but let’s contextualize it further. There is evidence that the SARS-CoV-2 virus emerged in the area around Wuhan, and outside the city are rural areas where bats are common. There is evidence of community spread at the end of 2019 that does not cluster in the city and that suggests a typical person in that area was passing the virus to others. Scientists working at the Wuhan Institute of Virology, on the other hand, were not in these areas and were not part of these early clusters. This seems to be evidence that is incompatible with an escape from an urban virus research center.
Again, these are arguments that I have not made myself. Instead, in this case, I have drawn them from an excellent Q&A document written by a graduate school friend of mine, Dr. Jim Duehr, who has gone on from his PhD in emerging viruses to pursue an MD. You can find his document here: https://drive.google.com/file/d/1kAHSEx9-eIyVIahczH8itHaUm9jI9WX7/view. Jim’s work is excellently referenced.
In summary, we are able to demonstrate evidence that suggests:
It is not reasonably possible that the virus was engineered in a laboratory, because it has sequences that we could not have predicted would emerge to behave in the way that this virus behaves
It is not reasonably possible that the virus was generated in a laboratory by directed evolution, because it contains features that are at odds with each other—it has evolved features that would mutate away in tissue culture, and it contains sequences that would have required extensive tissue culture work to introduce
It is not reasonably possible that the virus originated from a laboratory handling accident of a natural sequence; there is evidence of a community origin at a distance away from the proposed site of such an accident, preceding the arrival of the virus in the location where that site is housed
Now, by the very nature of this emerging virus, it is not possible for us to definitively pin down where it entered the human population. We did not know what to look for or where to look for it, and the symptoms in most people are mild enough that the first patient would never have known that they were the first patient. That is one of the reasons that the lead-in to this newsletter says “the first documented case” rather than the “first case.” What we can do is rule out certain scenarios through observation and evidence until only several possibilities remain, and we can rank those by their likelihood. We do not know what specific bat passed the virus to another animal, and we do not know if that other animal was a human, or some intermediate. What we do know is that the list of possibilities, based on the evidence available, can be narrowed to a natural event in which a bat passed the virus to another animal, allowing it to grow and eventually become the pandemic virus that it is today.
Join the conversation, and what you say will impact what I talk about in the next issue.
Also, let me know any other thoughts you might have about the newsletter. I’d like to make sure you’re getting what you want out of this.
This newsletter will contain mistakes. When you find them, tell me about them so that I can fix them. I would rather this newsletter be correct than protect my ego.
Though I can’t correct the emailed version after it has been sent, I do update the online post of the newsletter every time a mistake is brought to my attention.
No corrections since last issue.
See you all next time.
Always,
JS
Also, I'm not finding a lot of information about loss of glycosylation sites as a general occurrence in serial passage through cell culture. Can you provide some links about that? Did it occur in any of the furin cleavage research I linked previously? It's my understanding that at least on the S protein, glycosylation plays an important role in the virus's entry into cells, and not only in evading the immune system, which implies it would be conserved even in culture. Furthermore, the selective pressure of an immune system could be restored by using whole live animals. Transgenic mice expressing the human ACE2 receptor are currently being used to study SARS-CoV-2; perhaps they were also used in its development?
Yuri Deigin's writeup on the lab leak hypothesis links to five different papers from 2006 to 2019 that involve messing with furin cleavage sites of coronaviruses, including SARS-CoV. I don't have the background to understand this research in depth, but from my layman's perspective it seems strange to say that there is no evidence of anyone experimenting in this area. The papers are:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7111780/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2583654/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2519682/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2660061/
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6832359/