W.P. Hanage and Xueting Qiu
Translation in Mandarin available here.
A novel coronavirus is spreading in central China, with exported cases reported in countries around the globe. There is substantial uncertainty surrounding the rate of spread and the mortality associated with disease, although it is clear that human to human transmission has been happening for months. Some authorities are now stating it is ‘likely’ that this will be declared a pandemic. In the face of a churning news cycle and unverified and frightening reporting the public is understandably anxious.
Recent news reports have included speculation that the novel coronavirus is ‘mutating’ and ‘evolving’ amid increasing concern that the outbreak could present an even more significant threat than it does already. At present there is no evidence to think that this is the case. The following is to help the public understand what scientists are looking for when they examine viral sequences, and what you can and cannot tell from such data.
Is the virus ‘stable’? What does that mean?
The question itself is a bit misleading. The viral genomes that are available so far are very very similar to one another. This doesn’t mean they are ‘stable’, it just means that the outbreak is young and the family tree of all the sequenced genomes started somewhere around the end of November or start of December. Over time, as mutations happen, we expect to find more differences between the viral genomes. But this is expected as the population grows and doesn’t necessarily mean anything bad or good about the trend of the outbreak.
Don’t be scared of mutation
Mutations happen all the time at a relatively constant rate for a specific pathogen, and they are a natural consequence of the process of genome replication. Mutation in this context is a random mistake that gets made when the virus is copying its genome and so the new genome is slightly different. The new mutated genome might be better at transmitting than the old one, or worse, or pretty much the same. We really cannot tell by just looking at new genomes as they come in.
Why do some of the outbreak genomes have more mutations than others?
This might mean they are more distantly related to the rest of the outbreak, although some genomes will have more mutations simply by chance. It might also reflect laboratory errors. At present the genomes are distinguished by a few handfuls of mutations, and so any mistakes in the sequencing will have a proportionately large impact. There are also different sequencing technologies being used, some of which have higher error rates than others. A more complete picture of the genetic relationship between different isolates needs a more representative sample of the viral population.
What about evolution?
All mutations are not alike. Some changes have very small/negligible impacts on the virus, while others change the viral proteins and might make the virus fitter – or less fit. Scientists call the latter ‘nonsynonymous’ mutations. Over time if we see an excess of nonsynonymous mutations becoming more common in the population, we might conclude that this is a sign of the virus adapting. But there is nowhere near enough data at the moment to say anything about whether the virus is adapting to transmit better in humans or become more virulent.
In fact, we may even expect nonsynonymous mutations to be more common over the short term. This is because the great majority of such mutations are damaging to the virus, but not so damaging they kill it outright. It takes time for such damaged variants to die out from the population, and because the virus has not had many limits on its growth so far, there’s not been much competition from the fitter variants.
Can you tell where the virus came from, is it a mixture of other viruses?
Once you have a genome you can compare it with others, to see whether it is more alike or not. Genomes that are more similar are more closely related. The virus causing the current outbreak is closely related to the SARS virus, and others that are circulating in bats. Some short regions of the new coronavirus genome are more different from these close relatives than others. If we examine these short regions we can see that they are similar to bits in other viruses, but this doesn’t mean anything. You can find short chunks of DNA similar to the virus in just about any genome, including yours! This happens by chance. There are statistics to handle this, but they have not always been applied appropriately in the rush to share results. As a consequence some early reports have been withdrawn, but not before they led to considerable public misunderstanding.
Can you tell how transmissible the virus is from its genome?
Not from a single genome. However, with a suitable sample of many genomes, collected over time, we can estimate how quickly the virus population is growing or indeed if it is shrinking, for instance after quarantine or other interventions. We’re not at the stage right now where we have enough data to do this properly, although we are getting close.
Will the virus evolve to be more dangerous?
Viruses evolve to transmit. The better they are at transmitting the more descendants they leave. Whether this produces evolutionary pressures to become more dangerous depends on whether disease helps the virus transmit. At the moment we don’t know, but we should note that the virus appears to be transmitting quite effectively for the time being, at least in Hubei province. The future evolution of the virus could plausibly lead it to become more virulent (exploit the host and transmit quick), or less (long duration of infection permitting host to make more contacts) but at this stage we don’t know. Be cautious of people who seem confident one way or the other.
What is the role of speculation in an outbreak?
This is understandably a frightening time. We lack a lot of information, and that goes for the scientists as well as the public. However, we should remember that if there’s something we don’t know, it doesn’t follow that we should treat all possibilities as equally plausible. Speculation in the absence of solid evidence is bad, and extraordinary claims should require extraordinary evidence. Some news reports have not handled this responsibly. Nor have some self-anointed ‘expert’ commentators on social media. Keep calm; we will learn as fast as we can by collecting more data, analyzing them in a rigorous way, and relying on reliable sources for information about the outbreak.
Further chewing on the sequence data
- Available 2019-nCoV sequence phylogeny and mutations:
- Situation report:
- Andrew Rambaut is regularly updating his analysis through his blog at virological.org. This is a link to his most recent Phylodynamic Analysis with 56 genomes: