COVID-19 Viral Genome Analysis Pipeline
Enabled by data from   gisaid-logo

This website provides analyses and tools for exploring accruing mutations in hCoV-19 (SARS-CoV-2) geographically and over time, with an emphasis on the Spike protein, using data from GISAID.

The SARS-CoV-2 sequence data used for these analyses was updated from GISAID on Aug 3, 2020

The analyses provided are based on a trimmed full length SARS-CoV-2 alignment containing 30,826 sequences:
sequence names and ID numbers used for full-length analyses,
or on a Spike alignment containing 59,543 sequences:
sequence names and ID numbers used for spike-only analyses.

The details of the analyses are described in:
Tracking changes in SARS-CoV-2 Spike: evidence that D614G increases infectivity of the COVID-19 virus.
Korber B, Fischer WM, Gnanakaran S, Yoon H, Theiler J, Abfalterer W, Hengartner N, Giorgi EE, Bhattacharya T, Foley B, Hastie KM, Parker MD, Partridge DG, Evans CM, Freeman TM, de Silva TI*, McDanal C, Perez LG, Tang H, Moon-Walker A, Whelan SP, LaBranche CC, Saphire EO, and Montefiori DC.
*on behalf of the Sheffield COVID-19 Genomics Group
In press in Cell, June 2020


May 11, 2020

  1. The observation in our preprint, that the D614G mutation was associated with higher viral loads in subjects, but not associated with greater disease severity (indicated by fewer PCR cycles needed for detection, Fig. 5C and 5D of Korber et al., based on clinical data from Sheffield), has recently been repeated by the Bedford lab in Washington: So some progress on this issue, of course more work will need to be done.
  2. The global pattern of repeated shifts over time from the D614 to G614 variant continues to be supported as data accrues, in dozens of regions in parallel throughout the globe. There is an interesting exception in California, where sequences of the original D614 are currently dominant. That could be explained by a bolus of available sequences from Santa Clara county in late April; D614 is the local epidemic form in Santa Clara county. To see what is happening in California at the county level go to the Tracking Mutations tool, but here is a summary as of today:
    California, a county breakdown: May 11, 2020
  3. In our original bioRxiv preprint, we noted that a mutation in Spike at position 943 seemed to be accruing. This position turned out to be a sequence processing artifact, which we have corrected in our bioRxiv preprint.



We gratefully acknowledge the authors, originating and submitting laboratories of the sequences from GISAID on which this research is based. The original data are available from

This COVID-19 response analysis pipeline is supported by: The Laboratory Directed Research and Development program of Los Alamos National Laboratory (20200706ER), and by the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services, under Interagency Agreement No. AAI12007-001-00000.

GISAID data provided on this website is subject to GISAID’s Terms and Conditions

