COVID-19 Viral Genome Analysis Pipeline COVID-19 Viral Genome Analysis Pipeline home COVID-19 Viral Genome Analysis Pipeline home
COVID-19 Viral Genome Analysis Pipeline
Enabled by data from   gisaid-logo


SHIVER

SARS CoV-2 Historically Identified Variants in Epitope Regions

Last update: Jul 31, 2021


Strategy: Take turns
Variants color key

NOTE: We are NOT tracking insertions in Spike sequences in this output; insertions are still very rare, but are found on occasion.

In particular, we have found them associated with a few rare Pango lineages including:
B.1.621 T95I, insert144T, Y144S, Y145N, R346K, E484K, N501Y, D614G, P681H, D950N
A.2.5.2 del141-143, insert215AGG, D215Y, L452R, D614G
AT.1 P9L, del136-144, D215G, H245P, E484K, D614G, N679K, insert679GIAL, E780K
B.1.214.2 insert214TDR, Q414K, N450K, D614G, T716I

Where 'insert' indicates an insertion at the given position followed by the list amino acids added, and 'del' indicates a deletion.


SHIVER: SARS CoV-2 Historically Identified Variants in Epitope Regions

SHIVER identifies sets of variant forms of the SARS CoV-2 virus with a focus on just the NTD and RBD neutralizing antibody epitope regions of the Spike protein, chosen to maximize coverage globally and/or on separate continents[*], depending on which of several strategies is employed.

The first variant in the input alignment is taken as the reference sequence, and should be the ancestral Wuhan variant to ensure epitope regions are chosen appropriately. The epitope regions in Spike that are featured as are defined as: The NTD supersite includes Spike positions 13-20, 140-158, and 242-264 (note, however, that site 18 is not included in the analysis because it is so variable that both the ancestral L18 form and the common variant L18F are very often both found in significant numbers among Variants of Interest).

The NTD supersite sites selected are for inclusion are based on:

Sites 14-20, 140-158, and 245-264: N-terminal domain antigenic mapping reveals a site of vulnerability for SARS-CoV-2 McCallum, M. et al. bioRxiv doi: 10.1101/2021.01.14.426475

Site 13: SARS-CoV-2 immune evasion by variant B.1.427/B.1.429 McCallum, M. et al. bioRxiv, 2021/04/07 doi: 10.1101/2021.03.31.437925 PMC8020983 Sites 242-244: SARS-CoV-2 501Y.V2 escapes neutralization by South African COVID-19 donor plasma Wibmer, C. et al. bioRxiv, doi: 10.1101/2021.01.18.427166 Sites 330-521: The RBD region includes positions 330-521, based on a synthesis of the literature from early 2020.

All distinct variants found within these boundaries are identified and tallied, and the most common variants are selected. Windows in time can be selected to reflect more recently emerging patterns in variation in key epitope regions.

[*] Note that the UK is treated as a separate continent because so much of the sequencing has been from the UK.

This run uses the T=taketurns strategy for identifying further variants. Each continent, in turn, chooses the next variant, based on which is the most common variant in that continent that has not already been chosen. The order of the continents is based on number of samples available in those continents.

This run uses sequences sampled from 2021-04-27 to 2021-07-26.
The number of sequences, broken out by continent is:
Total: 426008, Europe-w/o-United-Kingdom: 163256, United-Kingdom: 118768, North-America: 107152, Asia: 24698, South-America: 10048, Africa: 1578, Oceania: 508.
Note: the focus here is specifically on the epitope region: NTD-18+RBD
Sites: 13-17,19,20,140-158,242-264,330-521

Table of Variants

In table below, first column is the pattern (ie, sequence within RBD+NTD) at sites where differences occur, relative to initial (Wuhan) sequence, with site numbers read down vertically).
LPM = Local Pattern Matches = # of seqs in continent that match over RBD+NTD
GMP = Global Pattern Matches = # of seqs in world that match over RBD+NTD
GSM = Global Sequence Matches = # of seqs that match over whole Spike protein

  111111111222222222223444444444445
12444555555444444455554134445778990
90245245678234678901236790692784041 Name                    LPM    GPM    GSM  GSM/GPM [Mutations] (Lineage)
TTGYYWESEFRLALRSYLTPGDRKNNGYLSTEFSN 1-Initial              2154   2154     32     1.5% [] (Ancestral)
...-..............................Y 2-Europe-1           108509 198115 122003    61.6% [H69-,V70-,Y144-,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] (B.1.1.7=Alpha)
R.D.....--G.................R.K.... 3-United-Kingdom-1    91502 119757  55453    46.3% [T19R,T95I,G142D,E156-,F157-,R158G,L452R,T478K,D614G,P681R,D950N] (B.1.617.2=Delta)
.N.....................T.......K..Y 4-North-America-1     10906  20995  12759    60.8% [L18F,T20N,P26S,D138Y,R190S,K417T,E484K,N501Y,D614G,H655Y,T1027I,V1176F] (P.1=Gamma)
R.......--G.................R.K.... 5-Asia-1               1846  13715   4792    34.9% [T19R,E156-,F157-,R158G,L452R,T478K,D614G,P681R,D950N] (B.1.617.2=Delta)
..............N-------......Q...S.. 6-South-America-1       706   1110    650    58.6% [G75V,T76I,R246N,S247-,Y248-,L249-,T250-,P251-,G252-,D253-,L452Q,F490S,D614G,T859N] (C.37=Lambda)
...........---.........N.......K..Y 7-Africa-1              232   1378    370    26.9% [L18F,D80A,D215G,L242-,A243-,L244-,K417N,E484K,N501Y,D614G,A701V] (B.1.351=Beta)
..D...K.....................R..Q... 8-Oceania-1              37    417    237    56.8% [T95I,G142D,E154K,L452R,E484Q,D614G,P681R,Q1071H] (B.1.617.1=Kappa)
...-.R............................Y 9-Europe-2             6364   6437   5268    81.8% [H69-,V70-,Y144-,W152R,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] (Alpha+W152R)
R.D.....--G........L........R.K.... 10-United-Kingdom-2     694    956    528    55.2% [T19R,G142D,E156-,F157-,R158G,P251L,L452R,T478K,D614G,P681R,D950N] (B.1.617.2=Delta)
.....................G.......N..... 11-North-America-2     3543   3553   1458    41.0% [L5F,T95I,D253G,S477N,D614G,Q957R] (B.1.526.2)
.....L.........................K... 12-Asia-2              1284   1350   1038    76.9% [W152L,E484K,D614G,G769V] (R.1)
.N.............................K..Y 13-South-America-2      463    682    423    62.0% [L18F,T20N,P26S,D138Y,R190S,E484K,N501Y,D614G,H655Y,T1027I,V1176F] (P.1=Gamma)
...-...........................K... 14-Africa-2              93   2198    868    39.5% [T95I,Y144-,E484K,D614G,P681H,D796H] (B.1.1.318)
...-...G..........................Y 15-Oceania-2             23     44     37    84.1% [H69-,V70-,Y144-,S155G,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] (B.1.1.7=Alpha)
..................................Y 16-Europe-3            1337   1815    353    19.4% [H69-,V70-,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] 
R.D.H...--G.................R.K.... 17-United-Kingdom-3     379    384    299    77.9% [T19R,T95I,G142D,Y145H,E156-,F157-,R158G,A222V,L452R,T478K,D614G,P681R,D950N] (B.1.617.2=Delta+A222V)
.....................G.........K... 18-North-America-3     3530   3859   2838    73.5% [L5F,T95I,D253G,E484K,D614G,A701V] (B.1.526=Iota)
.........................K.....K... 19-Asia-3              1239   1441   1207    83.8% [I210T,N440K,E484K,D614G,D936N,S939F,T1027I] 
...SN.................K........K..Y 20-South-America-3      135    772    525    68.0% [T95I,Y144S,Y145N,R346K,E484K,N501Y,D614G,P681H,D950N] 
...-............................S.Y 21-Africa-3              48   1318    483    36.6% [L5F,H69-,V70-,Y144-,F490S,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] (Alpha+F490S)
...-.......................N......Y 22-Oceania-3              7     11      6    54.5% [H69-,V70-,Y144-,Y449N,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] (B.1.1.7=Alpha)
.I.-..............................Y 23-Europe-4             763    969    808    83.4% [T20I,H69-,V70-,Y144-,N501Y,A570D,D614G,P681H,T716I,S982A,D1118H] (Alpha+T20I)
R.D.....--G...............V.R.K.... 24-United-Kingdom-4     315    354    188    53.1% [T19R,T95I,G142D,E156-,F157-,R158G,G446V,L452R,T478K,D614G,P681R,D950N] (B.1.617.2=Delta)
..............................K.... 25-North-America-4     2748   2888   1875    64.9% [T478K,D614G,P681H,T732A] (B.1.1.519)
........................K.......... 26-Asia-4               366    495    237    47.9% [N439K,D614G,P681R] 
.......................T.......K..Y 27-South-America-4       64     95     38    40.0% [L18F,P26S,D138Y,R190S,K417T,E484K,N501Y,D614G,H655Y,T1027I,V1176F] 
R.......--G.......S.........R.K.... 28-Africa-4              46     51     28    54.9% [T19R,E156-,F157-,R158G,A222V,T250S,L452R,T478K,D614G,P681R,D950N] (B.1.617.2=Delta+A222V)
........---....................K.P. 29-Oceania-4              6    143    101    70.6% [E156-,F157-,R158-,F306L,E484K,S494P,D614G,E780A,D839V,T1027I] (B.1.1.523)

Table of Coverages

In table below, T-n refers to a batch of the first n variants. Coverage is defined as fraction of sequences in the continent with an exact match (over the RBD/NTD regions) to one of the first n variants. (Here, 'T' corresponds to the Taketurns strategy.) The coverage table is based on 426008 sequences.

                Continent Name Coverage

                   Global T-1    0.0051
Europe-w/o-United-Kingdom T-1    0.0046
           United-Kingdom T-1    0.0002
            North-America T-1    0.0080
                     Asia T-1    0.0193
            South-America T-1    0.0030
                   Africa T-1    0.0095
                  Oceania T-1    0.0039

                   Global T-8    0.8395
Europe-w/o-United-Kingdom T-8    0.8251
           United-Kingdom T-8    0.9471
            North-America T-8    0.7570
                     Asia T-8    0.7950
            South-America T-8    0.8189
                   Africa T-8    0.6749
                  Oceania T-8    0.8209

                   Global T-15   0.8752
Europe-w/o-United-Kingdom T-15   0.8736
           United-Kingdom T-15   0.9542
            North-America T-15   0.7991
                     Asia T-15   0.8495
            South-America T-15   0.8653
                   Africa T-15   0.7338
                  Oceania T-15   0.8898

                   Global T-22   0.8978
Europe-w/o-United-Kingdom T-22   0.8919
           United-Kingdom T-22   0.9581
            North-America T-22   0.8401
                     Asia T-22   0.9029
            South-America T-22   0.9025
                   Africa T-22   0.7706
                  Oceania T-22   0.9075

                   Global T-29   0.9095
Europe-w/o-United-Kingdom T-29   0.8991
           United-Kingdom T-29   0.9614
            North-America T-29   0.8673
                     Asia T-29   0.9186
            South-America T-29   0.9090
                   Africa T-29   0.8010
                  Oceania T-29   0.9252

 

last modified: Wed Jul 14 06:34 2021



GISAID data provided on this website is subject to GISAID's Terms and Conditions

Questions or comments? Contact us at seq-info@lanl.gov.

 
Operated by Triad National Security, LLC for the U.S. Department of Energy's National Nuclear Security Administration
© Copyright Triad National Security, LLC. All Rights Reserved | Disclaimer/Privacy

Dept of Health & Human Services Los Alamos National Institutes of Health