SHIVER: SARS CoV-2 Historically Identified Variants in Epitope Regions
SHIVER identifies variant forms of the SARS CoV-2 virus with a focus on the NTD and RBD neutralizing antibody epitope regions of the Spike protein, as well as sites related to furin cleavage; the forms are chosen to maximize coverage globally and/or on separate continents[*], depending on which of several strategies is employed.
In the Table of Variants, below, the first column is the pattern at sites where differences occur, relative to initial (Wuhan) sequence, with site numbers read down vertically.
Table of Variants
LPM = Local Pattern Matches = # of seqs in continent that match over epitope region
GPM = Global Pattern Matches = # of seqs in world that match over epitope region
GSM = Global Sequence Matches = # of seqs in world that match over whole Spike protein
111112222223333333444444444444445566 1112444554455664566777001444557777880177 4790146562907013358123054279232589061757 Name LPM GPM GSM GSM/GPM [Mutations] (Lineage) QMLFDHNSGNGDAADRTLF-FAKSNHDWLFAKK-KFHPKR 1-Initial 49 49 27 55.1% [] ............................S........... 2-North-America-1 4183 5165 2988 57.9% [L452S] .---........................S........... 3-Asia-1 130 436 220 50.5% [M17-,P18-,L19-,F20-,L452S] ........................K...S........... 4-Europe-1 13 15 5 33.3% [N414K,L452S] .---.Q.FRHV...ATKI....R..P.LFL..NVAS...H 5-Oceania-1 11 33 15 45.5% [M17-,P18-,L19-,F20-,T25R,L51S,Q53H,+69HV,V82A,F126V,H144Q,S155F,G156R,Q181E,+208N,I209L,G210E,F213L,N242H,G249V,D261A,V329I,R343T,T353K,L365I,K400R,H442P,W449L,L452F,F453L,K478N,+479V,K480A,F486S,K550E,V566A,S617P,R677H,S700L,F935S,L1139P] .---.QTFRH.G..ATKI.FS-R..PNL...TNVAS.S.H 6-United-Kingdom-1 15 24 10 41.7% [M17-,P18-,L19-,F20-,T25R,L51S,+69HV,K76R,V82A,F126V,H144Q,N146T,S155F,G156R,Q181E,N183I,F184-,+208N,I209L,G210E,F213L,N242H,D250G,D261A,V329I,R343T,T353K,L365I,+371F,F372S,A373-,K400R,H442P,D447N,W449L,K475T,K478N,+479V,K480A,F486S,P517S,K550E,V566A,S617P,R677H,F935S,L1139P] .---...................R....S........... 7-South-America-1 6 8 7 87.5% [M17-,P18-,L19-,F20-,S405R,L452S] ..................S...R.....S........... 8-Africa-1 4 4 2 50.0% [F368S,K400R,L452S,I580V,S700L] ...............T............S........... 9-North-America-2 193 209 69 33.0% [F60S,R343T,L452S,A1083S] ............................S....VE..... 10-Asia-2 78 78 31 39.7% [+27XXX,L51X,L452S,+479V,K480E] .......................R....S........... 11-Europe-2 9 140 88 62.9% [S405R,L452S] ............................SL.......... 12-Oceania-2 7 103 50 48.5% [L452S,F453L] .---...........T............S........... 13-United-Kingdom-2 3 12 4 33.3% [M17-,P18-,L19-,F20-,R343T,L452S] ......................R.....S........... 14-South-America-2 3 16 6 37.5% [K400R,L452S] .---G..........T........K...S........... 15-Africa-2 3 3 1 33.3% [M17-,P18-,L19-,F20-,N21X,L22X,I23X,T24X,T25X,T26X,Q27X,+27XXX,D141G,R343T,N414K,L452S] ...............T............SL.......... 16-North-America-3 44 50 7 14.0% [R343T,L452S,F453L] K---.Q.FRHV...ATKI....R..PNLFL..NVAS...H 17-Asia-3 12 15 9 60.0% [Q14K,M17-,P18-,L19-,F20-,T25R,L51S,Q53H,+69HV,V82A,F126V,H144Q,S155F,G156R,Q181E,+208N,I209L,G210E,F213L,N242H,G249V,D261A,V329I,R343T,T353K,L365I,K400R,H442P,D447N,W449L,L452F,F453L,K478N,+479V,K480A,F486S,K550E,V566A,S617P,R677H,F935S,L1139P] .---........................SL.......... 18-Europe-3 4 10 3 30.0% [M17-,P18-,L19-,F20-,L452S,F453L,A684V] ............................S.........R. 19-Oceania-3 2 6 6 100.0% [L452S,K675R] ............................S.......Y... 20-United-Kingdom-3 3 16 8 50.0% [L452S,H501Y] .---.Q.FRHV...ATKI....R..PNLFLV.NVAS...H 21-South-America-3 2 14 4 28.6% [M17-,P18-,L19-,F20-,T25R,L51S,+69HV,V82A,F126V,H144Q,S155F,G156R,Q181E,+208N,I209L,G210E,F213L,N242H,G249V,D261A,V329I,R343T,T353K,L365I,K400R,H442P,D447N,W449L,L452F,F453L,A472V,K478N,+479V,K480A,F486S,K550E,V566A,S617P,R677H,F935S,L1139P] ...............T......R.....S........... 22-Africa-3 3 3 3 100.0% [R343T,K400R,L452S] .....Q......................S........... 23-North-America-4 29 30 24 80.0% [H144Q,L452S,T568I] ............................SL...VE..... 24-Asia-4 11 11 2 18.2% [+27XXX,L452S,F453L,+479V,K480E] .---.Q.FRHV...ATKI....R..PNLFL..NVAS...H 25-Europe-4 4 21 9 42.9% [M17-,P18-,L19-,F20-,T25R,L51S,Q53H,+69HV,V82A,F126V,H144Q,S155F,G156R,Q181E,+208N,I209L,G210E,F213L,N242H,G249V,D261A,V329I,R343T,T353K,L365I,K400R,H442P,D447N,W449L,L452F,F453L,K478N,+479V,K480A,F486S,K550E,V566A,S617P,R677H,F935S,L1139P] ............TV..............S........... 26-Oceania-4 2 2 2 100.0% [S32F,A257T,A260V,L452S] .---.Q.FRHV...ATKI....R..PNL.L.RNVAS...H 27-United-Kingdom-4 2 7 4 57.1% [M17-,P18-,L19-,F20-,T25R,L51S,+69HV,V82A,F126V,H144Q,S155F,G156R,E178V,K180Q,Q181E,+208N,I209L,G210E,F213L,N242H,G249V,D261A,V329I,R343T,T353K,L365I,K400R,H442P,D447N,W449L,F453L,K475R,K478N,+479V,K480A,F486S,V566A,S617P,R677H,A697V,F935S,L1139P] .---.K.FRHV...ATKI....R..PNLFLV.NVAS...H 28-South-America-4 1 3 1 33.3% [M17-,P18-,L19-,F20-,T25R,L51S,+69HV,V82A,F126V,H144K,S155F,G156R,Q181E,+208N,I209L,G210E,F213L,N242H,G249V,D261A,V329I,R343T,T353K,L365I,K400R,H442P,D447N,W449L,L452F,F453L,A472V,K478N,+479V,K480A,F486S,K550G,V566A,S617P,Q673R,R677H,F935S,L1139P] .---...........T......R.....S........... 29-Africa-4 1 1 1 100.0% [M17-,P18-,L19-,F20-,N21X,L22X,I23X,T24X,T25X,T26X,Q27X,+27XXX,S28X,L174F,R343T,K400R,L452S,L1230X,C1231X,C1232X]
In the Table of Variants, above, the first variant in the input alignment is taken as the reference sequence, and is the ancestral Wuhan variant to ensure epitope regions are chosen appropriately. The alignment on the left shows the positions that define unique common forms that are searched using SHIVER. The positions numbers are written vertically. The amino acids in the top row are taken from is the ancestral Wuhan variant. The epitope regions in Spike that are explored for a focused search for the common Spike variants are defined at the end of this document. The epitope and furin cleavage regions in Spike that are featured are defined below.
The basic NTD supersite sites selected are for inclusion are based on:
Sites 14-20, 140-158, and 245-264: McCallum, M. et al. N-terminal domain antigenic mapping reveals a site of vulnerability for SARS-CoV-2. Cell 184:9 2332-2347.e16 (2021)
Site 13: Impacts signal peptide cleavage and NTDss antibodies. McCallum, M. et al. SARS-CoV-2 immune evasion by the B.1.427/B.1.429 variant of concern. Science 373:648-654 (2021)
Sites 242-244: Impacts NTDss antibody potency SARS-CoV-2 501Y.V2 escapes neutralization by South African COVID-19 donor plasma Wibmer, C. et al. Nature Med. 27(4): 622-625.
Toggling Sites: Site 18 is in the NTDss and toggles frequently between L and F, so we exclude it from the tallies of forms of the regions of interest as it splits the counts on otherwise distinctive forms. An analogous situation is a problem for site 142. Among Delta variants, every common variant within the Delta lineages includes both (the ancestral) G and D at site 142. This is because the ARTIC 3 primers can results in an erroneous call of the ancestral G at position 142. The G142D mutation is the common form, and this error is resolved by using the ARTIC 4 primers. By excluding both sites 18 and 142 from our NTDss definition, we group the forms of Spike that carry either form in our tallies.
Analysis of the ARTIC version 3 and version 4 SARS-CoV-2 primers and their impact on the detection of the G142D amino acid substitution in the spike protein. Davies et al. bioRxiv 10.1101/2021.09.27.461949 (2021)
Sites 330-521: the RBD region includes positions 330-521, based on a synthesis of the literature from early 2020.
Furin related sites: mutations that add positive charge to near the furin cleavage site can enhance Spike cleavage and infectivity. Also, the change at H655Y (Alba2021) has been shown to impact furin cleavage, and we include site 950 as it accompanies P681R in Delta and P681H in Mu, to variants that were particularly fast spreading, though Delta became prevalent. SARS-CoV-2 spike P681R mutation, a hallmark of the Delta variant, enhances viral fusogenicity and pathogenicity. Saito et al. bioRxiv 10.1101/2021.06.17.448820 (2021) SARS-CoV-2 variants of concern have acquired mutations associated with an increased spike cleavage. Alba et al. bioRxiv 10.1101/2021.08.05.455290 (2021)
Table of Coverages
In table below, T-n refers to a batch of the first n variants. Coverage is defined as fraction of sequences in the continent with an exact match (over the region NTDss-18-142+RBD+furin) to one of the first n variants. (Here, 'T' corresponds to the 'Taketurns' strategy.) The coverage table is based on 7173 sequences.
Continent Name Coverage
Global T-1 0.0068 North-America T-1 0.0037 Asia T-1 0.0018 Europe-minus-United-Kingdom T-1 0.0155 Oceania T-1 0.0305 United-Kingdom T-1 0.0368 South-America T-1 0.0000 Africa T-1 0.0000
Global T-8 0.7994 North-America T-8 0.8193 Asia T-8 0.6011 Europe-minus-United-Kingdom T-8 0.7888 Oceania T-8 0.8659 United-Kingdom T-8 0.7831 South-America T-8 0.8056 Africa T-8 0.2667
Global T-15 0.8776 North-America T-15 0.8938 Asia T-15 0.7766 Europe-minus-United-Kingdom T-15 0.8391 Oceania T-15 0.8994 United-Kingdom T-15 0.8309 South-America T-15 0.8750 Africa T-15 0.5333
Global T-22 0.8935 North-America T-22 0.9070 Asia T-22 0.8050 Europe-minus-United-Kingdom T-22 0.8547 Oceania T-22 0.9238 United-Kingdom T-22 0.8529 South-America T-22 0.9028 Africa T-22 0.7333
Global T-29 0.9039 North-America T-29 0.9155 Asia T-29 0.8369 Europe-minus-United-Kingdom T-29 0.8624 Oceania T-29 0.9299 United-Kingdom T-29 0.8640 South-America T-29 0.9167 Africa T-29 0.8000
This run uses the T=taketurns strategy for identifying further variants. Each continent, in turn, chooses the next variant, based on which is the most common variant in that continent that has not already been chosen. The order of the continents is based on number of samples available in those continents.
Sequence sample dates range from 2024-02-08 to 2024-03-06. The number of sequences, broken out by continent is: Total: 7173, North-America: 5406, Asia: 564, Europe-minus-United-Kingdom: 516, Oceania: 328, United-Kingdom: 272, South-America: 72, Africa: 15. The focus here is specifically on the epitope region: NTDss-18-142+RBD+furin Sites: 13-17,19,20,140,141,143-158,242-264,330-521,655,675,677,679,681,950
[*] Note that the UK is treated as a separate continent because so much of the sequencing has been from the UK.