'New Variant' Covid 19 -
Changes in the
SARS-CoV-2 spike protein
caused by mutations
As time passes, a number of changes in the amino acid sequence in the virus protein have been recorded, especially the spikes projecting from the surface.
And then there are some other genetic differences reported in the virus from mink, as reported from Denmark and the Netherlands, as well as developments in a number of other countries, including Mexico and South Africa.
These have been described as mutants, new variants and simply new strains. There is concern about their infectivity and the possibility that they may be less susceptible to control by vaccines.
Attention has been focussed on antibodies to the spike protein as well as the action of T-cells. Groups of mutations have been used to describe and monitor the movement of the virus through populations, and geographical areas.
There is a convention that substitutions show the changed amino acid (as a single letter code) followed by the location in the sequence and the replacement amino acid.
Sometimes it is accompanied by a similar code showing changes in the DNA bases (A, C, G, T), and their location within the genome, which is generally a larger number.
"Cluster 5" is a mutated variant of the SARS-CoV-2 virus, discovered in Northern Jutland, Denmark. It is believed to have been spread from minks to humans via mink farms.
Y453F, I692V, M1229I as well as two amino acid deletions H69del/V70del.
The first three are substitutions, and the deletion of 6 bases from 21765-21770 caused the loss of two amino acids from the polypeptide chain.
Cluster 5 Substitutions (and base code changes
- using the flexible base notation for triplet degeneracy)
is a substitution of phenylalanine (F) for tyrosine (Y) at position 453
[base code switched from UA
Y to UU
is a substitution of valine (V) for isoleucine (I) at position 692
[base code switched from A
UY to GUY
is a substitution of isoleucine (I) for methionine (M) at position 1229
[base code switched from AUR
- a transversion]
Spike deletion 21765-21770
(Partial) Base and amino acid sequence
A comparison of sections of (DNA) base sequences between reference sequence Wuhan-Hu-1 (at the top, in blue) and deletion 21765-21770 (in green).
The loss of 6 bases gives a triplet of ATC (bases 21764-21771-21772 in 'old numbers') still coding for an isoleucine ('same amino acid'), so the chain is simply shortened by 2 amino acids.
What are the two lost amino acids?
> H histidine
> V valine
I have taken the liberty of annotating base sequence data and amino acid listings
to show the effects of that deletion.
- Just look for lines I have inserted between the original data!
Other mutants - rather a saga
The D614G mutation
is an allele causing a modification of the virus' surface spike protein, which has become increasingly common. This notation means that the 614th amino acid in its polypeptide chain is altered from being aspartate - aspartic acid - (one-letter amino acid code D) to glycine (G). This is likely to be the result of a change (A to G) in the middle base of a codon in the viral RNA. It is said that this mutation appears to have greater transmissibility in humans rather than greater pathogenicity.
The BBC states that a strain A222V
spread across Europe and was linked to summer holidays in Spain.
What is the change in amino acid, and the base sequence? (Use the table above)
Amino acid 222 changed from > alanine
to > valine
, triplet changed from > GCN
to > GUN
More recent versions, and a cause of concern because of apparently greater (70% ?) transmissibility, includes a number of mutations known from other parts of the world.
- N501Y (substitution of tyrosine for asparagine at 501 - triplet change AAY to UAY) codes for changes in one of six key contact residues within the receptor-binding domain (RBD) of the spike protein.
It has been identified as increasing binding affinity to the target protein ACE2 in lungs.
- The spike deletion 69-70del - See Spike deletion 21765-21770 above - may make the virus more resistant to the human immune response. It has also occurred a number of times in association with other RBD changes.
- Mutation P681H (substitution of histidine for proline at 681 - triplet change CCN to CAY) is immediately adjacent to the furin cleavage site,, which enables the virus to easily enter into the host cell for infection, thus efficiently aiding its spread throughout the human population.
Another version of this, P681R also involves the substitution at position 681, a proline-to-arginine substitution.
- Mutation E484K (substitution of lysine for glutamic acid at 484 - triplet change AAR to GAR) is feared because it may defeat some antibodies (formed after vaccination or infection by other strains). This has been followed by E484Q, giving the amino acid glutamine - triplet CAR), which may be of even more concern.
L452R (substitution of arginine for leucine at 452 - triplet change CUN to CGN) may also increase the effectiveness of some emerging variants.
Virus strains with different combinations of mutations have been given descriptions such as 'Variant of interest', 'Variant under investigation', or 'Variant of concern'. These are changing all the time.
Variant of Concern 202012/01 (VOC-202012/01)
has 23 mutations: 14 changes to protein-coding codons, 3 deletions, and 6 'synonymous' mutations that code for the same amino acids due to degeneracy of the genetic code, i.e. there are 17 mutations that change proteins and six that do not.
And some countries have complained about names given to virus strains being used in a critical way. The WHO named the most recent cause for concern ('the Indian strain') the Delta variant
on 31 May 2021, which has effectively replaced a previous version ('the Kent variant') - now known as the Alpha variant