'New Variant' Covid 19 -
Changes in the
SARS-CoV-2 spike protein
caused by mutations
This virus came to our attention towards the end of 2019 - hence the name Covid 19 - but the virus's RNA together with the proteins coded for were published in early 2020, and this was quickly used in the development of mRNA vaccines.
As time passes, a number of changes in the amino acid sequence in the virus protein have been recorded in viruses in circulation in the population, especially the spikes projecting from the surface.
And then there are some other genetic differences reported in the virus from mink, as reported from Denmark and the Netherlands, as well as developments in a number of other countries, including Mexico and South Africa.
These have been described as mutants, new variants and simply new strains. There is concern about their infectivity and the possibility that they may be less susceptible to control by vaccines.
Attention has been focussed on antibodies to the spike protein as well as the action of T-cells. Groups of mutations have been used to describe and monitor the movement of the virus through populations, and geographical areas.
There is a convention that substitutions show the changed amino acid (as a single letter code) followed by the location in the sequence and the replacement amino acid.
Sometimes it is accompanied by a similar code showing changes in the DNA bases (A, C, G, T), and their location within the genome, which is generally a larger number.
"Cluster 5" is a mutated variant of the SARS-CoV-2 virus, discovered in Northern Jutland, Denmark. It is believed to have been spread from minks to humans via mink farms.
These include
Y453F, I692V, M1229I as well as two amino acid deletions H69del/V70del.
The first three are substitutions, and the deletion of 6 bases from 21765-21770 caused the loss of two amino acids from the polypeptide chain.
Cluster 5 Substitutions (and base code changes
- using the flexible base notation for triplet degeneracy)
Y453F is a substitution of phenylalanine (F) for tyrosine (Y) at position 453
[base code switched from U
AY to U
UY]
I692V is a substitution of valine (V) for isoleucine (I) at position 692
[base code switched from
AUY to
GUY]
M1229I is a substitution of isoleucine (I) for methionine (M) at position 1229
[base code switched from AU
R to AU
Y - a transversion]
Spike deletion 21765-21770
(Partial) Base and amino acid sequence
A comparison of sections of (DNA) base sequences between reference sequence Wuhan-Hu-1 (at the top, in blue) and deletion 21765-21770 (in green).
The loss of 6 bases gives a triplet of ATC (bases 21764-21771-21772 in 'old numbers') still coding for an isoleucine ('same amino acid'), so the chain is simply shortened by 2 amino acids.
What are the two lost amino acids?
>
H histidine
>
V valine
I have taken the liberty of annotating
base sequence data and amino acid listings to show the effects of that deletion.
- Just look for lines I have inserted between the original data!
In fact this deletion is responsible for an effect known as
S-gene dropout or
S-gene target failure.
When testing for possible coronavirus infection using PCR, it is conventional to use kits targeting a limited number of virus genes. If some of these are detected but the S-gene is not, this is strongly indicative of a variant of concern, such as the
omicron variant, and full viral genome sequencing is called for.
Other mutants - rather a saga
The
D614G mutation is an allele causing a modification of the virus' surface spike protein, which has become increasingly common. This notation means that the 614th amino acid in its polypeptide chain is altered from being aspartate - aspartic acid - (one-letter amino acid code D) to glycine (G). This is likely to be the result of a change (A to G) in the middle base of a codon in the viral RNA. It is said that this mutation appears to have greater transmissibility in humans rather than greater pathogenicity.
The BBC stated that a strain
A222V spread across Europe and was linked to summer holidays in Spain.
What is the change in amino acid, and the base sequence? (Use the table above)
Amino acid 222 changed from >
alanine to >
valine, triplet changed from >
GCN to >
GUN
More recent versions, and a cause of concern because of apparently greater (70% ?) transmissibility, includes a number of mutations known from other parts of the world.
- N501Y (substitution of tyrosine for asparagine at 501 - triplet change AAY to UAY) codes for changes in one of six key contact residues within the receptor-binding domain (RBD) of the spike protein.
It has been identified as increasing binding affinity to the target protein ACE2 in lungs.
- The spike deletion 69-70del - See Spike deletion 21765-21770 above - may make the virus more resistant to the human immune response. It has also occurred a number of times in association with other RBD changes.
- Mutation P681H (substitution of histidine for proline at 681 - triplet change CCN to CAY) is immediately adjacent to the furin cleavage site,, which enables the virus to easily enter into the host cell for infection, thus efficiently aiding its spread throughout the human population.
Another version of this, P681R also involves the substitution at position 681, a proline-to-arginine substitution.
- Mutation E484K (substitution of lysine for glutamic acid at 484 - triplet change AAR to GAR) is feared because it may defeat some antibodies (formed after vaccination or infection by other strains). This has been followed by E484Q, giving the amino acid glutamine - triplet CAR), which may be of even more concern.
-
L452R (substitution of arginine for leucine at 452 - triplet change CUN to CGN) may also increase the effectiveness of some emerging variants.
Virus strains with different combinations of mutations have been given descriptions such as 'Variant of interest', 'Variant under investigation', or 'Variant of concern'. These are changing all the time.
Variant of Concern 202012/01 (VOC-202012/01) has 23 mutations: 14 changes to protein-coding codons, 3 deletions, and 6 'synonymous' mutations that code for the same amino acids due to degeneracy of the genetic code, i.e. there are 17 mutations that change proteins and six that do not.
Some countries have complained about names given to virus strains being used in a critical way. The WHO named the most recent cause for concern ('the Indian strain') as the
Delta variant on 31 May 2021, which has effectively replaced a previous version ('the Kent variant') - now known as the
Alpha variant.
The
Delta Plus variant which also has the
K417N substitution (asparagine instead of lysine in the spike protein) - although this has also been found in other variants.
A further development (19/10/21) of this, AY.4.2 , has two mutations :
A222V and
Y145H are changes to spike proteins which may give it 10% more transmissibility.
These are alanine to valine (triplet change GCN to GUN) and cysteine to histidine (triplet change UAY to CAY) substitutions.
This has been followed by the
Omicron variant - B.1.1.529 (which was first identified in South Africa). . .
Issues of concern include factors such as increased transmissibility and disease severity of the infections and potential 'immune escape' (not covered by vaccinations), diagnostic or therapeutic escape (testing and caring problems);
Additionally community transmission or multiple COVID-19 clusters are expected.
More recently (March 2023) other variants have been noted:
BQ.1 and BQ.1.1 contain mutations to the spike protein on the surface of the virus allowing it to attach to and infect cells.
These mutations include changes to amino acids which are fairly close to each other:
K444T - substitution of lysine with threonine - a change to the middle base of the triplet,
N460K - substitution of asparagine with lysine - a change from AAY to AAR (transversion substitution) to the third base,
L452R - leucine to arginine - another middle base change,
F486V - phenylalanine replaced by valine - a first base change
and
R346T - arginine to threonine - yet another middle base change
These changes have also been associated with significant
immune escape and
antibody evasion, meaning that they are less under control by antibodies from previous infections and vaccination.
In fact there are many lists of different 'mutations of interest' - including one (Omicron XBB.1.5) with several changes to amino acids including two to subsequent amino acids:
L455F and
F456L, which surprisingly represent opposite changes (Leucine to phenylalanine and phenylalanine to leucine)