Introduction

The present report highlights recent knowledge about the different variants of the causative agent of coronavirus disease 2019 (COVID-19), severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2), throughout 2021, with an emphasis on those variants that could further compromise the critical public health scenario in different parts of the globe. The focus of the present report is (1) the multiple structural features of the spike (S) protein, invariably present in the capsid of all viral variants; (2) understanding how these variations can potentially decrease the protection offered by vaccines and therapeutic measures, such as monoclonal antibodies; and (3) highlighting the most recent knowledge about Omicron’s mutations.

COVID-19

COVID-19, a disease caused by the new severe acute respiratory disease 2 (SARS-CoV-2), was first reported as a case of pneumonia in December 2019 in Wuhan, China; since then, it has progressed from a local outbreak to an unprecedented pandemic. Two other insurgences in the Coronaviridae family have occurred in the last two decades: severe acute respiratory disease (SARS), caused by SARS-CoV in 2002 [1], and Middle East respiratory syndrome (MERS), caused by MERS-CoV in 2012 [2, 3]. Even though the mortality rate decreased [4] when compared with the previous viruses, the virus that emerged in 2019 showed a major increase in infectivity [2, 4, 5].

Past vaccine knowledge and advances in technology have helped to accelerate the already well-established process of vaccine development [6]. The response to the disease outbreak was similar to that which followed the 2009–2010 H1N1 pandemic, for instance, but the number of vaccine developers was 30–50 times greater [6], and the technology level had been developing over the past 10 years. The funding allocated worldwide for COVID-19 testing and vaccination is unprecedented, gathering several billion dollars to all stages of development [7]. Additionally, even though cooperation between governmental funders and research institutes or international alliances was previously seen in other vaccines, COVID-19 exceeded previous cooperation levels, leading to rapid information sharing among public and private partnerships, providing risk sharing, and improving the efficiency of resource use [7]. Nonetheless, at the final stages of development, the high number of infected people helped to show the effectiveness of the vaccines, which is challenging with diseases of low prevalence. The clinical trials were not reduced [6], and some efficacy and safety studies for COVID-19 vaccines included more than 40,000 subjects [8,9,10,11,12,13,14,15]. In contrast, other approved vaccines, such as Haemophilus influenzae [16], had approximately 5000 subjects, varicella vaccines [17] had approximately 1000 and hepatitis A had approximately 40,000 subjects [18, 19]. Fast regulatory review and approval processes were important, as the criteria of approval in multiple countries were the same, and several documents were issued to clarify the licensure pathway [6].

Fast information sharing has been strongly discussed by scientists in recent years, since it helps researchers gather information quickly, and the results of health care and public policy are widespread in this urgent time. However, it might also cause the spread of biased data and questionable results without peer review [7]. Even though the scientific community might understand the importance of taking precautions with this kind of information and checking for further publication of results, the mainstream and social media might interpret them as generating fake news [20, 21].

The symptoms of this infection may vary between inapparent or mild cases of respiratory insufficiency and sepsis, with the major symptoms including fever, dry cough, and difficulty breathing [22,23,24]. Additionally, host genetics might be key to understanding the high dissimilarity of symptoms between patients [25]. The disease relates to an aggressive immune response that may damage the airways. As a result, severe cases must consider controlling the infection itself as well as managing the host response, since it might evolve into a cytokine storm, which leads to multiple organ failure and death [26]. Nonetheless, minimizing secondary bacterial or fungal infections is also very important [26]. Disease transmission occurs mainly through contact with contaminated people. The incubation period can range between 0 and 14 days after virus contact and is usually between 5 and 6 days [27,28,29]. /The viral load is usually high at the beginning of symptoms but is still similar in patients with or without symptoms, suggesting that even asymptomatic patients can be contagious, emphasizing the relevance of preventive measures [30].

Host genetics may be important for understanding the large range of symptoms between patients, but researchers [2] have also reported several host genes that are critical to viral infection, such as ACE2, RAB7A and four members of the ARP2/3 complex, which are crucial for attachment and endocytosis, or CTSL and 13 vacuolar-ATPase proton pumps, which are important for spike protein cleavage and viral membrane fusion. From invasion to the exit of the infected cell, they are mainly involved in the initial binding, endocytosis and cleavage of the spike protein but also play roles in fusion with the viral membrane, endosome, Golgi apparatus, and transcriptional modulators. These genes are important regardless of viral load and do not strongly depend on the cell type or tissue. Nonetheless, among the main genes found, only the ACE2 receptor demonstrated tissue-specific expression, with increased expression in the testicles, small intestine, kidneys, and heart. This expression might help to explain the viral tropism (as S interacts primarily with ACE2), and more studies should be performed to better understand the potential damage caused by COVID-19 in these tissues.

Some evidence has shown the possibility of long-term, nonproductive persistence of SARS-CoV-2 infection in tissues, such as epithelial, myeloid, and neural cells [31]. The persistence might also be related to mutational events in the virus, as related by a study analyzing a variant (GZ69) isolated from an asymptomatic individual who reported an unprecedented capability of replication in Vero E6 cells in the absence of any evident cytopathic effect. This strain might favor cell survival and, eventually, viral persistence due to a mutation in residues 203 and 204 in the N protein [32]. This evidence, along with others [33,34,35], suggests that prolonged viral shedding and viral persistence might be present in COVID-19.

A nasopharyngeal or oropharyngeal swab followed by quantitative real-time reverse transcriptase-polymerase chain reaction (qRT–PCR) is the standard procedure for diagnosis [36,37,38,39,40]. Some immunochromatography assays and other technologies, such as reverse transcription loop-mediated isothermal amplification (RT-LAMP) [41,42,43], can be used as complements.

According to data published by the World Health Organization (WHO) in February 25, 2022 [44], 146 vaccines are in clinical development. Of these, 33% use a protein subunit platform, and 24% use RNA or mRNA to elicit an immune response, a great number of which rely on the form or sequence of the spike protein of SARS-CoV-2, including the Pfizer/BioNTech® and Moderna® vaccines [45, 46].

SARS-CoV-2 biology

SARS-CoV-2 is an enveloped virus with a viral particle of 60–140 nm [47, 48] in diameter with a single positive-strand RNA (( +) ssRNA) of approximately 26–33 kb in length [49], and it may depend on host factors at all stages of its viral cycle. It belongs to the order Nidovirales, suborder Cornidovirineae, family Coronaviridae, subfamily Orthocoronavirinae, genus Betacoronavirus, and subgenus Sarbecovirus [50]. The coronavirus family is so named thanks to a spiculated glycoprotein membrane that resembles a solar corona [51].

The virus has four main structural proteins: spike protein (S), membrane (M), envelope (E), and nucleocapsid protein (N) [52]. The M protein is the most abundant and, in addition to defining the shape of the viral envelope, anchors the envelope to the nucleocapsid [52, 53]. Protein E, which is also the smallest structural protein, has viroporin action and forms an ionic channel [52, 54]. Both E and M proteins are essential to the assembly and release of the virion of host cells[52]. The N protein is the only protein that binds directly to viral RNA and has a high diagnostic value [52, 55].

The spike protein (S) shapes trimers in the viral envelope and has a crucial function in fusion with the host cell and viral pathogenicity. Its N-terminal part has an external globular domain called S1, where the receptor-binding domain (RBD) is also located. The C-terminal part, called S2, forms the spicule rod and includes the membrane fusion peptide (FP) [49, 52]. The structure of the spike protein of a D614G mutant at pH 5.5 (PDB: 6XM0) is shown in Fig. 1, constructed with PyMOL 2.5.2® (The PyMOL Molecular Graphics System [56]) and Biorender® [57]. Different from other external fusion proteins, the SARS-CoV-2 spike protein is very flexible and articulated in at least three points, improving the scanning capacity over the human cell and multiple binding sites [58].

Fig. 1
figure 1

A Side view of the demarcating protein S2 and S1 domains and subdomains. B SARS-CoV-2 spike protein structure of the D614G mutant viral strain (PDB: 6XM0) constructed through PyMOL® and BioRender. The colors refer to the secondary structures: alpha-helix in cyan, beta sheets in purple and turns in magenta. C Top view of the trimeric protein with the receptor binding domain (RBD) demarcated on the surface

The receptor binding domain (RBD), located in the S1 part of protein S, has approximately 220 residues and is stabilized by eight disulfide bridges and two N-glycosylation sites (N331 and N343). It has a potential role in protein folding, dynamics, stability, and accessibility to the receptor and is composed mainly of random structures (35.6%) and β-sheets (33%), followed by turns (19.1%) and alpha helices (12.4%) [59].

The RBD is also subdivided into two subdomains: the core, rich in nonpolar residues, and the receptor binding motif (RBM), corresponding to residues 443–503 of the S protein, which is mostly polar. Specifically, the RBM mediates angiotensin converting enzyme-2 (ACE2) receptor binding and must maintain sufficient affinity with the receptor for entry into the host cell to be effective [59]; however, this region is highly targeted by neutralizing antibodies, presenting itself as a crucial location for mutations that promote viral immune evasion. It has been demonstrated [60] that RBM might have variability and a high degree of plasticity, supporting variations in its sequence without losing the ability to connect with ACE2. Additionally, A475 within RBM seems to be a crucial residue for the connection between S and ACE-2 [61].

The viral entry process and the replication mechanism of SARS-CoV-2 will be described in the next paragraphs. Figure 2 provides a visual guide and might be helpful to better understand the process.

Fig. 2
figure 2

Role of spike protein in the SARS-CoV-2 infection mechanism. 1 The S protein is used by SARS-CoV-2 to interact with the host cell receptor. The host cell has several different receptors and polysaccharides in its membranes. In this step-by-step figure of the cycle, we focus on the major receptor used, which is the ACE2 receptor, but SARS-CoV-2 may also interact with other cell receptors. 2 Protein S attacks its target cells: the RBM (in RBD) interacts with the ACE2 receptor. In this process, A475 and F486 in the RBM seem to be the key residues to connect. In this step, it is important to note 2 additional receptors important for viral entry: furin and TMPRSS2. 3 Interaction between HR1 and HR2: the connection causes conformational changes in S2, where HR1 and HR2 motifs interact to start the formation of the six-helix bundle. 4 Proteolytic activation of spike: after S-ACE2 binding, proteolytic activation of spike occurs when the additional receptors cleave the protein, exposing the fusion peptide (FP). While furin cleaves the polybasic cleavage site (PRRAR) between S1 and S2, TMPRSS2 cleaves an S2 site. 5 Approximation between viral and host membranes: the FP approximates the viral and host membranes together with the six-helix bundle, leading to fusion and viral entry. 6 Fusion and viral entry: fusion of the membranes releases the viral RNA into the cytoplasm of the host cell. 7 Translation of replicase genes in the viral RNA genome: since the RNA of SARS-CoV-2 is a + ssRNA, it can be translated directly into polyproteins: ORFa will be translated in pp1a, which will be further cleaved in nspn 1 to 10, and by a − 1 ribosomal frame shifting, ORF1a and ORF1b will translate to pp1b, which will generate nsps 11 to 16. 8 Replicase-transcriptase complex: the viral RNA-dependent RNA polymerase (RdRp), composed of nsp12, nsp7, and 8 along with other nsps, condenses and assembles in the rough endoplasmic reticulum (ER) membrane-bound ribonucleoprotein complex, the so-called replicase-transcriptase complex (RTC) that directs replication, transcription, and maturation of the viral genome and subgenomic mRNAs. 9 Subgenomic transcription: the direct translation of the viral genome produces several ( −) subgenomic RNAs (sgRNAs) from the structural protein genes and ORF3 and ORF6 to 9, which are synthesized in a full-length negative strand RNA combining varying lengths of the 3′ end of the genome with the 5′ leader sequence. 10 Replication: the full-length ( −) RNA is also used as a mold to replicate the ( +) ssRNA of the virus that will form the new virion. 11 Structural protein synthesis: the ( −) sgRNA is used as a mold to synthesize subgenomic ( +) mRNAs, which are then translated into M, S, E, and N proteins and processed as needed. 12 Encapsulation: after translation, the structural proteins are inserted into the endoplasmic reticulum and continue to their intermediate compartment, where the replicated ( +) ssRNA will interact with N protein, forming the nucleocapsid that will be enveloped in the ER-Golgi intermediate. 13 Transport and exocytosis: the virions are transported by vesicles and released by exocytosis

Protein S attacks target cells, such as nasal epithelial cells, bronchial cells, and pneumocytes, by binding to ACE2. The RBM-ACE2 connection causes proteolytic activation of the S protein, exposing the fusion peptide and leading to the adoption of a more favorable conformation with epistatic regions, leading to viral entry into the host cell. This cleavage between S1 and S2 is usually performed by the TMPRSS2 protein, another protease that causes protein-dependent proteolysis of S or through the furin cleavage site or polybasic cleavage site (PRRAR) [23, 25, 49, 52, 62, 63]. Both ACE2 and TMPRSS2 have been studied as molecular markers that confer genetic susceptibility or resistance [25].

This furin site, a single Arg in other coronaviruses, has an unusual sequence of amino acids between S1 and S2 in SAR-CoV-2, composed of Pro-Arg-Arg-Ala-Arg (Pro: proline; Arg: Arginine; Ala: Alanine). The cleavage of this sequence is important for the efficient uptake of lung cells [58, 64]. SARS-CoV-2 virus without this site seems to be less infective [62, 65], and some variations in this sequence seem to improve viral transmissibility, as seen in the alpha and delta variants [58]. In alpha, proline is changed to histidine (P681H), and in delta, proline is changed to arginine (P681R). Both alterations make the sequence more basic, improving recognition by furin. This furin cleavage primes the newly made S protein (and consequently the virion) to be more effective at infecting host cells. For instance, for SARS-CoV, these primed proteins represent only 10%, while for the SARS-CoV-2 alpha variant, they represent more than 50%, and for the Delta variant, they represent more than 75% [58].

After binding ACE2, two domains interact within the S2 subunit: heptad repeats 1 and 2 (HR1 and HR2). These domains are structural motifs characterized by seven amino acids in a specific configuration (hydrophobic-polar-polar-hydrophobic-charged-polar-charged) that prone them to form alpha helices. They form a six-helix bundle responsible for approximating the viral and host membranes, leading to fusion and viral entry [66]. Following fusion, the inner contents of the virus are released into the cytoplasm of the host cell. Then, replicase genes are translated into the viral RNA genome.

As SARS-CoV-2 has a ( +) ssRNA, it can act as an mRNA and be directly translated into the ribosome. The first two open reading frames (ORFs) in the viral genome, called ORF1a and ORF1b, are immediately translated into two polyproteins that then undergo proteolysis into 16 nonstructural proteins (nspns). Polyprotein 1a (pp1a) originates from the translation of ORF1a and is cleaved in nspn 1 to 10, while polyprotein 1ab (pp1ab) is encoded by ORF1a and ORF1b through a − 1 ribosomal frame-shift mechanism and is then cleaved in nsps 11 to 16.

SARS-CoV-2 has two cysteine proteases [67] encoded by ORF1a: one papain-like protease (PLpro or nsp3) that cleaves pp1a at three sites liberating nspn1-4 and the main protease (Mpro or nsp5), sometimes referred to as 3Clpro for having a chymotrypsin-like fold, which cleaves both polyproteins liberating nsps4-16 [67, 68]. After this process, the viral RNA-dependent RNA polymerase (RdRp), composed of nsp12, nsp7, and 8 along with other nsps, condenses and assembles in the rough endoplasmic reticulum (ER) membrane-bound ribonucleoprotein complex, the so-called replicase-transcriptase complex (RTC), which directs replication, transcription, and maturation of the viral genome and subgenomic mRNAs [52, 67,68,69,70].

One important viral protein in this early step is Nsp1, which blocks host mRNA entry in ribosomes, undermining translation by approximately 70%, and also induces selective endonucleolytic cleavage and degradation of host mRNAs, especially cytosolic mRNAs, by binding to the 40S ribosome subunit. Thus, this protein helps to accelerate cell turnover, as the remaining translation activity is used by the virus. Another important action of this protein is to jam up nuclear exit channels, inhibiting nuclear alerts to the immune system and leading to a lower secretion of interferons [58, 71].

The structural proteins of SARS-CoV-2, however, are produced through a more complex process, performed by the RTC, of discontinuous transcription and then translation into proteins. In this process, translation of the ( +) ssRNA of the virus produces several ( −) subgenomic RNAs (sgRNAs) from the structural protein’s genes and ORFs 3 and ORF6 to 9, which are synthesized in a full-length negative strand RNA combining varying lengths of the 3′ end of the genome with the 5′ leader sequence. Then, this ( −) sgRNA is used as a mold to synthesize subgenomic ( +) mRNAs, which are then translated into M, S, E, and N proteins and processed as needed [52, 69, 72]. Full-length ( −) RNA is also used as a model for replication of the ( +) ssRNA of the virus that will form the new virion [69] (Fig. 2).

After replication and synthesis of subgenomic RNA, structural proteins are translated and inserted into the endoplasmic reticulum and continue to an intermediate compartment, where the replicated ( +) ssRNA will interact with the N protein. This forms the nucleocapsid that will be enveloped in the ER-Golgi intermediate, and vesicles containing the virion will bind to the membrane surface, resulting in the release of the virus [69]. Protein M directs most of the protein–protein interactions involved in viral assembly; however, the presence of protein E is also essential to produce the viral envelope, in addition to being related to the induction of membrane curvature and in preventing the formation of aggregates by M protein. After assembly, the virions are transported by vesicles and released by exocytosis [52, 69].

Additionally, infection by SARS-CoV-2 can induce syncytia [73, 74], as seen in MERS-CoV[75] and with some evidence in SARS-CoV [76]. In this mechanism, S proteins shown on the surface of the infected host cell activate a host cell calcium-ion channel, leading to the production and secretion of a fatty coating that leads to fusion with nearby cells expressing ACE2. This multinuclear structure may allow longer survival of the infected cell. SARS-CoV-2 is even able to form this structure with lymphocytes, important immune cells, similar to what is seen in tumor cells, which also helps the virus avoid host immune detection [58].

Most of the sequence variations reported in the new strains are single nucleotide polymorphisms or variants, along with other point mutations. In addition, some reports have suggested that variations in the virus genome might occur following infection, so-called intrahost mutations [25, 77,78,79].

The arrival of new viral variants has worsened the scenario. Viruses change constantly, and these mutations might result in new variants, some of which may have mutations that could allow them to spread more easily or increase disease severity. The RNA replication of SARS-CoV-2 has a moderate intrinsic error rate [60], as shown by the small number of mutations that reach high frequency between sequenced genomes. Nevertheless, the high number of infected individuals and the range of susceptible hosts increase the risk of new variants that could directly impact vaccines and therapeutic tools [60]. Additionally, coronaviruses possess a NSP14 exoribonuclease, which has a repair proofreading function [25].

The spike protein (S) of SARS-CoV-2: clinical interest and new viral variants

Due to its critical role in host invasion, the spike protein (S) of the virus has been the target of numerous studies, many related to possible therapeutic targets, whether by blocking binding with the receptor [60], DNA vaccines [80] or RNA based on the S protein sequence [46, 81], among others. However, there has been an increasing number of variants with divergences in their S protein sequences that can affect the efficiency of therapeutic measures based on transmission rates and disease severity [82, 83].

In addition, some authors [60] have indicated that, depending on the variants, long-term control of the pandemic will require systematic monitoring and complementary preparations for circulating strains. GISAID, which refers to the Global Initiative on Sharing Avian Influenza Data, has also monitored SARS-CoV-2 variants through its database of gene sequences since the beginning of the COVID-19 pandemic [82, 84,85,86]. The WHO also regularly monitors new variants and their potential impact on transmissibility; severity; and resistance to therapeutic tools, vaccines, and diagnosis to assess the risk they impose on public health. Weekly update reports are made available on the organization’s website and aim to assess the overall state of the pandemic worldwide [82, 87, 88].

The CDC-US [89] and WHO[90] have proposed classifying the variants into four classes based on the main strains in circulation in the USA and globally: variants of high consequence (VOHC), variants of concern (VOC), variants of interest (VOI), and variants being monitored (VBM). As it is not restricted to the strains circulating in the USA, the current WHO classification of the variants will be adopted [87, 90].

In the first stage, we include VOHC, a level for lineages with a proven advantage over preventative measures or medical treatments over previously circulating strains [89]. The measurements considered are the proof of failure in diagnosis or a decrease in vaccination effectiveness, a disproportional number of infections in vaccinated individuals, or low vaccine-induced protection for critical conditions, as well as a more severe prognosis and a spike in the proportion of hospitalizations. Neither the CDC-US nor WHO reported any variants in the VOHC category as of March 2022.

The VBM list, in contrast, involves variants that have genetic mutations that are suspected to affect virus characteristics, with some indication that they may pose a future risk, but for which evidence is still unclear and needs further monitoring and assessment [90]. By March 1, 2022, only the lineages B.1.640, C.1.2, and B.1.1.318 are on the VBM list.

Variants of interest (VOIs) have specific genetic markers associated with changes in receptor binding, mainly mediated by S protein, considerable potential for avoidance of neutralizing antibodies generated by previous infection or vaccination and a reduction in the efficacy of available treatments, a potential impact on diagnosis, or the possibility of increased transmissibility or severity of the disease [89]. Only the strains Lambda (C.37) and Mu (B.1.621) are currently (March 1, 2022) classified as VOI by the WHO [90].

The Mu variant seems to be highly resistant to COVID-19 convalescent sera, even more so than beta or delta, potentially induced by YY144-145TSN (mutation of two Y (tyrosine) by one T (threonine) and S (serine) in positions 144 and 145, with a genetic insertion of an N (asparagine) in position 146) and E484K mutations in the spike protein. Additionally, Mu-infected individual serum has been shown to be resistant to Mu as well as other variants [91].

Lambda has one deletion (del246-252) and seven nonsynonymous mutations in S (G75V, T76I, D253N, L452Q, F490S, D614G, T859N) [92]. Of these, F491S showed some evasion capacity of antibody neutralization in vitro [93,94,95] and seems to be highly infectious, characteristics associated mainly with T76I and L452Q [96].

On the other hand, VOC characterizes variants with evidence of increased transmissibility and severity of disease course (increase in hospitalization or deaths), a significant reduction in the efficiency of neutralizing antibodies, a decrease in the effectiveness of treatments or vaccination, and failures or interferences in diagnostic targets.

The following lineages are considered VOC by WHO as of March 1, 2022, with the correspondent’s name in order of Pango lineage and WHO label (a summary of data can be found in Table 1):

  • B.1.1.7 or Alpha: detected in the UK on 20 September 2020, with a high number of mutations, including 8 in the S protein. The main [82, 88] mutations in protein S are H69/V70 and N501Y changes, A570D, and P681H, according to the weekly report of 16 March 2021 from WHO. The strain, already found in 118 [88] countries, has a known 50% increase in viral transmission and a potential increase in disease severity but no to minimal impacts on monoclonal antibody therapies or neutralizing antibodies in convalescent serum or postvaccination [82, 88, 89, 97,98,99]. Recent data suggest that individuals infected by this strain are, on average, 61% more likely to die, based on 1,146,534 patients in community tests [100].

  • P.1 or B.1.1.28.1 or Gamma: detected in Japan in December 2020, but based on data in Brazil, has a moderate impact on therapies based on monoclonal antibodies and a reduction in the neutralizing power of convalescent sera and postvaccination [89]. The main mutations highlighted in the recent WHO [88] report are the changes K417T, E484K, and N501Y. It is currently found in 38 countries.

  • B.1.351 or Beta: the first report of this strain used to be considered early August 2020 in South Africa and has already been reported in 64 countries [88], but the WHO now considers that this strain emerged in South Africa in May 2020. This variant might have a 50% increase in its transmission rate [101], a moderate impact on monoclonal antibody therapies [73, 102], especially to the combination of Bamlanivimab® and Etesevimab® [102], and reduced neutralization by convalescent and postvaccination antibodies [73, 102].

  • B.1.617.2 or Delta: it was first reported in India. It has at least nine mutations in the S protein [25] This variant has higher transmissibility and a potential reduction in neutralization postvaccination and against some of the monoclonal antibody treatments currently used in the USA. Two strains were initially detected, AY.1 and AY.2, but now they are both considered under B.1.617.2. Another study showed that the sensitivity to vaccines in this strain is only modestly different than that of the alpha variant after complete vaccination (double- or single-dose vaccines) [103]. Another study, however, showed lower protection when compared with the alpha strain. In this study, Delta was resistant to some anti-NTD and anti-RBD monoclonal antibodies, the sera of convalescent individuals after 12 months were fourfold less potent against this variant, and the sera of individuals immunized with one dose of Pfizer® or AstraZeneca® were barely inhibitory to the strain. However, they improved with a second dose, being neutralized in 95% of the individuals, but with titers at least threefold lower than the alpha variant [104]

  • B.1.1.529 or Omicron: it was first reported in South Africa. It has at least 32 mutations only on the S protein, which is markedly different from other strains. Initial data show that this variant has a potential increase in transmissibility and a potential reduction by some monoclonal antibody treatments and postvaccination sera [89]. A genome-based phylogeny and mutational analysis, currently in preprint, concluded that this variant shares a common ancestry with the VOI lambda [105].

Table 1 WHO’S classification of SARS-CoV-2 Variants of Concern (March 2022): Who label, Pango lineage, local of origin, quantification and descriptions of spike mutations and important characteristics highlighted in the literature

Interestingly, most VOCs have the same additions in nonstructural protein 6 (NSP6), a potential scaffold transmembrane protein S106, G107, and F108 [52], and all variants monitored have at least one common mutation: the alteration of a negatively charged aspartate by a glycine at position 614 of the spike protein [8, 83, 86, 125,126,127].

In the sequence, we will describe important mutations frequently described in the literature.

D614G

One month after being identified in March and April 2020, D614G became the most prevalent mutation worldwide due to positive selection [63, 83, 86, 128]. Travelers may have played a major role in introducing the mutation in several locations. Another hypothesis [63] is that the genetic variability in ACE2 expression may play a role in the greater success of this mutation, since its expression is significantly higher in Asian populations than in European, miscegenated American and African populations, which show lower expression of the receptor. A correlation analysis showed a significant positive relationship between ACE2 expression and the prevalence of D614, which could explain the higher prevalence of the wild-type strain in the Asian population and the rapid spread of the G614 variant throughout the globe, since it might lead to greater transmission in populations with lower ACE2 expression [22].

Infections by strains with this mutation are associated with a greater number of nucleic acids found in the upper respiratory tract of patients, suggesting greater viral load and infectivity, which may suggest the need for higher antibody titers for protection [8, 86, 129]. G614 is also transmitted significantly faster than the wild strain between hamsters through aerosols and droplets [130].

However, no evidence has linked this mutation to the greater severity of the disease, data also supported by other authors [8, 86], nor is there any difference between the sensitivity between the two strains (D614 and G614) against monoclonal antibodies [8, 127] or convalescent serum [130].

Although one study [8] indicated that it decreases affinity for ACE2, the main SARS-CoV-2 receptor, increasing the dissociation rate by 4 × and decreasing binding affinity by 5.7 × at 25 °C in relation to the D614 strain, the higher G614 infectivity could not be explained by increased protein–receptor affinity. Other authors [127] found an increase of 1.5 to 2 × in affinity between the mutated protein and ACE2 receptor, also based on the dissociation rate, at 30 °C and 37 °C, suggesting that the mutation increased affinity. Ozono et al. [127], the authors of the second experiment, indicated that the difference between the results could have occurred due to the use of different S proteins and that further studies should be performed.

Additionally, structural changes seem to favor the G614 mutation: the exchange for glycine at position 614 leads to the breakdown of a usual bond between D614 in the S1 part of the protein and T859 in the S2 part. This leads to an important conformational change, increasing the proportion of proteins in the open conformation (greater interaction with ACE2) in relation to the closed conformation, from 18% open to 58% in proteins with the G614 mutation. Additionally, other intermediate states were found in this case: open-2 conformation (~ 39% of the observed forms) and fully open state (~ 20%), which also favors the connection to the ACE2 receptor [8]. The highest efficiency of entry into the host cell by strains with this mutation was also found by [128, 130] and Ozono et al. [127].

N439K and N439R

Two recurrent mutations in the RBM region, N439K and N439R, showed a 2 × increase in ACE2 binding affinity compared to the wild strain (N439) [60]. The N439K variants seem to be more successful since they appeared independently in multiple lineages due to convergent evolution and rapidly reached numerous countries. This mutation, although unrelated to the severity of disease, enables immune evasion, conferring resistance to various monoclonal antibodies and some polyclonal responses generated by previous infection or vaccination. Compared with N439, the N439K variant strain showed a twofold lower antibody-mediated immune response in 442 convalescent individuals [60]. This substitution does not seem to be present in any of the strains under monitoring [89]; however, Omicron bears an N440K mutation, close enough to have similar effects, as shown in Table 3.

K417T, E484K, and N501Y

Other mutations in RBD are present in most strains considered VOC, such as K417T (K417N in Omicron), E484K (in omicron E484A, [89]), and N501Y, the latter two directly in RBM, shared by Gamma and Beta. The Beta strain is reported [124, 131] to be more resistant to some monoclonal antibodies, convalescent plasma and patients vaccinated due to the E484K mutation, which also emerged independently in more than 50 strains and are currently present in all strains under monitoring. The gamma strain also has an evasion capacity, but it is apparently less impactful than the beta strain [132]. It is important to emphasize that the N501Y mutation, in addition to assisting in immune evasion, confers greater affinity for the ACE2 receptor. Several monoclonal antibody treatments under analysis target E484 and K417, other mutations shared by the strains, and may not be as effective in these two variants [124, 131].

S477N

The S477N mutation in Iota (former VBM) and Omicron showed a decrease in protein stability in computer-based analysis, which is still under peer review, compared with N501Y, which stabilizes the protein structure [133]. This study also showed that, while D614G leads to reduced stability and more disease, the variation S477N is less prone to disease [134]. In the same computational study, the N477 mutant showed lower binding affinity, but other authors found that this mutation could increase ACE2 binding affinity [135]. Additionally, there is some evidence linking this mutation with escape from multiple mAbs [94].

T478K

The T478K mutation, also seen in Delta strains and Mexican B.1.1.222 and B.1.1.519, located in the RBD part of the spike protein, forms three additional hydrogen bonds with F486, leading to a stabilization effect in the protein. Furthermore, it is predicted to increase the electrostatic potential of the S protein in a region of ACE2 contact, which might impact the RBD-ACE2 interaction [136, 137]. This mutation was considered a unique mutation site for the Delta variant and was used in screening tests for this variant. However, T478K is also mutated in the Omicron variant, which calls for caution and careful study when adopting this target for variant screening [138, 139]. Immunologically, this mutation might also be related to the success of immune evasion seen in the Delta variant against convalescent and, to a lesser extent, vaccine-elicited sera as well as against mAbs, such as imdevimab [140].

Effects on mAbs and immune response triggered by natural infection or vaccines

As stated by Moore et al. (2021), it is important to keep in mind that reduced sensitivity to neutralization does not necessarily translate into lower vaccines effectiveness, especially those that generate high levels of neutralizing antibodies, such as mRNA vaccines. These vaccines also induce specific cellular immune responses, such as TCD4 + and TCD8 + lymphocytes, which can aid in the response to the virus [131].

Garcia-Beltran et al. (2021) showed the neutralizing power of Pfizer® and Moderna® vaccines against some variants. While Alpha, B.1.1.298, and B.1.429 continued to be neutralized, variants P.2 and Gamma significantly reduced the neutralization capacity of vaccine-induced antibodies. The Beta variant demonstrated high dropout power, being as efficient as other distant coronaviruses.

Another study [124] evaluated the neutralizing efficacy of serum from convalescent and immunized patients with the Pfizer vaccine (25 patients 4–14 days after the second dose) and AstraZeneca (25 patients 14 to 28 days after the second dose), with results consistent with those found by Garcia-Beltran et al. (2021). The authors concluded that the Beta is/was the variant of greatest concern at the time of study, since the data suggest a large reduction in neutralization and even evidence of complete failure in neutralization in some cases. The neutralizing power against gamma and alpha was similar, and few samples failed to demonstrate 100% neutralization in serum dilutions of 1:20, while the neutralization of variant beta (B.1.351) was reduced 7.6-fold in patients immunized with the Pfizer vaccine and ninefold with the AstraZeneca vaccine. Another study [118] similarly demonstrated a reduction in the power of neutralizing antibodies of 1.7-fold in alpha, fivefold in gamma, and 7.9-fold in beta.

Another study [123] mentioned a possible failure in the neutralization of the gamma strain in natural infection convalescent patients and after 5 months of immunization with CoronaVac®; in the latter case, the data suggest a sixfold reduction in neutralization of the strain.

Decreased neutralization is common in vaccines, and a reduction of 4 × in a vaccine for influenza, for example, would signal a need for updating [143]. More robust studies with a higher number of vaccinated individuals should provide more compelling information about the true immunizing power of vaccines and the need for booster doses. Thus, the data seem to suggest the need for caution in discarding preventive measures, individual and collective, even after vaccination.

Epidemiology and reinfection rates

In the matter of epidemiology, it is important to note that the distribution of strains in Brazil used to be very unequal between regions, but delta has been the most prevalent in all states for several months. It is also important to note that the testing and genomic vigilance in the country is low despite all of the efforts made by scientists in the area. Even so, the country ranked 11th worldwide in SARS-CoV-2 sequencing performance published in December 2020 [144]. The country has already registered at least 2000 cases of the omicron variant in Brazil, and the possible increase in infectivity and reinfection rates in this strain may rapidly change the current state if preventive measures are not ensured [145].

Regarding reinfection rates, a study conducted in South Africa, still under peer review, did not find any evidence of increased reinfection risk associated with the beta or delta variant. In contrast, Omicron showed a marked immune evasion ability at the population level, which increases the risk of reinfection by this variant [146]. These data are supported by a computer-based study that predicted the vaccine escape capability of Omicron to be approximately twice as high as that seen in Delta, and its mutations could impact the efficacy of the Eli Lilly antibody cocktail and mAbs from Celltrion® and Rockefeller University® [147].

Omicron

Despite multiple studies seeking to assess the risk of this new Omicron variant, the available data are still preliminary. What is known and was reinforced in the last update of the WHO regarding Omicron is that all variants of SARS-CoV-2 can cause severe cases and death from COVID-19; thus, taking the right preventive measures is still the best option [148]. Even though there are no data to support an increase in the severity of disease by this variant, only the higher number of infections, which are doubling every 2 to 3 days, could represent a new wave of contamination [149].

The concern related to Omicron lies mostly in the fact it has a lot of mutations, some already studied, as previously discussed, some new and some still under investigation. It is also important to understand how these mutations work individually and in combination to better assess the effects of the new variant of interest, as some combinations might further increase receptor binding affinity [150]. More than 30 mutations have been described only on the S protein [89, 151, 152]. What is known about theses variations is described in Tables 2, 3, and 4; most of the articles included in this table are preprint versions, as knowledge about these mutations, especially the ones only seen in Omicron, is very new and published in the last months (Tables 2, 3, and 4). Additionally, a visual demonstration of all the catalogued mutations in the Omicron variant can be seen in Fig. 3.

Table 2 Mutations A67V to K417N in spike protein of the Omicron variant: locations, previous association to other SARS-CoV-2 variants and current knowledge about them (December 2021)
Table 3 Mutations N440K to ins214EPE in spike protein of the Omicron variant: locations, previous association to other SARS-CoV-2 variants and current knowledge about them (December 2021)
Table 4 Mutations Q498R to L981F in spike protein of the Omicron variant: locations, previous association to other SARS-CoV-2 variants and current knowledge about them (December 2021)
Fig. 3
figure 3

Visual representation of Tables 2, 3, and 4, showing all catalogued mutations of Omicron’s spike proteins. They were constructed using PyMOL 2.5.2® (The PyMOL Molecular Graphics System) and Biorender®

Another computational study comparing Delta and Omicron suggests that the S protein and its RBD are more hydrophobic, as the contents of leucine and phenylalanine and other hydrophobic amino acids are higher in the Omicron variant. Additionally, alpha-helix structures are more present in omicron than in the Delta variant in the whole S protein, suggesting, according to the authors, a more stable structure [119].

The same study [119] also compared the docking energy of the strains in relation to the Wuhan wild type. The wild type reached − 500.37 of docking energy using HEX software, whereas Delta reached − 529.62 and Omicron − 539.81; thus, the Omicron variant S protein seems to hold a higher affinity and fitness with the ACE2 receptor. Additionally, regarding the Omicron RBD mutations, E484A showed the lowest energy interaction with the receptor (− 478.49), and a higher energy interaction occurred with Q439R (− 581.53). The values for the other mutations are shown in Table 2. These data show that the Omicron variant transmission rate may be greater than that seen in the Delta strain, which is currently responsible for more than 90% of new COVID cases in the world.

Some results, one still under peer review, suggest that Omicron is very good at evading immunologic defense and mAbs [177, 232, 233]. Regarding vaccines, the emergence of omicron has heated the debate about booster doses. Pfizer®, for example, released statements saying that 3 doses of Pfizer-BioNTech COVID-19 vaccine are sufficient to neutralize Omicron, while two doses, which was the initial plan, show significantly reduced neutralization titers [234, 235]

In summary, the new variants of SARS-CoV-2 are a major issue, and mass vaccination is the best way to control the pandemic (Fig. 4). Even so, the appearance of strains capable of evading the immune response is inevitable because vaccination progress can also provide selective pressure to these strains, hence the need for maintaining preventive measures. It is also important to guarantee vaccine doses to everybody [236]. Despite efforts, this is not the reality, as African vaccination data reports indicate that some countries still have less than 20 doses per 100 population as of March 3, 2022 [237]. This vaccine inequity [238], which may arise from difficulties in the four dimensions of effective global immunization (development and production, allocation, affordability and deployment) [239, 240], will probably lead to a more prolonged pandemic and might even culminate in a more resistant variant [239, 241]. Thus far, all of the variants seem to be at least moderately sensitive to the available vaccines after the recommended doses, especially blocking symptomatic and severe cases, but booster doses will be necessary, as already seen in some countries, especially with the emergence of Omicron.

Fig. 4
figure 4

Summary of the main information regarding the spike protein and its mutations. The box summarizes data considering common characteristics of the protein and its major mutations, such as D614G, N439R, E484K, K417T, and N501Y

Final remarks

The S protein has a crucial role in viral infectivity, and all strains considered of interest or concern have mutations in the S sequence. These mutations seem to confer immune evasion and higher infectivity properties, especially those linked to conformational changes in its structure. Surprisingly, the mRNA vaccines, such Pfizer® or Moderna®, which use this protein sequence, seem to trigger a strong immune response [242, 243] to vaccination that can protect against most variants with multiple mutations in its sequence [103], even though some variants, such as Gamma [73, 122], Beta [112, 244], or Delta [104], have been demonstrated to be more tolerant to postvaccine serum, which could cause more infections among vaccinated people.

Omicron variant risk is still under preliminary assessment but appears to be more effective in evading immune responses, largely because of the many mutations in its S protein, and it also seems to transmit better than Delta. To date, there is no evidence of an increase in the severity of disease, even though the number of vaccinated people infected can mask these results, as vaccination tends to protect individuals from more severe cases or progression. Additionally, some preliminary evidence showing major outbreaks among vaccinated people, indicating the need for booster doses, is already available. Many of these strain mutations, highlighted in Table 1, are new, and there is almost no knowledge about them, making these genetic variations good subjects for further research.

This review summarizes the disease process, biology and replication cycle of SARS-CoV-2 and the important mutations carried by the most worrisome variants thus far. Additionally, this study highlights the most recent knowledge about Omicron’s spike mutations.

Genomic sequencing of the different SARS-CoV-2 variants and their hosts has undoubtedly led to the rapid development of vaccines by pharmaceutical companies around the world, allowing people to experience the pandemic in a slightly less catastrophic way. Of course, this happened differently in each country across the globe, mainly because different countries adopted distinct testing procedures, social distancing, and specific vaccination schedules in an uncoordinated way. However, there is a consensus that Science has saved lives around the world and has provided different people with the possibility of being better prepared to face similar public health problems that may occur in the future.