Glycosylation and its Role in SARS-CoV-2 Viral Pathogenesis
Author: Meg Baker, PhD
N-Glycosylation and COVID19
Glycobiology
N-linked glycosylation (NLG) is a complex biosynthetic process that regulates proper folding of proteins through and intracellular transport of proteins to the secretory pathway. This co- and post-translational modification occurs by a series of enzymatic reactions, which results in the transfer of a core glycan from the lipid carrier to a protein substrate and the possibility for further remodeling of the glycan. The enzymes are located in the cytosolic and the luminal side of the ER membrane. The study of NLG and related effects of glycans is called glycobiology.
NLG takes place at sites specified in the protein sequence itself. N-linked oligosaccharides are attached via a GlcNAc linked to the side chain nitrogen of Asn found in the consensus sequence NXT/S (X ≠ P) known as the ‘glycosylation sequon’. Formation of a precursor branched carbohydrate chain, the lipid-linked oligosaccharide (LLO) structure, takes place in the endoplasmic reticulum. The LLO consists of a Glc3Man9GlcNAc2 molecule (three glucose, nine mannose, and two N-acetylglucosamine sugars) linked to a dolichol pyrophosphate. The enzyme oligosaccharyltransferase then moves it to an Asn in the polypeptide.
The removal of the three glucose sugars from the new N-linked glycan signals that the structure is ready for transport to the Golgi where mannose is removed yielding a carbohydrate chain containing five–nine mannose sugars. Further removal of mannose residues can lead to the core structure containing three mannose and two N-acetylglucosamine residues, which may then be elongated with a variety of different monosaccharides including galactose, N-acetylglucosamine (aka NAG or GlcNac), N-acetylgalactosamine, fucose, and sialic acid, many of which can also exist in sulfated form.
The enzymes involved in this essential process are evolutionarily conserved. However, the genes and their specific functions, have evolved uniquely for each selected organism. Therefore, each organism and each individual cell, depending on genetic background and influenced by nutritional and such things as disease status, will decorate secreted proteins in a unique manner.
The advent of biologic medicines (protein based therapeutics) presents the challenge of making sure that the primary protein sequence is specified but also that the manufacture of the protein – typically in a eukaryotic cell host capable of glycosylation – will take place with some degree of reproducibility. The large number of monoclonal antibody therapeutics absolutely require glycosylation for proper structural integrity but are generally made in rodent or other nonhuman cells. Thus, the term “biosimilar” rather than generic is the term being used to connote the variation which will necessarily result due to different manufacturing process even of the same genetic sequence.
Viral Glycoproteins
It should be obvious that the viral genome is not large enough to encompass the collection of enzymes required for glycosylation of any type and viral glycoproteins are formed by the host cell in which the virus is replicating. The study of the impact of glycan content and composition on viral infectivity and, more importantly, vaccine development is a subject which has been late to be addressed largely due to the technical difficulty and lack of methods for analyzing protein glycan composition. However, progress is being made. Raska et al. (J Biol Chem 2010 Jul 2; 285(27): 20860–20869. Glycosylation Patterns of HIV-1 gp120 Depend on the Type of Expressing Cells and Affect Antibody Recognition) was able to perform such an analysis on the HIV-1 virus albeit almost 30 years after its emergence in human populations. The findings of this study may explain, in part, the difficulty in developing a vaccine against HIV.
SARS-CoV-2 spike protein (P0DTC2 uniprot.org) – as so popularly depicted – is a trimer poking out of the lipid coat that protects it’s genome. The spike protein, like gp120 in HIV, is the point of contact with the human cell ACE2 receptor it uses to gain entry. The spike protein contains two functional external subunits, designated S1 and S2. S1 separated by a furin cleavage site from S2, forms the apex of the trimeric spike structure, is responsible for attachment to the ACE2 receptor. S2 is responsible for fusion to the cell membrane. (PDB: 6VSB shows a 3D image of the protein structure, including glycan positions). There are 22 glycans per polypeptide or 66 per spike trimer protein (Watanabe et al. 2021 Site-specific glycan analysis of the SARS-CoV-2 spike. Science 17 Jul 2020:Vol. 369, Issue 6501, pp. 330-333 ).
Although shielding of receptor binding sites by glycans is a common feature of viral glycoproteins, Watanabe (ibid) note the low mutation rate of SARS-CoV-2 and that as yet, there have been no observed mutations to N-linked glycosylation sites.
The development of a vaccine or individual antibodies or antibody cocktails with neutralizing (viral entry blocking or virocidal activity) is also influenced by the presence or absence of glycans and how well they target the natural conformation of the spike protein. Papageorgiou et al. The SARS-CoV-2 Spike Glycoprotein as a Drug and Vaccine Target: Structural Insights into Its Complexes with ACE2 and Antibodies. Cells 2020 Oct 22;9(11):2343. doi: 10.3390/cells9112343. SARS-CoV-2 Spike – Stanford Coronavirus Antiviral Research Database It should be noted that the mRNA vaccines (or other nucleic acid formats) may obviate these analysis because the immune response is to a spike protein made and glycosylated in the human host’s own body and therefore will be customized to each individual in some sense.
Glycans may themselves represent drug targets. Casolino et al. suggest an essential structural role of N-glycans at sites N165 and N234 in modulating the conformational dynamics of the spike’s receptor binding domain (RBD), which is responsible for ACE2 recognition (Casolino et al. 2020. Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike Protein ACS Cent Sci. 2020 Oct 28; 6(10): 1722–1734),
COVID19 Variants
SARS-CoV-2 lineage B.1.1.7 likely arose in the United Kingdom in September 2019 and is characterized by 17 mutations, including 8 in the spike protein (Rambaut et al., 2020). Other lineages, including B.1.351, initially detected in South Africa (Tegally et al., 2020), and most recently lineage P.1, first documented in the Amazonia region of Brazil (Faria et al., 2020), carry additional mutations. All three lineages are characterised by a N501Y (Asn to Tyr) mutation in the spike protein, while both B.1.351 and P.1 also carry the spike mutation E484K. In addition, both B.1.1.7 and B.1.351, but not P.1, have acquired short sequence deletions in the spike protein. N501Y is in the receptor-binding domain (RBD) but is not a glycosylation site.
Reference
See the CDC Emerging SARS-CoV-2 Variants | CDC
Leave a Reply