Dataset: 11.1K articles from the COVID-19 Open Research Dataset (PMC Open Access subset)
All articles are made available under a Creative Commons or similar license. Specific licensing information for individual articles can be found in the PMC source and CORD-19 metadata
More datasets: Wikipedia | CORD-19

Logo Beuth University of Applied Sciences Berlin

Made by DATEXIS (Data Science and Text-based Information Systems) at Beuth University of Applied Sciences Berlin

Deep Learning Technology: Sebastian Arnold, Betty van Aken, Paul Grundmann, Felix A. Gers and Alexander Löser. Learning Contextualized Document Representations for Healthcare Answer Retrieval. The Web Conference 2020 (WWW'20)

Funded by The Federal Ministry for Economic Affairs and Energy; Grant: 01MD19013D, Smart-MD Project, Digital Technologies

Imprint / Contact

Highlight for Query ‹COVID-19 screening

Post-translational modifications of coronavirus proteins: roles and function

Disulfide bond formation

Disulfide bonding contributes to the folding of MHV S proteins. When MHV-infected cells were briefly exposed to reducing agent dithiothreitol added to culture medium, newly synthesized MHV S protein was completely reduced, as indicated by a shift of mobility in nonreducing gel. Reduction of MHV S protein was associated with a loss of conformation, as the protein could no longer be recognized by a conformation-specific monoclonal antibody. When dithiothreitol was withdrawn, the S protein folded aberrantly into disulfide-linked aggregates, from which properly folded S protein subsequently dissociated. Therefore, disulfide bond formation is essential for the correct folding, trafficking and trimerization of MHV S protein.

In another study, the recombinant S1 domain of SARS-CoV S protein was used to study the redox state of the 20 cysteine residues. Interestingly, four cysteines remained unpaired in mature S1, and chemical reduction using β-mercaptoethanol did not impair the binding of S1 to the cognate receptor ACE2. Furthermore, treatment of sulfhydryl-blocking agent (DTNB) or the oxidoreductase inhibitor bacitracin did not inhibit the fusion of SARS-CoV pseudotyped particles, while the fusion of HIV- or MLV-pseudotyped virus was significantly affected. These data suggest that the S1 domain of SARS-CoV S protein exhibits a high level of insensitivity to redox state.

N-linked glycosylation

N-linked glycosylation of coronavirus S protein was first described for MHV in the 1980s. MHV S protein in the rough ER was found to acquire high mannose oligosaccharides. Treatment of the Golgi transport blocker monensin inhibited the transport of MHV S protein from trans-Golgi network to the cell surface. Later studies demonstrated that S proteins of IBV, TGEV, bovine coronavirus (BCoV) were also modified by N-linked glycosylation. Using pulse-chase experiments coupled with fractionation, it was found that high mannose glycans were acquired by monomer of the TGEV S protein, followed by the rate-limiting assembly of monomers into a trimeric structure and terminal glycosylation of the newly assembled trimers. Similarly, SARS-CoV S protein was found to acquire high mannose oligosaccharides and trimerize as early as 30 min postentry into ER, prior to the acquisition of complex glycans in the Golgi complex. The maturation status of SARS-CoV S protein can thus be monitored by its sensitivity to endoglycosidase H (endo H), which hydrolyzes high mannose glycans but not complex glycans. Using mass spectrometry, the structure of N-linked glycans on SARS-CoV S protein was determined, which were composed of high mannose, hybrid and complex glycans with and without bisecting N-acetyl-galactosamine (GalNAc) and core fucose.

With the advent of molecular cloning technologies, the coding sequences of S proteins from numerous coronaviruses were cloned and the putative N-linked glycosylation sites were predicted from the sequence information. For instance, 20 or 21 glycosylation sites were predicted in the S protein of MHV, 19 in bovine enteric coronavirus, 30 in HCoV-229E, 33 in TGEV, 20 or 22 in HCoV-OC43, 29 or 27 in porcine epidemic diarrhea virus (PEDV), 33 in feline enteric coronavirus, 29 or 33 in canine coronavirus, 20 or 21 in canine respiratory coronavirus.

However, it should be noted that not all of the putative glycosylation sites are functional. In fact, among the 23 putative glycosylation sites in the SARS-CoV S protein, only 12 sites were actually glycosylated, as determined by mass spectrometry following peptide:N-glycosidase F (PNGase F) digestion. Recently, we have used in solution deglycosylation combined with mass spectrometry to determine the N-linked glycosylation sites in the IBV S protein. As deglycosylation was carried out in the H2O18 environment, incorporation of O18 to Asp resulted in a mass increment of 2.98 Da, leading to a more robust identification of glycosylated sites by mass spectrometry. Among the 29 predicted N-linked glycosylation sites, only eight sites were confirmed using this method. Therefore, majority of the predicted N-linked glycosylation sites on coronavirus S protein may not be modified, possibly due to the massive amount of S protein produced during infection and the limited capacities of the cellular glycosylation enzymes. Some sites may be preferentially modified due to their relatively better spatial availability, while some inefficiently and/or partially glycosylated sites may not reach the detection limit for mass spectrometry. Thus, the predicted glycosylation sites are not fully utilized in coronavirus S protein. Preferential glycosylation on certain critical sites, such as those located within or near the RBD, may be of particular importance in the functionality of S protein.

N-linked glycosylation contributes significantly to the conformation of coronavirus S protein, and therefore profoundly affects the receptor binding and antigenicity of S protein. For example, early studies showed that the binding of IBV neutralizing antibodies was dependent on the glycosylation of the IBV S protein. Consistently, mutations that introduced new N-linked glycosylation sites in the S1 domain were shown to contribute to antigenic shifting of IBV. Also, when the S1 domain of BCoV S protein was cloned and expressed in insect cells, the mature protein was glycosylated and bound by neutralizing monoclonal antibodies. In contrast, when cells were infected with TGEV in the presence of tunicamycin, an inhibitor of N-linked glycosylation, the antigenicity of both S and M protein was significantly reduced. Similarly, when the overexpressed full-length homotrimeric SARS-CoV S protein was treated with PNGase F under a native condition, the protein was no longer recognized by neutralizing antisera raised against purified virions. This finding suggests that N-linked glycosylation may play an important role in constituting the native structure of coronavirus S protein, thereby affecting its antigenicity. During its maturation in the ER, SARS-CoV S protein binds to the molecular chaperone calnexin. Compared with control, SARS-CoV S-pseudotyped virions produced in calnexin-knockdown cells contained S protein with aberrant N-glycans and exhibited significantly lower infectivity. As for IBV, we recently showed that N-D or N-Q mutations at the N-linked glycosylation site N212 or N276 abolished the function of S protein to induce cell–cell fusion and the infectivity of corresponding recombinant viruses.

Nonetheless, in some instances, the antigenicity of coronavirus S protein does not depend on its glycosylation status. For example, when the S protein of TGEV was expressed by recombinant baculovirus in insect cells, the recombinant S protein acquired high mannose glycans, but the complete processing into complex glycans was not efficient. However, the recombinant TGEV S protein still exhibited antigenic properties and induced a high level of neutralizing antibodies. Similarly, a potent neutralizing monoclonal antibody against the S1 protein of SARS-CoV could bind to the deglycosylated S1 protein, suggesting that the epitope was not glycosylation-dependent. In one early study, the RBD of SARS-CoV S protein was mapped to amino acid residues 319–518, which contained two potential glycosylation sites N330 and N357. However, mutation of N330 or N357 to either alanine or glutamine did not affect the binding ability of RBD-containing fragment to the cognate receptor ACE2. Later, the structure of RBD of SARS-CoV S protein complexed with human ACE2 was determined, and both N330 and N357 were not positioned in the interface where the two proteins interacted. It was thus concluded that glycosylation did not always constitute neutralizing epitopes within the RBD. A later study exploring recombinant RBD of SARS-CoV S protein as a vaccine candidate found that yeast-expressed recombinant RBD (spanning amino acid residues 318–536) with glycosylation sites removed indeed induced a higher level of neutralizing antibody in immunized mice, compared with wild type RBD.

Although not essential for its binding to the cellular receptor ACE2, N-linked glycosylation of SARS-CoV may still contribute to efficient attachment of virions to the host cells. The C-type lectin DC-SIGN was shown to facilitate cell entry of SARS-CoV. The DC-SIGN binding region was mapped to amino acid residues 324–386 of SARS-CoV, and pseudotyped viruses with mutated N-linked glycosylation sites (N330Q or N357Q) had significantly reduced DC-SIGN-binding capacity. In a separate study, seven glycosylation sites (N109, N118, N119, N158, N227, N589 and N699) in SARS-CoV S protein were also shown to be critical for virus entry mediated by the DC-SIGN and/or L-SIGN. The interaction between N-linked glycans and lectins can also negatively affect receptor binding of coronavirus. For example, mannose-binding lectin was shown to interact with SARS-CoV S-pseudotyped virus and block viral binding to DC-SIGN, and N-linked glycosylation at N330 was found critical for the specific interaction between mannose-binding lectin and SARS-CoV S protein. Since N330 is also critical for DC-SIGN-binding, competitive binding between the two lectins to N-linked glycans on SARS-CoV S protein may have some implications in the attachment and entry of virions. At last, LSECtin, a lectin coexpressed with DC-SIGN on sinusoidal endothelial cells in the liver and lymph node, was also shown to interact with SARS-CoV S-pseudotyped virus.

N-linked glycosylation may also contribute to the activation of innate immune response in coronavirus-infected cells. Pretreatment of TGEV-infected cells with the plant lectin concanavalin A before exposure to porcine peripheral blood mononuclear cells led to a dose-dependent reduction in the induction of IFN-α. Also, inhibition of N-linked glycosylation by tunicamycin or removal of N-linked glycans by PNGase F reduced TGEV-induced IFN-α production. Therefore, N-linked glycans on coronavirus S protein may be a pathogen-associated molecular pattern recognized by host pattern recognition receptors, which in turn activate downstream antiviral innate immune response. However, compared with the parental PEDV strain, the more effective host immune response against the cell attenuated Zhejiang08 strain was associated with the lack of a potential glycosylation site in its S protein. Thus, the effect of S protein glycosylation on the immune response is complex, which may vary depending on the specific coronavirus and host system in question.

Caution should also be taken regarding the biological systems used to express the coronavirus S protein. For example, a recent study evaluated the antigenicity of recombinant IBV S1 protein expressed in mammalian cells. The result showed that the recombinant S1 protein was highly glycosylated and was able to induce the production of antibodies against S1 in immunized chickens. However, these antibodies had lower neutralizing activity compared with those generated by chickens immunized with inactivated IBV. Therefore, the glycosylation pattern of IBV S protein synthesized in mammalian cells may differ from those produced in avian cells, thereby affecting its antigenicity in vivo. Similarly, the glycosylation pattern of other coronavirus proteins may also be differentially affected by the expression systems, thereby changing their behaviors in relevant functional assays.

Interestingly, some of the known cellular receptors for coronavirus have also been shown to be modified by glycosylation. N-linked glycosylation of CEACAM1, the cellular receptor protein of MHV, was found essential for its binding to MHV-A59 virions, although recombinant proteins with mutations in the three N-linked glycosylation sites in the N-terminal domain were still functional. On the other hand, insertion of an N-linked glycosylation site into human APN, the receptor for HCoV-229E, abolished its activity to bind HCoV-229E virions. Similarly, N-linked glycosylation of DPP4, the cognate receptor of MERS-CoV, dramatically affects its binding to MERS-CoV S protein. Normally, mouse DPP4 does not support MERS-CoV entry. However, when the N328 glycosylation site was mutated in the presence of a secondary mutation A288L, the binding affinity of mouse DPP4 to MERS-CoV was significantly increased. Conversely, when the corresponding glycosylation site was introduced to human DPP4, the binding of MERS-CoV was significantly reduced. Therefore, glycosylation of coronavirus receptors contributes significantly to the host tropism of coronavirus infection, although additional sequence and structural determinants of S protein are also involved.


Palmitoylation of coronavirus S protein was initially identified in cells infected with MHV-A59, as 3H-palmitate was found to be incorporated in unglycosylated S protein in MHV-infected cells treated with tunicamycin. Treatment of palmitoyl acyltransferase inhibitor 2-bromopalmitate at a nontoxic dose reduced palmitoylation of MHV S protein and led to a significant reduction in the infectivity of MHV. Reduction of S palmitoylation correlated with a decreased level of S associated with M protein and subsequent exclusion of S from virions. However, underpalmitoylated S protein could still be expressed on the cell surface to induce cell–cell fusion. The C1347F/C1348S mutant virus harboring mutations in the putative palmitoylation sites exhibited reduced infectivity, further supporting the importance of palmitoylation in virion assembly and infectivity. Using antiviral heptad repeat peptides that only bind to folding intermediates of the fusion process, it was found that MHV S mutants lacking the palmitoylated cysteines were trapped in translational folding states almost ten-times longer than wild-type MHV S protein, leading to slower cell entry and reduced infectivity. In a later study using reverse genetics, the nine cytoplasmic cysteines in MHV S protein were singly or doubly substituted to alanine. Interestingly, no single specific cysteine in the MHV S endodomain was essential for viral replication, but a minimum of three cysteines within the motif independent of position was required for the recovery of viable recombinant MHV.

The cytoplasmic portion of SARS-CoV S protein contains four cysteine-rich clusters. Mutational analysis showed that cysteine clusters I and II were modified by palmitoylation. Although cell surface expression of SARS-CoV S protein was not significantly affected by mutations in cysteine clusters I and II, S-mediated cell fusion was markedly reduced compared with wild-type protein, suggesting that palmitoylation in the endodomain may be required for the fusogenic activity of SARS-CoV S protein. In a later study, a recombinant nonpalmitoylated SARS-CoV S protein was generated by mutating all nine cytoplasmic cysteines to alanines. Using this nonpalmitoylated mutant, it was shown that similar to MHV S protein, palmitoylation of the SARS-CoV S protein was required for its partitioning into detergent-resistant membranes and for cell–cell fusion. However, unlike MHV S protein, palmitoylation of SARS-CoV S protein was not required for S–M interaction. Interestingly, treatment of nitric oxide or its derivatives led to a reduction in the palmitoylation of SARS-CoV S protein, which affected its binding to the cognate receptor ACE2.

The S protein of the Alphacoronavirus TGEV is also modified by palmitoylation, and inhibition of palmitoylation by 2-bromopalmitate treatment reduced TGEV replication in cell culture. Although palmitoylation of TGEV S protein was essential for its incorporation into virus-like particles (VLP), the interaction between TGEV S and M proteins was not affected by the lack of palmitoylation. Therefore, dependent on the coronavirus in question, palmitoylation may differentially affect the folding, fusogenic activity and/or protein–protein interaction of S protein. Palmitoylation of S protein has not been characterized for other coronaviruses.


Based on sequence prediction, SARS-CoV E protein contains two potential N-linked glycosylation sites on N48 and N66, whereas IBV E contains one potential site on N5. Although topological study demonstrated that IBV E protein spanned the membrane once with a luminal N-terminus and a cytoplasmic C-terminus, the glycosylation site on N5 was not functional. On the other hand, SARS-CoV E protein in transfected cells seemed to adopt two distinct membrane topologies. In one form, both the N- and C-termini were exposed to the cytoplasmic side and the protein was not modified by glycosylation. In an alternative minor form, SARS-CoV E protein was shown to be glycosylated on N66, with the C-terminus exposed to the luminal side. A later study using transfected SARS-CoV E protein with an N-terminal Myc-tag confirmed that SARS-CoV E protein was glycosylated co-translationally. Although the two putative TM domains were required for its interaction with the SARS-CoV M protein, the hydrophilic region (60–76) flanking the N66 glycosylation site was dispensable as shown by co-immunoprecipitation experiment. The glycosylation of SARS-CoV E protein during actual infection and its biological function remain to be further investigated.


All the three cysteine residues (C40, C43 and C44) in SARS-CoV E protein are also modified by palmitoylation, which may regulate its subcellular trafficking and association with lipid rafts. In fact, when the homologous cysteine residues in the E protein of MHV-A59 (C40, C44 and C47) were doubly or triply mutated to alanine, its ability to induce VLP formation was significantly reduced. Moreover, MHV E protein carrying triple mutations (C40A/C44A/C47A) was prone to degradation, and the corresponding recombinant MHV had significantly reduced yield compared with wild-type. While wild-type MHV E protein mobilized co-expressed M protein into detergent-soluble secreted forms, in cells expressing the triple C-to-A MHV E protein, the co-expressed M protein accumulated into detergent-insoluble complexes that were not secreted. Therefore, palmitoylation of MHV E protein contributes to its stability and biological activity during assembly of mature virions. On the other hand, palmitoylation of SARS-CoV E protein is not required for its association with N protein and VLP production, and thus possibly dispensable for SARS-CoV assembly.

O-linked glycosylation

O-linked glycosylation of the MHV M protein was first discovered in 1981. It was found that in the presence of tunicamycin, an inhibitor of N-linked glycosylation, synthesis of the S protein was completely inhibited, but M protein was still normally produced and glycosylated, resulting in the formation of noninfectious virions containing normal amounts of N and M protein, but lacking S completely. When it was expressed from transfected cDNA, M protein of MHV-A59 also underwent O-linked glycosylation and was localized in the Golgi region. The structures of the O-linked glycans of MHV-A59 M protein were characterized, and pulse-chase labeling experiments showed that the O-linked glycans were acquired in a two-step process: GalNAc was added before the addition of galactose and sialic acid. After the sequential acquisition of GalNAc, galactose and sialic acid, the M protein of MHV-A59 was further modified in the trans-Golgi network. Apart from MHV, O-linked glycosylation was also found in the M protein of two other lineage A Betacoronaviruses: BCoV and HCoV-OC43. Since its discovery, O-linked glycosylation has been used as a marker to study the maturation, membrane insertion and intracellular trafficking of MHV M protein. In fact, due to its high expression level in transfected or MHV-infected cells, the M protein of MHV has also been used as a model protein to study O-linked glycosylation and vesicular trafficking between ER and the Golgi compartments.

Initial studies proposed the four highly conserved hydroxyamino acids (S2, S3, T4 and T5) at the extreme N terminus of MHV M protein as the putative O-linked glycosylation sites. Follow-up investigations further pinpointed T5 as the functional acceptor site, and the downstream P8 was also required for efficient O-linked glycosylation. However, the hydroxylamino acid cluster per se was not sufficient, as downstream amino acids must also be included to introduce a functional O-linked glycosylation site into a foreign protein. Interestingly, in the highly virulent strain MHV-2, the S-S-T-T sequence was mutated to N-S-T-T, and N-linked glycosylation was shown to be added to the N2 residue. However, whether the presence of extra sugars would affect the function of MHV-2 M protein has not been fully understood.

O-linked glycosylation is not essential for the assembly of MHV virions, as mutations that abolished the normal O-linked glycosylation site did not inhibit the budding of infectious virions or growth kinetics in cell culture. However, it was found that recombinant MHV containing N-linked glycosylated M protein induced a higher level of type I interferon compared with the wild-type MHV with O-linked glycosylated M protein, whereas MHV with nonglycosylated M protein was a poor interferon inducer in cell culture. The in vitro interferongenic capacity also correlated with the abilities of these viruses to replicate in the liver of infected mice, suggesting that glycosylation status of M protein might affect the induction of innate immune response by MHV infection.

N-linked glycosylation

Distinct from the O-link glycosylation observed in the M protein of MHV, BCoV and HCoV-OC43, the M protein of Alphacoronavirus TGEV and PEDV, as well as Gammacoronavirus IBV and turkey enteric coronavirus are all modified by N-linked glycosylation, which is sensitive to endo H and can be inhibited by tunicamycin. The N-linked glycosylation sites were mapped to N3 and N6 of IBV (unpublished data from this group). Within the Betacoronavirus genus, M protein of coronaviruses in other lineages is also N-linked glycosylated. For example, SARS-CoV M protein contains a single N-glycosylation site at N4. When transiently transfected as a C-terminally FLAG-tagged protein, SARS-CoV M protein was found to obtain high-mannose N-glycans that were modified into complex N-glycans in the Golgi. However, in a later study using SARS-CoV infected cells and purified SARS-CoV virions, glycosylated M protein was shown to remain endo H sensitive, suggesting that trimming and maturation of N-linked glycans were inhibited during actual SARS-CoV infection.

Similar to O-linked glycosylation of MHV, N-linked glycosylation of SARS-CoV M protein is not essential for viral replication, as recombinant SARS-CoV with glycosylation-deficient M protein had normal virion morphology and retained its infectivity in cell culture. However, unlike O-linked glycosylation that conferred IFN antagonism to the MHV M protein, the IFN-antagonizing activity of SARS-CoV M protein was independent of N-linked glycosylation and might be mediated through its first TM domain.


Phosphorylation of coronavirus N protein was first described in ip60K cells infected with MHV-JHM, where a protein kinase associated with purified virions was shown to transfer the γ-phosphate of ATP to serine residues to the MHV N protein. A later study showed that the MHV-JHM N protein was synthesized initially in a nonphosphorylated 57-kDa form detected exclusively in the cytosol, while the subsequent phosphorylated 60-kDa form was associated with the cellular membrane fraction and mature virion. Similarly, 32P-orthophosphate labeling showed that the phosphorylation level of IBV N protein was significantly higher in the virion than in the infected cell lysates. In sharp contrast, only the phosphatase insensitive nonphosphorylated form of N protein was detected in extracellular virions of BCoV-infected cells, suggesting that dephosphorylation of BCoV N protein may facilitate its specific assembly. Therefore, the phosphorylation status of N protein may differentially regulate coronavirus assembly for different viruses in question.

Phosphorylation sites and the corresponding protein kinases have been identified for some coronaviruses. For the Alphacoronavirus TGEV, four phosphorylation sites have been identified in the N protein, namely S9, S156, S254 and S256. Using mass spectroscopy, two clusters of phosphorylation sites were identified in IBV, namely amino acid residues S190/S192 and T378/S379. Importantly, although both phosphorylated and nonphosphorylated IBV N protein bound to viral RNA with the same affinity, phosphorylated N protein bound to viral RNA with higher affinity than nonviral RNA, compared with the nonphosphorylated IBV N protein. This suggests that N phosphorylation may facilitate the differential recognition of viral RNA. Consistently, using a reverse genetic system based on Vaccinia virus, Spencer et al. showed that IBV N protein was essential for the recovery of recombinant IBV, and that phosphorylated IBV N protein was more efficient than partially or nonphosphorylated N protein. Phosphorylation at T378 and S379 of IBV N protein was shown to be dependent on ATR, a kinase activated during IBV replication. However, recombinant IBV harboring alanine substitutions at all four putative phosphorylation sites (S190A/S192A/T378A/S379A) could still be recovered and grew at a similar growth rate as wild-type IBV, suggesting that ATR-dependent phosphorylation of N protein is not essential for IBV replication in vitro.

As for betacoronavirus, the N protein of SARS-CoV can be phosphorylated by multiple host kinases, including cyclin-dependent kinase, glycogen synthase kinase, mitogen-activated protein kinase and casein kinase II. Using mass spectrometry analysis, six phosphorylation sites (S162, S170, T177, S389, S424 and T428) on the MHV-A59 N protein were identified. Phosphorylation of the N protein of SARS-CoV and MHV-JHM by the host protein GSK-3 was precisely mapped to S197 and S177 in the serine arginine-rich region, respectively. Moreover, inhibition of GSK-3 by kenpaullone significantly reduced the phosphorylation level of N protein, as well as the supernatant virus titer and cytopathic effects on VeroE6 cell-infected SARS-CoV. Therefore, phosphorylation of the N protein appears to be essential for the replication of some Betacoronaviruses. In fact, a recent study showed that phosphorylation of the MHV-JHM N protein by GSK-3 allowed the recruitment of RNA helicase DDX1 to facilitate template read-through, enabling the synthesis of genomic RNA and longer sgRNAs. On the other hand, when N protein was not phosphorylated, template switching was favored during transcription, leading to the preferential generation of shorter sgRNAs but not genomic RNA or longer sgRNAs. Therefore, the phosphorylation status of MHV-JHM N protein acts as a switch to regulate the process of genome replication/transcription.

Phosphorylation of the SARS-CoV N protein may also affect its nucleocytoplasmic shuttling, which is mediated by its interaction with the host adapter protein 14-13-3. Additionally, SARS-CoV N protein was shown to translocate to cytoplasmic stress granules in response to cellular stress, while phosphorylation in the serine-arginine rich region inhibited this translocation. Since stress granules play important roles in translation control and antiviral immune response, phosphorylation of N protein may be a strategy used by SARS-CoV to antagonize host antiviral mechanisms. At last, compared with SARS-CoV N protein expressed in Escherichia coli, recombinant SARS-CoV N protein produced by the baculovirus system in insect cells showed significantly higher immunoreactivity and antigenic specificity. As dephosphorylation by PP1 also reduced the immunoreactivity of SARS-CoV N protein, it was proposed that phosphorylation might also contribute to the antigenicity SARS-CoV N protein.

Proteolytic cleavage, sumoylation & ADP-ribosylation

One early study shows that the N protein of TGEV was cleaved at D359 during the late stage of infection, presumably by the activated caspase-6 and -7 during TGEV-induced apoptosis. Similarly, the N protein of SARS-CoV was also cleaved at D400 and D403 by caspases during lytic infection in Vero E6 and A549 cells, but not during persistent infection in Caco-2 and N2a cells. Cleavage of the SARS-CoV N protein was mediated by caspase-6 and/or caspase-3, and was dependent on the nuclear localization of the N protein. We have also observed cleavage of the IBV N protein during late stage IBV infection. Thus proteolytic cleavage of the N protein may be a common outcome associated with coronavirus-induced apoptosis in the infected cells, although the biological significance is not known. Presumably, coronavirus N protein may compete with other caspase substrates for cleavage, so as to promote cell survival in order to prolong the duration of virion release.

Yeast two-hybrid screen identified Ubc9, a host protein involved in sumoylation, as an interacting partner of SARS-CoV N protein. Biochemical analysis confirmed that SARS-CoV N protein was modified by sumoylation at lysine 62, which significantly promoted homo-oligomerization of the N protein. The biological significance of this modification on the viral replication and coronavirus–host interactions remains to be investigated.

A novel form of PTM known as ADP-ribosylation was recently recognized, in which single or multiple ADP-ribose moieties are covalently attached to a protein. This process is catalyzed by enzymes called poly-ADP-ribose polymerases and utilizes nicotinamide adenine dinucleotide as the ADP-ribose donor. Interestingly, N proteins of MHV, PEDV, SARS-CoV and MERS-CoV were all shown to be ADP-ribosylated in the infected cells, while ADP-ribosylated MHV N protein was also detected in the purified virions. Notably, MHV N protein expressed from transfected plasmids was only ADP-ribosylated in the context of virus infection, suggesting that enzymes catalyzing this modification are activated by coronavirus infection and additional viral components may be involved.

Glycosylation of nsp3 & nsp4

Among all the coronavirus nsps, three of them are known to contain TM domains that facilitate their insertion into ER membrane. Nsp3 and nsp4 have two and four TM domains respectively, while nsp6 contains six TM domains with a hydrophobic C-terminal cytosolic tail. These three nsps are proposed to reorganize ER membrane to form DMVs and to facilitate the assembly and anchorage of the replication/transcription complex to the DMVs. In fact, co-expression of SARS-CoV nsp3, nsp4 and nsp6 induced DMV formation in the transfected cells. A more recent study showed that, for both MERS-CoV and SARS-CoV, co-expression of nsp3 and nsp4 was already sufficient to induce DMV formation. On the other hand, overexpression of coronavirus nsp6 induced the formation of autophagosomes, but at the same time restricted its expansion. Therefore, nsp3, nsp4 and nsp6 are closely associated with cellular membrane dynamics in coronavirus-infected cells.

Given their membrane multispanning nature, it is not surprising that some of the luminal domains undergo N-linked glycosylation in the ER (Figure 6). For example, MHV nsp3 is inserted into ER co-translationally and glycosylated at N1525. Glycosylation of nsp4 was first identified in IBV (Lim et al., 2000). By glycosidase digestion and site-directed mutagenesis, the glycosylation site of IBV nsp4 was confirmed to be at N48. As for the nsp4 of MHV, two glycosylation sites were predicted at N176 and N237. In one early study using reverse genetics, it was found that whereas recombinant MHV harboring nsp4-N176A mutation replicated identically to the WT control, nsp4-N237A was lethal and no recombinant virus could be recovered. In a later study using identical infectious clone system based on MHV-A59, Gadlage et al. successfully recovered recombinant MHV with N176A, N273A or N176A/N273A mutation in nsp4. Interestingly, all nsp4 glycosylation mutants exhibited aberrant morphology of DMVs and were defective in viral RNA synthesis and virus growth, supporting a critical role of N-linked glycosylation in the DMV formation activity of MHV nsp4. In a recent follow-up study, other mutations distinct from glycosylation sites were introduced in MHV nsp4. Similar to the glycosylation mutants, some of these mutants also exhibited altered DMV morphology. However, only mutations in the nsp4 glycosylation sites resulted in a loss of fitness in the recombinant MHV. Therefore, apart from DMV formation, N-linked glycosylation of MHV nsp4 may serve other critical roles during viral replication.

Disulfide bond formation of nsp9 of HCoV-229E

Coronavirus nsp9 has been characterized as an ssRNA binding protein. The colocalization of nsp9 with other replicase proteins and its interaction with the coronavirus RdRP suggested that the ssRNA-binding activity of nsp9 might play a role during coronavirus genome transcription/replication. Crystallography studies showed that SARS-CoV nsp9 formed homodimer, and higher oligomers could be observed in solution using glutaraldehyde cross-linking. Surprisingly, in spite of 45% sequence homology, nsp9 of HCoV-229E (but not SARS-CoV) was shown to form homodimer linked by a disulfide bond (Figure 6). Mutation of the disulfide bond forming cysteine 69 to either alanine or serine significantly reduced the binding affinity of HCoV-229E nsp9 to ssRNA or ssDNA, as determined by surface plasmon resonance experiments. Although disulfide bonds are rare in cytosolic proteins, a disulfide-bonded form of nsp9 may be correlated with oxidative stress induced by HCoV-229E infection.

Ubiquitination of nsp16 of SARS-CoV

Coronavirus nsp16 has been identified as a nucleoside-2′O-methyltransferase (2′-O-MTase). By modifying the cap-0 structure at the ribose 2′-O position of the first nucleotide to form cap-1 structures, nsp16 enables the viral RNA to avoid detection by the cytoplasmic pattern recognition receptor MDA5. In fact, compared with wild-type control, recombinant virus lacking the nsp16 2′-O-MTase activity induced a high level of type I interferon in the infected cells, and viral replication was highly sensitive to the antiviral function of exogenous interferon. Using yeast two-hybrid screening, a component of E3 ubiquitin ligase – von Hippel Lindau (VHL) was found to interact with SARS-CoV nsp16. Overexpression of VHL promoted the ubiquitin-proteasomal degradation of SARS-CoV nsp16, while knockdown of VHL increased the protein stability of nsp16. However, the precise ubiquitination site in SARS-CoV nsp16 has not been mapped, and similar modifications of nsp16 in other coronaviruses have not been characterized. Also, the physiological significance of nsp16 ubiquitination remains to be investigated using recombinant viruses under the setting of actual coronavirus infection.

PTMs of coronavirus accessory proteins

Apart from the structural and nonstructural proteins, coronavirus genome also encodes various accessory proteins, most of which share no homology to any known proteins. These accessory proteins are dispensable for viral replication in cell culture. In fact, when the coding sequences of accessory proteins were deleted by reverse genetics, the resulting recombinant viruses still replicated similarly to wild-type virus. However, some of the coronavirus accessory proteins are incorporated in mature virions, while others have been implicated in the modulation of host immune response and in vivo pathogenesis. Only a few coronavirus accessory proteins are known to be modified by PTMs (Figure 6 & Table 1).

Apart from the S protein, some Betacoronaviruses also encode the HE protein, which forms homodimers and constitutes a second type of shorter projections on the virion surface. Similar to S protein, the HE protein of MHV was also found to be modified by N-linked glycosylation, which was inhibited by tunicamycin but not monensin. The HE protein of BCoV was also shown to be glycosylated when expressed using a human adenovirus vector. The importance of N-linked glycosylation for the function of coronavirus HE protein has not been fully characterized.

Interestingly, although SARS-CoV M protein is N-linked glycosylated, its accessory protein 3a is O-linked glycosylated. The SARS-CoV protein 3a and M share the same N-exo/C-endo membrane topology, and both proteins contain three TM domains. O-linked glycans of the SARS-CoV protein 3a were resistant to the treatment of PNGase F, and pulse-chase analysis suggested that the oligosaccharides were acquired post-translationally. Protein 3a has been implicated in modulating host immune response, such as upregulating fibrinogen expression and production of proinflammatory cytokines. However, whether O-linked glycosylation contributes to the immune-modulating activities of SARS-CoV protein 3a is not known.

In animal isolates and early human isolates, the sgRNA8 of SARS-CoV encoded a single protein 8ab. However, in later human isolates during the peak of SARS-CoV epidemic, a 29-nt deletion in the center split ORF8 into two smaller ORFs, encoding proteins 8a and 8b respectively. Whereas protein 8ab is co-translationally imported into the ER and is N-linked glycosylated at N81, protein 8b is synthesized in the cytosol and not modified. Both proteins 8b and 8ab were shown to interact with mono-ubiquitin and polyubiquitin, and both were also modified by ubiquitination. However, whereas glycosylation at N81 stabilized protein 8ab and protected it from proteasomal degradation, protein 8b was highly unstable and underwent rapid proteasomal degradation. The ubiquitinated 8b and 8ab may mediate rapid degradation of IRF3 and regulate host antiviral innate immunity.

The accessory protein 3b of TGEV is encoded between the S and M genes in the Purdue strain, but it is truncated in some lab-passaged strains. TGEV 3b protein is translated via an internal entry mechanism, possibly in conjunction with leaky scanning. In cells infected with the Purdue strain of TGEV, two forms of 3b protein were detected: a 31 kDa N-glycosylated, membrane-associated form and a 20 kDa nonglycosylated soluble form. The TGEV 3b protein was not essential for viral replication and was not incorporated in mature virion. Its role in pathogenesis is not completely understood, although deletion of ORF3b was found in some naturally attenuated TGEV strains such as Miller M60.

Deubiquitinating activity of coronavirus PLPro

Coronavirus encodes one or two PLPro in the nsp3, which carry out the proteolytic cleavage that releases nsp1, nsp2 and nsp3 from the polyprotein. Apart from its protease activity, SARS-CoV PLPro was also shown to possess deubiquitinating (DUB) activity, which was also identified later for PLP2 of HCoV-NL63 and MHV-A59, as well as PLPro of MERS-CoV and IBV. Structural studies revealed that SARS-CoV PLPro shared similar fold with known DUB enzymes, but exhibited several distinct features. Later studies showed that apart from ubiquitin, the PLPro of SARS-CoV and MERS-CoV also recognized another ubiquitin-like modifier interferon-stimulated gene 15 (ISG15), and served as a deISGylating enzyme. Interestingly, the DUB/deISGylating activity of coronavirus PLPro could be separated from its protease activity. The crystal structure of SARS-CoV PLPro in complex with human ubiquitin analog has been determined, and certain mutations in the interacting regions were shown to compromise ubiquitin binding without affecting the protease activity of PLPro. Similarly, using the structure of MERS-CoV PLPro in complex with Ub as a guide, mutations were introduced into PLPro that specifically disrupted the DUB function without affecting its proteolytic activity. Unlike wild-type PLPro, the DUB lacking variants were deficient in suppressing IFN promoter activation.

In terms of biochemistry, PLPro from different coronaviruses seems to have slightly different substrate specificities and enzyme properties. SARS-CoV PLPro greatly prefers K48-linked to K63-linked ubiquitin chains. The specificity of SARS-CoV PLPro toward polyUb(K48) was proposed to be determined by its extended conformation and binding via two contact sites. In contrast, the PLPro of MERS-CoV cleaves polyUb chains with broad linkage specificity. Also, whereas MERS-CoV PLPro cleaves polyUb chains one Ub at a time, SARS-CoV PLPro cleaves K48-linked polyUb chain in a ‘di-distributive’ manner – that is, removing a di-Ub moiety at a time.

Since ubiquitination and ISGylation are critical for signaling transduction of innate immunity, the DUB and deISGylating activities of coronavirus PLPro are well characterized as antagonists of host antiviral response (Figure 7). Initial studies identified SARS-CoV PLPro as a potent IFN antagonist by interacting with IRF3 and inhibiting its phosphorylation and nuclear translocation, thereby blocking type I IFN production. Subsequently, it was found that SARS-CoV PLPro could also inhibit TNFα-induced NF-κB activation and blocked the production of proinflammatory cytokines and chemokines in activated cells. The IFN antagonist activity of coronavirus PLPro can be mediated by multiple mechanisms, which may or may not involve its protease and DUB activities. PLP2 of MHV-A59 was found to directly deubiquitinate IRF3 and prevent its nuclear translocation. It also deubiquitinated upstream TBK1 and reduced its kinase activity, thereby inhibiting IFN signaling. PLPro of SARS-CoV was shown to remove K63-linked ubiquitin chains from TRAF3 and TRAF6, thereby suppressing the activation of TBK1 in cells treated with TLR7 agonist. Alternatively, membrane-anchored SARS-CoV PLPro might physically interact with the STING-TRAF3-TBK1 complex to inhibit the phosphorylation and dimerization of IRF3, thereby suppressing the STING/TBK1/IKKε-mediated activation of type I IFN. At last, using a constitutively active phosphor-mimetic IRF3, it was recently shown that the DUB activity of PLPro also inhibited IRF3 at a postactivation step.

Other coronavirus proteins that modulate PTMs of host proteins

Apart from the most well-characterized DUB/deISGylation activities of coronavirus PLPro, other coronavirus proteins have also been implicated in regulating PTMs of host proteins (Figure 7). For example, in addition to the DUB activity encoded in the nsp3 of SARS-CoV, its SARS-unique domain (SUD) can also enhance a cellular E3 ubiquitin ligase called ring-finger and RCHY1, which leads to proteasomal degradation of p53. Cellular p53 inhibits replication of SARS-CoV and HCoV-NL63, presumably by activating genes involved in innate immunity. Thus, by targeting p53 for RCHY1-mediated ubiquitination and proteasomal degradation, the SUD of nsp3 may contribute to the pathogenesis of SARS-CoV. Similarly, SARS-CoV ORF9b was found to localize to mitochondria and induce ubiquitination and proteasomal degradation of DRP1, leading to the elongation of mitochondrial. It might also hijack a ubiquitin E3 ligase called AIP4 to trigger the degradation of MAVS, TRAF3 and TRAF6, thereby significantly suppressing IFN responses.

On the other hand, ubiquitination of some cellular proteins is suppressed by coronavirus proteins. TRIM25 is an E3 ubiquitin ligase that associates with and activates RIG-I by mediating its ubiquitination. The N protein of SARS-CoV was found to bind to the SPRY domain of TRIM25 and inhibit TRIM25-dependent RIG-I activation, thereby suppressing the type I IFN production induced by poly(I:C) or Sendai virus. Similarly, the accessory protein 6 of SARS-CoV was shown to interact with the IFN-signaling pathway-mediating protein Nmi and promote its ubiquitin-dependent proteasomal degradation, thereby potentially modulating the virus-induced innate immune response.

The enzymatic activity of some nonstructural proteins can directly modify some host proteins. For example, porcine deltacoronavirus (PDCoV) nsp5 was shown to mediate the cleavage of NF-κB essential modulator at glutamine 231, thereby significantly inhibiting IFN-β production induced by Sendai virus infection. Later, it was shown that the nsp5 of PDCoV also cleaved STAT2 and impaired its ability to induce the expression of ISGs. Therefore, PDCoV nsp5 mediates the cleavage of key players to inhibit both the production and signaling of type I interferons.


Accumulating evidence suggests that coronavirus proteins are subjected to various PTMs by the host cells. Transmembrane structural proteins (S, E and M), nonstructural proteins (nsp3 and nsp4) and accessory proteins (SARS-CoV 3a) are modified by glycosylation. Although glycosylation of coronavirus S protein is essentially N-linked, the M proteins of lineage A Betacoronavirus adopt the special O-linked glycans, while the M proteins of other coronaviruses are modified by N-linked glycosylation. Some coronavirus S and E proteins acquire palmitoylation in the cytosolic cysteine residues, while the N protein is mainly phosphorylated by multiple host kinases. The conserved types and sites of PTMs on these proteins suggest that co-option of PTMs to regulate coronavirus proteins has a long evolutionary history, while the diversity of PTMs on numerous coronavirus proteins highlights their important implication in viral replication and pathogenesis.

PTMs contribute significantly to the functions of coronavirus proteins. Apart from facilitating the folding and intracellular trafficking of the coronavirus S protein, N-linked glycans also constitute a significant part of the protein mass and profoundly affect the conformation of the mature S protein and its binding to surface receptors. N-linked glycans may play a role in the antigenicity of S protein, and glycosylation may also contribute to the induction of innate immune response, thereby affecting the viral pathogenesis. Phosphorylation of coronavirus N protein improves its selective binding to viral RNA and may regulate the uncoating and assembly process during replication. Importantly, the phosphorylation status of MHV-JHM N protein, controlled by the host kinase GSK-3, acts as a switch to regulate genome replication/transcription, although a similar mechanism has not been described for other coronaviruses. Coronavirus also employ multiple mechanisms to interfere with PTMs of host proteins. In particular, the DUB and deISGylating activities encoded by nsp3 suppress the induction and signaling of type I interferons.

Future perspective

The functional implication of PTMs on many coronavirus proteins has not been fully characterized, and their biological significance requires further investigations combining reverse genetics and suitable in vivo models. However, the presence of multiple modification sites on some functionally important domains of a protein, as examplified by the presence of more than 20 predicted N-linked glycosylation sites on various functional domains of coronavirus S protein and multiple phosphorylation sites in coronavirus N protein, and the absence of sensitive and specific methods for detection of individual PTMs in live cells and infectious particles have hindered further investigation into the function of PTMs at a specific site of a coronavirus protein in virus replication and pathogenesis. In addition, it appears that the functional effect of mutation at a canonical site can be compensated by the same PTM at an alternate site. This is especially true for proteins with multiple sites for a certain PTM, such as N-linked glycosylation of S protein and phosphorylation of N protein.

With the advent of innovative labeling techniques (such as H2O18 labeling) and the ever-growing capacity of mass spectrometry, systematic identification of conventional and novel PTM will be greatly accelerated over the next decade. Also, as we better understand the detailed molecular mechanisms behind PTMs, functional studies will shift from relying on less specific inhibitors to targeted depletion of key modifying enzymes using gene knockdown/knockout approaches based on CRISPR technologies. Assisted by the exquisite structural and biochemical investigation of PLPro and other coronavirus proteins, future studies will further reveal the mechanisms of how these proteins interfere with host PTMs and modulate viral pathogenesis. Undoubtedly, coronavirus reverse genetics will remain the cornerstone for characterizing the biological significance of PTMs in coronavirus proteins, which will also be facilitated by the recent development of various transgenic in vivo models.

In terms of translational applications, PTMs of coronavirus proteins and the interference of PTMs of host proteins by coronavirus proteins may be attractive targets for therapeutic intervention. For instance, carbohydrate binding agents that directly interact with glycans on the virion surface may be able to suppress virus attachment and entry. As more than one N-linked glycosylation sites are present, multiple mutations will be required for the virus to develop drug resistance. On the other hand, recombinant coronaviruses with the DUB/deISGylation activity specifically deleted from PLPro may be desirable vaccine candidates, as these viruses will retain the protease activity required for replication, but become substantially attenuated as they are defective in subverting the host innate immune response. Given its importance in both veterinary setting and public healthcare, a better understanding of the PTMs of coronavirus proteins will provide new insights into the development of more efficient vaccines and novel antivirals.