{"id":7730,"date":"2023-10-10T18:00:00","date_gmt":"2023-10-10T18:00:00","guid":{"rendered":"http:\/\/www.cov19longhaulfoundation.org\/?p=7730"},"modified":"2023-10-10T18:00:00","modified_gmt":"2023-10-10T18:00:00","slug":"structural-biology-of-sars-cov-2-and-implications-for-therapeutic-development","status":"publish","type":"post","link":"https:\/\/cov19longhaulfoundation.org\/?p=7730","title":{"rendered":"Structural biology of SARS-CoV-2 and implications for therapeutic development"},"content":{"rendered":"\n<p class=\"has-small-font-size\">Haitao Yang\u00a0&amp;\u00a0Zihe Rao\u00a0 <a href=\"https:\/\/www.nature.com\/nrmicro\"><em>Nature Reviews Microbiology<\/em><\/a>\u00a0<strong>volume\u00a019<\/strong>,\u00a0pages685\u2013700 (2021) <a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#citeas\">Cite this article<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Abs1\">Abstract<\/h2>\n\n\n\n<p>The COVID-19 pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is an unprecedented global health crisis. However, therapeutic options for treatment are still very limited. The development of drugs that target vital proteins in the viral life cycle is a feasible approach for treating COVID-19. Belonging to the subfamily&nbsp;<em>Orthocoronavirinae<\/em>&nbsp;with the largest RNA genome, SARS-CoV-2 encodes a total of 29 proteins. These non-structural, structural and accessory proteins participate in entry into host cells, genome replication and transcription, and viral assembly and release. SARS-CoV-2 proteins can individually perform essential physiological roles, be components of the viral replication machinery or interact with numerous host cellular factors. In this Review, we delineate the structural features of SARS-CoV-2 from the whole viral particle to the individual viral proteins and discuss their functions as well as their potential as targets for therapeutic interventions.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Sec1\">Introduction<\/h2>\n\n\n\n<p>Coronaviruses are enveloped viruses that possess a positive-sense single-stranded RNA genome 26\u201332\u2009kb in length<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR1\">1<\/a><\/sup>. Coronaviruses belong to the&nbsp;<em>Coronaviridae<\/em>&nbsp;subfamily&nbsp;<em>Orthocoronavirinae<\/em>. According to variations in the genome sequence and serological reactions, coronavirus members in the subfamily are classified into four genera:&nbsp;<em>Alphacoronavirus<\/em>,&nbsp;<em>Betacoronavirus<\/em>,&nbsp;<em>Gammacoronavirus<\/em>&nbsp;and&nbsp;<em>Deltacoronavirus<\/em><sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR2\">2<\/a><\/sup>. Among them,&nbsp;<em>Betacoronavirus<\/em>&nbsp;is classified into five subgenera. Although infectious bronchitis virus was the first coronavirus isolated in chicken embryos in 1937 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR3\">3<\/a><\/sup>), it was not until the 1960s that these viruses, particularly the human respiratory coronaviruses<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR4\">4<\/a><\/sup>, were characterized by electron microscopy. This subfamily of viruses has a unique structural feature on their surfaces which resembles a solar corona. This feature arises due to the presence of spike proteins on the virion surface.<\/p>\n\n\n\n<p>Coronaviruses are characterized by high genetic recombination and mutation rates, which result in their ecological diversity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR5\">5<\/a><\/sup>. They are able to infect and readily adapt to a wide range of hosts, from birds to whales. Seven coronaviruses have been found to infect humans. Human coronaviruses 229E, OC43, NL63 and HKU1 are responsible for 10\u201330% of upper respiratory tract infections annually, characterized by mild respiratory illnesses, such as the common cold<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR6\">6<\/a><\/sup>. By contrast, severe acute respiratory syndrome coronavirus (SARS-CoV), Middle East respiratory syndrome coronavirus<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR7\">7<\/a><\/sup>&nbsp;and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are able to cause severe human respiratory diseases, potentially resulting in high mortality. In 2002\u20132003, SARS-CoV resulted in 8,096 reported cases and 774 deaths (case\u2013fatality rate of ~10%)<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR7\">7<\/a><\/sup>. By the end of January 2020, 2,500 cases of Middle East respiratory syndrome and more than 800 associated deaths (case\u2013fatality rate ~34%) were reported worldwide<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR8\">8<\/a><\/sup>. In late December 2019, clustered cases of a severe pneumonia were reported, and the aetiological agent was isolated and identified as a novel betacoronavirus, named SARS-CoV-2, that shares ~80% similarity in genome sequence with SARS-CoV<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR9\">9<\/a><\/sup>. SARS-CoV-2 causes COVID-19, with symptoms including fever, cough, fatigue, nausea and shortness of breath<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR10\">10<\/a><\/sup>. To date, there have been more than 160 million confirmed COVID-19 cases and more than 3 million related deaths worldwide<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR11\">11<\/a><\/sup>.<\/p>\n\n\n\n<p>To date, there has been a lack of effective therapies to treat COVID-19. Due to the rampant and continuous spread of COVID-19, it is a matter of urgency to identify and characterize drug and vaccine targets for SARS-CoV-2. The genome of SARS-CoV-2 is close to 30\u2009kb on size, contains 14 open reading frames (ORFs) and encodes 29 viral proteins. Approximately two thirds of the 5\u2032 end of the SARS-CoV-2 genome encodes two overlapping polyproteins: pp1a and pp1ab<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR12\">12<\/a><\/sup>. These two polyproteins are digested by two viral proteases into 16 non-structural proteins (NSPs), which are essential for viral replication and transcription (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig1\">1a<\/a>). Four ORFs at the 3\u2032 terminus of the viral genome encode a canonical set of structural proteins that include the nucleocapsid (N), spike (S) protein, membrane (M) protein and envelope (E) protein, which are responsible for virion assembly and also participate in suppression of the host immune response. A series of accessory genes, which encode accessory proteins (ORF3a, ORF3b, ORF6, ORF7a, ORF7b, ORF8b, ORF9b and ORF14), lie between these structural genes. The accessory proteins are involved in regulating viral infection but may not be incorporated into the virion, except for the structural proteins ORF3a and ORF7a.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/1\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig1_HTML.png\" alt=\"figure 1\" style=\"width:223px;height:366px\" width=\"223\" height=\"366\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 1: SARS-CoV-2 genome and life cycle.<\/strong><\/figcaption><\/figure>\n\n\n\n<p>Briefly, in the first step of the SARS-CoV-2 life cycle, the S protein on the outer surface of the virion is responsible for binding to the host receptor or receptors for attachment to the cell membrane, which is followed by viral and host cellular membrane fusion and the release of viral genomic RNA into the cells. Subsequently, host ribosomes are hijacked to produce the two viral replicase polyproteins, which can further be processed into 16 mature NSPs through two virus-encoding proteases: main protease (M<sup>pro<\/sup>) and papain-like protease (PL<sup>pro<\/sup>). These NSPs are able to assemble into the replication and transcription complex (RTC) to initiate viral RNA replication and transcription. The genomic RNA and structural proteins then assemble into mature progeny virions, which are subsequently released through exocytosis to initiate another round of infection<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR10\">10<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig1\">1b<\/a>). Viral proteins can individually perform important physiological roles, constitute the viral protein machinery for specific essential events in the viral life cycle or extensively interplay with the cellular factors in the host immune response and pathogenesis<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR13\">13<\/a><\/sup>. In the following sections, we delineate the structural features of SARS-CoV-2 extending from the whole viral particle to individual proteins, including several antiviral drug targets, including the S protein, PL<sup>pro<\/sup>, M<sup>pro<\/sup>&nbsp;and viral RNA-dependent RNA polymerase (RdRP)<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR14\">14<\/a><\/sup>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Sec2\">Structural proteins in the viral life cycle<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec3\">S protein in viral entry<\/h3>\n\n\n\n<p>The S protein is a homotrimer, which protrudes from the virion and extensively decorates the viral surface like a crown. It is heavily glycosylated, belongs to the type I membrane-protein family and is anchored in the viral membrane, where it mediates fusion of the viral membrane with the host cell membrane<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR15\">15<\/a><\/sup>. In the native state, prefusion and postfusion conformations of S proteins can be traced simultaneously on the reconstructed virions. The SARS-CoV-2 S protein comprises ~1,200 residues and can be cleaved by a furin-like protease into two functional subunits, S1 and S2, which are responsible for mediating attachment to host cells and membrane fusion, respectively<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR16\">16<\/a><\/sup>. After cleavage during viral entry into the host cells, S1 and S2 remain associated with each other through non-covalent interactions. As shown by cryogenic electron microscopy (cryo-EM) (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2a<\/a>), the S1 subunit of the SARS-CoV-2 S protein wraps around a threefold axis, covering the S2 subunit underneath<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR17\">17<\/a><\/sup>. The S1 subunit contains a receptor-binding domain (RBD) and an amino-terminal (N-terminal) domain (NTD). The RBD has a five-stranded antiparallel \u03b2-sheet core, flanked on either side by a short helix. The receptor-binding motif (RBM) extends out of the core (connecting \u03b24 and \u03b25), taking on a cradle-like structure for receptor binding. The RBM, which is stabilized by a disulfide bond, does not possess a regular secondary structure except for two small \u03b2-sheets. The RBD can adopt two distinct conformational states: the closed \u2018down\u2019 state and the open \u2018up\u2019 state<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR17\">17<\/a><\/sup>. In the \u2018down\u2019 state, RBD angles are close to the central cavity of the trimer to shield the receptor-binding regions, while in the \u2018up\u2019 state, the RBD undergoes hinge-like conformational movement, exposing its determinant regions to recognize the human angiotensin-converting enzyme 2 (hACE2) receptor on the host cellular membrane, the state of which is considered to be less stable than in the \u2018down\u2019 state. The NTD of the S protein adopts a galectin-like fold with a sugar-binding pocket and contains a ceiling-like structure on top. The NTD may recognize sugar moieties upon initial attachment and play a significant role in the transition of the conformation of the S protein. The S2 subunit comprises four conserved structural regions: a fusion peptide, two heptad repeats (HR1 and HR2) and a transmembrane region. The HR1 region constitutes the main helical stalk of S2, whereas the HR2 region is temporarily flexible in the prefusion state. The fusion peptide forms a short hydrophobic segment.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/2\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig2_HTML.png\" alt=\"figure 2\" style=\"width:319px;height:416px\" width=\"319\" height=\"416\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 2: Structures of the SARS-CoV-2 spike protein in the presence or absence of antibodies.<\/strong><\/figcaption><\/figure>\n\n\n\n<p>Undergoing a substantial structural rearrangement, from the metastable prefusion conformation to the postfusion conformation, the S protein fulfils its function in regulating the fusion of viral membrane with the host cell membrane<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR18\">18<\/a><\/sup>. Fusion is triggered when the S1 subunit binds to hACE2 (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2b<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">c<\/a>). As observed in the complex structure, the N-terminal helix of hACE2 interacts with the outer surface of the RBM in the S1 subunit<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR19\">19<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR20\">20<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR21\">21<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR22\">22<\/a><\/sup>. The interaction involves 16 residues in the RBD and 20 residues in hACE2, which forms a network consisting of 14 hydrogen bonds and one salt bridge<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR19\">19<\/a><\/sup>. The binding of hACE2 to the RBD can lock the RBD in the \u2018up\u2019 conformation and trigger S1 shedding, which is mediated by the proteolytic cleavage of host TMPRSS2 and cathepsin B or cathepsin L. Thus, three HR1 helices of trimeric S2 interact with the pairing HR2 helices and constitute a stable six-helix bundle<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR23\">23<\/a><\/sup>. In this unique helix bundle, three HR2 helices are packed into the hydrophobic grooves of the HR1-trimer core in an antiparallel manner. This conformational arrangement brings viral and host cell membranes into proximity and facilitates subsequent membrane fusion. Because of the indispensable function of the S protein, it is an attractive target for inhibition by neutralizing antibodies (nAbs), and characterization of the S protein structure provides atomic-level information for rational vaccine design.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec4\">S protein-neutralizing antibodies<\/h3>\n\n\n\n<p>nAbs targeting the SARS-CoV-2 S trimer have shown protection from viral infection in animal models and are being evaluated as therapeutics in humans. These antibodies comprise human monoclonal antibodies isolated from COVID-19 convalescent donors and single-domain antibodies (also known as nanobodies) which can bind novel epitopes, including buried cavities that are inaccessible to conventional antibodies. Determination of a number of structures of nAbs in complex with the S trimer has elucidated their modes of neutralization. Although some nAbs target the NTD or S2, most nAbs bind to the RBD, the latter of which can be further classified into four distinct classes (classes I, II, III and IV) on the basis of the nAb\u2013RBD binding characteristics.<\/p>\n\n\n\n<p>The nAbs in class I can bind to the RBD only in the \u2018up\u2019 state (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2e<\/a>). They are expected to bind to the flat area on the top side of the cradle-like surface of the RBD, which extensively overlaps with the binding site for hACE2. Through direct competition with hACE2, nAbs in this class would produce steric hindrance when binding to RBD, blocking hACE2 attachment. CB6 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR24\">24<\/a><\/sup>), C105 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR25\">25<\/a><\/sup>), CV30 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR26\">26<\/a><\/sup>), B38 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR27\">27<\/a><\/sup>), CC12.1, CC12.3 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR28\">28<\/a><\/sup>), PR1077 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR29\">29<\/a><\/sup>) and P4A1 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR30\">30<\/a><\/sup>) nAbs belong to this class. Most contain&nbsp;<em>IGHV3-53<\/em>&#8211; or&nbsp;<em>IGHV3-66<\/em>-encoded heavy chains and utilize residues in complementarity-determining regions 1, 2 and 3.<\/p>\n\n\n\n<p>The nAbs in class II also bind to the RBD in the \u2018up\u2019 state, but exhibit no overlap with hACE2-binding sites (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2f<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">g<\/a>). CR3022 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR31\">31<\/a><\/sup>), EY6A<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR32\">32<\/a><\/sup>&nbsp;and nanobody VHH-72 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR33\">33<\/a><\/sup>) belong to this class. The binding region is located at the bottom of the RBD, and is spatially separated from the hACE2-binding sites. Structural analysis showed that the RBD undergoes a rotation that exposes the epitopes for these nAbs. Such a rearrangement is considered to cause a premature conversion of the S protein from the prefusion state to the postfusion state. The resulting unstable configuration of the S protein consequently inactivates SARS-CoV-2.<\/p>\n\n\n\n<p>The nAbs in class III can bind to RBDs only in the \u2018down\u2019 conformation (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2h<\/a>). They comprise Fab 2-4, Fab 2-43 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR34\">34<\/a><\/sup>) and BD23 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR35\">35<\/a><\/sup>). The heavy chains of the nAbs reach the RBD and interact with the cradle-like surface or the flexible ridge region. However, the binding pattern between these nAbs and the RBD is different from that for class I nAbs, according to the orientation change in the RBD, and the binding area becomes narrower. Notably,&nbsp;<em>N<\/em>-glycan chains are supposed to play a significant role in stabilizing the binding of class III nAbs to the \u2018down\u2019 RBD. Additionally, epitopes of some nAbs extend to the NTD, which may help to resist dynamic instability. Collectively, this binding mode would lock the RBD in the \u2018down\u2019 conformation, which also sterically hinders hACE2 access.<\/p>\n\n\n\n<p>The nAbs in class IV can recognize both the \u2018up\u2019 RBD conformation and the \u2018down\u2019 RBD conformation (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2i<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">j<\/a>). They comprise H11-D4, H11-H4 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR36\">36<\/a><\/sup>), P2B-2F6 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR37\">37<\/a><\/sup>), Ty1 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR38\">38<\/a><\/sup>), S309 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR39\">39<\/a><\/sup>), REGN10987 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR40\">40<\/a><\/sup>) and P17 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR41\">41<\/a><\/sup>). Structural studies show that these nAbs target different regions. P2B-2F6 and the nanobodies H11-D4 and H11-H4 can bind to the top cradle-like surface in a similar orientation as class III nAbs. Their binding can be further reinforced by a protruding loop on the RBD. These three nAb epitopes are largely located on the opposite side of the RBM compared with the epitopes of class I nAbs. By partially overlapping with the hACE2-binding site, these nAbs sterically block hACE2 binding to the RBD as well. S309 targets a region distinct from the RBM. Its epitope comprises the \u03b11 helix, a section of the \u03b21 strand and two loops formed by residues 358\u2013361 and 333\u2013335. RGEN10987 is another class VI nAb that binds distal to the hACE2-binding site. The binding of this nAb would spatially hinder hACE2 attachment.<\/p>\n\n\n\n<p>4A8 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR42\">42<\/a><\/sup>), COV57 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR25\">25<\/a><\/sup>), 2\u201317, 5\u201324, 4\u20138 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR34\">34<\/a><\/sup>) and FC05 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR43\">43<\/a><\/sup>) are nAbs that target other parts of the S protein. Structural analysis reveals that 4A8, which shows a high level of neutralization of SARS-CoV-2, recognizes the NTD and does not sterically hinder the binding between hACE2 and the S protein (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig2\">2d<\/a>). Regarding the S2 subunit, only a few targeted monoclonal antibodies have been reported. Antibody 1A9 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR44\">44<\/a><\/sup>) has been found to interact with the S2 subunit but fails to neutralize SARS-CoV-2. In a recent report, the nAb CC40.8 was identified and found to neutralize SARS-CoV-2 and specifically recognize the S2 subunit<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR45\">45<\/a><\/sup>. The discovery of non-RBD-targeted nAbs may benefit&nbsp;the strategy of nAb cocktail therapeutics.<\/p>\n\n\n\n<p>Since SARS-CoV and SARS-CoV-2 share the same host cell receptor, hACE2, development of cross-neutralizing antibodies to both coronaviruses seems feasible. H014 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR46\">46<\/a><\/sup>) is a recently reported&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos1\">humanized antibody<\/a>&nbsp;which efficiently neutralizes both SARS-CoV and SARS-CoV-2. It can recognize and interact with open RBDs, but the binding interface is located distinct from the RBM, and exhibits no competition with hACE2 attachment. Consistently, other cross-neutralizing antibodies (for example, VHH-72, ADI-56046 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR47\">47<\/a><\/sup>), COV21 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR25\">25<\/a><\/sup>) and CC6.33 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR48\">48<\/a><\/sup>)) also avoid the RBM and prefer to recognize the core domain of the RBD.<\/p>\n\n\n\n<p>It is noteworthy that SARS-CoV-2 has a high mutation rate, and numerous mutant strains (variants) have been reported. Mutations in the S protein, especially the epitopes for nAbs, would attenuate the potency of nAbs. The D614G mutation is the most commonly reported mutation in the S protein<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR49\">49<\/a><\/sup>, and results in increased infectivity and morbidity. The cryo-EM structure of the trimeric S protein with D614G demonstrated a conformational shift towards the hACE2-binding fusion-competent state<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR49\">49<\/a><\/sup>&nbsp;and exhibited attenuation of efficacy in nAb binding. N501Y is a mutant variant emerging from the United Kingdom, South Africa and Brazil<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR50\">50<\/a><\/sup>. The mutation site is located at the RBD\u2013hACE2 interface and has been experimentally shown to cause an increase in hACE2 affinity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR51\">51<\/a><\/sup>. Other mutations worth noting include K417N and K417T, which appear in the epitopes of class I nAbs and are considered to affect the binding of class I antibodies. Mutations at residues in the NTD were also found in the new variants of concern, such as \u0394Y144 and \u0394242\u2013244. They were shown to abrogate neutralization of NTD-specific nAbs<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR52\">52<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR53\">53<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR54\">54<\/a><\/sup>. Additionally, SARS-CoV-2 with the naturally occurring mutations to E484, F490, Q493 or S494 of the S protein was found to escape from potential therapeutic antibodies such as C121 and C144 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR55\">55<\/a><\/sup>). Combination treatment with two or more nAbs targeting distinct epitopes would be a strategy to suppress nAb escape variants.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec5\">E protein<\/h3>\n\n\n\n<p>After a coronavirus enters host cells, the E protein regulates viral lysis and the subsequent viral genome release. The E protein was found to be involved in viral assembly and budding by localizing to endoplasmic reticulum (ER) and Golgi body membranes<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR2\">2<\/a><\/sup>. Moreover, the E protein has been shown to participate in activating the host inflammasome<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR56\">56<\/a><\/sup>.<\/p>\n\n\n\n<p>The structure of the SARS-CoV-2 E protein<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR57\">57<\/a><\/sup>&nbsp;solved by nuclear magnetic resonance spectroscopy shows that it is composed of a five-helix bundle ~35\u2009\u00c5 in length (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig3\">3b<\/a>). As the E protein can function as an ion channel, the pore inside the transmembrane region is predominantly occupied by hydrophobic residues except for the N-terminal pore. Owing to non-specific interhelical interactions, the entrance site at the N terminus is a drug target for inhibitor binding. The E protein is recognized topologically to be N<sub>lumen<\/sub>\u2013C<sub>cyto<\/sub>&nbsp;(N-terminal ER\u2013Golgi intermediate compartment lumen and carboxy-terminal (C-terminal) cytoplasm) and involved in regulation of pumping Ca<sup>2+<\/sup>&nbsp;out of the ER, which may lead to activation of the cellular inflammasome, thereby enhancing the host antiviral response.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/3\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig3_HTML.png\" alt=\"figure 3\" style=\"width:452px;height:170px\" width=\"452\" height=\"170\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 3: Structures of the SARS-CoV-2 nucleocapsid and envelope proteins.<\/strong><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec6\">N protein<\/h3>\n\n\n\n<p>The N protein serves as the only structural protein inside the virion. It is a crucial component that protects the viral RNA genome and packages it into a&nbsp;ribonucleoprotein complex. A native reconstruction of SARS-CoV-2 using electron cryotomography suggests that a significant number of ribonucleoproteins may be membrane proximal. The N protein also plays a role in antagonizing the host immune response<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR58\">58<\/a><\/sup>&nbsp;and has been identified to counter cellular RNAi-mediated antiviral activities through its binding with double-stranded RNA \u2018strings\u2019<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR59\">59<\/a><\/sup>, and can be regarded as a viral suppressor of RNA silencing. The N protein has potential as a target for vaccine development because it induces a severe immune responses during infection.<\/p>\n\n\n\n<p>The N protein has two conserved structural domains, the NTD (N-NTD) and the CTD (N-CTD), each of which is independently folded<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR60\">60<\/a><\/sup>. In the crystal structures of the N protein<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR61\">61<\/a><\/sup>, the N\u2010NTD exists as a monomer, whereas the N\u2010CTD exists as a dimer (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig3\">3a<\/a>). The N\u2010NTD has the shape of a right\u2010handed fist and contains a four\u2010stranded antiparallel \u03b2\u2010sheet as a core subdomain. The loops protruding out of the core are positively charged, putatively to allow RNA binding. The N\u2010CTD homodimer forms a rectangular shape, with each protomer displaying a crescent shape. To stabilize the dimer interface, two \u03b2\u2010hairpin structures from each protomer can form four antiparallel \u03b2-strands by inserting themselves into each cavity. Compared with other coronaviruses, the N protein from SARS-CoV-2 displays different charge distributions in the N-terminal loop, the RNA protruding tip, the bottom of the N-NTD core and the N-CTD \u03b2-strand face. Hence, the variations in RNA binding to the N protein may further guide inhibitor optimization.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Sec7\">NSPs and inhibitors<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec8\">Host translation shutdown by nsp1<\/h3>\n\n\n\n<p>nsp1 originates from the N-terminal cleavage of polypeptides pp1a and pp1ab by PL<sup>pro<\/sup>. The biological functions of nsp1 manifest themselves mainly in virus\u2013host interactions to suppress host translation<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR62\">62<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR63\">63<\/a><\/sup>, and thus nsp1 can be regarded as a canonical virulence factor. To hinder the host translation process, nsp1 is proposed to function by two mechanisms: the first is to bind the ribosomal 40S subunit during the initiation stage<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR64\">64<\/a><\/sup>&nbsp;and the second is to induce host mRNA degradation<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR65\">65<\/a><\/sup>. Importantly, nsp1 does not impede viral protein expression while it binds to the mRNA 5\u2032 untranslated region, leading to efficient viral translation and replication. The structure of nsp1 and the ribosomal 40S subunit has been determined to show the interactions between them and to explain the potential inhibition mechanism<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR66\">66<\/a><\/sup>. In this cryo-EM structure, the C-terminal domain of nsp1 possesses a short \u03b1-helix which is connected to a longer \u03b1-helix through a short loop (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig4\">4a<\/a>). Thus, the host mRNA entry channel is blocked by nsp1 insertion. This hypothesis is corroborated by the loss of host translation inhibition in the K164A\u2013H165A double mutant. The long \u03b1-helix also contributes to the interactions between nsp1 and the ribosome. Through the shutdown of host translation, especially antiviral factors, nsp1 assists in evading immune defences, which suggests that disrupting nsp1\u2013ribosome interactions is a plausible approach for SARS-CoV-2 drug discovery.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/4\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig4_HTML.png\" alt=\"figure 4\" style=\"width:453px;height:283px\" width=\"453\" height=\"283\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 4: Structures of the SARS-CoV-2 nsp1 and nsp3 subdomains and PL<sup>pro<\/sup>&nbsp;inhibitors.<\/strong><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec9\">Multidomain protein nsp3<\/h3>\n\n\n\n<p>nsp3 consists of 10\u201316 domains depending on the coronavirus genus. Eight are present in all coronaviruses, including ubiquitin-like domain 1 (Ubl1), a hypervariable region, a macrodomain, ubiquitin-like domain 2 (Ubl2), a PL<sup>pro<\/sup>, a zinc-finger domain, a Y1 domain and a CoV-Y domain<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR67\">67<\/a><\/sup>. Most of the conserved domains perform essential functions in the life cycle of the virus. The macrodomains possess highly conserved structures and similar functions. Macrodomain Mac1 can cleave the phosphate group of ADP-ribose 1-phosphate and reverse protein ADP-ribosylation by hydrolysis. The core structure of Mac1 contains seven \u03b2-strands flanked by six \u03b1-helices (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig4\">4b<\/a>). ADP-ribose interacts with the Mac1 hydrophobic cleft through conserved hydrogen bonds<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR68\">68<\/a><\/sup>. This indicates that compounds targeting Mac1 may have broad-spectrum antiviral activities.<\/p>\n\n\n\n<p>The \u2018SARS-unique domain\u2019 (SUD) participates in virus\u2013host interactions. SUD has three subdomains:&nbsp;SUD-N (Mac2), SUD-M (Mac3) and SUD-C (DPUP). SUD-N and SUD-M adopt a macrodomain fold, whereas SUD-C has a frataxin\u2010like fold. Deletion of Mac2 decreases the viral replication rate to 65\u201370%, whereas Mac3 is indispensable for replication activity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR69\">69<\/a><\/sup>. PAIP1, which is a component of the eukaryotic translation machinery, has been identified to interact with SUD. The structure of the Mac2\u2013PAIP1M (middle domain of PAIP1) complex shows that Mac2 displays a typical \u03b1\/\u03b2\/\u03b1 macrodomain fold, whereas PAIP1M adopts a HEAT repeat fold<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR70\">70<\/a><\/sup>. Strong complementarity which enhances complex stability is observed at the interface. This structure also supports the suggestion that Mac2\u2013PAIP1M participates in regulating viral mRNA translation and is thus a good antiviral drug target.<\/p>\n\n\n\n<p>PL<sup>pro<\/sup>&nbsp;is located in nsp3 between SUD and a nucleic acid-binding domain. It cleaves the viral polyprotein precursors pp1a and pp1ab at three sites to produce NSPs nsp1, nsp2 and nsp3 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR71\">71<\/a><\/sup>). Apart from viral polyproteins, PL<sup>pro<\/sup>&nbsp;can also cleave host proteins to antagonize the innate immune response<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR72\">72<\/a><\/sup>. It preferentially recognizes and cleaves interferon-stimulated gene product 15 (ISG15) from interferon regulatory factor 3 (IRF3) and attenuates type I interferon responses, facilitating escape of the virus from the immune system<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR73\">73<\/a><\/sup>. PL<sup>pro<\/sup>&nbsp;is a 36-kDa cysteine protease with a catalytic triad<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR71\">71<\/a><\/sup>. It contains an N-terminal ubiquitin-like domain and a catalytic core domain<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR74\">74<\/a><\/sup>. The catalytic core domain comprises three subdomains, the thumb, palm and fingers, which together fold like an open right hand. The thumb subdomain is composed of four \u03b1-helices, whereas the palm is formed by a six-stranded \u03b2-sheet. A four-stranded, twisted, antiparallel \u03b2-sheet makes up the finger subdomain. In the fingertip region, four cysteine residues constitute a zinc-finger motif, which coordinates a zinc ion with tetrahedral geometry. This zinc-finger is essential for structural integrity and protease activity.<\/p>\n\n\n\n<p>The substrate-binding site is located in the solvent-exposed cleft between the thumb subdomain and the palm subdomain, which possess a catalytic triad composed of C111, H272 and D286. The substrate-binding site recognizes the consensus sequence LXGG\u2193X (the amino acid residues of the substrate are numbered P4\u2013P3\u2013P2\u2013P1\u2193P1\u2032\u2013P2\u2032 around the cleavage site, denoted by the downwards arrow). Subsites S1\u2013S4 provide the binding sites for P1\u2013P4, respectively<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR75\">75<\/a><\/sup>. The S1 and S2 subsites are rather narrow, and can accommodate only glycine residues. The S3 subsite is partially solvent exposed but prefers positively charged and hydrophobic residues. The S4 subsite is relatively large and accommodates only hydrophobic residues. A flexible \u03b2-hairpin BL2 loop, which contains an unusual \u03b2-turn at Y268 and Q269, is involved in controlling substrate access to the active site. Consideration of the conformation of the BL2 loop may be important for rational drug design.<\/p>\n\n\n\n<p>Besides the catalytic site, PL<sup>pro<\/sup>\u00a0harbors two distinct binding subsites (SUb1 and SUb2) for recognizing diubiquitin chains and ISG15. SUb1 recognizes one ubiquitin molecule of diubiquitin chains and the C-terminal ubiquitin-like domain of ISG15. SUb2 recognizes the other (K48-linked) uniquitin molecule and the N-terminal ubiquitin-like domain of ISG15 (refs<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR74\">74<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR76\">76<\/a><\/sup>) (Fig.\u00a0<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig4\">4d<\/a>). As shown in the complex structures of PL<sup>pro<\/sup>\u2013ubiquitin and PL<sup>pro<\/sup>\u2013ISG15, SUb1 of SARS-CoV-2 PL<sup>pro<\/sup>\u00a0preferentially binds ISG15 through a different binding mode compared with uniquitin. Moreover, PL<sup>pro<\/sup>\u00a0SUb2 provides exquisite specificity for K48-linked diubiquitin chains, which makes diubiquitin a suitable substrate compared with monoubiquitin.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec10\">Inhibitors targeting PL<sup>pro<\/sup><\/h3>\n\n\n\n<p>Owing to the substantial role in mediating viral replication and suppressing the host immune response, PL<sup>pro<\/sup>&nbsp;is an attractive target for antiviral drug development. Thousands of compounds, including approved drugs and molecules in clinical trials, have been screened against this target, but the hit rate is extremely low compared with that of drug leads that target M<sup>pro<\/sup>, another viral protease encoded by SARS-CoV-2. The peptidomimetic inhibitors VIR250 and VIR251 were the first identified covalent inhibitors of PL<sup>pro<\/sup>&nbsp;(ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR77\">77<\/a><\/sup>) (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig4\">4c<\/a>). A catalytic residue, C111, of PL<sup>pro<\/sup>&nbsp;engages in a&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos2\">Michael addition reaction<\/a>&nbsp;with the \u03b2-carbon of the vinyl group of the vinylmethyl ester warheads from VIR250 and VIR251, resulting in the formation of a covalent thioether linkage. Residues at P2, P3 and P4 participate in an extensive network of hydrogen bonds and van der Waals interactions with their corresponding subsites. Similar substrate preferences and catalytic efficiencies are observed for SARS-CoV-2 and SARS-CoV PL<sup>pro<\/sup>, suggesting that inhibitors of SARS-CoV PL<sup>pro<\/sup>&nbsp;are a good starting point for lead compound optimization against SARS-CoV-2. GRL0617, an inhibitor of SARS-CoV PL<sup>pro<\/sup>, also inhibits SARS-CoV-2 PL<sup>pro<\/sup>&nbsp;(ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR78\">78<\/a><\/sup>). Structural studies show that GRL0617 fits in the substrate cleft which was formed between the BL2 loop and the loop connecting \u03b13 and \u03b14, where it occupies the S3 and S4 subsites. The aromatic ring of GRL0617 fits into the S3 subsite, while the&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos3\">naphthalene group<\/a>&nbsp;fills the S4 subsite. Thus, the binding of GRL0617 blocks the substrate from gaining access to the active site. Inspired by the success of GRL0617, several naphthalene-based compounds were synthesized and also show good inhibition of SARS-CoV-2 PL<sup>pro<\/sup>&nbsp;(ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR79\">79<\/a><\/sup>). YM155, an anticancer drug candidate in clinical trials, has also been shown to inhibit SARS-CoV-2 PL<sup>pro<\/sup>&nbsp;and has potent antiviral activity (<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos4\">half-maximal effective concentration<\/a>&nbsp;(EC<sub>50<\/sub>) of 170\u2009nM)<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR80\">80<\/a><\/sup>. YM155 achieves such a strong inhibition by simultaneously recognizing three hotspots in PL<sup>pro<\/sup>. The first binding site is located at the entrance of the substrate-binding pocket and blocks substrate entry to the active site. The second is located on the thumb domain and hampers interactions between PL<sup>pro<\/sup>&nbsp;and ISG15. The third site is located on the zinc-finger motif, and the binding perturbs the stability of the zinc-finger motif and enzyme activity.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec11\">M<sup>pro<\/sup><\/h3>\n\n\n\n<p>M<sup>pro<\/sup>&nbsp;is the major protease encoded by SARS-CoV-2. It cleaves replicase polyproteins at no fewer than 11 sites to release NSPs, allowing the assembly of the viral replication and transcription machinery. The pivotal role that M<sup>pro<\/sup>&nbsp;plays in regulating viral replication and transcription makes it an attractive drug target. Crystal structures show that this 306 amino acid protease comprises three domains (domain I, residues 10\u201399; domain II, residues 100\u2013182; and domain III, residues 198\u2013303) and adopts a chymotrypsin-like fold<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR81\">81<\/a><\/sup>. Due to the similar substrate specificity and presence of a cysteine as a catalytic residue, M<sup>pro<\/sup>&nbsp;is classified as a 3C-like protease<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR82\">82<\/a><\/sup>.<\/p>\n\n\n\n<p>Since the first crystal structure of SARS-CoV-2 M<sup>pro<\/sup>\u00a0in complex with a\u00a0<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos5\">Michael acceptor<\/a>\u00a0inhibitor N3 (Protein Data Bank accession code 6LU7) (Fig.\u00a0<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig5\">5a<\/a>) was published<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR81\">81<\/a><\/sup>, many structures of M<sup>pro<\/sup>\u00a0in complex with inhibitors have been reported. SARS-CoV-2 M<sup>pro<\/sup>\u00a0functions as an active homodimer, in which the two protomers are nearly perpendicular to each other. The N-terminal finger (residues 1\u20137) of one protomer inserts itself between domains II and III of its neighboring protomer, and promotes the formation of the dimer and the S1 subsite in the neighboring protomer<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR83\">83<\/a><\/sup>. Dimerization is additionally regulated by domain III through a salt-bridge interaction between E290 of one protomer and R4 from its adjacent protomer. In each protomer, a deep cleft between domains I and II forms the substrate-binding site, with a catalytic dyad (H41 and C145) at its Centre. Domain III contains five \u03b1-helices that arrange themselves into a large antiparallel globular cluster and exhibit a unique topology in coronaviruses. Domains II and III are connected by a long loop (residues 183\u2013198).<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/5\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig5_HTML.png\" alt=\"figure 5\" style=\"width:409px;height:271px\" width=\"409\" height=\"271\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 5: Structures of SARS-CoV-2 M<sup>pro<\/sup>&nbsp;and its inhibitors.<\/strong><\/figcaption><\/figure>\n\n\n\n<p>Coronavirus M<sup>pro<\/sup>s recognize the P4\u2013P1\u2032 positions of the substrate<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR84\">84<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR85\">85<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR86\">86<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig5\">5b<\/a>). The S1 subsite has an absolute preference for glutamine at P1. P2 is usually a bulky side chain that can be accommodated by the deep hydrophobic S2 subsite. The P3 side chain is solvent exposed, and the corresponding S3 subsite also shows tolerance to a wide range of functional groups. The hydrophobic S4 subsite is smaller than S2 and thus accommodates residues with small side chains. This binding pocket is highly conserved among coronavirus M<sup>pro<\/sup>s, suggesting that antiviral inhibitors targeting this pocket should have broad-spectrum activity against coronaviruses in general<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR87\">87<\/a><\/sup>.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"Sec12\">Inhibitors of SARS-CoV-2 M<sup>pro<\/sup><\/h4>\n\n\n\n<p>Recently, numerous inhibitors of M<sup>pro<\/sup>\u00a0have been identified exhibiting a range of binding mechanisms (Fig.\u00a0<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig5\">5c<\/a>). N3 is the representative peptidomimetic inhibitor, and harbours a Michael acceptor as a warhead and substituents spanning all substrate-binding subsites. The Michael acceptor forms a covalent bond with the active site residue, C145. N3 bears a lactam ring, an aliphatic isobutyl group, an isopropyl group, a methyl group and an isoxazole as the side chain for the P1\u2013P5 sites, respectively. The lactam ring, which replaces glutamine at the P1 site, exhibits favorable binding at the S1 subsite<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR81\">81<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR88\">88<\/a><\/sup>. Studies have shown that N3 displays strong inhibition of M<sup>pro<\/sup>s from different coronaviruses, and it could inhibit SARS-CoV-2 with EC<sub>50<\/sub>\u00a0of 16.77\u2009\u03bcM in a Vero cell-based assay. This value may not be truly representative of activity as it is not clear whether the high levels of expression of the efflux transporter P-glycoprotein in Vero cells affected the evaluation of its antiviral efficacy<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR88\">88<\/a><\/sup>.<\/p>\n\n\n\n<p>A recent study reported a series of \u03b1-ketoamides that inhibit SARS-CoV-2 M<sup>pro<\/sup>&nbsp;(ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR89\">89<\/a><\/sup>). Distinct from the previously designed \u03b1-ketoamides, the P2\u2013P3 amide bond is replaced with a pyridone ring, which increases the half-life in plasma. Replacement of the P2 cyclohexyl moiety with smaller cyclopropyl increases the antiviral activity against betacoronaviruses. Approved hepatitis C virus drugs, such as boceprevir, telaprevir and narlaprevir, are \u03b1-ketoamide inhibitors and also exhibit inhibition of SARS-CoV-2 M<sup>pro<\/sup>. The ketone group undergoes a nucleophilic attack by the C145 thiolate to form a hemithioketal. Because boceprevir, telaprevir and narlaprevir are peptidomimetic inhibitors with similar structures, they form very similar interactions with the S1\u2032\u2013S4 subsites<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR90\">90<\/a><\/sup>. Another ketone-based potent inhibitor was discovered in the hydroxymethylketone class<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR91\">91<\/a><\/sup>. One of the hydroxymethylketone derivatives demonstrated inhibition of SARS-CoV-2 M<sup>pro<\/sup>&nbsp;and also possesses antiviral activity with EC<sub>50<\/sub>&nbsp;of 4.8\u2009\u03bcM.<\/p>\n\n\n\n<p>Another study presented two peptidomimetic aldehydes (named \u201811a\u2019 and \u201811b\u2019) which bear an indole moiety at the N terminus (P3 site) and an aldehyde warhead at the C terminus<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR92\">92<\/a><\/sup>. The complex structures show that the aldehyde groups covalently bind to C145 of the catalytic dyad to inhibit M<sup>pro<\/sup>&nbsp;activity. Both inhibitors exhibited excellent inhibition of SARS-CoV-2 M<sup>pro<\/sup>&nbsp;with half-maximal inhibitory concentrations of 0.053\u2009\u03bcM and 0.040\u2009\u03bcM, respectively. The inhibitors also exhibited strong anti-SARS-CoV-2 infection activity in Vero cell-based assays and good pharmacokinetic and toxicity properties. A recent study reported another series of aldehyde derivatives with EC<sub>50<\/sub>&nbsp;ranging from 7.6 to 748.5\u2009nM in cell-based assays. In a transgenic mouse model of SARS-CoV-2 infection, oral or intraperitoneal treatment with two compounds, MI-09 or MI-30, significantly reduced lung viral loads and lung lesions. Both also displayed good pharmacokinetic properties and safety in rats<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR93\">93<\/a><\/sup>. GC376, an inhibitor of feline infectious peritonitis virus in preclinical studies, has been found to efficaciously inhibit SARS-CoV-2 in Vero cells by targeting M<sup>pro<\/sup>. It utilizes an aldehyde bisulfite to covalently bind to C145 (refs<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR94\">94<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR95\">95<\/a><\/sup>). Based on 11a, 11b and GC376, a number of aldehyde-based dipeptidyl and tripeptidyl inhibitors of M<sup>pro<\/sup>&nbsp;were designed, and the organocatalyst-mediated protein aldol ligation to C145 of the protease occurs<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR96\">96<\/a><\/sup>. A series of M<sup>pro<\/sup>&nbsp;inhibitors that possess an aldehyde group for covalent inhibition have been reported<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR97\">97<\/a><\/sup>. Among them, two compounds inhibited SARS-CoV-2 replication in cultured primary human airway epithelial cells.<\/p>\n\n\n\n<p>The repurposing of approved drugs, drug candidates and pharmacologically active compounds provides an alternative approach to identify potential drug leads that could rapidly be approved as clinical treatments for COVID-19. Through high-throughput screening, one study identified multiple drug leads that target M<sup>pro<\/sup>, including ebselen, disulfiram and carmofur<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR81\">81<\/a><\/sup>. Ebselen exhibited antiviral activity in a plaque-reduction assay (EC<sub>50<\/sub>\u2009=\u20094.67\u2009\u03bcM). As an organoselenium compound, ebselen was previously investigated for treatment of bipolar disorders and hearing loss<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR98\">98<\/a><\/sup>. It has been shown to have low cytotoxicity in humans in clinical trials<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR99\">99<\/a><\/sup>. Ebselen has been approved by the US Food and Drug Administration to enter phase II clinical trials (NCT04484025 and NCT04483973) for COVID-19 treatment. Carmofur, which also exhibited antiviral activity in vitro, is a derivative of 5-fluorouracil. It is an approved&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos6\">antineoplastic agent<\/a>, and has been investigated as a cancer treatment<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR100\">100<\/a><\/sup>. As observed in the complex structure of M<sup>pro<\/sup>&nbsp;and carmofur, the catalytic C145 residue is covalently bound to the carbonyl reactive group of carmofur and its fatty acid tail extends into the hydrophobic S2 subsite<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR101\">101<\/a><\/sup>. Such a novel inhibitory mode makes carmofur a good lead compound for rational drug design. GRL-1720 and 5h were also identified as covalent inhibitors targeting M<sup>pro<\/sup>&nbsp;through high-throughput screening. Crystal structures show that both GRL-1720 and 5h form extensive interactions with C145 and other residues in the M<sup>pro<\/sup>&nbsp;active site<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR102\">102<\/a><\/sup>.<\/p>\n\n\n\n<p>A recent study performed large-scale fragment screening against M<sup>pro<\/sup>&nbsp;by combining mass spectrometry and X-ray approaches<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR103\">103<\/a><\/sup>. Seventy-one hits were identified to bind at the substrate-binding site, and three hits were found to bind near the dimer interface. These structures provide a starting point to design more elaborate and potent drug leads that target SARS-CoV-2 M<sup>pro<\/sup>. Another study performed a high-throughput X-ray crystallographic screening of two drug repurposing libraries (the Fraunhofer IME Repurposing Collection and the Safe-in-Man library from Domp\u00e9 Farmaceutici) against the SARS-CoV-2 M<sup>pro<\/sup>&nbsp;(ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR104\">104<\/a><\/sup>); the study authors identified 37 compounds that bind to M<sup>pro<\/sup>. In subsequent cell-based assays, one peptidomimetic compound (calpeptin) and six non-peptidic compounds showed antiviral activity at non-toxic concentrations. Additionally, two allosteric binding sites representing potential targets against SARS-CoV-2 were identified. The first allosteric site is in the immediate vicinity of the S1 pocket of the adjacent protomer within the native dimer. The second allosteric site is formed by the deep groove between the catalytic domain and the dimerization domain.<\/p>\n\n\n\n<p>Baicalin and baicalein, which are natural products derived from the flowering plant&nbsp;<em>Scutellaria baicalensis<\/em>, have been shown to inhibit SARS-CoV-2 M<sup>pro<\/sup>&nbsp;with half-maximal inhibitory concentrations of 6.41\u2009\u03bcM and 0.94\u2009\u03bcM, respectively<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR105\">105<\/a><\/sup>. The structure of M<sup>pro<\/sup>&nbsp;in complex with baicalein shows that the phenyl ring with three hydroxy groups forms&nbsp;<em>\u03c0<\/em>\u2013S and&nbsp;<em>\u03c0<\/em>\u2013<em>\u03c0<\/em>&nbsp;interactions with C145 and H41 of the catalytic dyad, while the hydroxy groups form multiple hydrogen bonds with the S1 subsite. The distal phenyl ring occupied the S2 subsite. Another example is shikonin<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR106\">106<\/a><\/sup>. The complex structure shows that shikonin forms a hydrogen bond network with the catalytic dyad C145 and H164 located in the S1 subsite. The aromatic head groups of shikonin form a&nbsp;<em>\u03c0<\/em>\u2013<em>\u03c0<\/em>&nbsp;interaction with H41 on the S2 subsite. The hydroxy and methyl groups of the isohexenyl side chain of the shikonin tail form hydrogen bonds with R188 and Q189, respectively, in the S3 subsite. Such a unique mode of action expands our knowledge of M<sup>pro<\/sup>&nbsp;inhibition.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Sec13\">Replication and transcription complex<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec14\">Replication mechanism of the central RTC<\/h3>\n\n\n\n<p>In coronavirus infection, replication and transcription is regulated through a multisubunit mechanism<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR107\">107<\/a><\/sup>, where the RdRP nsp12 catalyses viral RNA synthesis and thus acts as the key component of the RTC<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR108\">108<\/a><\/sup>. In addition, the primase nsp8 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR109\">109<\/a><\/sup>) and an auxiliary factor, nsp7, contribute to the activation and continuous production of viral RNA<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR110\">110<\/a><\/sup>. nsp12 along with nsp7 and nsp8 makes up the complete RdRP complex.<\/p>\n\n\n\n<p>SARS-CoV-2 nsp12 is composed of three major domains, a nidovirus RdRP-associated nucleotidyltransferase (NiRAN) domain, an interface domain and a right-handed RdRP domain (finger, palm and thumb)<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR111\">111<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6a<\/a>). The active site of SARS-CoV-2 RdRP is located in the palm subdomain, which has a shape like other RNA polymerases, such as those from hepatitis C virus ns5b<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR112\">112<\/a><\/sup>&nbsp;and poliovirus 3Dpol<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR113\">113<\/a><\/sup>. The architecture of the central cavity is shared by other conserved polymerases involving the primer-template entry, nucleoside triphosphate (NTP) entry and nascent strand exit paths. Residues D760 and D761 are involved in the coordination of two Mg<sup>2+<\/sup>&nbsp;ions essential for polymerase activity. One Mg<sup>2+<\/sup>&nbsp;ion coordinates motif C and binds at the 3\u2032 end (\u2018i\u2019 site) of the RNA primer, facilitating the condensation reaction in RNA chain synthesis, while the second Mg<sup>2+<\/sup>&nbsp;positions the incoming NTP and stabilizes the charge environment. Separate from conserved motifs A\u2013E at the active site, motif F and motif G inside the fingers subdomain are conducive to guiding the RNA template. During viral RNA synthesis, notable structural rearrangements occur in this complex to accommodate the RNA<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR114\">114<\/a><\/sup>. Along with the product chain synthesis, the protruding RNA template\u2013product duplex exits through the active site without steric hindrance and extends to two positively charged \u2018sliding poles\u2019 formed by two nsp8 N-terminal helices<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR115\">115<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6b<\/a>). Consistent with SARS-CoV nsp8 adopting variable conformations<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR116\">116<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR117\">117<\/a><\/sup>, N-terminal extensions of nsp8-2 (the second copy of nsp8) have two different orientations at the early replicating stage. In one orientation, it is adjacent to the finger subdomain, whereas in the other orientation, it interacts with the RNA duplex, suggesting that nsp8 may have regulatory functions in replication initiation. The complex consisting of nsp12, nsp7, nsp8 and RNA duplex reflects the replicating state in RdRP activity; therefore, it is referred to as the central RTC (C-RTC).<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/6\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig6_HTML.png\" alt=\"figure 6\" style=\"width:449px;height:503px\" width=\"449\" height=\"503\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 6: Structures of SARS-CoV-2 replication and transcription complex and its inhibitors.<\/strong><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec15\">RNA elongation, capping and backtracking<\/h3>\n\n\n\n<p>The RTC needs to guarantee processive RNA duplex elongation without template\u2013product dissociation so that viral genome or subgenome synthesis can be rapidly completed inside the host cell<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR118\">118<\/a><\/sup>. For coronaviruses, which have the largest known positive-sense RNA genomes, both replication efficiency and replication fidelity are essential for maintaining genetic integrity. The former relies on the functional elongation RTC (E-RTC), whereas the latter depends on proofreading by nsp14. An E-RTC is composed of a C-RTC and two coupled copies of the nsp13 helicase: nsp13-1 and nsp13-2 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR119\">119<\/a><\/sup>) (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6c<\/a>). nsp13 is believed to be crucial in viral replication and the mRNA capping process, which includes unwinding of the RNA duplex into single strands, 5\u2032 to 3\u2032 polarity formation and RNA 5\u2032-triphosphatase activity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR120\">120<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR121\">121<\/a><\/sup>. The unique domains of coronavirus nsp13, such as the zinc-binding domain, the stalk and the 1B domain, are all important for helicase activity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR122\">122<\/a><\/sup>. In the structure of E-RTC, two nsp13 zinc-binding domains form extensive interactions with two nsp8 N-terminal helices. In particular, the zinc-binding domain from nsp13-2 forms additional interactions with the nsp12 thumb subdomain, stabilizing the overall structure during elongation<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR119\">119<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR123\">123<\/a><\/sup>. Before entering the nsp12 active site, the template RNA strand undergoes disruption of RNA secondary structure and guidance between the nsp13-2 RecA domain and the 1B domain to ensure the 5\u2032 to 3\u2032 translocation direction<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR124\">124<\/a><\/sup>. Structural characterization of E-RTC not only helps elucidate the RNA elongation mechanisms but also suggests different functional roles that nsp13 may play in this event. In nsp13-2, residues N361 in the domain 1A, S468, T532 and D534 in the domain 2A and R178 and H230 in the domain 1B collectively contribute to template RNA recognition and elongation, demonstrating that nsp13-2 is directly involved in positioning downstream template RNA. Interestingly, the interactions between the nsp13-1 1B domain and the nsp13-2 1B domain have been shown to play a pivotal role in E-RTC helicase activity, even though nsp13-1 is far from nsp13-2 (ref.<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR123\">123<\/a><\/sup>) (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6d<\/a>). Therefore, nsp13-1 is indispensable for RNA elongation in that it is cooperatively coupled with nsp13-2 in the functioning E-RTC.<\/p>\n\n\n\n<p>The capping modification of mRNA, which rigorously follows subgenomic mRNA synthesis, is essential for viral translation and propagation, mRNA protection and escape from host immune response<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR125\">125<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR126\">126<\/a><\/sup>. Similarly to the RNA elongation process, multiple NSPs participate in RTC assembly during sequential stages of mRNA capping, which can be divided into four main steps: (1) removal of the \u03b3-phosphate of 5\u2032-pppA by nsp13 with RNA 5\u2032-triphosphatase activity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR120\">120<\/a><\/sup>; (2) transfer of GMP to 5\u2032-ppA by the nsp12 NiRAN domain with guanylyltransferase (GTase) activity, leading to the generation of a GpppA cap structure<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR127\">127<\/a><\/sup>; (3) methylation of N7-guanine by nsp14, which has N7-methyltransferase activity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR128\">128<\/a><\/sup>; and (4) methylation of the ribose 2\u2032-<em>O<\/em>\u00a0nucleotide into the final\u00a0<sup>7Me<\/sup>GpppA<sub>2\u2032OMe<\/sub>\u00a0cap structure by nsp1, which has 2\u2032-<em>O<\/em>-methyltransferase activity<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR129\">129<\/a><\/sup>. Multiple NSPs are assembled into the RTC in order according to their functional roles, a process which is accompanied by structural conformational changes. On one hand, the nsp12 NiRAN domain is involved in the second step to catalyze the ppA to GpppA transfer through its newly identified GTase activity. On the other hand, an intermediate state which has been captured by cryo-EM, shows that nsp9 can inhibit the GTase activity by tight insertion into the NiRAN catalytic centre in order to terminate the reaction (Fig.\u00a0<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6e<\/a>). nsp9 is an RNA-binding protein, which is characterized by a positively charged groove<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR130\">130<\/a><\/sup>. This groove, together with a \u03b2-hairpin at the nsp12 N terminus, provides an exit path for postcatalytic GpppA-RNA. Several hydrophobic interactions and hydrogen bonds enhance nsp9 binding to nsp12, suggesting that nsp9 plays a substantial role in the viral life cycle. Because it has been shown that disruption of the nsp9\u2013nsp10 cleavage site is not lethal<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR131\">131<\/a><\/sup>\u00a0and nsp10 is able to tightly bind to nsp14 or nsp16 (refs<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR132\">132<\/a>,<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR133\">133<\/a><\/sup>), nsp9 may serve as a core regulator in recruiting the nsp10\u2013nsp14 or nsp10\u2013nsp16 complex for the following capping RTC assembly with N7-methyltransferase activity and 2\u2032-<em>O<\/em>-methyltransferase activity.<\/p>\n\n\n\n<p>Another important aspect relating to the RTC is its proofreading mechanism. Most RNA viruses replicate with estimated error rates between 10<sup>\u22123<\/sup>&nbsp;and 10<sup>\u22125<\/sup>, which results in approximately one mutation per genome per round of replication for a typical&nbsp;\u223c10-kb genome<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR134\">134<\/a><\/sup>, a much higher mutation rate than occurs in cellular DNA replication<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR135\">135<\/a><\/sup>. The lower fidelity may largely be due to the lack of proofreading activity in these viruses. By contrast, SARS-CoV-2, which encodes nsp14 (an exonuclease with proofreading activity), can maintain high fidelity during replication of its large genome. Proofreading involves the backtracking of mismatched template\u2013product RNA chains. The single-stranded 3\u2032 segment of the product RNA generated by backtracking extrudes through the RdRP NTP entry tunnel. Then a mismatched nucleotide located at the 3\u2032 end of product RNA enters the conserved NTP entry tunnel to initiate backtracking, and meanwhile, nsp13 stimulates RdRP backtracking. The structure of C-RTC in complex with the essential nsp13 helicase and RNA suggests that the helicase can facilitate the backtracking mechanism<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR136\">136<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6f<\/a>).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec16\">RdRP inhibitor discovery<\/h3>\n\n\n\n<p>The RdRP is a prime drug target for SARS-CoV-2 (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig1\">1a<\/a>). Inhibition of RdRP activity will prevent viral replication and can potentially achieve clinical efficacy. Major efforts have been devoted to identify both nucleotide and non-nucleotide inhibitors, which have also been used as probes to understand the replication cycle of SARS-CoV-2 and to provide a basis for development of broad-spectrum antiviral drugs.<\/p>\n\n\n\n<p>The&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Glos7\">prodrug<\/a>&nbsp;remdesivir, which was initially developed for the treatment of Ebola virus infection, shows good activity against SARS-CoV-2 in in vitro assays<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR137\">137<\/a><\/sup>&nbsp;but limited efficacy in clinical trials. In the cell, remdesivir is phosphorylated to remdesivir triphosphate, enabling it to act as an ATP analogue. The structure of pretranslocated catalytic C-RTC clearly demonstrates the incorporation mode of remdesivir and suggests its inhibition mechanism<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR114\">114<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6g<\/a>). Kinetic analysis shows remdesivir triphosphate is preferred as a substrate over ATP<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR138\">138<\/a><\/sup>&nbsp;and terminates product chain elongation at a delayed position (<em>i<\/em>\u2009+\u20093). Once the inserted remdesivir monophosphate is transferred to the&nbsp;<em>i<\/em>\u2009+\u20093 position, the distance between the serine hydroxy oxygen from S861 and the 1\u2032-cyano nitrogen from remdesivir monophosphate will be close to 2\u2009\u00c5, causing \u2018delayed chain termination\u2019. Further investigations indicate that an remdesivir-induced translocation barrier and RdRP stalling occur after the addition of three nucleotides upon incorporation of remdesivir into the product chain<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR139\">139<\/a><\/sup>. Favipiravir is another nucleoside analogue that has been approved as an anti-influenza virus drug in Japan. Favipiravir simulates the incorporation of ATP and GTP into the product RNA, yet it inhibits viral proliferation by increasing the mutation rate of the viral genome rather than causing product chain terminations<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR140\">140<\/a><\/sup>. The structure of the RdRP\u2013favipiravir complex delineates a precatalytic state and identifies the conserved residues for favipiravir recognition (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6h<\/a>).<\/p>\n\n\n\n<p>Although nucleotide inhibitors can be inserted into RNA chains, they can later be cleaved by proofreading activity. Thus, non-nucleotide inhibitors have been considered as an alternative approach for drug development. Suramin, a century-old drug used to treat African sleeping sickness and river blindness, can effectively inhibit SARS-CoV-2 polymerase activity with at least 20-fold more activity than RDV-3Pi in biochemical assays and inhibits viral replication in vitro<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR141\">141<\/a><\/sup>. In the cryo-EM structure, two suramin molecules bind to the active sites of nsp12, with one occupying the template-binding site and the other occupying the primer catalytic active centre, implying that suramin may competitively inhibit protein\u2013RNA binding due to its strong electronegativity (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig6\">6i<\/a>). However, the highly negatively charged suramin has the potential to bind to many positively charged macromolecular surfaces, and thus its specific antiviral activity remains to be further investigated.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Sec17\">Accessory protein\u2013host interactions<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"Sec18\">ORF3a, ORF9b, ORF7a and ORF8<\/h3>\n\n\n\n<p>ORF3a protein, encoded by&nbsp;<em>ORF3a<\/em>, is an ion channel membrane protein with 274 amino acids. It forms a potassium-sensitive channel and may promote virus release. The cryo-EM structure of SARS-CoV-2 ORF3a is&nbsp;the first viroporin family structure determined in coronaviruses<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR142\">142<\/a><\/sup>. The overall structure shows that ORF3a forms a dimer with the ion channel decorated with charged residues for cation conduction (Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig7\">7a<\/a>). It is noteworthy that ORF3a has a TRAF-binding domain at the N terminus that can activate NF-\u03baB and the NLRP3 inflammasome<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR143\">143<\/a><\/sup>, suggesting an important role in the host immune response. As ion channels are important therapeutic targets and many ion-channel drugs have already been approved for clinical trials, ORF3a is another good antiviral drug target<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR144\">144<\/a><\/sup>.<\/p>\n\n\n\n<figure class=\"wp-block-image is-resized is-style-default\"><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8\/figures\/7\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/media.springernature.com\/lw685\/springer-static\/image\/art%3A10.1038%2Fs41579-021-00630-8\/MediaObjects\/41579_2021_630_Fig7_HTML.png\" alt=\"figure 7\" style=\"width:392px;height:479px\" width=\"392\" height=\"479\"\/><\/a><figcaption class=\"wp-element-caption\"><strong>Fig. 7: Structures of SARS-CoV-2 accessory proteins.<\/strong><\/figcaption><\/figure>\n\n\n\n<p>ORF9b is encoded by an alternative ORF within the N protein gene. ORF9b suppresses the type I interferon immune response by interacting with the mitochondrial import receptor subunit TOM70. Targeting the interactions between ORF9b and TOM70 has been proposed as a therapeutic option for SARS-CoV-2. The structure of SARS-CoV-2 ORF9b shows that it is dimeric, with each protomer composed mainly of \u03b2-strands<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR145\">145<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig7\">7b<\/a>). The centre of the dimer has a hydrophobic environment for accommodating lipid molecules and membrane attachment.<\/p>\n\n\n\n<p>ORF7a is a type I transmembrane protein and is also involved in virus\u2013host interactions and protein trafficking within the ER and Golgi body. Its structure shows that it has a seven-stranded \u03b2-sandwich fold consistent with the immunoglobulin superfamily<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR146\">146<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig7\">7c<\/a>). A deep hydrophobic pocket has been identified for potential inhibitor binding.<\/p>\n\n\n\n<p>ORF8 is an accessory protein that is composed of 121 amino acids. It has an N-terminal signal sequence and adopts an immunoglobulin-like fold<sup><a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#ref-CR147\">147<\/a><\/sup>&nbsp;(Fig.&nbsp;<a href=\"https:\/\/www.nature.com\/articles\/s41579-021-00630-8#Fig7\">7d<\/a>). The structure of ORF8 shows that it can form a dimer, and each promoter of ORF8 contains eight antiparallel \u03b2-strands tied by three disulfide bonds. The covalently bonded dimer structure is stabilized by surface hydrophobic interactions and a series of hydrogen bonds. ORF8 is capable of assembling itself into large-scale homologous complexes; however, the oligomerization mechanism needs to be investigated further.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"Sec19\">Conclusions<\/h2>\n\n\n\n<p>Coronaviruses have the largest genomes among all RNA viruses, encoding structural proteins and NSPs that achieve sustainability in a wide variety of ecological niches and hosts. Evolving viral proteins help coronaviruses to achieve host recognition and entry, genome replication, assembly and release of progeny viruses, and host immune surveillance evasion. In response to the COVID-19 pandemic, great efforts have been devoted to structural studies of SARS-CoV-2 proteins and viral\u2013cellular protein complexes using X-ray crystallography and cryo-EM. Among them, the S protein, M<sup>pro<\/sup>, PL<sup>pro<\/sup>&nbsp;and RdRP are the most widely studied drug targets. A multidisciplinary combination of structural virology, \u2019omics technologies, immunology and virology will produce a more effective approach to structure-aided design of vaccines and therapeutics that have the potential for clinical use.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Haitao Yang\u00a0&amp;\u00a0Zihe Rao\u00a0 Nature Reviews Microbiology\u00a0volume\u00a019,\u00a0pages685\u2013700 (2021) Cite this article Abstract The COVID-19 pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is an unprecedented global health crisis. However, [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":7734,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[541],"tags":[],"class_list":["post-7730","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-spike-protein"],"_links":{"self":[{"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=\/wp\/v2\/posts\/7730","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=7730"}],"version-history":[{"count":0,"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=\/wp\/v2\/posts\/7730\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=\/wp\/v2\/media\/7734"}],"wp:attachment":[{"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=7730"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=7730"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cov19longhaulfoundation.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=7730"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}