A bioinformatics method for identification of human proteases active against viral envelope glycoproteins: a case study on the SARS-CoV-2 spike protein

Cover Page

Cite item

Full Text

Open Access Open Access
Restricted Access Access granted
Restricted Access Subscription Access

Abstract

Many viruses, including SARS-CoV-2, the coronavirus responsible for the COVID-19 pandemic, enter host cells through a process cell-viral membrane fusion that is activated by proteolytic enzymes. Typically, these enzymes are host cell proteases. Identifying the proteases that activate the virus is not a simple task but is important for the development of new antiviral drugs. In this study, we developed a bioinformatics method for identifying proteases that can cleave viral envelope glycoproteins. The proposed approach involves the use of predictive models for the substrate specificity of human proteases and the application of structural analysis method for predicting the vulnerability of protein regions to proteolysis based on their 3D structures. Specificity models were constructed for 169 human proteases using information on their known substrates. A previously developed method for structural analysis of potential proteolysis sites was applied in parallel with specificity models. Validation of the proposed approach was performed on the SARS-CoV-2 spike protein, the proteolysis sites of which had been well studied.

Full Text

Restricted Access

About the authors

E. V. Matveev

Skolkovo Institute of Science and Technology; Kharkevich Institute for Information Transmission Problem; Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology

Email: mkazanov@gmail.com
Russian Federation, Moscow, 121205; Moscow, 127051; Moscow, 117997

G. V. Ponomarev

Skolkovo Institute of Science and Technology; Kharkevich Institute for Information Transmission Problem

Email: mkazanov@gmail.com
Russian Federation, Moscow, 121205; Moscow, 127051

M. D. Kazanov

Skolkovo Institute of Science and Technology; Kharkevich Institute for Information Transmission Problem; Dmitry Rogachev National Medical Research Center of Pediatric Hematology, Oncology and Immunology

Author for correspondence.
Email: mkazanov@gmail.com
Russian Federation, Moscow, 121205; Moscow, 127051; Moscow, 117997

References

  1. Ramage H., Cherry S. (2015) Virus-host interactions: from unbiased genetic screens to function. Annu. Rev. Virol. 2, 497–524. doi: 10.1146/annurev-virology-100114-055238
  2. Li G., Hilgenfeld R., Whitley R., De Clercq E. (2023) Therapeutic strategies for COVID-19: progress and lessons learned. Nat. Rev. Drug Discov. 22, 449–475. doi: 10.1038/s41573-023-00672-y
  3. V’kovski P., Kratzel A., Steiner S., Stalder H., Thiel V. (2021) Coronavirus biology and replication: implications for SARS-CoV-2. Nat. Rev. Microbiol. 19, 155–170. doi: 10.1038/s41579-020-00468-6
  4. Baggen J., Vanstreels E., Jansen S., Daelemans D. (2021) Cellular host factors for SARS-CoV-2 infection. Nat. Microbiol. 6, 1219–1232. doi: 10.1038/s41564-021-00958-0
  5. Takeda M. (2022) Proteolytic activation of SARS-CoV-2 spike protein. Microbiol. Immunol. 66, 15–23. doi: 10.1111/1348-0421.12945
  6. Jackson C.B., Farzan M., Chen B., Choe H. (2022) Mechanisms of SARS-CoV-2 entry into cells. Nat. Rev. Mol. Cell Biol. 23, 3–20. doi: 10.1038/s41580-021-00418-x
  7. Walls A.C., Park Y.J., Tortorici M.A., Wall A., McGuire A.T., Veesler D. (2020) Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell. 181, 281‒292.e6. doi: 10.1016/j.cell.2020.02.058
  8. Zabiegala A., Kim Y., Chang K.O. (2023) Roles of host proteases in the entry of SARS-CoV-2. Anim. Dis. 3(1), 12. doi: 10.1186/s44149-023-00075-x
  9. Benton D.J., Wrobel A.G., Xu P., Roustan C., Martin S.R., Rosenthal P.B., Skehel J.J., Gamblin S.J. (2020) Receptor binding and priming of the spike protein of SARS-CoV-2 for membrane fusion. Nature. 588, 327‒330. doi: 10.1038/s41586-020-2772-0
  10. Matsuyama S., Nao N., Shirato K., Kawase M., Saito S., Takayama I., Nagata N., Sekizuka T., Katoh H., Kato F., Sakata M., Tahara M., Kutsuna S., Ohmagari N., Kuroda M., Suzuki T., Kageyama T., Takeda M. (2020) Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells. Proc. Natl. Acad. Sci. USA. 117, 7001–7003. doi: 10.1073/pnas.2002589117
  11. Shang J., Wan Y., Luo C., Ye G., Geng Q., Auerbach A., Li F. (2020) Cell entry mechanisms of SARS-CoV-2. Proc. Natl. Acad. Sci. USA. 117, 11727‒11734. doi: 10.1073/pnas.2003138117
  12. Callaway E. (2020) The coronavirus is mutating — does it matter? Nature. 585, 174–177. doi: 10.1038/d41586-020-02544-6
  13. Lubinski B., Whittaker G.R. (2023) The SARS-CoV-2 furin cleavage site: natural selection or smoking gun? Lancet Microbe. 4(8), e570. doi: 10.1016/S2666-5247(23)00144-1
  14. Whittaker G.R. (2021) SARS-CoV-2 spike and its adaptable furin cleavage site. Lancet Microbe. 2(10), e488–e489. doi: 10.1016/S2666-5247(21)00174-9
  15. Wu Y., Zhao S. (2021) Furin cleavage sites naturally occur in coronaviruses. Stem Cell Res. 50, 102‒115. doi: 10.1016/j.scr.2020.102115
  16. Chan Y.A., Zhan S.H. (2021) The emergence of the spike furin cleavage site in SARS-CoV-2. Mol. Biol. Evol. 39(1), msab327. doi: 10.1093/molbev/msab327
  17. Whittaker G.R., Daniel S., Millet J.K. (2021) Coronavirus entry: how we arrived at SARS-CoV-2. Curr. Opin. Virol. 47, 113–120. doi: 10.1016/j.coviro.2021.02.006
  18. Liu Z., Zheng H., Yuan R., Li M., Lin H., Peng J., Xiong Q., Sun J., Li B., Wu J., Ke C., Hulswit R.J.G., Bowden T.A. Rambaut A., Pybus O.G., Loman N., Lu J. (2020) Identification of common deletions in the spike protein of SARS-CoV-2. J. Virol. 94, e00790-20. doi: 10.1128/JVI.00790-20
  19. Park J.E., Li K., Barlan A., Fehr A.R., Perlman S., McCray P.B., Gallagher T. (2016) Proteolytic processing of middle east respiratory syndrome coronavirus spikes expands virus tropism. Proc. Natl. Acad. Sci. USA. 113, 12262–12267. doi: 10.1073/pnas.1608147113
  20. Baggen J., Jacquemyn M., Persoons L., Vanstreels E., Pye V.E., Wrobel A.G., Calvaresi V., Martin S.R., Roustan C., Cronin N.B., Reading E., Thibault H.J., Vercruysse T., Maes P., De Smet F., Yee A., Nivitchanyong T., Roell M., Franco-Hernandez N., Rhinn H., Mamchak A.A. Young-Chapon M.A., Brown E., Cherepanov P., Daelemans D. (2023) TMEM106B is a receptor mediating ACE2-independent SARS-CoV-2 cell entry. Cell. 186, 3427–3442. doi: 10.1016/j.cell.2023.06.005
  21. Meng B., Abdullahi A., Ferreira I.A.T.M., Goonawardane N., Saito A., Kimura I., Yamasoba D., Gerber P.P., Fatihi S., Rathore S., Zepeda S.K., Papa G., Kemp S.A., Ikeda T., Toyoda M., Tan T.S., Kuramochi J., Mitsunaga S., Ueno T., Shirakawa K., Takaori-Kondo A., Brevini T., Mallery D.L., Charles O.J., CITIID-NIHR BioResource COVID-19 Collaboration, Genotype to Phenotype Japan (G2P-Japan) Consortium, Ecuador-COVID19 Consortium, Bowen, J. E., Joshi A., Walls A.C., Jackson L., Martin D., Smith K.G.C., Bradley J., Briggs J.A.G., Choi J., Madissoon E., Meyer K.B., Mlcochova P., Ceron-Gutierrez L., Doffinger R., Teichmann S.A., Fisher A.J., Pizzuto M.S., de Marco A., Corti D., Hosmillo M., Lee J.H., James L.C. Thukral L., Veesler D., Sigal A., Sampaziotis F., Goodfellow I.G., Matheson N.J., Sato K., Gupta R.K. (2022) Altered TMPRSS2 usage by SARS-CoV-2 Omicron impacts infectivity and fusogenicity. Nature. 603, 706–714. doi: 10.1038/s41586-022-04474-x
  22. Rawlings N.D., Barrett A.J., Thomas P.D., Huang X., Bateman A., Finn R.D. (2018) The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database. Nucleic Acids Res. 46, D624–D632. doi: 10.1093/nar/gkx1134
  23. Wasserman W.W., Sandelin A. (2004) Applied bioinformatics for the identification of regulatory elements. Nat. Rev. Genet. 5, 276–287. doi: 10.1038/nrg1315
  24. Schechter I., Berger A. (1968) On the active site of proteases. 3. Mapping the active site of papain, specific peptide inhibitors of papain. Biochem. Biophys. Res. Commun. 32, 898–902. doi: 10.1016/0006-291x(68)90326-4
  25. Matveev E.V., Safronov V.V., Ponomarev G.V., Kazanov M.D. (2023) Predicting structural susceptibility of proteins to proteolytic processing. Int. J. Mol. Sci. 24, 10761. doi: 10.3390/ijms241310761
  26. Igarashi Y., Eroshkin A., Gramatikova S., Gramatikoff K., Zhang Y., Smith J.W., Osterman A.L., Godzik A. (2007) CutDB: a proteolytic event database. Nucleic Acids Res. 35(Database issue), D546-9. doi: 10.1093/nar/gkl813
  27. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E., Louppe G. (2011) Scikit-Learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830. doi: 10.48550/arXiv.1201.0490
  28. The UniProt Consortium (2018) UniProt: the universal protein knowledgebase. Nucleic Acids Res. 46, 2699. doi: 10.1093/nar/gky092
  29. wwPDB consortium (2019) Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 47, D520–D528. doi: 10.1093/nar/gky949
  30. Waterhouse A., Bertoni M., Bienert S., Studer G., Tauriello G., Gumienny R., Heer F.T., De Beer T.A.P., Rempfer C., Bordoli L., Lepore R., Schwelde T. (2018) SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 46, W296–W303. doi: 10.1093/nar/gky427
  31. Hoffmann M., Kleine-Weber H., Pöhlmann S.A. (2020) Multibasic cleavage site in the spike protein of SARS-CoV-2 is essential for infection of human lung cells. Mol. Cell. 78, 779‒784.e5. doi: 10.1016/j.molcel.2020.04.022

Supplementary files

Supplementary Files
Action
1. JATS XML
2. Fig. 1. Assessment of the proteolytic efficiency of human proteases relative to the SARS-CoV-2 S-glycoprotein sites. Human proteolytic enzymes with the highest CS values at position R815 of the S2’ site (a) and at position R685 of the S1/S2 site (b). Distribution of CS values for known proteolytic sites and randomly selected positions (c).

Download (508KB)
3. Fig.. 2. Assessment of proteolytic accessibility of SARS-CoV-2 S-glycoprotein sites. a ‒ Display on the structure of S-glycoprotein (PDB ID: 6VXX) estimates of proteolytic accessibility in the region of the S2’ site (position R815). b - Display on the structure of S-glycoprotein of estimates of proteolytic accessibility in the region of the S1/S2 site (position R685). The color gradient from blue to red to yellow corresponds to changes from a minimum proteolytic accessibility score of 0 to a maximum of 1.

Download (554KB)
4. Supplementary 1
Download (2MB)
5. Supplementary 2
Download (3MB)

Copyright (c) 2024 Russian Academy of Sciences