Applying molecular similarity used for evaluating the accuracy of retention index predictions in gas chromatography using deep learning
- Authors: Matyushin D.D.1, Sholokhova A.Y.1, Khrisanfov M.D.1,2, Borovikova S.A.1
-
Affiliations:
- A. N. Frumkin Institute of Physical Chemistry and Electrochemistry of the Russian Academy of Sciences
- M. V. Lomonosov Moscow State University
- Issue: Vol 99, No 1 (2025)
- Pages: 144-152
- Section: ФИЗИЧЕСКАЯ ХИМИЯ ПРОЦЕССОВ РАЗДЕЛЕНИЯ. ХРОМАТОГРАФИЯ
- Submitted: 01.06.2025
- Published: 17.04.2025
- URL: https://innoscience.ru/0044-4537/article/view/681877
- DOI: https://doi.org/10.31857/S0044453725010146
- EDN: https://elibrary.ru/EHWTZH
- ID: 681877
Cite item
Abstract
When predicting retention indices using deep learning, there is usually no way to assess the reliability of the prediction for a particular molecule. In this work, using stationary phases based on polyethylene glycol and the NIST 17 database as an example, it is shown that, on average, the closer the molecule in the training data set is to the compound being predicted, the more accurate the prediction. Tanimoto similarity of “molecular fingerprints” ECFP is the most appropriate molecular similarity calculation algorithm for this problem among the four considered. It is shown that for a number of transformation products of unsymmetrical dimethylhydrazine, whose structure was established using this prediction, it could be very unreliable.
Full Text

About the authors
D. D. Matyushin
A. N. Frumkin Institute of Physical Chemistry and Electrochemistry of the Russian Academy of Sciences
Email: shonastya@yandex.ru
Russian Federation, Moscow, 119071
A. Yu. Sholokhova
A. N. Frumkin Institute of Physical Chemistry and Electrochemistry of the Russian Academy of Sciences
Author for correspondence.
Email: shonastya@yandex.ru
Russian Federation, Moscow, 119071
M. D. Khrisanfov
A. N. Frumkin Institute of Physical Chemistry and Electrochemistry of the Russian Academy of Sciences; M. V. Lomonosov Moscow State University
Email: shonastya@yandex.ru
Russian Federation, Moscow, 119071; Moscow, 119991
S. A. Borovikova
A. N. Frumkin Institute of Physical Chemistry and Electrochemistry of the Russian Academy of Sciences
Email: shonastya@yandex.ru
Russian Federation, Moscow, 119071
References
- Tarján G., Nyiredy S., Györ M. et al. // J. of Chromatography A. 1989. V. 472. P. 1. https://doi.org/10.1016/S0021-9673(00)94099-8
- Franke J.-P., Wijsbeek J., De Zeeuw R.A. // J. of Forensic Sciences. 1990. V. 35. № 4. P. 813. https://doi.org/10.1520/JFS12893J
- Zellner B.A., Bicchi C., Dugo P. et al. // Flavour and Fragrance J. 2008. V. 23. № 5. P. 297–314. https://doi.org/10.1002/ffj.1887
- Milman B.L., Zhurkovich I.K. // TrAC Trends in Analytical Chemistry. 2016. V. 80. P. 636–640. https://doi.org/10.1016/j.trac.2016.04.024
- Vinaixa M., Schymanski E.L., Neumann S. et al. // TrAC Trends in Analytical Chemistry. 2016. V. 78. P. 23. https://doi.org/10.1016/j.trac.2015.09.005
- Matyushin D.D., Sholokhova A.Yu., Karnaeva A.E. et al. // Chemometrics and Intelligent Laboratory Systems. 2020. V. 202. P. 104042. https://doi.org/10.1016/j.chemolab.2020.104042
- Schymanski E.L., Meringer M., Brack W. // Analytical Chemistry. 2011. V. 83. № 3. P. 903. https://doi.org/10.1021/ac102574h
- Dossin E., Martin E., Diana P. et al. // Analytical Chemistry. 2016. V. 88. № 15. P. 7539–7547. https://doi.org/10.1021/acs.analchem.6b00868
- Sholokhova A.Yu., Matyushin D.D., Grinevich O.I. et al. // Molecules. 2023. V. 28. № 8. P. 3409. https://doi.org/10.3390/molecules28083409
- Su Q.-Z., Vera P., Salafranca J. et al. // Resources, Conservation and Recycling. 2021. V. 171. P. 105640. https://doi.org/10.1016/j.resconrec.2021.105640
- Su Q.-Z., Vera P., Nerín C. et al. // Resources, Conservation and Recycling. 2021. V. 167. P. 105365. https://doi.org/10.1016/j.resconrec.2020.105365
- Sholokhova A.Yu., Grinevich O.I., Matyushin D.D. et al. // Chemosphere. 2022. V. 307. P. 135764. https://doi.org/10.1016/j.chemosphere.2022.135764
- Matyushin D.D., Buryak A.K. // IEEE Access. 2020. V. 8. P. 223140. https://doi.org/10.1109/ACCESS.2020.3045047
- Debus B., Parastar H., Harrington P. et al. // TrAC Trends in Analytical Chemistry. 2021. V. 145. P. 116459. https://doi.org/10.1016/j.trac.2021.116459
- Dong S., Wang P., Abbas K. // Computer Science Review. 2021. V. 40. P. 100379. https://doi.org/10.1016/j.cosrev.2021.100379
- Matyushin D.D., Sholokhova A.Yu., Buryak A.K. // Intern. J. of Molecular Sciences. 2021. V. 22. № 17. P. 9194. https://doi.org/10.3390/ijms22179194
- Matyushin D.D., Sholokhova A.Yu., Buryak A.K. // J. of Chromatography A. 2019. V. 1607. P. 460395. https://doi.org/10.1016/j.chroma.2019.460395
- Anjum A., Liigand J., Milford R. et al. // Ibid. 2023. V. 1705. P. 464176. https://doi.org/10.1016/j.chroma.2023.464176
- Qu C., Schneider B.I., Kearsley A.J. et al. // Ibid. 2021. V. 1646. P. 462100. https://doi.org/10.1016/j.chroma.2021.462100
- Vrzal T., Malečková M., Olšovská J. // Analytica Chimica Acta. 2021. V. 1147. P. 64. https://doi.org/10.1016/j.aca.2020.12.043
- Geer L.Y., Stein S.E., Mallard W.G. et al. // J. of Chemical Information and Modeling. 2024. V. 64. № 3. P. 690–696. https://doi.org/10.1021/acs.jcim.3c01758
- Raymond J.W., Gardiner E.J., Willett P. // The Computer J. 2002. V. 45. № 6. P. 631–644. https://doi.org/10.1093/comjnl/45.6.631
- Bender A., Glen R.C. // Organic & Biomolecular Chemistry. 2004. V. 2. № 22. P. 3204. https://doi.org/10.1039/B409813G
- Morehouse N.J., Clark T.N., McMann E.J. et al. // Nature Communications. 2023. V. 14. № 1. P. 308. https://doi.org/10.1038/s41467-022-35734-z
- Rogers D., Hahn M. // J. of Chem. Inform. and Modeling. 2010. V. 50. № 5. P. 742. https://doi.org/10.1021/ci100050t
- Hoo Z.H., Candlish J., Teare D. // Emergency Medicine J. 2017. V. 34. № 6. P. 357. https://doi.org/10.1136/emermed-2017-206735
- Polo T.C.F., Miot H.A. // J. Vascular Brasileiro. 2020. V. 19. P. e20200186. https://doi.org/10.1590/1677-5449.200186
- Popov M.S., Ul’yanovskii N.V., Kosyakov D.S. // Microchemical J. 2024. V. 197. P. 109833. https://doi.org/10.1016/j.microc.2023.109833
Supplementary files
