Rapid traversal of vast chemical space using machine learning-guided docking screens
| dc.contributor.affiliation | Universidade de Santiago de Compostela. Centro de Investigación en Medicina Molecular e Enfermidades Crónicas (CiMUS) | |
| dc.contributor.affiliation | Universidade de Santiago de Compostela. Departamento de Farmacoloxía, Farmacia e Tecnoloxía Farmacéutica | |
| dc.contributor.author | Luttens, Andreas | |
| dc.contributor.author | Cabeza de Vaca, Israel | |
| dc.contributor.author | Sparring, Leonard | |
| dc.contributor.author | Brea Floriani, José Manuel | |
| dc.contributor.author | Martínez Rodríguez, Antón Leandro | |
| dc.contributor.author | Kahlous, Nour Aldin | |
| dc.contributor.author | Radchenko, Dmytro S. | |
| dc.contributor.author | Moroz, Yurii S. | |
| dc.contributor.author | Loza García, María Isabel | |
| dc.contributor.author | Norinder, Ulf | |
| dc.contributor.author | Carlsson, Jens | |
| dc.date.accessioned | 2026-01-29T13:04:04Z | |
| dc.date.available | 2026-01-29T13:04:04Z | |
| dc.date.issued | 2025-03-13 | |
| dc.description.abstract | The accelerating growth of make-on-demand chemical libraries provides unprecedented opportunities to identify starting points for drug discovery with virtual screening. However, these multi-billion-scale libraries are challenging to screen, even for the fastest structure-based docking methods. Here we explore a strategy that combines machine learning and molecular docking to enable rapid virtual screening of databases containing billions of compounds. In our workflow, a classification algorithm is trained to identify top-scoring compounds based on molecular docking of 1 million compounds to the target protein. The conformal prediction framework is then used to make selections from the multi-billion-scale library, reducing the number of compounds to be scored by docking. The CatBoost classifier showed an optimal balance between speed and accuracy and was used to adapt the workflow for screens of ultralarge libraries. Application to a library of 3.5 billion compounds demonstrated that our protocol can reduce the computational cost of structure-based virtual screening by more than 1,000-fold. Experimental testing of predictions identified ligands of G protein-coupled receptors and demonstrated that our approach enables discovery of compounds with multi-target activity tailored for therapeutic effect | |
| dc.description.peerreviewed | SI | |
| dc.description.sponsorship | A.L. was supported by a postdoctoral scholarship from the Knut and Alice Wallenberg Foundation (KAW2022.0347). J.C. received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (grant agreement 715052), the Swedish Cancer Society, the Swedish Research Council and the Olle Engkvist Foundation. This research was partially supported by the project AI4Research at Uppsala University. I.C.d.V. was funded by a postdoctoral fellowship provided by the Sven och Lilly Lawski foundation. The computations were enabled using resources provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) (partially funded by the Swedish Research Council through grant agreement number 2022-06725) and the supercomputing resource Berzelius provided by the National Supercomputer Centre at Linköping University and the Knut and Alice Wallenberg Foundation. J.B., A.L.M. and M.I.L. were funded by Agencia Estatal de Investigación (PID2020-119428RB-I00), Xunta de Galicia (ED431C 2022/20) and European Regional Development Fund (ERDF). A.L., I.C.d.V. and J.C. thank OpenEye Scientific Software for the use of OEToolkits at no cost. We thank J. Zhang for providing the initial deep neural network code | |
| dc.identifier.citation | Luttens, A., Cabeza de Vaca, I., Sparring, L. et al. Rapid traversal of vast chemical space using machine learning-guided docking screens. Nat Comput Sci 5, 301–312 (2025). https://doi.org/10.1038/s43588-025-00777-x | |
| dc.identifier.doi | 10.1038/s43588-025-00777-x | |
| dc.identifier.essn | 2662-8457 | |
| dc.identifier.uri | https://hdl.handle.net/10347/45594 | |
| dc.issue.number | 5 | |
| dc.journal.title | Nature Computational Science | |
| dc.language.iso | eng | |
| dc.page.final | 312 | |
| dc.page.initial | 301 | |
| dc.publisher | Springer Nature | |
| dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/715052/ | |
| dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-119428RB-I00/ES/NUEVA APROXIMACION EXPERIMENTAL PARA LA IDENTIFICAICON DE ANTIPSICOTICOS ACTIVOS FRENTE AL DEFICIT COGNITIVO EN ESQUIZOFRENIA | |
| dc.relation.publisherversion | https://doi.org/10.1038/s43588-025-00777-x | |
| dc.rights | © The Author(s) 2025. This article is licensed under a Creative Commons Attribution 4.0 International License | |
| dc.rights | Attribution 4.0 International | en |
| dc.rights.accessRights | open access | |
| dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | |
| dc.subject | Cheminformatics | |
| dc.subject | Computational chemistry | |
| dc.subject | Machine learning | |
| dc.subject | Structure-based drug design | |
| dc.subject | Virtual drug screening | |
| dc.title | Rapid traversal of vast chemical space using machine learning-guided docking screens | |
| dc.type | journal article | |
| dc.type.hasVersion | VoR | |
| dspace.entity.type | Publication | |
| relation.isAuthorOfPublication | 67b19be7-64a8-45c8-a6e4-ed48a4410ef8 | |
| relation.isAuthorOfPublication | efe7f464-2f77-4a92-915f-fda4128451fa | |
| relation.isAuthorOfPublication | 7765cb9b-b630-44dc-9477-dd266a62bb3c | |
| relation.isAuthorOfPublication.latestForDiscovery | 67b19be7-64a8-45c8-a6e4-ed48a4410ef8 |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- 2025_NatCompSci_Luttens_Rapid.pdf
- Size:
- 3.33 MB
- Format:
- Adobe Portable Document Format