TY - JOUR
T1 - Comparative Data Analysis of Virtual Screening Methodologies for Predicting Urease Inhibitory Activity
AU - Valdés-Muñoz, Elizabeth
AU - Olguín-Orellana, Gabriel J.
AU - Ríos-Rozas, Sofía E.
AU - Alegría-Arcos, Melissa
AU - Morales, Natalia
AU - Rojas-Santander, Vicente
AU - Farías-Abarca, Javier
AU - Palma, Jonathan M.
AU - Hernández-Rodríguez, Erix W.
AU - Suardíaz, Reynier
AU - Bustos, Daniel
N1 - Publisher Copyright:
© 2025 The Authors. Published by American Chemical Society
PY - 2025/10/28
Y1 - 2025/10/28
N2 - Structure-based virtual screening (SBVS) is a fundamental approach in drug discovery, yet its predictive accuracy is highly dependent on methodological choices, scoring functions, and data processing strategies. This study systematically evaluates five protocol variants integrating molecular docking, induced-fit docking (IFD), quantum-polarized ligand docking (QPLD), ensemble docking (ED), and molecular mechanics/generalized Born surface area (MM-GBSA) in Helicobacter pylori urease employing four distinct crystallographic structures obtained from the protein data bank (PDB). We assess their predictive performance using statistical correlation metrics (Spearman and Pearson) and error-based measures (mean absolute error, root-mean-squared error, and inlier ratio metric). Additionally, we investigate the influence of data fusion techniques─minimum, median, arithmetic, geometric, harmonic, and Euclidean means─and varying numbers of docking poses (ranging from 1 to 100) on ligand ranking accuracy. Results indicate that MM-GBSA and ED consistently outperform other methods in compound ranking, although MM-GBSA exhibits higher errors in absolute binding energy predictions. While increasing the number of poses generally reduces predictive accuracy, the minimum fusion approach remains robust across all conditions. Comparisons between IC50and pIC50as experimental reference values reveal that pIC50provides higher Pearson correlations, reinforcing its suitability for affinity prediction, while both metrics perform similarly in Spearman rankings. These findings refine SBVS workflows by optimizing scoring and pose aggregation strategies, highlighting the importance of method selection and data fusion techniques. The proposed framework enhances ligand prioritization in virtual screening campaigns and can be adapted to other therapeutic targets. Future research should explore adaptive scoring frameworks and machine-learning approaches to further improve the SBVS predictive reliability.
AB - Structure-based virtual screening (SBVS) is a fundamental approach in drug discovery, yet its predictive accuracy is highly dependent on methodological choices, scoring functions, and data processing strategies. This study systematically evaluates five protocol variants integrating molecular docking, induced-fit docking (IFD), quantum-polarized ligand docking (QPLD), ensemble docking (ED), and molecular mechanics/generalized Born surface area (MM-GBSA) in Helicobacter pylori urease employing four distinct crystallographic structures obtained from the protein data bank (PDB). We assess their predictive performance using statistical correlation metrics (Spearman and Pearson) and error-based measures (mean absolute error, root-mean-squared error, and inlier ratio metric). Additionally, we investigate the influence of data fusion techniques─minimum, median, arithmetic, geometric, harmonic, and Euclidean means─and varying numbers of docking poses (ranging from 1 to 100) on ligand ranking accuracy. Results indicate that MM-GBSA and ED consistently outperform other methods in compound ranking, although MM-GBSA exhibits higher errors in absolute binding energy predictions. While increasing the number of poses generally reduces predictive accuracy, the minimum fusion approach remains robust across all conditions. Comparisons between IC50and pIC50as experimental reference values reveal that pIC50provides higher Pearson correlations, reinforcing its suitability for affinity prediction, while both metrics perform similarly in Spearman rankings. These findings refine SBVS workflows by optimizing scoring and pose aggregation strategies, highlighting the importance of method selection and data fusion techniques. The proposed framework enhances ligand prioritization in virtual screening campaigns and can be adapted to other therapeutic targets. Future research should explore adaptive scoring frameworks and machine-learning approaches to further improve the SBVS predictive reliability.
UR - http://www.scopus.com/inward/record.url?scp=105019975234&partnerID=8YFLogxK
U2 - 10.1021/acsomega.5c04457
DO - 10.1021/acsomega.5c04457
M3 - Article
AN - SCOPUS:105019975234
SN - 2470-1343
VL - 10
SP - 49641
EP - 49658
JO - ACS Omega
JF - ACS Omega
IS - 42
ER -