TY - JOUR
T1 - Supervised Machine Learning Algorithms for Fitness-Based Cardiometabolic Risk Classification in Adolescents
AU - Yáñez-Sepúlveda, Rodrigo
AU - Olivares, Rodrigo
AU - Olivares, Pablo
AU - Zavala-Crichton, Juan Pablo
AU - Hinojosa-Torres, Claudio
AU - Giakoni-Ramírez, Frano
AU - Souza-Lima, Josivaldo de
AU - Monsalves-Álvarez, Matías
AU - Tuesta, Marcelo
AU - Páez-Herrera, Jacqueline
AU - Olivares-Arancibia, Jorge
AU - Reyes-Amigo, Tomás
AU - Cortés-Roco, Guillermo
AU - Hurtado-Almonacid, Juan
AU - Guzmán-Muñoz, Eduardo
AU - Aguilera-Martínez, Nicole
AU - López-Gil, José Francisco
AU - Clemente-Suárez, Vicente Javier
N1 - Publisher Copyright:
© 2025 by the authors.
PY - 2025/8
Y1 - 2025/8
N2 - Background: Cardiometabolic risk in adolescents represents a growing public health concern that is closely linked to modifiable factors such as physical fitness. Traditional statistical approaches often fail to capture complex, nonlinear relationships among anthropometric and fitness-related variables. Objective: To develop and evaluate supervised machine learning algorithms, including artificial neural networks and ensemble methods, for classifying cardiometabolic risk levels among Chilean adolescents based on standardized physical fitness assessments. Methods: A cross-sectional analysis was conducted using a large representative sample of school-aged adolescents. Field-based physical fitness tests, such as cardiorespiratory fitness (in terms of estimated maximal oxygen consumption [VO2max]), muscular strength (push-ups), and explosive power (horizontal jump) testing, were used as input variables. A cardiometabolic risk index was derived using international criteria. Various supervised machine learning models were trained and compared regarding accuracy, F1 score, recall, and area under the receiver operating characteristic curve (AUC-ROC). Results: Among all the models tested, the gradient boosting classifier achieved the best overall performance, with an accuracy of 77.0%, an F1 score of 67.3%, and the highest AUC-ROC (0.601). These results indicate a strong balance between sensitivity and specificity in classifying adolescents at cardiometabolic risk. Horizontal jumps and push-ups emerged as the most influential predictive variables. Conclusions: Gradient boosting proved to be the most effective model for predicting cardiometabolic risk based on physical fitness data. This approach offers a practical, data-driven tool for early risk detection in adolescent populations and may support scalable screening efforts in educational and clinical settings.
AB - Background: Cardiometabolic risk in adolescents represents a growing public health concern that is closely linked to modifiable factors such as physical fitness. Traditional statistical approaches often fail to capture complex, nonlinear relationships among anthropometric and fitness-related variables. Objective: To develop and evaluate supervised machine learning algorithms, including artificial neural networks and ensemble methods, for classifying cardiometabolic risk levels among Chilean adolescents based on standardized physical fitness assessments. Methods: A cross-sectional analysis was conducted using a large representative sample of school-aged adolescents. Field-based physical fitness tests, such as cardiorespiratory fitness (in terms of estimated maximal oxygen consumption [VO2max]), muscular strength (push-ups), and explosive power (horizontal jump) testing, were used as input variables. A cardiometabolic risk index was derived using international criteria. Various supervised machine learning models were trained and compared regarding accuracy, F1 score, recall, and area under the receiver operating characteristic curve (AUC-ROC). Results: Among all the models tested, the gradient boosting classifier achieved the best overall performance, with an accuracy of 77.0%, an F1 score of 67.3%, and the highest AUC-ROC (0.601). These results indicate a strong balance between sensitivity and specificity in classifying adolescents at cardiometabolic risk. Horizontal jumps and push-ups emerged as the most influential predictive variables. Conclusions: Gradient boosting proved to be the most effective model for predicting cardiometabolic risk based on physical fitness data. This approach offers a practical, data-driven tool for early risk detection in adolescent populations and may support scalable screening efforts in educational and clinical settings.
KW - adolescent
KW - gradient boosting
KW - health
KW - physical fitness
KW - predictive modeling
UR - http://www.scopus.com/inward/record.url?scp=105014467682&partnerID=8YFLogxK
U2 - 10.3390/sports13080273
DO - 10.3390/sports13080273
M3 - Article
AN - SCOPUS:105014467682
SN - 2075-4663
VL - 13
JO - Sports
JF - Sports
IS - 8
M1 - 273
ER -