A Machine Learning Model for Post-Concussion Musculoskeletal Injury Risk in Collegiate Athletes

Abstract
Background Emerging evidence indicates an elevated risk of post-concussion musculoskeletal injuries in collegiate athletes; however, identifying athletes at highest risk remains to be elucidated. Objective The purpose of this study was to model post-concussion musculoskeletal injury risk in collegiate athletes by integrating a comprehensive set of variables by machine learning. Methods A risk model was developed and tested on a dataset of 194 athletes (155 in the training set and 39 in the test set) with 135 variables entered into the analysis, which included participant’s heath and athletic history, concussion injury and recovery-specific criteria, and outcomes from a diverse array of concussion assessments. The machine learning approach involved transforming variables by the weight of evidence method, variable selection using L1-penalized logistic regression, model selection via the Akaike Information Criterion, and a final L2-regularized logistic regression fit. Results A model with 48 predictive variables yielded significant predictive performance of subsequent musculoskeletal injury with an area under the curve of 0.82. Top predictors included cognitive, balance, and reaction at baseline and acute timepoints. At a specified false-positive rate of 6.67%, the model achieves a true-positive rate (sensitivity) of 79% and a precision (positive predictive value) of 95% for identifying at-risk athletes via a well-calibrated composite risk score. Conclusions These results support the development of a sensitive and specific injury risk model using standard data combined with a novel methodological approach that may allow clinicians to target high injury risk student athletes. The development and refinement of predictive models, incorporating machine learning and utilizing comprehensive datasets, could lead to improved identification of high-risk athletes and allow for the implementation of targeted injury risk reduction strategies by identifying student athletes most at risk for post-concussion musculoskeletal injury. Key Points - There is a well-established elevated risk of post-concussion subsequent musculoskeletal injury; however, prior efforts have failed to identify risk factors. - This study developed a composite risk score model with an area under the curve of 0.82 from common concussion clinical measures and participant demographics. - By identifying athletes at elevated risk, clinicians may be able to reduce injury risk through targeted injury risk reduction programs.
Description
This article was originally published in Sports Medicine. The version of record is available at: https://doi.org/10.1007/s40279-025-02196-4. © The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. This research was featured in UDaily on 04/15/2025 at: https://www.udel.edu/udaily/2025/april/kaap-concussion-artificial-intelligence-injury-prediction-athletics/
Keywords
Citation
Claros, C.C., Anderson, M.N., Qian, W. et al. A Machine Learning Model for Post-Concussion Musculoskeletal Injury Risk in Collegiate Athletes. Sports Med (2025). https://doi.org/10.1007/s40279-025-02196-4