Exploring the Effectiveness of Large-Scale Automated Writing Evaluation Implementation on State Test Performance Using Generalised Boosted Modelling
Date
2025-02-23
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Journal of Computer Assisted Learning
Abstract
Background
Automated writing evaluation (AWE) systems, used as formative assessment tools in writing classrooms, are promising for enhancing instruction and improving student performance. Although meta-analytic evidence supports AWE's effectiveness in various contexts, research on its effectiveness in the U.S. K–12 setting has lagged behind its rapid adoption. Further rigorous studies are needed to investigate the effectiveness of AWE within the U.S. K–12 context.
Objectives
This study aims to investigate the usage and effectiveness of the Utah Compose AWE system on students' state test English Language Arts (ELA) performance in its first year of statewide implementation.
Methods
The sample included all students from grades 4–11 during the school year 2015 in Utah (N = 337,473). Employing a quasi-experimental design using generalised boosted modelling for propensity score weighting, the analysis focused on estimating the average treatment effects among the treated (ATT) of the AWE system.
Results and Conclusions
The results showed that students who utilised AWE more frequently demonstrated improved ELA performance compared to their counterparts with lower or no usage. The effects varied across certain student demographic groups. This study provides strong and systematic evidence to support the hypothesis of causal inferences regarding AWE's effects within a large-scale, naturalistic implementation, offering valuable insights for stakeholders seeking to understand the effectiveness of AWE systems.
Summary
- What is currently known about this topic?
○ Automated writing evaluation (AWE) systems have the potential to enhance writing instruction and improve student writing skills.
○ Previous studies show mixed results regarding AWE effectiveness in U.S. K–12 settings.
○ There is a lack of methodologically rigorous studies that explore AWE's effects on high-stakes performance.
- What does this paper add?
○ This study is the first to apply propensity score methods to explore AWE's effects and provide robust evidence supporting the hypothesis of a causal relationship across diverse demographics within the context of large-scale, naturalistic implementation in the U.S.
○ Deeper and consistent use of AWE significantly enhances students' state test English Language Arts performance.
○ Students from certain demographic subgroups may benefit more from using AWE frequently.
- Implications for practice/or policy
○ The findings provide evidence-based guidance for policymakers and administrators on adopting AWE systems and their effects on state test performance.
○ AWE shows promise to help address long-standing educational disparities.
Description
This article was originally published in Journal of Computer Assisted Learning. The version of record is available at: https://doi.org/10.1111/jcal.70009.
© 2025 The Author(s). Journal of Computer Assisted Learning published by John Wiley & Sons Ltd.
This is an open access article under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Keywords
automated writing evaluation, educational technology, generalised boosted modelling, propensity score weighting
Citation
Huang, Y. and Wilson, J. (2025), Exploring the Effectiveness of Large-Scale Automated Writing Evaluation Implementation on State Test Performance Using Generalised Boosted Modelling. J Comput Assist Learn, 41: e70009. https://doi.org/10.1111/jcal.70009