Evaluating the classification accuracy of MI Write as a universal writing screener in middle school
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Writing proficiency plays a key role in academic success, yet many middle school students do not meet the required standards, highlighting the urgent need for effective universal screening tools. Traditional screening methods often fall short due to their subjective nature and inefficiency to capture the complexity of writing. Automated Writing Evaluation (AWE) system offers a more efficient and reliable alternative, but its role as a universal writing screener in middle schools, where writing demands are more complex, remains unexplored. ☐ This dissertation investigates the potential of AWE tools, specifically MI Write, as more efficient and reliable alternatives in the middle school setting. It assesses MI Write’s classification accuracy in identifying middle school students at risk of failing to demonstrate grade-level English Language Arts (ELA) proficiency based on the Smarter Balanced (SB) ELA assessment across different time points in the academic year (fall, winter, spring) and different risk status cutpoints. The research also examines whether integrating MI Write with the reading screener i-Ready enhances the classification accuracy, compared to using each screener individually. ☐ Utilizing a quantitative methodology, the study systematically analyzes ROC curve results to measure MI Write’s classification accuracy, comparing it across different time points and student grades. The findings demonstrate that MI Write consistently achieves AUC values above .75, affirming its potential as an effective standalone screener. MI Write’s classification accuracy varies by grade level, performing strongest in Grade 7 and weakest in Grade 8, suggesting the necessity for grade-specific scoring and benchmarks. Across all grades and seasons, d-based cutpoints provided the best balance of sensitivity and specificity, with scores below 15.35 generally indicating a high risk of not meeting grade-level ELA proficiency. ☐ This study also compares the classification accuracy of MI Write, i-Ready, and a combined model across seasons and grade levels. The combined model consistently outperformed the writing-only model and slightly outperformed i-Ready alone, with AUC values ranging from .91 to .94. The 90% sensitivity cutpoint offered the best trade-off by minimizing false negatives, with a predicted probability threshold of .39 for identifying at-risk students. While the combined model maintained consistent accuracy throughout the year, its performance varied across grades, highlighting the need for grade-specific cutpoints and adjustments. In a gated model, MI Write could serve as a valuable second-stage screener, enhancing precision while reducing costs and unnecessary testing. ☐ Additionally, the study examines middle school ELA administrators’ perceptions of MI Write’s trustworthiness as a universal screening tool. Through qualitative focus group interviews, administrators acknowledge its accuracy in identifying at-risk students. However, they also identify several implementation challenges and emphasize the need for improvements and better integration with existing educational tools. ☐ This dissertation bridges a critical literature gap by providing new insights into the classification accuracy of AWE tools as universal writing screeners in the demanding middle school environment. The findings have broad implications for screening practices, offering stakeholders evidence-based guidance on the role of technology-enhanced assessments, and supporting more informed, data-driven decisions in identifying and supporting middle school students at risk.
Description
Keywords
Writing proficiency, Automated Writing Evaluation, English Language Arts, Universal writing screener, Middle schools
