Affect and behavior assessment in autism therapy: a multimodal machine learning approach
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of Delaware
Abstract
Autism spectrum disorder (ASD) is a developmental disorder that influences the communication and behavior of a person. Over 1 million children in the U.S. are on the spectrum. A prevalent treatment for autism is play therapy, a psychotherapeutic approach promoting children to communicate with people through play. ☐ The video recordings of therapy interventions provide a rich resource for therapists to analyze and monitor the developmental states of children with autism. However, this process can be cumbersome and costly if humans review the recordings. On the other hand, there is a lack of research on the automated solutions that review these videos of children with ASD in play therapy settings due to privacy and sparsity of public datasets. This thesis aims to automatically analyze and evaluate video recordings of children in autism therapy using deep learning techniques. Two main types of behaviors are investigated in this thesis: emotion and movement synchrony. ☐ Emotion recognition identifies human emotion by multiple cues, including facial expressions, speech emotion, physiological and biological signals. We proposed a two-stage multi-modal approach using acoustic and visual cues to classify the affect states of children with ASD. It provides a novel way to combine human expertise and machine intelligence for affect recognition considering distinct behaviors in different emotions. For example, children tend to scream and shout more in negative affect states and smile more in positive ones. ☐ On the other hand, movement synchrony pertains to the synchronization and similarity of movements between individuals. It stands as a crucial metric for comprehending children's developmental stages. To address this, we introduced a multi-task framework aimed at integrating movement synchrony estimation with auxiliary tasks such as identifying intervention activities and assessing individual action quality. Our research extended to employing exclusively privacy-preserving data, with a dedicated emphasis on safeguarding the privacy of children with autism. The data modalities encompassed both skeleton and optical flow data, and we proposed state-of-the-art methods based on Transformer Networks, tailored with specific architectural adjustments to suit the dyadic, interactive setting. Furthermore, we explored the utilization of related benchmark datasets, such as activity recognition benchmarks without synchrony labels, through the constructed positive/negative data pairs and contrastive learning. ☐ The outcomes of our study carry significance for automated behavior analysis in play therapy. Additionally, the findings have practical applications in social behavior analysis involving human-human interactions, extending beyond intervention assessment to areas like motor rehabilitation, education, choreography, and sports.
Description
Keywords
Autism, Behavior analysis, Deep learning, Emotion recognition, Movement synchrony estimation, Multimodal machine learning