Towards multi-scale inter-frame attention to improve deep learning tasks

Author(s)Bhattarai, Ashuta
Date Accessioned2025-05-13T17:42:38Z
Date Available2025-05-13T17:42:38Z
Publication Date2025
SWORD Update2025-04-28T04:03:51Z
AbstractAccess to specialized medical screening remains a challenge for individuals with sickle cell disease (SCD), particularly those in low-income and rural communities, where advanced diagnostic tools and expert evaluations are limited. In ophthalmology, Sickle Cell Retinopathy (SCR) diagnosis relies on ophthalmologic evaluation, including Optical Coherence Tomography (OCT) scans, but the manual interpretation is prone to subjectivity, fatigue-induced errors, and inconsistencies across clinicians. Similarly, video-based event analysis—such as reconstructing crime scenes from fragmented surveillance footage—is a time-intensive process that requires manual ordering and interpretation of unordered clips. These challenges highlight the need for automated solutions that enhance medical diagnostics and video-based decision-making. ☐ To address these issues, we propose Multi-scale Inter-frame Attention (MIA), a novel framework that enhances deep learning models for processing volumetric and video datasets. Our approach leverages spatial and spatio-temporal attention mechanisms to improve feature extraction and representation learning. We integrate MIA into two specialized models: the Cross-Scan Attention Transformer (CSAT) for SCR detection and the Sequential Ordering of Frames in Time (SOFT) for video-based action recognition. Experimental results demonstrate that CSAT+MIA outperforms conventional object detection models in diagnosing SCR, while SOFT+MIA enhances action recognition, particularly in temporally shuffled scenarios. ☐ Beyond domain-specific improvements, our research aims to establish a unified deep-learning method capable of capturing both inter-frame and intra-frame relationships for broader applications in medical imaging, surveillance, and video understanding. By integrating multi-scale inter-frame attention, we advance the field of automated diagnosis and event reconstruction, paving the way for more efficient, reliable, and intelligent decision-making systems.
AdvisorKambhamettu, Chandra
DegreePh.D.
DepartmentUniversity of Delaware, Department of Computer and Information Sciences
Unique Identifier1519583855
URLhttps://udspace.udel.edu/handle/19716/36137
Languageen
PublisherUniversity of Delaware
URIhttps://www.proquest.com/pqdtlocal1006271/dissertations-theses/towards-multi-scale-inter-frame-attention-improve/docview/3196068009/sem-2?accountid=10457
KeywordsCross scan attention transformer
KeywordsSickle cell retinopathy
KeywordsVideo understanding
KeywordsSickle cell disease
TitleTowards multi-scale inter-frame attention to improve deep learning tasks
TypeThesis
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Bhattarai_udel_0060D_16508.pdf
Size:
48.36 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.22 KB
Format:
Item-specific license agreed upon to submission
Description: