SwinDAF3D: Pyramid Swin transformers with deep attentive features for automated finger joint segmentation in 3D ultrasound images for rheumatoid arthritis assessment

Training (A) and validation (B) Dice score accuracy curves (one-fold) over 50 epochs for all models in the ablation study: 3D UNet, DAF3D, Swin UNETR, UNETR++, TransUNet, and SwinDAF3D.

Published in: Bioengineering (Basel). 2025 Apr 5;12(4):390. doi: 10.3390/bioengineering12040390.

Abstract

Rheumatoid arthritis (RA) is a chronic autoimmune disease that can cause severe joint damage and functional impairment. Ultrasound imaging has shown promise in providing real-time assessment of synovium inflammation associated with the early stages of RA. Accurate segmentation of the synovium region and quantification of inflammation-specific imaging biomarkers are crucial for assessing and grading RA. However, automatic segmentation of the synovium in 3D ultrasound is challenging due to ambiguous boundaries, variability in synovium shape, and inhomogeneous intensity distribution.

In this work, we introduce a novel network architecture, Swin Transformers with Deep Attentive Features for 3D segmentation (SwinDAF3D), which integrates Swin Transformers into a Deep Attentive Features framework. The developed architecture leverages the hierarchical structure and shifted windows of Swin Transformers to capture rich, multi-scale and attentive contextual information, improving the modeling of long-range dependencies and spatial hierarchies in 3D ultrasound images.

In a six-fold cross-validation study with 3D ultrasound images of RA patients’ finger joints (n = 72), our SwinDAF3D model achieved the highest performance with a Dice Score (DSC) of 0.838 Β± 0.013, an Intersection over Union (IoU) of 0.719 Β± 0.019, and Surface Dice Score (SDSC) of 0.852 Β± 0.020, compared to 3D UNet (DSC: 0.742 Β± 0.025; IoU: 0.589 Β± 0.031; SDSC: 0.661 Β± 0.029), DAF3D (DSC: 0.813 Β± 0.017; IoU: 0.689 Β± 0.022; SDSC: 0.817 Β± 0.013), Swin UNETR (DSC: 0.808 Β± 0.025; IoU: 0.678 Β± 0.032; SDSC: 0.822 Β± 0.039), UNETR++ (DSC: 0.810 Β± 0.014; IoU: 0.684 Β± 0.018; SDSC: 0.829 Β± 0.027) and TransUNet (DSC: 0.818 Β± 0.013; IoU: 0.692 Β± 0.017; SDSC: 0.815 Β± 0.016) models. This ablation study demonstrates the effectiveness of combining a Swin Transformers feature pyramid with a deep attention mechanism, improving the segmentation accuracy of the synovium in 3D ultrasound. This advancement shows great promise in enabling more efficient and standardized RA screening using ultrasound imaging.

FUNDING SOURCE:  The research reported in this publication was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases (NIAMS) of the National Institutes of Health under award number R01AR060350. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Read the paper.

Share