Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT

Lukas Wimmert; Maximilian Nielsen; Frederic Madesta; Tobias Gauer; Christian Hofmann; Rene Werner

doi:10.1002/mp.17038

Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT

Beteiligte Einrichtungen

Abstract

BACKGROUND: Stereotactic body radiotherapy of thoracic and abdominal tumors has to account for respiratory intrafractional tumor motion. Commonly, an external breathing signal is continuously acquired that serves as a surrogate of the tumor motion and forms the basis of strategies like breathing-guided imaging and gated dose delivery. However, due to inherent system latencies, there exists a temporal lag between the acquired respiratory signal and the system response. Respiratory signal prediction models aim to compensate for the time delays and to improve imaging and dose delivery.

PURPOSE: The present study explores and compares six state-of-the-art machine and deep learning-based prediction models, focusing on real-time and real-world applicability. All models and data are provided as open source and data to ensure reproducibility of the results and foster reuse.

METHODS: The study was based on 2502 breathing signals ( t t o t a l ≈ 90 $t_{total} \approx 90$ h) acquired during clinical routine, split into independent training (50%), validation (20%), and test sets (30%). Input signal values were sampled from noisy signals, and the target signal values were selected from corresponding denoised signals. A standard linear prediction model (Linear), two state-of-the-art models in general univariate signal prediction (Dlinear, Xgboost), and three deep learning models (Lstm, Trans-Enc, Trans-TSF) were chosen. The prediction performance was evaluated for three different prediction horizons (480, 680, and 920 ms). Moreover, the robustness of the different models when applied to atypical, that is, out-of-distribution (OOD) signals, was analyzed.

RESULTS: The Lstm model achieved the lowest normalized root mean square error for all prediction horizons. The prediction errors only slightly increased for longer horizons. However, a substantial spread of the error values across the test signals was observed. Compared to typical, that is, in-distribution test signals, the prediction accuracy of all models decreased when applied to OOD signals. The more complex deep learning models Lstm and Trans-Enc showed the least performance loss, while the performance of simpler models like Linear dropped the most. Except for Trans-Enc, inference times for the different models allowed for real-time application.

CONCLUSION: The application of the Lstm model achieved the lowest prediction errors. Simpler prediction filters suffer from limited signal history access, resulting in a drop in performance for OOD signals.

Bibliografische Daten

Originalsprache	Englisch
ISSN	0094-2405
DOIs	https://doi.org/10.1002/mp.17038
Status	Veröffentlicht - 05.2024

PubMed	38536107