Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT


BACKGROUND: Stereotactic body radiotherapy of thoracic and abdominal tumors has to account for respiratory intrafractional tumor motion. Commonly, an external breathing signal is continuously acquired that serves as a surrogate of the tumor motion and forms the basis of strategies like breathing-guided imaging and gated dose delivery. However, due to inherent system latencies, there exists a temporal lag between the acquired respiratory signal and the system response. Respiratory signal prediction models aim to compensate for the time delays and to improve imaging and dose delivery.

PURPOSE: The present study explores and compares six state-of-the-art machine and deep learning-based prediction models, focusing on real-time and real-world applicability. All models and data are provided as open source and data to ensure reproducibility of the results and foster reuse.

METHODS: The study was based on 2502 breathing signals ( t t o t a l ≈ 90 $t_{total} \approx 90$ h) acquired during clinical routine, split into independent training (50%), validation (20%), and test sets (30%). Input signal values were sampled from noisy signals, and the target signal values were selected from corresponding denoised signals. A standard linear prediction model (Linear), two state-of-the-art models in general univariate signal prediction (Dlinear, Xgboost), and three deep learning models (Lstm, Trans-Enc, Trans-TSF) were chosen. The prediction performance was evaluated for three different prediction horizons (480, 680, and 920 ms). Moreover, the robustness of the different models when applied to atypical, that is, out-of-distribution (OOD) signals, was analyzed.

RESULTS: The Lstm model achieved the lowest normalized root mean square error for all prediction horizons. The prediction errors only slightly increased for longer horizons. However, a substantial spread of the error values across the test signals was observed. Compared to typical, that is, in-distribution test signals, the prediction accuracy of all models decreased when applied to OOD signals. The more complex deep learning models Lstm and Trans-Enc showed the least performance loss, while the performance of simpler models like Linear dropped the most. Except for Trans-Enc, inference times for the different models allowed for real-time application.

CONCLUSION: The application of the Lstm model achieved the lowest prediction errors. Simpler prediction filters suffer from limited signal history access, resulting in a drop in performance for OOD signals.

Bibliografische Daten

StatusVeröffentlicht - 05.2024
PubMed 38536107