Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT

Standard

Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT. / Wimmert, Lukas; Nielsen, Maximilian; Madesta, Frederic; Gauer, Tobias; Hofmann, Christian; Werner, Rene.

In: MED PHYS, Vol. 51, No. 5, 05.2024, p. 3173-3183.

Research output: SCORING: Contribution to journalSCORING: Journal articleResearchpeer-review

Harvard

APA

Vancouver

Bibtex

@article{bf018921258547e6b22ff44363973171,
title = "Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT",
abstract = "BACKGROUND: Stereotactic body radiotherapy of thoracic and abdominal tumors has to account for respiratory intrafractional tumor motion. Commonly, an external breathing signal is continuously acquired that serves as a surrogate of the tumor motion and forms the basis of strategies like breathing-guided imaging and gated dose delivery. However, due to inherent system latencies, there exists a temporal lag between the acquired respiratory signal and the system response. Respiratory signal prediction models aim to compensate for the time delays and to improve imaging and dose delivery.PURPOSE: The present study explores and compares six state-of-the-art machine and deep learning-based prediction models, focusing on real-time and real-world applicability. All models and data are provided as open source and data to ensure reproducibility of the results and foster reuse.METHODS: The study was based on 2502 breathing signals ( t t o t a l ≈ 90 $t_{total} \approx 90$ h) acquired during clinical routine, split into independent training (50%), validation (20%), and test sets (30%). Input signal values were sampled from noisy signals, and the target signal values were selected from corresponding denoised signals. A standard linear prediction model (Linear), two state-of-the-art models in general univariate signal prediction (Dlinear, Xgboost), and three deep learning models (Lstm, Trans-Enc, Trans-TSF) were chosen. The prediction performance was evaluated for three different prediction horizons (480, 680, and 920 ms). Moreover, the robustness of the different models when applied to atypical, that is, out-of-distribution (OOD) signals, was analyzed.RESULTS: The Lstm model achieved the lowest normalized root mean square error for all prediction horizons. The prediction errors only slightly increased for longer horizons. However, a substantial spread of the error values across the test signals was observed. Compared to typical, that is, in-distribution test signals, the prediction accuracy of all models decreased when applied to OOD signals. The more complex deep learning models Lstm and Trans-Enc showed the least performance loss, while the performance of simpler models like Linear dropped the most. Except for Trans-Enc, inference times for the different models allowed for real-time application.CONCLUSION: The application of the Lstm model achieved the lowest prediction errors. Simpler prediction filters suffer from limited signal history access, resulting in a drop in performance for OOD signals.",
keywords = "Radiosurgery/methods, Respiration, Machine Learning, Benchmarking, Humans, Time Factors, Deep Learning, Four-Dimensional Computed Tomography",
author = "Lukas Wimmert and Maximilian Nielsen and Frederic Madesta and Tobias Gauer and Christian Hofmann and Rene Werner",
note = "{\textcopyright} 2024 The Authors. Medical Physics published by Wiley Periodicals LLC on behalf of American Association of Physicists in Medicine.",
year = "2024",
month = may,
doi = "10.1002/mp.17038",
language = "English",
volume = "51",
pages = "3173--3183",
journal = "MED PHYS",
issn = "0094-2405",
publisher = "AAPM - American Association of Physicists in Medicine",
number = "5",

}

RIS

TY - JOUR

T1 - Benchmarking machine learning-based real-time respiratory signal predictors in 4D SBRT

AU - Wimmert, Lukas

AU - Nielsen, Maximilian

AU - Madesta, Frederic

AU - Gauer, Tobias

AU - Hofmann, Christian

AU - Werner, Rene

N1 - © 2024 The Authors. Medical Physics published by Wiley Periodicals LLC on behalf of American Association of Physicists in Medicine.

PY - 2024/5

Y1 - 2024/5

N2 - BACKGROUND: Stereotactic body radiotherapy of thoracic and abdominal tumors has to account for respiratory intrafractional tumor motion. Commonly, an external breathing signal is continuously acquired that serves as a surrogate of the tumor motion and forms the basis of strategies like breathing-guided imaging and gated dose delivery. However, due to inherent system latencies, there exists a temporal lag between the acquired respiratory signal and the system response. Respiratory signal prediction models aim to compensate for the time delays and to improve imaging and dose delivery.PURPOSE: The present study explores and compares six state-of-the-art machine and deep learning-based prediction models, focusing on real-time and real-world applicability. All models and data are provided as open source and data to ensure reproducibility of the results and foster reuse.METHODS: The study was based on 2502 breathing signals ( t t o t a l ≈ 90 $t_{total} \approx 90$ h) acquired during clinical routine, split into independent training (50%), validation (20%), and test sets (30%). Input signal values were sampled from noisy signals, and the target signal values were selected from corresponding denoised signals. A standard linear prediction model (Linear), two state-of-the-art models in general univariate signal prediction (Dlinear, Xgboost), and three deep learning models (Lstm, Trans-Enc, Trans-TSF) were chosen. The prediction performance was evaluated for three different prediction horizons (480, 680, and 920 ms). Moreover, the robustness of the different models when applied to atypical, that is, out-of-distribution (OOD) signals, was analyzed.RESULTS: The Lstm model achieved the lowest normalized root mean square error for all prediction horizons. The prediction errors only slightly increased for longer horizons. However, a substantial spread of the error values across the test signals was observed. Compared to typical, that is, in-distribution test signals, the prediction accuracy of all models decreased when applied to OOD signals. The more complex deep learning models Lstm and Trans-Enc showed the least performance loss, while the performance of simpler models like Linear dropped the most. Except for Trans-Enc, inference times for the different models allowed for real-time application.CONCLUSION: The application of the Lstm model achieved the lowest prediction errors. Simpler prediction filters suffer from limited signal history access, resulting in a drop in performance for OOD signals.

AB - BACKGROUND: Stereotactic body radiotherapy of thoracic and abdominal tumors has to account for respiratory intrafractional tumor motion. Commonly, an external breathing signal is continuously acquired that serves as a surrogate of the tumor motion and forms the basis of strategies like breathing-guided imaging and gated dose delivery. However, due to inherent system latencies, there exists a temporal lag between the acquired respiratory signal and the system response. Respiratory signal prediction models aim to compensate for the time delays and to improve imaging and dose delivery.PURPOSE: The present study explores and compares six state-of-the-art machine and deep learning-based prediction models, focusing on real-time and real-world applicability. All models and data are provided as open source and data to ensure reproducibility of the results and foster reuse.METHODS: The study was based on 2502 breathing signals ( t t o t a l ≈ 90 $t_{total} \approx 90$ h) acquired during clinical routine, split into independent training (50%), validation (20%), and test sets (30%). Input signal values were sampled from noisy signals, and the target signal values were selected from corresponding denoised signals. A standard linear prediction model (Linear), two state-of-the-art models in general univariate signal prediction (Dlinear, Xgboost), and three deep learning models (Lstm, Trans-Enc, Trans-TSF) were chosen. The prediction performance was evaluated for three different prediction horizons (480, 680, and 920 ms). Moreover, the robustness of the different models when applied to atypical, that is, out-of-distribution (OOD) signals, was analyzed.RESULTS: The Lstm model achieved the lowest normalized root mean square error for all prediction horizons. The prediction errors only slightly increased for longer horizons. However, a substantial spread of the error values across the test signals was observed. Compared to typical, that is, in-distribution test signals, the prediction accuracy of all models decreased when applied to OOD signals. The more complex deep learning models Lstm and Trans-Enc showed the least performance loss, while the performance of simpler models like Linear dropped the most. Except for Trans-Enc, inference times for the different models allowed for real-time application.CONCLUSION: The application of the Lstm model achieved the lowest prediction errors. Simpler prediction filters suffer from limited signal history access, resulting in a drop in performance for OOD signals.

KW - Radiosurgery/methods

KW - Respiration

KW - Machine Learning

KW - Benchmarking

KW - Humans

KW - Time Factors

KW - Deep Learning

KW - Four-Dimensional Computed Tomography

U2 - 10.1002/mp.17038

DO - 10.1002/mp.17038

M3 - SCORING: Journal article

C2 - 38536107

VL - 51

SP - 3173

EP - 3183

JO - MED PHYS

JF - MED PHYS

SN - 0094-2405

IS - 5

ER -