Translator Disclaimer
30 May 2018 Tomographic breathing detection: a method to noninvasively assess in situ respiratory dynamics
Author Affiliations +
Physiological monitoring is a critical aspect of in vivo experimentation, particularly imaging studies. Physiological monitoring facilitates gated acquisition of imaging data and more robust experimental interpretation but has historically required additional instrumentation that may be cumbersome. As frame rates have increased, imaging methods have been able to capture ever more rapid dynamics, passing the Nyquist sampling rate of most physiological processes and allowing the capture of motion, such as breathing. With this transition, image artifacts have also changed their nature; rather than intraframe motion causing blurring and deteriorating resolution, interframe motion does not affect individual frames and may be recovered as useful information from an image time series. We demonstrate a method that takes advantage of interframe movement for detection of gross physiological motion in real-time image sequences. We further demonstrate the ability of the method, dubbed tomographic breathing detection to quantify the dynamics of respiration, allowing the capture of respiratory information pertinent to anesthetic depth monitoring. Our example uses multispectral optoacoustic tomography, but it will be widely relevant to other technologies.



During in vivo experiments, it is desirable to develop an understanding of the subject’s systemic physiology, to control for experimental aberrations and to better interpret experimental results. Physiological information may be particularly relevant to imaging investigations. In magnetic resonance imaging (MRI), for example, physiological monitoring enables the acquisition of cardiac- and respiratory-gated images, enabling better interimage correlations that are not corrupted by motion. Tomographic imaging is a broad designation given to any method that provides a cross-sectional slice of the imaged subject, whether it is material, human, or animal. As the hardware for tomographic imaging systems has improved, so too has the time resolution of image acquisition, such that interimage movement is often substantially greater than intraimage movement.1 At the same time, computational resources have improved such that real-time image reconstruction can be performed even using model-based methods.2 This capability allows the imaging system access to a train of reconstructed images in real time. In realistic in vivo scenarios, some proportion of these images will be displaced relative to others as a result of motion caused by physiological processes, e.g., breathing, as noted in several previous publications.36 Online averaging of images “smooths out” this displacement7 but compromises resolution in the process. This is due to the fact that averaging is ostensibly intended to improve signal-to-noise ratio, according to the central limit theorem (CLT). However, one of the central assumptions in the CLT is that the distribution is stationary, which is clearly not the case for a dynamic image. If some processing takes place before averaging, however, we may derive useful information regarding physiological motion. This motion information has a benefit of being presynchronized to the image time series. We demonstrate the concept in terms of respiratory motion observed in freely breathing anesthetized mice undergoing photoacoustic imaging during a gas breathing challenge.





The ability to correct for physiological motion is contingent upon identifying it. Identifying differences between image frames is a well-studied topic in the field of motion estimation and video compression.8 We have chosen to implement a filter-based approach (shown in Fig. 1, details in Appendix A) due to its ease of implementation and low computational overhead. In brief, the method bandpass filters two sequential images to remove low-frequency contrast information and noise, giving images primarily weighted toward edge contrast. We then apply a box filter to reduce sensitivity to minor motion and calculate the mean squared error (MSE) of the two images as

where N is the number of pixels in a given image, Ik is the current filtered image, and Ik1 is the previous filtered image. MSE provides a measure of the degree of movement across the entire imaging region. A fundamental assumption is that the image contrast changes relatively slowly from frame to frame, while edge motion is guided by more rapid dynamics. This is reasonable in many scenarios, including contrast agent infusion; as large areas of contrast change do not affect the MSE due to the initial high-pass step, any contrast change will be in a relatively localized area and is unlikely to contribute substantially to the MSE over the entire image plane.

Fig. 1

Cartoon overview of the method: (a) an image series (e.g., Video 1, MP4, 129 kB [URL:]) is bandpass filtered and (b) the MSE between each sequential image is calculated. This error may then be used to infer the presence of motion between a given pair of frames.


There are many varieties of physiological motion which degrade image quality and resolution, including respiratory,5 cardiac,6 and subject4 movement. In most in vivo imaging scenarios, suppression or compensation of this motion is rather challenging. By tracking MSE over time, however, we may take advantage of this motion to derive physiological information without any additional instrumentation. In this study, we use respiratory motion as an example, due to its approximate periodicity and dominance in contributing to overall body motion as compared to the relatively minor effects of cardiac motion.

Given a time course of the MSE between images, we would like to recover information about breathing rates and breathing dynamics. To recover the breathing rate, we must perform some form of time-frequency analysis, to determine the breathing rate at each point in time. This might be achieved by determining the time between individual peaks, but this approach is fundamentally discontinuous and does not reflect our assumption that the modulation of breathing dynamics is a continuous process. We have, therefore, chosen to use wavelet analysis to preserve both time and frequency continuity.9

We first subtract the mean of the pulse train and then apply a synchrosqueezed wavelet transform (SSWT).10 The SSWT effectively performs a standard one-dimensional wavelet transform followed by localization in frequency, giving higher time-frequency resolution than other methods. We may then apply ridge detection to the resultant transform, recovering the primary mode of the time series, which should correspond to the instantaneous breathing rate at each point in time. This instantaneous breathing rate, in turn, is time-continuous and may be analyzed further to examine the stability of the breathing rate, for example, with respect to anesthetic or gas monitoring.11

To accommodate periods of time when there may be lapses in breathing, we note that the absence of a breathing signal will correspond to lower total signal energy. If we threshold based on this energy, then absence of breathing will be detected whenever the total signal energy drops below the threshold. We may improve robustness further by tracking the amount of energy “around” the detected time-frequency ridge, which will provide some amount of noise rejection. The details of this approach are given in Appendix B.

A goal is to achieve a stable baseline from which variation due to experimental interventions may be compared. In experiments performed under anesthesia, it is critical that the physiological state induced by a certain amount of anesthetic is sufficiently stable so as to not introduce artifacts in experiments.12 The monitoring of anesthetic depth is frequently assessed using a combination of signals ranging from reflex response13 to high-order statistical analysis of electroencephalography data in the form of the bispectral index,14,15 but most of these methods require specific instrumentation or contact intervention.

We use an imaging modality known as multispectral optoacoustic tomography (MSOT)16,17 as an example of tomographic imaging. MSOT illuminates an imaging target with wavelength-selectable near-infrared light in pulses of duration on the order of nanoseconds. This achieves stress confinement, inducing local thermal expansion, and causing ultrasound pressure waves to propagate from the point of absorption. These waves are then detected by a circular transducer array surrounding the target, providing data which may then be reconstructed into an image coded by the selected wavelength of light.


Experimental Verification

All animal work was approved by the institutional animal care and use committee at the University of Texas Southwestern in accordance with United States federal guidelines.

MSOT data were acquired using an InVision 256-TF (iThera Gmbh, Munich, Germany). The description of detailed technical specifications is available elsewhere.16 Briefly, the imaged subject is coated in ultrasound gel, wrapped in a polyethylene film, and placed in a holder, which allows the subject to rest in a heated water bath. Wavelength-specific laser pulses then illuminate the subject, causing photoacoustic thermal expansion, which induces pressure waves that are then detected by circumferentially arranged ultrasound transducers.

Specifically, an anesthetized female nude mouse in a supine position was imaged cross sectionally at the level of the kidneys and liver in a water bath at 35°C. This area has both rich structure and substantial diaphragmatic motion resulting from breathing. Data were collected at wavelengths ranging from 700 to 860 nm, in steps of 20 nm. Four laser shots were acquired per wavelength, unaveraged at a rate of 10 Hz. This wavelength set was then acquired repeatedly for the entire imaging session. Images of a 2.5-cm field of view were reconstructed using ViewMSOT® v3.6 software (iThera Gmbh, Munich, Germany) to a resolution of 75  μm. Data were collected from a mouse breathing isoflurane and medical air or 100% oxygen [Fig. 2(a), upper]. After a preliminary imaging session of 20  min to acclimate the mouse to the water bath temperature and warm up the device’s laser, the level of anesthetic was varied to induce changes in breathing rate under several anesthetic conditions (1.5%, 2%, and 2.5% isoflurane) in one continuous scan over a period of 35 min. The breathing gas was switched from air to oxygen at 20 min to observe any substantial variations in breathing rate due to the pO2 of inspired gas.18 Reconstructed image data were then processed using methods available from the Image Processing and Wavelet Toolboxes in MATLAB® (The MathWorks Inc.).

Fig. 2

(a, upper) Experimental protocol for isoflurane challenge. A mouse was induced while breathing 1.5% isoflurane in air at 2  L/min for 20 min and then underwent the isoflurane and oxygen challenge experiment. The animal breathed air for the first 20 min, at which point the breathing gas was switched to oxygen. (a, lower) Breathing rate determined from the isoflurane challenge experiment with both the MSE method and the manually annotated data. (b) Bland–Altman plot showing the variation in detected breathing rate between the MSE and manual breathing traces. The differences between the methods have a mean of 0.0±3.3  bpm. (c–f) Time courses of MSE and recovered breathing rate under (c) 2.0%, (d, e) 2.5%, and (f) 1.5% isoflurane exposure, while breathing (c, d) air and (e, f) oxygen. In (d), we see the capture of a breathing transient in compensation for suspended breathing, whereas (f) shows the increased breathing rate due to switching back to 1.5% isoflurane. Note the change of scale in (e, f).


Objective validation of the quantified breathing rate was performed by manual tagging of breathing events in the image time series. These manually registered tags were then gridded via nearest-neighbor interpolation and smoothed by convolution with a triangular kernel to lessen harmonic interference. The rectified dataset then underwent the same time-frequency analysis as the MSE data.



When applied to the breathing video (Video 1), tomographic breathing detection (TBD) was able to extract respiratory motion with high reliability [Fig. 2(a), lower]. After SSWT analysis, the method was able to quantitatively return the breathing rate over a wide range of respiratory behaviors ranging from highly regular, consistently spaced breathing [Figs. 2(c), 2(e), and 2(f)] to intermittent, irregular breathing [Fig. 2(d)], and at a range of breathing rates. Interestingly, the change to oxygen at 20 min was not seen to dramatically increase respiratory activity, in contrast with previous observations.18 We attribute this to the depth of anesthesia at the time of gas change, though other explanations, such as systemic CO2 accumulation,19 are possible.

Due to the ridge-finding algorithm’s presumption of continuity, instances when the mouse ceased breathing for extended periods introduced artifacts of spuriously high or low breathing rates. In practice, however, these rates would be easily identified by masking the breathing rate based on the energy thresholding approach described in Appendix B.



We have demonstrated a method to recover breathing dynamics from imaging sequences with high temporal resolution, even in scenarios with changing illumination and contrast. The periodic respiratory motion is readily recoverable through a simple algorithm and yields breathing information that strongly recapitulates objectively validated data.

The method, as presented, is quite rudimentary yet captures a substantial amount of information pertaining to respiratory physiology. Extensions are manifold, ranging from more sophisticated edge filtering to online SSWT analysis,20 any of which could substantially improve practicality without necessarily adding to the difficulty of implementation. We implemented the method in the image domain, although most imaging methods acquire their data in some other signal domain. Nevertheless, similar processing may take place in the signal domain by comparing variation between datasets. Additional processing, such as retrospective clustering or dynamic resampling of high-motion frames, could be performed using the MSE signal as a feature, allowing for lower-complexity clustering problems. The MSE signal may also be used as a criterion for motion correction, by removing high-MSE images from the dataset which is then averaged, as shown in Fig. 3.

Fig. 3

Comparison of averaged images (a) without (N=32) and (b) with (N=12) removal of image frames exceeding the first quartile of the MSE distribution. Improved resolution due to decreased blurring may clearly be seen, particularly in regions with rich vascular structure (insets). The images are pooled from the 800-nm channel of the first 300 image frames of Video 1.


TBD is context agnostic and could be applied to MRI, CT, ultrasound, or any other imaging modality that provides tomographic images, including other, three-dimensional implementations of photoacoustic imaging,21 particularly those which record data at near-real-time rates.22 The application of the SSWT for the purposes of breathing detection may be applied to any data that capture the motion of the chest wall, such as laser range-finding or pressure transducers coupled to the imaging water bath.5 Notably, the process is quite flexible and requires relatively little parameter tuning; many different image filters accomplish the task of recovering a suitable MSE signal. Furthermore, the method preserves locality; many other methods correlate images across an entire dataset to provide motion correction,5,6 which is unsuitable for imaging dynamic effects, though some methods have adopted other sophisticated processing methods to improve spatiotemporal resolution.22

The method has additional application to anesthetic depth monitoring. A classification of anesthetic depth was provided by Thomas et al.,13 Bhargava et al.,14 and Guedel et al.23 and separates progression of anesthetic effect into four planes. Of particular note is that plane II is a highly chaotic state providing a poor baseline, characterized by irregular breathing and a generally unstable breathing rate, whereas breathing in plane III is much more stable and regular and, therefore, preferable for experiments, especially those which probe oxygenation status as an experimental covariate.12

The effects of varying anesthesia in the isoflurane challenge as shown in Fig. 2 are notable: the transition from 1.5% to 2.0% isoflurane is marked by a substantial drop in mean respiratory rate, as well as an increase in regularity. This likely corresponds to the animal achieving a deeper plane of anesthesia, transitioning from planes II to III. Even within plane III, variations in breathing rate in response to changes in isoflurane concentration may be noted, particularly at the 25-min mark. The transition from planes III to II, i.e., to a more mild state of anesthesia, is seen at 30 min after switching to 1.5% isoflurane. The ability to track animal respiration, potentially in real time, opens possibilities for more standardized experimental protocols, as well as accounting for varying anesthetic sensitivity between groups and individuals. Moreover, by forming a quantitative signal of respiratory rate, a feedback system could be developed, which varies the level of anesthetic delivered to a subject to achieve a given respiratory rate.


Appendix A:

Details of Image Analysis

Images are first high-pass filtered to remove contrast information, as two consecutive images may be taken with different contrast parameters/wavelengths without having substantial motion.

These edge images are then filtered with a blurring/low-pass kernel to perform rudimentary denoising, as well as to smooth the calculation of the MSE. This filtering is given by the following equation:

where hh and hl are the arbitrary high- and low-pass filters, respectively. These filters may be combined into a bandpass filter or left separate in cases of nonlinearity (e.g., using a median filter as the low-pass filter). In this study, we simply used MATLAB®’s default “unsharp” filter for high pass and the median filter for low pass, though we found a number of different combinations to be effective.

The point-wise difference of the sequential images is taken as


By high-pass filtering the images, we generate a representation that is insensitive to contrast but retains edge definition. We then take the difference of consecutive high-passed images, generating a correspondence image. This correspondence image should be near-zero if the images are well-registered and, similarly, should have some mismatch across its field of view if they are not. From there, we may calculate the MSE of the given image in relation to its predecessor


This process may then be repeated over a dataset or as data arrives from the imaging procedure. Pairs of images with low MSE will correspond to images that are relatively well-aligned, while pairs of images with high MSE will correspond to images where there is some gross motion.

Appendix B:

Energy Thresholding

Given an MSE trace for all time indices k, as in Appendix A, we perform an SSWT to return a time-frequency image with time index k and frequency bin ν


Then applying a ridge-finding algorithm will return the dominant signal trace given as


Locally summing the energy around this trace can be accomplished by taking the sum of absolute values in the time-frequency bins around the ridge

where J is an integer number, J=5 in this study, though the number of local bins depends on the time-frequency resolution chosen in the wavelet transform. This summed value then provides an energy term that may be thresholded to create a mask of “valid” breathing rates

We acknowledge that the precise value of τ is subject to some degree of tuning and is dependent on the nature of the image and signal statistics. Automatic determination of this threshold may be a fruitful direction of future investigation.


The authors have no relevant financial interests in this study and no potential conflicts of interest to disclose.


The research was supported in part by the Cancer Prevention and Research Institute of Texas Grant Nos. RP140399 and RP120753-03, the National Institutes of Health (NIH) Grant No. 1 P50 CA196516, and the assistance of the Southwestern Small Animal Imaging Resource through the National Cancer Institute Cancer Center Support Grant No. 1P30 CA142543. iThera MSOT was purchased under NIH Grant No. 1 S10 OD018094-01A1.



X. L. Dean-Ben, E. Bay and D. Razansky, “Functional optoacoustic imaging of moving objects using microsecond-delay acquisition of multispectral three-dimensional tomographic data,” Sci. Rep., 4 5878 (2014). SRCEC3 2045-2322 Google Scholar


L. Ding, X. L. Dean-Ben and D. Razansky, “Real-time model-based inversion in cross-sectional optoacoustic tomography,” IEEE Trans. Med. Imaging, 35 (8), 1883 –1891 (2016). ITMID4 0278-0062 Google Scholar


J. Chung and L. Nguyen, “Motion estimation and correction in photoacoustic tomographic reconstruction,” SIAM J. Imag. Sci., 10 (1), 216 –242 (2017). Google Scholar


M. Schwarz et al., “Motion correction in optoacoustic mesoscopy,” Sci. Rep., 7 10386 (2017). Google Scholar


J. Xia et al., “Retrospective respiration-gated whole-body photoacoustic computed tomography of mice,” J. Biomed. Opt., 19 (1), 016003 (2014). Google Scholar


A. Taruttis et al., “Motion clustering for deblurring multispectral optoacoustic tomography images of the mouse heart,” J. Biomed. Opt., 17 (1), 016009 (2012). JBOPFO 1083-3668 Google Scholar


J. Joseph et al., “Evaluation of precision in optoacoustic tomography for preclinical imaging in living subjects,” J. Nucl. Med., 58 (5), 807 –814 (2017). JNMEAQ 0161-5505 Google Scholar


D. L. Gall, “MPEG: a video compression standard for multimedia applications,” Commun. ACM, 34 (4), 46 –58 (1991). CACMA2 0001-0782 Google Scholar


H.-T. Wu et al., “Using synchrosqueezing transform to discover breathing dynamics from ECG signals,” Appl. Comput. Harmon. Anal., 36 (2), 354 –359 (2014). ACOHE9 1063-5203 Google Scholar


G. Thakur et al., “The synchrosqueezing algorithm for time-varying spectral analysis: robustness properties and new paleoclimate applications,” Signal Process., 93 (5), 1079 –1094 (2013). Google Scholar


C. L. Herry et al., “Heart beat classification from single-lead ECG using the synchrosqueezing transform,” Physiol. Meas., 38 (2), 171 –187 (2017). Google Scholar


M. R. Tomaszewski et al., “Oxygen enhanced optoacoustic tomography (OE-OT) reveals vascular dynamics in murine models of prostate cancer,” Theranostics, 7 (11), 2900 –2913 (2017). Google Scholar


G. J. Thomas et al., “Summary of stages and signs in anesthesia,” Anesth. Analg., 40 42 –51 (1961). Google Scholar


A. K. Bhargava, R. Setlur and D. Sreevastava, “Correlation of bispectral index and Guedel’s stages of ether anesthesia,” Anesth. Analg., 98 (1), 132 –134 (2004). Google Scholar


I. J. Rampil, “A primer for EEG signal processing in anesthesia,” Anesthesiology, 89 (4), 980 –1002 (1998). ANESAV 0003-3022 Google Scholar


A. Dima, N. C. Burton and V. Ntziachristos, “Multispectral optoacoustic tomography at 64, 128, and 256 channels,” J. Biomed. Opt., 19 (3), 036021 (2014). Google Scholar


L. R. McNally et al., “Current and emerging clinical applications of multispectral optoacoustic tomography (MSOT) in oncology,” Clin. Cancer Res., 22 (14), 3432 –3439 (2016). Google Scholar


H. F. Becker et al., “Effect of different levels of hyperoxia on breathing in healthy subjects,” J. Appl. Physiol. (1985), 81 (4), 1683 –1690 (1996). Google Scholar


P. M. Macey, M. A. Woo and R. M. Harper, “Hyperoxic brain effects are normalized by addition of CO2,” PLoS Med., 4 (5), e173 (2007). 1549-1676 Google Scholar


C. K. Chui, Y.-T. Lin and H.-T. Wu, “Real-time dynamics acquisition from irregular samples—with application to anesthesia evaluation,” Anal. Appl., 14 537 (2016). Google Scholar


H. P. Brecht et al., “Whole-body three-dimensional optoacoustic tomography system for small animals,” J. Biomed. Opt., 14 (6), 064007 (2009). Google Scholar


R. B. Lam et al., “Dynamic optical angiography of mouse anatomy using radial projections,” Proc. SPIE, 7564 756405 (2010). Google Scholar


A. E. Guedel, Inhalation Anesthesia: A Fundamental Guide, XIV 172 The Macmillan Company, New York (1937). Google Scholar


Devin O’Kelly is a PhD candidate at UT Southwestern Medical Center pursuing a degree in biomedical engineering with a focus on computational and systems biology. He received his bachelor of science with high honors in biomedical engineering from the University of Texas at Austin. His research focuses on the development of imaging-based methods to assess the dynamics of biological systems, particularly that of cancer physiology.

Heling Zhou is a postdoctoral fellow at UT Southwestern Medical Center. She received her bachelor’s degree in biomedical engineering from Zhongshan School of Medicine, Sun Yat-Sen University in China, and her PhD in radiological sciences from UT Southwestern Medical Center. Her research focuses on developing biomedical imaging techniques and analytical methods to assess cancer physiology. Her research interest is to apply data science in cancer imaging and therapy.

Ralph P. Mason, a chemist by training, is a professor of radiology with over 25 years of experience in cancer imaging, therapy, and tumor pathophysiology. His primary research interest is prognostic radiology—developing and implementing methods for predicting optimal cancer therapy and assessing early response to treatment (“precision medicine”). His research relates to tumor hypoxia and developing methods for its noninvasive assessment.

© The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI.
Devin O'Kelly, Heling Zhou, and Ralph P. Mason "Tomographic breathing detection: a method to noninvasively assess in situ respiratory dynamics," Journal of Biomedical Optics 23(5), 056011 (30 May 2018).
Received: 26 March 2018; Accepted: 7 May 2018; Published: 30 May 2018

Back to Top