Fusion of multimodal data can offer enhanced machine learning. One of the most common fusion approaches in deep learning is end-to-end training of a neural network on all available modalities. However, paired multimodal data from all the modalities is required to train such a network. Collecting paired data from multiple modalities can be challenging and expensive due to the requirement of specialized equipment, atmospheric conditions, limitation of individual modalities to probe a scene, data integration from modalities with different spatial and spectral resolutions, and annotation challenges for obtaining ground truth. A two-phase multi-stream fusion approach is presented in this work to counteract this issue. First, we train the unimodal streams in parallel with their own decision layers, loss, and hyper-parameters. Then, we discard the individual decision layers, concatenate the last feature map of all unimodal streams, and jointly train a common multimodal decision layer. We tested the proposed approach on the NTIRE-21 dataset. Our experiments corroborate that in multiple cases, the proposed method can outperform the alternatives.
Ideal treatment of trauma, especially that which is sustained during military combat, requires rapid management to optimize patient outcomes. Medical transport teams `scoop-and-run' to trauma centers to deliver the patient within the `golden hour', which has been shown to reduce the likelihood of death. During transport, emergency medical technicians (EMTs) perform numerous procedures from tracheal intubation to CPR, sometimes documenting the procedure on a piece of tape on their leg, or not at all. Understandably, the EMT's focus on the patient precludes real-time documentation; however, this focus limits the completeness and accuracy of information that can be provided to waiting trauma teams. Our aim is to supplement communication that occurs en-route between point of injury and receiving facilities, by passively tracking and identifying the actions of EMTs as they care for patients during transport. The present work describes an initial effort to generate a coordinate system relative to patient's body and track an EMT's hands over the patient as procedures are performed. This `patient space' coordinate system allows the system to identify which areas of the body were the focus of treatment (e.g., time spent over the chest may indicate CPR while time spent over the face may indicate intubation). Using this patient space and hand motion over time in the space, the system can produce heatmaps depicting the parts of the patient's body that are treated most. From these heatmaps and other inputs, the system attempts to construct a sequence of clinical procedures performed over time during transport.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.