Nowadays, a precise video quality assessment (VQA) model is essential to maintain the quality of service (QoS). However, most existing VQA metrics are designed for specific purposes and ignore the spatiotemporal features of nature video. This paper proposes a novel general-purpose no-reference (NR) VQA metric adopting Long Short-Term Memory (LSTM) modules with the masking layer and pre-padding strategy, namely VQA-LSTM, to solve the above issues. First, we divide the distorted video into frames and extract some significant but also universal spatial and temporal features that could effectively reflect the quality of frames. Second, the data preprocessing stage and pre-padding strategy are used to process data to ease the training for our VQA-LSTM. Finally, a three-layer LSTM model incorporated with masking layer is designed to learn the sequence of spatial features as spatiotemporal features and learn the sequence of temporal features as the gradient of temporal features to evaluate the quality of videos. Two widely used VQA database, MCL-V and LIVE, are tested to prove the robustness of our VQA-LSTM, and the experimental results show that our VQA-LSTM has a better correlation with human perception than some state-of-the-art approaches.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.