In 3D wavelet video coding schemes, in which a temporal wavelet decomposition of the video data is combined with a spatial wavelet transform, temporal scalability and the reduction of temporal redundancy is often achieved at the expense of a delay. The delay increases according to the number of video frames that are jointly coded or, in other terms, according to the temporal wavelet transform depth. Depending on the system delay that is allowed by a specific application, the maximum temporal transform depth might be limited. On the other hand, consecutive temporal lowpass frames at the highest permitted temporal decomposition level might still be strongly correlated, especially in case of video material with static background or low motion that can be optimally compensated. In this case, the temporal correlation should be exploited to improve the coding efficiency without inducing an additional delay to the overall system. In this paper, we consider a 3D wavelet video coding scheme in which the temporal wavelet decomposition precedes the spatial wavelet decomposition, and investigate the application of a spatially scalable wavelet video coder with in-band prediction to the temporal lowpass frames at the maximum temporal transform depth.