The scene boundary detection is important in the semantic understanding of video data and is usually determined by coherence between shots. To measure the coherence, two approaches have been proposed. One is a discrete approach and the other one is a continuous approach. In this paper, we use the continuous approach and propose some modifications on the causal First-In-First-Out(FIFO) short-term memory-based model. One modification is that we allow dynamic memory size in computing coherence reliably regardless of the size of each shot. Another modification is that some shots can be removed from the memory buffer not by the FIFO rule. These removed shots have no or small foreground objects. Using this model, we detect scene boundaries by computing shot coherence. In computing coherence, we add one new term which is the number of intermediate shots between two comparing shots because the effect of intermediate shots is important in computing shot recall. In addition, we also consider shot activity because this is important to reflect human perception. We experiment our computing model on different genres of videos and have obtained reasonable results.