We propose a method to detect and localize story-related subject captions in news video. Most caption detection and localization algorithms attempt to detect as many captions as possible; however, a news picture may contain many types of captions that are unrelated to the story. To facilitate fast and accurate access to news video content, a method for detecting and localizing the story-related caption is necessary. This paper addresses two problems in texture-based caption detection and localization: the time-consuming computation of features, and the clutter of caption detection results. We address these problems by first identifying the subject caption region based on the frequency of text occurrence. Then, we detect the subject caption frame as it first appears onscreen. Finally, the texture-based caption localization procedure is performed on the subject caption region in subject captions' beginning frames. Using this method decreases the computation time significantly. Additionally, the unrelated types of text are also filtered out, and only the subject caption is detected and localized. Experimental results show that the proposed method can quickly and robustly detect subject captions from news video.