A spectral filtering based method for top-down spatiotemporal saliency detection is proposed. The proposed method enables to favor the salient features of the target object needed to pop out. Here a feature vector representing the salient features of the target object is learned online within the first image in which it is detected or initialized manually. The proper scale of the Gaussian kernel for spectral filtering is selected automatically according to the size ratio of the whole image to the target object. Guided by the top-down information, a top-down, target-related saliency map can be built in subsequent images. This enables to focus on the most relevant salient region and can be extended to complicated computer vision tasks. Experiment results demonstrate the effectiveness of the proposed method.