Multiview video is obtained by capturing a three-dimensional scene using many cameras simultaneously. Since adjacent view sequences have strong spatial redundancy as well as temporal redundancy, we can achieve additional coding gain by eliminating the spatial redundancy. In this paper, we propose view-temporal prediction structures that can be adjusted to various characteristics of general multiview video by separating them into temporal and view prediction structures. The proposed temporal prediction structure minimizes the average distance between a frame to be coded and its reference frames. The proposed view prediction structure considers the location of the I frame in anchor and the number of B frames to be inserted in anchor. We also propose a quantization parameter selection method for enhancing coding efficiency of views whose anchor frames are coded as B frames. Experimental results show that the proposed algorithms achieved 0.10 to 0.34 dB of peak SNR (PSNR) gain and reduced the variances of PSNR values of views by 28.26% on average against the results by the reference prediction.