In our previous work, we described an adaptive fast playback framework for video summarization where we changed the playback rate using the motion activity feature so as to maintain a constant “pace.” This method provides an effective way of skimming through video, especially when the motion is not too complex and the background is mostly still, such as in surveillance video. In this paper, we present an extended summarization framework that, in addition to motion activity, uses semantic cues such as face or skin color appearance, speech and music detection, or other domain dependent semantically significant events to control the playback rate. The semantic features we use are computationally inexpensive and can be computed in compressed domain, yet are robust, reliable, and have a wide range of applicability across different content types. The presented framework also allows for adaptive summaries based on preference, for example, to include more dramatic vs. action elements, or vice versa. The user can switch at any time between the skimming and the normal playback modes. The continuity of the video is preserved, and complete omission of segments that may be important to the user is avoided by using adaptive fast playback instead of skipping over long segments. The rule-set and the input parameters can be further modified to fit a certain domain or application. Our framework can be used by itself, or as a subsequent presentation stage for a summary produced by any other summarization technique that relies on generating a sub-set of the content.