This paper discusses our work on automatic categorization of broadcast news based on close caption texts. The multimedia news data under study are first segmented into story units based on video and audio signals with our previous developed algorithms. Based on the time stamp information, close caption texts are segmented into text units corresponding to each story unit. A Bayes network is then trained to automatically classify the story units into fourteen categories. The major contribution of this paper is the idea of category, which represents a higher level of semantic generalization as compared with traditional topics. We discusses in detail the administrated bottom-up clustering algorithm to generate semantically meaningful category framework as well as the training procedures to build the brief network that covers the large broadcast news data set. Using LDC (Linguistic Data Consortium)'s CSR LM 1996 data set, we designed a number of experiments to discuss the relationship between categorization design and the classification performance.