This paper presents a data extraction and identification method for paper-based Chinese road maps. First, the extraction of the map title box and the legend index table is accomplished by a rule with trained parameter values. Then they are identified by distinguishing the characters from the legends using Bayes’s theorem. Second, the graylevel histogram of the large components is constructed and smoothed, and then the road images are filtered out using the multilevel thresholding technique. The extracted roads are further vectorized into line segments to save storage. The gaps between line segments are filled by a postprocessing procedure. Third, the characters and the legends are segmented by combining the small-component image and the difference image between the large-component image and the road images. The extracted legends are recognized by the proposed probabilistic template matching method. The performance of the proposed system is evaluated on 20 test maps, and the experimental results show that the proposed system is effective.