Two sources of information play key roles in a collection of medical images such as computer tomographs, X-rays and histological slides, they are (1) textual descriptions relating to the image content and (2) visual features that can be seen on the image itself. The former are traditionally made by human specialists (e.g. histopathologists, radiographers, etc.) who interpret the image, and the latter are the inherent characteristics of images. This research program aims to study the architectural issues of a system which combines and interprets the information inherent in these two media to achieve automatic intelligent browsing of medical images. To give the research some practical significance, we applied the architecture to the design of the I-BROWSE system which is being developed jointly by the City University of Hong Kong and the Clinical School of the University of Cambridge. I- BROWSE is aimed to support intelligent retrieval and browsing of histological images obtained along the gastrointestinal tract (GI tract). Within such an architecture, given a query image or a populated image, a set of low level image feature measurements are obtained from a Visual Feature Detector, and with the help of knowledge bases and reasoning engines, the Semantic Analyzer derives, using an semantic feature generation and verification paradigm, the high level attributes for the image and furthermore automatically generates textual annotations for it. If the input image is accompanied with annotations made by a human specialist, the system will also analyze, combine and verify these two level of information, i.e., iconic and semantic contents. In the paper, we present the architectural issues and the strategies needed to support such information fusion process as well as the potentials of intelligent browsing using this dual- content-based approach.