We describe an architecture to provide online semantic labeling capabilities to field robots operating in urban environments. At the core of our system is the stacked hierarchical classifier developed by Munoz et al., which classifies regions in monocular color images using models derived from hand labeled training data. The classifier is trained to identify buildings, several kinds of hard surfaces, grass, trees, and sky. When taking this algorithm into the real world, practical concerns with difficult and varying lighting conditions require careful control of the imaging process. First, camera exposure is controlled by software, examining all of the image's pixels, to compensate for the poorly performing, simplistic algorithm used on the camera. Second, by merging multiple images taken with different exposure times, we are able to synthesize images with higher dynamic range than the ones produced by the sensor itself. The sensor 's limited dynamic range makes it difficult to, at the same time, properly expose areas in shadow along with high albedo surfaces that are directly illuminated by the sun. Texture is a key feature used by the classifier, and under /over exposed regions lacking texture are a leading cause of misclassifications. The results of the classifier are shared with higher-lev elements operating in the UGV in order to perform tasks such as building identification from a distance and finding traversable surfaces.