The work presented in this paper shows the possibility of an automatic extraction of three dimensional urban objects from very high resolution (VHR) satellite scenes from anywhere of the world. Actual VHR satellites like GeoEye, World-View-2 or 3 or the Pliades system have ground sampling distances (GSD, “pixel sizes”) of 0.3 to 0.7 centimetres. All these systems allow also the acquisition of in-orbit-stereo-images. These are two or more images of the same location on ground acquired in the same orbit of the satellite from different viewing angles mostly only some seconds apart. From such stereo or – if more than two images were acquired – multistereo images in a first step a high resolution digital surface model (DSM) can be extracted with the same GSD as the stereo imagery. In the second step the inevitable errors and holes in the generated DSM will be filled and corrected using the multispectral imagery. Beneath the very high resolution panchromatic images which are used for the generation of the DSM also lower resolution – normally about 1/4 of the resolution of the panchromatic bands – multi-spectral images are acquired. These contain at least the four visible/NIR (VNIR) bands blue, green, red and near-infrared (NIR). Some sensors have more VNIR bands like World-View-2 (coastal, blue, green, yellow, red, red-edge and two NIR bands) or even additionally short-wave-infrared (SWIR) bands like World-View-3. From these mutispectral bands in a third step a spectral classification can be derived. This classification is used mainly for discrimination of vegetation and non-vegetation areas and the detection of water areas. The last step in this pre-processing comprises the correct orthorectification of the DSM and the pan-sharpened multispectral image. After this pre-processing of the stereo-imagery urban objects like buildings, trees, roads, bridges, and so on can be detected and in a last step these objects will be modeled to produce a final object-model of the satellite-scene or parts of it. In this paper the method is described and applied to an example satellite imagery.