The number of affordable consumer unmanned aerial vehicles (UAVs) available on the market has been growing quickly in recent years. Uncontrolled use of such UAVs in the context of public events like sports events or demonstrations, as well as their use near sensitive areas, such as airports or correctional facilities pose a potential security threat. Automatic early detection of UAVs is thus an important task which can be addressed through multiple modalities, such as visual imagery, radar, audio signals, or UAV control signals. In this work we present an image processing pipeline which is capable of tracking very small point targets in an overview camera, adjusting a tilting unit with a mounted zoom camera (PTZ system) to locations of interest and classifying the spotted object in this more detailed camera view. The overview camera is a high-resolution camera with a wide field of view. Its main purpose is to monitor a wide area and to allow an early detection of candidates, whose motion or appearance warrant a closer investigation. In a subsequent process these candidates are prioritized and successively examined by adapting the orientation of the tilting unit and the zoom level of the attached camera lens, to be able to observe the target in detail and provide appropriate data for the classification stage. The image of the PTZ camera is then used to classify the object into either UAV class or distractor class. For this task we apply the popular SSD detector. Several parameters of the detector have been adapted for the task of UAV detection and classification. We demonstrate the performance of the full pipeline on imagery collected by the system. The data contains actual UAVs as well as distractors, such as birds.