Automatic object segmentation in highly noisy image sequences, composed by a translating object over a background having a different motion, is achieved through joint motion-texture analysis. Local motion and/or texture is characterized by the energy of the local spatio-temporal spectrum, as different textures undergoing different translational motions display distinctive features in their 3D (x,y,t) spectra. Measurements of local spectrum energy are obtained using a bank of directional 3rd order Gaussian derivative filters in a multiresolution pyramid in space- time (10 directions, 3 resolution levels). These 30 energy measurements form a feature vector describing texture-motion for every pixel in the sequence. To improve discrimination capability and reduce computational cost, we automatically select those 4 features (channels) that best discriminate object from background, under the assumptions that the object is smaller than the background and has a different velocity or texture. In this way we reject features irrelevant or dominated by noise, that could yield wrong segmentation results. This method has been successfully applied to sequences with extremely low visibility and for objects that are even invisible for the eye in absence of motion.