Successful acoustic signal classification requires the choice of an appropriate problem-adapted signal representation and the extraction of an invariant feature vector for classification. The two scientific core questions however, what is the best signal representation and what is acoustic resemblance, are theoretically still unanswered. Both the definitions of an optimum time-frequency representation (TFD) and of the correct acoustic invariants need some a priori knowledge about the inherent structure and symmetries of the acoustic time series as well as some knowledge about the differences between the classes to be distinguished. In this work, the central parts are a data-driven optimization of the parameterized TFD by maximizing the distance measure of the different sound classes and a geometric similarity concept based on dimensional analysis for defining dimensionless acoustic invariants. Starting from a parameterized TFD of the acoustic signals the joint moments are calculated. Dimensional analysis is used for defining dimensionless invariants under geometric transformations of the TFDs. The significance of these invariants is shown on the basis of the acoustic class of whistling and booming noise (WBN). Using these geometric invariants, the detection rate of WBN can be improved. However, the detection rate is heavily dependent on the chosen TFD. By maximizing the distance measure of the sound classes as a function of two TFD-kernel parameters, the optimum TFD in the sense of the available signal structure can be found and is exemplified by means of an industrial WBN dataset. For validation purpose of the kernel optimization, some acoustic signals in the sense of analytically known asymptotic limit cases with predetermined behavior are given.