Small neural networks (NNs) that have a small model size find applications in mobile and wearable computing. One famous example is the SqueezeNet that achieves the same accuracy as the AlexNet yet has 50x fewer parameters than AlexNet. A few follow-ups and architectural variants have been inspired. They were built upon ad hoc arguments and experimentally justified. It remains a mystery why the SqueezeNet works efficiently. In this work, we attempt to provide a scientific explanation to the superior performance of the SqueezeNet. The function of the fire module, which is a key component of the SqueezeNet, is analyzed in detail. We study the evolution of cross-entropy values across layers and use visualization tools to shed light on its behavior with several illustrative examples.