We investigate low-complexity convolutional neural networks (CNNs) for object detection for embedded vision applications. It is well-known that consolidation of an embedded system for CNN-based object detection is more challenging due to computation and memory requirement comparing with problems like image classification. To achieve these requirements, we design and develop an end-to-end TensorFlow (TF)-based fully-convolutional deep neural network for generic object detection task inspired by one of the fastest framework, YOLO.1 The proposed network predicts the localization of every object by regressing the coordinates of the corresponding bounding box as in YOLO. Hence, the network is able to detect any objects without any limitations in the size of the objects. However, unlike YOLO, all the layers in the proposed network is fully-convolutional. Thus, it is able to take input images of any size. We pick face detection as an use case. We evaluate the proposed model for face detection on FDDB dataset and Widerface dataset. As another use case of generic object detection, we evaluate its performance on PASCAL VOC dataset. The experimental results demonstrate that the proposed network can predict object instances of different sizes and poses in a single frame. Moreover, the results show that the proposed method achieves comparative accuracy comparing with the state-of-the-art CNN-based object detection methods while reducing the model size by 3× and memory-BW by 3 − 4× comparing with one of the best real-time CNN-based object detectors, YOLO. Our 8-bit fixed-point TF-model provides additional 4× memory reduction while keeping the accuracy nearly as good as the floating-point model. Moreover, the fixed- point model is capable of achieving 20× faster inference speed comparing with the floating-point model. Thus, the proposed method is promising for embedded implementations.