Reliable vehicle detection and tracking in wide area motion imagery (WAMI), a novel class of imagery captured by airborne sensor arrays and characterized by large ground coverage and low frame rate, are the basis for higher-level image analysis tasks in wide area aerial surveillance. Possible applications include real-time traffic monitoring, driver behavior analysis, and anomaly detection. Most frameworks for detection and tracking in WAMI data rely on motion-based input detections generated by frame differencing or background subtraction. Subsequently employed tracking approaches aim at recovering missing motion detections to enable persistent tracking, i.e. continuous tracking also for vehicles that become stationary. Recently, a moving object detection method based on convolutional neural networks (CNNs) showed promising results on WAMI data. Therefore, in this work we analyze how CNN-based detection methods can improve persistent WAMI tracking compared to detection methods based on difference images. To find detections, we employ a network that uses consecutive frames as input and computes detection heatmaps as output. The high quality of the output heatmaps allows for detection localization by non-maximum suppression without further post processing. For quantitative evaluation, we use several regions of interest defined on the publicly available, annotated WPAFB 2009 dataset. We employ the common metrics precision, recall, and f-score to evaluate detection performance, and additionally consider track identity switches and multiple object tracking accuracy to assess tracking performance. We first evaluate the moving object detection performance of our deep network in comparison to a previous analysis of difference-image based detection methods. Subsequently, we apply a persistent multiple hypothesis tracker with WAMI-specific adaptations to the CNN-based motion detections, and evaluate the tracking results with respect to a persistent tracking ground truth. We yield significant improvement of both the motion-based input detections and the output tracking quality, demonstrating the potential of CNNs in the context of persistent WAMI tracking.