This research presents an in-depth investigation into the application of Convolutional Neural Networks (CNN) for acoustic remote sensing on multi-rotor UAVs, with a specific focus on detecting large vehicles on the ground. We used a multi-rotor UAV equipped with a custom audio recorder, calibrated microphones, and uniquely designed microphone mounts for data collection. We explored optimal features for training our CNN, experimented with different normalization techniques, and examined their synergy between various activation functions. The study further explores the fine-tuning of model parameters to enhance detection performance and reliability. The outcome was a CNN model, trained with a combination of both real-world and synthetic data, demonstrating a proficient capability in target detection.
A priori estimation of the expected achievable quality for an uncrewed aerial vehicle (UAV) based imaging system can help validate the choice of components for the system’s implementation. For uncrewed airborne imaging systems coupling the sensor to the UAV platform is relatively simple. Quantifying the expected quality of collected data can, on the other hand, be less clear and often require trial and error. The central problem for these platforms is blur. The blur produced by the various rotational modalities of the aircraft can range from overwhelming to trivial but in most cases can be mitigated. This leaves the combination of the aircraft’s linear motion, its altitude and the imaging device’s instantaneous field of view (IFOV) and integration time as the determining factors for the blur produced in the image. In addition, there are significant differences in speeds obtainable between multi-rotor and fixed wing UAVs. In this paper we develop mathematical models for predicting blur based on these factors. We then compare these models with field data obtained from cameras mounted to fixed wing and multi-rotor UAVs. Conclusions regarding camera characteristics best suited for both types of UAV as well as the best image acquisition parameters such as altitude and speed, are discussed.
The discrimination between cotton and the invasive Palmer amaranth is economically important, as these weeds take resources away from cotton, resulting in diminished crop yield. There has been research into the discrimination between species of plants, including cotton and Palmer amaranth, that focused on the use of aerial imagery and the derived Red, Green, and near-infrared (RGN) spectral data fed into a machine-learning algorithm to classify these plants based on the measurable differences in their spectral characteristics. We believe that this research can be expanded upon by using geometric data derived from the aerial imagery to classify cotton and non-cotton plants based on their physical characteristics. This would also allow for accurate geolocation of the classified weeds for later removal. An autonomous drone with a GPS and a RGN camera attached will take a predetermined path to scan a crop field, and the resulting videos will be divided into individual frames. From these frames, both the RGN spectral data and a 3D point cloud can be derived. The RGBN data and the geometric data will be fed into a machine learning algorithm for classification between the cotton and non-cotton plants, and then additional processing will be done to geolocate the weeds. With this additional information for classification, it is hoped that the discrimination between cotton and weeds can be more accurate, and the location of the weeds can be more exact.
Most implementations of Fast Fourier Transform (FFT) algorithms available in software packages and libraries require the number of points on the input to be an integral power of two. However most digital images especially those obtained in PC based systems will seldom meet this requirement. This paper will present a simple computational technique to adjust image dimensions to an appropriate size. Nonlinear polynomials are used as the basis for the scheme. The derivation of the basic interpolation functions is presented and a basic three by three mask is obtained. Observations regarding properties of image and mask are made which lead to the reduction of the mask to a three by one. An optimal procedure for utilizing the mask is also presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.