In this work, we present the framework surrounding the development of a mmW radar image-based algorithm for wire recognition and classification for rotorcraft operation in degraded visual environments. While a mmW sensor image lacks the optical resolution and perspective of an IR or LIDAR sensor, it currently presents the only true see-through mitigation under the heaviest of degraded vision conditions. Additionally, the mmW sensor produces a high-resolution, radar map that has proven to be exceedingly interpretable, especially to a familiar operator. Seizing on these clear advantages, the mmW radar image-based algorithm is trained and evaluated against independent mmW imagery data collected from a live flight test in a relevant environment. The foundation of our approach is based on image processing and machine learning techniques utilizing radar-based signal properties and sensor and platform information for added robustness. We discuss some of the requirements and practical challenges of a standalone algorithm development, and lastly, present some preliminary examples using existing development tools and discuss the path for continued advancement and evaluation.
The outstanding coding performance of H.264 comes with the cost of significantly higher complexity, making it too complex to be applied widely. This work aims at accelerating the H.264 encoder using joint algorithm/code-level optimization techniques so as to make it feasible to perform real-time encoding on a commercial personal computer. We propose a fast inter-mode decision scheme based on spatio-temporal information of neighboring macroblocks for the algorithm-level optimization. We use a commercial profiling tool to identify most time consuming modules and then apply several code-level optimization techniques, including frame-memory rearrangement, single-instruction-multipledata (SIMD) implementations based on the Intel MMX/SSE2 instruction sets. Search mode reordering and early termination for variable block-size motion estimation, are then applied to speed up these time-critical modules. The simulation results show that our proposed joint optimization H.264 encoder achieves a speed-up factor of up to 18 compared to the reference encoder without introducing serious quality degradation.
H.264/MPEG-4 AVC is the latest international video coding standard. Fidelity Range Extensions (FRExt) is recently
adopted as an amendment to AVC, which has demonstrated further coding efficiency improvements. With newly added
coding tools, FRExt encoders are very computationally demanding. In this work, several new algorithms are proposed to
optimize the FRExt-enabled features, including a fast rate-distortion optimized intra-mode decision algorithm and a fast
variable blocksize transform decision algorithm. Simulation results show that our proposed algorithm can reduce the
total encoder runtime by about 20%, with performance degradation within 0.1dB.
In this paper, early-stop and Motion Vector (MV) re-use approaches are proposed for the MPEG-2 to H.264 transcoding to reduce the computation of the variable block-size motion estimation. By combining the two approaches, the number of MV search points is reduced by more than 80% without significantly affecting the video quality. The proposed approaches can also be used in fast variable block-size motion estimation for the H.264 video encoding.
Global motion estimation is a powerful tool used in computer vision and video processing. In this paper, we propose an efficient approach of estimating general global motion parameters from coarsely sampled motion vector fields. The approach is based on gradient descent with outlier rejection. Experimental results show the effectiveness of the method in terms of robustness and accuracy.
In video transcoding, useful coding statistics become available when the input video stream is decoded. It is possible to utilize these statistics to improve the transcoding performance. In this paper we investigate how to use these statistics to perform the picture bit-allocation for transcoding a pre-encoded video bit-stream. We propose a scheme to estimate the complexities of the pictures of the output video using the coding statistics computed from the input video stream. Based on the estimated picture complexity, we present a picture bit-allocation algorithm for the rate-control of the transcoding process. The algorithm is simple to compute, and effectively improves the video quality and reduces the picture quality variations.
Most previous research efforts on video transcoding have been focused on changing the bit-rate of one pre-encoded bit-stream to another. However, in many applications (e.g. streaming video over a heterogeneous network), it may require to transcode a pre-encoded video bit-stream to multiple bit-streams with different bit-rates and features to support multiple clients with different requirements. In this paper, we discuss the case of point-to-multipoint transcoding. We compare two transcoder architectures in terms of processing speed for H.263 and MPEG-2 transcoding. We show that for point-to-multipoint transcoding, a cascaded video transcoder is more efficient since some parts of the transcoder can be shared by the multiple clients.
We investigate the scenario of using the Automatic Repeat reQuest (ARQ) retransmission scheme for two-way low bit-rate video communications over wireless Rayleigh fading channels. We show that during the retransmission of error packets, due to the reduced channel throughput, the video encoder buffer may fill-up quickly and cause the TMN8 rate-control algorithm to significantly reduce the bits allocated to each video frame. This results in Peak Signal-to-Noise Ratio (PSNR) degradation and many skipper frames. To reduce the number of frames skipped, in this paper we propose a coding scheme which takes into consideration the effects of the video buffer fill-up, an a priori channel model, the channel feedback information, and hybrid ARQ/FEC. The simulation results indicate that our proposed scheme encode the video sequences with much fewer frame skipping and with higher PSNR compared to H.263 TMN8.
KEYWORDS: Video, Computer programming, Video coding, Video processing, Quantization, Cameras, Visual communications, Mathematical modeling, Video compression, Semantic video
In streaming video applications, video sequences are encoded off-line and stored in a server. Users may access the server over a constant bit-rate channel such as Public Switched Telephone Network (PSTN) or Integrated Service Digital Network (ISDN). Examples of the streaming video are video on demand, archived video news, and non-interactive distance learning. Before the playback, part of the video bit-stream is pre- loaded in the decoder buffer to ensure that every frame can be decoded at the scheduled time. For these streaming video applications, since the delay (latency) is not a critical issue and the whole video sequence is available to the encoder, a more sophisticated bit-allocation scheme can be used to achieve better video quality. During the encoding process for streaming video, two constraints need to be considered: the maximum pre-loading time that the video viewers are willing to accept and the physical buffer-size at the receiver (decoder) side. In this paper, we propose a rate- control scheme that uses statistical information of the whole video sequence as a guidance to generate better video quality for video streaming involving constant bit-rate channels. Simulation results show video quality improvements over the regular H.263 TMN8 encoder.
KEYWORDS: Video, Computer simulations, Visual communications, Video processing, Signal processing, Wireless communications, Control systems, Receivers, Signal to noise ratio, Human-machine interfaces
Wireless channel impairments pose many challenges to real-time visual communications. In this paper, we describe a real-time software based wireless visual communications simulation platform which can be used for performance evaluation in real-time. This simulation platform consists of two personal computers serving as hosts. Major components of each PC host include a real-time programmable video code, a wireless channel simulator, and a network interface for data transport between the two hosts. The three major components are interfaced in real-time to show the interaction of various wireless channels and video coding algorithms. The programmable features in the above components allow users to do performance evaluation of user-controlled wireless channel effects without physically carrying out these experiments which are limited in scope, time-consuming, and costly. Using this simulation platform as a testbed, we have experimented with several wireless channel effects including Rayleigh fading, antenna diversity, channel filtering, symbol timing, modulation, and packet loss.
Multi-point videoconferencing provides the full benefits of teleconference but also incurs more involved technical issues. This paper does a detailed analysis of a continuous presence video bridge using the H.261 video coding standard. We first describe the architecture and the required operations of a coded domain bridge using H.261. We then derive the bounds of the bridge delay and the required buffer size for the implementation of the bridge. The delay and the buffer occupancy of the video bridge depend on the order, complexity, and the bit- distribution of the input video sources. To investigate a typical case, we simulate the delay and the buffer occupancy of a video bridge. We also provide a heuristic method to estimate the delay in a typical case. Several techniques were discussed to minimize the bridge delay and the buffer size. Finally, we simulate an intra slice coding and show that the delay and the buffer size can be reduced significantly using this technique.
The Discrete Cosine Transform (DCT) is considered to be the most effective transform coding technique for image and
video compression. In this paper, a new implementation of an experimental prototype multi-function DCT/IDCT
(Inverse DCT) chip is reported. The chip is based on a distributed arithmetic architecture. The main features of the chip
include: 1) The DCT and the IDCT are integrated in the same chip, 2) the chip achieves high accuracy, exceeding the
stringent requirements of a proposed CCITF standard, 3) it achieves a high operating speed of 27 MHz, and is thus
applicable to a wide-range of real-time image and video applications, 4) the internal clock frequency is the same as the
pixel rate, and 5) with an on-chip zigzag scan converter and an adder/subtractor, it is multifunctional and useful in a
DPCM configuration. The chip is implemented with standard cells and contains about 156k transistors.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.