PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8067, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Since the early 1990's researchers have begun to apply evolutionary algorithms to synthesize electronic circuits.
Nowadays it is evident that the evolutionary design approach can automatically create efficient electronic circuits
in many domains. This paper surveys fundamental concepts of evolutionary hardware design. It introduces
relevant search algorithms such as Cartesian genetic programming (CGP). Several case studies are presented
demonstrating strength and weakness of the method. Target domains are combinational circuit synthesis where
the goal is to minimize the number of gates, image filter design intended for field programmable gate arrays
(FPGAs) where the goal is to obtain the quality of filtering of conventional methods for a significantly lower cost
on a chip and evolution of benchmark circuits for evaluation of testability analysis methods. Evolved circuits
are compared with the best-known conventional designs. FPGAs are presented as accelerators for evolutionary
circuit design and circuit adaptation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multi-dimensional algorithms are hard to implement on classical platforms. Pipelining may exploit instruction-level
parallelism, but not in the presence of simultaneous data; threads optimize only within the given restrictions. Tiled
architectures do add a dimension to the solution space. With locally a large register store, data parallelism is handled, but
only to a dimension. 3-D technologies are meant to add a dimension in the realization. Applied on the device level, it
makes each computational node smaller. The interconnections become shorter and hence the network will be condensed.
Such advantages will be easily lost at higher implementation levels unless 3-D technologies as multi-cores or chip
stacking are also introduced. 3-D technologies scale in space, where (partial) reconfiguration scales in time. The optimal
selection over the various implementation levels is algorithm dependent. The paper discusses such principles while
applied on the scaling of cellular neural networks (CNN). It illustrates how stacking of reconfigurable chips supports
many algorithmic requirements in a defect-insensitive manner. Further the paper explores the potential of chip stacking
for multi-modal implementations in a reconfigurable approach to heterogeneous architectures for algorithm domains.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A generic bio-inspired adaptive architecture for image compression suitable to be implemented in embedded
systems is presented. The architecture allows the system to be tuned during its calibration phase. An evolutionary
algorithm is responsible of making the system evolve towards the required performance. A prototype has been
implemented in a Xilinx Virtex-5 FPGA featuring an adaptive wavelet transform core directed at improving
image compression for specific types of images.
An Evolution Strategy has been chosen as the search algorithm and its typical genetic operators adapted to
allow for a hardware friendly implementation. HW/SW partitioning issues are also considered after a high level
description of the algorithm is profiled which validates the proposed resource allocation in the device fabric.
To check the robustness of the system and its adaptation capabilities, different types of images have been
selected as validation patterns. A direct application of such a system is its deployment in an unknown environment
during design time, letting the calibration phase adjust the system parameters so that it performs efcient image
compression. Also, this prototype implementation may serve as an accelerator for the automatic design of
evolved transform coefficients which are later on synthesized and implemented in a non-adaptive system in the
final implementation device, whether it is a HW or SW based computing device.
The architecture has been built in a modular way so that it can be easily extended to adapt other types of
image processing cores. Details on this pluggable component point of view are also given in the paper.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Systems on Chip (SoC) are present in a wide range of applications. This diversity in addition with the quantity of critical
variables involved in their design process becomes it as a great challenging topic. FPGAs have consolidated as a
preferred device to develop and prototype SoCs, and consequently Partial Reconfiguration (PR) has gained importance
in this approach. Through PR it is possible to have a section of the FPGA operating, while other section is disabled and
partially reconfigured to provide new functionality. In this way hardware resources can be time-multiplexed and
therefore it is possible to reduce size, cost and power. In this case we focus on the implementation of a SoC, in an
FPGA-based board, with one of its peripherals being a reconfigurable partition (RP). Inside this RP different hardware
modules defined as reconfigurable modules (RM) can be configured. Thus, the system is suitable to have different
hardware configurations depending on the application needs and FPGA limitations, while the rest of the system
continues working. To this end a MicroBlaze soft-core processor is used in the system design and a Virtex-5 FPGA
board is utilized to its implementations. A remote sensing application is used to explore the capabilities of this approach.
Identifying the section(s) of the application suitable of being time-shared it is possible to define the RMs to place inside
the RP. Different configurations were carried out and measurements of area were taken. Preliminary results of the
performance-area utilisation are presented to validate the improvement in flexibility and resource usage.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Modern FPGAs with run-time reconfiguration allow the implementation of complex systems offering both the flexibility
of software-based solutions combined with the performance of hardware. This combination of characteristics, together
with the development of new specific methodologies, make feasible to reach new points of the system design space, and
make embedded systems built on these platforms acquire more and more importance. However, the practical exploitation
of this technique in fields that traditionally have relied on resource restricted embedded systems, is mainly limited by
strict power consumption requirements, the cost and the high dependence of DPR techniques with the specific features of
the device technology underneath.
In this work, we tackle the previously reported problems, designing a reconfigurable platform based on the low-cost and
low-power consuming Spartan-6 FPGA family. The full process to develop the platform will be detailed in the paper
from scratch. In addition, the implementation of the reconfiguration mechanism, including two profiles, is reported. The
first profile is a low-area and low-speed reconfiguration engine based mainly on software functions running on the
embedded processor, while the other one is a hardware version of the same engine, implemented in the FPGA logic. This
reconfiguration hardware block has been originally designed to the Virtex-5 family, and its porting process will be also
described in this work, facing the interoperability problem among different families.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, low distance wireless connectivity is having an exponential growth. Fast design and verification of the
performances of the wireless network is becoming a necessity for electronic industry to hit the more and more restrictive
market requests. A system level model of the network is indispensable to ensure fast and flexible design and verification.
In this work a SystemC model of the IEEE 802.15.4 standard is presented. The model has been used to verify the
performances of the 802.15.4 standard in terms of efficiency and channel throughput as a function of the number of
nodes in the network, of the dimension of the payload and of the frequency with which the nodes try to transmit.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A critical issue of Wireless Sensor Networks circuits is energy management. This work presents a Radio-Triggered
Wake-Up solution designed and developed for WSN based systems. The proposed circuit manages, in a simple and
efficient way, node switching between sleep mode and both receiving or transmitting active modes. It uses a HW hearing
circuit, which lowers power consumption and avoids extra processing on the main microcontroller. The weak-up is
selective with predefined recognition patterns without the microcontroller intervention. Furthermore, it is tiny in size,
and the whole circuit is suitable for single CMOS chip integration. The circuit has been tested to demonstrate the Wake-
Up proposal worthiness. With only 8.7 microwatts of power consumption (@ 3.0 Vdc) the system successfully Wake-Up
nodes up to 15 meters away from the transmission source. This performance improves solutions presented in previous
research works.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a performance analysis of wireless image sensor networks for videosurveillance using the IEEE
802.15.4 wireless standard. The dependence of image quality and network throughput with JPEG image compression
parameters and wireless protocol parameters has been investigated. The objective of the work is to give useful guidelines
in the design of wireless videosurveillance networks over low cost, low power, low rate IEEE 802.15.4 wireless protocol.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a tool for simulating the impulse response for indoor wireless optical channels using 3D computer-aided
design (CAD) models is presented. The tool uses a simulation algorithm that relies on ray tracing techniques and the
Monte Carlo method and improves on all previous methods from a computational standpoint. The 3D scene, or the
simulation environment, can be defined using any computer-aided design (CAD) software in which the user specifies, in
addition to the setting geometry, the reflection characteristics of the surface materials as well as the structures of the
emitters and receivers involved in the simulation. Also, in an effort to improve the computational efficiency, two
optimizations are presented. The first consists of dividing the setting into cubic regions of equal size. These sub-regions
allow the program to consider only those object faces and/or surfaces that are in the ray propagation path. This first
optimization provides a calculation improvement of approximately 50%. The second involves the parallelization of the
simulation algorithm. The parallelization method proposed involves the equal and static distribution of the rays for
computation by different processors. This optimization results in a calculation speed-up that is essentially proportional to
the number of processors used.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A fully differential programmable gain amplifier (PGA) with constant transfer characteristic and very low power
consumption is proposed and implemented in a 130 nm CMOS technology. The PGA features a gain range of
4 dB to 55 dB with a step size of 6 dB and a constant bandwidth of 10-550 kHz. It employs two stages of
variable amplification with an intermediate 2nd order low-pass channel filter.
The first stage is a capacitive feedback OTA using current-reuse achieving a low input noise density of
16.7 nV/√Hz. This stage sets the overall high-pass cutoff frequency to approximately 10 kHz. For all gain
settings the high-pass cutoff frequency variation is within ±5%.
The low-pass channel filter is merged with a second amplifying stage forming a Sallen-Key structure. In order
to maintain a constant transfer characteristic versus gain, the Sallen-Key feedback is taken from different taps of
the load resistance. Using this new approach, the low-pass cutoff frequency stays between 440 kHz and 590 kHz
for all gain settings (±14%). Finally, an offset cancelation loop reduces the output offset of the PGA to less than
5 mV (3σ).
The PGA occupies an area of approximately 0.06 mm2 and achieves a post-layout power consumption of
55 μW from a 1V-supply. For the maximum gain setting the integrated input referred noise is 14.4 μVRMS while
the total harmonic distortion is 0.7 % for a differential output amplitude of 0.5 V.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, an infrared wireless communications system based on THSS techniques employing angle-diversity
detection is studied via simulation. Although the system is designed to operate at infrared wavelengths, it can also be
used for Visible Light Communications (VLC). Time-Hopping codification is based on splitting the symbol period into
several short slots. In order to specify which slots are used to transmit and which are not, the use of maximum length
sequences is considered. The remaining time slots can be used by other users so as to provide the system with multiple
access capabilities. In this paper, a 2-PPM modulation scheme is selected because it yields good results in infrared
systems as well as in VLC. Furthermore, the THSS system allows for selecting the number of pulses per symbol to be
transmitted and makes use of an optimum maximum-likelihood receiver for AWGN channels with the ability to choose
between hard or soft decision decoding. The system designed allows for comparing the performance based on the
computation of the bit error rate (BER) as a function of the pulse energy to noise power spectral density ratio, for
different configurations in single-user and multi-user environments. The results show a significant enhancement when
angle-diversity receivers are used as compared to employing receivers using a single-element detector with a wide field
of view (FOV). In this paper, two angle-diversity structures are compared: conventional and sectored receivers. Although
the sectored receiver exhibits better BER than the conventional receiver, its implementation is more complex.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The increasing campaigns of GNSS-R scenario have put great pressure on high performance post-processing
design into the space level instrumentation. Due to large scale of information acquisition and the intensive
computation of cross-correlation waveform (CC-WAV), the overhead between the processing time and the storage
of amount of data prior to downlink issues has lead us to get the solution of real-time parallel processing design on
board. In this paper, we focus on the interaction of the chip level multiprocessing architecture and applications,
which show that the unbalanced workload of the transmission and processing can be compensated on the novel
architecture, Heterogeneous Transmission and Parallel Computing Platform (HTPCP). The intention of HTPCP
is to get a solution for the bus congestion and memory allocation issues. The pros and cons of SMP and HTPCP
are discussed, and the simulation results prove that HTPCP can highly improve the throughput of the GOLD-RTR
system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a low-power, high-speed 4-data-path 128-point mixed-radix (radix-2 & radix-22) FFT processor for
MB-OFDM Ultra-WideBand (UWB) systems. The processor employs the single-path delay feedback (SDF) pipelined
structure for the proposed algorithm, it uses substructure-sharing multiplication units and shift-add structure other than
traditional complex multipliers. Furthermore, the word lengths are properly chosen, thus the hardware costs and power
consumption of the proposed FFT processor are efficiently reduced. The proposed FFT processor is verified and
synthesized by using 0.13 μm CMOS technology with a supply voltage of 1.32 V. The implementation results indicate
that the proposed 128-point mixed-radix FFT architecture supports a throughput rate of 1Gsample/s with lower power
consumption in comparison to existing 128-point FFT architectures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents the definition in SystemC of wireless channels at different levels of abstraction. The different levels
of description of the wireless channel can be easily interchanged allowing the reuse of the application and baseband
layers in a high level analysis of the network or in a deep analysis of the communication between the wireless devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new design methodology for parallel and distributed embedded systems is presented using the behavioural hardware
compiler ConPro providing an imperative programming model based on concurrently communicating sequential processes
(CSP) with an extensive set of interprocess-communication primitives and guarded atomic actions. The programming
language and the compiler-based synthesis process enables the design of constrained power- and resourceaware
embedded systems with pure Register-Transfer-Logic (RTL) efficiently mapped to FPGA and ASIC technologies.
Concurrency is modelled explicitly on control- and datapath level. Additionally, concurrency on data-path level
can be automatically explored and optimized by different schedulers.
The CSP programming model can be synthesized to hardware (SoC) and software (C,ML) models and targets. A common
source for both hardware and software implementation with identical functional behaviour is used.
Processes and objects of the entire design can be distributed on different hardware and software platforms, for example,
several FPGA components and software executed on several microprocessors, providing a parallel and distributed system.
Intersystem-, interprocess-, and object communication is automatically implemented with serial links, not visible
on programming level.
The presented design methodology has the benefit of high modularity, freedom of choice of target technologies, and
system architecture. Algorithms can be well matched to and distributed on different suitable execution platforms and
implementation technologies, using a unique programming model, providing a balance of concurrency and resource
complexity.
An extended case study of a communication protocol used in high-density sensor-actuator networks should demonstrate
and compare the design of a hardware and software target. The communication protocol is suited for high-density intra-and
interchip networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multiprocessor System-on-Chip (MPSoCs) are emerging as one of the technologies providing a way to support the
growing design complexity of embedded systems including several types of cores. The interconnection among cores of a
MPSoC is proposed to be provided by Networks-on-Chip (NoC). In real applications it is usual to find different
interconnection needs amongst cores, so distinct bandwidth is needed in each node of a NoC. Since larger FIFOs in NoC
routers provide larger throughputs and smaller latencies, depths are usually sized for the worst case, compromising not
only the routing area, but power consumption. In this paper, a reconfigurable router with a dynamic sharing mechanism
of buffers at the input channels is proposed to reduce congestion in the network. In this situation, a channel may
dynamically lend or borrow some non-used buffer units to or from neighboring channels, in accordance to the
connection rates. The proposed reconfigurable router architecture was embedded in the Hermes NoC. The main
advantages of the Hermes are its small size and modular design. This, as well as the open source approach, have lead to
the selection of this NoC. The basic element of Hermes is a router with five bi-directional ports employing an XY
routing algorithm. FIFO buffering is present only at the input channel, with all channels having the same buffer depth
defined at design time. The proposed reconfigurable router has been coded in VHDL at RTL level from the adaptation of
the Hermes router to fit into the proposed scheme. Results obtained from the simulation of the router under scenarios
with different traffic characteristics and percentage of shared buffer, show that mean latency can be reduced up to a 30%
in comparison to the original router.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The growth of complexity and the requirements of on-chip technologies create the need for new architectures which
generate solutions representing a compromise between complexity and power consumption, and Quality of Service
(QoS) of the communications between the cores of a System-on-Chip (SoC). Network-on-Chip (NoC) arises as a
solution to implement efficient interconnections in SoC. This new technology, due to its complexity, creates the need of
specialized engineers who can design the intricate circuits that NoC requires. It is possible to reduce those specialization
needs by using CAD tools. In this paper, one of this tools, called Arteris NoC Solution, is used for developing the
proposed framework for NoC emulation. This software includes three different tools: NoCexplorer, for high-level
simulation of an abstract model of the NoC, NoCcompiler, in which the NoC is defined and generated in HDL language,
and NoCverifier, which performs simulations of the HDL code. Furthermore, a validation and characterization
infrastructure was developed for the created NoC, which can be completely emulated in FPGA. This environment is
composed by OCP traffic generators and receptors, which also can perform measurements over the created traffic, and a
store and communication module, which is responsible for storing the results obtained from the emulation of the entire
system in the FPGA, and send it to a PC. Once the data is stored in the PC, statistical analyses are performed, including a
comparison of mean latency from high level simulations, RTL simulations and FPGA emulations. The analysis of the
results is obtained from three scenarios with different NoC topologies for the same SoC design.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scalable Video Coding (SVC) is the extension of H.264/AVC standard proposed by Joint Video Team (JVT) to provide
flexibility and adaptability on video transmission. SVC is an extension of the H.264/AVC codec that exploits the use of
layers, what permits to obtain a bit stream where specific parts can be removed to obtain an output video with a lower
resolution (temporal or spatial) and/or lower quality/fidelity.
This paper provides a performance analysis of the scalable video coding (SVC) extension of H.264/AVC for constrained
scenarios. For this, the open-source decoder called "Open SVC Decoder" was adapted to obtain a version likely to be
implemented in reconfigurable architectures. For each scenario a set of different sequences were decoded to analyze the
performance of each functional block inside the decoder.
From this analysis we conclude that reconfigurable architectures are a suitable solution for an SVC decoder in a
constrained device or for a specific range of scalability levels. Our proposal consists in architecture of a SVC decoder
that admits different options depending on device requirements where certain blocks are customizable to improve the
performance of decoder in hardware resources usage and execution time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the most computational intensive tasks in recent video encoders and decoders is the deblocking filter. Its
computational complexity is considerable, and it might take more than 30% of the total computational cost of the
decoder execution. Nowadays, some of its limiting factors for reaching real-time capabilities are mainly related with
memory and speed. Trying to deal with these factors, this paper proposes a novel Deblocking filter architecture which
supports all filtering modes available in both the H.264/AVC and Scalable Video Coding (SVC) standards. It has been
implemented in a hardware scalable architecture, which benefits of the parallelism and adaptability of the algorithm and
which can be adapted dynamically in FPGAs.
Regarding to the parallelism, this architecture mapping is capable of respecting data dependencies among MBs while
several functional units (FU) are filtering data in parallel. Regarding scalability, the architecture is flexible enough for
adapting its performance to the diverse environment demands. This fact is possible by increasing or decreasing the
number of FUs, like in a systolic array. In this sense, this paper will present a composition between the FU proposed
against the state-of-the art work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The ability of additional detail extraction offered by the super-resolution image reconstruction (SRIR) algorithms greatly
improves the results of the process of spatial images augmentation, leading, where possible, to significant objective
image quality enhancement expressed in the increase of peak-signal-to-noise ratio (PSNR). Nevertheless, the ability of
providing hardware implementations of fusion SRIR algorithms capable of producing satisfactory output quality with
real-time performance is still a challenge. In order to make the hardware implementation feasible a number of trade-offs
that compromise the outcome quality are needed.
In this work we tackle the problem of high resource requirements by using a non-iterative algorithm that facilitates
hardware implementation. The algorithm execution flow is presented and described. The algorithm output quality is
measured and compared with competitive solutions including interpolation and iterative SRIR implementations. The
tested iterative algorithms use frame-level motion estimation (ME), whereas the proposed algorithm relies on,
performance-wise better, block matching ME. The comparison shows that the proposed non-iterative algorithm offers
superior output quality for all tested sequences, while promising efficient hardware implementation able to match -at
least- the software implementations in terms of outcome quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Graphics Processing Units have become a booster for the microelectronics industry. However, due to intellectual
property issues, there is a serious lack of information on implementation details of the hardware architecture
that is behind GPUs. For instance, the way texture is handled and decompressed in a GPU to reduce bandwidth
usage has never been dealt with in depth from a hardware point of view. This work addresses a comparative
study on the hardware implementation of different texture decompression algorithms for both conventional (PCs
and video game consoles) and mobile platforms.
Circuit synthesis is performed targeting both a reconfigurable hardware platform and a 90nm standard cell
library. Area-delay trade-offs have been extensively analyzed, which allows us to compare the complexity of
decompressors and thus determine suitability of algorithms for systems with limited hardware resources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The evaluation of elementary functions can be performed by approximations using minimax polynomials requiring
simple hardware resources. The general method to calculate an elementary function is composed by three
steps: range reduction, computation of the polynomial in the reduced argument and range reconstruction. This
approach allows a low-degree polynomial approximation but range reduction and reconstruction introduce a
computation overhead.
This work proposes an evaluation methodology without range reduction and range reconstruction steps. Applications
that need to compute elementary functions may benefit from avoiding these steps if the argument
belongs to a sub-domain of the function. Particularly in the context of embedded systems, applications related
to digital signal processing most of the times require function evaluation within a specific interval. As
a consequence of not doing range reduction, the degree of the approximant polynomials increases to maintain
the required precision. Interval segmentation is an effective way to overcome this issue because the approximations
are computed in smaller intervals. The proposed methodology uses non-uniform segmentation as a way
to mitigate the problem arising from not carrying out range reduction. The benefits that come from applying
interval segmentation to the general evaluation technique are limited by the range reduction and reconstruction
steps because the segmentation only applies to the approximation step. However, when used in the proposed
methodology it reveals more effective.
Some elementary functions were implemented using the proposed methodology in a FPGA device. The metric
used to characterize the proposed technique are the area occupation and the corresponding latency. The results of
each implementation without range reduction were compared with the corresponding ones of the general method
using range reduction. The results show that latency can be significantly reduced while the area is approximately
the same.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is widely recognized that the welfare of the most advanced economies is at risk, and that the only way to tackle this
situation is by controlling the knowledge economies and dealing with. To achieve this ambitious goal, we need to
improve the performance of each dimension in the "knowledge triangle": education, research and innovation. Indeed,
recent findings point to the importance of strategies of adding-value and marketing during R+D processes so as to bridge
the gap between the laboratory and the market and so ensure the successful commercialization of new technology-based
products. Moreover, in a global economy in which conventional manufacturing is dominated by developing economies,
the future of industry in the most advanced economies must rely on its ability to innovate in those high-tech activities
that can offer a differential added-value, rather than on improving existing technologies and products. It seems quite
clear, therefore, that the combination of health (medicine) and nanotechnology in a new biomedical device is very
capable of meeting these requisites.
This work propose a generic CMOS Front-End Self-Powered In-Vivo Implantable Biomedical Device, based on a threeelectrode
amperometric biosensor approach, capable of detecting threshold values for targeted concentrations of
pathogens, ions, oxygen concentration, etc.
Given the speed with which diabetes can spread, as diabetes is the fastest growing disease in the world, the nano-enabled
implantable device for in-vivo biomedical analysis needs to be introduced into the global diabetes care devices market.
In the case of glucose monitoring, the detection of a threshold decrease in the glucose level it is mandatory to avoid critic
situations like the hypoglycemia. Although the case study reported in this paper is complex because it involves multiple
organizations and sources of data, it contributes to extend experience to the best practices and models on nanotechnology
applications and commercialization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a first approach on multi-pathogen detection system for portable point-of-care applications on
discrete electronics field. The main interest is focused on the development of custom built electronic solutions for bioelectronics
applications, from discrete devices to ASICS solutions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aggressive CMOS scaling results in significant increase of leakage current in MOS transistors manufactured in deep
submicron regime. Consequently low power SRAM design becomes an important criteria in design of VLSI circuits. In
this work, a new six transistor (6T) SRAM cell based on dual threshold voltage and dual power supply techniques, has
been proposed for low leakage SRAM design. The proposed cell has been compared to the conventional 6T-SRAM,
using the 65 nm technology. Compared to conventional six transistor (6T) SRAM cell, new 6T SRAM cell reduces
leakage power consumption by 72.6%. Furthermore, the proposed SRAM cell shows no area overhead and comparable
read/ writes speed as compared to conventional 6T SRAM cell.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Variable capacitors, the varactors, are key components in many types of radiofrequency circuits and thus high quality
varactors are essential to achieve high quality factors in these devices.
This work presents results of a study on the variation of tuning range and quality factor when varying the depth and
separation of N+ diffusions in a PN junction varactor with fixed number of cells. For test needs four types of cells,
varying the geometry of N+ and P+ diffusions were designed. The varactors were formed by horizontally and vertically
overlapping cells. Based on their implementation structure, the varactors were divided into two groups, each comprising
4 varactors. The varactors belonging to the first group have all N+ diffusions connected to the buried layer. Varactors
from the second group use floating N+ diffusions and a buried N+ diffusion to separate pairs formed by two adjacent
cells.
Post implementation measurements show that the area of varactors from in the first and second group is 1795.74 μm2(51.9 x 34.6) and 1288.92 μm2 (46.7 x 27.6), respectively. The varactors from the 1st group have a high tuning range,
whereas the ones from the 2nd group high quality factors and require less area.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Measuring, Detecting and Obscuring Defects and Effects
Verification is still the bottleneck of the complex digital system design process. Formal techniques have advanced in
their capacity to handle more complex descriptions, but they still suffer from problems of memory or time explosion.
Simulation-based techniques handle descriptions of any size or complexity, but the efficiency of these techniques is
reduced with the increase in the system complexity because of the exponential increase in the number of simulation tests
necessary to maintain the coverage. Semi-formal techniques combine the advantages of simulation and formal
techniques as they increase the efficiency of simulation-based verification. In this area, several research works have
introduced techniques that automate the generation of vectors driven by traditional coverage metrics. However, these
techniques do not ensure the detection of 100% of faults.
This paper presents a novel technique for the generation of vectors. A major benefit of the technique is the more efficient
generation of test-benches than when using techniques based on structural metrics.
The technique introduced is more efficient since it relies on a novel coverage metric, which is more directly correlated to
functional faults than structural coverage metrics (line, branch, etc.). The proposed coverage metric is based on an
abstraction of the system as a set of polynomials where all system behaviours are described by a set of coefficients. By
assuming a finite precision of coefficients and a maximum degree of polynomials, all the system behaviors, including
both the correct and the incorrect ones, can be modeled. This technique applies mathematical theories (computer algebra
and number theory) to calculate the coverage and to generate vectors which maximize coverage.
Moreover, in this work, a tool which implements the technique has been developed. This tool takes a C-based system
description and provides the coverage and the generated vectors as output.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
System-level energy optimization of battery-powered multimedia embedded systems has recently become a design goal.
The poor operational time of multimedia terminals makes computationally demanding applications impractical in real
scenarios. For instance, the so-called smart-phones are currently unable to remain in operation longer than several hours.
The OMAP3530 processor basically consists of two processing cores, a General Purpose Processor (GPP) and a Digital
Signal Processor (DSP). The former, an ARM Cortex-A8 processor, is aimed to run a generic Operating System (OS)
while the latter, a DSP core based on the C64x+, has architecture optimized for video processing.
The BeagleBoard, a commercial prototyping board based on the OMAP processor, has been used to test the Android
Operating System and measure its performance. The board has 128 MB of SDRAM external memory, 256 MB of Flash
external memory and several interfaces. Note that the clock frequency of the ARM and DSP OMAP cores is 600 MHz
and 430 MHz, respectively.
This paper describes the energy consumption estimation of the processes and multimedia applications of an Android v1.6
(Donut) OS on the OMAP3530-Based BeagleBoard. In addition, tools to communicate the two processing cores have
been employed. A test-bench to profile the OS resource usage has been developed.
As far as the energy estimates concern, the OMAP processor energy consumption model provided by the manufacturer
has been used. The model is basically divided in two energy components. The former, the baseline core energy,
describes the energy consumption that is independent of any chip activity. The latter, the module active energy, describes
the energy consumed by the active modules depending on resource usage.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Side Channel Attack (SCA) differs from traditional mathematic attacks. It gets around of the exhaustive mathematic
calculation and precisely pin to certain points in the cryptographic algorithm to reveal confidential information from the
running crypto-devices. Since the introduction of SCA by Paul Kocher et al [1], it has been considered to be one of the
most critical threats to the resource restricted but security demanding applications, such as wireless sensor networks. In
this paper, we focus our work on the SCA-concerned security verification on WSN (wireless sensor network). A detailed
setup of the platform and an analysis of the results of DPA (power attack) and EMA (electromagnetic attack) is
presented. The setup follows the way of low-cost setup to make effective SCAs. Meanwhile, surveying the weaknesses
of WSNs in resisting SCA attacks, especially for the EM attack. Finally, SCA-Prevention suggestions based on
Differential Security Strategy for the FPGA hardware implementation in WSN will be given, helping to get an improved
compromise between security and cost.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With aggressive scaling, one of the major barriers that CMOS technology faces is the increasing process variations. The
variations in process parameters not only affect the performance of the devices but also degrade the parametric yield of
the circuits. Adaptive repairing techniques like adaptive body bias were proved to be effective to mitigate variations in
the process parameters. In this paper, we evaluate the use of zone based self-repairing techniques to mitigate the impact
of process variations on SRAM cells. Two different techniques were experimented and analyzed through extensive
Monte Carlo simulations and exploiting a commercial 65nm technology. Obtained results demonstrate that
improvements up to 35.7% in variability factor for leakage power and up to 22.3% in Design Margin for leakage power
can be achieved by using the suggested approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a glitch propagation model that can be used to categorize the propagation likelihood of a given noise
waveform trough a logic gate. This analysis is key to predict if a SET induced within a combinational block is capable of
causing a SEU. The model predicts the glitch output characteristics given the input noise waveform for each gate in a 65-
nm technology library. These noise transfer curves are fitted to known functions to have a simple analytical equation and
compute the propagation. Comparison between simulations and model shows a good agreement.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The incorporation of Resonant Tunnel Diodes (RTDs) into III/V transistor technologies has shown an improved circuit performance:
higher circuit speed, reduced component count, and/or lowered power consumption. Currently, the incorporation of these devices into
CMOS technologies (RTD-CMOS) is an area of active research. Although some works have focused the evaluation of the advantages
of this incorporation, additional work in this direction is required. We compare RTD-CMOS and pure CMOS realizations of a network
of logic gates which can be operated in a gate-level pipeline. Significant lower average power is obtained for RTD-CMOS
implementations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.