This paper presents adder circuits of various architectures aimed at reducing static power dissipation. Circuit
topologies for basic building blocks were evaluated for fabrication technologies of 65nm down to 32nm, and
simulation results are presented. This work has lead to the development of various low power adder circuits and
provides comparative analysis leading to the recommendation that a variable size block carry select adder is the
best performer, taking into consideration both static and dynamic power dissipation.
This paper introduces a new input pattern dependent model for total static leakage estimation in ultra deep submicron processes. The model integrates gate tunnelling leakage, gate induced drain leakage (GIDL) and subthreshold leakage into a single leakage estimation framework. Subthreshold estimation is facilitated through the analytical estimation of nodal voltages between OFF transistors, while gate tunnelling leakage and GIDL are calculated based on simplified versions of their respective BSIM4 equations. The framework deals with all input patterns and accommodates scenarios where the various leakage currents interact. Similar approaches in the literature are either based on a look up table approach, and do not accommodate transistor stacks with varying widths, or are highly experimental and require a detailed knowledge of the transistor device physics. Several approaches also exist for modeling either subthreshold leakage or gate tunnelling leakage separately. Even those approaches use a lookup table approach, fix all widths in a transistor stack and/or limit the stack size to 2-3 transistors. The model proposed in this paper is tractable and almost completely analytical. It is capable of accommodating stacks with up to 4 transistors with varying transistor widths. A stack estimator function based on this model was coded in MatLab for the 65nm, 45nm and 32nm PTM process technologies. Compared with SPICE simulations the model exhibited an average error of 1.29%, 2.79%, 7.57% and 11.42% for stack sizes of 1, 2, 3 and 4 respectively across all three technologies. The model also exhibits significant runtime savings when compared with SPICE.
This paper presents a novel approach for technology partitioning in a library free paradigm based on the use of virtual
cells. Previous methods for library free logic partitioning rely on creating the largest possible partitions from a user
defined criteria, predominately the stack length of the transistor level implementation. However, these methods can
cause conflicting structures, defying the AND-OR-INVERT (AOI) and OR-AND-INVERT (OAI) representations that
are used as templates for the virtual cells. The Complementary Logic Partitioning (CLP) algorithm, defines a partition
as consisting of only two hierarchical levels of complementary nodes (AND and OR), as well as using the logical effort
model for the migration of inputs to optimize the partitions to meet both the user defined limiting criteria and minimize
the delay of the inputs. The CLP algorithm is compared against Synopsys' Design Compiler using Artisan standard cell
library for a set of MCNC '91 benchmarks. Preliminary simulation results based on TSMC's 0.18 micron CMOS technology, show a reduction of more than 50% in the critical path delay can be achieved with CLP.
Library-free logic synthesis is an innovative approach that provides a fully customized design performance while avoiding the huge cost of developing and maintaining the extensive cell libraries. Its strength is coming from the use of a virtual library based on on-the-fly cell generation. However, the flexibility of the virtual library makes it impossible to exploit the existing methodologies that are based on the pre-characterized standard cell libraries. The authors developed a creative approach to map the design into customized CMOS complex gates using virtual library technique. This is a timing-driven process, which consists of four phases: logic transformation, logic partitioning, gate mapping and transistor re-ordering. The performance of CMOS complex gates and the logic path derived from the extracted transistor topology are used in guiding the synthesis process. The proposed mapping algorithm was used in combination with our topology-based performance estimation model to synthesize some of the MCNC91 benchmarks. The results show that our algorithm can achieve 42% improvement in area and 43% improvement in power compared to that same designs synthesized by Synposys' Design Analyzer.