This paper introduces a defect tolerant 64-bit Sklansky prefix adder, designed with the goal of increasing its
reliability and extending its lifetime in the presence of hard faults. We consider defect tolerance for early
transistor wear-out by exploring the design of fine-grained reconfigurable logic. The approach involves enabling
spare processing elements to replace defective elements. Power gating techniques are used to disable faulty logic
blocks and enable spare logic. Minimum sized transistors are used for spare processing elements to reduce area
overhead, and simplify reconfiguration interconnect.
The performance of the design is compared to a baseline, non-repairing design using the cost metrics of: area
overhead, power consumption, and performance in the fault free and faulty case.
This report describes an open source VHDL description of a 64-bit MIPS-based processor. The pipeline can
execute most instructions from the MIPS III instruction set architecture (ISA). The full pipeline is made available
to digital VLSI engineers as a platform to test cell designs as a part of a complete computing system. The pipeline
is an 8-stage RISC based on the MIPS R4000 series of processors, and includes common arithmetic operations
on 32- and 64-bit operands, and full IEEE 754 floating point support. This report describes the architecture
and components of the MIPS-based processor.
A new design of a hardware accelerator for RSA cryptography is described. The accelerator performs long integer
(1024-bit) modular exponentiation using the Residue Number System (RNS). It is implemented on an FPGA
and interfaced to a host PC via the PCI bus. The accelerator uses the RNS to break the long operands into
short channels that are processed in parallel. The performance of this architecture is evaluated and the potential
for its further improvement is discussed.
Low density parity check decoders use computation nodes with multioperand adders on their critical path. This
paper describes the design of estimating multioperand adders to reduce the latency, power and area of these
nodes. The new estimating adders occasionally produce inaccurate results. The effect of these errors and the
subsequent trade-off between latency and decoder frame error rate is examined. For the decoder investigated it
is found that the estimating adders do not degrade the frame error rate.
Although multiplication and addition can be very efficiently implemented in a Residue Number System (RNS), scaling (division by a constant) is much more computationally complex. This limitation has prevented wider adoption of RNS. In this paper, different RNS scaling schemes are surveyed and compared. It is found that scaling in RNS has been performed with the aid of conversions to and from RNS, bse extensions between modulus sets, and redundant RNS channels. Recent advances in RNS scaling theory have reduced the overhead of such measures but RNS scaling still falls short of the ideal: a simple operation performed entirely within the RNS channels.
A new hardware architecture is described to perform multiplication and modular multiplication with a modulus of variable wordlength. It is intended for a microprocessor datapath to support efficient implementation of long wordlength operations using the residue number system.
Reconfigurable Circuit (RC) platforms can be configured to implement complex combinatorial and sequential logic. In this paper we investigate various RC technologies and discuss possible methods to optimise their power, speed and area. To address the drawbacks of existing RC technologies we propose a generic architecture we call "OFRL" (On-the-Fly Reconfigurable Logic). Our objective is to provide a low power, high speed platform for reconfigurable circuit and dynamically reconfigurable logic applications that use fewer transistors than existing technologies.
Recently, decimal arithmetic has become attractive in the financial and commercial world including banking, tax calculation, currency conversion, insurance and accounting. Although computers are still carrying out decimal calculation using software libraries and binary floating-point numbers, it is likely that in the near future, all processors will be equipped with units performing decimal operations directly on decimal operands. One critical building block for some complex decimal operations is the decimal carry-free adder. This paper discusses the mathematical framework of the addition, introduces a new signed-digit format for representing decimal numbers and presents an efficient architectural implementation. Delay estimation analysis shows that the adder offers improved performance over earlier designs.