this is VLSI Case Study.

VLSI | Verilog HDL | Digital Design

This case study focuses on analyzing the performance of different 32-bit pipelined CPU designs and comparing the delays, power consumption, and processing frequency of various full-bit adder architectures. Additionally, a new ALU architecture is proposed and evaluated against the existing full-adder designs to enhance the CPU's speed and performance.


Overview

The purpose of this case study is to gain a further understanding of different 32-bit pipelined central processing unit (CPU) designs' performance; the comparison between different full-bit adders to showcase the delays, power consumption, and processing frequency of each design. Coupled with the concern of having a fast, high-performing CPU, a new arithmetic logic unit (ALU) architecture should be implemented and compared to the pre-existing full-adder designs. A pipelined CPU is capable of handling many instructions at once, and in this design, the CPU uses 32-bit length words, therefore the architecture for processing a plethora of computations in unison becomes critical in its front-end and back-end design optimization. The integration of the CPU is synchronized by an external clock and other signals to direct operations and operands for the ALU and instructions for addressing the memory file. Using the environments Cadence Virtuoso, Xcelium, Genus, and Innovus, along with the RTL design being written in Verilog HDL, allowed for the compilation, construction, and simulation of developing a high-performing, fast processing CPU.

Media





Description

  1. Full-Bit Adder Circuits

    • In this case study, four types of full-bit adder circuits are compared: carry ripple adder (CRA), carry lookahead adder (CLA), carry skip adder (CSA), and carry select adder (CSeA). These adder circuits are essential components of a 32-bit CPU and are evaluated based on their performance, delays, and speed. Each adder design follows the same test bench Verilog file, but they differ in their topology and processing methods, aiming to achieve an ideal ALU (Arithmetic Logic Unit) block.

      1. Carry Ripple Adder (CRA)
        • The Carry Ripple Adder (CRA) is the simplest adder design among the four compared in the case study. It consists of n-bit full adders that are connected in a cascade, where the carry output of one stage becomes the carry input of the next stage. The main advantage of CRA is its simplicity, as it requires minimal hardware. However, its performance is limited by the sequential nature of the carry propagation, resulting in longer propagation delays. As a result, the CRA design may not be suitable for applications that require fast addition operations.
      2. Carry Lookahead Adder (CLA)
        • The Carry Lookahead Adder (CLA) design improves upon the CRA by incorporating additional logic circuits to compute the carry signals in parallel, rather than relying on sequential carry propagation. This parallel computation reduces the carry propagation delay significantly, resulting in faster addition operations compared to CRA. However, the implementation of additional logic circuits introduces more complex hardware, increasing the area and power requirements of the CLA design. It may also lead to increased design complexity and higher manufacturing costs.
      3. Carry Skip Adder (CSA)
        • The Carry Skip Adder (CSA) design aims to further enhance the speed of addition operations by dividing the operands into separate blocks, each containing CRA circuits. These blocks perform the addition operations in parallel, allowing for concurrent computation of carry signals. By utilizing parallelism, CSA achieves faster addition speeds compared to both CRA and CLA. However, CSA requires additional hardware for block division and skip logic, which increases the overall complexity, area, and power consumption of the adder design. This additional hardware may also introduce additional delay, affecting critical timing paths.
      4. Carry Select Adder (CSeA)
        • The Carry Select Adder (CSeA) design shares similarities with the CSA, as it also divides the bit operations into blocks. However, in addition to dividing the operands, CSeA divides the operations of each bit and incorporates hard-coded carry-in signals. This eliminates the reliance on carry signals from other adders, reducing the critical path delay. By implementing two separate summations, each with a predetermined carry bit (0 and 1), CSeA effectively reduces the propagation delay associated with carry generation. Furthermore, the inclusion of a multiplexer layer that selects the original carry input bit allows for additional flexibility in carry propagation. However, the additional hardware for carry selection and the multiplexer layer increase the complexity, area, and power consumption of the CSeA design.
  2. New ALU Architecture

    • The new ALU architecture aims to optimize the performance, speed, and delay of operations within a CPU. It introduces a 32-bit comparator and a 4x2 multiplexer to enhance the functionality of the ALU. The primary objective of this architecture is to determine the comparative relationship between two 32-bit words stored in registers A and B.
    • The 32-bit comparator is a key component of the new ALU architecture. It compares each corresponding pair of bits from the two input words (A and B) and generates binary outputs based on the comparison results. The comparator is implemented using a series of 2-bit comparators, allowing for simultaneous comparison of each bit. This approach enables efficient comparison operations within the ALU.
    • The comparator's output provides information about the relative magnitudes of the two input words. It generates the following output cases:
      1. If A = B: The output of the comparator is "10". This indicates that the two input words A and B are equal.
      2. If A > B: The output of the comparator is "01". This suggests that the value stored in register A is greater than the value stored in register B.
      3. If A < B: The output of the comparator is "00". This signifies that the value stored in register A is less than the value stored in register B.
    • The 32-bit comparator plays a crucial role in various applications, such as sorting, searching, and decision-making processes. By providing a straightforward and efficient means of comparing 32-bit words, it enhances the overall functionality and versatility of the ALU. In addition to the comparator, the new ALU architecture includes a series of multiplexers. Specifically, it consists of five layers of multiplexers, with the number of multiplexers decreasing in each subsequent layer until reaching layer 0, which contains a single multiplexer. These multiplexers are responsible for selecting appropriate inputs and determining the desired outputs based on the control signals and comparison results obtained from the 32-bit comparator. They enable the ALU to perform various operations, such as arithmetic calculations, logical operations, and data manipulation, based on the defined control logic.