

Applied Mathematics & Information Sciences An International Journal

http://dx.doi.org/10.18576/amis/13S110

# Analysis and Design of Power Optimized Pipelined Processor Using Micrologic Elements

C. Aarthi<sup>1,\*</sup> and R. K. Gnanamurthy<sup>2</sup>

<sup>1</sup> Department of ECE, Sengunthar Engineering College, Tiruchengode, Tamilnadu, India.
 <sup>2</sup> Dhanalakshmi Srinivasan College of Engineering, Coimbatore, Tamilnadu, India.

Received: 2 May 2019, Revised: 22 Jul. 2019, Accepted: 27 Jul. 2019 Published online: 1 Aug. 2019

**Abstract:** A delay buffer consists of gated clocks, a driver tree, and a memory unit in the pipelined processor that uses several novel techniques and methodologies to reduce its power consumption. In this paper a new circuit design and analysis of a low-power delay buffer using micrologics (MLEs) is proposed. Delay buffers of different lengths are needed in the pipelined CPU and FFT architecture. Several implementations of DET flip-flops, clock gating C elements, shift registers, and buffers are compared with micrologic elements such as F element, H element, S element and B element. These micrologic elements produce very low propagation delays which enhance its use in high-speed systems. A new simulation and optimization approach is presented, targeting the power consumption. The different comparison analysis reveals the sources of performance and power consumption bottlenecks in different design styles. The implementation is carried out in standard cells of 0.18  $\mu$ m CMOS technology. Both simulation and experimental results show great improvement in power consumption.

Keywords: Micrologic elements (MLEs), C-element, H-element, S-element, F-element, B-element.

## **1** Introduction

Portable communication devices have experienced explosive growth in the recent years. Due to the increasing features in consumer products, designs are to be with high-device density, low power and high-speed. Many types of signal-processing circuits with diverse functionalities are needed for wireless communication transceivers in an OFDM baseband receiver IC. In these circuits low power and low complexity are the major concerns. In many OFDM systems Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) modules are used in delay buffers, polar/rectangular coordinate conversion functions. Numerically Controlled Oscillators (NCO) for transforming the signals [1-9].

The FFT and IFFT modules transform signals between the time domain and frequency domain, and they occupy a large portion of the circuit area and are responsible for a large fraction of power consumption. Pipelined FFTs are classes of parallel algorithms that contain an amount of parallelism equal to  $\log_r N$  where N is the number of points for an FFT and r is the radix. A pipelined implementation of the Radix-2 single path and multipath FFT consist of a series of computational blocks, each composed of delay lines, coefficient storage, commutators, multipliers, and adders.

Previous methods use delay buffer design in pipelined processor such as SRAM/shift register-based delay buffers, pointer-based delay buffers, delay buffers with gated clocks and double-edge triggered flip-flops. Shift registers are highly complex for one-bit storage, so it cannot be used for long-delay buffers; the SRAM register-based delay buffer design suffers from high-power consumption. The existing delay buffer design using C element gated clock with gated driver tree and look ahead clock gating techniques are proved to be efficient in reducing the propagation delay and power consumption, but didn't produce good throughput because of metastability and difference in threshold voltages for different devices. As an alternative, this paper presents the analysis, design and development of a micrologic elements based FFT blocks [7].

A novel approach using the H element, Counter adapter (C) element and all micrologic elements instead of Muller C element, shift registers and DET FF are

<sup>\*</sup> Corresponding author e-mail: aarthi.c22@yahoo.com

adapted to reduce the power consumption. The micrologic elements are sets of compatible, and integrated logic building blocks. The elements are manufactured using fairchild planar epitaxial process by which all the necessary transistors and resistors are diffused in a single silicon wafer. These epitaxial micrologic elements have very low propagation delay. Hence it is used in high-speed systems [1].

In the proposed delay buffer, the clock gating provided using H element and F (flip-flop) element, instead of the Muller C element and R-S flip flop. The flip-flop element consists of two basic RTL gate circuits internally crossconnected to form a bi-stable flip-flop storage unit. The clock gating is provided by H element instead of Muller C element in the existing system, and reclocking of data can be done by counter adapter element. The DET FF can be replaced by milliwatt micrologic type D FF.

The applications of micrologic elements are

- -The B element is an inverting driver circuit. it is used for large fan-outs, line-driving etc.,
- -The C element provides gated complementary outputs from a single-valued input. It is used for reclocking data
- -The H element is a two-level AND/OR gate suitable for use as an exclusive OR gate. It gives useful halfadder functions directly.
- -The S element is used widely in shift-register and in counting applications.
- -The G element is the basic NAND/NOR circuit from which all of the micrologic elements are constructed. The four variations are the three-input gate, four-input gate, dual two-input gate and the dual three-input gate element. The role of these elements is the generation of any logic function through the exclusive use of gate elements.

These elements are designed for a wide variety of commercial industrial equipment operating over a temperature range of  $+15^{\circ}$  C to  $+55^{\circ}$  C. This micrologic element provides high fan-out (> 16), low-power dissipation (< 3 mW/node), high-speed, and high-noise immunity. The implementation carries out in standard cells of 0.18  $\mu$ m CMOS technology. Measured result shows great improvement than the existing system.

The rest of this paper is organized as follows. Section 2 introduces the micrologic elements circuit design for implementing delay buffers. Next is the proposed delay buffer using the micrologic elements and the detailed explanation of micrologic elements described in Section 3. The experimental results and comparative analysis are presented in Section 4. Section 5 concludes this paper.

## 2 Micrologic Elements Circuit Design

The micrologic elements are sets of compatible, integrated logic building blocks. These elements are



Fig. 1: Schematic illustration for buffer element

characterized by very low propagation delays which enhance its use in high-speed systems. Typical propagation delay for the basic RTL circuit is 12ns. Epitaxial micrologic elements are available in two temperature ranges. The full range is  $-55^\circ$  C to  $+125^\circ$  C . The mid-range is  $0^\circ$  C to  $+100^\circ$  C . This micrologic element provides high fan-out, low power dissipation, High-speed, and high noise immunity [12]. Emphasis is placed on the logic function to be performed in both development and the use of these building blocks. The following micrologic elements comprise the family: "B" element-buffer

- "H" element-half adder
- "C" element-counter adapter
- "F" element-flip-flop
- "S" element-half-shift register
- "G" element-gate

These are all the building blocks needed for logic functions [1, 2].

## 2.1 B element (Buffer element)

The micrologic buffer element is an inverting driver circuit. Used for large fan-outs, line-driving etc., the buffer element drives heavily-loaded circuits because of its very low source impedance. It also minimizes the rise-time deterioration due to capacitive loading. This element is mainly used in multivibrators.

Positive and Negative logic:  $B_1, B_2 = A$ .

Being an inverting driver circuit the output is inverted. Outputs  $B_1$  and  $B_2$  may not be used concurrently.

#### 2.2 *H* element (Half-adder element)

The micrologic half-adder element is a multipurpose combination of three basic RTL Circuits. It is two-level AND/OR gate suitable for use as an exclusive OR gate. It gives half adder function directly and is also very useful for gating.

For the positive logic E = C + D

108



Fig. 2: Schematic illustration for half-adder element



Fig. 3: Schematic illustration for counter-adapter element



Fig. 4: Schematic illustration for F element

F = (A + B)(C + D)For the Negative logic E = CDF = AB + CD

## 2.3 C element (Counter-adapter Element)

The micrologic counter-adapter element is a non-inverting gating RTL circuit. It provides gated complementary outputs from a single-valued input. It is mainly used for the reclockage of data.



Fig. 5: Schematic illustration for S element

#### 2.4 F element (Flip-flop Element)

The micrologic flip-flop element consists of two basic RTL circuits internally cross-connected to form a bistable flip-flop storage unit; it is used in active memory. Here the input and output are provided externally, the internal state can be changed by giving positive input and the unit should be regarded as a NOR circuit. Concurrent positive signals at both inputs can cause near-ground signals at both outputs.

#### 2.5 S element (Half-shift Register Element)

The micrologic half-shift register element is a gated input storage element having five basic RTL gate circuits. It is used widely in shift-register and in counting applications. The two cascaded internal logic levels change the state in response to near-ground input signals.

 $A_1 = B_1 + A_0 P$ Negative logic:  $A = B_1(A + P)$  positive logic:  $B_1 = A_1 + B_0 P B_1 = A_1(B_0 + P)$ 

## **3 Proposed Delay Buffer**

In the proposed delay buffer, several power-reduction techniques have been adopted using micrologic elements. Here the delay buffer design using H element and F element has been presented. These circuit techniques are mainly used to decrease the loading on high fan-out nets and to reduce the power consumption.

#### 3.1 Clock Gating

Previously, clock gating was provided by R-S flip-flop and Muller C element to reduce the loading on active clock drivers. In the proposed system the H element and F element instead of Muller C element and R-S flip-flop is replaced. The Muller C element uses a larger number of gates and also more transistors compared to H element, and the extra R-S flip-flops demands more necessary





Fig. 6: Ring counter with clock gated by H elements



Fig. 7: Milli-watt  $\mu$ L elements

power. Additionally the DET FF is replaced by milli-watt micrologic type D FF which dissipates less power.

The proposed ring counter with different micrologic elements is shown in Fig. 6. Each block contains one H-element to control the delivery of the local clock signal " $CLK_{i,j}$ " to the DETFF, and only the gate signals along the path passing the global clock source to the local clock signal are active. When the input of the last DET flip-flop in the previous block changes to "1" making both two inputs of the H element as "0", the clock signal in the current block is turned on and remaining blocks are disabled. Additionally, the output terminal of the H element can drive up to five other micrologic element loads in parallel.

In Fig. 6, every eight DFFs in the ring counter are grouped into one block. Then a gate signal is computed for each block. When the input of the last DET flip-flop in the previous block changes to "1" making the first input as "1" and second input as "0", the clock signal in the current block is turned on and remaining blocks are disabled. Additionally, the output terminal of the F

element can drive up to four other micrologic element loads in parallel.

## 3.2 Energy Dissipation in Flip-flops

The milli-watt micrologic D flip-flop is a complete, general-purpose storage element. It consumes very less power compared with the DET flip-flop in the existing system.

The state of input 2 is stored when input 1 changes from high to low, a subsequent change of input 2 while input 1 is low has no effect. The applications are shift registers, counters and control circuits. The power and delay comparison is summarized in Table 5 [1].

## **4 Performance Results**

The proposed architecture is described by means of VHDL language. A delay buffer based on the proposed techniques is designed using micrologic elements and implemented in 0.18  $\mu$ m CMOS technology. Performance analysis is conducted for different micrologic elements based on the power consumption and the results are summarized in Table 1.

Micrologic elements operate at bit rates in excess of 1 mc, which is a significant advance in the speed of such units. Typical power dissipations of 30 mW per unit permit high-density packaging without extraordinary thermal problems (elements have a temperature range of  $-55^{\circ}$  C to  $+125^{\circ}$  C. The epitaxial micrologic is characterized by very low power consumption and propagation delays which enhance its use in high-speed systems. Typical propagation delay for the basic RTL circuit is 12 nanoseconds.



Fig. 8: Micrologic elements versus power consumption



Fig. 9: Micrologic elements versus milli-watt micrologic elements

 Table 1: Comparison of power with different micrologic elements

| Sl.No | Micrologic elements            | Power Consumption |
|-------|--------------------------------|-------------------|
| 1     | H element(Half adder)          | 45 mW             |
| 2     | F element(flip-flop)           | 22 mW             |
| 3     | C element(Counter Adapter)     | 55 mW             |
| 4     | S element(Half Shift register) | 36 mW             |
| 5     | B element(Buffer)              | 24 mW             |
| 6     | A element (Adder)              | 12.5 mW           |
| 7     | G element(Gate)                | 15 mW             |
| 8     | Three input gate element       | 5 mW              |
| 9     | Four input gate element        | 5 mW              |
| 10    | Dual input gate element        | 10 mW             |
| 11    | Dual three input gate element  | 10 mW             |
| 12    | Expander gate element          | -                 |
| 13    | R element(Register)            | 15 mW             |
| 14    | JK flip-flop element           | 54 mW             |



Fig. 10: H-micrologic versus Muller C element

The comparison of power with different milli-watt micrologics are summarized in Table 2. The milli-watt micrologic (MW  $\mu$ L) elements are sets of compatible integrated building blocks. The scope of the milli-watt micrologic elements are the usage in extreme

environmental conditions in space vehicles. It has very low-power consumption and propagation delay. The comparison of micrologic elements and milli-watt  $\mu$ L elements are summarized in Table 2.

Finally Table 6 lists the comparison between the proposed delay buffer and existing delay buffer using power consumption and delay. The milli-watt  $\mu$ L is characterized by very low propagation delay and its power dissipation is typically 2 mW.

| Sl.No.      | Milli-watt micrologic elements     | Power |
|-------------|------------------------------------|-------|
| consumption |                                    |       |
| 1           | MW $\mu$ L B (Buffer) element      | 10 mW |
| 2           | MW $\mu$ L G (Gate) element        | 4 mW  |
| 3           | MW µL DG (Dual Gate) element       | 4 mW  |
| 4           | MW $\mu$ L H (Half adder) element  | 8 mW  |
| 5           | MW $\mu$ L D (D flip-flop) element | 15 mW |
| 6           | MW $\mu$ L Gate expander element   | _     |

#### **Table 2:** Comparison of power with different milli watt micrologic elements

Table 3: Comparison of power and delay of H element and Muller C element

| Ring counter structure                                            | Power consumption in mW @ 2.5 V, 0.18 $\mu$ m | Delay in ns |
|-------------------------------------------------------------------|-----------------------------------------------|-------------|
| Gated clock ring counter using <i>H</i> element (Proposed system) | 45 mW                                         | 4.114 ns    |
| Clock gating using MullerC- element (Existing system)             | 81 mW                                         | 5.776 ns    |

Table 4: Comparison of power and delay of clock gating S-R flip-flop and F element

| Ring counter structure           | Power consumption in mW @ 2.5 V, 0.18 $\mu$ m | Delay in ns |
|----------------------------------|-----------------------------------------------|-------------|
| Clock gating using F element     | 22 mW                                         | 4.208 ns    |
| Clock gating using S-R Flip flop | 81 mW                                         | 5.161 ns    |



Fig. 11: F-element versus S-R flip-flop



Fig. 12: Power and delay measurement of C element and S element

Table 5: Counter design using C element and S element

| Micrologic<br>elements  | Power<br>consumption<br>in mW | Delay    |
|-------------------------|-------------------------------|----------|
| Counter adapter element | 55 mW                         | 5.776 ns |
| S element               | 36 mW                         | 4.04 ns  |

**Table 6:** Comparison of measurement results with existing and proposed delay buffers

|                       | Power consumption in mW | Delay    |
|-----------------------|-------------------------|----------|
| Existing delay buffer | 97 mW                   | 6.897 ns |
| Proposed delay buffer | 82 mW                   | 4.101 ns |

#### 4.1 Comparison of Measurements Results

The results of measurements are summarized in Tables 3, 4, 5 and 6. In following figures, the power and delay of the proposed ring counter and gated clock ring counter using Muller C element and S-R flip-flop are shown. The power consumption and delay reduce due to clock gating provided by H element and F element. The proposed system using H element and F element dissipates only 45 mW and 22 mW of power from the supply voltage of 2.5 V, and Tables 3 and 4 give a brief summary of the measurements.

Table 5 summarizes the performance analysis of C element and S element. In digital system it is often



Fig. 13: Comparison of existing and proposed delay buffer with power and delay

necessary to reclock the data, whose correct timing has been altered by propagation delays, a counter-adapter element can be used for reclocking. The S element or half-shift register is a gated flip-flop, when we connect these elements in cascade it forms shift register. Table 5 gives the brief of power consumption and delay of C element and S element.

# **5** Perspective

In this paper, the analysis and design of pipelined processor using mircrologic elements are presented by considering the delay buffer architecture of FFT Processor. The several implementations of DET flip-flops, Clock-gating C elements, shift registers, buffers are compared with micrologic elements (MLEs) such as F element, H element, S element and B element. These micrologic elements produce very low propagation delays which enhance its use in high-speed systems. The clock gating for the ring counter using H element and F element effectively eliminates the excessive data transition without increasing the loading effect.

The gated driver tree technique is used for distributing the clock without power consumption. The gated driver tree technique used for the clock distribution networks can eliminate the power wasted on drivers and decreases the loading of the input and output data bus.

The DET flip-flop can also be replaced with milli-watt micrologic D flip-flop which reduces the number of fan out and power consumption. The main advantage of micrologic elements are low production cost, high reliability and circuit operation that is level-sensitive rather than rise-time or frequency-sensitive. These devices are suitable for space vehicles under extreme environment conditions.

A new simulation and optimization approach in the pipelined processor is presented, targeting the power consumption. Measurement results indicate that the proposed architecture consumes only 10% to 15% of power in comparison with the existing system. The implementation is carried out in standard cells of 0.18  $\mu$ m CMOS technology. Both simulation and experimental results show great improvement in power consumption.

#### References

- G.W.A. Dummer, J. Mackenzie Robertson, American Microelectronics Data Annual, 1964–65 (1964).
- [2] Bo Lojek, History of Semiconductor Engineering, Springer, Berlin, Heidelberg, New York, ISBN-103–540-34257–5, (2007).
- [3] R.H. Norman, J.R. Nall, Micrologic elements, Micro electronics reliability, Elsevier, 1(3), 251–254, (1962).
- [4] N. Gault, J. Nall, Solid state micrologic elements, IEEE Xplore Digital Library, International Electron Devices Meeting Oct 26–29, USA (1961).
- [5] Micrologic Handbook-SGS-Fairchild Ltd., South Ruislip, London (1964).
- [6] Lloyd Thayne (Martin Co., Denver, Colorado), Use of integrated circuitry in a digital system, Western Joint Computer Conference (1964).
- [7] Po-Chun Hsieh, Jing-Siang Jhuang, pei-Yun Tsai and Tzi-Dar Chiueh, A low-power delay buffer using gated driver tree, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., 17(9), 1212–1219, (2009).
- [8] W. Eberle et al., 80-Mb/s QPSK and 72-Mb/s 64-QAM flexible and scalable digital OFDM transceiver ASICs for wireless local area networks in the 5-GHz band, IEEE J. Solid-State Circuits, 36(11), 1829–1838, (2001).
- [9] M.L. Liou, P.H. Lin, C.J. Jan, S.C. Lin, and T.D. Chiueh, Design of an OFDM baseband receiver with space diversity, IEE Proc. Commun., 153(6), 894–900, (2006).
- [10] W. Li and L. Wanhammar, A pipeline FFT processor, Proc. Workshop Signal Process. Syst. Design, Implement 1999, 654–662 (1999).
- [11] I.E. Sutherland, Micropipelines, Commun. ACM., 32(6), 720–738, (1989).
- [12] P.J. Beneteau, S. Jannazzo, Characteristics and applications of micrologic elements, Solid Circuits and Microminiaturization Proceedings, 81–101, June 1963–1964
- [13] R. Anderson, Testing of micrologic elements, International Workshop on Managing Requirements Knowledge, 75, 1961, doi:10.1109/AFIPS (1961).
- [14] Carl Ingemarsson, Petter Kallstrom, Fahad Qureshi, Oscar Gustafsson, Efficient FPGA mapping of pipline SDF FFT Cores, IEEE Transactions on Very large Scale Integration (VLSI) Systems, 25(9), Sep (2017).





C. Aarthi has completed her Under Graduate (B.E) in the field of Electronics Communication and Engineering in Sri Ramakrishna Engineering College under Bharathiyar University, Coimbatore. Post Graduate (M.E)in the field of VLSI Design

under Anna University of Technology, Coimbatore. Currently Pursuing Ph. D in the field of Electronics and Communication Engineering under Anna University, Chennai. She is having more than 12 years of Experience in the field of Teaching. Currently she is working as a Associate Professor in the Department of Electronics and Communication Engineering, Sengunthar Engineering College, Tiruchengode, Tamilnadu, India. Her Current Research interest includes Low Power VLSI Design, Physical Design of VLSI Circuits, DSP and FPGA based System Design.



R. K. Gnanamurthy has completed his Under Graduate(B.E) in the field of Electronics and Communication Engineering from Bharathiar University, Coimbatore. He also had finished his Post Graduate(M.E) in the field of Microwave and Optical Engineering from Madurai

Kamaraj University, Madurai. Received his Ph. D in the field of Information and Communication Engineering from Anna University, Chennai. He has more than 30 years experience in the field of Teaching; He worked in several institutions in the various designations like Senior Lecturer, Assistant Professor, Professor and Head of the Department and Principal. Now He is working as a Professor and Principal, Dhanalakshmi Srinivasan College of Engineering, Coimbatore, Tamil nadu, India. He is a life member of Indian Society for Technical Education and Computer Society of India Member of IIIE, India. And he also is a student's member of Institute of Electrical and Electronics (IEEE) (USA). He is the chairman and member of board of studies in various universities. His Area of specialization is wireless sensor networks and mobile computing. He guided 27 PG students and under his guide ship more than 13 students are doing their research.