RTL Guidelines for Static Power Reduction

(1)

F

ACULDADE DE

E

NGENHARIA DA

U

NIVERSIDADE DO

P

ORTO

RTL Guidelines for Static Power

Reduction

Ciro de Moura Monteiro

Mestrado Integrado em Engenharia Eletrotécnica e de Computadores Synopsys Supervisor: Hélder Silva

FEUP Supervisor: José Carlos Alves

(2)

c

(3)

Resumo

Nos dias que correm, com o crescimento de aparelhos portáteis, operados por baterias de capacidade limitada, é importante uma boa gestão de energia para garantir o maior período de operação possível. O consumo dinâmico de energia foi em tempos uma das maiores considerações a ter no design de circuitos para baixo consumo de energia, mas hoje em dia, algumas técnicas de redução de consumo dinâmico são aplicadas automaticamente pelas ferramentas.

Cada vez se conseguem produzir circuitos integrados com mais transístores e até mesmo com transístores mais pequenos. No entanto, daí advém também o problema do aumento nas correntes de fuga. Uma abordagem possível para tentar reduzir o efeito da potência estática consumida pelas correntes de fuga destes circuitos é apelidada de power gating.

Power gatingconsiste no uso de transístores como interruptores para ligar e desligar a alimen-tação de partes de um circuito integrado. Para tal, podem ser usados transístores de cabeçalho ou transístores de rodapé, cada um com as suas vantagens e desvantagens.

(4)

(5)

Abstract

In today’s world, we are witnessing a growth in battery operated portable devices, that require smart power choices due to their limited battery life. Dynamic power consumption has been a major consideration when designing power aware devices, but some dynamic power savings are already automatically introduced by the designing tools.

Current technology has evolved into having smaller transistors and that enables building chips with bigger transistor density. With this technology, new problems arise, as the existence of higher leakage currents. These may or may not be resolved by power reduction techniques, as not all of them are leakage oriented. One possible solution for this problem would be the power gating technique.

Power Gating consists in using switching transistors to control the power supply of certain areas of the circuit. This power reduction technique allows the use of header or footer transistor, each one with its benefits and disadvantages.

(6)

(7)

Agradecimentos

Gostava de deixar um agradecimento...

Em especial ao Hélder Silva e Athul Stripad por me acompanharem todas as semanas e aju-darem em algumas decisões importantes.

Ao professor José Carlos Alves por me orientar neste trabalho, e me ajudar a tomar decisões. Ao Nelson Eira pela paciência para me explicar o funcionamento dos scripts utilizados pelo ambiente de implementação.

À empresa Synopsys pela possibilidade que me foi dada em fazer este projecto de dissertação em ambiente empresarial.

À minha família por me ajudar a crescer.

À Susana Carvalho por me ajudar a escolher a minha especialização, da qual fiquei a gostar. À Inês Teixeira e Gabriel Ribeiro por me ajudarem com o meu inglês.

A todos os meus outros amigos pela paciência para me aturarem.

E por fim à FEUP e aos seus professores por me ajudarem na minha formação.

Ciro de Moura Monteiro

(8)

(9)

“Laugh and the world laughs with you. Snore and you sleep alone”

Anthony Burgess

(10)

(11)

List of Figures

2.1 High power consumption . . . 9

2.2 Low power consumption . . . 9

2.3 Power consumption on a CMOS inverter. Source: [1] . . . 10

2.4 Inverter . . . 10

2.5 Clock Gating Cell . . . 11

2.6 Summary of leakage currents of deep-submicrometer transistors. Source: [2] . . 12

2.7 Sub-threshold leakage path in a CMOS inverter . . . 12

2.8 Header switching . . . 16

2.9 Footer switching . . . 16

2.10 Fine grain and cell . . . 17

2.11 Fine grain and cell with isolation clamp transistor . . . 18

2.12 Fine grain header switching and cell with isolation clamp transistor . . . 19

2.13 Companies involved in IEEE P1801 working group. Source [3]. . . 21

2.14 Example of power domains. . . 22

2.15 Isolation cell between power domains . . . 23

2.16 State Retention Isolation. Source: [4] . . . 24

2.17 Power domain with heterogeneous fan-out . . . 25

2.18 Retention register. Source: [4] . . . 26

2.19 Enable Level Shifter Example . . . 27

2.20 Header switch cell . . . 28

2.21 Isolation on input of heterogeneous fan-out . . . 29

2.22 Redundant isolation . . . 30

2.23 Output isolation . . . 30

3.1 UPF tool flow. Source: [5] . . . 33

3.2 Design flow for multi-voltage, power gated designs. Source: [4] . . . 34

4.1 Basic representation of the module used. . . 42

4.2 Function State Machine. . . 45

4.3 Block representation of the power domains . . . 47

(14)

(15)

List of Tables

2.1 Main parameter for the seven-metal-layer 90-nm CMOS technology node. Source:

[6] . . . 13

2.2 Example of PST . . . 31

4.1 EDMA power state table . . . 43

4.2 Edma power state table . . . 46

5.1 Power during activity. . . 50

5.2 Power consumption related to current implementation, during activity. . . 50

5.3 Power consumption during full simulation. . . 51

5.4 Power consumption for a full simulation, write traffic. . . 51

5.5 Power consumption for a full simulation, read traffic. . . 51

5.6 Relative power consumption for a full write simulation. . . 51

(16)

(17)

Symbols and Abbreviations

CAD Computer-Aided Design CPF Common Power Format CTS Clock Tree Synthesis DC Design Compiler

DFT Compiler Design-for-test Compiler DMA Direct Memory Access DRC Design Rule Check

DVE Debugging and Visualisation Environment DVFS Dynamic Voltage and Frequency Scaling DVS Dynamic Voltage Scaling

EDA Electronic Design Automation eDMA Embedded DMA

FET Field Effect transistor GIDL Gate Induced Drain Leakage HDL Hardware Description Language IC Integrated Circuit

IEEE Institute of Electrical and Electronics Engineers IoE Internet of Everything

IoT Internet of Things IP Intellectual Property IP Intellectual Property

IR drop Voltage drop due to energy losses in a resistive path MOS Metal–Oxide–Semiconductor

MTCMOS Multi-Threshold CMOS MVSIM Multi-Voltage Simulation NLP Native Low Power NMOS N channel MOSFET PG Power and Ground PMU Power Management Unit PST Power State Table

PVT Process, Voltage, Temperature QoR Quality of Results

RCE Regression Control Environment RTL Register Transfer Level

SAIF Switching Activity Interchange format SDC Synopsys Design Constraints

SI International System of Units (Système international d’unités) SI2 Silicon Integration Initiative

(18)

xvi SYMBOLS AND ABBREVIATIONS

SoC System on a Chip

SPEF Standard Parasitic Exchange Format TCL Tool Command Language

UPF Unified Power Format

UVM Universal Verification Methodology VC LP Verification Compiler Low Power

VCD IEEE Standard 1364-1995, Value Change Dump VCS Verilog Compiled code Simulator

VHDL VHSIC Hardware Description Language VHSIC Very High Speed Integrated Circuit VIP Verification IP/Verification Link Partner VLSI Very-Large-Scale Integration

VPD VCD Plus

VTB Verilog Test Bench WWW World Wide Web

(19)

Concepts

Clock Gating Technique to reduce clock activity Power Density Power consumed by area

Power Gating Cutting power using a switch

Power Island Island that kept ON inside power domain that is OFF Shadow Registers Register used to store data in sleep mode

VDD Positive voltage rail

VSS Negative voltage rail/Reference voltage

(20)

(21)

Chapter 1

Introduction

Energy efficiency is a very important aspect of electronic circuit design nowadays. Just a few decades ago, designers used to focus only in having a working chip, and power consumption was not a primary design concern. The requirements for portability, mobility or battery dependency were very infrequent and the early CMOS technologies for digital electronics were sufficiently constrained in terms of power consumption. Then, as technology evolved, transistors size de-creased, making it possible to fit more of them in one die, and so, power consumption gained importance. Nonetheless, only dynamic power seemed to matter, due to the fact that for CMOS technology above 130nm leakage is negligible [4]. Nowadays, power is more important than ever, especially for battery operated and mobile devices.

Power hungry chips tend to have high power density, which generates a lot of heat for a small dissipation area. This raises power dissipation issues, requires expensive cooling systems and may cause chip lifetime reduction.

Due to environmental concerns, there is an interest on reducing power consumption from devices in an effort of reducing pollution and power wasted without activity. Systems may have most of their modules idling, consuming power without executing any operation. Chips should idle efficiently, to reduce the energy consumed on their operation.

In the last years, portability and mobility have gained a still growing importance, and in that field, power consumption is of paramount importance. For example, there is a significant inconve-nience for users if portable equipments have to be constantly charged. Not only batteries have little autonomy without power management, they have short lifetime as well and have to be replaced. Efficient power management can make a better use of batteries’ energy, making them last longer and avoiding the hassle of replacing or charging them constantly.

To reduce power consumption in digital integrated VLSI systems, various considerations must be kept in mind. Dynamic power used to be the major concern in power aware designs, but as the technology nodes decrease, and transistor density increases, power consumption has gained increasing importance due to transistor’s leakage. Besides, most circuits today already implement techniques to reduce dynamic power consumption, such as clock gating. Clock gating effec-tively reduces the activity in the system consequently reducing dynamic power consumption (as

(22)

2 Introduction

deductible from equation 2.4).

Today’s techniques to reduce power consumption include using higher threshold transistors in non-critical paths of the design. That includes using high K dielectric gate oxide, well biasing and bigger transistors. These techniques are applied at a very low level, they don’t save that much power and become expensive because they require extra masks for the different transistors. Sometimes it may be essential to use low vt or ultra low vt transistors to achieve the very fast

speeds that industry demands nowadays. These techniques are also very time consuming since they may require the designer’s attention to single gates.

Before the 90nm technology node became available, designers used to simply migrate chips to lower geometries to reduce power. This would take advantage of lower supply voltages as well as lower capacitance. Integrating a 180nm chip into a 130nm would cut down power to almost half. Although voltage decreases, current will increase. At 90nm, the increase in current is more significant than the decrease in voltage and this results in a higher power consumption than the expected [7].

Static power importance has increased significantly as technology size decreases. Once negli-gible, at the deep sub-micron level, static consumption can get almost as high as 50% of the total power consumption [8]. A very useful and effective technique to reduce static power consumption is power gating. Power gating consists in shutting down inactive parts of a circuit. Implementing this technique is a very time consuming task due to power necessity identification, and testing.

Power gating was created to reduce static power at the block level. By cutting off the supply to the module, when power gating is applied, power consumption drops theoretically to zero. Major dynamic and static power reductions can be achieved by addressing power on the RTL (Register Transfer Level) and system level [9] [10]. This provides a higher abstraction level to the designer, and allows the RTL designer to produce power aware circuits, otherwise only implemented at the back-end.

1.1 Context

This dissertation is being developed in the scope of the Master in Electrical and Computers Engineering from the Faculty of Engineering from University of Porto. This work was proposed by Synopsys R _{from a previous contact between July and September 2015 on a summer job.}

1.2 Motivation

Everyday, a great number of portable devices is in development. Smart phones and smart watches are two good examples of technological evolution. These are evidences of the growing of IoT (Internet of Things), also named IoE (Internet of Everything) by some, given its extent. These devices are small and most of them are battery operated, which requires smart power choices.

Transistor’s size has been decreasing during the past few years, allowing smaller devices with higher transistor density. Also, this lowers the operating supply voltage, which translates to a

(23)

1.3 Objectives 3

quadratic reduction in the power consumption (P(t) = V (t)2/R). On the other hand, leakage power also increases with smaller transistors due to lower threshold voltage and increasing quan-tity of transistors per die. Besides, smaller transistors also allow faster transitions which increases dynamic power consumption.

When designing digital circuits, reducing power consumption has been a major concern in the last few years. Designers often implement several methodologies to reduce dynamic power con-sumption, like clock gating, but as technology improves and gets smaller, static energy consump-tion has been gaining importance and becoming a bigger slice in the total power consumpconsump-tion.

Even when a circuit is not active, it will consume energy. This is because transistors are not perfect and permit small currents to flow, despite the logical off state. With the increase in the number of transistors in a single chip, the sum of these small currents becomes significant. This problem is known as leakage, because unwanted current is leaking through the transistors.

For this reason, leakage current is an important matter as technology advances, because of the decreasing size of transistors, which leads to smaller threshold voltage, provoking an increase in leakage. Nevertheless, switching currents decrease due to smaller capacitance, and therefore, static power consumption increases, comparing to dynamic power consumption. Switching off the inactive parts of a circuit can be a solution for the leakage problem. This method is called power gating.

Power gating is already implemented in several designs and has proved effectiveness, but it is not yet automatic and there are no guidelines defined for high level static power savings.

1.3 Objectives

The goal in this dissertation is to evaluate strategies for power gating mechanisms implemen-tation, applied to a high-speed intellectual property (IP) module, characterise the power savings and the impact in the circuit design, resulting in a set of design guidelines to introduce power gating in future designs. The study of different approaches is made, and their impact on the design evaluated.

Since power gating implementation on an IP block designed without power taken into account can be very troublesome and slow, emerged the need for some guidelines that could be followed to reduce the implementation time. These guidelines could also be useful in the future with the intention of automating the process.

Another objective is to evaluate the best way to use the tools to implement power gating in the RTL design as well as checking the functionality before and after power gating implementation.

1.4 Power Gating

Power gating consists in using transistors as switches to control the supply of power to se-lected parts of a circuit, according to its activity. Power gating’s purpose is to reduce static power consumption. As almost every improvement in digital circuits design is a trade-off, power gating

(24)

4 Introduction

decreases the power consumption in exchange for a small area increase as well as increased design complexity.

A concept in power gating is power domains, these domains are composed by circuit blocks that share similar activity requirements. Each power domain can be controlled by a different signal, allowing some power domains to be active while others are sleeping, providing functionality while saving power. When a power domain is in sleep mode, it’s registers lose their value, which may become a problem if there is a need to keep state during sleep mode. To solve this issue, we can use retention registers. Retention registers are used to save the value of important registers and will be looked upon on later chapters.

Connecting circuits that are not active to active circuits may cause incorrect reads of values. To avoid this, isolation cells are used to connect inactive and active blocks.

Static power consumption increases as transistors get smaller, and has been gaining importance in power aware designs. A good solution to address this issue is power gating, that effectively cuts down the leakage currents. The challenge when adopting a power gating strategy is to decide which modules should be power controlled and decide when to power on/off, according to the required functional specifications and the design speed/area trade-offs. Even though there are other techniques, the focus of this work will be mainly on power gating.

There are description languages that allow to integrate power gating in hardware design with the help of EDA (Electronic Design Automation) and verification tools. These languages are known as power intent description languages, because they are used to describe the mechanisms that control power to the modules.

1.5 EDA Team Organisation

The design of an integrated electronic circuit is a very complex task that requires a close cooperation among different teams with competences in diverse areas. It is common for EDA teams to be organised in small teams. A project burden is often divided between the sub-teams. Usually there will be a front-end team, responsible for RTL design. There is also a team for RTL verification, responsible for writing test-benches and verify the RTL implementation of the logical intent. The back-end team is responsible for place and route as well as layout.

Power gating is usually implemented at the front-end, but impacts a lot the back-end process, and some decisions should be made together. Although there are other techniques for static power reduction, implemented at the back-end stage of the process, they are not as effective as cutting the supply voltage to the design. Power gating causes an increase in design effort for both the front-end and back-end teams, resulting in some decisions being taken together.

Power intent specification is made at the RTL level, allowing verification of the logical opera-tion with RTL simulaopera-tions. Simulaopera-tions at the netlist or lower level would increase implementaopera-tion and verification times a lot. This simulations should also be performed, but only after having good results from the RTL simulation.

(25)

1.6 Structure 5

This work involves RTL design and verification of the design, and also synthesis, necessary for power analysis, that is usually not done by the front-end team.

1.6 Structure

This document is divided in 6 chapters, Introduction 1, Related Work 2, Design Flow 3, Implementation 4, Results 5and Conclusion6.

The Introduction chapter (1), as its name states, is an introduction to the work developed throughout this master’s dissertation.

In the Related Work (2) chapter, a bibliographic review and study of the current state of the art is made, introducing the major concepts used in power gating.

The Design Flow (3) chapter is dedicated to explaining the differences between a traditional digital CMOS circuit design and a power oriented one.

Implementation (4) is a chapter dedicated to explaining the implementation decisions taken during the development as well as the final result.

The Results chapter (5) contains the results obtained from the implementation.

Conclusion (6) is the final chapter, where the last conclusions about this work are made, as well as possible future improvements to it are described.

(26)

(27)

Chapter 2

Related Work

In today’s world, with the growth of the VLSI industry and the portability of devices, the num-ber of gates inside a single IC chip has been increasing. This allows more logical components in the same die or even smaller dies, but causes an increasing power consumption and, consequently, can raise thermal and energy problems.

One concern, that has been gaining importance, is static power consumption. Static power is consumed when the circuit is in an idle state, that is, when it has no activity.

Power dissipation causes heat, which can be prejudicial to chips, reducing their lifespan and affecting performance, but reducing the heat may require expensive cooling systems that raise products market cost.

As most of the devices in these days are battery operated, it is a major concern to improve their durability. Batteries hold limited charge, and taking into account today’s electronic devices power consumption, they usually don’t last very long. As such, there is the need of buying new batteries constantly. Non rechargeable batteries have a big environmental impact, and rechargeable ones last a limited amount of charging cycles. To reduce the impact of batteries, better energy efficiency is needed.

This chapter presents a study of power related problems on CMOS digital circuits and some techniques to avoid them, focusing mainly on power gating, a technique for static power reduction.

2.1 Power Consumption

With the growth of mobile devices and applications, as well as all the environmental concerns, power consumption is becoming an important criteria in electronics system designs. Synopsys’ EDA tools provide various solutions for power aware design, some of them automated. On the other hand, new techniques emerge with better trade-offs and better power savings, and these are available in the market. Therefore implementing these becomes a market advantage for both Synopsys and its costumers.

Energy loss is converted into heat, which can be prejudicial to electronic components, therefore there is a need to maintain a low temperature in these devices. One way of guaranteeing this is to

(28)

8 Related Work

use cooling systems. Reducing power consumption will reduce heat dissipation, creating a better system in many aspects, and making it possible to drop the cooling system, since the cooling systems raise the price of the product itself.

Current technology already handles clock gating automatically, as well as other power sav-ing techniques, but leakage power is gainsav-ing importance and a solution to efficiently reduce this problem is needed.

CMOS digital circuits used to have negligible static power losses, as referred by [11] in 2003:

"Historically, complementary metal-oxide semi-conductor technology has dissipated much less power than earlier technologies such as transistor-transistor and emitter-coupled logic. In fact, when not switching, CMOS transistors lost negligible power. However, the power they consume has increased dramatically with increases in device speed and chip density."

The power consumption in digital CMOS circuits is given by the equation 2.1. This equation can be divided into dynamic and static power consumption, as seen in subsections below (2.1.2 and 2.1.3). The first term is the dynamic power consumption and the second one the static power consumption. P represents the total power consumption. A is the fraction of gates switching, C the total capacitance load of all gates and f the clock frequency. Ileakis the leakage current and V

the supply voltage.

P= ACV2f+V Ileak (2.1)

Source: [11]

2.1.1 Power and Energy

Energy and power are two important but different concepts, specially concerning portable devices. For these type of devices, battery life is a big concern, as well as heat dissipation, because there is usually not enough space to implement an efficient cooling system, if any.

Power SI unit is watt, and it represents the amount of energy transferred per unit of time. Instant power is given by:

P(t) = V (t) × I(t) (2.2)

Energy is what a system converts into work or heat. Batteries provide energy for a circuit to execute a given function. Energy SI unit is joule (J) and is usually measured in a time interval. Energy can be calculated as the integral of power for a given time interval:

E=

Z T 0

P(t)dt (2.3)

(29)

2.1 Power Consumption 9

A higher power demanding circuit will consume more energy than a low power one. Specially for battery operated devices, power consumption is very important. Having a lower power design will result on longer operational times. As can be seen in figures2.1and2.2, both graphs have the same energy, the graph’s area are the same. If two battery operated circuits had power consump-tions similar to the ones seen in the images, the first one would have less operational time due to its higher power consumption.

Time Power

Figure 2.1: High power consumption

Time Power

Figure 2.2: Low power consumption

Static CMOS logic cells are made of NMOS and PMOS transistor nets, based on their ability to work like digital switches. Transistors are however not ideal, their gates are capacitive inputs, which makes the logic gate inputs capacitive. Transistors are also non ideal switches since they have a non zero ON resistance and have finite OFF resistance. These cause what’s called parasitic impedance. Parasitic impedances, on CMOS gates, will consume power when switching state, through charging and discharging of its parasitic capacitors, and static power, when not switching, due to non infinite OFF resistance and non zero ON resistance.

In figure2.3, it is possible to observe the paths of static and dynamic power consumption on a CMOS inverter. Logical gates have input and output capacitance due to transistors parasitic effects. Pstaticis the static power consumption of the inverter, being V the supply voltage and Ileak

the leakage current. Iscis the short circuit current during state transition, C the capacitive output

of the gate and input of the next gate. Iswtch is the switching current, and fswitch represents the

(30)

10 Related Work

Figure 2.3: Power consumption on a CMOS inverter. Source: [1]

2.1.2 Dynamic Power

Dynamic power consumption arises from constant charging and discharging of parasitic ca-pacitances on the output of millions of gates inside an integrated circuit. Transistors are not ideal and have parasitic capacitance. These capacitances are charged and discharged according to the output of logic gates. If the logical state is one, the capacitance is charged, on the contrary, if the logical state is zero, the capacitor is discharged. The transitions between zero and one is what consumes dynamic power, and it was in times the most important factor in power consumption.

Cout

Cin

Figure 2.4: Inverter

In figure2.4it is possible to view a representation of an inverter with its parasitic input and output capacitor. These parasitic capacitances are the cause of dynamic power consumption, and port delays. Another parasitic effect comes from the wire interconnections, the bigger the wire, the bigger the capacitance.

In digital CMOS circuits, fan-out is the ability of a logic port to drive other logical port inputs. Clock distribution generates big networks, because all synchronous logic will require a reference clock signal. The clock signal requires complex routing, and complex buffering due to the high extent of the signal. Clock signal is implemented using a complex tree of buffers to be able to drive all the gate inputs, and keep the required timing with reduced skew across the system.

Huge clock trees normally are the ones that consume the most dynamic power. That is why the main focus in reducing dynamic power consumption is the reduction of clock frequency as well

(31)

as the reduction of active cycles to inactive parts of the system through clock gating. Most of the systems nowadays implement clock gating.

Latch Enable

Clock

Figure 2.5: Clock Gating Cell

Dynamic power consumption can be calculated from expression 2.4, the power loss is caused by charging and discharging the gates capacitive loads. A is the fraction of gates actively switch-ing, C the total capacitive load of the module, f the frequency and V the voltage [11]. As can be observed, dynamic power losses depend on active gates, capacitance, frequency and voltage.

The number of active gates is dependable on the system needs, being the clock tree one of the biggest contributors. Activity can be reduced by removing the clock signal from logic that is currently not necessary, this technique is known as clock gating.

Voltage is the most important factor since it is a quadratic factor, a reduction of voltage to half, will reduce dynamic power to a quarter, but the frequency the system is able to achieve also depends on voltage (2.5), so reducing voltage can be prejudicial for high-speed interfaces. A technique named Multi-Voltage can be useful for keeping different areas of a chip operating at different voltages, according to their necessities.

Pdynamic= ACV2f (2.4) f ∝(V −Vth) α V (2.5) Source: [11] 2.1.3 Static Power

Transistors are not ideal digital switches and conduct small currents even when the gate voltage is below the threshold voltage. These small currents, cause power consumption, more precisely static power. Static power consists in the power consumed when the gates are not switching. Static power, once negligible, gained a lot of importance since it has increased in the last few years.

Static power consumption can sometimes be up to 50% of the total power consumption of a chip. This comes from the ever increasing number of transistors in each die, as well as the use of lower threshold voltage transistors. In a effort to reduce power consumption, new techniques have been developed, and can be used together for better efficiency.

(32)

12 Related Work

On a CMOS gate, there are four main leakage sources, sub-threshold leakage, gate leakage, gate induced drain leakage (GIDL) and reverse bias junction leakage. Sub-threshold leakage (ISUB) is the current that flows from drain to source when the transistor is operating in the weak

inversion region. Gate induced drain leakage (IGIDL) is the current induced by a high field effect

in the drain caused by a high VDG. Reverse bias junction leakage (IREV) is caused by minority

carrier drift and generation of electron/hole pairs in the depletion regions. Gate leakage (IGATE) is

the current that flows through the gate oxide to the substrate layer due to gate oxide tunnelling and hot carrier injection [4]. Gate leakage can be improved by using materials with higher dielectric constant for the gate oxide.

Figure 2.6: Summary of leakage currents of deep-submicrometer transistors. Source: [2]

In figure 2.6, it is possible to observe the sneaky paths in a MOS transistor, where static current leaks. I1is the reverse bias pn junction leakage, I2the subthreshold leakage, I3the oxide

tunnelling current, I4the gate current due to hot-carrier injection, I5the GIDL and I6is the channel

punchthrough current

According to [11], the two major components of static power consumption are gate leakage and sub-threshold leakage. Sub-threshold leakage is a weak inversion current across the device, some devices can be designed to work in sub-threshold mode, but it is out of the scope of this thesis.

VDD

Isub

Figure 2.7: Sub-threshold leakage path in a CMOS inverter

Static power consumption depends only on voltage and current 2.6. Reducing either voltage or current is effective to reduce static power consumption. Since reducing voltage can introduce frequency problems, as seen in 2.1.2, reducing current is the best way to go. To achieve current

(33)

reduction, one can implement power gating, transistors with higher threshold voltage, and lower leakage currents, switch on and off the module power rails.

Pstatic= V Ileak (2.6)

Leakage current can be approximated as a combination of sub-threshold and gate-oxide leak-age:

Ileak= Isub+ Iox (2.7)

Gate leakage is the current that flows through the gate oxide, due to the quantum-mechanical tunneling of electrons, as described by [6]:

"For oxide thicknesses below 4 nm, high current leakages through the oxide can occur due to the quantum-mechanical tunneling of electrons. The gate leakage current can not only negatively affect the device performance but also significantly increase the standby power consumption of a chip."

When a transistor works as a digital switch, it operates either in an active mode or cuts off the signal. More specifically, a MOSFET enters the cut off state when its gate-substract voltage difference is bellow the transistor’s threshold voltage. Nonetheless, since transistors are not ideal switches, they have a non infinite resistance in the off state, which means that a small amount of power will be consumed by this component. This effect is named as sub-threshold leakage because, as the name suggests, the gate voltage will be below the threshold.

Drain-bulk and Source-bulk contribute with their reverse currents. Two important mecha-nisms contribute to bulk current, gate induced drain leakage (GIDL) and impact ionisation. For advanced technologies, impact ionisation is no longer important because supply voltage is in the same or lower order than the band-gap of silicon, therefore, the carriers are no longer able to create electron-hole pairs [6].

From table 2.1 one can observe that p-MOS transistors are less leakier than n-MOS of the same size, but are not capable of carrying the same amount of current in saturation mode.

Table 2.1: Main parameter for the seven-metal-layer 90-nm CMOS technology node. Source: [6]

Parameter Logic (low power) n-MOS p-MOS Supply voltage(V) 1.2 Drawn gate(nm) 90 tox(nm) 1.5 VT(mV) 420 -400 IDsat(mA/µm) 1.0 0.5 Ioff(nA/µm) 15 6

(34)

14 Related Work

According to [12], the sub-threshold conduction current, for short channel MOSFETs can be calculated with the following equation:

ID= ISe

VGS

nVT _(2.8)

Source: [12]

In 2.8ISis a constant, VTis the thermal voltage at room temperature and n is a constant whose

value depends on the material and structure of the device. Although small, power dissipation becomes a problem on chips with billions of transistors. Other authors consider a more complex and parameter dependent equation (2.9), W and L are the gate width and length respectively, the other parameters are technological parameters.

ISU B= W LµV 2 thCsthe VGS−VT +ηVDS nvth _{(1 − e} −VDS vth ₎ _(2.9) Source: [13] [4]

As seen in the equation 2.9, sub-threshold leakage is exponentially dependent of VGS and

VT. As technology scales down VDD and VTto lower dynamic power consumption, static power

consumption increases.

2.2 Power Gating

This section presents power gating, it’s definition, important concepts, how it appeared, lan-guages used to implement it as well as who made it happen and how the two actual standards that exist were created.

Power gating consists in using a switch between the supply rails and the cells supply ports. When the module is not in use, the switch is turned off, cutting power to the module and avoiding the static power consumption. Power gating, with ideal digital switches would cut completely off the leakage current consumed by the module. Since the switches are usually implemented using CMOS technology, it reduces the leakage current of the whole module to the leakage current of the switching transistors.

Power gating is implemented with transistors connected between supply and the module, known as header switching, as seen in figure 2.8, between the module and ground, known as footer switching, seen in figure 2.9or both. Each approach has its advantages and disadvantages. Implementing power gating is a trade-off, it increases the area as well as a dynamic power due to switching between powered on and off state. So, in order to implement power gating, one must be aware of this trade-off and make sure the implementation is beneficial. If a module is constantly switching between the on and off state, the increase in dynamic power turns out to be bigger than the decrease in static power, therefore turning this technique prejudicial rather than beneficial.

Shutting down inactive parts of a system may result in a loss of state, to avoid this problem less leakier registers are used to retain state. These registers can introduce significant area overhead because they are implemented using bigger transistors.

(35)

2.2 Power Gating 15

When applying power gating to a design, special care must be taken to avoid changes to critical paths as well as to avoid creating new ones. Extra logic like the isolation cells and level shifters cause delays in the data-path. Paths that hardly fulfil timing constraints, with the extra logic may violate these constraints. Powered off logic takes time when powering back on, which may cause performance issues.

Power gating can be implemented at the RTL level of abstraction either using power intent lan-guages like Common Power Format (CPF) or Unified Power Format (UPF), this will be explained further in another section.

Power switching can be driven either by hardware or software. Hardware implementation re-quires more area for the control circuitry. Software implementation rere-quires software development as well as an interface.

2.2.1 Consideration

Some considerations must be kept in mind when implementing power gating. The dynamic power consumed by the extra circuitry, the static power consumed by always on logic, as well as the power switching transistors. The area cost should also be kept in mind, because higher Vth

transistors occupy a bigger area. The retention strategy is also important because retention cells can cause a big area overhead.

A module that is switched off can not be directly connected to an always on module due to floating voltages, requiring the use of isolation cells.

A single transistor may not be able to drive a full module as its width may not be enough to drive all the current the module needs. Waking up the circuit too fast may cause a big inrush cur-rent, which could damage some tracks. The voltage drop at the switching fabric must be carefully analysed to ensure proper operation of the module.

When a block is power gated, its registers lose their value. It may be important to keep those values in some cases. In those cases when there is a need to retain state, always on retention registers are the solution. These registers are usually implemented with less leakier, and lower voltage retention cells. However, this comes with a time penalty when restoring the values back to the main registers at wake up time, raises dynamic power consumption and increases area.

2.2.2 Header vs Footer Switching

The switching transistors used for power gating can be placed between the power supply and the power domain supply pins, or between the ground and the power domain ground pin. This is known by header and footer switching respectively. Each of this implementations has its advan-tages and disadvanadvan-tages.

A single transistor is not able to carry enough current to power a large power domain. For that reason, several transistors are used in parallel, which are usually staged in time to avoid large inrush currents [7].

(36)

16 Related Work

The sleep transistor efficiency is a relation between the current in the ON state and the OFF state (ION/IOFF). The total leakage of the switching fabric is highly dependent on the switching

efficiency, because we need enough transistors to deliver the required ON state current [4]. Header switching is typically implemented using PMOS transistors to switch VDD. PMOS

transistors are less leakier their NMOS counterpart with the same size, however they provide lower drive current when active. Header switches turn off the supply voltage allowing for simple clamp of isolation cells to "0" using a single transistor. This type of isolation however should only be used to close timing constraints due to the fact if they fail, it will cause hard to detect stuck-at faults. Since at system level signals are usually referenced to ground ("0"), switching VDD

becomes more convenient.

VDD

Load

Figure 2.8: Header switching

Foot switching is typically implemented with NMOS transistors, they can drive a larger amount of current than a PMOS transistor of the same size, having a smaller area cost on the design. Typ-ically NMOS transistors have higher switching performance than PMOS [4], therefore allowing greater energy savings and lower area impact in the same design. As footer transistors will switch VSS, making the system more sensible to reference noise, which may become a problem.

VDD

Load

Figure 2.9: Footer switching

Some academic paper authors use both header and footer switching, however, the two series switches cause a more significant voltage drop, which in turn increases the gates delay [4]. This

(37)

2.2 Power Gating 17

will also create a bigger area overhead since now we have two series switches performing the work that only one would be enough for.

2.2.3 Fine Grain vs Coarse Grain

The power switch implementation can be either fine grain or coarse grain. Fine grain power switches are part of the standard cell. It is required that the library contains standard cells with the switches attached to them. Coarse grain power switching on the other hand can be implemented with the addition of some special cells for power gating.

The decision of which implementation to use should be discussed with the back-end team, since the bigger impact will be after synthesis. Usually, the chosen implementation will be coarse grain power switching since it will create less area overhead, making it a better option even with the increased design effort.

2.2.3.1 Fine Grain Switching

Since the switch has to be able to provide the worst case current necessary for the cell to operate without performance loss, the area overhead can be considerable [4]. They may also include a pull-up or pull-down transistor for isolation. Since the power switch is already inside the standard cell, it is possible to use the traditional design flow.

Cells used for fine grain switching, with the embedded switching transistor are called Multi-Threshold CMOS (MTCMOS) cells. MTCMOS cells contain the usual supply connections, inputs and outputs and additionally they have an input for the sleep signal. MTCMOS cells are usually implemented using foot switching, due to their higher switching performance. Those kind of cells will create less area overhead than header switching. Even with foot switching, area overhead can get close to four times the size of the original cell [4].

VDD

Sleep

Figure 2.10: Fine grain and cell

2.2.3.2 Fine Grain Advantages

(38)

18 Related Work

VDD

Sleep

Figure 2.11: Fine grain and cell with isolation clamp transistor

• Not sensible to ground noise injection because of short virtual power nets;

• Small wake-up latency and in-rush current due to small capacitance of virtual power nets; • Built-in clamp transistors keep outputs in known states and eliminate wake-up crowbar

cur-rents;

• Timing impact of voltage drop across the switch and clamp behaviour are easy to charac-terise since they are inside the cell;

• Can be easily analysed and synthesised by conventional ASICs tools and flow, since MTC-MOS are basically a normal standard cell;

2.2.3.3 Fine Grain Disadvantages

The authors also name a couple of disadvantages from fine grain switching:

• Considerable area overhead, with increases up to three times the size of the original cell; • Requires special library with MTCMOS cells;

• Significant buffering and routing resources for sleep control distribution; 2.2.3.4 Coarse Grain Switching

In coarse grain power switching, a collection of switches are used to gate a collection of blocks of cells. Switch network sizing is harder than fine grain switching since the activity can not be estimated. Coarse grain power switching however introduces significantly smaller area overhead [4].

Due to the fact that area penalty for fine grain power switching is not worth the saving on design effort, coarse grain switching became the industry preferred method [4].

(39)

2.2 Power Gating 19

VDD

Sleep

Figure 2.12: Fine grain header switching and cell with isolation clamp transistor

2.2.3.5 Coarse Grain Advantages

Coarse grain power switching is more widely accepted in the EDA industry for power gat-ing implementation, mainly because of area constraints. The main advantages of coarse grain switching, as explained in [4] are:

• Since sleep transistors can share charge, they are less sensitive to PVT (process, voltage, temperature) variations and introduces less voltage drop variations;

• Significantly smaller areas than fine grain switching;

• Sleep transistors number can be optimised for voltage drop and speed targets; • Existing standard cell libraries can be used with a few extra special cells;

2.2.3.6 Coarse Grain Disadvantages

Coarse grain also bring up some disadvantages, as stated in [4]:

• Requires complex power network;

• Power network is hard to synthesise and requires static and dynamic voltage drop analysis. • Requires wake-up in-rush current control;

• Bigger wake-up latency;

• Power analysis is more complex;

(40)

20 Related Work

2.2.4 Power Intent Languages

Power information is not supported by normal Hardware Description Languages (HDL), there-fore it has to be described somewhere else. There is where power intent files come along. Written in a different language, they specify how the module should behave in terms of power. These files are written independently of the HDL, and later formally verified by the tools. Power intent languages are used to describe power specifications on the RTL level, this high level of abstraction allows better power savings.

Two standards are currently defined for power intent, developed by different companies: they are Unified Power Format (UPF) and Common Power Format (CPF).

CPF is a Silicon Integration Initiative (Si2) standard for low power and has some interoper-ability with IEEE1801 low power standard [14]. UPF, as it is commonly known, is the IEEE 1801 Standard for Design and Verification of Low-Power Integrated Circuits. UPF is based on Tool Command Language (TCL) [5]. Some other languages, that are not standards, may also be used by some companies. These languages however, do not have a high usage, since developers prefer to use standards in an effort to unify the development for faster and easier integrations.

CPF 2.0 is a widely adopted low-power intent format, approved as an Si2 standard by the Low Power Coalition. It allows some interoperability with IEEE1801-2009 (UPF). CPF supports hier-archical low-power flow, output and bidirectional virtual ports, isolation strategies, level-shifting, retention strategies and more.

This work uses UPF because Synopsys tools offer compatibility with it and some power intent is already specified in UPF. Some further UPF explanation can be found in section 2.2.5.

2.2.5 UPF

UPF is the IEEE (Institute of Electrical and Electronics Engineers) standard for design and verification of low power in integrated circuits, under the standard number 1801. It was originally created in an effort for a open portable power specification standard and approved in 2007 as an Accellera standard. In the same year, Accellera donated it to the IEEE. The first version of IEEE Std 1801, second version of UPF, was released in 2009 [5].

Since IEEE Std 1801 is an open standard, it gives EDA tool providers the ability to imple-ment its latest features. The standard is already supported by a large number of EDA companies. Synopsys tools already support a large subset of the commands in UPF, as well as some UPF-like power intent commands that are not part of the standard [15].

UPF focus on controlling voltage and current applied to the transistors, normally technology used for the switches is assumed to be CMOS, but other technologies can also be used. UPF can be applied with any of the three HDL description languages, VHDL, Verilog or SystemVerilog, due to its abstraction level [5].

UPF supports a design hierarchy and is advisable for reusing of power intent across configura-tions. UPF hierarchy is dependent on the RTL modules’ hierarchy, which can be a downside when it was not designed taking power gating into account.

(41)

2.2 Power Gating 21

Figure 2.13: Companies involved in IEEE P1801 working group. Source [3].

The current active version of the standard is IEEE Std 1801-2015, approved on 8 December 2015 by the IEEE-SA Standards Board [16].

2.2.5.1 Concepts

When defining power intent with UPF, a few concepts must be learnt for better understanding of its structure. This section explains the major concepts used in UPF for power gating.

Modules are put together in power domains according to their power specifications, if we have two modules that turn off at the same time and use the same voltage, they can be put together in the same power domain.

Ports are connection points between adjacent levels of hierarchy , connected together using nets. UPF assumes a more abstract model of the design hierarchy, using its commands to change the scope within the hierarchy levels. Ports have an HighConn, visible to the parent instance, and a LowConn side, visible to the instance itself.

Power domains are collections of instances that are powered in the same way, child instances are included in the same power domain as their parents. A power domain does not need to be con-tiguous, this means that instances on the same power domain can be placed in different locations. In the example present in figure2.14, both modules A and B have the same power requirements,

(42)

22 Related Work

A _B _C

PD_A

PD_B

Figure 2.14: Example of power domains.

so they have been put together in power domain PD_A. As for module C, since it has a different power requirement than A and B, it belongs to a different power domain.

Supply ports are connections for supply nets on hierarchical boundaries. Supply sets represent a collection of supply nets. Supply switches control supply connections between supply ports.

2.2.5.2 Scope

The scope is the design hierarchy where the UPF commands are executed. Defining a scope is particularly useful for a reusable power intent. Using theset_scope command will change the current scope, and signals will be pulled from the current scope. It is possible to write UPF in which the current scope is the same as the root scope, but small changes in hierarchy will imply changing all of the UPF, as in a reusable UPF only the scope would need to be changed.

2.2.5.3 Power domains

A concept introduced with power intent is power domain. When a design is power aware, modules belong to power domains. A power domain defines a set of rules for the modules that belong to it. A design can have several power domains, each of which has its own independent set of rules. A power domain can be switched off, or have a defined voltage. Power domains from the same design can be in different states independently from each other.

This is the power domain definition present in the standard:

"power domain: A collection of instances that are treated as a group for power-management purposes. The instances of a power domain typically, but do not always, share a primary supply set. A power domain can also have additional supplies, in-cluding retention and isolation supplies." [16]

The other components defined in the IEEE 1801 standard are usually associated with a power domain. That applies to the retention strategies, isolation strategies, power switches and level shifters.

(43)

2.2 Power Gating 23

Power domains are characterised by their power availability. A power domain that is not switchable, and remains always powered, is said to be an always on power domain. Power domains may also be characterised in relation to other power domains, this is, if power domain PD_A is on when power domain PD_B is off, power domain PD_A is said to be relatively always on in relation to power domain PD_B.

Three supply set handles are usually created with the power domain, primary, default_retention and default_isolation. Extra supply sets handles can also be created with the-supplyargument. The power domain’s supply set handles default_retention and default_isolation are usually associated with an always on supply set from the top power domain.

2.2.5.4 Isolation strategies

Powered off logical outputs can not be directly connected to active logic inputs, since values are unpredictable they can cause incorrect readings and lead to unwanted behaviour. Isolation cells exist to address this issue. They are placed on the border of the power domain and are responsible for clamping the cell output. They can also be used together with level-shifters in a multi-voltage design.

OFF

Isolation

_ON

cell Power Management Unit

Figure 2.15: Isolation cell between power domains

There are three types of isolation cells, according to their functionality, they can clamp to "0", "1" or the last value. A simple AND gate can be used to clamp the signal to "0", as well as an OR gate can be used to clamp it to "1". To clamp the signal to the last value before power down, a more complex cell is used, consisting of a latch to keep state and a multiplexer, as can be seen in figure2.16.

Isolation cells are placed using the UPF command set_isolation. Depending on the version of the standard used by the tools, it may be necessary to define an isolation control. This is true for the IEEE 1801-2009 Std. version. For the newer versions of the standard, the

(44)

24 Related Work

Figure 2.16: State Retention Isolation. Source: [4]

set_isolation_controlcommand has been superseded and all isolation information can be defined in a singleset_isolationcommand.

Isolation cells can either be inserted at the output of the gated power domain, or at the input of the power domain that is connected to it. In case this second power domain is less on than the first one, there may be no need for the insertion of the isolation cells. Both types of isolation can coexist in the same design.

Isolation is only needed at either the input of the on power domain or the output of the power gated one. Having isolation on both will create redundant isolation inserting more cells than the ones necessary for the operation.

Isolating inputs of a power domain from the outputs a less active one is a way of ensuring all the signals are isolated. Leaving nets from a power gated module without isolation may cause incorrect behaviour of the system as well as sneaky paths for current to leak. When isolating inputs, it is necessary to make sure that no always on cells are inserted before the isolation cells by the synthesis tools.

Another option is to isolate the outputs from the power domain that is to be powered off. Nevertheless, this would insert isolation cells in all the output ports of that power domain, some of which may be connected to itself or a less on power domain.

Outputs from modules that connect to the same power domain do not need to be isolated, although isolation cells are typically small, they introduce delays in the data-path. Manually se-lecting each port that should or not be isolated is possible, but impracticable for large designs. UPF already accounts for this, by using the-diff_supply_onlyswitch when creating the iso-lation rule, will prevent tools from inserting isoiso-lation cells for nets connected to the same supply set. This however will also foreclose the insertion of isolation cells for output ports with hetero-geneous fan-out, this is, that connect to both another power domain and itself.

(45)

2.2 Power Gating 25

Likewise the-diff_supply_onlycommand, it is also possible to specify a source and/or sink filter. This filter will only apply the isolation rule to nets that come from one of the source supply sets and enter one of the specified sink supply sets. This is very useful when isolating designs that have several power domains.

OFF

ON

OFF

Figure 2.17: Power domain with heterogeneous fan-out

Using-diff_supply_onlywill however fail to create isolation cells in a domain port with heterogeneous fan-out, like the one on figure 2.17resulting in a warning message. This case is a good example where isolation could be placed on the input of the ON power domain. It could also be place on the output of the OFF power domain, but the second off power domain input does not need to be isolated as it as the same power needs as the first one.

2.2.5.5 Supply sets

Supply sets are an aggregation of supply functions that together provide a complete power source [16]. Supply sets provide a higher level of abstraction to the designer, replacing the need of creating individual supply nets and supply ports. Supply sets have their implicit supply nets, such as power, ground and well biasing. Supply sets provide the needed supply nets for modules to operate. Explicitly created supply nets can be associated with an existing supply set via the

-functionargument ofcreate_supply_setcommand.

A power domain can have several supply set handles, which are then associated to supply sets. Supply sets are usually associated with power domain’s supply set handles.

2.2.5.6 Retention strategies

When powering off some designs, there may be a need to keep some state. To keep state, some registers need their value to be preserved when the module is turned off. There are several possible

(46)

26 Related Work

approaches to achieve this, either using retention registers, power islands or external memory. Retention registers are made of two registers, the main register, for normal operation and the shadow register. Shadow registers are less leakier but produce a big area overhead.

Figure 2.18: Retention register. Source: [4]

Another way to retain state is keeping the modules which contain the registers needed to keep state in a different always on power domain. This technique is named power islands, due to the fact that those modules will be in a different always on domain, inside a powered off domain. This adds some complexity to the design, since the back-end designers will need to pull the supply rails a module inside a powered on domain. This does not cause a considerable area increase, if any, but it is not advisable in large areas with low activity, since we would be wasting an opportunity to reduce leakage.

Retention may be one of the power gating components with major impact. Retention registers can create huge area overhead if not planned carefully. The need for full state retention or only partial retention should be taken into consideration for area optimisation and restore time reduc-tion. If the system is able to recover from a power down with only partial state retention, this becomes an attractive solution give the registers time overhead and size.

Low standby voltage is also a possibility, but this solution increases testing complexity since it will require a multi-voltage design, as well as a library with cells able to operate on the specified voltage range, from standby voltage to normal operation voltage.

2.2.5.7 Level Shifters

In a multi-voltage design, communication between modules that operate at different voltages may cause reading errors or even damage the circuitry. To ensure the correct expected operation, level shifters must be inserted in between those modules. Level shifters are gates responsible to shift logical signals across different voltages. If two power domains use different voltage, level shifters must be used to ensure the correct functionality of the system. Level shifters have a low voltage and a high voltage side.

Level shifters can be of two types, low to high or high to low. As the name suggests, high to low level shifters, shift from the high voltage to the low voltage and low to high ones shift from low to high voltage.

(47)

2.2 Power Gating 27

2.2.5.8 Enable Level Shifters

In multi-voltage designs, ports on the boundary of power domains may need both level shifters and to be isolated. In order to have lower area overhead, a single cell called enable level shifter can be used instead of the isolation cell and level shifter.

Figure 2.19: Enable Level Shifter Example

2.2.5.9 Power Switch

The power switch is usually implemented in CMOS technology and consists in a transistor between the power supply and the standard cells power input pins. The switch can be either NMOS (footer switch) or PMOS (header switch).

Liberty libraries may have several different switch cells. Switch cells in the library may contain several switches and are usually defined by their type. Switches types can be coarse grain or fine grain. DC will select a switch able to carry the needed current for the on state. To force DC to select a specific switch cell, the designer can mark all other switches as dont_use or dont_touch and recompile the library.

Most switch related decisions are made by the back-end designer, so tampering with the library may not be a good option. A single switch will usually not be enough to supply an entire power domain, leaving to the back-end team the decision of selecting coarse grain switching or fine grain switching and grid or array topology.

Switch cells have an output acknowledge port. The acknowledge port is usually connected to the PMU to indicate that the power is now stable, or has been removed. This particular signal is very important to avoid incorrect behaviours, if the PMU transitioned state based on a timer, since small manufacturing process variations can affect wake up and shutdown times, it could transition into an operative state before the power domain was actually operational, or even spend more time than necessary waiting for power up.

(48)

28 Related Work

At the back-end phase, the decision of switch topology goes into the design is also made. Most designs use coarse grain power switching because the reduced complexity in implementation does not compensate for the increase in area.

It is up to the back-end engineer to introduce delays between the switches in order to avoid large inrush currents, since this kind of analysis is not able to be performed at the synthesis level.

VVDD

ACK VDD

Sleep

Figure 2.20: Header switch cell

The figure2.20represents a PMOS header switching cell. VDD represents the input voltage

from the power rail. VVDDis the virtual voltage supply that is to be input of the power domain to

be gated. The sleep signal is responsible for controlling the virtual supply rail. The acknowledge port reports the power state back to the power management unit, with the help of a buffer.

2.2.5.10 Cell Location

UPF provides the option of defining the physical location for cell insertion. This is a somewhat important decision since it will affect layout complexity. This decision is taken at the RTL level, but it is important that the power architect is aware of the back-end flow in order to not difficult the implementation. The cell location is defined by the-locationargument present in the UPF cell insertion commands.

Cells can be inserted on the power domain they belong, in the parent domain or even both. When working on IP, to be integrated in other designs, it is useful to place the cells in the power domain they belong to, since putting them outside will create a area overhead in the parent design in relation to the predicted area of the IP. If the cells are inside the IP area estimation already ac-counts for them. Cells located inside the IP also provide a more abstract model to the designer that is going to integrate the IP, this way there is no need to worry with power intent since everything is already implemented inside the IP, reducing verification and implementation times.

UPF related cells inside the power domain may however cause a more complex back-end implementation. Isolation cells inside a gated power domain require a extra pg pin connection to an always on net, to power the cell, since the primary power net will be shut off. This means an extra power rail has to be pulled inside the power domain on the layout stage of design.

(49)

2.2 Power Gating 29

Inserting cells in the parent power domain may be a good option for internal power domains. That means no extra supply rail needs to be pulled inside the power domain since it will be more on than the one isolation cells are coming from.

2.2.5.11 Input vs Output Strategies

When creating isolation, level shifter or enable level shifter strategies, it is possible to chose if that strategy applies to the power domain inputs, outputs or both. This is a quite an important decision, since it may avoid uninsulated paths or redundant strategies.

As described in the isolation section (2.2.5.4), using the-diff_supply_only trueswitch when defining an isolation strategy will not insert cells if the output has heterogeneous fan-out. Instead, if that happens to be the case, it is better to define the strategy for the input port of the active power domain, given it is the only power domain needing isolation or level shifting for that signal.

OFF

ON

OFF

iso_enable

Figure 2.21: Isolation on input of heterogeneous fan-out

Figure 2.21is a good example where the strategy should be applied to the input, however, in figure2.22 it is the opposite. Since both power domains require isolation, because they are active when the output of the first power domain is corrupt, it would be better to just isolate the output of the first power domain. This represents an example of redundant isolation, and creates unnecessary cells.

In figure2.23displays a situation when using output isolation would be the best option. The output port connects to two power domains, and using input isolation would create an unnecessary extra cell.

2.2.5.12 Power State Table

The power state table is a very important component to help verification. The power state table has no physical implementation, that means it is only a table that defines all possible voltages that

(50)

30 Related Work

OFF

ON

iso_enable

ON

Figure 2.22: Redundant isolation

OFF

iso_enable

ON

Figure 2.23: Output isolation

can be applied to the power domains. If during the simulation, a power domain enters in a state that is not defined in the power state table, it is said to be in an illegal state and will trigger an error, causing the simulation to fail.

The power state table (PST) can contain several possible states, and several supply sets. The power architect should write all possible states for the power domains in the power state table, although, it is also possible to have several power state tables in the same design. Having several power state tables allows unrelated power domains to operate independently. All power domains related should be included in the same table to catch bugs on the power intent.

Values on the power state table are real and define the voltage applied to the supply net. A zero in the PST does not mean the net is off, it means the defined voltage is zero. Ground net when defined as 0, it means the net is ON. Gated nets in the power state table are defined as "OFF".

In the example table2.2it is defined the possible states of two power domains, PDA and PDB. This table has three possible states, PS_ALL_ON, PS_ALL_OFF and PS_LP_1.

(51)

2.2 Power Gating 31

Table 2.2: Example of PST

PDA.primary PDB.primary State power ground power ground PS_ALL_ON 1.0 0.0 0.8 0.0 PS_ALL_OFF OFF 0.0 OFF 0.0 PS_LP_1 1.0 0.0 OFF 0.0

In PS_ALL_ON state, both power domains are on, PDA with 1.0V and PDB with 0.8V. Tools when analysing the PST will notice this and check if level shifters have been inserted on connec-tions between the two power domains.

The PS_ALL_OFF state is a state usually present in all PST, designs without it risk missing states in the power up or power down sequence [7]. It is possible to observe that for this particular design, header switching was chosen, since the supply net that is gated is the power one.

The last state, PS_LP_1, has one power domain active, PDA, and the other one power gated. This means there need to be isolation cells from PDB to PDA. As the two power domains operate at different voltages, enable level shifters should be used instead of both an isolation cell and a level shifter.

From this power state table, it is possible to see that PDA can not be turned OFF when PDB is ON. This situation creates a violation of the power state table and will cause the simulation to fail with an illegal state.

(52)

RTL Guidelines for Static Power Reduction

F

E

U

P

RTL Guidelines for Static Power

Reduction

Ciro de Moura Monteiro

Resumo

Abstract

Agradecimentos

Contents

List of Figures

List of Tables

Symbols and Abbreviations

Concepts

Chapter 1

Introduction

1.1

Context

1.2

Motivation

1.3

Objectives

1.4

Power Gating

1.5

EDA Team Organisation

1.6

Structure

Chapter 2

Related Work

2.1

Power Consumption

2.2

Power Gating

OFF

ON

OFF

ON

OFF

OFF

ON

OFF

OFF

ON

ON

OFF

ON

ON

_ON