• Nenhum resultado encontrado

Modular BiNoC Architecture Design for Network on Chip

N/A
N/A
Protected

Academic year: 2017

Share "Modular BiNoC Architecture Design for Network on Chip"

Copied!
9
0
0

Texto

(1)

International Journal of Electronics and Computer Science Engineering 662

Available Online at www.ijecse.org ISSN- 2277-1956

ISSN 2277-1956/V2N2-662-670

Modular BiNoC Architecture Design for Network on

Chip

Prof. C.N. Bhoyar

1

, Ashish khodwe

2

1 2 Department of Electronics Engineering

1 2 Priyadarshini College of Engineering RTMNU Nagpur,India 1

cnbhoyar@yahoo.com 2khodweashish@gmail.com

Abstract- The advent of deep sub-micron technology has recently highlighted the criticality of the on-chip interconnects. As diminishing feature sizes have led to increases in global wiring delays, Network-on-Chip (NoC) architectures are viewed as a possible solution to the wiring challenge and have recently crystallized into a significant research thrust. Both NoC performance and energy budget depend heavily on the routers' buffer resources. Network on chip is efficient on chip communication architecture for SoC architecture. This paper presents a VHDL based model of a novel bidirectional channel network-on-chip BiNoC architecture to enhance the performance of on-chip communication. By introducing a new (BiNoC) organization, our design added flexibility promises better bandwidth utilization, lower packet delivery latency, and higher packet consumption rate. Novel on-chip router architecture is developed to support dynamic self-reconfiguration of the bidirectional traffic flow. This area-efficient BiNoC router delivers better performance and requires smaller buffer size than that of a conventional network-on-chip (NoC). The flow direction at each Channel is controlled by a channel direction control (CDC) algorithm. This algorithm implemented with a pair of finite state machines. This CDC algorithm provided high performance, free of deadlock, and free of starvation. We implemented a parameterized register transfer level (RTL) design of the NoC architecture elements.

Keywords-: Interconnection networks, multiprocessor, systems-on-chip (MPSoCs), networks-on-chip (NoCs), on-chip communication, reconfigurable architectures.

I INTRODUCTION

Several multi-core integrated circuit designs such as 64 core SoC and 80-core NoC architecture [1][12] have been proposed recently. Such a many-core system requires high-performance interconnections to transfer data among the cores on the chip. Traditional system components interface with the interconnection backbone via a bus interface. Propagation delay, power dissipation, and reliability will be the VLSI technology. Therefore, reusable on chip bus serious issues of global wires in deep submicron Interconnect templates such as ARM’s AMBA [3] and IBM’s Core Connect [4] are commonly used in current MP-SoC amd CMP designs. However, on-chip bus allows only one communication transaction at a time according to the arbitration result.The scalability and success of switch-based networks and packet-based communication in parallel computing and Internet has inspired the researchers to propose the Network-on-Chip (NoC) architecture as a viable solution to the complex on-chip communication problems [5].

Therefore, there is growing interest in NoC research [4][6] because of its impact on the next generation SoC designs. Dally and Towles proposed replacing dedicated, design specific wires with general purpose, (packet-switched) network [18], hence marking the beginning of network-on chip (NoC) era. Scalable Networks on Chips (NoCs) are needed to provide high bandwidth communication infrastructure for SoCs [7][8].

(2)

A bidirectional channel network-on-chip (BiNoC) architecture is proposed in this section to enhance the performance of on-chip communication. In a BiNoC, each communication channel allows itself to be dynamically reconfigured to transmit flits in either direction. This added flexibility promises better bandwidth utilization, lower packet delivery latency, and higher packet consumption rate. Novel on-chip router architecture is developed to support dynamic self-reconfiguration of the bidirectional traffic flow. The flow direction at each channel is controlled by a channel-direction-control (CDC) protocol. Implemented with a pair of finite state machines. This channel-direction-control protocol is shown to be of high performance, free of deadlock, and free of starvation. This paper presents a VHDL based cycle accurate register transfer level model for evaluating the latency, throughput, dynamic, and leakage power consumption of NoC based interconnection architectures. We implemented a parameterized register transfer level design of the NoC architecture elements.

This paper includes key technical contributions as following.

A new area-efficient BiNoC router architecture that utilizes smaller buffer size than conventional unidirectional NoC router while delivering better performance.

The rest of this paper is organized as follows. In Section 2, we will discuss some of the background materials for NoC architecture and prior related research. a bidirectional network on-chip (BiNoC) architecture will be given in Section 3.section 4 will provide bidirectional CDC protocol in detail. Finally, in Section 5, experiment results comparing the performance of the proposed BiNoC architecture against the conventional NoC architecture are provided. In last section, brief statements conclude this paper.

II BASELINE NOC ROUTER

(3)

ISSN 2277-1956/V2N2-662-670

NoC designs must be flexible enough use a mesh or torus topology becaus systems, yet it may not achieve the b consists of circuit switched fabrics a be constructed by the crossroad swit designs, virtual-channel flow-contro with smaller buffer size, is a well-kno

Motivation

In a conventional NoC architecture, direction to propagate data on the n bandwidth utilization, data channels each run cycle. That is, four kinds shown in Figure 5(b).

Fig

However, current unidirectional No cannot achieve the high bandwidth fully utilize available resources (cha

IJECSE,

Prof. C.N. Bh

Figure 1.Typical NoC architecture in mesh topology

ugh to cover a certain range of applications, most of the s ause of its performance benefits and high degree of scala

e best performance for a single application [3][6]. Conv s and an arbitration controller. In each arbitration decisio itch as long as no contention exists between these paths trol-based router design, which provides better flexibil known technique from the domain of multiprocessor netw

re, each pair of neighboring routers uses two unidirect e network as shown in Figure 3(a). In our BiNoC archi els between each pair of routers should be able to transm ds of channel- direction combinations should be allowe

Fig.3 Channel direction in Typical NoC and propose BiNoC

NoC architectures, when facing applications that have th utilization objective. the unidirectional channel struct hannel bandwidth) and may cause longer latency. This o

E, Volume 2, Number 2

Bhoyar and Ashish khodwe

664

e state-of the-art NoC designs alability for two-dimensional nventional design of a router ision, more than one path can ths. For most existing switch bility and channel utilization etworks [12][13].

ectional channels in opposite hitecture, to enable the most nsmit data in any direction at wed for data transmission as

(4)

Fig.5 Modified four-stage pipeli

A.Reconfigurable Input/output Port

As shown in Fig. 4, one of the input/ as a low priority (LP) port. Each of transmission direction based on a transmission direction, two data pack Fig. 3 shows the detailed schematic are assigned properly, no conflict an output port, within BiNoC, each port generated from the channel control m

elined router architecture for our proposed BiNoC router with VC flow

orts

ut/output ports is designated as a high priority (HP) port, of the two bidirectional channels between a pair of rout a distributed channel control protocol. When both ackets could be sent concurrently, which effectively dou tic of the in-out ports implementation in BiNoC. As long and no unpredictable situation will occur. Instead of usin ort can be either an input port or an output port controlled l module.

flow-control technique.

(5)

ISSN 2277-1956/V2N2-662-670

Fig.4 Sche

B. Channel Control Module

The channel control module has two arb−req signal to the SA at the rout finite state machine (FSM) consistin discussed in the next section. Each c the HP and LP ports of each router. T and output−req. The two FSMs exc input−req (input request) and output− to transmit. The output−req signal fr two clock cycles due to the presenc request) signal from the internal RC current channel for forwarding data.

C. VC Flow-Control

The WH flow-control-based BiNoC each direction to receive packets flexibility by intentionally sharing t direction, where two input buffers direction as shown in Fig. 3

D. Virtual Channel Allocator

The VA module matches resource r hardware cost, a separate allocator is RC module, VA needs the first-stag Then, at the second-stage, the reques VC allocator architecture is identical

E. Switch Allocator

The SA allocates a time slot of the cr conventional NoC, SA is accomplis lines and output lines, and its output

IJECSE,

Prof. C.N. Bh

chematic of a bidirectional link implemented in BiNoC architecture.

two major functions: 1) determine channel direction at outer to handle channel allocation. The channel control ting of three states. Details of the operations of the chan h channel is connected to two FSMs, a HP FSM and a . The two FSMs exchange information through a pair of exchange control signals through a pair of doubly buffe

ut−req (output request). output−req = 1 when the sending l from one router becomes the input−req signal to the FS ence of two buffering flip-flops. Each FSM also receiv C module. channel−req = 1 when a data packet in the lo ta. However, if the downstream input buffer is full, chann

C router architecture that we proposed in [15] needs two ts simultaneously. However, in this paper, we improv g the access authority of two input buffers for the two rs (or even multiple VCs) can be multiplexed on two

e requests from input VCs to available VCs at downst r is used to perform VC allocation in [14]. Given a routi age arbiter at each input VC to select one requested VC uest contentions between input VCs will be resolved as i cal to the conventional VC flow-control based router [3][1

cross-bar switch to move flits from an input VC to an ou lished with two-stage arbitration, where the arbiter has ut is the grant signal as discussed in [14]. as illustrated in

E, Volume 2, Number 2

Bhoyar and Ashish khodwe

666

at run time, and 2) output an rol module is realized with a annel control module will be a LP FSM, corresponding to of signaling wires: input−req ffered hand shaking signals: ing end router has data packet FSM of the other router after eives a channel−req (channel local router is requesting the nnel−req will be reset to 0.

two separated input buffers in rove the channel utilization two in-out ports at the same o physical channels in each

nstream routers. To conserve uting result calculated by the C at the downstream router. s illustrated in Fig. 6(a). This

][14].

(6)

3) Wait state: an intermediate sta the Free State with an output channel

(a)

Fig. 7(a) FSM for HP port a

Comparison with Existing Architectu

Typical-NoC represents a conventio one input channel and one output c Typical- NoC-double which use a architecture, two bidirectional chann inspect the performance trend among bidirectional channel for each direct buffer resources, Typical-NoC, BiNo size of 160 flits. Only Typical-NoC buffer is doubled in our BiNoC, the b

Tab

Architecture Resou

Total number o buffers

state preparing the transition from the idle state with an nel direction. The operations of the HP FSM and the LP F

(b)

rt and (b) FSM for LP port for bidirectional channels between two neig

ctures

tional NoC using fixed uni-direction channels between r t channel are used to connect between two neighboring a 64-flit buffer in each input channel for comparison nnels are used to Connect the two neighboring routers in ng different NoC architectures, we studied a Reduced- B ection (that is, it uses five bidirectional channels in tota iNoC and Reduced BiNoC listed on Table 1 were all imp oC-double has doubled its total buffer size to 320 flits.

e buffer size buffer in BiNoC is reduced to 16 flits only.

able I. Comparison with existing NoC router architecture [17]

source Typical NoC Typical NoC-double

Reduced BiNoC

r of 5 5 5

an input channel direction to P FSM are discussed below.

(b)

eighboring routers

n routers. For each direction, ing routers. Also, we studied on. In our proposed BiNoC in each direction. In order to BiNoC which uses only one total). For fair comparison in implemented with total buffer ts. Since the number of input

Normal BiNoC

(7)

IJECSE, Volume 2, Number 2

Prof. C.N. Bhoyar and Ashish khodwe

668

ISSN 2277-1956/V2N2-662-670 Area breakdown of different NoC architectures

Table II shows Area breakdown of different NoC architectures [17]

Architecture Buf. Size/ Buf. Dept. Ch./Dir. Crossbar Freq. (MHz) Normalized Cycle

BiNoC−WH(32) 32 flits 16 flits 2-inout 10 × 10 921 0.81

BiNoC−2VC(32) 32 flits 16 flits 2-inout 10 × 10 666 1.12

BiNoC−3VC(48) 48 flits 16 flits 2-inout 10 × 10 637 1.17

BiNoC−4VC(32) 32 flits 8 flits 2-inout 10 × 10 627 1.19

BiNoC−4VC(64) 64 flits 16 flits 2-inout 10 × 10 561 1.33

V EXPERIMENTAL RESULTS

Performance Evaluation

In this section, we present simulation-based performance evaluation of our architecture, BiNoC router with VC flow-control technique in terms of network latency, energy consumption .We describe our experimental methodology, and detail the procedure followed in the evaluation of these architectures.

Simulation Platform

A cycle-accurate NoC simulator was developed in order to conduct a detailed evaluation of the router architectures. The simulator operates at the granularity of individual architectural components, accurately emulating the major hardware components. The simulation test-bench models both the routers and the interconnection links, conforming to the implementation of various NoC architectures. The simulator is fully parameterizable, allowing the user to specify parameters such as network size, topology, switching mechanism, routing algorithm, number of VCs per PC, number of PCs, buffer depth, PE injection rate, injection traffic-type, flit size, and number of flits per packet. The simulator models each individual component within the router architecture, allowing for detailed analysis of component utilizations and flit flow through the network. The activity factor of each component is used for analyzing power consumption within the network. We assume that link propagation happens within a single clock cycle. In addition to the network-specific parameters, our simulator accepts Hardware parameters such as power consumption (dynamic and leakage) for each component and overall clock frequency. These parameters are extracted from hardware synthesis tools and back annotated into the simulator for power profile analysis of the entire on-chip network.

Area Measurement

NoC router architectures in terms of logic gate count and percentage calculated by synopsys design compiler [17].

Area breakdown of BiNoC_4VC

Table III shows Area breakdown of BiNoC_4VC [17]

Component buff. BiNoC_4VC(16) 4 flits x 4

Buffers/Direction 1 1 1 2

Total Channels 5-in 5-out 5-in 5-out 5-inout 10-inout

Channels/Direction 1-in 1-out 1-in 1-out 1-inout 2-inout

Each Buffer Size 32 flit 64 flits 32 flit 16 flit

Total Buffer size 160 flit 320 flits 160 flit 160 flit

(8)

In addition, as illustrated in Tables above, we can observe that as buffer size increased, the relative (percentage) area overhead due to crossbar switch decreased. Thus, we can conclude that, since the buffer memory dominated the major cost of router, we achieved the goal of hardware saving by adopting a smaller buffer size in a BiNoC than a T-NoC router while getting better performance.

Power Measurement

power breakdown across the major component of various router architectures (each running at its maximum supported frequency) which were calculated by Synopsys Prime Power under a regional traffic of flit injection rate 0.322, where the latency results of all selected router architectures still performed normally before reaching saturation throughput [10]. Packets of 16 flits with a 50% switching activity factor of random payload were used in this experiment. Power measurements were performed for 10,000 cycles where the initial 1,000 cycles were used as the warm-up time. Power consumption of each router running at 500 MHz

VI CONCLUSIONS

The continuing technology shrinkage into the deep sub-micron era has magnified the delay mismatch between gates and global wires. Wiring will significantly affect design decisions in the forthcoming billion transistor chips, whether these are complex heterogeneous SoCs, or Chip Multi-Processors (CMP). Networks-on-Chip (NoC) have surfaced as a possible solution to escalating wiring delays in future multi-core chips. NoC performance is directly related to the routers' buffer size and utilization.In the first part of this paper, we introduced the detail of an on-chip interconnection framework, namely, network on chip (NoC), used in the design of multiprocessor system-onchip (MPSoC) and chip multiprocessor (CMP) architectures. we introduce self configurable architectures, called Dynamic Self-Reconfigurable BiNoC Architecture, In which each communication channel allows itself to be dynamically reconfigured to transmit flits in either direction.

REFERENCES

[1] J. D. Owens, W. J. Dally, R. Ho, D. N. Jayasimha, S. W. Keckler, and L. S. Peh, “Research challenges for on chip interconnection networks,” IEEE Micro, vol. 27, no. 5, pp. 96–108, Nov. 2007.

[2] W. J. Dally and B. Towles, “Route packets, not wires: On-chip interconnection networks,” in Proc. DAC, Jun. 2001, pp. 684–689.

[3] W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks. San Mateo, CA: Morgan Kaufmann, 2004.

[4] T. Bjerregaard and S. Mahadevan, “A survey of research and practices of network-on-chip,” ACM Comput. Surveys, vol. 38, no. 1, pp. 1–

51, Mar. 2006.

[5] L. Benini and G. DeMicheli, “Networks on chips: A new SoC paradigm,” IEEE Comput., vol. 35, no. 1, pp. 70–78, Jan. 2002.

[6] U. Orgas, J. Hu, and R. Marculescu, “Key research problems in NoC design: A holistic perspective,” in Proc. CODES ISSS, Sep. 2005,

pp.69–74.

[7] R. Marculescu, U. Y. Ogras, L. S. Peh, N. E. Jerger, and Y. Hoskote, “Outstanding research problems in NoC design: System,

microarchitecture and circuit perspectives,” IEEE Trans. Comput.-Aided Des., vol.

[8] J. D. Owens, W. J. Dally, R. Ho, D. N. Jayasimha, S. W. Keckler, and L. S. Peh, “Research challenges for on chip interconnection networks,” IEEE Micro, vol. 27, no. 5, pp. 96–108, Nov. 2007.

[9] L. Shang, L. S. Peh, and N. K. Jha, “Powerherd: A distributed scheme for dynamically satisfying peak-power constraints in interconnection

networks,” IEEE Trans. Comput.-Aided Des., vol. 25, no. 1, pp. 92– 110, Jan. 2006.

[10] A. Banerjee, P. T. Wolkotte, R. D. Mullins, S. W. Moore, and G. J. M. Smit, “An energy and performance exploration of network-on-chip

(9)

IJECSE, Volume 2, Number 2

Prof. C.N. Bhoyar and Ashish khodwe

670

ISSN 2277-1956/V2N2-662-670

[11] H. Ito, M. Kimura, K. Miyashita, T. Ishii, K. Okada, and K. Masu, “A bidirectional and multi-drop-transmission-line interconnect for multipointto- multipoint on-chip communications,” IEEE J. Solid-State Circuits, vol. 43, no. 4 pp. 1020–1029, Apr. 2008.

[12] Chrysostomos A. Nicopoulos, Dongkook Park” ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip Routers”The 39th

Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06)0-7695-2732-9/06© 2006

[13] M. Lis, K. S. Shim, M. H. Cho, and S. Devadas, “Guaranteed in-order packet delivery using exclusive dynamic virtual channel allocation,”

Massachusetts Inst. Technol., Boston, Tech. Rep. CSAIR-TR-2009–036, Aug. 2009. 28, no. 1, pp. 3–21, Jan. 2009.

[14] J. Lillis and C. Cheng, “Timing optimization for multisource nets: Characterization and optimal repeater insertion,” IEEE Trans. Comput.-

Aided Des., vol. 18, no. 3, pp. 322–331, Mar. 1999.

[15] Y. C. Lan, S. H. Lo, Y. C. Lin, Y. H. Hu, and S. J. Chen, “BiNoC: A bidirectional NoC architecture with dynamic self-reconfigurable channel,” in Proc. NOCS, May 2009, pp. 266–275.

[16] Wen-Chung Tsai,1 Ying-Cherng Lan,1 Yu-Hen Hu,2 and Sao-Jie Chen3,” Networks on Chips: Structure and Design Methodologies,”

Hindawi Publishing Corporation Journal of Electrical and Computer Engineering Volume 2012, Article ID 509465, 15 pages doi:10.1155/2.012/509465.

[17] Ying-Cherng Lan, Hsiao-An Lin, Shih-Hsin Lo, Yu Hen Hu, and Sao-Jie Chen, “A bidirectional noc (binoc) architecture with dynamic selfreconfigurable channel,” Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol. 30, no. 3, pp. 427 – 440, march 2011.

[18] M. A. Al Faruque, T. Ebi, and J. Henkel, “Configurable links for runtime adaptive on-chip communication,” in Proc. DATE, Apr. 2009, pp.

Imagem

Figure 1.Typical NoC architecture in mesh topology
Table III shows Area breakdown of  BiNoC_4VC  [17]

Referências

Documentos relacionados

A: dental alveolus; B: maxilo-palatine opening; C: tortuosity from frontal sinus; D: maxillary sinus, caudo-dorsomedial portion; E: palatine sinus; F: maxillary sinus, caudal

In this paper we propose a robust OFDM channel estimator which has a twofold objective: its does not requires the knowledge about the channel statistics (temporal and

Souza e Diehl (2009) afirmam que, dentro da literatura de Contabilidade Gerencial, esta é uma das técnicas mais relevantes, considerando as informações que gera para a tomada

A entrada dos dois países na, então, Comunidade Económica Europeia (CEE), incluiu-os num esquema de integração económica que, além de se basear na eliminação de barreiras à

14–26 Página 19 A Carta de Potencial à Expansão Urbana foi elaborada a partir da integração da Carta de Ocupação Urbana com a Carta de Aptidão Física

To evaluate the possible effects of jetties introduction on the estuarine channel, it was suggested a maximum discharge comparison between the proposed scenarios for two sections

Nas últimas décadas, diante do processo de globalização e dos avanços tecnológicos, percebemos uma mudança significativa nas relações sociais e organizacionais,

Comparison between the effects of two nicotinic receptor antagonists ( d -tubocurarine and ␣ -bungarotoxin) and ␮ -conotoxin GIIIB, a sodium channel blocker in skeletal muscle,