IBM®
Skip to main content
    Country/region [select]      Terms of use
 
 
   
     Home      Products      Services & solutions      Support & downloads      My account     
CoreConnect FAQ
Related links: CoreConnect Bus Architecture

FAQ

Questions
Click on question to go the answer.

  1. What is the purpose of having the CoreConnect bus architecture segmented into three parts -- PLB, OPB and DCR bus?

  2.  
  3. Why do the CoreConnect core specifications not always agree with corresponding architecture specifications. For example the PLB arbiter core specification indicates that it supports up to eight masters while te PLB architecture specification says that PLB supports up to 16 masters.

  4.  
  5. What is the difference in the major revision levels of the Processor Local Bus (PLB 2.X, PLB 3.X, PLB 4.X) specifications?

  6.  
  7. Has the On-chip Peripheral Bus -- OPB specification been enhanced or extended?

  8.  
  9. How is CoreConnect different from common board / system level bus architectures?

  10.  
  11. Are there any reference designs for system on chip ASICs built around CoreConnect architecture?

  12.  
  13. What is the procedure for licensing CoreConnect?

  14.  
  15. So, PLB allows 32-bit masters and slaves to be connected to 64-bit and 128-bit busses and allows 64-bit masters and slaves to be connected to 128-bit busses. Does that mean that there are no restrictions on mixing masters and slaves on PLB?

  16.  
  17. How many PLB masters can be attached to a PLB bus complex?

  18.  
  19. How many PLB slaves can be attached to a PLB bus complex?

  20.  
  21. How many OPB masters can be attached to a OPB bus complex?

  22.  
  23. How many OPB slaves can be attached to a OPB bus complex?

  24.  
  25. Can I have multiple masters on a DCR bus?

  26.  
  27. On PLB 4, why does the address phase of a bus transaction take three bus clock cycles? This is slower than in previous versions of PLB.

  28.  
  29. In PLB write transactions, the first data acknowledge can occur in the same cycle as the address acknowledge. But, in a read transaction, the first data acknowledge must wait until two clock cycles after the address acknowledge -- why?

  30.  
  31. How are the PLB attributes ("PLB_TAttributes") used by the CoreConnect bus elements?

  32.  
  33. Why don't PLB master and slave cores generally support byte burst and half-word (16-bit) burst transfers?

  34.  
  35. What is the purpose of the CoreConnect toolkits?

  36.  
  37. What are the components of the CoreConnect toolkits?

  38.  
  39. What is Bus Functional Language -- BFL?

  40.  
  41. Can parameters be substituted for wait(level=31), for instance wait(level=endoftest), where 'endoftest' is a parameter with value=31?

  42.  
  43. At the end of our tests we need to compare the contents of a PLB Slave's memory with expected results. Is there any way to do this without requiring the use of a PLB master that uses actual bus cycles? We would like to be able to directly query the PLB Slave's memory contents.

  44.  
  45. We would like to code some of our BFL bus transactions using a base-address + offset. Is there a way, for instance, to write a command like this?
    mem_init(addr=base + 0,data=5a5a5a5a)
    mem_init(addr=base + 4,data=5a5a5a5a)
    mem_init(addr=base + 8,data=5a5a5a5a)
    . . . . .

  46.  
  47. We've seen a ".bfl_on" and ".bfl_off" construct in an app note, but can't find it in any other tool kit documentation. We're wondering about how this can be used, and whether it can be used to embed flow control capability within a BFL file?

  48.  
  49. We use the PLB tool kit to verify PLB compliance of new IP cores for the library. We would like to know what the PLB monitor actually checks and what it does not check so that we can find solutions to the items not supported by the PLB monitor.

  50.  
  51. For PLB, what is the difference between fixed and variable length burst transfers?

  52.  
  53. For PLB, can a fixed length burst transfer be terminated early?

  54.  
  55. PLB, what is the difference between burst transfers and line transfers?

  56.  
  57. For PLB, what is unique about DMA transfer types?

  58.  
  59. In the PLB arbiter core control register (PACR), what is the function of the high bus utilization (HBU) setting?

  60.  
  61. The PLB bus interfaces of PowerPC 405 CPU core is designed to and compliant with the PLB 3 architecture specification. How can it be used in an ASIC using a PLB 4 implementation? There are differences in how the PLB time-out is handled between PLB versions 3 and 4.

  62.  
  63. Can a CoreConnect based design have multiple instantiations of PLB and / or OPB buses?

  64.  
  65. What are the required clocking relationships between elements of a CoreConnect bus implementation? Can bus clocks be run at any frequency?

  66.  
Answers

Q_1: What is the purpose of having the CoreConnect bus architecture segmented into three parts -- PLB, OPB and DCR bus?

A_1: Each bus component of the CoreConnect architecture is optimized to achieve specific design goals for an on-chip bus architecture. The prime goal of PLB is to provide a high-bandwidth, low-latency connection between bus agents that are the main producers and consumers of the bus transaction traffic. The prime goal of OPB is to provide for a flexible connection path to peripherals and memory of various bus widths and transaction timing requirements while providing minimal performance impact to the PLB bus. The DCR (Device Control Register) Bus is a mechanism for removing device configuration slave loads, memory address resource use and configuration transaction traffic from the main system busses. Most traffic on the DCR bus occurs during the system initialization period, however, some elements such as the DMA controller and interrupt controller cores use the DCR bus to access normal functional registers used during operation.



Q_2: Why do the CoreConnect core specifications not always agree with corresponding architecture specifications. For example the PLB arbiter core specification indicates that it supports up to eight masters while the PLB architecture specification says that PLB supports up to 16 masters.

A_2: As with other architectures, implementations of an architecture can be created that do not invoke all aspects of that architecture. For example, it is acceptable for a PLB master or slave to not generate / respond to all PLB transaction types. The PowerPC 405 CPU core does not generate PLB burst transactions. It uses PLB line transactions for blocks (cache lines) of data.



Q_3: What is the difference in the major revision levels of the Processor Local Bus (PLB 2.X, PLB 3.X, PLB 4.X) specifications?

A_3: Each major version release of the PLB architecture represents a set of architectural enhancements most of which are oriented towards increasing levels of performance.

  • Version unique characteristics of PLB 2.X are: 32-bit data bus, 32-bit address bus, single cycle address phase, two level address pipelining
  • Version unique characteristics of PLB 3.X are: 64-bit data bus, 32-bit address bus, two cycle address phase, two level address pipelining
  • Version unique characteristics of PLB 4.X are: 128-bit data bus, up to 64-bit address bus, three cycle address phase, "n" level address pipelining, enhanced error reporting



Q_4: Has the On-chip Peripheral Bus -- OPB specification been enhanced or extended?

A_4: Yes, the OPB architecture specification was enhanced in OPB version 2.X to provide support for 64-bit addresses and a 64-bit data bus. It also added support for byte enables to be used as an alternative to dynamic bus sizing for smaller (than the data bus width) transactions.



Q_5: How is CoreConnect different from common board / system level bus architectures?

A_5: CoreConnect was optimized for on-chip interconnection as opposed to board / system interconnections. Key differences include:

  • Larger bus width -- in general, connections on chip (chip wiring interconnect) are less expensive than connections between chips (package pins) and across board backplanes.
  • Higher clock speeds -- interconnect distances are shorter on chip than between packaged devices.
  • Lower power consumption -- package and board interconnect typically use more power than on-chip interconnect because the parasitic load is greater.
  • Manufacturing costs -- chip wiring is cheaper than board interconnects.
  • Reliability -- chip wiring is less failure prone than package and board interconnects.



Q_6: Are there any reference designs for system on chip ASICs built around CoreConnect architecture?

A_6: IBM World Wide Design Centers have a growing family of PowerPC and CoreConnect based System-on-Chip reference designs that represent a starting point for the functional architecture of a large number of ASIC designs. The first available reference design is the 405 PBD (platform based design). It is composed of a PowerPC 405 core and the 64-bit implementation of CoreConnect PLB. A new design point is in progress based on the PowerPC 440 core and the 128-bit version of CoreConnect PLB. The reference designs are available to IBM ASIC and CSSP customers through Design Center representatives.



Q_7: What is the procedure for licensing CoreConnect?

A_7: CoreConnect licenses are available at no cost to businesses and academic institutions on request through the IBM PowerPC Applications group at ppcsupp@us.ibm.com. Requests will be responded to with an email form to collect the necessary company / institution data.



Q_8: So, PLB allows 32-bit masters and slaves to be connected to 64-bit and 128-bit busses and allows 64-bit masters and slaves to be connected to 128-bit busses. Does that mean that there are no restrictions on mixing masters and slaves on PLB?

A_8: There are restrictions. First, the PLB bus arbiter must be as large as the widest agent on the bus. Next, the various masters and slaves must be designed to support transaction to other size elements. This adds a level of complexity to the master that must support transactions with a smaller slave in that the master must perform extra "conversion" cycles to communicate with the smaller slave. For example, a 128-bit master must perform a regular cycle plus three conversion cycles to access a 32-bit slave. This effectively provides for some level of dynamic bus sizing. The advantage is that legacy cores may be used on a newer version of the PLB in some cases. A disadvantage of this scenario would be a case where a significant amount of the traffic on the 128-bit PLB was communication with a 32-bit master or slave. In this case, 75% of the data bandwidth would be wasted.



Q_9: How many PLB masters can be attached to a PLB bus complex?

A_9: The PLB architecture allows up to 16 masters on a PLB bus complex instantiation. PLB arbiter cores to date have typically provided for support of up to eight masters on a single bus segment. Newer cross-bar arbiter cores will provide for up to 12 masters on a two-way cross-bar. In typical applications, putting more than eight masters on a single segment bus would tend to lead to bus saturation and extended latency times with typical bus traffic patterns.



Q_10: How many PLB slaves can be attached to a PLB bus complex?

A_10: With the existing PLB arbiters, there is not a logical limit to the number of "non-pipelining" slaves, however, slaves that support address pipelining are limited to eight. Electrical loading on the bus must also be considered. Excessive loading will require rebuffering logic to be inserted and this will lower the achievable bus clock speed.



Q_11: How many OPB masters can be attached to a OPB bus complex?

A_11: The existing arbiter supports four masters on an OPB segment. Most chip implementations to date have used one or two master slots. For applications requiring many OPB masters, the straightforward option is to have multiple OPB segments instantiated on the chip.



Q_12: How many OPB slaves can be attached to an OPB bus complex?

A_12: There is no logical limit tot he number of OPB slave on a bus segment. The main limit would be due to electrical loading. Bus speed could be traded off for an increase in the number of loads. Multiple OPB segments could also be used to mitigate the loading problem.



Q_13: Can I have multiple masters on a DCR bus?

A_13: The DCR bus has no arbiter and is designed for a single master. Normally this is a PowerPC 4XX CPU core. Access to the DCR bus is provided through a separate software mechanism from memory space loads and stores in the PowerPC programming model.



Q_14: On PLB 4, why does the address phase of a bus transaction take three bus clock cycles? This is slower than in previous versions of PLB.

A_14: The address request-acknowledge path through the bus architecture is the longest path. In order to run the bus at higher clock rates and meet the synchronous bus timings, it was necessary to add clock cycles to the transaction address phase. The data phase of the transaaction is still implemented with single clock cycle timing. For burst and line transactions that have multiple beats (data acknowledges), the impact of a multi-cycle address phase is not as significant. Also, address pipelining, commonly used in PLB implementations, effectively hides the multi-cycle address phase for normal traffic mix scenarios.



Q_15: In PLB write transactions, the first data acknowledge can occur in the same cycle as the address acknowledge. But, in a read transaction, the first data acknowledge must wait until two clock cycles after the address acknowledge -- why?

A_15: For write transactions, the master is providing the data at the beginning of the cycle and the data is typically posted in a slaves write buffer. This will normally allow a slave sufficient time to return the data acknowledge at the same time that the address acknowledge is returned. For read transactions, the slave needs sufficient time to obtain the data from some device and buffer it before returning it to the master with the data acknowledge. The two cycle latency to the first data acknowledge allows for this access time without degrading the bus clock speed.



Q_16: How are the PLB attributes ("PLB_TAttributes") used by the CoreConnect bus elements?

A_16: The attributes have no meaning within the CoreConnect architecture itself. They only represent information that is carried along with the address and qualifiers during the address phase of the transaction. The meaning of the attributes may have significance to the masters and slaves on PLB. Some of the attribute lines have a "recommended" usage as documented in the PLB architecture specification. These definitions come from common usage in PowerPC and CoreConnect based chips.



Q_17: Why don't PLB master and slave cores generally support byte burst and half-word (16-bit) burst transfers?

A_17: While the PLB architecture has significant flexibility with respect to transaction protocol, not all allowable options make sense to use. An extreme example of this would be to have a 128-bit PLB data bus carrying a significant amount of "byte burst" traffic. During the byte bursts, over 93% of the bus bandwidth capability is wasted. This is certainly not an efficient use of chip resources.



Q_18: What is the purpose of the CoreConnect toolkits?

A_18: The CoreConnect toolkits provide a very powerful set of functions for designing and verifying elements of a system-on-chip implementation:

  • Provides for modeling of the chip-top functional bus architecture without requiring the specific functional IP blocks to be completed
  • Provides for developing and verifying functional IP blocks that are compliant to the specifc CoreConnect bus segment to which they will attach (e.g. PLB).
  • Provides for faster simulation through the use of the behavioral simulation models when doing architectural performance modelling.



Q_19: What are the components of the CoreConnect toolkits?

A_19: CoreConnect toolkits are an integral part of the CoreConnect license package. The toolkits provide support for PLB, OPB and the DCR Bus. Each toolkit contains:

  • Bus arbiter
  • Behavioral master and slave model
  • Protocol monitor
  • Bus Functional Language compiler
  • Testbench instantiating bus arbiter, master and slave models and protocol monitor
  • Test case scripts
  • User documentation



Q_20: What is Bus Functional Language -- BFL?

A_20: Bus Functional Language is a simple command language that is part of the CoreConnect toolkits and works with the behavioral master and slave models. BFL commands are written to configure the models and generate transaction traffic for the bus testbench. The BFL files are translated to a simulator specific command file by the Bus Functional Compiler (BFC), a utility program provided with the toolkits.



Q_21: Can parameters be substituted for wait(level=31), for instance wait(level=endoftest), where 'endoftest' is a parameter with value=31?

A_21: No, the Bus Functional Compiler (BFC) is expecting the wait command to have a level parameter followed by and integer range 0 to 31. BFL code that is not compliant with the syntax rules will be flagged as an error during compile process.



Q_22: At the end of our tests we need to compare the contents of a PLB Slave's memory with expected results. Is there any way to do this without requiring the use of a PLB master that uses actual bus cycles? We would like to be able to directly query the PLB Slave's memory contents.

A_22: The command "mem_check" may provide what you want to do; see section 6.4.5 of PLB Tool kit User's Manual.



Q_23: We would like to code some of our BFL bus transactions using a base-address + offset. Is there a way, for instance, to write a command like this?
mem_init(addr=base + 0,data=5a5a5a5a)
mem_init(addr=base + 4,data=5a5a5a5a)
mem_init(addr=base + 8,data=5a5a5a5a)
. . . . .

A_23: The mem_init command only allows an address and data; see section 6.3.2 of PLB Tool kit User's Manual. The BFC will produce an error message if you express the command as indicated.



Q_24: We've seen a ".bfl_on" and ".bfl_off" construct in an app note, but can't find it in any other tool kit documentation. We're wondering about how this can be used, and whether it can be used to embed flow control capability within a BFL file?

A_24: The ".bfl_off" and ".bfl_on" constructs are not mentioned in the user's manuals. These constructs allow the user to mask commands, parameters, etc., that are not BFL code from the Bus Functional Compiler (BFC). The .bfl_off and .bfl_on constructs have limited use (described below) and are considered unsupported constructs at this time. However, they are useful, for example, in working with Verilog. If a BFL programmer needed to include some Verilog compiler directives or system tasks in compiled .v files, he could use the constructs in BFL code as follows:
.bfl_off
`timescale 1ns/1ns
.bfl_on

.bfl_off
`define m0
`define m0_MASTER_SIZE 2'b01
.bfl_on

.bfl_off
$dumpvars;
.bfl_on

The BFC will ignore any text between the .bfl_on and .bfl_off constructs during compile time but will include that text in the final .v file. The location of these constructs in the BFL code is critical because its location will show up in the compiled .v file.



Q_25: We use the PLB tool kit to verify PLB compliance of new IP cores for the library. We would like to know what the PLB monitor actually checks and what it does not check so that we can find solutions to the items not supported by the PLB monitor.

A_25: The PLB monitor is developed based on the PLB architectural specification. The developers have attempted to include as many protocol checks as possible which reflect the PLB architecture requirements. The intent is to have complete protocol coverage. The existing documented list of checks is located in the PLB tool kit user's manual, chapter eight, title "PLB Compliance Checks".



Q_26: For PLB, what is the difference between fixed and variable length burst transfers?

A_26: Fixed length burst transfers are a variant of variable length burst transfers. The master burst and slave burst terminate signaling is used just as with the variable length burst, but the length of the burst requested by the master is communicated to the slave via the byte enable signals. The purpose of the fixed length transfer is to give the slave a "hint" of when the transfer will end so that it can assert read/write complete signals sooner. This minimizes the number of read data bus dead cycles between transfers so it helps maximize bus utilization.



Q_27: For PLB, can a fixed length burst transfer be terminated early?

A_27: Yes, either a master or a slave can terminate the burst early, although a master would not normally do this.



Q_28: For PLB, what is the difference between burst transfers and line transfers?

A_28: Both types are block oriented transfers. Line transfers are primarily used to move cache lines of data and instructions. The key characteristics of line transfers are that they support a very limited number of block sizes and the blocks are naturally aligned in the address space. PLB line read transfers support "target word first" mode which means that data items may not be transferred in a linear address sequence. In line transfers, all byte lanes of the PLB segment are used in the transfer. For example, in a 64-bit PLB segment, a 32-byte line transfer takes four data beats while on a 128-bit PLB segment, it takes two data beats.

Burst transfers are used to move variable sized blocks of data in a linear address sequence. Depending on the burst transfer type, some or all of the data byte lanes are used on the PLB segment; it is possible to burst single bytes of data on a 128-bit PLB segment, but this is not recommended from a bandwidth utilization perspective. The number of data beats in the transfer is arbitrary. The only alignment requirement relates to the address alignment of a single transfer item (byte, word, etc.).



Q_29: For PLB, what is unique about DMA transfer types?

A_29: In the interest of bus architecture flexibility and efficient bus utilization, several "special" DMA transfer types were defined. Some of these involved data movement confined to a single PLB slave segment (such as a peripheral device and memory block attached to the same bus controller which appears as a single PLB slave). Since the data buffering is confined to the bus controller, the data does not appear on the PLB. In addition, since on device is a DMA peripheral, it does not have an associated address. So, whereas a normal memory to memory transfer performed by a PLB master (such as the DMA controller or a CPU) would involve read and write transaction with both address and data phases, the previously described transfer would only have one address phase from a PLB point of view. Other DMA transfers involve either missing data or address phases also. Because of the nature of most on-chip bus implementations and normal data flow in these chip, the PLB DMA transfer types are not normally needed and newer DMA controller cores do not support them.



Q_30: In the PLB arbiter core control register (PACR), what is the function of the high bus utilization (HBU) setting?

A_30: The purpose of the high bus utilization setting is to allow a lower priority request of one gender (read / write) to bypass a higher priority request of the opposite gender in the interest of more efficient utilization of the PLB read and write data buses. For example, given a higher priority write and a lower priority read requests presented to the arbiter, normally the write would win the next access. In the situation where the write data bus is busy but the read data bus is free, it is normally advantageous to let the read transaction through first. Use of the HBU setting can help reduce latencies in a busy multi-master PLB environment.



Q_31: The PLB bus interfaces of PowerPC 405 CPU core is designed to and compliant with the PLB 3 architecture specification. How can it be used in an ASIC using a PLB 4 implementation? There are differences in how the PLB time-out is handled between PLB versions 3 and 4.

A_31: There are two ways to accomplish this:

  • The easy way is to use the 12-master, 2-way crossbar arbiter core which supports both PLB 3 and PLB 4 master attachment. It has the logic to support PLB 3 time-out protocol.
  • The IBM Design Center engineers have devised a functional chip-top logic solution to support PLB 3 master attachment (specifically, the PPC405 CPU) to a PLB 4 arbiter.


  • Q_32: Can a CoreConnect based design have multiple instantiations of PLB and / or OPB buses?

    A_32: Yes, it is certainly reasonable to have multiple PLB segments and OPB segments in an on-chip bus implementation. These segments may be connected via bridges or they may be totally independent. Use of segmented buses is a routine method of partitioning traffic flows to avoid bus congestion and latency problems. One restriction to be aware of is that the total maximum number of PLB masters on connected PLB segments is 16.



    Q_33: What are the required clocking relationships between elements of a CoreConnect bus implementation? Can bus clocks be run at any frequency?

    A_33: Architecturally, CoreConnect is defined as a synchronous bus architecture. Almost all on-chip bus implementations have multiple segments, typically one PLB segment, one OPB segment and one DCR segment. Normally, each of these segments is in its own clock domain. The clock domains are typically at different clock frequencies with integer relationships. The bus core implementations require the clocks to be phase synchronous (rising edge alignment) in order to meet chip timing requirements. The bus interfaces of masters and slaves on CoreConnect bus segments must run at the bus clock frequency and be phase aligned.


    Revision Date: 09/06/02

    IBM Customer Connect
    Sign in  

    IBM microNews
      Feedback
    Questions or comments on the technical library
      Help
    Information on search and navigation

        About IBM Privacy Contact