# T9000 - A NEW GENERATION TRANSPUTER (Architecture and Applications) Part I ## Zlatko Bele KEYWORDS: transputers, T9000, multiprocessing, embedded computer systems, virtual channel processors, programmable memory interfaces, communication subsystems, architecture, application ABSTRACT: In this two part article a new generation transputer T9000 is described. Part I deals with it's basic concept and architecture, while Part II describes main areas of it's applications. # T9000 - TRANSPUTER NOVE GENERACIJE (Zgradba in uporaba) I. del KLJUČNE BESEDE: transputerji, T9000, multiprocesiranje, sistemi z računalnikom, virtualni kanalski procesorji, programabilni pomnilniški vmesniki, komunikacijski podsistemi, zgradba, uporaba POVZETEK: Omenjeni članek v dveh delih opisuje transputer nove generacije T9000. V prvem delu je podan osnovni koncept in zgradba T9000, v drugem pa so opisana glavna področja in načini njegove uporabe. #### INTRODUCTION The T9000 transputer is the first member of a new generation of high performance transputers designed to give exceptional single processor performance and virtually unlimited multiple processor capability. An advanced CMOS technology has been used to integrate a 32-bit integer processor, a 64-bit floating point unit, 16 kBytes of cache memory, a communications processor and four high bandwith serial communications links on a single T9000 chip. The T9000 is completely software compatible with the first generation of transputers, extending the transputer range, and giving easy upgradability. A new family of communication peripherals supports the construction of T9000 networks and mixed transputer systems. The T9000 excels in real-time embedded applications, delivering exceptional performance with scalable multiprocessor capability and extensive industry standard software support. #### \* PERFORMANCE The T9000 transputer is an exceptionally high performance microprocessor. It has been designed to achieve maximum integer and scalar floating point performance from a single microprocessor, without compromising multiprocessing capability and ease of use. The T9000 offers: #### \* Exceptional uniprocessor performance The T9000 transputer boasts exceptional single processor performance: the new super scalar core is capable of peak performance of 200 MIPS and 25 MFLOPS at 50 MHz. ## \* Real-time performance The T9000 has been designed with ultimate consideration for the real time-embedded systems market, with an on-chip kernel giving hardware support for multi-tasking, multiple interrupts, and sub-microsecond interrupt response and context switching. ## \* Unlimited multiprocessor performance On-chip and off-chip support for high speed multiprocessing allows system scalability: the ability to increase the performance by adding more processors. The T9000 architecture makes this form of application accelerations both simple and low cost. ## \* Usable performance The T9000 microarchitecture design allows compilers to be written to fully exploit the superscalar performance. Furthermore, the combination of on-chip cache and PMI allows maximum performance of the T9000 to be obtained using low cost DRAMS. ## **MULTIPROCESSING** For applications that demand performance and functionality which single processors cannot provide, the transputer family has complete hardware and software support for multiprocessing. The T9000 transputer enhances this position trough a new on-chip communications subsystem and off-chip communications peripherals. #### \* On-chip support for multiprocessing The T9000 serial communications links provide total of 80 Mbytes/s bidirectional bandwidth. This on-chip communications technology, supported by a packet-based link protocol, enables inter-processor communication, high speed data transfer and I/O, and distributed control. The on-chip Virtual Channel Processor (VCP) makes the programming of multiprocessor systems both simple and powerful. This is supported by a range of development tools and industry standard programming languages. # \* Off-chip support for multiprocessing The T9000 transputer is supported by a range of communications peripherals that add to the multiprocessing capability of T9000 systems. The C1XX family ensures that any size of T9000 system can be constructed, connecting first generations and second generation transputers, and providing an interface to the outside world. The C104 is a complete packet routing switch on a single chip. The C104 connects 32 links to each other via a non-blocking crossbar switch with sub-microsecond latency. This allows communications between T9000 transputer that are not directly connected, emulating a direct connection between each of the devices in a T9000 network. Multiple C104 can be easily connected together to make larger networks, linking any number of T9000 transputers. The C100 system protocol convertor converts between the first generation transputer links and the new T9000 links. This allows mixed transputer networks to be constructed using the optimum combination of transputers to satisfy processing power, communications bandwidth and system cost. The C101 link adaptor provides a parallel interface between a T9000 link and external systems such a buses, peripheral devices and even other microprocessors. #### **SOFTWARE** The succes of any microprocessor is determined as much by the quality of its software tools as by any other feature. The T9000 transputer as well as the whole transputer family is provided with a range of industry standard compilers and powerful development tools to support the emmbedded systems market. In addition to the software tools developed specificaly for the T9000, instruction set compatibility with the first generation transputer family means that the T9000 inherits an existing range of transputer development and application software. To support the development of T9000 transputer systems following is offered by producer: ## \* The transputer toolset The transputer toolset is a set of development tools for programming, configuring and debugging mixed transputer systems. It is available on a variety of host computers including IBM and NEC PC, VAX, SUN3/SUN4. # \* Compilers For fast development time, and to satisfy the diverse programming requirements of different applications areas, the toolset can be used with a variety of industry standard compilers, all with major support for multiprocessing. The T9000 is supported by a range of compilers including ANSI C, C++, FORTRAN, OCCAM and ADA. These compilers are available for the whole transputer family, and have been optimised for the new T9000 microarchitecture. ## \* System software System software support for the T9000 reflects the requirements of the embedded systems marketplace. The T9000 is supported by a range of operating systems and real-time kernels including C-exec and VRTX real-time kernels and the Chorus distributed UNIX operating system. This impressive array of development tools, industry standard compilers and system software efectively meets the demands of the embedded systems market. #### THE T9000 ARCITECTURE All the members of the transputer family share the same architecture, combining processor, communications links, RAM, and many other features, all on a single chip. The T9000 transputer architecture has been designed to cater for the increasing demands made by today's embedded system applications. ## \* Superscalar processor At the heart of the T9000 lies the superscalar processor, with the ability to execute up to eight instructions in every clock cycle. The on-chip 64-bit floating point unit has a peak performance of 25 MFLOPS at 50 MHz. The FPU is a scalar processor, ideally suited to high performance numerical applications. Figure 1: T9000 basic architecture The T9000 pipeline contais an Instruction Grouper stage, which takes code sequences and organises the instructions into groups to best exploit the functionality of the pipeline. This means that the microarchitecture is transparent to the user, and allows efficient code to be written in industry standard languages to fully utilise the performance capabilities of the T9000. #### \* Communications subsystem. The T9000 communications subsystem comprises four high bandwitdh serial links, two control links and a dedicated virtual channel processor (VCP). Each 100 MBaud serial link has a packet based link protocol, supporting a data rate of 10 MBytes/s, giving the T9000 a total bidirectional communications bandwith of 80 MBytes/s. The links are primarily used as an efficient method of direct communications between T9000 transputers. Communication between software processes, or tasks, on transputers takes place over software channels. The same machine instructions are used for communication between processes on the same processor as for communication between processes on different processors. Virtual channels for off-chip communications are multiplexed onto each physical link by the VCP. This support for interprocessors communication is unique to the T9000, and makes programming a multiprocessor system as simple as programming a single processor. Communication between T9000s and peripherals which are not directly connected is achieved by using the C1XX family of communications peripherals. This combination of on-chip and off-chip communications support makes the T9000 transputer the optimum solution for multiprocessing systems. #### \* On-chip cache To support the high performance processor core, the T9000 has a 16 kByte on-chip cache. This can also be programmed to function as 16 kByte of on-chip memory, or 8 kByte of on-chip memory and 8 kByte of cache. This adds flexibility to system design, allowing applications to run with no external memory, and quaranteeing deterministic code behavior in on-chip memory for applications where this is critical. # \* Programmable memory interface The T9000 programmable memory interface (PMI) has been designed to provide maximum bandwidth to support the on-chip cache, system flexibility, and support for low cost mixed memory systems. Figure 2.: T9000 example system The T9000 can directly address a 4 GByte physical address space, and provides a peak external memory bandwith of 200 MBytes/s. Four independent banks of external memory are supported, allowing the implementation of mixed memory systems, with support for a combination of DRAMS, SRAMS, EPROM and VRAM. The databus of each bank can be configured to be 64, 32, 16 or 8 bits wide depending on the type of memory being used. This feature combined with an efficient on-chip cache mean that the full performance on the T9000 transputer can be exploited using low cost DRAM. Furthermore, up to 8 MBytes of DRAM can be connected to the PMI with zero external logic, leading to minimum component count and low system cost. Table 1: The transputer product range | Family | Part number | Speed (MHz) | On-chip SRAM | Seeial links | Package | Process | |------------------------------|-------------|----------------------------|--------------|--------------|----------|--------------| | T2-16 bit CPU | T222-G17M | 17.5 | 4K | 4 | 68 PGA | Mil-Std-883C | | | T225-G20S | 20 | 4K | 4 | 68 PGA | Commercial | | | T225-G25S | 25 | 4K | 4 | 68 PGA | Commercial | | | T225-J20S | 20 | 4K | 4 | 68 PLCC | Commercial | | | T225-F20S | 20 | 4K | 4 | 100 CQFP | Commercial | | T4 - 32 bit CPU | T400-G20S | 20 | 2K | 2 | 84 PGA | Commercial | | | T400-J20S | 20 | 2K | 2 | 84 PLCC | Commercial | | | T400-X20S | 20 | 2K | 2 | 100 PQFP | Commercial | | | T425-G20S | 20 | 4K | 4 | 84 PGA | Commercial | | | T425-G25S | 25 | 4K | 4 | 84 PGA | Commercial | | | T425-J20S | 20 | 4K | 4 | 84 PLCC | Commercial | | | T425-F20S | 20 | 4K | 4 | 100 CQFP | Commercial | | T8 - 32 bit CPU + 64 bit FPU | T800-G17M | 17.5 | 4K | 4 | 84 PGA | Mil-Sdt-883C | | | T801-G20S | 20 | 4K | 4 | 100 PGA | Commercial | | | T801-G25S | 25 | 4K | 4 | 100 PGA | Commercial | | | T805-G20S | 20 | 4K | 4 | 84 PGA | Commercial | | | T805-G25S | 25 | 4K | 4 | 84 PGA | Commercial | | | T805-G30S | 30 | 4K | 4 | 84 PGA | Commercial | | | T805-J20S | 20 | 4K | 4 | 84 PLCC | Commercial | | | T805-F20S | 20 | 4K | 4 | 100 CQFP | Commercial | | T9 - 32 bit CPU + 64 bit FPU | T9000-F40S | 40 | 16K | 4 | 208 CQFP | Commercial | | | T9000-F50S | 50 | 16K | 4 | 208 CQFP | Commercial | | T2/T4/T8 peripherals | C011-P20S | Link adaptor to bus or I/O | | | 28 PDIL | Commercial | | | C011-E20S | Link adaptor to bus or I/O | | | 28 SOJ | Commercial | | | C011-S20M | Link adaptor to bus or I/O | | | 28 CDIL | Mil-Std-883C | | | C012-P20S | Link adaptor to bus | | | 24 PDIL | Commercial | | | C004-G20S | 32 way crossbar switch | | | 84 PGA | Commercial | | | C004-G20M | 32 way crossbar switch | | | 84 PGA | Mil-Std-883C | | T9 peripherals | C100-F10S | System protocol converter | | | 100 CQFP | Commercial | | | C104-F10S | Packet routing switch | | | 208 CQFP | Commercial | ## \* High integration for real-time system The T9000 has an on chip hardware kernel which comprises on-chip timers and process schedulers. The T9000's multiple interrupt capability and sub-microsecond interrupt response and context switching make it ideally suited to real-time applications that demand high performance and maximum on-chip functionality. The considerations for ease of use and low system cost extends to on-chip phase locked loop(PLL) technology. This allows a low frequency 5 MHz input clock to be used, which is then generated into the high frequency processor clock, removing the need to route high speed clocks on PCB. The original transputer architecture was designed to allow maximum usability and minimum system cost without sacrificing performance and multiprocessing capabilities. The T9000 continues this tradition, giving the added benefits of exceptional integer and floating point performance and enchanced multiprocessing support, combined with ultimate consideration for system cost and design flexibility ### \* Transputer product range In Table 1. a complete transputer product range is presented in terms of part numbers, speed, on-chip memory as well as type of package being used. Po interni dokumentaciji SGS-THOMSON/INMOS priredil: Zlatko Bele MIKROIKS d. o. o. Dunajska 5 61000 Ljubljana SLOVENIJA