## MULTI-CHANNEL RECEIVER FOR CDMA USING A DIGITAL SIGNAL PROCESSOR AND FPGA CIRCUIT

Filip Samo Balan, Zmago Brezočnik University of Maribor, Faculty of Electrical Engineering and Computer Science, Maribor, Slovenia

Keywords: communications, multi-channel receivers, CDMA systems, Code Division Multiple Access systems, correlation, SS, Spread Spectrum, pseudo-random sequences, DSP processors, Digital Signal Processing processors, FPGA, Field-Programmable Gate Arrays, VHSIC Hardware Description Language, Very High Speed Integrated Circuits Hardware Description Language, practical results, WCDMA systems, Wideband Code Division Multiple Access systems

Abstract: A communication system with a multi-channel receiver for code division multiple access is presented in this article. The system allows simultaneous transmission and reception of data originating from individual transmitters made by means of RISC microcontrollers, using different pseudo-random sequences. In the reception phase, the input signal is first digitised by means of rapid A/D converters. Hardware for the formation of local PR sequences, the correlation with the input signal, and the communication with the processor is described in VHDL and implemented in a FPGA circuit. A digital signal processor performs other necessary operations.

# Večkanalni sprejemnik za CDMA z uporabo digitalnega signalnega procesorja in vezja FPGA

Ključne besede: komunikacije, sprejemniki večkanalni, CDMA sistemi s sodostopom kodno porazdeljenim, korelacija, SS spekter porazdeljeni, zaporedja psevdonaključna, DSP procesorji obdelave signalov digitalni, FPGA vezja logična s poljem programirljivim, VHDL jezik opisni hardwarski vezij integriranih hitrosti zelo visokih, rezultati praktični, WCDMA sistemi s sodostopom kodno porazdeljenim širokopasovni

Povzetek: Članek opisuje izvedbo komunikacijskega sistema z večkanalnim sprejemnikom s kodno porazdeljenim sodostopom. Sistem omogoča sočasno oddajanje in sprejemanje podatkov, tvorjenih s posameznimi oddajniki, ki so narejeni s pomočjo RISC mikrokrmilnikov in uporabljajo različna psevdonaključna zaporedja. V sprejemnem delu vhodni signal najprej digitaliziramo s hitrimi A/D pretvorniki. Strojna oprema za tvorjenje lokalnih PN zaporedij, korelacijo z vhodnim signalom in komunikacijo s procesorjem je opisana v jeziku VHDL ter izvedena z vezjem FPGA. Vsa druga potrebna opravila naredi digitalni signalni procesor.

#### 1 Introduction

Due to its capacity increase, improved call quality and cell coverage, enhanced privacy, simplified system planning and flexibility for multirate systems, Code Division Multiple Access (CDMA) represents one of the leading technologies of today's and tomorrow's cellular and wired communication systems. Also referenced as Direct Sequence Spread Spectrum (DSSS), it is the most known representative from a group of modulation systems with Spread Spectrum (SS) including pure CDMA systems as Frequency Hopping SS (FHSS) and Time Hopping SS (THSS) as well as many kinds of Hybrid CDMA systems /1/. The common characteristic of SS systems is the usage of unique signature sequences or spreading codes, which after modulation causes a wideband signal with Multiple Access (MA) capability that allows many users to share the same channel for transmission. CDMA systems use all of the available time-frequency space, far beyond that needed for basic communication. Other systems with MA capability such as Frequency Division MA (FDMA) or Time Division MA (TDMA) are mostly narrowband systems and limited in frequency band and time duration, respectively.

As the design of the CDMA transmitter with blocks for source and channel coding, spread spectrum and RF modulation is relatively simple, the design of the receiver with multi-user detection is not. Especially blocks after the RF demodulation and before source and channel decoding (chip-matched filters or correlators, channel parameter estimation and detection) are very complex.

Today we can choose between a number of receiver configurations based on using matched filters fabricated in charge coupled devices (CCD), surface acoustic wave (SAW), or standard digital signal processors (DSPs) /2/. Although the performance of DSPs raises rapidly, the computational complexity needed for a large number of users exceeds it. We achieved the needed DSP acceleration with the use of reconfigurable coprocessor FPGA.

For better illustration of the topic, a short introduction to the CDMA basics is presented in Section 2. We can find more detailed principles in /3/, the history in /4/, and numerous topics in /1,5-7/ and elsewhere. In Section 3 we explain some characteristics of DSP and FPGA circuits and the reasons for their usage. Section 4 describes the individual building blocks that we construct for the communi-

cation system. Some results are presented in Section 5. We conclude with guidelines for further study.

#### 2 Code Division Multiple Access

A basic block diagram of a Direct Sequence CDMA system, with spectrum spreading before Binary Phase Shift Keying (BPSK) modulation, is shown in Fig. 1.



Fig. 1: Direct Sequence CDMA system

The baseband signal  $d_u(t)$  is first modulated (simply multiplied) by the spreading signal c(t) and then used for BPSK modulation of the carrier frequency  $f_0$ . Pseudo-Random (PR) sequences like m-sequences, Gold, Kasami, Dual-BCH, and many others can be used for the spreading signal c(t) /8/. The transmitted signal  $d_r(t)$  in the radio channel is subject to Multiple-Access Interference (MAI), general interference, multipath and frequency-selective fading simplified denoted by i(t). In the receiver, the incoming signal  $r_r(t)$  is first BPSK-demodulated and then, with the help of a correlator and a local copy of the spreading signal c(t) de-spread back to the baseband signal. In Fig. 2,



Fig. 2: Baseband signal modulation and demodulation

an example of baseband signal modulation or spreading in the transmitter and receiver demodulation or de-spreading in an ideal channel is presented. When the N chip long sequences are settled, as in this case, the output  $r_d(t)$  reaches the maximum correlation value of  $\pm N$ , which determines currently received bit.

All spread spectrum systems have a common characteristic named processing gain  $G_{\rho}$  defined as

$$G_{p} = \frac{B_{t}}{B_{i}} \sim \frac{T_{b}}{T_{c}}$$

where  $B_t$  is the spread spectrum bandwidth and  $B_i$  the baseband signal bandwidth, both nearly defined by the chip rate  $T_c$  and the symbol interval or bit rate  $T_b$ .

#### 3 Used DSP and FPGA

DSPs and FPGAs are proliferating into a broad range of compute-intensive applications, including telecommunications, networking, instrumentation, and computers. Nowadays we have even specialised processors for specific classes of applications. Although DSPs support many onchip functions and are highly optimised for the demands of high-speed computing, many applications require functions that DSPs are not optimised for, or require higher precision or performance. Solutions include writing a more optimised software code, using a faster and therefore more expensive processor, or designing a custom gate array. The software code option is limited by the DSP's data output, which is only a fraction of the clock speed. That is why we have to use a custom Application Specific Integrated Circuit (ASIC). This results in an increased power consumption, added cost for additional design tools, non-recurring engineering expenses, increased design time, design risk, loss of design flexibility, and what dissuades us the most difficult real-time adaptation to different algorithms. In the custom gate array option with register rich FPGAs, implementing complex datapath logic functions using similar pipelining and bit-serial processing techniques, FPGA is used to accelerate specialised functions, not easily implemented or quickly enough executed in the DSP.



Fig. 3: Symmetrical array surrounded by I/O

We used a Cache-Logic FPGA from Atmel /9/ as a reconfigurable DSP coprocessor /10/. Now it is possible to off-load compute-intensive as well as special tasks from DSP and execute them in a dynamically reconfigurable FPGA. In this way, we got a 2-chip embedded DSP solution, achieving significantly higher performance and lower power consumption on a relatively small board area. The FPGA architecture consists of thousands of logic cells, each containing a D-type register and a couple of logic gates, or-

ganised in rows and columns. The heart of the FPGA architecture is a symmetrical array of identical cells. An example of a very small (we used a 32 x 32) array is shown in Fig. 3. The regularity is corrupted every four cells by row and column busses and bus repeaters, turn points, clock lines, set/reset lines and RAM blocks. The individual cells are, as shown in Fig. 4, connected to five horizontal and vertical buses, as well as to its eight nearest neighbours. Fig. 5 shows the internal structure of the 8-sided cell. Independent configuration bits determine the input signals and different modes of operation. That is why all permutations are legal. Despite the immense number of possible combinations and therefore errors as well, the design is very well supported by an Integrated Development System (IDS), which lets designers create fast and predictable designs.



Fig. 4: Cell to bus connections (a) and cell to cell connections (b)



Fig. 5: Cell structure

On the other side, we used a DSP from Texas Instruments /11/. The TMS320C542-40 is just one member from a big family of fixed-point DSPs. It has an excellent performance/power consumption ratio and is for that very reason often used in mobile systems. The block diagram of the DSP shown in Fig. 6 represents the main parts of the processor. Features like advanced multibus architecture with three separate data and one program memory bus, dualaccess on-chip RAM, 40-bit Arithmetic Logic Unit (ALU), 17 x 17 bit parallel multiplier, units for the Viterbi operator support, fast return from interrupt, power consumption

control, 40 MIPS of performance and finally an available C compiler are just some reasons for using this DSP.



Fig. 6: TMS320C54x internal hardware

### 4 Receiver implementation

In Fig. 7 we can see the block scheme of the whole test CDMA system. The signals  $w_i(t)$  from up to k transmitters TXi (i=1,2,...,k) are merged in a common AWGN channel.



Fig. 7: Entire test CDMA system

In a real environment each signal is subject to multipath fading, but in our case we simplified this to a variety of time independent attenuations  $\beta$  and delays  $\tau$ . The joint signals together with the noise n(t) represent the input signal for the CDMA receiver r(t). It is then simply denoted as

$$\begin{split} r(t) &= \sum_{i=1}^{k} \sum_{m=1}^{N_{m}(i)} \beta_{(i,m)} W_{i}(t+\tau_{(i,m)}) + n(t) \\ r(t) &\approx \sum_{i=1}^{k} \beta_{i} W_{i}(t+\tau_{i}) + n(t) \approx \sum_{i=1}^{k} \beta_{i} W_{i}(t) + n(t) \end{split}$$

where  $N_m(i)$  represents the number of paths for the i-th signal,  $\beta_{(i,m)}$  attenuations and  $\tau_{(i,m)}$  delays of the i-th signal on the m-th path.

A personal computer (PC) performs control of the receiver and reception of the data stream d(t). A more detailed block scheme of the CDMA receiver is shown in Fig. 8. For QPSK modulation support, two equal and independent branches for I (*in-phase*) and Q (*quadrature-phase*) channel are applied.



Fig. 8: CDMA receiver

The input signal level is adjusted by an amplifier and immediately digitised by means of rapid A/D converters. From this point onward, the signal processing is entirely digital. Digitised signals are then first pre-processed in the correlation coprocessor and then further processed by the DSP. The raw data stream  $d_i(t)$  and the control are transmitted over a Host Port Interface (HPI) to a PC.

Fig. 9 shows the correlator coprocessor's block scheme.



Fig. 9: Correlation coprocessor

All needed clocks, sequences and control signals are derived in the clock generator module from a single system clock. The A/D interface module takes care for start of the conversion, six samples deep temporary memory and proper sample selection upon current channel phase. The PR sequence generator module consists of a PR code memory for loading pre-generated PR sequences and PR counters for chip selection regarding the current channel. The central part is the correlator itself. The selected sample from the temporary memory is first multiplied with the bipolar PR bit and later digitally integrated using an adder and an intermediate correlation memory. When the PR counter reaches the last chip, the integrated value repre-

senting final correlation is moved to the final correlation memory and the intermediate memory is cleared. A correlation flag is set to inform the DSP on the availability of the correlation and, with regard to the priority, the channel number. As soon as the DSP reads the channel number and then the correlation value across the interface, the corresponding flag is cleared. The DSP interface behaves like an ordinary register-based peripheral device.

The correlation coprocessor is described in VHDL (VHSIC (Very High Speed Integrated Circuits) Hardware Description Language) which is an IEEE standard since 1987. It is a formal notation intended for use in all phases of the creation of electronic systems and it supports the development, verification, synthesis, and testing of hardware designs, communication of hardware design data, simulation of hardware descriptions as described in /12/. VHDL provides five kinds of design units to model a design. As an example, the VHDL syntax /13/ of the entity declaration, which describes the interface of the design to its external environment, is shown in Fig. 10. Other design units are architecture body, configuration declaration, package declaration and package body. Detailed description of the coprocessor is in /14/.

```
entity correlation coprocessor is
port
--CLK
    dsp_clk: in std_logic;
--A/D
    ad_clk: out std_logic;
    ad_oe: out std_logic;
    ad_value: in std_logic_vector(7 downto 0);
--DSP
    dsp_data: inout std_logic_vector(15 downto 0);
    dsp_address: in std_logic_vector(7 downto 0);
    dsp_rw: in std_logic;
    dsp is n: in std logic;
    dsp_iostrb_n: in std logic
);
end correlation coprocessor;
```

Fig. 10: VHDL entity declaration of the coprocessor

The local copy of the spreading signal c(t) generated in the receiver has to be synchronised with the received signal



Fig. 11: Delay locked loop with data correlators

r(t). This is done with the help of a Delay Locked Loop (DLL), shown in Fig. 11, in two phases. A rough settlement to one chip period  $T_C$  is done in the searching phase for what we need  $T_{ACK}=N^2T_C$  time for each hypothesis. After that, in the following phase, the receiving of data is possible and a fine settlement down to a fraction of  $T_C$  is initiated. In this phase we need the results of an early-late correlator to control the individual phases of the local PR sequences with software DLL run on the DSP. An early or late correlation represents a correlation with a sample 1/ $3T_C$  before or after the sample for the data correlator, respectively.

#### 5 Results

The constructed and tested multi-channel receiver for direct sequence CDMA, based on a FPGA circuit and a DSP, enables simultaneous handling of seven channels. The maximum PR sequence length was 128 chips with a rate of 1 Mchip/s. With this sequence length, we can achieve a data transmission rate of 7.8125 kbit/s. To test the resistance of the receiver against noise and interference, we added a digitally generated Additive White Gaussian Noise (AWGN) to the received signal. For normal operation in an analog communication systems we need a signal to noise ratio of 17 dB or more. A CDMA system can, like all other digital systems, operate at a much lower signal to noise ratio. The tolerance to interference is proportional to the processing gain  $G_p$ . Because of the influence of MAI, we tested the receiver at a variety of active transmitters. For a bit error of  $P_b = 10^{-3}$  we needed a signal to AWGN ratio of – 10.5 dB and -6.0 dB for one or six active transmitters, respectively. This is about 3.7 dB and 7.8 dB behind the theoretical limits. In the time needed for one hypothesis (TACK=16 ms) in the searching phase, we had 94% and 47% cases of successful synchronisation, respectively. The minimal DSP efficiency, which assures system operation in real time, is presented in Table 1. The results in a system without a correlation coprocessor are projected, because the efficiency of the used DSP was not sufficient for even a single channel. The results evidently show that the use of the coprocessor gives us an opportunity of using a 40 times less efficient DSP for the same result /15/.

| Number of transmitters | K=1  | K=6   |
|------------------------|------|-------|
| Without coprocessor    |      |       |
| Searching phase (MIPS) | 16.3 | 113.9 |
| Following phase (MIPS) | 49.7 | 348.0 |
| Minimal DSP efficiency | 49.7 | 348.0 |
| With coprocessor       |      |       |
| Searching phase (MIPS) | 1.3  | 8.9   |
| Following phase (MIPS) | 0.5  | 3.2   |
| Minimal DSP efficiency | 1.3  | 8.9   |

Table 1: Needed minimal DSP efficiency

#### 6 Conclusion

The advantage of the receiver, mainly composed of software executed by DSP, proved well especially for the ability to change certain functions without the need for radical hardware modifications. This property extended also to FPGAs, since the structure of these circuits is no more defined with the masks used in the time of their manufacturing, but repeatedly formed at each initialisation. The structure is also easily changeable during FPGA activity. thus enabling adaptive algorithms. Beside that, changing FPGA with a bigger, faster, or even totally different one is not a problem any more, because of the description in VHDL. On the other side, the code for DSP, written in higher-level programming language like C, makes migration to a newer DSP possible without rewriting the whole software. Digital nature of the receiver allows elements with greater tolerance, the sensitivity to temperature and noise is low and at the production there is no need for special trimming as it is necessary in analogue realisation causing a cheaper production.

Some future improvements of this work include the use of a more powerful FPGA and DSP to increase the number and/or data transfer rate of the individual channels /15/ to ensure applicability in Wideband CDMA (WCDMA) communications /16/. For a more realistic measurements and practical use, a HF section with a digital oscillator like the COordinate Rotation Dlgital Computer Numeric Controlled Oscillator (CORDIC NCO), QPSK modulator, and codec should be added. Since power consumption and size have a big influence on the mobility of the receiver, the selection of the individual parts should be more careful.

#### References

- /1/ Savo Glišić and Branka Vučetić. Spread spectrum CDMA systems for wireless communications. Artech House, Inc., Norwood, MA 02062, 1997.
- /2/ Keith K. Onodera and Paul R. Gray, A 75-mW 128-MHz DS-CDMA Baseband Demodulator for High-Speed Wireless Applications, *IEEE Journal of solid-state circuits*, 33(5):753-761, May, 1998.
- /3/ Andrew J. Viterbi. CDMA: Principles of spread spectrum communication. Addison-Wesley, Reading, Massachusets 01967, 1998.
- /4/ Robert A. Scholtz. The Origins of Spread-Spectrum Communications. IEEE Transactions on Communications, COM-30(5):822-854, May 1982.
- /5/ Vijay K. Garg, Kenneth F. Smolnik, and Joseph E. Wilkers. Applications of CDMA in wireless/personal communications. Prentice-Hall PTR, Upper Saddle River, NJ 07458, 1997.
- /6/ Man Young Rhee. CDMA cellular mobile communication & network security. Prentice-Hall PTR, Upper Saddle River, NJ 07458, 1998.
- /7/ Laurence B. Milstein, Wideband Code Division Multiple Access, IEEE Journal on selected areas in communications, 18(8):1344-1354, August, 2000.
- /8/ Esmael H. Dinan and Bijan Jabbari. Spreading Codes for Direct Sequence CDMA and Wideband CDMA Cellular Networks. IEEE Communications magazine, 36(9):48-54, September 1998.
- /9/ AT40K FPGAs with FreeRAM, Atmel Corporation, Rev. 0896B, January 1999.

- /10/ DSP Acceleration Using a Reconfigurable Coprocessor FPGA, Application Note, Atmel Corporation, Rev. 0724B, September 1999.
- /11/ TMS320C54x DSP Reference Set, Volume 1: CPU and Peripherals, Texas Instruments, August 1998.
- /12/ Roger Lipsett, Carl Schaefer, and Cary Ussery, VHDL: Hardware Description and Design, Kluwer Academic Publishers, Boston, 1989.
- /13/ Jayram Bhasker, A Guide to VHDL Syntax, Prentice-Hall PTR, Englewood Cliffs, NJ 07632, 1995.
- /14/ Peter Vicman, Korelatorski koprocesor za večkanalni sprejemnik CDMA – Diploma work, Maribor, April 2000.
- /15/ Filip S. Balan, Večkanalni sprejemnik za CDMA z uporabo digitalnega signalnega procesorja in vezij FPGA - Master thesis, Maribor, December 1999.
- /16/ Tero Ojanperä and Ramjee Prasad. Wideband CDMA for third generation mobile communications. Artech House, INC., 685 Canton Street, Norwood, MA 02062, 1998.

mag. Filip Samo Balan, univ. dipl. inž. el. izr. prof. dr. Zmago Brezočnik, univ. dipl. inž. el. Univerza v Mariboru Fakulteta za elektrotehniko, računalništvo in informatiko Smetanova 17, 2000 Maribor, Slovenija tel.: +386 2 220 72 02, fax: +386 2 251 11 78 e-mail: (balan,brezocnik)@uni-mb.si

Prispelo (Arrived): 10.05.2001 Sprejeto (Accepted): 20.08.2001