Design and implementation of an FPGA-based multi-standard software radio receiver
Awan, Mehmood-Ur-Rehman; Alam, Muhammad Mahtab; Koch, Peter; Behjou, Nastaran

Published in:
Norchip, 2007

Publication date:
2007

Document Version
Accepted author manuscript, peer reviewed version

Link to publication from Aalborg University

Citation for published version (APA):

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

? Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
? You may not further distribute the material or use it for any profit-making activity or commercial gain
? You may freely distribute the URL identifying the publication in the public portal

Take down policy
If you believe that this document breaches copyright please contact us at vbn@aub.aau.dk providing details, and we will remove access to the work immediately and investigate your claim.

Downloaded from vbn.aau.dk on: oktober 19, 2018
Design and Implementation of an FPGA-based Multi-Standard Software Radio Receiver

Mehmood-Ur-Rehman Awan, Muhammad Mahtab Alam, Peter Koch, Nastaran Behjou
Department of Electronic Systems, Technology Platform Section
Aalborg University Denmark
Email: (mura, mma, pk, nab)@es.aau.dk

Abstract—The aim of this work is to design and implement an FPGA-based Multi-Standard Software Radio Receiver. WLAN and UMTS are taken as the case study. Xilinx FPGA Virtex-IV is the target platform. Bandpass sampling technique at 840MHz is used to alias the combined band of WLAN and UMTS. In the channelization process, in contrast to conventional channelizer, polyphase channelizer is employed. The designed prototype filter for WLAN has 50 taps, partitioned into 5 polyphase sub-filters whereas for the UMTS the prototype filter has 2520 taps, partitioned into 210 polyphase sub-filters. In the implementation, serial polyphase structure with parallel MAC is selected. An implementation analysis based on the area requirements for multipliers, adders and registers for different structures is performed. For 16-tap filter, the structures for Parallel-Multiply and Accumulate, DA, Fast FIR, and Frequency domain filtering require 2896 (without adders), 3072, 4064, and 5572 slices, respectively. The DA is found to be suitable for the implementation due to being resource efficient. Polyphase sub-filter is implemented with Distributed Arithmetic structure and also with Xilinx-DSP48 slices for improved performance.

I. INTRODUCTION

Rapid growth of wireless communications and the emergence of new standards increase the demand for low cost multi-mode radio receivers. For portable battery-powered receivers a high level of integration, high flexibility, and low power dissipation are precedence objectives [1]. One approach to achieve multi-mode operation in a receiver is to design hardware, which can be reconfigured by software. This approach is related to the concept of software-defined radio (SDR) [2]. In transition from traditional radio architectures to software-defined radio, most of the signal processing is shifted from the analog to the digital domain. This is feasible through shifting of ADC as close to the antenna as possible. A software radio receiver architecture is presented in Fig. 2.

A software radio receiver architecture is presented in Fig. 2. The idea in this paper is to use an efficient technique called bandpass sampling which can directly sample the Radio Frequency (RF) signal after Low Noise Amplifier (LNA), and all the signal processing to be done in digital domain. It will overcome the problems such as the I/Q imbalance of analog components of digital radios or even software defined radios. Moreover, by processing the digital data, the unique functionalities of each standard can be set in the digital signal processing programmable parts by employing the similar concept as that of software-defined radios. This enables the front-end to process numerous signals in the digital domain.

Fig. 1. A scenario of multi-standard multi-mode "all-in-one" front-ends user equipment. It highlights the user equipment capable of receiving two standards i.e. UMTS and WLAN.

<table>
<thead>
<tr>
<th></th>
<th>UMTS</th>
<th>IEEE 802.11g</th>
</tr>
</thead>
<tbody>
<tr>
<td>Duplexing</td>
<td>FDD</td>
<td>TDD</td>
</tr>
<tr>
<td>Frequency Band</td>
<td>1.920 - 1.980 : UL</td>
<td>2.4 - 2.4835</td>
</tr>
<tr>
<td></td>
<td>2.110 - 2.170 : DL</td>
<td></td>
</tr>
<tr>
<td>Rx Sensitivity</td>
<td>-117 dBm</td>
<td>-82 to -65 dBm</td>
</tr>
<tr>
<td>Tx Power Level</td>
<td>24 dBm (Class 3)</td>
<td>20 dBm (Europe)</td>
</tr>
<tr>
<td>Channel Bandwidth</td>
<td>3.84 MHz</td>
<td>16.6 MHz</td>
</tr>
<tr>
<td>Non-overlap channels</td>
<td>12</td>
<td>3</td>
</tr>
</tbody>
</table>

TABLE I
SOME SPECIFICATIONS OF UMTS AND WLAN STANDARDS [4]
without the traditional hardware limitations.

![Proposed architecture of the software radio](image)

**Fig. 2.** The proposed architecture of the software radio, where sampling is done at RF just after the LNA which is the only analog component in this architecture.

## II. System Design

Polyphase channelizer is the most efficient approach in terms of computations and required hardware resources as compared to standard channelizer [5]. Based on the unique features of the polyphase channelizer, we have chosen it, to design and implement the system.

The relation among the sampling frequency, channel spacing and number of channels for the polyphase channelizer is [6]:

\[
 f_s = N \times \Delta f \tag{1}
\]

where \( f_s \) is the input sampling frequency, \( N \) is number of channels/transform size and \( \Delta f \) is the inter channel spacing. There are two constraints that have to be met in polyphase channelizer [10].

- The channels to be down-sampled and down-converted to baseband should be centered on the multiples of the channel spacing or on the multiple of quarter of their channel spacing respectively.
- The number of channels (\( N \)) must be integer.

The sampling frequency of 840MHz [5] is selected after examining different sampling frequencies, which fulfills the two mentioned constraints. The RF spectrum of WLAN and UMTS is aliased down to lower spectrum range between (36-410) MHz which is shown in Fig. 3.

**Fig. 3.** The combined spectrum of UMTS and WLAN is bandpass sampled at 840MHz, and the resulted aliases in the Nyquist zone are spectrally inverted.

According to Eq. 1, polyphase channelizer for WLAN has 35 channels of 24MHz and UMTS has 168 channels of 5MHz at 840MHz. However, the required channels for WLAN and UMTS are 3 and 12 respectively. This puts an extra load on the filtering process in terms of high clock speed requirement and large memory storage for filter coefficients. One of the techniques to solve the problem is to re-sample the data before the polyphase channelizer as shown in Fig. 4. The sampled signal can be re-sampled by large factors such that the resultant sampling frequency is above the total signal bandwidth, if the incoming signal is image free. The re-sampling process in this case is simply the spectrum translation [5]. Based on this technique, the WLAN and UMTS bandpass filters are made complex and the resultant image free signals for WLAN and UMTS are tried by different re-sampling factors to have the minimum possible sampling frequencies, which are listed in Tables II and III.

<table>
<thead>
<tr>
<th>Down-sample factor</th>
<th>New Sampling Freq. (MHz)</th>
<th>Channel Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>168</td>
<td>non-overlapped, non-integer</td>
</tr>
<tr>
<td>6</td>
<td>140</td>
<td>non-overlapped, non-integer</td>
</tr>
<tr>
<td>7</td>
<td>120</td>
<td>non-overlapped, integer</td>
</tr>
<tr>
<td>8</td>
<td>105</td>
<td>non-overlapped, non-integer</td>
</tr>
<tr>
<td>10</td>
<td>84 &amp; below</td>
<td>&lt; 84.5MHz bandwidth</td>
</tr>
</tbody>
</table>

**TABLE II**

Re-sampling factors for WLAN with complex signal, showing the channel status as overlapped/non-overlapped and resulting number of channels as integer/non-integer.

<table>
<thead>
<tr>
<th>Down-sample factor</th>
<th>New Sampling Freq. (MHz)</th>
<th>Channel Status</th>
</tr>
</thead>
<tbody>
<tr>
<td>4</td>
<td>210</td>
<td>non-overlapped, integer</td>
</tr>
<tr>
<td>7</td>
<td>120</td>
<td>non-overlapped, integer</td>
</tr>
<tr>
<td>8</td>
<td>105</td>
<td>non-overlapped, integer</td>
</tr>
<tr>
<td>10</td>
<td>84</td>
<td>non-overlapped, non-integer</td>
</tr>
<tr>
<td>12</td>
<td>70</td>
<td>non-overlapped, integer</td>
</tr>
<tr>
<td>14</td>
<td>60</td>
<td>non-overlapped, integer</td>
</tr>
<tr>
<td>15&amp; above</td>
<td>below 60</td>
<td>&lt; 60MHz bandwidth</td>
</tr>
</tbody>
</table>

**TABLE III**

Re-sampling factors for UMTS with complex signal, showing the channel status as overlapped/non-overlapped and resulting number of channels as integer/non-integer.

The desired rate for WLAN and UMTS at the baseband is 20MHz and 61.44MHz respectively. Table II shows that a maximum re-sampling factor of 7 for WLAN is possible which results in a new sampling frequency of 120MHz, with 5 channels of 24MHz which fits the non-overlapped channel criterion. Table III shows the maximum possible re-sampling factor of 14 for UMTS, that results in the new sampling frequency of 60MHz. In order to have desired UMTS rate of 61.44MHz, an embedded re-sampling factor of 125/128 is required. Similarly, with the other two re-sampling factors of 12 and 8, embedded re-sampling factors of 875/768 and 875/512 are required. In this rational number embedded re-sampling, we have to design the prototype filter at up-sampled frequencies. To have the minimum up-sampled factor, embedded re-sampling factor of 875/512 is selected and is rounded...
to 17/10. Finally, re-sampling factors of 7 and 8 are selected for WLAN and UMTS, resulting in new sampling frequencies of 120MHz and 105MHz respectively. This is illustrated in Fig 4 and is summarized in Table IV.

$$X[n] \rightarrow Y[n]$$

TABLE IV

<table>
<thead>
<tr>
<th>Cases</th>
<th>Sampling rate (MHz)</th>
<th>Channel Spacing (MHz)</th>
<th>No. of Channels</th>
</tr>
</thead>
<tbody>
<tr>
<td>UMTS</td>
<td>105</td>
<td>5</td>
<td>21</td>
</tr>
<tr>
<td>WLAN</td>
<td>120</td>
<td>24</td>
<td>5</td>
</tr>
</tbody>
</table>

The corresponding band of UMTS channels translate to (37.5 to -12.5) MHz after re-sampled by a factor of 8. With new sampling frequency of 105MHz and channel spacing of 5MHz, the number of channels become 21 which is the number of the polyphase decomposition. The prototype filter for UMTS has 2520 taps, partitioned into 210 polyphase sub-filters which results in 12 taps per sub-filter. The polyphase channelizer for UMTS is shown in Fig 7.

The down-factor to have 61.44MHz target rate at 105MHz sampling frequency is 1.7 or 17/10. This ratio can be realized by first up-sampling the input stream by 10 and then down-sampling it by 17. The up-sampling is performed by zero packing the input data and down-sampling by serpentine shifting data through the filter in stride of length 17 [7]. The process is illustrated for two data load iterations in Fig 8.

There is no actual zero packing in the final configuration. In the first data load, shown in Fig 8(A), 2-actual data samples are delivered to the next 17 register addresses, while in the second load 2-actual data samples are delivered to the next 17 register addresses as shown in Fig 8(B). The data loading procedure is found to be periodic in 210-load cycles for which it will require 210-states to control the process. The Least Common Multiple (LCM) of 21 and 17 is 357, and since 17 zero packed inputs are delivered at a time, 21 states are needed. For up-sampling factor of 10, the LCM of 21 and 10 becomes 210.

Fig. 4. System block diagram having re-samplers prior to UMTS and WLAN channelizers.

Fig. 5. WLAN channelizer: $k$ and $s$ are tuning parameters. $k$ is the channel number and $s$ is the offset of multiples of quarter of the channel spacing.

The down-factor to have 20MHz required rate at 120MHz sampling frequency is 6. This is realized by down-sampling by serpentine shifting data through the filter in stride of length 6. The process is illustrated for two data load iterations in Fig 6.
210, which is the periodic interval. Table V lists the memory
loading instructions for the process that anchors the data
registers and cycles the data load for UMTS channelizer. Note
that in the 210-states, a total of 357 inputs are delivered and
210 outputs are taken from the polyphase engine to realize the
desired embedded 17/10 re-sampling. The loading scheme is
seen to be a constant offset of -10 modulo 21 within a sequence
as well as in the transition between sequences. The -10 offset
is a consequence of the 1-to-10 up-sampling represented by
the zero packing but not actually implemented in the process.

Because of the 1-to-10 up-sampling implemented by the
zero packing, only one-tenth of the weights in each stage
actually contributes to the sub-filter output. Thus each stage
is further partitioned into 10 subsets of weights, which results
in a total of 210 × 10 = 210 filter weight sets. These sets are
denoted by C0, C1, ..., C209 where the integer is the starting
index from the original non-partitioned prototype filter. Each
filter starts with its index and increments in stride of length
210. Table VI lists the filter assignment to the 21-successive
data registers for 210-states of the process. Table VI shows
that in a given state the successive filter index increments by
22 modulo-210 and between states, the filter index increments
by 17 modulo-210. The integer 22 is the offset between two
data samples in the zero packed load in two adjacent rows. The
index 17 is the number of zero packed data points introduced
per data load cycle. The prototype filter has to be designed to
operate at 10 times f_s or 1050MHz due to up-sampling of the
data by a factor of ten on the way into the filter. Consequently,
the filter becomes ten times longer than the standard design
but since only one-tenth of it is used per processing cycle so
no processing penalty is paid [7].

III. IMPLEMENTATION

In the implementation phase, polyphase channelizers are
analyzed in terms of the required components, consisting of
demultiplexer as commutator, a filter bank having polyphase
filters, and finally the coherent phase summation. There are
different structural techniques which can be used to carry
out the implementation. To select the best technique for
the designed receiver, general polyphase structure, optimized
structures - symmetric property based structure, adder shared
structure, serial polyphase structures with serial and parallel
MAC are considered. Based on the complexity analysis as
shown in Table VII, serial polyphase structure with parallel
MAC is selected for the final implementation, as shown in
Fig 9.

<table>
<thead>
<tr>
<th>State</th>
<th>No. of Inputs</th>
<th>Loading Sequence</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>2</td>
<td>R16, R6</td>
</tr>
<tr>
<td>1</td>
<td>2</td>
<td>R17, R7</td>
</tr>
<tr>
<td>2</td>
<td>2</td>
<td>R18, R8</td>
</tr>
<tr>
<td>3</td>
<td>1</td>
<td>R19</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>208</td>
<td>2</td>
<td>R4, R15</td>
</tr>
<tr>
<td>209</td>
<td>1</td>
<td>R5</td>
</tr>
</tbody>
</table>

**TABLE V**
UMTS POLYPHASE FILTER’S REGISTER LOADING SEQUENCE WITH THE STATE MACHINE

<table>
<thead>
<tr>
<th>Cases</th>
<th># Mults</th>
<th>#Adders</th>
<th>#Regs</th>
<th>Clock speed</th>
</tr>
</thead>
<tbody>
<tr>
<td>Polyphase General (Transpose form)</td>
<td>N</td>
<td>((N/M)-1)x M</td>
<td>N</td>
<td>f_s/M</td>
</tr>
<tr>
<td>Symmetric form (Shared Multipliers)</td>
<td>N/2</td>
<td>((N/M)-1)x M</td>
<td>N</td>
<td>2f_s/M</td>
</tr>
<tr>
<td>Symmetric form (Shared Multipliers &amp; Adders)</td>
<td>N/2</td>
<td>((N/M)x M)/2</td>
<td>Nx2</td>
<td>2f_s/M</td>
</tr>
<tr>
<td>Serial Polyphase (Serial MAC)</td>
<td>1</td>
<td>1</td>
<td>N</td>
<td>f_s x (N/M)</td>
</tr>
<tr>
<td>Serial Polyphase (Parallel MAC)</td>
<td>N/M</td>
<td>(M/N)-1</td>
<td>N</td>
<td>f_s</td>
</tr>
</tbody>
</table>

**TABLE VII**
COMPLEXITY ANALYSIS FOR POLYPHASE FILTER BANK, IN TERMS OF MULTIPLIERS, ADDERS, REGISTERS, AND CLOCK REQUIREMENTS.
domain filtering structures require 2896 (without adders), 3072, 4064, and 5572 slices, respectively. The Distributed arithmetic is found to be suitable for the implementation due to being resource efficient.

The focus of the above techniques is to use multipliers as little as possible, to save the area. But due to technology advancement, the modern FPGAs have dedicated multiplier blocks which are more efficient than the CLB-slices based multipliers, mainly in terms of operating speed and reduced power requirements. Xilinx FPGA, Virtex-IV has XtremeDSP blocks that can perform multiplication up to 500MHz. The system performance is increased by using these blocks. Each XtremeDSP block has two DSP48 slices [9]. Therefore, the polyphase filter bank implemented as serial-polyphase-filter structure with parallel MAC for WLAN and UMTS channelizer can be built by using 10 and 12 DSP48 slices respectively. In the fixed-point implementation of WLAN channelizer, word-length for Input Data, Filter’s Coefficient, Complex Phasors and Complex Output are taken as 16 Bits (1 sign, 7 Integer, 8 Fraction), 12 Bits (1 sign, 11 Fraction), 16 Bits (1 sign, 1 Integer, 14 Fraction), and 30 Bits (1 sign, 10 Integer, 19 Fraction) respectively. The resource utilization of polyphase channelizer for WLAN is tabulated in Table VIII.

The maximum operating frequency of the design comes out to be 134MHz, which is within the desired frequency i.e. 120MHz.

<table>
<thead>
<tr>
<th>Selected Device</th>
<th>xc4vsx35</th>
</tr>
</thead>
<tbody>
<tr>
<td>No. of slices</td>
<td>1178 out of 15360 7%</td>
</tr>
<tr>
<td>No. of Slice Flip Flops</td>
<td>1649 out of 30720 5%</td>
</tr>
<tr>
<td>No. of DSP48s</td>
<td>14 out of 192 7%</td>
</tr>
</tbody>
</table>

**TABLE VIII**

**RESOURCE UTILIZATION OF WLAN CHANNELIZER**

IV. Conclusion

We presented a dual-standard software radio receiver architecture. A system designed with resource efficient technique ‘polyphase channelizer’ is used to extract the 12 UMTS and 3 WLAN Channels with desired rate at the baseband. Serial Polyphase filter structure with parallel MAC is considered for the FPGA implementation. The critical analysis in terms of hardware area is carried out, which reflect that Distributed Arithmetic or Dedicated Xtreme DSP48 blocks are the best and efficient for polyphase channelizer.

The sampling frequency is a critical parameter in the whole system design. By having multiple bands the complete spectrum is much wider, so in order to fulfill the Nyquist criterion of $f_s \geq 2B$, higher sampling frequency is required. This puts more limitations on the selection of hardware platform with high speed ADCs, technology with higher switching speed.

There is always room for improvement and following are some of the future work [5].

1) The polyphase channelizer can be used to its level best features that is extracting all of the channels for any standard, by having a heterodyning at the input of the polyphase channelizer, and heterodyning-carrier is selected such that the translated channels have equal channel spacing. This case will result in extracting all the channels of a standard, just by using standard polyphase channelizer, not by its variant to compensate the offsets of multiples of quarter of channel spacing.

2) In the polyphase channelizer for UMTS, the required downfactor of 875/512 is rounded to 17/10, which results in the output sampling rate of 61.76MHz instead of 61.44MHz. Arbitrary sampling rate technique [10] can be used along with polyphase channelizer to have the exact required sampling rate of 61.44MHz.

**ACKNOWLEDGMENT**

The research described in this publication is carried out in Center for Software Defined Radio at Aalborg University. A special thanks to Prof. Fredric J Harris, San Diego State University (USA), for his valuable guidance for setting up the system design.

**REFERENCES**