#### INFORMATION TO USERS This manuscript has been reproduced from the microfilm master. UMI films the text directly from the original or copy submitted. Thus, some thesis and dissertation copies are in typewriter face, while others may be from any type of computer printer. The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleedthrough, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send UMI a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. Oversize materials (e.g., maps, drawings, charts) are reproduced by sectioning the original, beginning at the upper left-hand corner and continuing from left to right in equal sections with small overlaps. ProQuest Information and Learning 300 North Zeeb Road, Ann Arbor, MI 48106-1346 USA 800-521-0600 ## A Bandwidth Efficient Turbo Coding Scheme for VDSL Systems Sreekanth Marti A Thesis in The Department of Electrical and Computer Engineering Presented in Partial Fulfillment of the Requirements for the Degree of Master of Applied Science at Concordia University Montréal, Québec, Canada April 2003 © Sreekanth Marti, 2003 National Library of Canada Acquisitions and Bibliographic Services 395 Wellington Street Ottawa ON K1A 0N4 Canada Bibliothèque nationale du Canada Acquisitions et services bibliographiques 395, rue Wellington Ottawa ON K1A 0N4 Canada Your file Votre référence Our lile Notre référence The author has granted a non-exclusive licence allowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microform, paper or electronic formats. The author retains ownership of the copyright in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission. L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique. L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation. 0-612-77974-2 #### ABSTRACT ## A Bandwidth Efficient Turbo Coding Scheme for VDSL Systems #### Sreekanth Marti The two important issues presently under consideration by ANSI and ETSI for very high bit-rate digital subscriber lines (VDSL) are the last-mile problem and the home local area network (home-LAN) interference problem. The first issue concerns the maximum distance for which a VDSL system can operate reliably for a given data rate. The second issue is the interference due to home-LAN services associated with the twisted pair lines. The drawback of the existing FEC scheme (a 4D Wei-RS scheme) for the VDSL systems is that further improvement is not possible to achieve without a substantial increase in complexity and power penalty. Also, the VDSL systems employing the 4D Wei-RS scheme operates far below the channel capacity. On the other hand, several techniques have been proposed to solve the home-LAN interference problem using iterative-decoding techniques. However, these techniques are complex to implement. In order to provide solutions to these problems and to ensure a reliable transmission of data over longer loops while providing the end user with maximal bit rate, a good FEC scheme with a high coding gain is required. In this thesis, the last-mile and home-LAN interference problems of VDSL systems are addressed using turbo codes. With regard to the last-mile problem, a bandwidth efficient turbo coding scheme is proposed in which, turbo codes are combined with bandwidth efficient modulation and the soft information is exchanged between the decoder and demodulator in an iterative manner. The main objective of the proposed scheme is to provide a higher coding gain than that provided by the 4D Wei-RS scheme to result in an improved performance of the VDSL modems in terms of bit rate, loop length and transmitting power. The scheme is investigated for various values of transmitting power, signal frequencies and numbers of crosstalkers for a targeted BER of 10<sup>-5</sup>. The effects of various code parameters on the performance of VDSL modems are explored. In order to reduce the latency at the receiver end, a pipelined decoding scheme is proposed. Simulation results are presented and compared with that of the 4D Wei-RS scheme. The results show that the choice of turbo codes not only provides a significant coding gain over the standard FEC scheme but also efficiently maximizes the loop length and bit rate at a very low transmitting power in the presence of dominant far-end crosstalk (FEXT). In order to compare the hardware complexity, the proposed and 4D Wei-RS schemes are synthesized using SYNOPSYS with the target technology of Xillinx 4010-3. The Xilinx FPGA statistics of the proposed scheme is compared with that of the 4DWei-RS scheme. To mitigate the effect of home-LAN interference, an iterative soft interference cancellation and decoding technique is proposed in which, the VDSL and home-LAN signals are jointly detected using a soft interference canceller and a soft-in soft-out demodulator combined with a set of turbo decoders. A symbol estimation algorithm based on the soft information of the coded data is developed to estimate the transmitted symbols. Simulations are carried out to evaluate the BER performance of the proposed technique. The data rates provided by the VDSL systems employing the proposed technique are evaluated. The results show that by employing the proposed technique in VDSL systems, the home-LAN interference can be successfully minimized thereby providing higher data rates and maximizing the loop length. Dedicated to my uncle, aunty, and parents #### ACKNOWLEDGEMENTS I would like to express my sincere gratitude to my supervisor Dr. M. O. Ahmad for providing the guidance and support that made this work possible. I am grateful for his extremely careful and thorough review of my thesis, and for his inspiration throughout the course of this work. I feel privileged for having the opportunity to work with him. I would also like to thank all my teachers who were of great support throughout my academic life. I am heartily grateful to my parents, elder uncle and aunt for their love, support and sacrifices that they made in bringing me up. I would like to express my deepest gratitude for my grandmother, brother, aunts, uncles, cousins and my fiance Deepa who supported me with their love and understanding. Finally, I would like to thank all my friends for their help and advice. ## TABLE OF CONTENTS | LI | ST C | F TAB | ELES | N | |----|------|--------|-----------------------------------------------------|-----| | LI | ST C | F FIG | URES | X | | LI | ST O | F ABB | REVIATIONS AND SYMBOLS | xiv | | 1 | Ger | ieral | | 1 | | | 1.1 | Basic | Concepts of VDSL | 2 | | | | 1.1.1 | VDSL System Architecture | 3 | | | | 1.1.2 | VDSL System Requirements | 5 | | | 1.2 | VDSL | Issues | 7 | | | 1.3 | Scope | and Organization of the Thesis | 9 | | 2 | VD | SL Lin | ne Impairments and Impact of Coding in VDSL Systems | 12 | | | 2.1 | Introd | luction | 12 | | | 2.2 | VDSL | Channel Model | 13 | | | 2.3 | VDSL | Line Impairments | 15 | | | | 2.3.1 | Background Noise | 16 | | | | 2.3.2 | Crosstalk Noise | 16 | | | | 2.3.3 | Impulse Noise | 19 | | | | 2.3.4 | Radio Frequency Interference | 20 | | | 2.4 | Impac | t of Coding | 20 | | | 2.5 | Overv | iew of the Existing Coding Scheme | 22 | | | | 2.5.1 | Reed-Solomon Coding | 25 | | | | 2.5.2 | 16 State 4D Wei Coder | 26 | | | 2.6 | VDSL | Line Codes | 30 | | | | 2.6.1 | CAP/QAM | 30 | | | | 2.6.2 | DMT | 31 | | | | 2.6.3 | Comparison of QAM and DMT | 32 | | | 2.7 | Concl | usion | 33 | |---|------|--------|--------------------------------------------------------|----| | 3 | A E | Bandwi | dth Efficient Turbo Coding Scheme for VDSL | 35 | | | 3.1 | Introd | luction | 35 | | | 3.2 | Basic | Concepts of Turbo Codes | 36 | | | | 3.2.1 | Turbo Encoder | 36 | | | | 3.2.2 | Turbo Decoder | 38 | | | 3.3 | Bit In | terleaved Turbo Coded Modulation | 39 | | | | 3.3.1 | Demodulation and Bit LLR Computation | 42 | | | | 3.3.2 | Binary Turbo Decoder | 44 | | | | 3.3.3 | Design Criteria for Constituent Codes | 48 | | | | 3.3.4 | Signal Mapping | 51 | | | 3.4 | Pipe I | Lined Decoding Scheme | 53 | | | 3.5 | Comp | lexity Analysis | 54 | | | | 3.5.1 | Arithmetic Complexity | 55 | | | | 3.5.2 | Hardware Complexity | 55 | | | 3.6 | Simula | ation Results | 60 | | | | 3.6.1 | Performance of Bit Interleaved Coded Modulation Scheme | 60 | | | | 3.6.2 | Performance Comparison between different Codes | 66 | | | 3.7 | Conclu | ısion | 67 | | 1 | Peri | formar | ace Evaluation of VDSL Employing BICM Scheme | 69 | | | 4.1 | Introd | uction | 69 | | | 4.2 | Effect | of Transmitting Frequency | 70 | | | 4.3 | | of Transmitting Power | | | | 4.4 | | of Numbers of Crosstalkers | 75 | | | 4.5 | | able Loop Length | 76 | | | 46 | | ision | | | 5 | An | Iterative Soft Interference Cancellation and Decoding technique | | |---|-------|-----------------------------------------------------------------|-----| | | to I | Vitigate the Effect of Home-LAN on VDSL | 84 | | | 5.1 | Introduction | 84 | | | 5.2 | System Model | 86 | | | 5.3 | Iterative Turbo Multiuser Receiver Structure | 88 | | | 5.4 | Soft Interference Cancellation via Multiuser Detection | 90 | | | 5.5 | Simulation Results | 93 | | | 5.6 | Conclusion | 96 | | 6 | Cor | iclusion and Future Work | 98 | | | 6.1 | Contributions and Concluding Remarks | 98 | | | 6.2 | Scope for Further Investigation | 101 | | A | ppen | $\mathbf{dix}$ | 102 | | R | efere | nces [ | 107 | ## LIST OF TABLES | 2.1 | Parameters for a simplified cable model | 13 | |-----|------------------------------------------------------------------------------|----| | 3.1 | Best rate 1/3 rate turbo codes | 51 | | 3.2 | Decoder complexity. | 55 | | 3.3 | Number of clock cycles required for metric computation | 58 | | 3.4 | Range of parameters | 58 | | 3.5 | Maximum storage requirements | 58 | | 3.6 | Xilinx FPGA statistics | 59 | | 3.7 | Performance comparison between the codes for a targeted BER of | | | | $10^{-5}$ for various code memories | 62 | | 3.8 | Performance comparison between the codes for a targeted BER of | | | | $10^{-5}$ for various interleaver sizes | 62 | | 3.9 | Required SNR for various codes for a targeted BER of 10 <sup>-5</sup> | 67 | | 4.1 | Comparison of data rates for various codes at $f=14~\mathrm{MHz}$ and Trans- | | | | mitting Power=10 dBm | 71 | | 4.2 | Comprison of data rates for various codes for number of cross-talkers=40. | 78 | | 4.3 | Realizable loop lengths with various codes for a data rate of 40 Mbps. | 80 | ## LIST OF FIGURES | 1.1 | Block diagram of a typical DSL transceiver | 4 | |------|----------------------------------------------------------------------|----| | 1.2 | FTTE architecture [3] | ۷ | | 1.3 | FTTC architecture [3] | 4 | | 1.4 | VDSL spectrum | ۶ | | 2.1 | Attenuation as a function of frequency of two 26 AWG (0.4 mm) lines | | | | and two 24 AWG (0.5 mm) lines | 14 | | 2.2 | Illustration of near-end crosstalk (NEXT) | 17 | | 2.3 | Next coupling with 6 dB noise margin | 18 | | 2.4 | Illustration of far-end crosstalk | 19 | | 2.5 | FEXT coupling loss with 6 dB noise margin for $N=49$ | 19 | | 2.6 | Spectral efficiency as a function of loop length | 21 | | 2.7 | Signal-to-self-FEXT plus white noise ratio for 24-gauge twisted-pair | | | | loop | 23 | | 2.8 | Signal-to-self-FEXT plus white noise ratio for 24-gauge twisted-pair | | | | loop as a function of loop length. | 24 | | 2.9 | 4D Wei-RS coding scheme | 24 | | 2.10 | Wei's 4D 16 State encoder | 27 | | 2.11 | 2-dimensional subsets for Wei's code | 28 | | 2.12 | Wei's 16-state 4D convolutional encoder | 28 | | 2.13 | Decoding Wei's code | 29 | | 2.14 | BER Performance of 4D Wei-RS scheme | 30 | | 2.15 | QAM transmitter block diagram | 31 | | 2.16 | DMT Transmitter/Receiver pair Block Diagram | 32 | | 3.1 | Turbo encoder. | 37 | | 3.2 | Turbo decoder. | 38 | |------|------------------------------------------------------------------------------|----| | 3.3 | BICM encoder | 41 | | 3.4 | VDSL crosstalk channel model | 41 | | 3.5 | BICM decoder | 41 | | 3.6 | Subset partitions of 16 QAM for two mapping schemes | 52 | | 3.7 | Signal constellation after the feedback | 53 | | 3.8 | Pipe-lined decoding scheme | 54 | | 3.9 | Architecture of ACS unit. | 57 | | 3.10 | Performance of BICM with encoder generator (5,7), $v=2.\ldots$ | 63 | | 3.11 | Performance of BICM with encoder generator (17,15), $v=3.\ldots$ | 63 | | 3.12 | Performance of BICM with encoder generator (27,31), $v=4.\ldots$ | 64 | | 3.13 | Performance of BICM with encoder generator (7,5) , $N=4096$ | 64 | | 3.14 | Performance of BICM with encoder generator (7,5) , N=8192 | 65 | | 3.15 | Performance of BICM with encoder generator (7,5) , M=64 | 65 | | 3.16 | Performance of BICM with encoder generator (7,5) , $M=256$ | 66 | | 4.1 | Effect of Frequency as a function of code memory for BICM | 72 | | 4.2 | Effect of Frequency as a function of interleaver length for BICM | 73 | | 4.3 | Effect of Frequency as a function of modulation for BICM | 74 | | 4.4 | Effect of Transmitting Power for various code memories for BICM | 75 | | 4.5 | Effect of Transmitting Power for various interleaver lengths for BICM. | 76 | | 4.6 | Effect of Transmitting Power for various modulation levels for BICM. | 77 | | 4.7 | Comparison of power for various codes at $f=14~\mathrm{MHz}$ for a data rate | | | | of 22 Mbps | 77 | | 4.8 | Effect of numbers of cross talkers for various code memories | 78 | | 4.9 | Effect of numbers of cross-talkers for various interleaver sizes | 79 | | 4.10 | Effect of numbers of cross-talkers for various levels of modulation | 80 | | 4.11 | Bit rate as a function of loop length for various code memories | 81 | | 4.12 | Bit rate as a function of loop length for various interleaver sizes | 81 | | 4.13 | Bit rate as a function of loop length for various modulation levels | 82 | |------|---------------------------------------------------------------------|----| | 5.1 | Example for soft cancellation | 85 | | 5.2 | Crosstalk system model [12] | 87 | | 5.3 | Interference model | 88 | | 5.4 | Turbo iterative multiuser receiver with SIC | 90 | | 5.5 | BER performance of the iterative decoder | 94 | | 5.6 | Performance comparison of the proposed receiver with and without | | | | SIC | 95 | | 5.7 | Convergence curve for the soft interference cancellation technique | 95 | | 5.8 | Achievable data rates with and without SIC. | 96 | ## LIST OF ABBREVIATIONS ACS Add Compare Select ADC Analog To Digital Converter ADSL Asymmetrical Digital Subscriber Lines ATM Asynchronous Transfer Mode AWGN Additive White Gaussian Noise BER Bit Error Rate BER Bit Error Rate BICM Bit Interleaved Coded Modulation CC Code Complexity CLB Control Logic Block CO Central Office DAC Digital To Analog Converter De-MUX Demultiplexer DFT Discrete Fourier Transform DMT Discrete Multi-tone Modulation DSL Digital Subscriber Line FDM Frequency Division Multiplexing FEC Forward Error Correction FEQ Frequency Domain Equalizer FEXT Far End Crosstalk FPGA Field Programmable Gate Array FTTC Fiber To The Curb FTTE Fiber To The Exchange HDSL High Bit Rate Digital Subscriber Lines HDTV High Definition Television home-LAN Home Local Area Network IDFT Inverse Discrete Fourier Transform IOB Input Output Buffer ISDN Integrate Digital Services Network ISI Inter Symbol Interference Lex Local Exchange LLR Log Likelihood Ratio MAI Multiple Access Interference MAP Maximum A Posteriori MSP Modified Set Partitioning Mapping MUX Multiplexer NEXT Near End Crosstalk ONU Optical Network Unit POTS Plain Old Telephone Set PSD Power Spectral Density PSTN Public Switched Telephone Network QAM Quadrature Amplitude Modulation RF Radio Frequency RFI Radio Frequency Interface RS Reed Solomon RSC Recursive Systematic Convolution SDH Synchronous Digital Hierarchy SHDSL Single pair High Bit Rate Digital Subscriber Lines SIC Soft Interference Canceller SISO Soft-in Soft-out SNR Signal To Noise Ratio SONET Synchronous Optical Network SOVA Soft Output Viterbi Algorithm TCM Trellis Coded Modulation TDM Time Division Multiplexing VDSL Very High Bit Rate Digital Subscriber Lines ### LIST OF SYMBOLS c Coded bits d Length of the twisted pair cable $d_{free,eff}$ Effective free distance of code word f Frequency of operation $f_i$ Crosstalk coupling function $g_0(D)$ Feedback polynomial $g_I(D)$ Feedforward polynomial h VDSL channel gain $k_1, k_2$ Cable parameters p(r|x) Probability density function r Received noisy symbol $S_t$ State of the encoder v Code memory x Transmitted symbols z Weight of the code sequence B Bandwidth of noise $B_{free,eff}$ Error coefficient related to the code effective free distance C Channel capacity of twisted pair loop E Statistical expectation operator G(D) Generator matrix H(f) Twisted pair channel transfer function M Level of modulation K Number of interferers N<sub>0</sub> Additive white Gaussian noise $N_F(f)$ FEXT coupling loss $N_N(f)$ NEXT coupling loss P Signal power *Pr(b)* Probability of symbol Q(f) Transmit power spectrum density R Receiver input impedance S(f) Received signal power spectrum density State of the Trellis v Average voltage of noise W White noise power spectral density $\alpha_t(s',s)$ Term associated with forward recursion $\beta_t(s',s)$ Term associated with backward recursion $\gamma_i(s,s)$ State transition probability $\sigma^2$ Total power of white noise $\Lambda(c_i)$ Log-likelihood ratio $\lambda(c_i)$ Soft metric corresponding to a bit $\chi$ Signal space of symbols Decision ## Chapter 1 ## General There exist many possible solutions to the problems of overloading the *public switched* telephone network (PSTN) with packetized data and broadband services. Some involve building entirely new systems based on wireless and satellite networks. But, a more realistic and cost-effective solution would be to maximize the reuse of existing analog local loops, or to include provision for providing some backward compatibility with the existing voice telephony equipment of analog hand set. Only copper-based solutions satisfy these criteria. Digital subscriber line (DSL) is a copper-based solution which was created to foster a total digitization of the PSTN end-to-end, from the user device to user device. Also, the drawbacks of the voice-band modem can be overcome with the DSL technology [1]-[4]. Through a process of bypassing the plain old telephone set (POTS) interface at a local central office (CO), a DSL can utilize the full potential of a copper telephone subscriber loop to deliver a transmission throughput of up to a few hundred times that of a voice-band modem. Figure 1.1 shows the general structure of a DSL transceiver [5]. The transceiver set consists of an analog part and a digital part. The analog part consists of analog transmit and receive filters, a digital to analog converter (DAC), an automatic gain device, and an analog to digital converter (ADC). The digital part has three major functions: modulation/demodulation, coding/decoding, and Figure 1.1: Block diagram of a typical DSL transceiver. bit packing/unpacking. Integrated services digital network (ISDN) was the first DSL service. New DSL technologies which are more interesting and promising are commonly listed as xDSL, where x represents a number or letter designation. The xDSL technologies in the chronological order can be listed as digital subscriber lines, high bit rate digital subscriber lines (HDSL/HDSL2), asymmetrical digital subscriber lines (ADSL), single pair high bit rate digital subscriber lines (SHDSL), and very high bit rate digital subscriber lines (VDSL). The latest member of the xDSL family, the VDSL is capable of providing a data rate of 13 Mbps to 52 Mbps upstream, and 1.5 Mbps to 6 Mbps downstream depending on the actual loop length. The applications supported by VDSL include all that ADSL was intended for, plus high definition TV (HDTV) digital television services. ## 1.1 Basic Concepts of VDSL VDSL is the latest transmission technology for providing a high speed digital service on the twisted pair phone lines with a range of speeds depending on the actual line length [3]. The basic intention of the VDSL technology is to create a serviceindependent transmission platform at a much higher transmission throughput than that provided by the ADSL technology. Therefore VDSL can be considered as a high speed version of ADSL. The potential of a higher transmission throughput can be achieved by expanding the signal bandwidth to the region of 10 to 30 MHz. At such a high frequency, a usable channel can be realized only on short twisted-pair telephone loops. Because of this short range, it is no longer possible to provide the service simply from the central office. Hybrid structures are required to provide the VDSL service to the customers living in direct proximity of a central office. To operate successfully, the VDSL equipment should overcome the line attenuation, crosstalk, radio-frequency (RF) ingress and other interferences. Of particular importance is the operation on the existing unshielded twisted-pair lines. VDSL modems must also sustain specified data rates over specified distances, suppress RF emissions, and must be compatible with the frequency spectra of other services that may be present in the same cable bundle such as ADSL, ISDN, and HDSL. The high throughput VDSL is easier to be made compatible with synchronous optical network (SONET) and asynchronous transfer mode (ATM) based services. VDSL can also be used to interconnect business customers within a concentrated area through leased telephone lines for high-speed intranet use. ## 1.1.1 VDSL System Architecture For a public telephone networks, there are two architectures in general [6]. In densely populated areas, many customers are within a few kilo feet of the CO or local exchange (LEx). In such cases, VDSL can be deployed directly from the CO or LEx. This configuration is known as fiber-to-the-exchange (FTTE) and is shown in Figure 1.2. When fiber extends deeper into the network, VDSL can be deployed from the optical network unit (ONU) in a configuration known as fiber-to-the-cabinet (FTTC). The FTTC architecture is shown in Figure 1.3. The transmission directions from the CO, LEx or ONU to the customer premise is called downstream. The direction from the customer to the CO, LEx or ONU is called upstream. In this thesis, we concentrate on the FTTC architecture as it is most commonly employed. Figure 1.2: FTTE architecture [3]. Figure 1.3: FTTC architecture [3]. The allowable frequency band for VDSL signals start from 300 KHz to 30 MHz as shown in Figure 1.4. The first limit is for long range systems and the second limit is for short range systems. The VDSL channel is separated from the bands used for narrow band services like POTS and ISDN basic rate access (ISDN-BA), thus enabling the service providers to overlay VDSL on the existing services. The ADSL frequency band, however, overlaps with the VDSL signals. Hence, in some circumstances, it may be prudent to place the start of the VDSL band above 1.1 MHz. Normal practice is to locate the downstream channel above the upstream one. Figure 1.4: VDSL spectrum. ## 1.1.2 VDSL System Requirements The standards ANSI T1E1.4 and ETSI TM6 have established consistent sets of VDSL system requirements to find a common VDSL solution [7]. In the following paragraphs, these requirements are briefly described. ## a) Data Rates and Ratios VDSL should consider both symmetric and asymmetric transmissions between the CO and the customer. The downstream speed range from about 13 Mbps to 55 Mbps, depending on the distance. The upstream data rate start at 1.5 Mbps and end at about 26 Mbps. Downstream data rates derive from the sub multiples of the SONET and synchronous digital hierarchy (SDH) canonical speed of 155.52 Mbps. Each rate has a corresponding target range. However, the overall transmission data rate depends on a number of factors such as the loop length, wire gauge, type of the cable, presence of bridged taps, and the crosstalk coupled interference. Also, line attenuation tends to increase as the line length increases. Hence, line attenuation must be taken into account during the design stages of a VDSL system. #### b) Transmit Power and Power Spectral Density The transmit power spectral density (PSD) describes how the power of a information bearing signal is distributed in frequency when the signal is applied to the channel at the transmitter output. A transmit PSD mask specifies the maximum allowable transmit PSD, which is by definition a function of frequency. The maximum allowable transmit power specified by both ANSI and ETSI is 11.5 dBm for a VDSL system. Both the groups have defined the masks specifying the maximum allowable transmit PSD, and require modems to be capable of reducing their transmit PSD to -80 dBm/Hz [7]. The PSD may be independently selected for the downstream and upstream directions of transmission. Regardless of mask option, the total transmitted power should not exceed 11.5 dBm. As mentioned previously, VDSL modems are deployed from the ONU which is typically located in a small curbside cabinet with no temperature control mechanisms. Thus, VDSL power consumption must be less than 1.5 Watts per transceiver, including line drivers. #### c) Spectral Compatibility If a VDSL system has to be viable, it must be spectrally compatible with other DSL services that may reside in the same cable. The DSL service most vulnerable to interference from VDSL is ADSL. VDSL can harm the performance of ADSL and the vice-versa can also occur. In FTTC architecture, VDSL can be detrimental to ADSL performance, since VDSL signals can couple into ADSL signals as Near-End-Crosstalk (NEXT) or Far-end -crosstalk (FEXT). The effect of VDSL on ADSL performance is quantified in [8]. Simulations in [8] show that if the downstream VDSL transmission is below 1.104 MHz, it adversely affects the ADSL performance in the FTTC configuration. For example, when a 300 meter VDSL line injects FEXT into the ADSL line, the range for a fixed bit rate of 6 Mbps decreases from 3.75 km to only 2.75 km, and for a fixed range of 3.75 km, the bit rate decreases from 6 Mbps to 3.5 Mbps. To avoid such a substantial degradation in the ADSL performance, the VDSL must be restricted from transmitting at -60 dBm/Hz either upstream or downstream in the band below 1.104 MHz when the ADSL loops reside in the same binder. #### 1.2 VDSL Issues At present, there is no complete standard, defining VDSL; however, the issue of standards is under consideration by both ANSI as well as ETSI. The most important VDSL question concerns the maximum distance for which a VDSL system can operate reliably for a given data rate i.e. the last-mile problem. This is a difficult question, since the real line characteristics at high frequency of operation of a VDSL system are not easy to measure, and the items such as short bridged taps or unterminated extension lines in homes may have detrimental effects on the VDSL in certain configurations. The disadvantage of the VDSL system compared to other DSL systems is its short copper loops that make the distribution area shrink to a few dozen customers. To make the VDSL systems more economical and to increase the distribution area, the design of such a system has to focus on increasing the length of the copper line without a loss in the bandwidth efficiency. In order to increase the loop length, the bit rate has to be compromised. To ensure a reliable transmission of data over longer loops and also to provide the end user with a maximal bit rate at a very low bit error rate (BER), a good forward error correction (FEC) scheme with a high coding gain is required. The FEC scheme for the VDSL systems that is under consideration in T1.413 proposal is a concatenated coding scheme consisting of an inner Trellis code (4-D Wei's code [9]) and an outer Reed-Solomon (RS) code [10]. There are two problems with this approach. First, the system operates far below the channel capacity. Second, the power penalty is more for multi-dimensional constellations. Though the 4D Wei-RS scheme has a high spectral efficiency of 6.12 bits/s/Hz, it requires a high signal-to-noise ratio (SNR) of 27 dB to reach the targeted BER. In order to maintain this high SNR, the bit rate available to the end customer has to be drastically reduced or the transmitting power has to be increased. Increasing the transmitting power increases the crosstalk which is detrimental to the VDSL performance in terms of bit rate and loop length. On the other hand, if the bit rate has to be kept constant, the loop length must be decreased drastically in order to maintain the targeted BER. Due to the concatenation of 4D Wei code and RS code, error propagation occurs between the two decoders thereby degrading the BER performance. Hence, a coding scheme which can achieve the targeted BER at a lower SNR than the 4D Wei-RS scheme can successfully address the last-mile problem. The second issue is interference due to the existing services associated with the twisted pair lines while providing enough bandwidth to support the required highdata rates [11]-[12]. The proposal to use the existing telephone wiring in homes for computer networking (home local area network (home-LAN)) avoids laying of additional wires in the same premise [11]. Due to the spectral crowding of home-LAN on the twisted pair lines, a severe performance loss in the VDSL services occur. To provide a better infrastructure for the internet services, it is desirable for both the VDSL and home-LAN systems to co-exist on the same twisted pair lines. This can be made possible by advanced receiver architectures and multiuser detection techniques [12]. Cioffi and Zeng proposed a crosstalk cancellation technique in [13], in which the crosstalk is estimated in some frequency bands and cancelled in others. But the authors considered NEXT as a major crosstalk which is not the case in general. Several authors proposed multiuser detection techniques to identify and cancel the crosstalk via iterative decoding [14] and [15]. In [15], a linear soft interference canceller has been proposed to reduce the interference with a small loss of VDSL signal bandwidth. These previously presented receivers for multi-user detection are complex to implement. The complexity involved in the multi-user detection can be reduced if iterative-decoding methods are used [12]. Hence, a low complex soft interference canceller coupled with a good coding scheme is necessary to mitigate the home-LAN interference on VDSL. From the above discussion, we conclude that a good coding scheme is required to provide a solution for the VDSL issues. The introduction of turbo codes in 1993 by Berrou is perhaps one of the most important contributions in the coding theory in this decade [16]. The performance of this coding scheme approaches close to the Shannon limit. Hence, it is worth addressing the two VDSL issues discussed above using turbo codes. ## 1.3 Scope and Organization of the Thesis The objective of this thesis is to address the last-mile problem and the home-LAN interference problems of VDSL systems. Regarding the first problem, we propose a bandwidth efficient turbo coding scheme that is more suitable for VDSL modems. The objective of this scheme is to provide a higher coding gain than that provided by the standard 4D Wei-RS scheme, to result in an improved performance in terms of the bit rate, loop length and transmitting power. Also, due to the large volume of data in VDSL applications, the use of turbo codes can avoid the problem of fixed length code word. Some design criteria are presented for constructing good constituent codes for VDSL systems. A mapping method which can maximize the inter-signal Euclidean distance is proposed. The interleaver size, the number of decoder iterations, the code complexity and the level of modulation are the important parameters in determining the coding gain provided by the turbo codes. Hence, the effect of these parameters on the performance of the VDSL modems is explored. The proposed scheme is investigated for different signal frequencies, values of transmitting power, numbers of crosstalkers and loop lengths. Also, since the VDSL systems are delay sensitive, a pipe-lined decoding scheme is proposed to reduce the latency generated due the iterations in the decoder. To compare the hardware complexity, the proposed and 4D Wei-RS schemes are synthesized using Xilinx Synthesizer. The Xilinx FPGA statistics of the proposed scheme is compared with that of the 4D Wei-RS scheme. A simulation study on the implementation of the proposed scheme is carried out to evaluate the achievable bit rates and loop lengths of the VDSL modems. The results are compared with that of the standard 4D Wei-RS scheme. To mitigate the effect of home-LAN on VDSL, an iterative soft interference cancellation and decoding technique is proposed. The VDSL and home-LAN signals are jointly detected using a soft interference canceller and a soft-in soft-out demodulator combined with a set of turbo decoders. The soft interference canceller uses the a priori probabilities to perform the soft interference cancellation. The turbo decoder produces a posteriori probabilities which are fed back to the soft interference canceller as a priori probabilities. The thesis is organized as follows. In Chapter 2, VDSL line impairments are discussed along with the need for a good coding scheme to improve the VDSL performance in terms of bit rate, loop length and transmitting frequency. A brief overview of the existing 4D Wei-RS scheme is provided. Two line codes, quadrature amplitude modulation (QAM) and discrete multi-tone modulation (DMT), that are presently under consideration by ANSI and ETSI for VDSL systems are presented and a comparison is provided. In Chapter 3, the proposed turbo coding scheme for the VDSL systems is presented. Complexity analysis is carried out to compare the proposed scheme with 4D Wei-RS scheme. To compare the hardware complexity, the proposed and the 4D Wei-RS scheme are synthesized using Xilinx Synthesizer. The mapping method used in the proposed scheme is discussed. To reduce the delay generated in the decoder, a pipe-lined decoding scheme is presented. Simulation results are discussed to evaluate the BER performance of the proposed scheme and the results compared with that of the 4D Wei-RS scheme. In Chapter 4, the performance of the VDSL systems employing the proposed scheme is discussed. The effect of transmitting power, transmitting frequency and the the numbers of cross talkers are discussed. The channel capacity and bit rates provided by the VDSL systems employing the proposed scheme are evaluated. Results are compared with that of the 4D Wei-RS scheme. In Chapter 5, an iterative soft interference cancellation and decoding technique is presented to mitigate the effect of home-LAN on VDSL. Simulation results are discussed to evaluate the performance of this technique. Chapter 6 concludes the thesis by summarizing and highlighting the significant results of this investigation. ## Chapter 2 # VDSL Line Impairments and Impact of Coding in VDSL Systems ### 2.1 Introduction For a VDSL system, the strength of the received signal is determined by the strength of the signal from the corresponding transmitter and the attenuation of the telephone subscriber loop [17]-[19]. Also, channel capacity of a telephone subscriber loop is determined by the transmit signal level, channel attenuation, and the receiver front end noise. Hence, to design an efficient VDSL system, an analysis has to be performed on the VDSL channel and the transceiver front end noise models. In this chapter, a VDSL channel model is presented and the channel attenuation characteristics are discussed. Various VDSL line impairments are presented along with their models. Using the VDSL channel model and the front end noise models, the SNR of a VDSL signal at the receiver front end is analyzed. Based on the analysis impact of a good coding scheme on the VDSL systems is explained. We also discuss the existing FEC scheme for the VDSL systems and its disadvantages. Finally, two different line codes QAM and DMT that are under consideration by ANSI and ETSI to be employed in a VDSL system are presented and a comparison | Gauge | $k_1 (\times 10^{-3})$ | $k_2 (\times 10^{-8})$ | |-------|------------------------|------------------------| | 22 | 3.0 | 0.035 | | 24 | 3.8 | -0.541 | | 26 | 4.8 | -1.709 | Table 2.1: Parameters for a simplified cable model. is drawn out. ### 2.2 VDSL Channel Model The twisted pair channel model has been investigated and modeled in [1] and [19]. Since VDSL operates in a high frequency range (f > 1 MHz), the simplified high frequency twisted pair channel transfer function can be formulated as $$H(d, f) = e^{-d(k_1\sqrt{f} + k_2 f)},$$ (2.1) where d is the length of the twisted pair cable in miles, f the frequency of operation in Hz, and $k_1$ and $k_2$ the proportional constants which are shown in Table 2.1. The received signal power spectrum density, S(f), is determined by the transmit power spectrum density, Q(f), and the twisted pair loop channel transfer function, H(f), as shown in the following expression $$S(f) = Q(f) |H(f)|^{2}$$ $$= Q(f)e^{-2d(k_{1}\sqrt{f}+k_{2}f)}$$ (2.2) Let us examine the signal attenuation (insertion loss) that can occur in a VDSL channel. In general, signal attenuation increases with an increase in the frequency. The rate at which the attenuation increases is a function of line length and wire gauge as shown below Figure 2.1: Attenuation as a function of frequency of two 26 AWG (0.4 mm) lines and two 24 AWG (0.5 mm) lines. $$|H(f)|^2 = e^{-2d(k_1\sqrt{f} + k_2 f)}$$ (2.3) Figure 2.1 illustrates the signal attenuation as a function of frequency for four different twisted pair loops: 3000 ft and 4500 ft of 26 AWG (or 0.4 mm) line, 3000 ft and 4500 ft of 24 AWG (or 0.5 mm) lines. The attenuation curves are smooth because the loops are terminated in the appropriate characteristic impedance at both ends. Comparison of the four curves show the relationship between the signal attenuation, line length, and wire gauge. Signals on longer wires composed of smaller diameter wires are attenuated very rapidly with an increase in the frequency, whereas shorter lines made of larger diameter wires cause a more gentle increase in the attenuation with an increase in the frequency. A single line terminated in an appropriate impedance at both ends is preferred for VDSL transmission since in this case, the attenuation is smooth with frequency. However, many twisted pair lines do not exhibit such smooth attenuation because of bridged tap configurations in which an unused twisted pair line is connected in shunt to the main cable pair. ## 2.3 VDSL Line Impairments Besides the limitation of a transceiver hardware noise floor, the other types of noises that effect the performance of a VDSL system are background noise, crosstalk noise, impulse noise, and radio frequency interference (RFI). The severity of a noise is usually measured from its power level or its power density level. The noise power is usually expressed in dBm [5] and is defined as $$P = 10 \times \log_{10} \frac{v^2}{R \times p_m} = 10 \times \log_{10} \frac{v^2}{100 \times 0.001},$$ (2.4) where, v is the average voltage of noise, $R = 100\Omega$ the receiver input impedance, and $P_m = 0.001$ the reference of 1 milli-Watt. The noise power density usually expressed in units of dBm/Hz is defined as $$PSD = 10 \times \log_{10} \frac{v^2}{R \times P_m \times B} = 10 \times \log_{10} \frac{v^2}{0.1 \times B},$$ (2.5) where B is the bandwidth of noise in hertz. #### 2.3.1 Background Noise Background noise in the telephone subscriber loop can be caused by a combination of the radio noise and the noise generated by electrical and electronic devices. The probability density of the background noise is very close to the Gaussian distribution. Therefore, background noise can be modelled as a Gaussian noise. Based on the results from the Bellcore noise survey, the background noise level in the twisted pair telephone loop has been assumed to be -140 dBm/Hz [20]. #### 2.3.2 Crosstalk Noise Due to the capacitive and inductive coupling of the binder groups, there is a crosstalk between each twisted pair even though the pairs are well insulated. As a result, a local receiver can detect signals transmitted on the other lines, thus increasing the noise power and degrading the received signal quality on that line. For VDSL systems crosstalk could become a limiting factor to the achievable throughput. There are two different types of crosstalks that can occur in a VDSL loop: Near End Crosstalk (NEXT) and Far End Crosstalk (FEXT) [21]. To adjust the interference level when the number of interferers differ from 49, we use a de-rating factor of $6 \times \log_{10}(n)$ . This is called as 6-dB noise margin. #### 2.3.2.1 Near End Crosstalk NEXT as shown in Figure 2.2 occurs when a local receiver detects signals transmitted on the other lines by one or more local transmitters. The level of NEXT detected Figure 2.2: Illustration of near-end crosstalk (NEXT). at a local receiver is primarily dependent on the number of interferers, proximity to the line of interest, relative power, spectral shapes of the interfering signals, and the frequency band over which NEXT occurs. In general, NEXT coupling between adjacent lines in a cable is worse than NEXT between lines spaced further apart. Also, NEXT worsens if the transmit power on the interfering lines are increased. The simplified Unger NEXT model generalized for K disturbers [5] is given by $$N_N = \left(\frac{K}{49}\right)^{0.6} \frac{1}{1.134 \times 10^{13}} f^{\frac{3}{2}} \tag{2.6}$$ Figure 2.3 shows the NEXT coupling loss for 1 and 49 disturbers with a 6 dB noise margin. From the figure we can observe that, the loss difference between 1 disturber and 49 disturbers is about 10 dB. However, NEXT can be eliminated by employing the frequency division multiplexing (FDM) approach in which one directional transmission is adopted among all the telephone subscriber loops. Hence, in this work the effect of NEXT is neglected. #### 2.3.2.2 Far End Crosstalk FEXT, as shown in Figure 2.4, occurs when a local receiver detects signals transmitted in its frequency band by one or more remote transmitters. As in the case of NEXT, the level of FEXT is dependent on the number of interferers, proximity to the line of interest, relative power, spectral shapes of the interfering signals, and the frequency band over which FEXT occurs. A simplified Unger FEXT model Figure 2.3: Next coupling with 6 dB noise margin. generalized to K disturbers [5] can be expressed as $$N_F = \left(\frac{K}{49}\right)^{0.6} k df^2 |H(f)|^2 \tag{2.7}$$ where, $k = 8 \times 10^{-20}$ , d the loop length in feet, f the frequency in Hz, and H(f) is the transfer function of the loop. Figure 2.5 shows the FEXT coupling loss due to 49 disturbers for 1500 ft, 4500 ft, and 7500 ft loop lengths. From the graph, it is observed that FEXT decreases with an increase in the loop length. But, this is not the case with NEXT which is independent of line length. This is because, as the loop length increases, FEXT signals gradually attenuates. For this reason, FEXT is a minor impairment in longer loops. As VDSL employs smaller loops, the effect of FEXT is severe. On longer loops, the line attenuation is severe enough to counteract the $f^2$ contribution to the FEXT coupling expression. Thus, the coupling curves decrease rapidly beyond some frequency. Figure 2.4: Illustration of far-end crosstalk. Figure 2.5: FEXT coupling loss with 6 dB noise margin for N=49. ## 2.3.3 Impulse Noise Impulse noise is a short-duration, high-power burst of energy that can temporarily overwhelm information bearing signals. Impulse noise can be caused by electronic, electro-mechanical devices, and lightning [22]. To mitigate the impulse noise, a well-designed FEC scheme can be used with data interleaving. The interleaver rearranges the order of the coded bytes so that any impulse noise that corrupts a set of bytes, when de-interleaved are spread out in time. Also, the interaleaver reduces the time over which a single impulse harms the signal. ## 2.3.4 Radio Frequency Interference Radio frequency interference noise appears at the receivers when over-the-air signals in overlapping frequency bands couple into phone lines. Overhead distribution cables and wires within homes, are particularly susceptible to interference from AM radio signals. AM interferers appear in the VDSL frequency spectrum as high-level noise spikes in the band between 525 kHz and 1.61 MHz [6]. However, radio frequency interference can be successfully eliminated by relying on the QAM based adaptive equalization techniques [23]. ## 2.4 Impact of Coding In this section, we discuss the need of a good coding scheme to improve the VDSL performance in terms of bit rate, loop length and transmitting power. The disadvantage of the VDSL system compared to other DSL systems is its short copper loops that make the distribution area shrink to a few dozen customers. To make the VDSL systems more economical and to increase the distribution area, the design of such a system has to focus on increasing the length of the copper line without a loss in the bandwidth efficiency. To explain the impact of coding, we assume that FEXT and background noise are the major line impairments. The FEXT noise power spectral density is determined by the transmit power spectral density and the FEXT coupling transfer function as $$N_F = Q(f) |H(f)|^2 k df^2$$ (2.8) By including the background noise power density, the received SNR becomes $$\frac{S(f)}{N_F(f) + W} = \frac{1}{kdf^2 + \frac{\sigma^2}{P}e^{2d(k_1\sqrt{f} + k_2f)}},$$ (2.9) Figure 2.6: Spectral efficiency as a function of loop length. where W is the white noise power density with total power $\sigma^2$ , and P the total power of the transmit signal. As discussed in Section 1.1.2.2, the maximum allowable transmit signal PSD is -80 dBm/Hz. Hence, we set the value of P at an acceptable level of -70 dBm/Hz. The differential channel capacity for a VDSL system can be expressed as $$\frac{dC}{dB} = \log_2\left(1 + \frac{S(f)}{N_F(f) + W}\right) = \log_2\left(1 + \frac{1}{kdf^2 + \frac{\sigma^2}{P}e^{2d(k_1\sqrt{f} + k_2f)}}\right)$$ (2.10) From Figure 2.6, we can observe the trade off between the spectral efficiency of a VDSL system and loop length. Due to this, the user located near the CO receives high bit-rate data, whereas the user far from the CO receives the data with low bit rate. In order to explain the necessity for a coding scheme to improve the VDSL performance, we examine the SNR of the received VDSL signal at the receiver. Figure 2.7 shows the SNR as a function of frequency under the FEXT plus white noise condition for a 24-gauge twisted-pair loop with background noise level -140 dBm/Hz. From this graph we can see the compromise between the parameters SNR, loop length, and frequency of operation. For a fixed frequency of operation, if the targeted BER can be achieved for a lower SNR, the loop length can be increased. On the other hand, for a fixed loop length, if the targeted BER can be achieved for a lower SNR, the frequency of operation can be increased thereby increasing the bit rate and reducing the line attenuation (refer to Figure 2.1). Figure 2.8 shows the SNR as a function of loop length for a frequency of operation of 14 MHz. From this figure we can conclude that, the lower the SNR, the longer the loop length that can be realized. Due to the trade off between the SNR and the bit rate in VDSL loops, user located near the CO receives high bit rate data, whereas user located far from the CO receives the data with low bit rate. From Figure 2.8, we can observe that, the received SNR decreases as the loop length increases because the signal attenuates more with an increase in the loop length. As the signal attenuation increases, the BER performance of the VDSL system degrades. Hence, if VDSL systems can achieve the targeted BER for a lower SNR, the loop length can be increased. These requirements suggest the necessity of employing a bandwidth and power-efficient coding scheme in the VDSL loops to provide the end user with a maximal bit rate. ## 2.5 Overview of the Existing Coding Scheme In this section a brief overview of the existing coding scheme for the VDSL systems is presented. The xisting coding scheme for the VDSL systems employs a 4D Wei-RS code as an inner code and Reed-Solomon code as an outer code as shown in Fig 2.9. Figure 2.7: Signal-to-self-FEXT plus white noise ratio for 24-gauge twisted-pair loop. Figure 2.8: Signal-to-self-FEXT plus white noise ratio for 24-gauge twisted-pair loop as a function of loop length. Figure 2.9: 4D Wei-RS coding scheme. ### 2.5.1 Reed-Solomon Coding Reed-Solomon (RS) codes are cyclic block codes that perform forward error control by using redundancy bits. The data is partitioned into symbols of m bits and each symbol is processed as one unit by encoder and decoder. RS codes are described as (n,k) block codes, where n is the coded data block length and k is the uncoded data block length. The extra (n-k) symbols are called the parity check symbols. The RS code satisfies: $n \leq 2^m - 1$ and $n - k \geq 2t$ , where t is the number of correctable symbol errors. Under the assumption that errors are independently distributed the symbol error rate can be estimated by $$P_e = \sum_{i=t+1}^{n} \binom{n-1}{i-1} P^i (1-P)^{n-i}, \qquad (2.11)$$ where P is the symbol error probability. In a VDSL system, the number of data symbols and the size of the code word vary depending on the VDSL data frame structure. The VDSL RS codes operate in Galois Field GF( $2^8$ ). A popular RS code used in VDSL is RS(255,223) with 8 bit symbols. For this code each codeword contains 255 bytes, of which 223 bytes are data symbols and 32 bytes are redundant parity symbols [10]. The interleaver is used to rearrange the coded data such that the location of errors look random and is distributed over many code words rather than a few code words. A periodic interleaver of depth m reads m code words of length n each and arrange them in a block with m rows and n columns. Then this block is read column wise. In the deinterleaver the bits are rearranged back to its original order. When an erroneous decision is made in the Trellis decoder it takes some sub symbols to reach the correct the trellis path again. This makes the interleaving useful in trellis coded modulation (TCM) systems where error bursts occur. ### 2.5.2 16 State 4D Wei Coder Trellis-coded modulation is an optional coding and modulation scheme for VDSL to either meet the performance requirement for longer loops or increase the transmission throughput under a certain performance margin. In VDSL, TCM uses Wei's 16 state four-dimensional (4D) convolutional encoder. This code has a theoretical coding gain of 4.5 dB. Figure 2.12 shows the trellis encoder structure in VDSL. Fig. shows the Wei's 16-state 4D convolutional encoding circuit [9]. The trellis encoder takes a set of bits $u = \{u_1, u_2, ...\}$ as its input. Because of the 4-dimensional nature of the encoder, each word u is encoded into two binary symbols v and w, which are modulated into two constellation points. The encoding process can be summarized as - 1) Encode $(u_1, u_2)$ using Wei's 16-state 4D rate 2/3 convolutional encoder to produce $(u_0, u_1, u_2)$ , in which $(u_1, u_2)$ are unchanged and $u_0 = S_0$ . - 2) $(u_0, u_1, u_2)$ is used to select one of the eight 4D subset. - 3) The subset is mapped to two indices that determine the least significant bits (LSBs) of u and w. The mapping method is $$v_{o} = u_{3}$$ $$v_{1} = u_{1} \oplus u_{3}$$ $$w_{0} = u_{2} \oplus u_{3}$$ $$w_{1} = u_{0} \oplus u_{1} \oplus u_{2} \oplus u_{3}$$ $$(2.12)$$ 4) The remaining bits of u and v are directly mapped to the most significant bits (MSBs) of v and w. The subset in the Wei's code is the union of two Cartesian products of two 2D subsets. Figure 2.11 shows the 2D subsets used by Wei's 4D code. The numbers in the figure represent the 2D subset indices, which are in fact the decimal values of two LSBs of v and w. Figure 2.10: Wei's 4D 16 State encoder. TCM decoder reads a pair of constellation points and takes them as its input. Soft-decision Viterbi decoding is used to decode the 4D code. The output of the decoder is an estimated sequence of received constellation points. After that, QAM decoder converts the constellation points into a set of bits. Figure 2.13 shows the viterbi decoding process. The metric used in the decoding is the Euclidean distance. The 4D metric can be obtained by adding the two 2D subset metrics for the pair of 2D subsets corresponding to that 4D subset. Figure 2.14 shows the BER performance of the 4D Wei-RS scheme. The asymptotic coding gain of this code is approximately 4-4.5 dB. The drawback of this scheme is that it requires a SNR higher than 27 dB to reach the targeted BER of 10<sup>-5</sup>, though it provides a high spectral efficiency of 6.12 bits/s/Hz. In order to maintain this high SNR, the bit rate available to the end user has to be drastically reduced. On the other hand, if a coding scheme can be devised to provide a high coding gain at the expense of some loss in the spectral efficiency, most of the transmitting power can be utilized to achieve high signal frequency in order to provide a high data rate. Figure 2.11: 2-dimensional subsets for Wei's code. Figure 2.12: Wei's 16-state 4D convolutional encoder. Directly map the remaining bits of w and v to the MSBs of u Figure 2.13: Decoding Wei's code. Figure 2.14: BER Performance of 4D Wei-RS scheme. ## 2.6 VDSL Line Codes In this section, we briefly discuss two line codes, QAM and DMT, which are under consideration to be employed in a VDSL system, and make a comparison to determine which one can best realize the VDSL's potential. ## 2.6.1 CAP/QAM The CAP/QAM proposal is associated with the frequency division multiplexing (FDM) for the upstream and downstream channels. Figure 2.15 [6] shows the general structure of a QAM VDSL transmitter. A bit stream is encoded into symbols by mapping consecutive sets of b bits, where b is generally less than 8, into constellation points. These constellation points are then modulated, filtered, and transmitted within some predetermined channel bandwidth. Figure 2.15: QAM transmitter block diagram. ### 2.6.2 DMT DMT is a multi-carrier modulation technique in which channel is partitioned into a set of orthogonal, independent sub-channels, each of which supports a distinct carrier. Figure 2.16 [6] shows the general transceiver structure of a DMT system. The source bits are encoded into a set of QAM sub symbols, each of which represents a number of bits determined by the SNR. The set of sub symbols is then input as a block to a complex-to-real inverse discrete Fourier transform (IDFT). Following the IDFT, a cyclic prefix is prepended to the output samples to mitigate inter-symbol interference (ISI). The resulting time domain samples are converted from digital to analog format and applied to the channel. At the receiver, after analog-to-digital conversion, the cyclic prefix is stripped, and the noisy samples are transformed back to the frequency domain by a discrete Fourier transform (DFT). Each output value is then scaled by a single complex number to compensate for the magnitude and phase of its sub-channel's frequency response and further processed by a frequency-domain equalizer (FEQ). After the FEQ, a memory less detector decodes the sub symbols. The DMT proposal deals with the time division multiplexing (TDM). Figure 2.16: DMT Transmitter/Receiver pair Block Diagram. ## 2.6.3 Comparison of QAM and DMT Here, a comparison of QAM and DMT techniques is presented in order to determine which technology can best realize VDSL's potential [24], [25]. The QAM approach rely on time-domain processing of signals thus taking into account the serial and analog nature of the signal on the wire. On the other hand, the DMT approach focuses on frequency-domain processing of the signal. It requires a time-to-frequency and serial-to-parallel data stream conversion at each end of the line. Another important consideration is power. A area of a QAM chip will be about $14mm^2$ and the power consumption will be about 110 mW. Whereas, the area of a DMT chip will be about $50mm^2$ in area with power consumption of 138 mW. Considering the ingress and egress problems, adaptive equalization algorithms in QAM can cope with incoming narrow band disturbers the to mitigate ingress, and programmable notch filters can be employed to mitigate egress. The same approach used for QAM is also used for DMT to address ingress and egress problems. But spurious, transient ingress interference, such as RFI is a more significant problem for DMT. This is because, in this case, the noise frequency has to be detected, communicated between transmitter and receiver and then compensated by the DSP circuitry. During the detection and compensation time, the data traffic being processed must be thrown away. On the other hand, QAM-based solutions rely on adaptive equalization techniques where no detection and communication between transmitter and receiver is needed. In summary, QAM technology fits well with the VDSL requirements and the POTS environment. ## 2.7 Conclusion In this chapter, VDSL line impairments and the necessity of a good coding scheme to improve the VDSL performance have been discussed. A VDSL channel model has been presented and is shown that the signals on longer wires composed of smaller diameter are attenuated rapidly, and shorter wires made of larger diameter cause a more gentle increase in the attenuation. The NEXT and FEXT models have been analyzed. The analysis has shown that the FEXT is more detrimental to VDSL compared to NEXT. To explain the disadvantage of the present VDSL systems, the trade offs between the spectral efficiency and loop length of a VDSL system have been observed. In order to determine the impact of coding, the trade off between the received SNR, loop length, and transmitting frequency has been analyzed and concluded that, if the targeted BER can be achieved for a lower SNR, either the loop length or the transmitting frequency can be increased. The analysis has shown the necessity of a higher coding gain for the FEC scheme in the VDSL systems to provide the end user with a maximal bit rate. The existing FEC (a concatenated 4D Wei-RS) scheme for VDSL systems and its disadvantages have been discussed. The drawback of this scheme is that, it is very complex to implement, and requires a high SNR of 22 dB to achieve the targeted BER of 10<sup>-5</sup>. Hence, a good FEC scheme that can achieve the targeted BER for a lower SNR and is less complex than that of the 4D Wei-RS scheme is required. Finally, the two line codes, QAM and DMT, that are under consideration by ANSI and ETSI to be implemented in a VDSL system have been discussed and concluded that the QAM has a better advantage than the DMT and can be successfully implemented in a VDSL system. ## Chapter 3 # A Bandwidth Efficient Turbo Coding Scheme for VDSL ## 3.1 Introduction The most important VDSL question concerns the maximum distance for which a VDSL system can operate reliably for a given data rate i.e. the last-mile problem. In order to provide a solution to the last-mile problem, the design of the VDSL system has to focus on increasing the length of the copper line without a loss in the bandwidth efficiency. As discussed in Chapter 2, to ensure a reliable transmission of data over longer loops and also to provide the end user with a maximal BER, a good FEC scheme with a high coding gain is required. Since a VDSL has generally to deal with huge multimedia applications, the FEC scheme should be less complex in order to reduce the decoding delay, and also it should be bandwidth efficient. The introduction of turbo codes in 1993 by Berrou is perhaps one of the most important contributions in the coding theory in this decade [16]. The performance of this coding scheme approaches close to the Shannon limit. However, conventional turbo codes are low-rate codes. Turbo codes can be combined with bandwidth efficient modulation to develop power efficient coding techniques without sacrificing the bandwidth efficiency. There are various approaches that have been used to increase the spectral efficiency of the turbo codes [26]-[31]. This Chapter is organized as follows. In Section 3.2, the basic concepts of turbo codes are explained. In Section 3.3, a bandwidth-efficient turbo coding scheme that is more suitable for VDSL modems is proposed [32]. The objective of the proposed scheme is to provide a higher coding gain than that of the 4D Wei-RS scheme resulting in an improved performance in terms of bit rate, loop length and transmitting power. Some design criterion is presented for constructing good turbo codes for VDSL systems are presented. In Section 3.4, a pipe-lined decoding scheme is proposed in order to reduce the decoding delay at the receiver end. In Section 3.5, an analysis is carried out to compare the complexity of the proposed scheme with that of the existing scheme. To compare the hardware complexity, the proposed and the 4D Wei-RS scheme are synthesized using Xilinx synthesizer. The FPGA statistics of the proposed schemes compared with that of the 4D Wei-RS scheme. In Section 3.6, simulation results are presented to evaluate the performance of the proposed scheme and results are compared with that of the 4D Wei-RS scheme. ## 3.2 Basic Concepts of Turbo Codes The turbo codes originally proposed consists of two parallel recursive systematic convolutional (RSC) encoders separated by an interleaver and uses an iterative soft input soft output (SISO) decoder. In this section, we will explain the general idea of turbo codes. ### 3.2.1 Turbo Encoder A turbo encoder as shown in Figure 3.1, is formed by two RSC encoders with memory v (so that the number of states is $2^v$ ), linked through an interleaver of length N. The role of the interleaver is to maximize the minimum Weight of the turbo code word Figure 3.1: Turbo encoder. at the output of the constituent encoders. The codeword is formed by adding the parity check bits generated by the first and second encoders to the input information bit. The Weight of the corresponding code word is given by $d = w + z_1 + z_2$ , where w is the Weight of the information sequence, and $z_1$ and $z_2$ are the Weights of the first and second parity check sequences respectively. The effective free distance $d_{free,eff}$ of the codeword [33] is given by $$d_{free,eff} = 2 + 2z_{min}, \tag{3.1}$$ where $z_{min}$ is the minimum free distance of the codeword. For a RSC code with a rate of 1/n and memory of v, the upper bound for $z_{min}$ [33] is given by $$z_{min} \le (n-1)(2^{v}+2) \tag{3.2}$$ The component codes and their rates are not necessarily the same. If the code rates of the two component encoders are denoted by $R_1$ and $R_2$ , the overall turbo code rate R can be determined by $\frac{1}{R} = \frac{1}{R_1} + \frac{1}{R_2} - 1$ [34]. The overall code rate can be increased by suitably puncturing the original code. These punctured codes have a simpler Trellis structure than those of the corresponding non-punctured codes. Figure 3.2: Turbo decoder. ### 3.2.2 Turbo Decoder A simple suboptimal iterative algorithm, where two soft-in soft-out (SISO) decoders are used in an iterative manner, is used in the decoder as shown in Figure 3.2. The $maximum\ a\ posterioiri\ (MAP)$ criterion is used to provide a soft output. The LOG-MAP algorithm minimizes the bit error probability by computing the log-liklehood ratio (LLR) of the bit $b\ [16]$ , [35], which is conditioned on the received sequence r, as given by $$\Lambda(b_t) = log \frac{Pr(b_t = 1|r)}{Pr(b_t = 0|r)}$$ (3.3) The first MAP decoder produces an estimate of the *a priori* probabilities for the information sequence for the second MAP decoder. This decoder also produces a soft output which is used to improve the estimate of the *a priori* probabilities for the information sequence at the input of the first MAP decoder. After a certain number of iterations, the soft outputs of both the MAP decoders stop to provide further improvements in the performance. Finally, the last stage of decoding makes the hard decision after deinterleaving. ## 3.3 Bit Interleaved Turbo Coded Modulation In this bit interleaved coded modulation (BICM) design, a single binary turbo code of rate 1/R is used. The RSC component code has a rate of $1/n_0$ . The information bit b is first encoded by the turbo encoder to form the code bits given by $c = \{c^0, c^1, c^2\}$ , where $c^0$ is the first coded bit, which is equal to the information bit b, $c^1$ is the second coded bit from the first RSC encoder and $c^2$ is the third coded bit from the second RSC encoder. The encoder outputs are suitably multiplexed and punctured to obtain $\tilde{m}$ parity symbols and m- $\tilde{m}$ information symbols as shown in Figure 3.3. These encoded symbols are mapped into an M-QAM signal set consisting of 2<sup>m</sup> points. A set $\{c_i^j\}$ (i=1,...,m;j=0,1,2) of m bits is mapped into a complex signal symbol x to be transmitted over the channel. Each symbol x is represented by a set of real-valued symbols $\{x_I, x_Q\}$ . The spectral efficiency of this scheme is (m- $\tilde{m}$ ) bits/s/Hz. The conventional turbo decoder [29] for this approach treats the demodulation and the decoding process as two separate entities, thereby degrading the performance with respect to the coding gain. The performance loss can be avoided by treating the demodulation and decoding processes as two separate stages of a single process. This is done through the so called "turbo principle", in which the "soft" information is exchanged between the demodulation and decoding operations in an iterative manner. To implement this joint demodulation and decoding of the VDSL signal, some assumptions are made with respect to the signal itself. A VDSL signal can be expressed as $$r = h.x + \sum_{i=2}^{n} f_i.\eta_i + N_0, \tag{3.4}$$ where r is the received symbol, x is transmitted symbol, h the VDSL channel gain, $\eta_i$ the *i*th crosstalk signal, $f_i$ the corresponding crosstalk coupling function and $N_0$ the additive white Gaussian noise (refer to Figure 3.4). Clearly y is the scaled version of x contaminated by a multiple access interference (MAI) and the channel noise, since we estimate each signal separately. Assuming that no knowledge of other signals is available, we can treat MAI as an extra "noise" source. As the MAI is independent of the channel noise, we can combine the MAI and the channel noise together and model the combined noise by another Gaussian distribution with its variance given by $$\sigma_N^2 = \sigma_{MAI}^2 + \sigma_{AWGN}^2 \tag{3.5}$$ Figure 3.5 shows the proposed turbo structure for a joint demodulation and decoding process. Before giving the detailed analysis of this structure, we will briefly describe its operation. The received noisy symbols $\{r\}$ are demapped and the log-likelihood ratio associated with each received bit is calculated. The log-likelihood ratio consists of a priori and extrinsic information. As the a priori information is not available to the demodulator during the first iteration, an equally likelihood assumption is made on the received symbols. The extrinsic information obtained from the demodulation stage is then demultiplexed and sent as a priori information to the binary turbo decoder. In turn, the binary turbo decoder computes the a posterioiri LLR of each code bit and then excludes the influence of its a priori information to obtain the extrinsic information of the decoding stage. This extrinsic information from the decoding stage is again suitably multiplexed and fed back to the demodulator as a priori information for the next iteration to improve the estimate of the received symbols. The operations carried out by the demodulator and decoder are repeated in an iterative manner. After the final iteration, the decoding stage makes hard decisions on its a posteriori LLR of the information bits. The receiver is thus expected to provide an improved performance by this iterative scheme as compared with the scheme in which the demodulation and decoding operations are two separate and distinct operations. Figure 3.3: BICM encoder. Figure 3.4: VDSL crosstalk channel model. Figure 3.5: BICM decoder. ## 3.3.1 Demodulation and Bit LLR Computation The log-likelihood ratio associated with each bit [14] can be calculated as $$\Lambda_1(c_i^j) = log \frac{P(c_i^j = 1|r)}{P(c_i^j = 0|r)}$$ $$= log \frac{p(r|c_i^j = 1)}{p(r|c_i^j = 0)} + log \frac{P(c_i^j = 1)}{P(c_i^j = 0)}$$ $$= \lambda_1(c_i^j) + \lambda_2^p(c_i^j), \quad i = 1, ..., m$$ (3.6) where, $\lambda_1(c_i^j)$ denotes the soft metric corresponding to $c_i^j$ delivered by the demodulation stage, and $\lambda_2^p(c_i^j)$ is the *a priori* LLR delivered from the decoding stage in the previous iteration. The soft metric, $\lambda_1(c_i^j)$ , is the extrinsic information delivered by the demodulation stage which is then deinterleaved and demultiplexed and sent to the channel decoder for further processing. For the first iteration, all the bits are assumed to be equally probable, and hence, this term is set to zero and it can be evaluated as $$\lambda_1(c_i^j) = log \frac{p(r|c_i^j = 1)}{p(r|c_i^j = 0)}$$ $$= log \frac{p(r, c_i^j = 1)P(c_i^j = 0)}{p(r, c_i^j = 0)P(c_i^j = 1)}$$ (3.7) By expressing the numerator and denominator of the operand used in (3.7) as summations of all m-bit M symbols, we can rewrite the equation as $$\lambda_1(c_i^j) = log \frac{\sum_{x \in \chi(c_i^j = 1)} p(r, x) P(c_i^j = 0)}{\sum_{x \in \chi(c_i^j = 0)} p(r, x) P(c_i^j = 1)}$$ $$= log \frac{\sum_{x \in \chi(c_i'=1)} p(r|x) P(x) P(c_i^j = 0)}{\sum_{x \in \chi(c_i'=0)} p(r|x) P(x) P(c_i^j = 1)},$$ (3.8) where, $\chi$ represents the signal space of M symbols. Depending on the specific bit $c_i$ , the signal space is divided into two sets, $\chi(c_i^j = 1)$ and $\chi(c_i^j = 0)$ . From the one-to-one correspondence between x and $\{c_i^j\}$ , P(x) in the above equation can be replaced by $\prod_{i=1}^m P(c_i^j)$ . Therefore, (3.8) can be rewritten as $$\lambda_1(c_i^j) = \log \frac{\sum_{x \in \chi(c_i^j = 1)} p(r|x) \prod_{k \neq i}^m P(c_k^j = 1)}{\sum_{x \in \chi(c_i^j = 0)} p(r|x) \prod_{k \neq i}^m P(c_k^j = 0)},$$ (3.9) where the conditional probability density function, p(r|x), is the likelihood function for the signal transmitted, and it is calculated as $$p(r|x) = p(r_I, r_O|x_I, x_O)$$ $$=\frac{1}{\sqrt{2\pi}\sigma_N}exp\left[-\frac{|r-x|^2}{2\sigma_N^2}\right]$$ $$= \frac{1}{\sqrt{2\pi}\sigma_N} exp\left[-\frac{(r_I - x_I)^2 - (r_Q - x_Q)^2}{2\sigma_N^2}\right],$$ (3.10) where $r_I$ and $r_Q$ are the noisy versions of $x_I$ and $x_Q$ respectively, and $\sigma_N$ is given by (3.5). Equation (3.10) indicates that the soft information delivered by the demodulation stage depends on the minimum Euclidean distance between the received symbol r and all the symbols x for which $c_i^j = 1$ or $c_i^j = 0$ . $P(c_i^j)$ depends on the extrinsic information $(\lambda_2^p(c_i^j))$ delivered by the decoding stage, and can be expressed as $$P(c_i) = \begin{cases} \frac{e^{\lambda_2^{p}(c_i^{j})}}{1 + e^{\lambda_2^{p}(c_i^{j})}} & for c_i^{j} = 1\\ \frac{1}{1 + e^{\lambda_2^{p}(c_i^{j})}} & for c_i^{j} = 0 \end{cases}$$ $$=\frac{1}{2}\left[1+c_{i}^{j}\tanh\left(\frac{1}{2}\lambda_{2}^{p}\left(c_{i}^{j}\right)\right)\right] \tag{3.11}$$ ### 3.3.2 Binary Turbo Decoder The turbo decoder consists of two MAP decoders, DEC1 and DEC2, serially concatenated by an interleaver and it operates in an iterative manner. Each MAP decoder is modified to produce the *a posteriori* LLRs of both the coded and uncoded bits. The first MAP decoder receives the soft information from the demodulation stage and produces a soft output, which is interleaved and used to produce an improved estimate of the *a priori* probabilities for the second MAP decoder. The extrinsic information of the second MAP decoder can be used as the estimates of the *a priori* probabilities for the first decoder. After certain number of iterations, the uncoded bit probabilities are taken out of the second MAP decoder and hard decisions are performed on the information bits. The coded bit probabilities from the outputs of the first and second MAP decoders and the uncoded bit probabilities from the output of the second deoder are suitably multiplexed and sent back as the *a priori* information to the demodulation stage. Here, we outline a procedure for computing the LLR's of the information and code bits, which is a modification of the algorithm presented in [16]. The RSC component code has a rate $1/n_0$ and an overall constraint length v. The state of the trellis at time t can be represented by a (v-1)-tuple as $S_t = (s_t, ..., s_t^{(v-1)})$ . We denote the input information bits that can cause the state transition from $S_{t-1} = s'$ to S = s by b(s', s) and the corresponding output code by c(s', s). Suppose that the encoder starts in state $s_0 = 0$ , an information bit stream $\{b_t\}_{t=1}^N$ , are the input to the RSC encoder followed by v blocks of all zero inputs, causing the encoder to end in state $S_\tau = 0$ , where $\tau = N + v$ . Since we process one bit at a time in the encoder, we drop the notation i for simplicity. Let $c_t^j$ , $\{j=0,1,2\}$ denote the output of the RSC encoder at time t. The a posteriori LLR of the coded bits obtained from DEC1 is given by $$\Lambda(c_t^1) = log \frac{P(c_t^1 = 1|obs)}{P(c_t^1 = 0|obs)}$$ $$= log \frac{\sum_{S^1=1} \alpha_{t-1}(s') \gamma_t(s', s) \beta_t(s)}{\sum_{S^0=0} \alpha_{t-1}(s') \gamma_t(s', s) \beta_t(s)},$$ (3.12) where obs is the observation provided by the soft information from the demodulation stage, $S^1$ the set of state pairs at time t such that the ith coded bit is 1, $S^0$ is the corresponding pair set such that the ith coded bit is 0, and $\alpha_t(s)$ , $\beta_t(s)$ , $\gamma_t(s', s)$ are defined as $$\alpha_t(s) = \sum_{s'} \alpha_{t-1}(s') \gamma_t(s', s), \ t = 1, 2, ..., \tau$$ (3.13) with the boundary conditions $\alpha_0(0) = 1$ and $\alpha_0(s) = 0$ for $s \neq 0$ , and $\tau$ denoting the length of the information sequence, $$\beta_t(s) = \sum_{s'} \beta_{t+1}(s')\gamma_{t+1}(s, s'), \ t = \tau - 1, \tau - 2, ..., 0$$ (3.14) with the boundary conditions $\beta_{\tau}(0) = 1$ and $\beta_{\tau}(s) = 0$ for $s \neq 0$ , and $$\gamma_t(s', s) = P(S_t = s | S_{t-1} = s') = P(c_t^0(s', s)) P(c_t^1(s', s))$$ for DEC1 (3.15) $$\gamma_t(s', s) = P(S_t = s | S_{t-1} = s') = P(c_t^0(s', s)) P(c_t^2(s', s))$$ for DEC2 (3.16) The code bit distribution $P(c_t^j(s',s))$ can be calculated from (3.11) as $$P(c_t^0(s's)) = \frac{1}{2} \left[ 1 + c_t^0(s', s) \tanh\left(\frac{1}{2}\lambda_1^p(c_t^0)\right) \right] for DEC1$$ (3.17) $$P(c_t^0(s's)) = \frac{1}{2} \left[ 1 + c_t^0(s', s) \tanh\left(\frac{1}{2}\lambda_1^{p}(c_t^0)\right) \right] \text{ for } DEC2$$ (3.18) $$P(c_t^j(s',s)) = \frac{1}{2} \left[ 1 + c_t^j(s',s) \tanh\left(\frac{1}{2}\lambda_1^p(c_t^j)\right) \right], \ j = 1,2$$ (3.19) where $\lambda_1^p(c_t^j)$ is the prior information provided by the demodulation stage, $\lambda_1^p(c_t^j)$ is its interleaved version. Thus, (3.12) can be rewritten as $$\Lambda(c_t^1) = log \frac{\sum_{S^1=1} \alpha_{t-1}(s')\beta_t(s)P(c_t^1=1)P(c_t^0=1)}{\sum_{S^0=0} \alpha_{t-1}(s')\beta_t(s)P(c_t^1=0)P(c_t^0=0)}$$ $$= log \frac{\sum_{S^{1}=1} \alpha_{t-1}(s')\beta_{t}(s)P(c_{t}^{0}=1)}{\sum_{S^{0}=0} \alpha_{t-1}(s')\beta_{t}(s)P(c_{t}^{0}=0)} + log \frac{P(c_{t}^{1}=1)}{P(c_{t}^{1}=0)}$$ $$= \lambda_{2}(c_{t}^{1}) + \lambda_{1}^{p}(c_{t}^{1})$$ (3.20) It is seen from the above equation that the output of DEC1 is sum of the a priori information $\lambda_1^p(c_t^1)$ provided by the demodulation stage and the extrinsic information $\lambda_2(c_t^1)$ . The extrinsic information is the information about the code bit $c_t^1$ gleaned from the prior information about the other code bits based on the trellis structure of the code. Following the same procedure as for the coded bits, the LLR of the uncoded bits can expressed as $$\Lambda(b_t) = \log \frac{\sum_{S^1 = 1} \alpha_{t-1}(s') \beta_t(s) \gamma_t(s's, )}{\sum_{S^1 = 0} \alpha_{t-1}(s') \beta_t(s) \gamma_t(s', s)}$$ (3.21) Since the first coded bit is equal to the systematic bit, $\Lambda(b_t) = \Lambda(c_t^0)$ . Then, only the extrinsic information of $\Lambda(b_t)$ ( the information not received from the other decoder, DEC2) is interleaved and sent to DEC2, where it is used as the *a priori* probability. After receiving the soft information about $b_t$ from DEC1, DEC2 uses it to evaluate $\Lambda(b_t)$ and $\Lambda(c_t^2)$ . The LLR of the coded and uncoded bits for DEC2 follows the same steps as that for DEC1 and are expressed as $$\Lambda(c_t^2) = \log \frac{\sum_{S^1=1} \alpha_{t-1}(s')\beta_t(s)P(c_t^0=1)}{\sum_{S^1=0} \alpha_{t-1}(s')\beta_t(s)P(c_t^0=0)} + \log \frac{P(c_t^2=1)}{P(c_t^2=0)}$$ $$= \lambda_2(c_t^2) + \lambda_1^p(c_t^2) \tag{3.22}$$ $$\Lambda(b_t) = \Lambda(c_t^0) = \log \frac{\sum_{S^1 = 1} \alpha_{t-1}(s') \beta_t(s) \gamma_t(s's, )}{\sum_{S^0 = 0} \alpha_{t-1}(s') \beta_t(s) \gamma_t(s', s)}$$ $$= \lambda_2(c_t^0) + \lambda_1^p(c_t^0) \tag{3.23}$$ The LLR of the coded bits $(\Lambda(c_t^1), \Lambda(c_t^2))$ and deinterleaved version of $\Lambda(c_t^0)$ from DEC2 computed by the turbo decoder are then suitably multiplexed to form $\Lambda_2(c_t)$ . $\Lambda_2(c_t)$ is used to form the extrinsic information for the demodulation stage by excluding the *a priori* knowledge as shown in Figure 3.5 and it can be written as $$\lambda_2(c_t) = \Lambda_2(c_t) - \lambda_1^p(c_t) \tag{3.24}$$ Equations from (3.12) to (3.16) corresponding DEC1 can be simplified by using the MAX-LOG-MAP algorithm in which all the operations are carried out in the logarithmic domain [35]. By replacing $\alpha_t$ with $\overline{\alpha}_t = \log(\alpha_t)$ , $\beta_t$ with $\overline{\beta}_t = \log(\beta_t)$ , and $\gamma_t$ with $\overline{\gamma}_t = \log(\gamma_t)$ , (3.13), (3.14), (3.15), and (3.16) can be respectively rewritten as $$\overline{\alpha}_{t}(s) = \log \left( \sum_{s'} e^{\overline{\alpha}_{t-1}(s') + \overline{\gamma}_{t}(s',s)} \right), \ \overline{\alpha}_{0}(0) = 0 \ and \ \overline{\alpha}_{0}(s) = -\infty \ for \ s \neq 0 \ (3.25)$$ $$\overline{\beta}_{t}(s) = \log \left( \sum_{s'} e^{\overline{\beta}_{t+1}(s') + \overline{\gamma}_{t+1}(s,s')} \right), \ \overline{\beta}_{\tau} = 0 \ and \ \overline{\beta}_{\tau}(s) = -\infty \ for \ s \neq 0 \quad (3.26)$$ $$\overline{\gamma}_{t}(s', s) = \log\left(P\left(c_{t}^{0}\right)\right) + \log\left(P\left(c_{t}^{1}\right)\right), \text{ for } DEC1$$ (3.27) $$\overline{\gamma}_t(s', s) = \log\left(P\left(c_t^0\right)\right) + \log\left(P\left(c_t^2\right)\right), \text{ for } DEC2$$ (3.28) By using (3.25)-(3.28), (3.12) can be modified as $$\Lambda(c_t^1) = log \frac{\sum_{S^1 = 1} e^{\overline{\alpha}_{t-1}(s') + \overline{\beta}_t(s) + \overline{\gamma}_t(s', s)}}{\sum_{S^0 = 0} e^{\overline{\alpha}_{t-1}(s') + \overline{\beta}_t(s) + \overline{\gamma}_t(s', s)}}$$ (3.29) By using the approximation $$log(e^{\delta_1} + e^{\delta_2} + \dots + e^{\delta_n}) = \max_{i \in \{1, 2, \dots, n\}} \delta_i, \tag{3.30}$$ (3.25), (3.26), and (3.29) can be simplified respectively as $$\overline{\alpha}_{t}(s) = \max_{s'}(\overline{\alpha}_{t-1}(s') + \overline{\gamma}_{t}(s', s))$$ (3.31) $$\overline{\beta}_{t}(s) = \max_{s'} (\overline{\beta}_{t+1}(s') + \overline{\gamma}_{t+1}(s, s'))$$ (3.32) $$\Lambda(c_t^1) = \max_{S^1 = 1} (\overline{\alpha}_{t-1}(s') + \overline{\beta}_t(s) + \overline{\gamma}_t(s', s)) - \max_{S^0 = 0} (\overline{\alpha}_{t-1}(s') + \overline{\beta}_t(s) + \overline{\gamma}_t(s', s))$$ (3.33) The MAX-LOG-MAP algorithm used for DEC1 can also be applied for DEC2. ## 3.3.3 Design Criteria for Constituent Codes We now present some design criterion for constructing good constituent codes for VDSL applications. Since the turbo codes should be spectrally efficient, we chose the rate of the turbo codes to be 1/3. The coding gain achieved by the turbo coding scheme is determined by two factors, the code complexity (CC) and the interleaver gain. The coding gain yielded by increasing the CC is rather large compared to that achieved by increasing the interleaver size (N) [33] and [36]. Thus, the size of the interleaver should be kept small. The constituent codes should be designed to perform well in the high SNR range as the targeted BER is $10^{-5}$ or lower. At high SNR the performance of the code is determined by the code effective free-distance $(d_{free,eff})$ when the interleaver size N is much larger than the code memory v. Hence, maximizing the effective free distance can be used as the design criterion for constructing good turbo codes at high SNR. Maximizing the $d_{free,eff}$ is equivalent to maximizing $z_{min}$ of the component RSC codes (refer to equation 3.1). This design objective can be achieved if a primitive feedback polynomial is used in the RSC component codes, because it maximizes the minimal length of the output sequences for an input Weight of w=2. For a rate 1/2 RSC code with memory v, generator matrix is given by $$G(D) = \left[1, \frac{g_1(D)}{g_0(D)}\right] \tag{3.34}$$ where $g_0(D)$ is the primitive polynomial of degree v, and $g_1(D)$ is feed-forward polynomial. $z_{min}$ can be determined from equation (3.2) as $$z_{min} \le 2^v + 2 \tag{3.35}$$ Let $1 + D^I$ be the shortest input sequence of Weight 2 that generate a finite length code sequence. Then, the code parity check sequence is given by $$(1+D^I) \cdot \frac{g_1(D)}{g_0(D)} \tag{3.36}$$ Since $g_1(D)$ and $g_0(D)$ are relatively prime. The input sequence $1+D^I$ must be a multiple of $g_0(D)$ and periodic with a period of I. Increasing the period I will increase the length of the shortest code sequence with Weight 2. Intuitively, this will result in increasing the Weight of the code sequence. For polynomial $g_0(D)$ with degree v, any polynomial divisible by $g_0(D)$ is periodic with period $I \leq 2^v - 1$ . The maximum is $2^v - 1$ , which is obtained when $g_0(D)$ is a primitive polynomial. The corresponding encoder is generated by a maximal length linear feedback shift register with degree v. In this case, the parity check sequence Weight depends only on the primitive feedback polynomial and is independent of $g_1(D)$ . The code search procedure [33] can be summarized as follows - 1. Choose $g_0(D)$ to be a primitive polynomial of degree v. - 2. Choose $g_1(D)$ to be a polynomial of degree v, where $g_0(D)$ and $g_1(D)$ are relatively prime. - 3. Evaluate the average bit error probability bound of the candidate turbo code for a given interlever size. - 4. From all the candidate codes, choose the one with the lowest BER in the desired range of SNR's. In steps 1 and 2, the candidate code has a maximum $z_{min} = 2^{v-1} + 2$ , and thus the maximum $d_{free,eff} = 2^v + 6$ . Then, the best code is chosen from all the candidate codes. The BER can be expressed as, $$P_b(e) \approx B_{free,eff}Q\left(\sqrt{2d_{free,eff}.R\frac{E_b}{N_0}}\right),$$ (3.37) where $B_{free,eff}$ is the error coefficient related to the code effective free distance. For a fixed $d_{free,eff}$ , optimizing the BER implies minimization of the error coefficient. Thus, steps 3 and 4 can be replaced by the following steps. - 3. Evaluate the error coefficient $B_{free,eff}$ of the candidate code. - 4. From all the candidate codes, choose the one with minimum $B_{free,eff}$ . Based on the search performed on the turbo codes using the above procedure, we select simple 1/3 rate codes as shown in Table 3.1 and implement it in a VDSL system. In this table, the generator polynomials $g_0(D)$ and $g_1(D)$ have been expressed in octal. The codes not only achieve a high coding gain but also ensure the simplicity of the decoding. The bit interleaved coded modulation approach is simple. By modifying the puncturing information and signal constellation, it is possible to obtain a large family | v | $g_0(D)$ | $g_1(D)$ | $d_{free,eff}$ | |---|----------|----------|----------------| | 1 | 3 | 2 | 4 | | 2 | 7 | 5 | 10 | | 3 | 15 | 17 | 14 | | 4 | 31 | 27 | 22 | | 5 | 51 | 67 | 38 | Table 3.1: Best rate 1/3 rate turbo codes. of turbo coded modulation schemes. By making the demodulator and decoder to operate in an iterative manner, the performance loss due to the demodulation stage can be avoided to a large extent. This scheme is well suited for VDSL systems in the sense that all the components are independent and can be applied to a wide range of signal constellations and code rates. ## 3.3.4 Signal Mapping Signal Mapping is a crucial part in designing BICM scheme. An efficient mapping can improve the performance of the BICM scheme. Gray mapping is considered as an efficient mapping method. But Gray mapping is not a preferred choice in iterative decoding because most of the binary signals resulting from ideal feedback have the same inter-signal Euclidean distance as original constellation. Hence, if we can increase the intersignal-Euclidean distance among the signals in the constellation, we can draw more advantage due to the iterative decoding. Here, we propose a mapping method called "modified set partioning mapping (MSP)" as shown in Figure 3.6, which can maximize the inter-signal Euclidean distance [37]. As mentioned in section 3.2, a set $\{c_i^j\}$ (i=1,...,m;j=0,1,2) of m bits is Gray mapped into a complex signal symbol x from a constellation x to be transmitted over the channel. To explain the advantage of MSP, lets take a simple case of 16 QAM. Figure 3.6 illustrates the subset partitioning for each of the four bit positions of 16-QAM constellations. The selected region (only shown inside the unit square) correspond to the decision regions for each bit in $x(c_i^j=1)$ while the unshaded to b. Modified set partitioning mapping Figure 3.6: Subset partitions of 16 QAM for two mapping schemes. $\chi(c_i^j=0)$ . It is clear that both Gray and MSP mapping methods have the same minimum Euclidean distance between subsets of $\chi(c_i^j=1)$ and $\chi(c_i^j=0)$ but a different numbers of nearest neighbors. At the second pass of decoding, given ideal feedback of all other bits, the constellation of bit 1 is confined to a pair of points as shown in Figure 3.7 (The figure also illustrates the increase in the minimum Euclidean distance between subsets). Therefore, as far as bit 1 is concerned, a 16-QAM constellation is translated to a binary channel with a constellation selected (by three feedback bits) from the eight possible signal pairs. To optimize the decoding performance from the second pass onwards, one must maximize the Euclidean distance between the two points of all pairs. Therefore, the overall decoding performance also depends on the first-pass performance and the robustness of a mapping method to the feedback errors. Gray mapping is not the preferred choice because most of the binary signal sets resulting from ideal feedback have the same inter-signal Euclidean distance as original 16-QAM constellation. In fact, Gray mapping can yield best performance without Figure 3.7: Signal constellation after the feedback. feedback. However, the performance gain with feedback is very small. A compromise between optimizing the first-pass decoding performance and maximizing the improvement provided by the iterative decoding leads to a MSP. Hence, MSP mapping is a better option for BICM than Gray mapping. ### 3.4 Pipe Lined Decoding Scheme As VDSL systems can deal with huge multimedia applications, timing becomes one of the main concerns in applying the turbo codes to VDSL systems. Due to the iterations to be carried out in the turbo decoder, a significant latency is introduced in the output of the receiver. However, without the decoding iterations, the performance of the turbo decoder is not good enough, since in this case the decoders would not be able to share the information. This latency can be reduced by implementing a pipelined decoding scheme as shown in Figure 3.8. In this scheme, each iteration is carried out by separate decoder modules instead of one decoder module performing Figure 3.8: Pipe-lined decoding scheme. all the iterations. Systematic data is passed to the subsequent modules after a delay equal to the latency of each module (i.e. latency of each turbo decoder). The input to each module is the systematic information and the extrinsic information delivered by the previous module. Each module consists of a demodulator and a BICM decoder. If a turbo decoder takes n iterations to decode the input code word, by implementing this pipelined decoding scheme, the latency will become n iterations for the first code word and 1 iteration for the subsequent code words . ## 3.5 Complexity Analysis In this section we compare the complexity of the proposed BICM scheme with that of the 4D Wei-RS scheme. In order to compute the state metric, the MAX-LOG-MAP algorithm in the BICM scheme considers only two paths per step: the best path with bit zero and the best path with bit one. On the other hand, the 4D Wei-RS scheme uses the soft output Viterbi decoding procedure (SOVA) [9]. The soft output Viterbi algorithm also considers two paths: one is the maximum likelihood path and the best path with the complementary symbol at time t to the maximum likelihood path. The most intensive calculation in the MAX-LOG-MAP algorithm and the SOVA is the computation of bit metric which involves the computation of forward, backward and branch metrics. The other modules corresponding to the interleaving, deinterlaving, hard decision operations are relatively much less complex compared to the bit metric computation. Hence, in this section, in order | Code | N | M | Add | Mul | Max | |-------------|------|-----|-----|-----|-----| | BICM (7,5) | 2048 | 16 | 40 | 16 | 14 | | BICM(7,5) | 2048 | 64 | 40 | 16 | 14 | | BICM(7,5) | 2048 | 256 | 40 | 16 | 14 | | BICM(7,5) | 4096 | 16 | 40 | 16 | 14 | | BICM(7,5) | 8192 | 16 | 40 | 16 | 14 | | BICM(15,17) | 2048 | 16 | 72 | 32 | 30 | | BICM(31,27) | 2048 | 16 | 136 | 64 | 62 | | 4D We | i-RS | | 153 | 76 | 35 | Table 3.2: Decoder complexity. to compare the complexity of the BICM scheme and the 4D Wei-RS scheme, the bit metric computation module is considered. ### 3.5.1 Arithmetic Complexity In this section, we compare the arithmetic complexity involved in the metric computation of the proposed scheme with that of the 4D Wei-RS scheme. Table 3.2 shows the corresponding statistics. From this table we can observe that the complexity increases approximately by a factor of two when the number of states is doubled. A similar observation is made when the code memory is increased from 3 to 4. The complexity of the BICM scheme with code memory 2 is almost one quarter to that of the 4D Wei-RS scheme, since the former consists of a smaller number of states. Since, the 4D Wei decoder has to determine the point in each of the multidimensional subsets which is closest to the received point [9], the complexity of the 4D Wei-RS scheme is slightly more than the proposed scheme with code memory 4, though both the schemes have the same number of states. ### 3.5.2 Hardware Complexity In this section, we compare the hardware complexity of the proposed BICM scheme with that of the 4D Wei-RS scheme. The hardware complexity is compared in terms of the number of clock cycles required to perform the decoding operation, the maximum storage requirements involved in the decoding process and the Xilinx FPGA statistics. The computationally intensive module in the BICM scheme is the MAP decoding block and in the 4D Wei-RS scheme it is the Viterbi decoding block. Thus, in order to compare the hardware complexity, we consider specifically these two blocks. These blocks are designed using VHDL then synthesized using SYN-OPSYS. The target technology is a Xilinx 4010e-3 field programmable gate array (FPGA). First, we describe the hardware implementation of the computationally intensive module. The most intensive calculations in the MAX-LOG-MAP algorithm are the computation of metrics (forward, backward and branch) and memory. The computational kernel of the MAX-LOG-MAP algorithm is analogous to the Add-Compare-Select (ACS) operation in the Viterbi a algorithm [38]. According to equation (3.31) the architecture of the processing unit that computes the new value of $\bar{\alpha}_t(s)$ is shown in Figure 3.9. The structure consists of the well known ACS unit and a register in order to keep $\bar{\alpha}_t(s)$ for the next iteration. According to this architecture, the critical path is composed of the propagation of one full-adder for the addition of the branch metric and propagation of the multiplexer. Thus the critical path time can be given as $$t_{acs} = t_{fa} + t_{mux} \tag{3.38}$$ The backward state metric can also be implemented in the similar manner to that of the forward state metric. The last step in the MAX-LOG-MAP algorithm is the computation of LLR value of the decoded bit. Parallel architectures for the LLR can be derived directly from equation (3.33). The first stage is composed of $2^v$ adders. The second stage is composed of two $2^{v-1}$ operand MAX operators. Finally, the last operation is the subtraction. A tree architecture with (v-1) layers can be used for the hardware realization of the $2^{v-1}$ operand MAX operators. The two operand MAX operator is indicated in Figure 3.9. The whole critical path is (v-1) Figure 3.9: Architecture of ACS unit. times the critical path of each layer. Thus, the time for whole critical path can be given as $$t_{TMAX} = (v-1) \cdot (t_{fa} + t_{mux})$$ (3.39) Even in the Viterbi algorithm, the same ACS module shown in Figure 3.9 can be used to compute the state metrics. In Table 3.3, we compare the complexity in terms of clock cycles. The number of clock cycles required to compute the LLR is dependent on the number of states and the block length (the interleaver length has been considered as the block length). From the table, we can observe that the delay involved in the 4D Wei-RS scheme is more than that of the BICM scheme with code memory 2 or 3 and is less than that of BICM with the code memory 4. With regard to the variation of the complexity for various code memories and interleaver lengths of the BICM scheme, the number of clock cycles becomes twice as the code memory or the block length doubles. However, the code can provide a higher coding gain when the code memory is increased than when the block length is increased. The range of various parameters used to compute the storage requirements | Code | N | M | Clock cycles | | | |-------------|-----------|-----|--------------|--|--| | BICM(7,5) | 2048 | 16 | 24576 | | | | BICM(7,5) | 2048 | 64 | 24576 | | | | BICM(7,5) | 2048 | 256 | 24576 | | | | BICM(7,5) | 4096 | 16 | 49152 | | | | BICM(7,5) | 8192 | 16 | 98304 | | | | BICM(15,17) | 2048 | 16 | 49152 | | | | BICM(31,27) | 2048 | 16 | 98304 | | | | 4D We | 4D Wei-RS | | | | | Table 3.3: Number of clock cycles required for metric computation. | Parameter | Range (bits) | |----------------------------|--------------| | Code symbol quantization | 1-8 | | State metric quantization | 1-16 | | Branch metric quantization | 1-8 | | LLR quantization | 1-16 | Table 3.4: Range of parameters. | Code | N | M | $\alpha$ (bits) | $\beta$ (bits) | LLR (bits) | |-------------|------|-----|-----------------|----------------|------------| | BICM (7,5) | 2048 | 16 | 262144 | 128 | 128 | | BICM(7,5) | 2048 | 64 | 262144 | 128 | 128 | | BICM(7,5) | 2048 | 256 | 262144 | 128 | 128 | | BICM(7,5) | 4096 | 16 | 524288 | 128 | 128 | | BICM(7,5) | 8192 | 16 | 1048576 | 128 | 128 | | BICM(15,17) | 2048 | 16 | 524288 | 256 | 256 | | BICM(27,31) | 2048 | 16 | 1048576 | 512 | 512 | | 4D We | i-RS | | 1048576 | 256 | 256 | Table 3.5: Maximum storage requirements. | Parameter | BICM(7,5) | BICM(15,17) | BICM(31,27) | 4D Wei-RS | |-------------------------------|-----------|-------------|-------------|-----------| | FG function generators | 225 | 497 | 1098 | 692 | | H function generators | 68 | 146 | 314 | 210 | | Number of CLB cells | 120 | 264 | 596 | 406 | | Number of hard macros | 2 | 6 | 8 | 10 | | Number of CLBs in other cells | 14 | 42 | 64 | 70 | | Total number of CLBs | 134 | 306 | 660 | 476 | | Number of ports | 289 | 545 | 1028 | 780 | | Number of IOBs | 112 | 192 | 348 | 246 | | Total number of cells | 235 | 463 | 953 | 663 | | Area | 244 | 496 | 1000 | 712 | Table 3.6: Xilinx FPGA statistics. is shown in Table 3.4. Using these various ranges, in Table 3.5, we compare the maximum storage requirements for various codes. In general, the forward state metric ( $\alpha$ ) needs a huge amount of memory for the storage, since the computation of LLR requires the values of $\alpha$ for all the states until the decoding of a block is completed. A much smaller amount of the memory is required for the calculation of $\beta$ and LLR, since in this case we need to store only the corresponding values from the previous time instant. The storage requirement for the 4D Wei-RS scheme is also large, since it has to store the surviving path metric, the previous state, and the 4D point that corresponds to the surviving path. The LLR computation module of different codes is synthesized using SYN-OPSYS. Xilinx FPGA statistics obtained for various codes are shown in Table 3.6. From this table, we can observe that the area increases by a factor of two, as the code memory is increased by unity (i.e from 2 to 3 or 3 to 4). The hardware complexity of the 4D Wei-RS scheme falls in between the complexity of the BICM scheme with the code memory 3 and 4. The proposed scheme with the code memory 2 requires less hardware compared to other memory codes because of the less number of states. We can conclude that in the proposed scheme, the hardware complexity doubles as the number of states of the code doubled or the interleaver size is doubled. ### 3.6 Simulation Results In this section, we explain the results of the simulations that are performed to evaluate the BER performance of the proposed turbo coding scheme in the VDSL environment. ### 3.6.1 Performance of Bit Interleaved Coded Modulation Scheme Here we present the results of the BER performance of the BICM scheme. The code memory (v), interleaver size (N), and the level of modulation (M) are the parameters that affect the code performance. Hence, simulations are carried out to evaluate the effect of the code memory, interleaver size, and level of modulation on the BER performance of the BICM. The targeted BER throughout the simulations is kept as $10^{-5}$ . The channel used is a 24 gauge, 3000 ft twisted pair loop with AWGN and FEXT as main sources of line impairments and modeled as shown by (2.1). ### 3.6.1.1 Effect of Code Memory To study the effect of the code memory, we implement the turbo codes with different memory RSC component codes which are shown in Table 3.1. The modulation scheme used is QAM. A MSP mapping is employed so that the symbol detection error results in an error of only one bit. The size of the interleaver is 2048. The turbo codes are appropriately punctured to obtain a rate of 1/2 codes thereby achieving a 2 bits/s/Hz spectral efficiency. Figures 3.10-3.12 show the simulation results for various code memories. From the graphs, we can observe that the performance of the BICM increases with an increase in the number of iterations. But the increase is negligible after certain number of iterations as data becomes more correlated. We can also observe that the targeted BER is achieved after 4-5 iterations. Hence, we can fix the number of iterations as a stopping criterion. In Table 3.7, we compare the SNR required by BICM with different code memories to achieve the targeted BER of 10<sup>-5</sup>. From the table we can observe that as the code memory increases, the code requires lower SNR to reach the targeted BER because of the increased effective free distance. As the effective free distance determines the performance of the turbo codes for the high SNR, increasing the code memory will result in an improved error performance for VDSL applications. The coding gain achieved by increasing the code memory from 2 to 3 is 0.6 dB, whereas the coding gain achieved is 0.7 dB for the case of increasing the code memory from 3 to 4. ### 3.6.1.2 Effect of Interleaver size The size of the interleaver plays an important role in determining the performance of the turbo codes. Hence, in order to study the performance of the BICM scheme, interleaver sizes of 2048, 4096 and 8192 are considered. The turbo code has rate of 1/2 code with a generator polynomial (5,7). Figures 3.10, 3.13, and 3.14 shows the performance of the BICM scheme for interleaver sizes 2048, 4096 and 8192, respectively. The graphs shows that the performance is improved by increasing the interleaver size. This improvement can be explained as follows. As the interleaver enables the information exchange between the two component decoders, increasing the interleaver size has the effect of randomizing the information sequence at the input of the second decoder. Consequently, the two inputs to the component decoders becomes less correlated with respect to the crosstalk noise, thus improving the performance. From Table 3.8, we can observe that, BICM scheme provides the targeted BER at SNR of 13.4 dB, 12.7 dB, and 12.5 dB for interleaver sizes 2048, 4096, 8192, respectively. The coding gain achieved by increasing the interleaver size from 2048 to 4096 is 0.7 dB, and the gain achieved by increasing the interleaver size from 4096 to 8192 is 0.2 dB. On the other hand, the coding gain achieved by increasing the code memory from 2 to 3 is about 0.6 dB and that of increasing the code memory from 3 to 4 is 0.6 dB. Hence, increasing the interleaver size beyond | Code | SNR (dB) | |---------|----------| | (7,5) | 13.4 | | (15,17) | 12.8 | | (31,27) | 12.2 | Table 3.7: Performance comparison between the codes for a targeted BER of $10^{-5}$ for various code memories.. | N | SNR (dB) | |------|----------| | 2048 | 13.4 | | 4096 | 12.7 | | 8192 | 12.5 | Table 3.8: Performance comparison between the codes for a targeted BER of $10^{-5}$ for various interleaver sizes. a certain point need not necessarily improve the performance. Increasing the code memory, or increasing the interleaver length increases the code complexity by the same amount (refer Tables 3.3 and 3.5). Hence, BICM scheme can provide a better coding gain by increasing the size of the code memory rather than by increasing the interleaver size. #### 3.6.1.3 Effect of Modulation Level We will now examine the effect of increasing the modulation level i.e. increasing the spectral efficiency of the BICM scheme. The BER performance of the BICM scheme for modulation levels 16, 64 and 256 are shown in Figures 3.10, 3.15 and 3.16, respectively. Spectral efficiency achieved by the BICM scheme for M=16, 64 and 256 is 2, 3, and 4 bits/s/Hz respectively. By increasing M, the spectral efficiency of the BICM scheme increases at the expense of increased SNR per bit. As M increases, the code requires higher SNR to reach the targeted BER, i.e. the system requires more transmitting power for a reliable transmission. On the other hand, an increase in the SNR has to be compromised with a decrease in the loop length. Figure 3.10: Performance of BICM with encoder generator (5,7), v=2. Figure 3.11: Performance of BICM with encoder generator (17,15), v=3. Figure 3.12: Performance of BICM with encoder generator (27,31), v=4. Figure 3.13: Performance of BICM with encoder generator (7,5) , N=4096. Figure 3.14: Performance of BICM with encoder generator (7,5), N=8192. Figure 3.15: Performance of BICM with encoder generator (7,5) , $M\!=\!\!64.$ Figure 3.16: Performance of BICM with encoder generator (7,5), M=256. ### 3.6.2 Performance Comparison between different Codes Table 3.9 illustrates a comparison of the performances of different codes. Clearly, the BICM scheme outperforms the 4D Wei-RS scheme. Though the 4D Wei-RS coding scheme has a spectral efficiency of 6.12 bits/sec/Hz, it requires a high SNR to achieve the targeted BER. Moreover, as the 4D Wei code is serially concatenated with the RS code, the error propagation occurs between the two decoders. But this is not the case with the BICM scheme in which the two encoders are concatenated in parallel. We can observe from Table 3.9 that, as the spectral efficiency of the coding scheme increases, higher SNR is required to reach the targeted BER. Hence, the spectral efficiency has to be compromised for an increased SNR. As the transmitting power is increased, crosstalk in the twisted pair loop also increases. Thus, the high transmitting power required by the 4D Wei-RS FEC scheme provides not only a high spectral efficiency but also an increase of crosstalk. But this is not the case with the | Code | N | M | Efficiency (bits/s/Hz) | SNR (dB) | |-------------|------|-----|------------------------|----------| | BICM (7,5) | 2048 | 16 | 2 | 13.4 | | BICM (7,5) | 2048 | 64 | 3 | 19.37 | | BICM(7,5) | 2048 | 256 | 4 | 23.8 | | BICM(7,5) | 4096 | 16 | 2 | 12.7 | | BICM(7,5) | 8192 | 16 | 2 | 12.5 | | BICM(15,17) | 2048 | 16 | 2 | 12.8 | | BICM(31,27) | 2048 | 16 | 2 | 12.2 | | 4D We | i-RS | | 6.12 | 27.8 | Table 3.9: Required SNR for various codes for a targeted BER of $10^{-5}$ . BICM scheme. Though this scheme has a smaller spectral efficiency, it achieves the targeted BER at a lower transmitting power thereby reducing the affect of crosstalk. ### 3.7 Conclusion In this chapter a bandwidth efficient turbo coding scheme referred to as BICM that is suitable for VDSL modems. In order to provide a higher coding gain, a joint demodulation and decoding procedure has been developed to avoid the performance degradation due to demodulation. To develop the joint demodulation and decoding procedure, the MAP algorithm is appropriately modified by treating MAI as an extra source of noise. Based on the formulation developed, we proposed a new mapping method called "mixed set partitioning" to maximize the minimum Euclidean distance between the signal pairs. This mapping method is well suited for iterative decoding scheme and has an advantage over Gray mapping from second iteration onwards. Some design criterion has been enunciated for developing good constituent codes. To reduce the delay at the receiver side, we proposed a pipe-lined decoding scheme. A detailed complexity analysis of the BICM scheme has been carried out. To analyze the hardware complexity, the BICM scheme has been synthesized using SYNOPSYS. The parameters that have been considered for the complexity analysis are decoder complexity, number of clock cycles, maximum storage requirements and Xilinx FPGA statistics. The analysis has shown that the BICM scheme has a lower complexity than that of the 4D Wei-RS scheme. However, the decoder complexity and the number of clock cycles required by the BICM scheme with v=4, N=2048 are more than that the 4D Wei-RS scheme. Also, the area required by the BICM scheme with the code memory 2 or 3 is less than that of the 4D Wei-RS scheme, whereas, the area required by the BICM scheme with code memory 4 is more than that of the 4D Wei-RS scheme. A detailed simulation study has been performed to analyze the BER performance of the BICM scheme for the code parameters v, N, and M. Simulation results have shown that the BICM scheme outperforms the 4D Wei-RS scheme. Although the BICM scheme is spectrally less efficient than the 4D Wei-RS scheme, it provides a significant coding gain at a very low SNR. The coding gain provided by increasing the code memory is rather large compared to that provided by increasing the interleaver size. A consistent coding gain of 0.6 dB is observed by increasing the code memory from 2 to 3 and 3 to 4, whereas the increase in the coding gain is 0.7 dB and 0.2 dB for an increments in the interleaver size from 2048 to 4096 and from 4096 to 8192 respectively. Thus, increasing the code memory is a better option rather than increasing the intrerleaver size. On the other hand, increasing the spectral efficiency of the BICM scheme increases the SNR required to reach the targeted BER. On the whole, BICM with v=2 or 3 and 16 QAM modulation are best suited for VDSL applications in the sense that they are less complex and can achieve the targeted BER at low SNR compared to that of 4D Wei-RS scheme. Also, the BICM scheme is flexible enough for further modifications as all the components are independent and can be upgraded easily. # Chapter 4 # Performance Evaluation of VDSL Employing BICM Scheme ### 4.1 Introduction In this chapter, we evaluate the VDSL performance employing the BICM scheme [32] and [39]. The parameters that determine the VDSL performance are the transmitting frequency, transmitting power, and the numbers of crosstalkers. Hence, we explore the effects of these parameters on the VDSL performance. As mentioned in Chapter 1, the primary VDSL issue is the loop length that can be reliably realized. Hence, we evaluate the loop length that can be realized for a targeted BER of $10^{-5}$ and for a given data rate. The results are compared with that of the 4D Wei-RS scheme. The channel used is a 24-Gauge 3000 ft loop and is modeled according to (2.1). In order to impose the maximum effect of FEXT, we use K=49, the maximum value generally used in practice. ### 4.2 Effect of Transmitting Frequency Figures 4.1-4.3 show the achievable data rates for various transmitting frequencies for a VDSL system employing the BICM scheme. The transmitting power is fixed at 10 dBm and the loop length at 3000 ft. The results are compared with that of the 4D Wei-RS scheme. A 6-dB noise margin condition is imposed. As seen from Figure 2.7, the cutoff rate for a 3000-ft loop is 14 MHz. But for the analysis purpose the results are shown upto 30 MHz. The graphs show that the VDSL system achieves higher data rates by employing the BICM scheme than that by employing the 4D Wei-RS scheme. From the graphs we can observe that the data rates become saturated as the frequency of operation is increased. The power spectral density of the crosstalk noise increases dramatically with an increase in the frequency of operation [40], thus limiting the capacity. Because of the poor coding gain of the 4D Wei-RS scheme, VDSL has to maintain lower data rates than the rates provided by employing BICM scheme. Figure 4.1 shows the effect of the code memory on the data rates achieved by the VDSL employing the BICM scheme. As the code memory increases, better data rates are achieved which can be accounted for the reason that the coding gain increases as the code memory increases (refer to Table 3.7). The effect of increasing the interleaver size on the data rates is shown in Figure 4.2. From this figure we can observe that there is a minimal increase in the data rate as the interleaver size increases. As interleaver size is increased from 2048 to 4096, a 4% increase in the data rate is observed, whereas the increase is almost negligible when the interleaver size is changed from 4096 to 8192. This can be accounted for the reason that a coding gain of 0.6 dB is achieved when the interleaver size is increased from 2048 to 4096, whereas the coding gain is just 0.2 dB for an increase in the interleaver size from 4096 to 8192. Hence, in order to obtain better data rates, increasing the code memory is a preferred option over that of increasing the interleaver size. The effect of increasing the modulation level on the data rates is shown in | Code | N | M | Efficiency (bits/s/Hz) | Bit Rate (Mbps) | |--------------|------|-----|------------------------|-----------------| | BICM (7,5) | 2048 | 16 | 2 | 21.4 | | BICM (7,5) | 2048 | 64 | 3 | 19.7 | | BICM (7,5) | 2048 | 256 | 4 | 16.8 | | BICM (7,5) | 4096 | 16 | 2 | 22.4 | | BICM(7,5) | 8192 | 16 | 2 | 22.44 | | BICM (15,17) | 2048 | 16 | 2 | 22 | | BICM (31,27) | 2048 | 16 | 2 | 229 | | 4D Wei | -RS | | 6.12 | 15.6 | Table 4.1: Comparison of data rates for various codes at f=14 MHz and Transmitting Power=10 dBm. Figure 4.3. From the graphs we can observe that the data rates decrease drastically as we increase the modulation level. Although the spectral efficiency is increased by increasing M from 16 to 64 and 256, the code requires a higher SNR to reach the targeted BER (Refer to Table 3.9). As seen from (3.10), the LLR of the transmitted bits depends on the Euclidean distance between the received and transmitted symbols. Hence, by increasing the signal constellation, the distance between the symbols is decreased thereby reducing the probability of calculating the correct LLR. Thus, when the frequency of operation is the key parameter, increasing the interleaver size or the modulation level is not recommended. In Table 4.1, a comparison of the bit rates achieved by the VDSL employing various codes is provided for a 3000 ft loop, cutoff frequency of 14 MHz and transmitting power of 10 dBm. The bit rate performances of both the BICM and the 4D Wei-RS schemes are almost the same in the low frequency range. In this range the background noise dominates the FEXT. Hence, increasing the spectral efficiency of the code yields better results in the low frequency range. However, in the high frequency range where the crosstalk is more severe, the SNR of the signal decreases rapidly thereby decreasing the data rate. Though the performance of the BICM scheme is the same as that of 4D Wei-RS scheme below 8 MHz, the VDSL system is more power efficient and less complex by employing the BICM scheme than that by employing the 4D Wei-RS scheme. Figure 4.1: Effect of Frequency as a function of code memory for BICM. ## 4.3 Effect of Transmitting Power We will now examine the effect of the transmitting power on the data rates provided by the VDSL employing the BICM scheme. To study the effect of transmitting power, we chose the frequency of operation to be 14 MHz. The transmitting power is varied from 5 dBm to 30 dBm. Figures 4.4-4.6 show the effect of the transmitting power on the data rates provided by the VDSL. The observation that can be made from the graphs is that the data rate increases very little as the transmitting power is increased. The reason for this can be given as follows. Assuming that the crosstalker uses the same signal strategy and power level as the transmitted signal, the crosstalk noise level increases by the same proportion as the increase in the transmitting power [41], and the overall signal to noise ratio remains the same. With regard to the performance of the 4D Wei-RS FEC scheme, the increase in the data rate is almost negligible with an increase in the transmitting power. Although this scheme has Figure 4.2: Effect of Frequency as a function of interleaver length for BICM. high spectral efficiency, the data rates achieved by VDSL is low compared to that provided by employing the BICM scheme. This can be accounted for the following reasons. For the 4D Wei-RS scheme most of the transmitting power is utilized to provide high spectral efficiency rather than combating the cross talk. Moreover, the 4D Wei-RS scheme requires a high signal power to achieve the targeted BER thereby increasing the effect of the crosstalk. Table 4.7 gives a comparison of the transmitting power required by the VDSL employing various codes to achieve a data rate of 22 Mbps at f=14 MHz. The table shows that the VDSL systems employing the BICM scheme is more power efficient than that by employing the 4D Wei-RS scheme. This can be attributed to the higher coding gain provided by the BICM scheme. In the case of BICM, increasing the modulation level increases the transmitting power by more than 2 times. On the other hand, by increasing the interleaver size and the code memory, a smaller Figure 4.3: Effect of Frequency as a function of modulation for BICM. transmitting power is required, since in this case a higher coding gain is achieved. By increasing the interleaver size from 2048 to 4096, the reduction in the power is 5 dBm, whereas the reduction is observed to be just 0.5 dBm when the interleaver size is increased from 4096 to 8192. Hence, increasing the interleaver size above 8192 will have negligible effect on the transmitting power. But this is not the case with code memory. A consistent reduction of nearly 3 dBm is observed when the code memory is increased from 2 to 3 and 4. From the above discussion we can draw the following conclusions. An increase in the spectral efficiency of the coding scheme at the expense of higher signal power does not increase the achievable data rates. By maintaining the spectral efficiency of the coding scheme at an acceptable level of 2-3 bits/s/Hz, a higher data rate can be obtained provided the coding scheme is designed to give high coding gain. Figure 4.4: Effect of Transmitting Power for various code memories for BICM. # 4.4 Effect of Numbers of Crosstalkers The performance of the VDSL against the numbers of cross talkers in the same wire bundle is shown in Figures 4.8-4.10. The channel is a 3000-ft loop at a frequency of operation of 14 MHz, and the transmitting power of 10 dBm. In general, the number of twisted pairs in the same wire bundle varies from 10 to 50 and the amount of interference crosstalk increases as the number increases. The graph shows the advantage of the BICM scheme against the numbers of cross talkers. As the crosstalk increases, the SNR per bit decreases which decreases the data rate. Comparing the performance of the BICM with that of the 4D Wei-RS scheme, the latter performs better when the number of crosstalkers is less than 20 (low crosstalk region). But as this number increases, the BICM scheme outperforms the 4D Wei-RS scheme. This is because, as the crosstalk increases, the 4D Wei-RS scheme has to increase the SNR per bit in order to maintain the targeted BER. Because of this increased Figure 4.5: Effect of Transmitting Power for various interleaver lengths for BICM. SNR, the data rate reduces significantly. Though the BICM scheme is spectrally less efficient than the 4D Wei-RS scheme, it achieves the targeted BER at a lower SNR. Thus, VDSL provides higher data rate by employing the BICM scheme compared to that provided by employing the 4D Wei-RS scheme. From Table 4.2, we can observe that for the case of the BICM scheme, increasing the interleaver size does not increase the data rate, whereas increasing the code memory increases the data rate. Increasing the spectral efficiency of the BICM scheme reduces the data rate. ### 4.5 Realizable Loop Length Figures 4.11-4.13 show the data rates and the corresponding loop lengths that can be realized reliably using the BICM scheme. These graphs show that for a fixed frequency of operation and transmitting power, the data rate decreases as the loop Figure 4.6: Effect of Transmitting Power for various modulation levels for BICM. | Code | N | M | Efficiency (bits/s/Hz) | Power (dBm) | |--------------|------|-----|------------------------|-------------| | BICM (7,5) | 2048 | 16 | 2 | 13 | | BICM (7,5) | 2048 | 64 | 3 | >30 | | BICM (7,5) | 2048 | 256 | 4 | >30 | | BICM (7,5) | 4096 | 16 | 2 | 8 | | BICM (7,5) | 8192 | 16 | 2 | 8.5 | | BICM (15,17) | 2048 | 16 | 2 | 10.4 | | BICM (31,27) | 2048 | 16 | 2 | 7 | | 4D Wei | -RS | | 6.12 | >30 | Figure 4.7: Comparison of power for various codes at f=14 MHz for a data rate of 22 Mbps. Figure 4.8: Effect of numbers of cross talkers for various code memories. | Code | N | M | Efficiency (bits/s/Hz) | Rate (Mbps) | |--------------|------|-----|------------------------|-------------| | BICM (7,5) | 2048 | 16 | 2 | 21.8 | | BICM (7,5) | 2048 | 64 | 3 | 20.6 | | BICM (7,5) | 2048 | 256 | 4 | 17.88 | | BICM (7,5) | 4096 | 16 | 2 | 22.7 | | BICM (7,5) | 8192 | 16 | 2 | 22.8 | | BICM (15,17) | 2048 | 16 | 2 | 22.3 | | BICM (31,27) | 2048 | 16 | 2 | 23.2 | | 4D Wei | -RS | | 6.64 | 16.9 | Table 4.2: Comprison of data rates for various codes for number of cross-talkers=40. Figure 4.9: Effect of numbers of cross-talkers for various interleaver sizes. length increases, since the signal attenuates more for longer loops. Hence, for a reliable transmission on longer loops, bit rate has to be compromised. Table 4.3 illustrates the loop lengths that can be realized by using various codes at a data rate of 40 Mbps with 6 dB noise margin. The table shows that by employing the BICM scheme, we can reliably realize the loops that are approximately 4 times longer than that by employing the 4D Wei-RS scheme. ### 4.6 Conclusion In this chapter, the performance of VDSL employing the BICM scheme has been evaluated. The parameters that have been considered are the transmitting frequency, transmitting power, and the numbers of crosstalkers. The effect of these parameters on the bit rate achieved by the VDSL employing BICM scheme has Figure 4.10: Effect of numbers of cross-talkers for various levels of modulation. | Code | N | M | Efficiency (bits/s/Hz) | Loop length (ft) | |--------------|------|-----|------------------------|------------------| | BICM (7,5) | 2048 | 16 | 2 | 1400 | | BICM (7,5) | 2048 | 64 | 3 | 800 | | BICM (7,5) | 2048 | 256 | 4 | 500 | | BICM (7,5) | 4096 | 16 | 2 | 1700 | | BICM (7,5) | 8192 | 16 | 2 | 1700 | | BICM (15,17) | 2048 | 16 | 2 | 1600 | | BICM (31,27) | 2048 | 16 | 2 | 1800 | | 4D Wei | -RS | | 6.12 | 480 | Table 4.3: Realizable loop lengths with various codes for a data rate of 40 Mbps. Figure 4.11: Bit rate as a function of loop length for various code memories. Figure 4.12: Bit rate as a function of loop length for various interleaver sizes. Figure 4.13: Bit rate as a function of loop length for various modulation levels. been evaluated. The effect of v, N, and M on the bit rate provided by the VDSL has been explored. Also, the loop length that can be reliably realized by employing the BICM scheme has been evaluated. Regarding the achievable data rates, the VDSL modems provide higher data rates by employing the BICM scheme compared to that achieved by employing the 4D Wei-RS code. It has been observed that, an increase in the transmitting frequency has a greater impact on the data rates than an increase in the transmitting power. It has been observed that the data rates become saturated beyond a certain increase in the frequency. The saturation point for the case of the 4D Wei-RS scheme is nearly 8 MHz, whereas it is approximately 20 MHz for the case of the BICM scheme. Increasing the code memory and interleaver size increases the data rate, whereas increasing the level of modulation decreases it. Hence, a low level of modulation is more suitable to obtain higher data rates. The increase in the data rate is minimal for the case of increasing the interleaver size, whereas the increase in the data rate is consistent for the case of increasing the code memory. Hence, increasing the code memory is recommended to achieve a higher bit rate. With regard to the crosstalk, the VDSL employing the BICM scheme successfully operates in a high crosstalk environment and achieves a 37% increase in the bit rate, whereas VDSL employing the 4D Wei-RS scheme can operate successfully only in the low crosstalk environment. Finally, the loop length that can be reliably realized by employing the proposed BICM scheme has been obtained. Results have shown that, depending on the code configuration, the loop length can be increased by at least 3 to 4 times compared to that of the 4D Wei-RS scheme. # Chapter 5 An Iterative Soft Interference Cancellation and Decoding technique to Mitigate the Effect of Home-LAN on VDSL ### 5.1 Introduction The proposal to use the existing telephone wiring in homes for computer networking (home-LAN) avoids the laying of additional wires in the same premise. Due to this spectral crowding of home-LAN on the twisted pair lines, a severe performance loss in the VDSL services occur. To provide a better infrastructure for internet services, it is desirable for both the VDSL and home-LAN services to coexist on the same twisted pair lines. Severe performance loss in the VDSL systems due to home-LAN can be eliminated through the use of multiuser detection techniques. Previous contributions on the maximum-likelihood based joint detection of VDSL and home-LAN signals have been reported in [11]-[15] and [42]-[45]. The Figure 5.1: Example for soft cancellation. methods described therein show that it is possible to mitigate the effect of a home-LAN signal on VDSL, if a small fraction (3%) of the VDSL band is silenced. These previously presented receivers for multiuser detection of VDSL and home-LAN signals are complex to implement. The reduction in complexity can be particularly pronounced in a multiuser detection, if sub-optimum iterative-decoding methods are used instead of a joint maximum likelihood detection. In this chapter, a multiuser detector is proposed to avoid the severe loss in the performance of the VDSL caused by home-LAN. However, this receiver can only detect the desired signal only from one of the transmitters and the signals from the other transmitters are considered to be the interference (MAI) to the receiver. The particular concept used here is called "soft cancellation" and is illustrated in Figure 5.1 for the simple case of detecting two independent messages that interfere at the input of a receiver. During the first iteration, Decoder1 attempts to compute the equivalent probability for each of the possible values associated with the first message. The resultant distribution may be very easy to decode when it is narrowly centered on one value. However, the presence of the second data message may obscure the first one and thus, the computed probability distribution may not heavily favor a particular message initially. However, the initial probability distribution for the first user can be input to the Decoder2 of the second user, which in turn attempts to compute the equivalent probability distribution associated with the second message. The distribution associated with the first user's message leads to a better distribution associated with the second user's message. In turn, the probability distribution of the second message is sent back to the first decoder for second iteration. The first decoder uses this information to produce a better probability distribution than on its first execution. Hence, we use the soft cancellation technique to mitigate the effect of home-LAN on VDSL. In order to perform the soft cancellation, an iterative multiuser receiver is proposed to jointly detect the VDSL and home-LAN signals [46]. The received signals are demapped and a set of decoders are used to form the soft estimates of the signals. Based on the soft estimates, the transmitted symbols are estimated and crosstalk cancellation is performed in a SIC. An algorithm is proposed in order to estimate the transmitted symbols from the soft estimates provided by the decoders. As the performance of turbo codes approach close to the Shannon limit, we use turbo codes to compute the soft estimates of the symbols. This chapter is organized as follows. In Section 5.2, the VDSL system model in an interference environment is described. In Section 5.3, an iterative turbo multiuser receiver structure along with the operation is described. In Section 5.4, the soft interference cancellation technique is described and its complexity analysis is carried out. In section 5.5, we discuss the simulation results. ### 5.2 System Model Figure 5.3 shows a VDSL system in a home-LAN interference environment. High speed internet access is provided to the user through VDSL using the unshielded twisted pair cable from the telephone CO to the user. Other applications such as LAN use the in-house telephone wiring to do computer networking in order to avoid laving of additional wirings in the user's premise. By doing so, the signals of these Figure 5.2: Crosstalk system model [12]. applications interfere with the VDSL signals entering the user's premise. Figure 3.3 shows the transmitter structure for the VDSL signals, whose operation was described in Section 3.3. The multiuser channel model for the VDSL is shown in Figure 5.3. The VDSL signal at the receiver can be modeled as $$\mathbf{r} = \sum_{k=1}^{n} f_k . x_k + N_0, \tag{5.1}$$ where $x_1$ is the desired VDSL signal, $(x_2, ..., x_n)$ the superimposed home-LAN signals on the desired VDSL signal, $(f_1, ..., f_n)$ their corresponding crosstalk coupling functions, and $N_0$ the AWGN vector. It is assumed that $(f_1, ..., f_n)$ are known. Equation (5.1) can be rewritten as $$\mathbf{r} = FX + N_0, \tag{5.2}$$ where $X = (x_1, x_2, ..., x_n)^T$ is the received data vector, and $F = (f_1, f_2, ..., f_n)$ the corresponding coupling coefficient matrix. The transmitted data of each signal (both VDSL and home-LAN) consists of both in-phase and quadrature-phase components and can be written as $$x_k = x_{k,l} + jx_{k,Q} \quad (k = 1, ..., n),$$ (5.3) Figure 5.3: Interference model. where the symbols $x_I$ and $x_Q$ take equi-probable values from the set $(\pm 1, \pm 3, ..., \pm \sqrt{M} - 1)$ with $M = 2^m$ . Similarly, the received data vector of the desired signal can be written as $$r_k = r_{k,l} + jr_{k,O} \tag{5.4}$$ The performance of the receiver can be improved significantly by jointly detecting the VDSL and home-LAN signals. Hence, while detecting the VDSL signal, we treat the home-LAN signals as MAI and vice-versa. ### 5.3 Iterative Turbo Multiuser Receiver Structure We perform the soft interference cancellation at the receiver end by jointly detecting the VDSL and home-LAN signals in an iterative manner. The receiver structure is shown in Figure 5.4. It consists of two stages. A soft-in soft-out soft interference canceller and a demodulator followed by n turbo decoders. From the extrinsic information provided by the decoding stage, the soft interference canceller (SIC) forms an estimate of the interference symbols and performs the interference cancellation. Then, the received noisy symbols of the desired signal are demapped and the log-likelihood ratio (LLR) associated with each bit is calculated. The LLR consists of a priori and extrinsic information. As the a priori information is not available to the demodulator during the first iteration, an equally likely assumption is made on the received symbols. The LLR associated with each bit can be calculated as $$\Lambda_1(c_{k,i}^j) = log \frac{P(c_{k,i}^j = 1 | r_k)}{P(c_{k,i}^j = 0 | r_k)}$$ $$= log \frac{p(r_k|c_{k,i}^j = 1)}{p(r_k|c_{k,i}^j = 0)} + log \frac{P(c_{k,i}^j = 1)}{P(c_{k,i}^j = 0)}$$ $$= \lambda_1(c_{k,i}^j) + \lambda_2^p(c_{k,i}^j), \quad i = 1, ..., m; k = 1, 2, ..., n; j = 0, 1, 2, \tag{5.5}$$ where $r_k$ represents the received symbols of the desired signal and the second term, $\lambda_1(c_{k,i}^j)$ the soft metric corresponding to $c_{k,i}^j$ delivered by the demodulation stage, $\lambda_2^p(c_{k,i}^j)$ the extrinsic information delivered from the decoding stage in the previous iteration. For the first iteration, all the bits are assumed to be equally probable and hence this term is set to zero. $\lambda_1(c_{k,i}^j)$ is the extrinsic information delivered by the demodulation stage which is then demultiplexed and sent to the turbo decoder for further processing. In turn, the binary turbo decoder computes the a posterioiri LLR of each code bit and then excludes the influence of its a priori information to obtain the extrinsic information as follows: $$\lambda_2(c_{k,i}^j) = \Lambda_2(c_{k,i}^j) - \lambda_1^p(c_{k,i}^j) \tag{5.6}$$ This extrinsic information from the decoding stage is again suitably multiplexed and fed back to the soft interference canceller and demodulator as a priori information for the next iteration to improve the estimate of the received symbols. The operations carried out by the soft interference canceller, demodulator and decoder are repeated in an iterative manner. After the final iteration, the decoding stage makes hard decisions on its a posteriori LLR of the information bits. Figure 5.4: Turbo iterative multiuser receiver with SIC. # 5.4 Soft Interference Cancellation via Multiuser Detection Let us consider the SIC module in the iterative turbo multiuser receiver structure depicted in Figure 5.4. This module performs soft interference cancellation on the received noisy symbols. Here, we describe the soft interference cancellation procedure. The procedure can be applied to in-phase and quadrature data separately. Soft decisions made on the coded data by the channel decoding stage is given according to (3.11) $$P(c_{k,i}^{j}) = \frac{1}{2} \left[ 1 + c_{k,i}^{j} \tanh\left(\frac{1}{2}\lambda_{2}^{p}(c_{k,i}^{j})\right) \right]$$ (5.7) Based on this soft information of the coded data, an estimate of the transmitted symbol [47] is determined as $$\widetilde{x}_{k,I} = E\left\{x_{I,k}(\chi_1)/\chi_1, obs\right\}$$ $$= \sum_{\chi_1} x_{I,k} P(x_{I,k} = x_{I,k}(\chi_1)/\chi_1, obs)$$ $$= \sum_{\chi_{l,k}} \prod_{i=1}^{m/2} P\left(c_{k,i}^{j} = c_{k,i}^{j}(\chi_{1})/\chi_{1}, obs\right), \tag{5.8}$$ where $\chi_1 \in \left\{\pm 1, \pm 3, ..., \pm \sqrt{M} - 1\right\}$ , $x_{I,k}(\chi_1)$ denotes the symbol $x_{I,k}$ associated with all the possible realizations of $\chi_1$ , obs are the observations provided by the soft information from the channel decoder, and $P(c_{k,i}^j)$ is given by (5.7). Depending on the specific bit $c_{k,i}^j$ , the signal space is divided into two sets, $\chi_1(c_{k,i}^j = 1)$ and $\chi_1(c_{k,i}^j = 0)$ . Let us explain by an example, how a transmitted symbol can be estimated based on the soft information. The in-phased data, $x_{I,k}(\chi_1)$ , in a 4-QAM modulation is associated with one coded piece of data $c_{k,1}^j$ with two possible values 0 or 1 and can be estimated as $$\widetilde{x}_{I,k} = (1) P\left(c_{k,1}^{j} = 1/obs\right) + (-1) P\left(c_{k,1}^{j} = 0/obs\right)$$ (5.9) The probability of $c_{k,i}^{j} = 1$ can be calculated from (5.7) as $$P(c_{k,i}^{j}=1) = \frac{1}{2} \left[ 1 + \tanh\left(\frac{1}{2}\lambda_{2}^{p}(c_{k,i}^{j})\right) \right],$$ (5.10) and probability of $c_{k,i}^{j} = 0$ can be calculated as $$P\left(c_{k,i}^{j}=0\right) = \frac{1}{2} \tag{5.11}$$ Hence, substituting (5.10) and (5.11) in (5.9), the estimated symbol is given by $$\widetilde{x}_{I,k} = \tanh\left(\frac{1}{2}\lambda_2^p(c_{k,i}^I)\right) \tag{5.12}$$ Following the same procedure as for $\tilde{x}_{l,k}$ , $\tilde{x}_{Q,k}$ can be estimated as $$\widetilde{x}_{Q,k} = \sum_{\chi_1} x_{Q,k} \prod_{i=(m/2)+1}^m P\left(c_{k,i}^j = c_{k,i}^j(\chi_1)/\chi_1, obs\right)$$ (5.13) Now, the soft interference cancellation on the desired signal is performed as $$\mathbf{r}_{\mathbf{I},\mathbf{k}} = \mathbf{r}_{\mathbf{I}} - F\widetilde{X}_{I,\mathbf{k}}$$ $$= F\left(X_{l,k} - \widetilde{X}_{l,k}\right) + N_0, \tag{5.14}$$ where $\widetilde{X}_{l,i} = [\widetilde{x}_{l,1}, \widetilde{x}_{l,2}, ..., \widetilde{x}_{l,i-1}, \widetilde{x}_{l,i+1}, ..., \widetilde{x}_{l,n}]^T$ . We choose the component of $\mathbf{r}_{l,k}$ that has highest signal to noise ratio and denote it as $r_{l,k}$ . The same procedure as for $\widetilde{r}_{l,k}$ can also be applied for $r_{Q,k}$ also. As the number of iterations are increased, we have better estimates of $x_{l,k}$ and the performance of the SIC improves. After the interference cancellation is performed, the transmitted data is left corrupted by only AWGN and is further processed by powerful turbo codes. After the soft interference cancellation, the desired data signal is demodulated as described next. The LLR associated with each received bit is calculated according to (5.5). The soft metric $\lambda(c_{k,i}^j)$ can be evaluated as $$\lambda_1(c_{k,i}^j) = \log \frac{\sum_{x_k \in \chi(c_{k,i}^j = 1)} p(r_k | x_k) \prod_{l \neq i}^m P(c_{k,l}^j = 1)}{\sum_{x_k \in \chi(c_{k,i}^j = 0)} p(r_k | x_k) \prod_{l \neq i}^m P(c_{k,l}^j = 0)}$$ (5.15) Depending on the specific bit $c_i^j$ , the signal space is divided into two sets, $\chi(c_{k,i}^j=1)$ and $\chi(c_{k,i}^j=0)$ corresponding to $c_{k,i}^j=1$ and $c_{k,i}^j=0$ , respectively. The soft decision of the coded data, $p(c_{k,l}^j)$ , is given by (5.7), and $p(r_k|x_k)$ is the likelihood function for the signal transmitted, and it is calculated as $$p(r_k|x_k) = p(r_{I,k}, r_{Q,k}|x_{I,k}, x_{Q,k})$$ $$=\frac{1}{\sqrt{2\pi}\sigma_N}exp\left[-\frac{|r_k-x_k|^2}{2\sigma_N^2}\right]$$ $$= \frac{1}{\sqrt{2\pi}\sigma_N} exp\left[-\frac{(r_{I,k} - x_{I,k})^2 - (r_{Q,k} - x_{Q,k})^2}{2\sigma_N^2}\right]$$ (5.16) Equation (5.15) indicates the soft information delivered by the demodulation stage. The turbo decoder use this soft information to compute the a posteriori LLR $(\lambda_2(c_{k,i}^j))$ of the received bits and the procedure is described in Section 3.3.2. These a posteriori LLR's are used to compute $P(c_{k,i}^j)$ as given by (5.7). The iterations can be terminated in two ways. One way is to stop when the changes in the soft symbols are smaller than a threshold. The value of the threshold determines the number of iterations. A small threshold will increase the number of iterations. Another possibility is to fix the number of iterations. The complexity of the multiuser detector is O(n). The complexity of the turbo decoder is $O(2^v)$ , where v is the code memory. Thus, the overall complexity of the algorithm is $O(n+2^v)$ . #### 5.5 Simulation Results In this section, we examine the performance of the proposed iterative decoder with SIC. The home-LAN signal is a 4-QAM occupying a bandwidth 4 MHz to 10 MHz. A 16-QAM modulation scheme is used in which the MSP mapping is employed so that the symbol detection error results in an error of only one bit. The turbo code has a rate of 1/3 with a generator polynomial of (1, 5/7). The interleaver size is 2048. It is assumed that the VDSL and the home-LAN signals have equal power. The BER performance of the proposed iterative decoder with the soft interference cancellation technique is shown in Figure 5.5. Results show that the performance increases as the number of iterations increases. This is because, as the number of iterations are increased, we have a better estimate of the interference. Figure 5.5: BER performance of the iterative decoder. But the improvement in the performance is negligible after certain number of iterations as the information exchanged between the decoders becomes more correlated. From the graph, it can be observed that after 4 iterations the improvement in the performance is negligible. In Figure 5.6, we compare the performance of the iterative decoder with and without soft interference canceller for 5<sup>th</sup> iteration. The graph shows the improvement in the BER performance by employing the proposed soft interference cancellation technique. The convergence curve for the interference cancellation technique is shown in Figure 5.7. Simulation is done for the worst case scenario in which the received signal power is close to the interference power. The curve shows that the algorithm converges in about four to five iterations. Finally, we examine the data rates provided by the VDSL modem employing the proposed iterative decoder with and without the SIC. Figure 5.8 illustrates that the bit rates decrease as the loop length increases because the signal attenuation Figure 5.6: Performance comparison of the proposed receiver with and without SIC. Figure 5.7: Convergence curve for the soft interference cancellation technique. Figure 5.8: Achievable data rates with and without SIC. increases as the loop length increases. VDSL modems provide higher data rates by employing the soft interference canceller compared to that without employing the the soft interference canceller. On the other hand, by employing the soft interference canceller, we can increase the loop length for a fixed data rate. But for the longer loop lengths the performance of the iterative decoder becomes the same as with and without the soft interference canceller. This is because the attenuation of the interfering signal increases as the loop length increases. Hence, on longer loops, the effect of home-LAN interference decreases. ## 5.6 Conclusion In this chapter, the issue of home-LAN interference into a VDSL signal has been considered. In order to mitigate the effect of Home-LAN, an iterative turbo decoder with soft interference canceller has been proposed. Soft interference cancellation has been performed by jointly detecting the VDSL and home-LAN signals using a soft interference canceller and a soft-in soft-out demodulator combined with a set of turbo decoders. By making the demodulator and decoder to operate in an iterative manner, the loss due to demodulation has been avoided. The soft interference canceller estimates the interference signals through the extrinsic information of the home-LAN signals provided by the decoding stage. Simulation results have shown that, as the number of iterations are increased, the estimate of the interference has been improved thereby improving the performance of the system. It has been shown that the algorithm converges after four to five iterations. Finally, an increase in the loop length and data rates provided by the VDSL systems employing the proposed SIC technique has been obtained. Results have shown that, on the shorter loops, VDSL systems employing the SIC obtain higher data rates compared to that of the VDSL systems without employing SIC. On the other hand, for a fixed data rate, longer loops can be reliably realized by the VDSL systems employing the SIC compared to that of the VDSL systems without SIC. # Chapter 6 ## Conclusion and Future Work ## 6.1 Contributions and Concluding Remarks In this thesis, the last-mile and home-LAN interference problems of VDSL systems have been addressed using turbo codes. With regard to the last-mile problem, a bandwidth efficient turbo coding scheme referred to as BICM scheme has been proposed. In order to minimize the effect of home-LAN interference on the VDSL systems, an iterative turbo decoder with a soft interference canceller has been proposed. The last-mile problem concerns the maximum distance for which a VDSL system can operate reliably for a given data rate. The home-LAN interference problem is due to the interference of home-LAN services associated with the twisted pair lines. Also, the disadvantage of the VDSL system compared to other DSL systems is its short copper loops that make the distribution area shrink to a few dozen customers. The drawback of the existing FEC scheme (a 4D Wei-RS scheme) for the VDSL systems is that further improvement is not possible to achieve without a substantial increase in the complexity and power penalty. Also, the VDSL systems employing the 4D Wei-RS scheme operates far below the channel capacity. On the other hand, attempts have been made to solve the home-LAN interference problem using iterative-decoding techniques. In [12], a linear soft interference canceller has been proposed to reduce the interference with a small loss of VDSL signal bandwidth. However, these techniques are complex to implement. In order to provide solutions to these problems and to ensure a reliable transmission of data over longer loops a good FEC scheme with a high coding gain is required. In the BICM scheme, a joint demodulation and decoding procedure has been developed to avoid the performance degradation due to demodulation. To develop the joint demodulation and decoding procedure, the MAP algorithm was appropriately modified by treating MAI as an extra source of noise. Some design criterioa have been enunciated for developing good constituent codes. To reduce the delay at the receiver end, a pipe-lined decoding scheme has been proposed. To analyze the hardware complexity, the BICM scheme has been synthesized using SYNOPSYS. The parameters that have been considered for the complexity analysis are decoder complexity, number of clock cycles, maximum storage requirements and Xilinx FPGA statistics. The analysis has shown that the BICM scheme has a lower complexity than that of the 4D Wei-RS scheme. However, the decoder complexity and the number of clock cycles required by the BICM scheme with v=4, N=2048 are more than that the 4D Wei-RS scheme. Also, the area required by the BICM scheme with the code memory 2 or 3 is less than that the 4D Wei-RS scheme, whereas, the area required by the BICM scheme with code memory 4 is more than that of the 4D Wei-RS scheme. A detailed simulation study has been performed to analyze the BER performance of the BICM scheme for the code parameters, v, N, and M. Simulation results have shown that the BICM scheme outperforms the 4D Wei-RS scheme. Although the BICM scheme is spectrally less efficient than the 4D Wei-RS scheme, it provides a significant coding gain at a very low SNR. The coding gain provided by increasing the code memory is rather large compared to that provided by increasing the interleaver size. A consistent coding gain of 0.6 dB was observed by increasing the code memory from 2 to 3 and 3 to 4, whereas the increase in the coding gain was 0.7 dB and 0.2 dB for increments in the interleaver size from 2048 to 4096 and from 4096 to 8192, respectively. Thus, increasing the code memory is a better option rather than increasing the intrerleaver size. On the other hand, increasing the spectral efficiency of the BICM scheme increases the SNR required to reach the targeted BER. Thus, the BICM scheme with v=2 or 3 and 16 QAM modulation is best suited for VDSL applications in the sense that they are less complex and achieve the targeted BER at a low SNR compared to that of the 4D Wei-RS scheme. Also, the BICM scheme is flexible enough for further modifications as all the components are independent and can be upgraded easily. With regard to the data rates, the VDSL modems provide higher data rates by employing the BICM scheme compared to that provided by employing the 4D Wei-RS scheme. It has been observed that, an increase in the transmitting frequency has a greater impact on the data rates than an increase in the transmitting power. Increasing the code memory and interleaver size increases the data rate, whereas increasing the level of modulation decreases it. Hence, a low level of modulation is more suitable to obtain higher data rates. Finally, the loop length that can be reliably realized by employing the BICM scheme has been obtained. Results have shown that, depending on the BICM configuration, the loop length can be increased by at least 3 to 4 times for a given data rate compared to that of the 4D Wei-RS scheme. In the iterative turbo decoder with soft interference canceller, soft interference cancellation has been performed by jointly detecting the VDSL and home-LAN signals using a soft interference canceller and a soft-in soft-out demodulator combined with a set of turbo decoders. By making the demodulator and decoder to operate in an iterative manner, the loss due to demodulation has been avoided. The soft interference canceller estimates the interference signals through the extrinsic information of the home-LAN signals provided by the decoding stage. Simulation results have shown that, as the number of iterations are increased, the estimate of the interference has been improved thereby improving the performance of the system. It has been shown that the algorithm converges after four to five iterations. Finally, an increase in the loop length and data rates provided by the VDSL systems employing the proposed SIC technique has been obtained. Results have shown that, on the shorter loops, VDSL systems employing the SIC obtain higher data rates compared to that of the VDSL systems without employing SIC. On the other hand, for a fixed data rate, longer loops can be reliably realized by the VDSL systems employing the SIC compared to that of the VDSL systems without SIC. ## 6.2 Scope for Further Investigation In the present work, a bandwidth efficient turbo coding scheme and a soft interference cancellation technique have been developed to provide solutions to the last-mile and the home-LAN interference problems in VDSL systems. In some VDSL applications, synchronization can be an important issue. In such applications, the proposed bandwidth efficient turbo coding scheme and the soft interference cancellation technique can be modified appropriately to overcome the synchronization problems. An additional investigation would be required to remove the ISI in VDSL systems by developing a receiver scheme where adaptive equalization and channel decoding are jointly optimized. The effects of bridged taps are not considered in the present work. An investigation can be undertaken to analyze the effects of bridged taps on the performance of the VDSL systems. Finally, it is worthwhile to look into the spectrum management issues to mitigate the crosstalk in VDSL systems. # Appendix In this appendix, a brief description of the programs developed for this thesis is given. The programs are developed in MATLAB on a Sun Workshop University Edition 5.0 platform. A CD-ROM containing these programs is included in this thesis. The list of the programs along with their description as it appears in the CD-ROM are given below: - line\_att.m This function demonstrates the line attenuation for various loop lengths for a VDSL channel. - fext\_loss.m This function demonstrates the FEXT coupling loss in a VDSL channel for different twisted pair loops for 49 disturbers. The input to this function if loop length and transmitting power. The output is a plot demonstrating the FEXT coupling loss for various frequencies. - next.m Demonstrates the effect of NEXT in a VDSL channel for various twisted pair loops and various numbers of crosstalkers. The input to this function if loop length and transmitting power. The output is a plot demonstrating the NEXT coupling loss for various frequencies. - fext.m This function demonstrates the effect of FEXT in a VDSL channel for various transmitting powers and frequencies. - chimpulser.m This function plots the impulse response of VDSL channel for various loop lengths. The input to this function is channel length and the output is the impulse response. capacity.m - This function plots the normalized channel capacity for a twisted pair loop in which AWGN and FEXT are line impairments. - This program plots the differential channel capacity of a VDSL twisted pair loop with AWGN and FEXT as line impairments. - Plot the frequency response of low-pass filters by taking the filter sampling rate as input. *Qfunct.m* - Program to evaluate the Q-function. capmod.m - Demonstrates the mapping scheme in CAP modulation. cap\_mod.m - Utility to implement CAP modulation scheme. qamsim1.m - Utility to implement CAP modulation and demodulation scheme. The program receives a string of bits to be modulated. The mapping scheme is also illustrated in this program. This program also generates home-LAN signals. gngauss.m- Program to generate Gaussian noise. *GrayCoding.m* - Returns the systematic Gray code for a symbol. *GrayDecoding.m* - Reverses the systematic Gray coding of a symbol. BiStream.m - Generates a sequence of bit streams (either 0 or 1) based upon the length of the sequence. BitToSymbolStream.m - Converts a sequence of bits into symbols. SymbolToBitStream.m - Converts symbols into a sequence of bits. This the topmost function. This program generates a sequence of bits, perform the encoding process using turbo codes, modulate them using QAM modulation, transmit the signals through a VDSL channel, and performs a joint demodulation and decoding procedure. The interleaver size, number of iterations, code generation matrix, level of modulation, range of SNR, puncturing matrix can be defined by user which are further given as input to the program. This program can also be used to perform soft interference cancellation of home-LAN signals on VDSL. The corresponding change that has to be done to perform the SIC is to enable the generation of the home-LAN signals in the qamsim1.m function. bin state.m - Converts an integer into a vector of binary bits. - demultiplex.m Perform serial to parallel demultiplex at the receiver to get the code word of each encoder. - encode\_bit.m This function takes as an input a single bit to be encoded, as well as the coefficients of the generator polynomials and the current state vector. It returns as output n encoded bits. - encoderm.m -This function performs turbo encoding process by determining the encoder memory and constraint length and returns the turbo encoded bits. - int state.m This function converts a row vector of bits into an integer. - logmapo.m Function to demonstrate a LOG-MAP component decoder. An option is provided to the user to simplify the LOG-MAP algorithm by using a MAX-LOG-MAP algorithm. This program calculates the branch metrics, forward state metrics, and backward state metrics. The output of the program is the LLR of the coded and uncoded bits. rsc encode.m - Encodes a block of data using RSC code. - This function implements the SOVA algorithm in trace back mode and returns the LLR of the coded and uncoded bits. - This function setup the Trellis for a code. The input to the program is a code generator matrix. - This function demonstrates random interleaver. The input to this function is a block of bits and output is the interleaved version of the input bits. weicode.m - This program illustrates the BER performance of the 4D Wei-RS code. bitrate\_code.m - This program can be used to calculate the bit rate provided by a VDSL system implementing the proposed scheme. The input to the program is the channel length, transmit frequency, transmitting power and numbers of crosstalkers, and the coding gain of the coding scheme. The output is the cutoff frequency of the corresponding VDSL channel. Bit rate can be obtained by multiplying the cut-off frequency with the spectral efficiency of the coding scheme. - Function to perform multi-user detection of the VDSL and home-LAN signals. The input to the program is the VDSL signal corrupted with home-LAN signals and the output is interference-free VDSL and home-LAN signals. bitrates.m - The bit rates provided by the VDSL system for various transmit frequencies, transmitting powers, numbers of crosstalkers, and loop lengths for various code parameters can be plotted using this program. - FB\_pkg.vhd VHDL program to perform the hardware complexity of the BICM scheme. Working model of the MAP decoding block can be checked through this program. The forward, backward, and state metrics can be computed through this program. The code generation matrix can be modified in order to check the functionality of various codes. - acs01.vhd VHDL program to check the working model of the add-compare-select block in the MAP decoding module. - viterbi.vhd VHDL program to check the working model of viterbi decoding block. SOVA algorithm is implemented in this program. - acs01.scr Program to synthesize the add-compare-select block. FPGA of add-compare-select block statistics can be obtain through this program. - llr1.scr Program to synthesize MAP decoding block in order to obtain the FPGA statistics. ## References - S. V. Ahmed, P. Bohn, and N. L. Gottfrie, "A Tutorial Two-Wire Digital Transmission in the Loop Plant," *IEEE Tran. Communications*, Vol. 29, pp. 1554-1564, Nov. 1981. - [2] R. G. Cornell and D. J. Stelte, "Progress Towards Digital Subscriber Line Services and Signaling," *IEEE Tran. Communications*, Vol. 29, pp. 1589-1594, Nov. 1981. - [3] J. Cioffi, T. Starr, and P. J. Silverman, Understanding Digital Line Subscriber Technology, NJ: Prentice-Hall, 1999. - [4] ETSI Technical Specification TS 101 270-1 v1.1.1. "Transmission and Multiplexing (TM); Access Transmission Systems on Metallic Access Cables: Very High Speed Digital Subscriber Line (VDSL); Part1; FuIndianapolished Requirements," Apr. 1998. - [5] W. Y. Chen, DSL Simulation techniques and standards Development for Digital Subscriber Line Systems, Macmillan Publishing Company, Indianapolis, Indiana, 1998. - [6] K. S. Jacobsen, "VDSL: The Next Step in the DSL Progression," Texas Instrument, Aug. 1999. - [7] Very-high-bit-rate Digital Subscriber Line (VDSL) Metallic Interface Part1: Functional Requirements and Common Specifications," T1E1.4/2000-009R3. - [8] K. S. Jacobsen, "Interactions Between ADSL and VDSL in the Same Binder:Simulations and Test Results," Texas Instrument T1E1 Contribution, T1E1.4/99-298, Jun. 1999. - [9] L.F. Wei, "Trellis Coded Modulation with Multidimensional Constellations," IEEE Trans. Information Theory, Vol. IT-33, pp. 483-501, Jul. 1987. - [10] S. Lin and D. J. Costello, Error Control Coding Fundamentals and Applications, Prentice-Hall, 1983. - [11] K. W. Cheong and J. M. Cioffi, "Coexistence of 1Mbps HPNA and DMT VDSL via Multiuser Detection and Code Divison Multiplexing," T1E1.4: VDSL and Spectral Community, Mar. 1999. - [12] K. Cheong, J. Choi, J. Fan, R. Negi, and N. Wu, "Soft Cancellation via Iterative Decoding to Mitigate the Effect of Home-LAN on VDSL," T1E1.4/99-333R1. - [13] C. Zeng and J. M. Cioffi, "Crosstalk Cancellation in xDSL Sytems," IEEE J. Select. Areas Communications, Vol. 20, No. 2, pp. 420-428, Feb. 2002. - [14] H. Dai and H. V. Poor, "Turbo Multiuser Detection for Coded DMT VDSL Systems," *IEEE J. Select. Areas Communications*, Vol. 20, No. 2, pp.351-362, Feb. 2002. - [15] K. W. Cheong, W. J. Choi, and J. M. Cioffi, "Multiuser Soft Inteference canceller via Iterative Decoding for DSL Applications," *IEEE J. Select. Areas Communications*, Vol. 20, No. 2, pp. 363-371, Feb. 2002. - [16] C. Berrou, A. Glavieux, and P. Thitimajshima, "Near Shannon limit error-correcting coding and decoding: Turbo-codes," in *Proc. IEEE Int. Conf. Communications*, Geneva, Switzerland, 1993, pp. 1064-1070. - [17] J. Cioffi, "Very-high-speed Digital Subscriber Lines-System Requirements," T1E1.4/98-043. - [18] A. Worner, M. Schenk, and D. Schmucking, "Transmission Capacity and Hard-ware Requirements of VDSL Systems," IEEE Int. Conf. Communication Systems. 1996, pp. 864-868. - [19] I. Kalet and S. Shamai, "On the Capacity of a Twisted-Wire Pair :Gaussian Model," *IEEE Tran. Communications*, Vol. 38, No. 3, pp. 379-383, Mar. 1990. - [20] J. H. W. Unger, "Near-End Crosstalk Model for Line Code Studies," Bellcore T1D1 Contribution, T1D1.3/85-244. - [21] D. G. Messerschimtt, "Design Issues in the ISDN-U Interface Transceiver," IEEE J. Select. Areas Communications, Vol. 4, No. 8, pp. 1281-1293, Nov. 1986. - [22] R. A. McDonald, "Report on Bellcore Impulse Noise Study," Bellcore T1D1 Contribution, T1D1.3/87-256. - [23] W. Y. Chen, "VDSL and Radio Interference Cancelation," T1E1 Contribution, T1E1.4/96-022. - [24] M. Schenk, "QAM vs. DMT-Which Technology Can Best Realize VDSL's Potential?," Infenion Technologies, Mar. 2001. - [25] C. S. Modlin and J. S. Chow, "Complexity Comparison of DMT VDSL with QAM Based VDSL," T1E1.4/99-269. - [26] D. Divsalar and F. Pollara, "Turbo Codes for PCS Applications.," in Proc. IEEE. Int. Conf. on Comm., WA, May, 1995, pp. 54-59. - [27] S. L. Goff, A. Glavieux, and C. Berrou, "Turbo-codes and High Spectral Effficiency Modulation," in *Proc. IEEE Int. Conf. Communications*, New Orleans, LA, 1994, pp. 645-649. - [28] D.J. Costello Jr, A. Banerjee, T. E. Fuja, and P. C. Massey, "Some Reflections on the Design of Bandwidth Efficient Turbo Codes," Proc. of 39<sup>th</sup> Annual Allerton Conf. on Communication, Computing and Control, Oct 2001, pp. 362-368. - [29] P. Robertson and T. Worz, "Bandwidth-Efficient Turbo Trellis-Coded Modulation Using Punctured Component Codes," IEEE J. Select. Areas Communications, Vol. 16, No. 2, pp. 206-218, Feb. 1998. - [30] C. Fragouli and R. D. Wesel, "Turbo Encoder Design for Symbol-Interlaved Parallel Concatenated Trellis Coded Modulation," *IEEE Trans. Communications*, Vol. 49, No. 3, pp. 425-434, Mar. 2001 - [31] S.Bendetto, G.Montorsi, and D.Divsalar, "Bandwidth Efficient Parallel cConcatenated Coding Schemes," *Electronic Letters*, Vol. 31, pp. 2067-2069, Nov. 1995. - [32] Sreekanth Marti and M.O. Ahmad, "A Bandwidth Efficient Turbo Coding Scheme for VDSL Systems," submitted to Circuits, Systems and Signal Processing Journal. - [33] S.Bendetto and G. Montorsi, "Design of Parallel Concatenated Convolutional Codes," *IEEE Trans. Communications*, Vol. 44, No. 5, pp. 591-600, May 1996. - [34] B. Vucetic and J. Yuan, Turbo Codes Principles and Applications, Kluwer Academic Publishers, Massachusetts, U.S.A, 2000. - [35] P. Robertson, P. Hoeher, and E. Villebrun, "A Comaprison of Optimal and Sub-Optimal MAP Decoding Algorithms Operating in the Log Domain," in Proc. Int. Conf. on Comm., 1995, pp. 1009-10013. - [36] S. Bendetto and G. Montorsi, "Unveiling Turbo Codes: Some Results on Parallel Concatenated Coding Schemes," *IEEE Trans. Communications*, Vol. 42, No. 2, Mar. 1996, pp. 409-428. - [37] A. Chindapol and J. A. Ritcey, "Design, Analysis, and Performance Evaluation for BICM-ID with Square QAM Constellations in Rayleigh Fading Channels," EEE J. Select. Areas Communications, Vol. 19, No. 5, pp. 944-957, May 2001. - [38] E. Boutillon, W. J. Gross, and P. Glenn Gulak, "VLSI Architectures for MAP algorithm," submitted to *IEEE Trans. Communications*. - [39] M. Sreekanth and M. O. Ahmad, "Performance Evaluation of a Turbo Coded VDSL System," in *Proc. The First Annual Northwest Workshop on Circuits and Systems*, Montreal, June 2003. - [40] J. T. Aslanis and J. M. Cioffi, "Achievable Information Rates on Digital Subscriber Loops: Limiting Information Rates with Crosstalk Noise," *IEEE Tran. Communications*, Vol. 40, pp. 361-367, Feb. 1992. - [41] P. S. Chow, J. C. Tu, and J. M. Cioffi, "Performance Evaluation of a Multichannel Transceiver System for ADSL and VHDSL Services," *IEEE J. Select.* Areas Communications, Vol. 9, No. 6, pp. 909-919, Aug. 1991. - [42] S. Galli, C. Valenti, and K. J. Kerpez, "A Frequency Domain Approach to Crosstalk Identification in xDSL Systems," *IEEE J. Select. Areas Communica*tions, Vol. 19, No. 8, pp. 1497-1506, Aug. 2001. - [43] C. Zeng, C. Aldana, A. A. Salvekar, and J. M. Cioffi, "Crosstalk Identification in xDSL Systems," *IEEE J. Select. Areas Communications*, Vol. 19, No. 8, pp. 1384-1392, Aug. 2001. - [44] D. Schmucking, H. Husmann, and M. Schenk, A. Worner, "Interference Cancellation in VDSL Sytems," *IEEE Int. Conf. Communication Systems*. 1996, pp. 872-876. - [45] K. Cheong, J. M. Cioffi, J. Lauer, and A. Salvekar, "Mitigation of DSL Crosstalk via Multiuser Detection," T1E1.4/98-253. - [46] M. Sreekanth and M. O. Ahmad, "An Iterative Soft Interference Cancellation and Decoding Technique to Mitigate the effect of Home-LAN on VDSL," in Proc. IEEE Int. Conf. Communication Systems, Singapore, Nov 2002, pp. 1000-1004. - [47] C. Laot, A. Glavieux, and J. Labat, "Turbo Equalization: Adaptive Equalization and Channel Decoding Jointly Optimized," IEEE J. Select. Areas Communications, Vol. 19, No. 9, pp. 1774-1752, Sep. 2001. - [48] G. Battail, "A Conceptual Frame Work for Understanding Turbo Codes," IEEE J. Select. Areas Communications, Vol. 16, No. 2, pp. 245-254, Feb. 1998 - [49] J. Hageanauer, "The Turbo Principle: Tutorial Introduction and State of the Art," Int. Symposium on Turbo Codes, Brest, France, 1997. - [50] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, "Optimal Decoding Of Linear Codes for Minimizing Symbol Error Rate," *IEEE Tran. Inf. Theory*, Vol. IT-20, Mar. pp. 284-287, 1974.