Login | Register

Data compression using error correcting codes


Data compression using error correcting codes

Haghighat, Javad (2007) Data compression using error correcting codes. PhD thesis, Concordia University.

Text (application/pdf)
NR30135.pdf - Accepted Version


Application of error correcting codes for data compression is first investigated by Shannon where he suggests that there is a duality between source coding and channel coding. This duality implies that good channel codes are likely to be good source codes (and vice versa). Recently the problem of source coding using channel codes is receiving increasing attention. The main application of this problem is when data are transmitted over noisy channels. Since standard data compression techniques are not designed for error correction, compressing data and transmitting over noisy channels may cause corruption of the whole compressed sequence. However, instead of employing standard compression techniques, like Huffman coding, one may compress data using error correcting codes that are suitable for both data compression and error correction purposes. Recently, turbo codes, repeat-accumulate codes, low density parity check codes, and fountain codes have been used as lossless source codes and have achieved compression rates very close to the source entropy. When a near-lossless compression is desired, i.e. a small level of distortion is acceptable, the source encoder generates fixed-length codewords and the encoding complexity is low. Theoretically, random codes could achieve near-lossless compression. In literature, this has been proved by presenting a random binning scheme. Practically, all powerful channel codes, e.g. turbo codes, can follow the same procedure as suggested in random binning and achieve compression rates close to the entropy. On the other hand, if a completely lossless compression is required, i.e. if the distortion must be forced to zero, the source encoding is a complicated iterative procedure that generates variable-length codewords to guarantee zero distortion. However, the large complexity of encoding imposes a large delay to the system. The iterative encoding procedure can be regarded as using a nested code where each codeword of a higher-rate code is formed by adding parities to a codeword of some lower-rate code. This iterative encoding is proposed for practical codes, e.g. turbo codes and low density parity check (LDPC) codes, in the literature. In contrast to near-lossless source coding, in the lossless case no random coding theory is available to support achievability of entropy and specify distribution of the compression rate. We have two main contributions in this thesis. Our first contribution is presenting a tree structured random binning scheme to prove that nested random codes asymptotically achieve the entropy. We derive the probability mass function of the compression rate and show how it varies when increasing the block length. We also consider a more practical tree structured random binning scheme, where parities are generated independently and randomly, but they are biased. Our second contribution is to decrease the delay in turbo source coding. We consider turbo codes for data compression and observe that existing schemes achieve low compression rates; but because of large block length and large number of iterations they impose a large delay to the system. To decrease this delay we look at the problem of source coding using short block length turbo codes. We show how to modify different components of the encoder to achieve low compression rates. Specifically we modify the parity interleaver and use rectangular puncturing arrays. We also replace a single turbo code by a library of turbo codes to further decrease the compression rate. Since the scheme is variable-length and also many codes are used, the codeword length along with the code index (index of the turbo code which is used for compression) are transmitted as an overhead. Transmission of this overhead increases the compression rate. We propose a detection method to detect this overhead from the codeword. Therefore, the overhead is no longer transmitted since it is detected from the codeword at the decoder. This detection method will reduce the compression rate for short block length systems but it becomes less attractive for large block length codes.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering
Item Type:Thesis (PhD)
Authors:Haghighat, Javad
Pagination:xiii, 86 leaves : ill. ; 29 cm.
Institution:Concordia University
Degree Name:Ph. D.
Program:Electrical and Computer Engineering
Thesis Supervisor(s):Soleymani, M. Reza and Hamouda, Walaa
ID Code:975299
Deposited By: Concordia University Library
Deposited On:22 Jan 2013 16:05
Last Modified:18 Jan 2018 17:40
Related URLs:
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Back to top Back to top