Mohammad Taheri, Yaser (2017) Improving the Rate-Distortion Performance in Distributed Video Coding. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
2MBMohammadTaheri_PhD_F2017.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Distributed video coding is a coding paradigm, which allows encoding of video frames at a complexity that is substantially lower than that in conventional video coding schemes. This feature makes it suitable for some emerging applications such as wireless surveillance video and mobile camera phones. In distributed video coding, a subset of frames in the video sequence, known as the key frames, are encoded using a conventional intra-frame encoder, such as H264/AVC in the intra mode, and then transmitted to the decoder. The remaining frames, known as the Wyner-Ziv frames, are encoded based on the Wyner-Ziv principle by using the channel codes, such as LDPC codes. In the transform-domain distributed video coding, each Wyner-Ziv frame undergoes a 4x4 block DCT transform and the resulting DCT coefficients are grouped into DCT bands. The bitplaines corresponding to each DCT band are encoded by a channel encoder, for example an LDPCA encoder, one after another. The resulting error-correcting bits are retained in a buffer at the encoder and transmitted incrementally as needed by the decoder. At the decoder, the key frames are first decoded. The decoded key frames are then used to generate a side information frame as an initial estimate of the corresponding Wyner-Ziv frame, usually by employing an interpolation method. The difference between the DCT band in the side information frame and the corresponding one in the Wyner-Ziv frame, referred to as the correlation noise, is often modeled by Laplacian distribution. A soft-input information for each bit in the bitplane is obtained using this correlation noise model and the corresponding DCT band of the side information frame. The channel decoder then uses this soft-input information along with some error-correcting bits sent by the encoder to decode the bitplanes of each DCT band in each of the Wyner-Ziv frames. Hence, an accurate estimation of the correlation noise model parameter(s) and generation of high-quality side information are required for reliable soft-input information for the bitplanes in the decoder, which in turn leads to a more efficient decoding. Consequently, less error-correcting bits need to be transmitted from the encoder to the decoder to decode the bitplanes, leading to a better compression efficiency and rate-distortion performance.
The correlation noise is not stationary and its statistics vary within each Wyner-Ziv frame and within its corresponding DCT bands. Hence, it is difficult to find an accurate model for the correlation noise and estimate its parameters precisely at the decoder. Moreover, in existing schemes the parameters of the correlation noise for each DCT band are estimated before the decoder starts to decode the bitplanes of that DCT band and they are not modified and kept unchanged during decoding process of the bitplanes. Another problem of concern is that, since side information frame is generated in the decoder using the temporal interpolation between the previously decoded frames, the quality of the side information frames is generally poor when the motions between the frames are non-linear. Hence, generating a high-quality side information is a challenging problem.
This thesis is concerned with the study of accurate estimation of correlation noise model parameters and increasing in the quality of the side information from the standpoint of improving the rate-distortion performance in distributed video coding.
A new scheme is proposed for the estimation of the correlation noise parameters wherein the decoder decodes simultaneously all the bitplanes of a DCT band in a Wyner-Ziv frame and then refines the parameters of the correlation noise model of the band in an iterative manner. This process is carried out on an augmented factor graph using a new recursive message passing algorithm, with the side information generated and kept unchanged during the decoding of the Wyner-Ziv frame. Extensive simulations are carried out showing that the proposed decoder leads to an improved rate-distortion performance in comparison to the original DISCOVER codec and in another DVC codec employing side information frame refinement, particularly for video sequences with high motion content.
In the second part of this work, a new algorithm for the generation of the side information is proposed to refine the initial side information frame using the additional information obtained after decoding the previous DCT bands of a Wyner-Ziv frame. The simulations are carried out demonstrating that the proposed algorithm provides a performance superior to that of schemes employing the other side information refinement mechanisms. Finally, it is shown that incorporating the proposed algorithm for refining the side information into the decoder proposed in the first part of the thesis leads to a further improvement in the rate-distortion performance of the DVC codec.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Mohammad Taheri, Yaser |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Electrical and Computer Engineering |
Date: | August 2017 |
Thesis Supervisor(s): | Ahmad, M.Omair and Swamy, M.N.S |
ID Code: | 983063 |
Deposited By: | YASER MOHAMMAD TAHERI |
Deposited On: | 08 Nov 2017 21:36 |
Last Modified: | 18 Jan 2018 17:56 |
Repository Staff Only: item control page