Chen, Yueqin (2003) Pipeline for the quality control of sequencing. Masters thesis, Concordia University.
Sequencing technology has been developed for more than decades. In all methods applied, however, sequencing quality remains a problem. In most cases, raw DNA sequences obtained from automatic sequencing machines are not reliable, and the output sequence might contain errors, vector contamination might occur, Dye-terminator reaction might not occur, and segment migration might be abnormal in gel electrophoresis. In order to trim low quality regions and sequences, to remove contamination and to build reliable assembled contigs, we constructed the pipeline, which uses Phred, Lucy and Phrap tools together. Phred takes chromatogram data as input, then makes base calls and assigns quality values and finally generates sequence and quality files in FASTA format. In a subsequent step, Lucy trims vectors and low quality sequences, makes a clean range of each sequence. At the end Phrap will assemble these sequences using pairwise alignment information from the Smith-Waterman algorithm. The consensus sequences of the assemblies built by the pipeline are assumed to be reliable genes that can be used for gene annotation. Furthermore, we compared two pipelines, with and without the use of Lucy. The output clearly indicates that applying Lucy in the sequencing process generates more reliable consensus sequences.
|Divisions:||Concordia University > Faculty of Engineering and Computer Science > Computer Science and Software Engineering|
|Item Type:||Thesis (Masters)|
|Pagination:||vii, 73 leaves : ill. ; 29 cm.|
|Degree Name:||Theses (M.Comp.Sc.)|
|Program:||Computer Science and Software Engineering|
|Thesis Supervisor(s):||Butler, Gregory|
|Deposited By:||Concordia University Libraries|
|Deposited On:||27 Aug 2009 17:27|
|Last Modified:||08 Dec 2010 15:26|
Repository Staff Only: item control page