The development of a tool for mapping protein mutations to sequence structures

Title:

The development of a tool for mapping protein mutations to sequence structures

Gurpur, Ashwin Bhat (2005) The development of a tool for mapping protein mutations to sequence structures. Masters thesis, Concordia University.

Preview

Text (application/pdf)
MR10286.pdf - Accepted Version

3MB

Abstract

Related work has been done in the NLP area to extract protein mutation information directly from PubMed papers and storing it in an XML file. This thesis describes a tool that processes this NLP output for the purpose of visualizing the mutations. The tool uses the NLP output file as input and extracts the details of the protein being discussed, along with the mutation information and these details are used to extract the sequence information from the NCBI protein database. Next, for each protein, it extracts the conserved domain information from the NCBI conserved domain database. Each extracted sequence is split into its respective conserved domains and these are placed sequentially. ClustalW and Alistat are used to remove sequences that fall below a particular threshold. For the remaining sequences, a consensus sequence is generated and a structure that best matches it, is selected. Mutations corresponding to the remaining sequences are mapped on to the structure and a reliability score is calculated. All this information is written on to a visualization file. This is the final output of this tool. This file can be uploaded to the PROSAT protein visualization tool and the mutations can be visualized. The results obtained when the tool was tested on three protein families---xylanases, dehalogenases and biphenyl dioxygenase are presented.

Divisions:	Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:	Thesis (Masters)
Authors:	Gurpur, Ashwin Bhat
Pagination:	x, 102 leaves ; 29 cm.
Institution:	Concordia University
Degree Name:	M. Comp. Sc.
Program:	Computer Science and Software Engineering
Date:	2005
Thesis Supervisor(s):	Butler, Gregory
Identification Number:	LE 3 C66C67M 2005 G87
ID Code:	8492
Deposited By:	lib-batchimporter
Deposited On:	18 Aug 2011 18:26
Last Modified:	13 Jul 2020 20:04
Related URLs:	https://concordiauniversity.on.worldcat....

Repository Staff Only: item control page

Download Statistics

Downloads per month over past year

Research related to the current document (at the CORE website)

Spectrum Research Repository

The development of a tool for mapping protein mutations to sequence structures

The development of a tool for mapping protein mutations to sequence structures

Abstract