Hina, Manolo Dulva (2003) Keyword-based approaches to improve internet search. Masters thesis, Concordia University.
Preview |
Text (application/pdf)
5MBMQ77713.pdf |
Abstract
Technology keeps on evolving and so must the science of information retrieval. This thesis presents keyword-based approaches to improve information retrieval from the Internet. Focused and unfocused queries to search engines are considered, and means of obtaining relevant documents are presented. For focused queries, techniques are provided to obtain a high precision score from the hit documents; these documents do contain the exact answers to the focused query, which is usually a question. User queries are subjected to ambiguity test to determine if it is ambiguous, and if it is so, provide direction so as the user's intended meaning is the one that is actually searched. The queries are modified to form a new clear and unambiguous. Query is sent to several search engines at the same time, and hit documents from each of these search engines are collated and merged. Hit documents to an ambiguous query are analyzed and ranked based on their actual relevance to the query. Term frequency is used, along with popularity score, to determine the total score of a relevant document. Every relevant hit document is classified based on its academic relevance. A few academic categories are considered--(1) Course Notes, (2) Frequently Asked Questions, (3) Research Paper, (4) Technical Report, (5) Thesis, (6) Tutorial, (7) Review, and (8) Research Paper/Technical Report. Once a search is done, a set of relevant documents is presented, along with each document's academic relevance category (if any)
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering |
---|---|
Item Type: | Thesis (Masters) |
Authors: | Hina, Manolo Dulva |
Pagination: | xi, 191 leaves : ill. ; 29 cm. |
Institution: | Concordia University |
Degree Name: | M. Comp. Sc. |
Program: | Computer Science and Software Engineering |
Date: | 2003 |
Thesis Supervisor(s): | Jayakumar, R |
Identification Number: | TK 5105.884 H56 2003 |
ID Code: | 2023 |
Deposited By: | Concordia University Library |
Deposited On: | 27 Aug 2009 17:24 |
Last Modified: | 13 Jul 2020 19:51 |
Related URLs: |
Repository Staff Only: item control page