Breadcrumb

 
 

Modeling the Evolving Structure of Social Text for Information Extraction and Topic Detection

Title:

Modeling the Evolving Structure of Social Text for Information Extraction and Topic Detection

Dubuc, Julien (2011) Modeling the Evolving Structure of Social Text for Information Extraction and Topic Detection. Masters thesis, Concordia University.

[img]
Preview
PDF - Accepted Version
1238Kb

Abstract

The advent of “social media” has enabled millions of people to participate in discussions within communities on a global scale. These conversations take place in a myriad of venues, on or off the web, each with its particular approach to implement what we now call “social media” – blogs, bulletin boards, mailing lists. However, while the software powering these communities varies a great deal, and continues to evolve, all of them share a common set of features. When a user initiates a discussion, the message is not addressed to a specific person, but broadcast to any interested reader; such a message can generate replies from other users, and these replies can then generate their own, forming a network of connections between messages. There is a need for a system that can make connections between related pieces of social text, to group information into coherent units. Making use of the structure of the social text helps to determine which elements of the text to consider for a given topic. To do this, a system needs to consider the different contexts in which it can be understood. A post, text transmitted by a single author at the same point in time, may have a different topic than the whole thread, which is comprised of all the posts in the discussion following an initial post. Different passages in a post could also have separate topics. Therefore, it is useful to annotate the text with information about its social structure explicitly for use in automatic search and text mining.

Divisions:Concordia University > Faculty of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Dubuc, Julien
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:15 April 2011
Thesis Supervisor(s):Bergler, Sabine
ID Code:7291
Deposited By:JULIEN DUBUC
Deposited On:09 Jun 2011 15:47
Last Modified:09 Jan 2012 14:25
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Document Downloads

More statistics for this item...

Concordia University - Footer