Triple Viz: A tool to explore document content from a graphical representation of subject-verb-object triples

Dhananjaya, Jahnavi (2016) Triple Viz: A tool to explore document content from a graphical representation of subject-verb-object triples. Masters thesis, Concordia University.

Most of the data available is unstructured. Text mining is the process of automatically extracting information from text. This thesis combines text mining with visualization to develop TripleViz, a lightweight, web-based tool used to process and analyze documents extracting subject-verb-object (SVO) triples, and visualize them as graphs. The SVO triples extracted from documents are visualized using the open-source visualization tools Turtled and Gephi. TripleViz extracts noun phrases and visualizes them in either full or head format to avoid overcrowding on the screen. For the same reason, TripleViz provides an option to select only triples that contain words of interest as provided by the user in the form of a word list. Within TripleViz, the user can also view color-coded output text highlighting words from a word list. This thesis presents an experiment in classifying newspaper articles and blogs into either "specific event" or "generic", which shows a moderate improvement over a strong baseline.

Divisions:Concordia University > Faculty of Engineering and Computer Science
Item Type:Thesis (Masters)
Authors:Dhananjaya, Jahnavi
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:12 August 2016
Thesis Supervisor(s):Bergler, Sabine
Keywords:Text mining, TripleViz, SVO triples
ID Code:981517
Deposited On:08 Nov 2016 16:15
Last Modified:08 Nov 2016 16:15


