Login | Register

Attention on Attention for Text to Image Synthesis using Mode-Seeking Loss Function

Title:

Attention on Attention for Text to Image Synthesis using Mode-Seeking Loss Function

Bhise, Naitik, Krzyżak, Adam and Bui, Tien D. (2021) Attention on Attention for Text to Image Synthesis using Mode-Seeking Loss Function. Masters thesis, Concordia University.

[thumbnail of Bhise_MCompSc_S2021.pdf]
Preview
Text (application/pdf)
Bhise_MCompSc_S2021.pdf - Accepted Version
9MB

Abstract

Text to Image Synthesis is a burgeoning field that has sprung up in the research community in the last few years. Generative Adversarial Networks form the basic component of modern research as many architectures are built around this particular type. This thesis is an attempt to develop a new technique and architecture to compete with the recent state-of-the-art models. The research around the text to image synthesis is conducted using two approaches by working with the modification of the loss and then the change in architecture. In the first approach, we sample two noise vectors to generate two different output images. The model is trained by a mode-seeking loss function which maximizes the ratio of the l2-norm of difference between the two images to the l2 norm of difference between noise vectors. The second approach deals with increasing the attention between the text and the image by introducing the Attention on Attention architecture in the AttnGAN network. We find that a combination of two approaches produces good quality images and better attended on their text descriptions. The metric scores of Frechet Inception Distance and Inception Scores are used to evaluate the results. The datasets used in this research are Microsoft COCO and CUB Birds. A comparison study of the obtained results with the past state-of-the-art models is conducted and presented in this thesis.

Divisions:Concordia University > Gina Cody School of Engineering and Computer Science > Computer Science and Software Engineering
Item Type:Thesis (Masters)
Authors:Bhise, Naitik and Krzyżak, Adam and Bui, Tien D.
Institution:Concordia University
Degree Name:M. Comp. Sc.
Program:Computer Science
Date:February 2021
Thesis Supervisor(s):Bui, Tien and Krzyzak, Adam
ID Code:988020
Deposited By: Naitik Bhise
Deposited On:29 Jun 2021 20:57
Last Modified:29 Jun 2021 20:57
All items in Spectrum are protected by copyright, with all rights reserved. The use of items is governed by Spectrum's terms of access.

Repository Staff Only: item control page

Downloads per month over past year

Research related to the current document (at the CORE website)
- Research related to the current document (at the CORE website)
Back to top Back to top