Farzad, Sabahi (2023) Development of Deep Learning Techniques for Image Retrieval. PhD thesis, Concordia University.
Preview |
Text (application/pdf)
9MBSabahi_PhD_F2023.pdf - Accepted Version Available under License Spectrum Terms of Access. |
Abstract
Images are used in many real-world applications, ranging from personal photo repositories to medical imaging systems. Image retrieval is a process in which the images in the database are first ranked in terms their similarities with respect to a query image, then a certain number of the images are retrieved from the ranked list that are most similar to the query image. The performance of an image retrieval algorithm is measured in terms of mean average precision. There are numerous applications of image retrieval. For example, face retrieval can help identify a person for security purposes, medical image retrieval can help doctors make more informed medical diagnoses, and commodity image retrieval can help customers find desired commodities. In recent years, image retrieval has gained more popularity in view of the emergence of large-capacity storage devices and the availability of low-cost image acquisition equipment. On the other hand, with the size and diversity of image databases continuously growing, the task of image retrieval has become increasingly more complex. Recent image retrieval techniques have focused on using deep learning techniques because of their exceptional feature extraction capability. However, deep image retrieval networks often employ very complex networks to achieve a desired performance, thus limiting their practicability in applications with limited storage and power capacity. The objective of this thesis is to design high-performance, low complexity deep networks for the task of image retrieval. This objective is achieved by developing three different low-complexity strategies for generating rich sets of discriminating features.
Spatial information contained in images is crucial for providing detailed information about the positioning and interrelation of various elements within an image and thus, it plays an important role in distinguishing different images. As a result, designing a network to extract features that characterize this spatial information within an image is beneficial for the task of image retrieval. In the light of the importance of spatial information, in our first strategy, we develop two deep convolutional neural networks capable of extracting features with a focus on the spatial information. For the design of the first network, multi-scale dilated convolution operations are used to extract spatial information, whereas in the design of the second network, fusion of feature maps obtained from different hierarchical levels are employed to extract spatial information.
Textural, structural, and edge information is very important for distinguishing images, and therefore, a network capable of extracting features characterizing this type of information about the images could be very useful for the task of image retrieval. Hence, in our second strategy, we develop a deep convolutional neural network that is guided to extract textural, structural, and edge information contained in an image. Since morphological operations process the texture and structure of the objects within an image based on their geometrical properties and edges are fundamental features of an image, we use morphological operations to guide the network in extracting textural and structural information, and a novel pooling operation for extracting the edge information in an image.
Most of the researchers in the area of image retrieval have focused on developing algorithms aimed at yielding good retrieval performance at low computational complexity by outputting a list of certain number of images ranked in a decreasing order of similarity with respect to the query image. However, there are other researchers who have adopted a course of improving the results of an already existing image retrieval algorithm through a process of a re-ranking technique. A re-ranking scheme for image retrieval accesses the list of the images retrieved by an image retrieval algorithm and re-ranks them so that the re-ranked list at the output the scheme has a mean average precision value higher than that of the originally retrieved list.
A re-ranking scheme is an overhead to the process of image retrieval, and therefore, its complexity should be as small as possible. Most of the re-ranking schemes in the literature aim to boost the retrieval performance at the expense of a very high computational complexity. Therefore, in our third strategy, we develop a computationally efficient re-ranking scheme for image retrieval, whose performance is superior to that of the existing re-ranking schemes. Since image hashing offers the dual benefits of computational efficiency and the ability to generate versatile image representation, we adopt it in the proposed re-ranking scheme.
Extensive experiments are performed, in this thesis, using benchmark datasets, to demonstrate the effectiveness of the proposed new strategies in designing low-complexity deep networks for image retrieval.
Divisions: | Concordia University > Gina Cody School of Engineering and Computer Science > Electrical and Computer Engineering |
---|---|
Item Type: | Thesis (PhD) |
Authors: | Farzad, Sabahi |
Institution: | Concordia University |
Degree Name: | Ph. D. |
Program: | Electrical and Computer Engineering |
Date: | 18 July 2023 |
Thesis Supervisor(s): | M. Omair, Ahmad and M.N.S., Swamy |
ID Code: | 992771 |
Deposited By: | FARZAD SABAHI |
Deposited On: | 15 Nov 2023 15:32 |
Last Modified: | 15 Nov 2023 15:32 |
Repository Staff Only: item control page