Query-by-Example Image Retrieval using Visual Dependency Representations

Desmond Elliott; Victor Lavrenko; Frank Keller

Query-by-Example Image Retrieval using Visual Dependency Representations

Desmond Elliott, Victor Lavrenko, Frank Keller

5 Citations (Scopus)

Abstract

Image retrieval models typically represent images as bags-of-terms, a representation that is wellsuited to matching images based on the presence or absence of terms. For some information needs, such as searching for images of people performing actions, it may be useful to retain data about how parts of an image relate to each other. If the underlying representation of an image can distinguish between images where objects only co-occur from images where people are interacting with objects, then it should be possible to improve retrieval performance. In this paper we model the spatial relationships between image regions using Visual Dependency Representations, a structured image representation that makes it possible to distinguish between object co-occurrence and interaction. In a query-by-example image retrieval experiment on data set of people performing actions, we find an 8.8% relative increase in MAP and an 8.6% relative increase in Precision@10 when images are represented using the Visual Dependency Representation compared to a bag-of-terms baseline.

Original language	Undefined/Unknown
Title of host publication	Proceedings of the 25th International Conference on Computational Linguistics
Number of pages	12
Publication date	2014
Pages	109-120
Publication status	Published - 2014

Cite this

@inproceedings{e104e6f55f274f368510fa6dbf73619e,

title = "Query-by-Example Image Retrieval using Visual Dependency Representations",

abstract = "Image retrieval models typically represent images as bags-of-terms, a representation that is wellsuited to matching images based on the presence or absence of terms. For some information needs, such as searching for images of people performing actions, it may be useful to retain data about how parts of an image relate to each other. If the underlying representation of an image can distinguish between images where objects only co-occur from images where people are interacting with objects, then it should be possible to improve retrieval performance. In this paper we model the spatial relationships between image regions using Visual Dependency Representations, a structured image representation that makes it possible to distinguish between object co-occurrence and interaction. In a query-by-example image retrieval experiment on data set of people performing actions, we find an 8.8% relative increase in MAP and an 8.6% relative increase in Precision@10 when images are represented using the Visual Dependency Representation compared to a bag-of-terms baseline.",

author = "Desmond Elliott and Victor Lavrenko and Frank Keller",

year = "2014",

language = "Udefineret/Ukendt",

pages = "109--120",

booktitle = "Proceedings of the 25th International Conference on Computational Linguistics",

}

TY - GEN

T1 - Query-by-Example Image Retrieval using Visual Dependency Representations

AU - Elliott, Desmond

AU - Lavrenko, Victor

AU - Keller, Frank

PY - 2014

Y1 - 2014

N2 - Image retrieval models typically represent images as bags-of-terms, a representation that is wellsuited to matching images based on the presence or absence of terms. For some information needs, such as searching for images of people performing actions, it may be useful to retain data about how parts of an image relate to each other. If the underlying representation of an image can distinguish between images where objects only co-occur from images where people are interacting with objects, then it should be possible to improve retrieval performance. In this paper we model the spatial relationships between image regions using Visual Dependency Representations, a structured image representation that makes it possible to distinguish between object co-occurrence and interaction. In a query-by-example image retrieval experiment on data set of people performing actions, we find an 8.8% relative increase in MAP and an 8.6% relative increase in Precision@10 when images are represented using the Visual Dependency Representation compared to a bag-of-terms baseline.

AB - Image retrieval models typically represent images as bags-of-terms, a representation that is wellsuited to matching images based on the presence or absence of terms. For some information needs, such as searching for images of people performing actions, it may be useful to retain data about how parts of an image relate to each other. If the underlying representation of an image can distinguish between images where objects only co-occur from images where people are interacting with objects, then it should be possible to improve retrieval performance. In this paper we model the spatial relationships between image regions using Visual Dependency Representations, a structured image representation that makes it possible to distinguish between object co-occurrence and interaction. In a query-by-example image retrieval experiment on data set of people performing actions, we find an 8.8% relative increase in MAP and an 8.6% relative increase in Precision@10 when images are represented using the Visual Dependency Representation compared to a bag-of-terms baseline.

M3 - Konferencebidrag i proceedings

SP - 109

EP - 120

BT - Proceedings of the 25th International Conference on Computational Linguistics

ER -