Scene graphs are a powerful structured representation of the underlying content of images, and embeddings derived from them have been shown to be useful in multiple downstream tasks . In this work, we employ a graphconvolutional network to exploit structure in scene graphs and produce imageembeddings useful for semantic image retrieval . We demonstrate that this this Ranking loss leads to robust representations that outperform well-knowncontrastive losses on the retrieval task . In addition, we provide qualitative evidence of how retrieved results that utilize structured scene information capture the global context of the scene, different from visual similaritysearch.

Author(s) : Paridhi Maheshwari, Ritwick Chaudhry, Vishwa Vinay

Links : PDF - Abstract

Code :

Keywords : scene - results - retrieved - structured - addition -

