Scene graphs are a powerful structured representation of the underlying content of images, and embeddings derived from them have been shown to be useful in multiple downstream tasks . In this work, we employ a graphconvolutional network to exploit structure in scene graphs and produce imageembeddings useful for semantic image retrieval . We demonstrate that this this Ranking loss leads to robust representations that outperform well-knowncontrastive losses on the retrieval task . In addition, we provide qualitative evidence of how retrieved results that utilize structured scene information capture the global context of the scene, different from visual similaritysearch. In addition to the results retrieved results, we demonstrate that retrieved results using structured scene data capture a global context, different than visual similarity

Author(s) : Paridhi Maheshwari, Ritwick Chaudhry, Vishwa Vinay

Links : PDF - Abstract

Code :

Keywords : scene - results - retrieved - structured - addition -

