The objective of this work is person-clustering in videos — groupingcharacters according to their identity . Previous methods focus on the narrowertask of face, and for the most part ignore other cues such as theperson’s voice, their overall appearance (hair, clothes, posture), and theediting structure of the videos . We introduce a multi-modal High-Precision Clustering algorithm . The dataset is by far the largest of its kind, and covers films and TV-shows representing a wide range of demographics . Finally, we show the effectiveness of using multiple modalities for person-Clustering, explore theuse of this new broad task for story understanding through characterco-occurrences, and achieve a new state of the art on all available datasets. We show theeffectiveness of using severalmodalities for Person-clUSTering. Finally, show the use of multiple modality for person

Author(s) : Andrew Brown, Vicky Kalogeiton, Andrew Zisserman

Links : PDF - Abstract

Code :

Keywords : person - clustering - show - multiple - voice -

Leave a Reply

Your email address will not be published. Required fields are marked *