The manifold hypothesis suggests that word vectors live on a submanifold within their ambient vector space . The identified, singular pointscorrespond to polysemous words, i.e. words with multiple meanings . We introduce a topological measure of polysemy based on persistent homology that correlates well with the actual number of meanings of a word . We propose a simple, topologically motivated solution to the SemEval-2010 task on Word Sense Induction & Disambiguation that produces competitive results . We present two kinds of empirical evidence to support this point of view . We suggest monosemous words can be distinguished based on the topology of their neighbourhoods .

Author(s) : Alexander Jakubowski, Milica Gašić, Marcus Zibrowius

Links : PDF - Abstract

Code :

Keywords : word - words - meanings - topology - based -

