The research of knowledge-driven conversational systems is largely limited due to the lack of dialog data which consists of multi-turn conversations on multiple topics and with knowledge annotations . In this paper, we propose a Chinese multi-domain knowledge- driven conversation dataset, KdConv . Our corpus contains 4.5K conversations from three domains (film, music, and travel), and 86K utterances with an average turn number of 19.0.0 . We provide several benchmark models to facilitate the following research on this corpus, which is publicly available . Comparative results show that the models can be enhanced by introducing background knowledge, yet there is still a large space for leveraging knowledge to model multi-talk conversations for further research . Results also show that there are obvious performance differences between different domains .

Links: PDF - Abstract

Code :


Keywords : knowledge - multi - driven - conversations - turn -

Leave a Reply

Your email address will not be published. Required fields are marked *