The identification of structural differences between a music performance and the score is a challenging yet integral step of audio-to-score alignment . We present a novel method todetect such differences between the score and performance for a given piece of music . Our method incorporates varying dilation rates at different layers to capture bothshort-term and long-term context, and can be employed successfully in the presence of limited annotated data . We conduct experiments on audio recordingsof real performances that differ structurally from the score .

Author(s) : Ruchit Agrawal, Daniel Wolff, Simon Dixon

