A good joint training framework is very helpful to improve the performances of weakly supervised audio tagging (AT) and acoustic event detection (AED)simultaneously . In this study, we propose three methods to improve DCASE2019 Task 4 for both AT and AED tasks . Aframe-level target-events based deep feature distillation is first proposed, itaims to leverage the potential of limited strong-labeled data in weaklysupervised framework to learn better intermediate feature maps . Then we propose an adaptive focal loss and two-stage training strategy to enable an effectiveand more accurate model training, in which the contribution ofdifficult-to-classify and easy to classify acoustic events to the total cost

Author(s) : Yunhao Liang, Yanhua Long, Yijie Li, Jiaen Liang

Links : PDF - Abstract

Code :
Coursera

Keywords : feature - training - aed - loss - adaptive -

Leave a Reply

Your email address will not be published. Required fields are marked *