Social networks are widely used for information consumption anddissemination, especially during time-critical events such as natural disasters . Despite its significantly large volume, social media content is often too noisy for direct use in any application . In this paper, we present a new large-scale dataset with~77K human-labeled tweets, sampled from a pool of ~24 million tweets across 19disaster events that happened between 2016 and 2019 . We propose adata collection and sampling pipeline, which is important for social media datasampling for human annotation . We report multiclass classification resultsusing classic and deep learning (fastText and transformer) based models to set the ground for future studies . The dataset and associated resources arepublicly available. The dataset is publicly available.

Author(s) : Firoj Alam, Umair Qazi, Muhammad Imran, Ferda Ofli

Links : PDF - Abstract

Code :

Keywords : dataset - social - human - large - humaid -

Leave a Reply

Your email address will not be published. Required fields are marked *