Multi task Learning of Negation and Speculation for Targeted Sentiment Classification

The majority of work in targeted sentiment analysis has concentrated on finding better methods to improve the overall results . Within this paper we show that these models are not robust to linguistic phenomena, specifically negation and speculation . We propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks to create models that are more robust to these phenomena .… Read the rest

NLP based Feature Extraction for the Detection of COVID 19 Misinformation Videos on YouTube

We present a simple NLP methodology for detecting COVID-19 misinformation videos on YouTube . We use transfer learning pre-trained models to generate a multi-label classifier that can categorize conspiratorial content… We use the percentage of misinformation comments on each video as a new feature for video classification .

A General Multi Task Learning Framework to Leverage Text Data for Speech to Text Tasks

Attention-based sequence-to-sequence modeling provides a powerful and elegant solution for applications that need to map one sequence to a different sequence . The proposed method achieves a relative 10~15% word error rate reduction on the English Librispeech task, and improves the speech translation quality on the MuST-C tasks by 4.2~11.1 BLEU.… Read the rest

A Neural Approach to Irony Generation

Many prior research studies have been conducted in irony detection, but few studies focus on irony generation . Ironies can express stronger emotions and show a sense of humor . The main challenges for irony generation are the lack of large-scale irony dataset and difficulties in modeling the ironic pattern .… Read the rest

KINNEWS and KIRNEWS Benchmarking Cross Lingual Text Classification for Kinyarwanda and Kirundi

Kinyarwanda has been studied in Natural Language Processing (NLP) to some extent . This work constitutes the first study on Kirundi . The design of the created datasets allows for a wider use in NLP beyond text classification in future studies, such as representation learning, cross-lingual learning with more distant languages, or as base for new annotations for tasks such as parsing, POS tagging, and NER .… Read the rest

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization

Recent algorithms have a tendency to be too structurally complex to deploy on embedded systems . We propose a Feature Regularization method that can generate input-dependent structured sparsity for hidden features . Our method can improve sparsity level in intermediate features by 60% to over 95% through pruning along the channel dimension for each pixel, thus relieving the computational and memory burden .… Read the rest

Movement Pruning Adaptive Sparsity by Fine Tuning

Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning . However, it is less effective in the transfer learning regime that has become standard for state-of-the-art natural language processing applications . We propose the use of movement pruning, a simple, deterministic first-order weight pruning method that is more adaptive to pretrained model fine-tuning… We give mathematical foundations to the method and compare it to existing zeroth-and-order pruning methods .… Read the rest

Aesthetic Attributes Assessment of Images

Aesthetic Attributes Assessment of Images is a new formula of image aesthetic assessment, which predicts aesthetic attributes captions together with the aesthetic score of each attribute . The AMAN model outperforms traditional CNN-LSTM model and modern SCA-CNN model of image captions .… Read the rest

Cross Lingual Relation Extraction with Transformers

Relation extraction (RE) is one of the most important tasks in information extraction, as it provides essential information for many NLP applications . In this paper, we propose a cross-lingual RE approach that does not require any human annotation in a target language or any cross language resources .… Read the rest

Audio based Near Duplicate Video Retrieval with Audio Similarity Learning

Audio Similarity Learning (AuSiL) effectively captures temporal patterns of audio similarity between video pairs . The proposed approach achieves very competitive results compared to three state-of-the-art methods . Unlike the competing methods, it is very robust to the retrieval of audio duplicates generated with speed transformations .… Read the rest

Cross Lingual Transfer in Zero Shot Cross Language Entity Linking

We propose a neural ranking architecture for this task that uses multilingual BERT representations of the mention and the context in a neural network… We find that the multilingual ability of BERT leads to robust performance in monolingual and multilingual settings .… Read the rest

Blind Video Temporal Consistency via Deep Video Prior

Applying image processing algorithms independently to each video frame often leads to temporal inconsistency in the resulting video . Our method is only trained on a pair of original and processed videos directly instead of a large dataset . We demonstrate effectiveness of our approach on 7 computer vision tasks on videos .… Read the rest

Investigating the True Performance of Transformers in Low Resource Languages A Case Study in Automatic Corpus Creation

Transformers represent the state-of-the-art in Natural Language Processing (NLP) in recent years, proving effective even in tasks done in low-resource languages . While pretrained transformers for these languages can be made, it is challenging to measure their true performance and capacity due to the lack of hard benchmark datasets, as well as the difficulty and cost of producing them… In this paper, we present three contributions: First, we propose a methodology for automatically producing Natural Language Inference (NLI) benchmark datasets for low-Resource languages using published news articles .… Read the rest

Cross Lingual Transfer Learning for Question Answering

Deep learning based question answering (QA) on English documents has achieved success because there is a large amount of English training examples . For most languages, training examples for high-quality QA models are not available . A machine translation (MT) based approach translates the source language into the target language, or vice versa .… Read the rest

Knowledge Distillation and Student Teacher Learning for Visual Intelligence A Review and New Outlooks

Knowledge distillation (KD) has been proposed to transfer information learned from one model to another . KD is often characterized by the so-called `Student-Teacher’ (S-T) learning framework and has been broadly applied in model compression and knowledge transfer . We discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning .… Read the rest

Dick Preston and Morbo at SemEval 2019 Task 4 Transfer Learning for Hyperpartisan News Detection

A pre-trained language model has been fine-tuned on domain-specific data and used for classification of news articles . The suggested approach yields accuracy and F1-scores around 0.8 which places the best performing classifier among the top-5 systems in the competition .… Read the rest

Matching the Clinical Reality Accurate OCT Based Diagnosis From Few Labels

Unlabeled data is often abundant in the clinic, making machine learning methods based on semi-supervised learning a good match for this setting . MixMatch and FixMatch algorithms have demonstrated promising results in extracting useful representations while requiring very few labels .… Read the rest

Improved Generalization of Arabic Text Classifiers

Transfer learning for text has been very active in the English language, but progress in Arabic has been slow . Domain Adaptation is used to generalize the performance of any classifier by trying to balance the classifier’s accuracy for a particular task among different text domains… In this paper, we propose and evaluate two variants of a domain adaptation technique .… Read the rest

MCGKT Net Multi level Context Gating Knowledge Transfer Network for Single Image Deraining

Rain streak removal in a single image is a very challenging task due to its ill-posed nature in essence . The conventional DCNN-based deraining methods have struggled to exploit deeper and more complex network architectures for pursuing better performance . This study proposes a novel MCGKT-Net for boosting deraining performance, which is a naturally multi-scale learning framework .… Read the rest

Improving significance of binary black hole mergers in Advanced LIGO data using deep learning Confirmation of GW151216

Machine Learning (ML) based strategy to search for compact binary coalescences in data from ground-based gravitational wave observatories . This is the first ML-based search that not only recovers all the binary black hole mergers in the first GW transients calalog (GWTC-1) but also makes a clean detection of GW151216, which was not significant enough to be included in the catalogue .… Read the rest

Towards Fully Bilingual Deep Language Modeling

Language models based on deep neural networks have facilitated great advances in natural language processing and understanding tasks in recent years . However, multilinguality has come at a cost in terms of monolingual performance . The best-performing models at most tasks not involving cross-lingual transfer remain monololingual… In this paper, we consider the question of whether it is possible to pre-train a bilingual model for two remotely related languages without compromising performance at either language .… Read the rest

Increasing Shape Bias in ImageNet Trained Networks Using Transfer Learning and Domain Adversarial Methods

Convolutional Neural Networks (CNNs) have become the state-of-the-art method to learn from image data . Recent works have attempted to increase the shape bias in CNNs in order to train more robust and accurate networks on tasks . The results show the proposed method increases the robustness and shape bias of the CNNs, while it does not provide a gain in accuracy .… Read the rest

What makes multilingual BERT multilingual

Multilingual BERT works remarkably well on cross-lingual transfer tasks, superior to static non-contextualized word embeddings . We found that datasize and context window size are crucial factors to the transferability of the transferable language . We provide an in-depth experimental study to supplement the existing literature .

Inferring symmetry in natural language

A hybrid transfer learning model that integrates linguistic features with contextualized language models most faithfully predicts the empirical data . Our work integrates existing approaches to symmetry in natural language and suggests how symmetry inference can improve systematicity in state-of-the-art language models .… Read the rest

MaLTESE Large Scale Simulation Driven Machine Learning for Transient Driving Cycles

A deep-neural-network-based surrogate model achieves high accuracy for various engine parameters such as exhaust temperature, exhaust pressure, nitric oxide, and engine torque . It requires about 16 micro sec for predicting the engine performance and emissions for a single design configuration compared with about 0.5 s per configuration with the engine simulator .… Read the rest

Instance based Inductive Deep Transfer Learning by Cross Dataset Querying with Locality Sensitive Hashing

Authors propose inductive transfer learning method that can augment learning models by infusing similar instances from different learning tasks in the Natural Language Processing (NLP) domain . They show that one can achieve competitive/better performance than learning from a single dataset .… Read the rest

Model Based and Data Driven Strategies in Medical Image Computing

Model-based approaches for image reconstruction, analysis and interpretation have made significant progress over the last decades . With the availability of large amounts of imaging data and machine learning techniques, data-driven approaches have become more widespread . These approaches learn statistical models directly from labelled or unlabeled image data and have been shown to be very powerful for extracting clinically useful information from medical imaging .… Read the rest

Learning Depth from Monocular Videos Using Synthetic Data A Temporally Consistent Domain Adaptation Approach

Majority of state-of-the-art monocular depth estimation methods are supervised learning approaches . Success of such approaches heavily depends on the high-quality depth labels which are expensive to obtain . Some recent methods try to learn depth networks by leveraging unsupervised cues from monocular videos which are easier to acquire but less reliable .… Read the rest

Hydrocephalus verification on brain magnetic resonance images with deep convolutional neural networks and transfer learning technique

Deep Learning is an evolving technology and part of a broader field of Machine Learning… Deep learning is currently actively researched in the field of radiology . The hydrocephalus can be either an independent disease or a concomitant symptom of a number of pathologies, therefore representing an urgent issue in the present-day clinical practice .… Read the rest

Meta Learning for Low Resource Unsupervised Neural MachineTranslation

Meta-learning algorithm trains the model to adapt to another domain by utilizing only a small amount of training data . We assume domain-general knowledge is a significant factor in handling data-scarce domains . Our model surpasses a transfer learning-based approach by up to 2-4 BLEU scores.… Read the rest

A Combinatorial Perspective on Transfer Learning

Human intelligence is characterized not only by the capacity to learn complex skills, but the ability to rapidly adapt and acquire new skills within an ever-changing environment . In this work we study how the learning of modular solutions can allow for effective generalization to both unseen and potentially differently distributed data .… Read the rest

Multi dimensional Style Transfer for Partially Annotated Data using Language Models as Discriminators

Style transfer has been widely explored in natural language generation with non-parallel corpus by extracting a notion of style from source and target domain corpus . A common aspect among the existing approaches is the prerequisite of joint annotations across all the stylistic dimensions under consideration… Availability of such dataset across a combination of styles is a limiting factor in extending state-of-the art style transfer setups to multiple style dimensions .… Read the rest

Hardware Conditioned Policies for Multi Robot Transfer Learning

Deep reinforcement learning could be used to learn dexterous robotic policies but it is challenging to transfer them to new robots with vastly different hardware properties . It is also prohibitively expensive to learn a new policy from scratch for each robot hardware due to the high sample complexity of modern state-of-the-art algorithms… We propose a novel approach called \textit{Hardware Conditioned Policies} where we train a universal policy conditioned on a vector representation of robot hardware .… Read the rest

Multi Stage Pre training for Low Resource Domain Adaptation

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain . We show that extending the vocabulary of the LM with domain-specific terms leads to further gains . We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection .… Read the rest

Just Pick a Sign Optimizing Deep Multitask Models with Gradient Sign Dropout

Gradient Sign Dropout (GradDrop) is a probabilistic masking procedure which samples gradients at an activation layer based on their level of consistency . GradDrop is implemented as a simple deep layer that can be used in any deep net and synergizes with other gradient balancing approaches .… Read the rest

Multilingual Offensive Language Identification with Cross lingual Embeddings

Offensive content is pervasive in social media and a reason for concern to companies and government organizations . Several studies have been recently published investigating methods to detect the various forms of such content (e.g. hate speech, cyberbulling, and cyberaggression)… The clear majority of these studies deal with English partially because most annotated datasets available contain English data .… Read the rest

Land Cover Semantic Segmentation Using ResUNet

In this paper we present our work on developing an automated system for land cover classification . This system takes a multiband satellite image of an area as input and outputs the land cover map of the area at the same resolution as the input .… Read the rest

Leveraging Unpaired Text Data for Training End to End Speech to Intent Systems

Training an end-to-end (E2E) neural network requires large amounts of intent-labeled speech data . When only a tenth of the original data is available, intent classification accuracy degrades by 7.6% absolute . The proposed approaches recover 80% of performance lost due to using limited intent-labelled speech data, according to the paper .… Read the rest

Multilingual Argument Mining Datasets and Analysis

The growing interest in argument mining and computational argumentation brings with it a plethora of NLU tasks and corresponding datasets . In this work, we explore the potential of transfer learning using the multilingual BERT model to address argument mining tasks in non-English languages .… Read the rest

Aerial Single View Depth Completion with Image Guided Uncertainty Estimation

Large viewpoint variations experienced by aerial vehicles are still posing major challenges for learning-based mapping approaches . The core of our method is a novel compact network that performs both depth completion and confidence estimation using an image-guided approach . Real-time performance onboard a GPU suitable for small flying robots is achieved by sharing deep features between both tasks .… Read the rest

Large-scale natural language inference (NLI) datasets such as SNLI or MNLI have been created by asking crowdworkers to read a premise and write three new hypotheses, one for each possible semantic relationships (entailment, contradiction, and neutral) While this protocol has been used to create useful benchmark data, it remains unclear whether the writing-based annotation protocol is optimal for any purpose, since it has not been evaluated directly… Furthermore, there is ample evidence that crowdworker writing can introduce artifacts in the data .… Read the rest

Deep neural network architectures have attained remarkable improvements in scene understanding tasks . Utilizing an efficient model is one of the most important constraints for limited-resource devices… Recently, several compression methods have been proposed to diminish the heavy computational burden and memory consumption .… Read the rest

ChrEn Cherokee English Machine Translation for Endangered Language Revitalization

Cherokee is a highly endangered Native American language spoken by the Cherokee people . There are approximately only 2,000 fluent first language Cherokee speakers remaining in the world . To help save this endangered language, we introduce ChrEn, a Cherokee-English parallel dataset .… Read the rest

Classification of Epithelial Ovarian Carcinoma Whole Slide Pathology Images Using Deep Transfer Learning

Ovarian cancer is the most lethal cancer of the female reproductive organs . There are $5$ major histological subtypes of epithelial ovarian cancer, each with distinct morphological, genetic, and clinical features… Currently, these histotypes are determined by a pathologist’s microscopic examination of tumor whole-slide images (WSI) The proposed algorithm achieved a mean accuracy of $87.54\%$ and Cohen’s kappa of $0.8106$ in the slide-level classification of $305$ WSIs; performing better than a standard CNN and pathologists without gynecology-specific training .

CNN based Approach for Cervical Cancer Classification in Whole Slide Histopathology Images

Cervical cancer will cause 460 000 deaths per year by 2040, approximately 90% are Sub-Saharan African women . A constantly increasing incidence in Africa making cervical cancer a priority by the World Health Organization (WHO) in terms of screening, diagnosis, and treatment .… Read the rest

Collective Wisdom Improving Low resource Neural Machine Translation using Adaptive Knowledge Distillation

Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Machine Translation (NMT) models in bilingually low-resource scenarios . Different transferred models may have complementary semantic and/or syntactic strengths, hence using only one model may be sub-optimal . We propose an effective adaptive knowledge distillation approach to dynamically adjust the contribution of the teacher models during the distillation process .… Read the rest

Convolutional Neural Networks for Classifying Melanoma Images

A lot of cancer cases early on are misdiagnosed leading to severe consequences including the death of patient… Also there are cases in which patients have other problems and doctors interpret it as skin cancer . In this work, we address the problem of skin cancer classification using convolutional neural networks .… Read the rest

fairseq S2T Fast Speech to Text Modeling with fairseq

fairseq S2T follows fairseq’s careful design for scalability and extensibility… We provide end-to-end workflows from data pre-processing, model training to offline (online) inference . We implement state-of-the-art RNN-based as well as Transformer-based models and open-source detailed training recipes . Fairseq’s machine translation models and language models can be seamlessly integrated into workflows for multi-task learning .

Explicit Alignment Objectives for Multilingual Bidirectional Encoders

AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities . AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model which has 4.6x the parameters of AMBER .… Read the rest

Pagsusuri ng RNN based Transfer Learning Technique sa Low Resource Language

Low-resource languages such as Filipino suffer from data scarcity which makes it challenging to develop NLP applications for Filipino language . Transformer-based models are proven to be effective in low-resource tasks but faces challenges in accessibility due to its high compute and memory requirements .… Read the rest