Multi task Learning of Negation and Speculation for Targeted Sentiment Classification

The majority of work in targeted sentiment analysis has concentrated on finding better methods to improve the overall results . Within this paper we show that these models are not robust to linguistic phenomena, specifically negation and speculation . We propose a multi-task learning method to incorporate information from syntactic and semantic auxiliary tasks to create models that are more robust to these phenomena .… Read the rest

Hydrocephalus verification on brain magnetic resonance images with deep convolutional neural networks and transfer learning technique

Deep Learning is an evolving technology and part of a broader field of Machine Learning… Deep learning is currently actively researched in the field of radiology . The hydrocephalus can be either an independent disease or a concomitant symptom of a number of pathologies, therefore representing an urgent issue in the present-day clinical practice .… Read the rest

A Neural Approach to Irony Generation

Many prior research studies have been conducted in irony detection, but few studies focus on irony generation . Ironies can express stronger emotions and show a sense of humor . The main challenges for irony generation are the lack of large-scale irony dataset and difficulties in modeling the ironic pattern .… Read the rest

What makes multilingual BERT multilingual

Multilingual BERT works remarkably well on cross-lingual transfer tasks, superior to static non-contextualized word embeddings . We found that datasize and context window size are crucial factors to the transferability of the transferable language . We provide an in-depth experimental study to supplement the existing literature .

Learning Depth from Monocular Videos Using Synthetic Data A Temporally Consistent Domain Adaptation Approach

Majority of state-of-the-art monocular depth estimation methods are supervised learning approaches . Success of such approaches heavily depends on the high-quality depth labels which are expensive to obtain . Some recent methods try to learn depth networks by leveraging unsupervised cues from monocular videos which are easier to acquire but less reliable .… Read the rest

Improved Generalization of Arabic Text Classifiers

Transfer learning for text has been very active in the English language, but progress in Arabic has been slow . Domain Adaptation is used to generalize the performance of any classifier by trying to balance the classifier’s accuracy for a particular task among different text domains… In this paper, we propose and evaluate two variants of a domain adaptation technique .… Read the rest

Increasing Shape Bias in ImageNet Trained Networks Using Transfer Learning and Domain Adversarial Methods

Convolutional Neural Networks (CNNs) have become the state-of-the-art method to learn from image data . Recent works have attempted to increase the shape bias in CNNs in order to train more robust and accurate networks on tasks . The results show the proposed method increases the robustness and shape bias of the CNNs, while it does not provide a gain in accuracy .… Read the rest

Knowledge Distillation and Student Teacher Learning for Visual Intelligence A Review and New Outlooks

Knowledge distillation (KD) has been proposed to transfer information learned from one model to another . KD is often characterized by the so-called `Student-Teacher’ (S-T) learning framework and has been broadly applied in model compression and knowledge transfer . We discuss the potentials and open challenges of existing methods and prospect the future directions of KD and S-T learning .… Read the rest

Adaptive Structured Sparse Network for Efficient CNNs with Feature Regularization

Recent algorithms have a tendency to be too structurally complex to deploy on embedded systems . We propose a Feature Regularization method that can generate input-dependent structured sparsity for hidden features . Our method can improve sparsity level in intermediate features by 60% to over 95% through pruning along the channel dimension for each pixel, thus relieving the computational and memory burden .… Read the rest

Model Based and Data Driven Strategies in Medical Image Computing

Model-based approaches for image reconstruction, analysis and interpretation have made significant progress over the last decades . With the availability of large amounts of imaging data and machine learning techniques, data-driven approaches have become more widespread . These approaches learn statistical models directly from labelled or unlabeled image data and have been shown to be very powerful for extracting clinically useful information from medical imaging .… Read the rest

A Combinatorial Perspective on Transfer Learning

Human intelligence is characterized not only by the capacity to learn complex skills, but the ability to rapidly adapt and acquire new skills within an ever-changing environment . In this work we study how the learning of modular solutions can allow for effective generalization to both unseen and potentially differently distributed data .… Read the rest

Inferring symmetry in natural language

A hybrid transfer learning model that integrates linguistic features with contextualized language models most faithfully predicts the empirical data . Our work integrates existing approaches to symmetry in natural language and suggests how symmetry inference can improve systematicity in state-of-the-art language models .… Read the rest

Multi dimensional Style Transfer for Partially Annotated Data using Language Models as Discriminators

Style transfer has been widely explored in natural language generation with non-parallel corpus by extracting a notion of style from source and target domain corpus . A common aspect among the existing approaches is the prerequisite of joint annotations across all the stylistic dimensions under consideration… Availability of such dataset across a combination of styles is a limiting factor in extending state-of-the art style transfer setups to multiple style dimensions .… Read the rest

MCGKT Net Multi level Context Gating Knowledge Transfer Network for Single Image Deraining

Rain streak removal in a single image is a very challenging task due to its ill-posed nature in essence . The conventional DCNN-based deraining methods have struggled to exploit deeper and more complex network architectures for pursuing better performance . This study proposes a novel MCGKT-Net for boosting deraining performance, which is a naturally multi-scale learning framework .… Read the rest

KINNEWS and KIRNEWS Benchmarking Cross Lingual Text Classification for Kinyarwanda and Kirundi

Kinyarwanda has been studied in Natural Language Processing (NLP) to some extent . This work constitutes the first study on Kirundi . The design of the created datasets allows for a wider use in NLP beyond text classification in future studies, such as representation learning, cross-lingual learning with more distant languages, or as base for new annotations for tasks such as parsing, POS tagging, and NER .… Read the rest

Blind Video Temporal Consistency via Deep Video Prior

Applying image processing algorithms independently to each video frame often leads to temporal inconsistency in the resulting video . Our method is only trained on a pair of original and processed videos directly instead of a large dataset . We demonstrate effectiveness of our approach on 7 computer vision tasks on videos .… Read the rest

Cross Lingual Transfer Learning for Question Answering

Deep learning based question answering (QA) on English documents has achieved success because there is a large amount of English training examples . For most languages, training examples for high-quality QA models are not available . A machine translation (MT) based approach translates the source language into the target language, or vice versa .… Read the rest

Investigating the True Performance of Transformers in Low Resource Languages A Case Study in Automatic Corpus Creation

Transformers represent the state-of-the-art in Natural Language Processing (NLP) in recent years, proving effective even in tasks done in low-resource languages . While pretrained transformers for these languages can be made, it is challenging to measure their true performance and capacity due to the lack of hard benchmark datasets, as well as the difficulty and cost of producing them… In this paper, we present three contributions: First, we propose a methodology for automatically producing Natural Language Inference (NLI) benchmark datasets for low-Resource languages using published news articles .… Read the rest

Towards Fully Bilingual Deep Language Modeling

Language models based on deep neural networks have facilitated great advances in natural language processing and understanding tasks in recent years . However, multilinguality has come at a cost in terms of monolingual performance . The best-performing models at most tasks not involving cross-lingual transfer remain monololingual… In this paper, we consider the question of whether it is possible to pre-train a bilingual model for two remotely related languages without compromising performance at either language .… Read the rest

Movement Pruning Adaptive Sparsity by Fine Tuning

Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning . However, it is less effective in the transfer learning regime that has become standard for state-of-the-art natural language processing applications . We propose the use of movement pruning, a simple, deterministic first-order weight pruning method that is more adaptive to pretrained model fine-tuning… We give mathematical foundations to the method and compare it to existing zeroth-and-order pruning methods .… Read the rest

Aesthetic Attributes Assessment of Images

Aesthetic Attributes Assessment of Images is a new formula of image aesthetic assessment, which predicts aesthetic attributes captions together with the aesthetic score of each attribute . The AMAN model outperforms traditional CNN-LSTM model and modern SCA-CNN model of image captions .… Read the rest

Improving significance of binary black hole mergers in Advanced LIGO data using deep learning Confirmation of GW151216

Machine Learning (ML) based strategy to search for compact binary coalescences in data from ground-based gravitational wave observatories . This is the first ML-based search that not only recovers all the binary black hole mergers in the first GW transients calalog (GWTC-1) but also makes a clean detection of GW151216, which was not significant enough to be included in the catalogue .… Read the rest

Aerial Single View Depth Completion with Image Guided Uncertainty Estimation

Large viewpoint variations experienced by aerial vehicles are still posing major challenges for learning-based mapping approaches . The core of our method is a novel compact network that performs both depth completion and confidence estimation using an image-guided approach . Real-time performance onboard a GPU suitable for small flying robots is achieved by sharing deep features between both tasks .… Read the rest

Hardware Conditioned Policies for Multi Robot Transfer Learning

Deep reinforcement learning could be used to learn dexterous robotic policies but it is challenging to transfer them to new robots with vastly different hardware properties . It is also prohibitively expensive to learn a new policy from scratch for each robot hardware due to the high sample complexity of modern state-of-the-art algorithms… We propose a novel approach called \textit{Hardware Conditioned Policies} where we train a universal policy conditioned on a vector representation of robot hardware .… Read the rest

Asking Crowdworkers to Write Entailment Examples The Best of Bad Options

Large-scale natural language inference (NLI) datasets such as SNLI or MNLI have been created by asking crowdworkers to read a premise and write three new hypotheses, one for each possible semantic relationships (entailment, contradiction, and neutral) While this protocol has been used to create useful benchmark data, it remains unclear whether the writing-based annotation protocol is optimal for any purpose, since it has not been evaluated directly… Furthermore, there is ample evidence that crowdworker writing can introduce artifacts in the data .… Read the rest

Multilingual Offensive Language Identification with Cross lingual Embeddings

Offensive content is pervasive in social media and a reason for concern to companies and government organizations . Several studies have been recently published investigating methods to detect the various forms of such content (e.g. hate speech, cyberbulling, and cyberaggression)… The clear majority of these studies deal with English partially because most annotated datasets available contain English data .… Read the rest

Multi Stage Pre training for Low Resource Domain Adaptation

Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain . We show that extending the vocabulary of the LM with domain-specific terms leads to further gains . We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection .… Read the rest

Classification of Epithelial Ovarian Carcinoma Whole Slide Pathology Images Using Deep Transfer Learning

Ovarian cancer is the most lethal cancer of the female reproductive organs . There are $5$ major histological subtypes of epithelial ovarian cancer, each with distinct morphological, genetic, and clinical features… Currently, these histotypes are determined by a pathologist’s microscopic examination of tumor whole-slide images (WSI) The proposed algorithm achieved a mean accuracy of $87.54\%$ and Cohen’s kappa of $0.8106$ in the slide-level classification of $305$ WSIs; performing better than a standard CNN and pathologists without gynecology-specific training .

Collective Wisdom Improving Low resource Neural Machine Translation using Adaptive Knowledge Distillation

Scarcity of parallel sentence-pairs poses a significant hurdle for training high-quality Neural Machine Translation (NMT) models in bilingually low-resource scenarios . Different transferred models may have complementary semantic and/or syntactic strengths, hence using only one model may be sub-optimal . We propose an effective adaptive knowledge distillation approach to dynamically adjust the contribution of the teacher models during the distillation process .… Read the rest