A New Benchmark for Evaluation of Cross Domain Few Shot Learning

Recent progress on few-shot learning has largely re-lied on annotated data for meta-learning, sampled from the same domain as the novel classes. In this paper, we propose the cross-domain few- shot learning (CD-FSL) benchmark, consist-ing of images from diverse domains with varying similarity to ImageNet, ranging from crop disease images, satellite images, and medical images.… Read the rest

A Simple Language Model for Task Oriented Dialogue

SimpleTOD is a simple approach to task-oriented dialogue. It uses a single causal language model trained on all sub-tasks recast as a single sequence prediction problem. This allows SimpleTOD to fully leverage transfer learning from pre-trained, open domain, causal language models such as GPT-2.… Read the rest

Adversarial Deep Averaging Networks for Cross Lingual Sentiment Classification

Adversarial Deep Averaging Network (ADAN) can transfer knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists. ADAN has two discriminative branches: a sentiment classifier and an adversarial language discriminator. Both branches take input from a shared feature extractor to learn hidden representations that are simultaneously indicative for the classification task and invariant across languages.… Read the rest

A Survey of Unsupervised Deep Domain Adaptation

Deep learning has produced state-of-the-art results for a variety of tasks. While such approaches for supervised learning have performed well, they assume that training and testing data are drawn from the same distribution. As a complement to this challenge, single-source unsupervised domain adaptation can handle situations where a network is trained on labeled data from a source domain and unlabeledData from a related but different target domain.… Read the rest

Adversarially Regularized Autoencoders

Deep latent variable models are now a key technique for representation learning of continuous structures. applying similar methods to discrete structures, such as text sequences or discretized images, has proven to be more challenging. The approach is based on the recently-proposed Wasserstein autoencoder (WAE) which formalizes the adversarial autoenCoder as an optimal transport problem.… Read the rest

Artistic style transfer for videos

In the past, manually re-drawing an image in a certain artistic style required a professional artist and a long time. Nowadays computers provide new possibilities. We present an approach that transfers the style from one image (for example, a painting) to a whole video sequence.… Read the rest

Attention Based Deep Learning Framework for Human Activity Recognition with User Adaptation

Sensor-based human activity recognition (HAR) requires to predict the action of a person based on sensor-generated time series data. The current state-of-the-art is represented by deep learning architectures that automatically obtain high level representations. We propose a novel deep learning framework, \algname, based on a purely attention-based mechanism, that overcomes the limitations of the current state of the art.… Read the rest

Attention Based Fully Convolutional Network for Speech Emotion Recognition

The proposed model outperformed the state-of-the-art methods with a weighted accuracy of 70.4% and an unweighted accuracy of 63.9% respectively. The proposed attention mechanism can make our model be aware of which time-frequency region of speech spectrogram is more emotion-relevant. Especially, it’s interesting to observe obvious improvement obtained with natural scene image based pre-trained model.… Read the rest

Classification of COVID 19 in chest X ray images using DeTraC deep convolutional neural network

Chest X-ray is the first imaging technique that plays an important role in the diagnosis of COVID-19 disease. Due to the limited availability of annotated medical images, the classification of medical images remains the biggest challenge in medical diagnosis. Thanks to transfer learning, an effective mechanism that can provide a promising solution by transferring knowledge from generic object recognition tasks to domain-specific tasks has been developed.… Read the rest

CleanNet Transfer Learning for Scalable Image Classifier Training with Label Noise

Existing approaches depending on human supervision are generally not scalable as manually identifying correct or incorrect labels is time-consuming. To reduce the amount of human supervision for label noise cleaning, we introduce CleanNet, a joint neural embedding network. CleanNet can reduce label noise detection error rate on held-out classes where no human supervision available by 41.5% compared to current weakly supervised methods.… Read the rest

Comixify Transform video into a comics

Neural style algorithm based on Generative Adversarial Networks (GANs) A Web-based working application of video comixification available at http://comixify.ii.pw.edu.pl. The final contribution of our work is a state-of-the-art keyframes extraction algorithm that selects a subset of frames from the video to provide the most comprehensive video context.… Read the rest

Common Voice A Massively Multilingual Speech Corpus

Common Voice is a massively-multilingual collection of transcribed speech intended for speech technology research and development. The most recent release includes 29 languages, and as of November 2019 there are a total of 38 languages collecting data. Over 50,000 individuals have participated so far, resulting in 2,500 hours of collected audio.… Read the rest

Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network

The proposed ResNet-50 shows improvements in top-1 accuracy from 76.3\% to 82.78\% on ILSVRC2012 validation set. With these improvements, inference throughput only decreases from 536 to 312. Our approach achieved 1st place in the iFood Competition Fine-Grained Visual Recognition at CVPR 2019, and the source code and trained models are available at https://github.com/clovaai/assembled-cnn.… Read the rest

Contour Knowledge Transfer for Salient Object Detection

In recent years, deep Convolutional Neural Networks (CNNs) have broken all records in salient object detection. However, training such a deep model requires a large amount of manual annotations. Our goal is to overcome this limitation by automatically converting an existing deep contour detection model into a salient object Detection model without using any manual salient object masks.… Read the rest

Contrastive Representation Distillation

The RepDistiller is a new way to transfer representational knowledge from one neural network to another. The method sets a new state-of-the-art in many transfer tasks, and sometimes even outperforms the teacher network when combined with knowledge distillation. We formulate this objective as contrastive learning and demonstrate that our resulting new objective outperforms other cutting-edge distillers on a variety of knowledge transfer tasks.

Controllable Artistic Text Style Transfer via Shape Matching GAN

Artistic text style transfer is the task of migrating the style from a source image to the target text to create artistic typography. The proposed method demonstrates its superiority over previous state-of-the-arts in generating diverse, controllable and high-quality stylized text. The proposal is based on a novel bidirectional shape matching framework to establish an effective glyph-style mapping at various deformation levels without paired ground truth.… Read the rest

Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation

Unsupervised text attribute transfer automatically transforms a text to alter a specific attribute (e.g. sentiment) without using any parallel data, while simultaneously preserving its attribute-independent content. The dominant approaches are trying to model the content-independent attribute separately, e.g., learning different attributes’ representations or using multiple attribute-specific decoders.… Read the rest

Cutting the Error by Half Investigation of Very Deep CNN and Advanced Training Strategies for Document Image Classification

We present an exhaustive investigation of recent Deep Learning architectures, algorithms, and strategies for the task of document image classification to finally reduce the error by more than half. Existing approaches, such as the DeepDocClassifier, apply standard Convolutional Network architectures with transfer learning from the object recognition domain.… Read the rest

Deep learning approaches in food recognition

The contents of food dishes are typically deformable objects, usually including complex semantics. Traditional image analysis approaches have achieved low classification accuracy in the past. The three main lines of solutions, namely the design from scratch, the transfer learning and the platform-based approaches, are outlined, particularly for the task at hand.… Read the rest

Deep Learning for ECG Analysis Benchmarks and Insights from PTB XL

Convolutional neural networks, in particular resnet- and inception-based architectures, show the strongest performance across all tasks. These results are complemented by deeper insights into the classification algorithm in terms of hidden stratification, model uncertainty and an exploratory interpretability analysis. With this resource, we aim to establish the PTB-XL dataset as a resource for structured benchmarking of ECG analysis algorithms.… Read the rest

Deep Photo Style Transfer

This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style. We show that this approach successfully suppresses distortion and yields satisfying photorealistic style transfers in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits.… Read the rest

Deformable GANs for Pose based Human Image Generation

In this paper we address the problem of generating person images conditioned on a given pose. Given an image of a person and a target pose, we synthesize a new image of that person in the novel pose… In order to deal with pixel-to-pixel misalignments caused by the pose differences, we introduce deformable skip connections in the generator of our Generative Adversarial Network.… Read the rest

Dense Intrinsic Appearance Flow for Human Pose Transfer

We present a novel approach for the task of human pose transfer. We address the issues of limited correspondences identified between keypoints only and invisible pixels due to self-occlusion. Unlike existing methods, we propose to estimate dense and intrinsic 3D appearance flow to better guide the transfer of pixels between poses.… Read the rest

DiscoFuse A Large Scale Dataset for Discourse Based Sentence Fusion

Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically-generating fusion examples from raw text and present DiscoFuse, a large scale dataset for discourse-based sentence fusion.… Read the rest

Disentangled Person Image Generation

Aims to generate novel, yet realistic, images of persons based on a novel, two-stage reconstruction pipeline. Multi-branched reconstruction network is proposed to disentangle and encode the three factors into embedding features, which are then combined to re-compose the input image itself.… Read the rest

Diversified Arbitrary Style Transfer via Deep Feature Perturbation

Image style transfer is an underdetermined problem, where a large number of solutions can satisfy the same constraint (the content and style) The key idea of our method is an operation called deep feature perturbation (DFP), which uses an orthogonal random noise matrix to perturb the deep image feature maps while keeping the original style information unchanged.… Read the rest

Document Image Classification with Intra Domain Transfer Learning and Stacked Generalization of Deep Convolutional Neural Networks

A region-based Deep Convolutional Neural Network framework is proposed for document structure learning. The proposed method achieves state-of-the-art accuracy of 92.2% on the popular RVL-CDIP document image dataset, exceeding benchmarks set by existing algorithms. The contribution of this work involves efficient training of region based classifiers and effective ensembling for document image classification… A primary level of `inter-domain’ transfer learning is used by exporting weights from a pre-trained VGG16 architecture on the ImageNet dataset to train a document classifier on whole document images.… Read the rest

Don t Just Scratch the Surface Enhancing Word Representations for Korean with Hanja

We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i.e. Hanja) We employ cross-lingual transfer learning in training word representations by leveraging the fact that Hanja is closely related to Chinese. We evaluate the intrinsic quality of representations learned through our approach using the word analogy and similarity tests.… Read the rest

Double Double Descent On Generalization Errors in Transfer Learning between Linear Regression Tasks

We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task.… Read the rest

DRCD a Chinese Machine Reading Comprehension Dataset

In this paper, we introduce DRCD (Delta Reading Comprehension Dataset), an open domain traditional Chinese machine reading comprehension (MRC) dataset. The dataset contains 10,014 paragraphs from 2,108 Wikipedia articles and 30,000+ questions generated by annotators. We build a baseline model that achieves an F1 score of 89.59%.

Effective Cross lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT) But applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective techniques to transfer a pre-trained NMT model to a new, unrelated language. We relieve the vocabulary mismatch by using cross-lingual word embedding, train a more language-agnostic encoder by injecting artificial noises, and generate synthetic data easily from the pre-training data without back-translation.… Read the rest