Instagram has become a great venue for amateur and professional photographers alike to showcase their work . Photographers trying to build a reputation on Instagram have to strike a balance between maximizing their followers’ engagement with their photos, while also maintaining their artistic style . We used transfer learning to adapt Xception, which is a […]

A paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning . We design a tag and generate pipeline that identifies stylistic attributes and generates a sentence in the target style while preserving most of the source content . For politeness as well as five […]

Batch Semi-Supervised Self-Organizing Map (Batch SS-SOM) is an extension of a SOM incorporating some advances that came with the rise of Deep Learning, such as batch training . It performs well in terms of accuracy and clustering error, even with a small number of labeled samples, as well as when presented to unsupervised data . […]

The research of knowledge-driven conversational systems is largely limited due to the lack of dialog data which consists of multi-turn conversations on multiple topics and with knowledge annotations . In this paper, we propose a Chinese multi-domain knowledge- driven conversation dataset, KdConv . Our corpus contains 4.5K conversations from three domains (film, music, and travel), […]

Most approaches in few-shot learning rely on costly annotated data related to the goal task domain during (pre-training) In settings with realistic domain shift, common transfer learning has been shown to outperform supervised meta-learning . We propose a transfer learning approach which constructs a metric embedding that clusters unlabeled prototypical samples and their augmentations closely […]

Deep learning has been adopted as an effective technique to aid COVID-19 detection and segmentation from computed tomography (CT) images… The major challenge lies in the inadequate public public COVI-19 datasets . The results reveal the benefits of transferring knowledge from non-COVID19 lung lesions, and learning from multiple lung lesion datasets can extract more general […]

Text style transfer aims to change the style of the input text to the target style while preserving the content to some extent . We aim to work on story-level text style transfer to generate stories that preserve plot of input story while exhibiting strong target style . We plan to explore three methods including […]

Transfer learning has fundamentally changed the landscape of natural language processing (NLP) research . Many existing state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks… However, due to limited data resources from downstream tasks and the extremely large capacity of models, the model is often overfit the data […]

The goal of this work is to build conversational Question Answering (QA) interfaces for the large body of domain-specific information available in FAQ sites . We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs… The dialogues are collected from three Stack Exchange sites using the Wizard of Oz method with crowdsourcing . […]

Long-tail distribution in object relationships remains a challenging and pestering issue . Existing methods largely rely on external knowledge or statistical bias information to alleviate this problem . In this paper, we tackle this issue from another two aspects: (1) scene-object interaction aiming at learning specific knowledge from a scene via an additive attention mechanism […]

Text generation has played an important role in various applications of natural language processing (NLP) In recent studies, researchers are paying increasing attention to modeling and manipulating the style of the generation text . In this tutorial, we will provide a comprehensive literature review in this direction . We start from the definition of style […]

A large percentage of the world’s population speaks a language of the Indian subcontinent, comprising languages from both Indo-Aryan (e.g. Hindi, Punjabi, Gujarati, etc.) and Dravidian . A universal characteristic of Indian languages is their complex morphology, which, when combined with the lack of sufficient quantities of high-quality parallel data, can make developing machine translation […]

An increasing number of people in the world today speak a mixed-language as a result of being multilingual . However, building a speech recognition system for code-switching remains difficult due to the availability of limited resources and the expense and significant effort required to collect mixed language data . We propose a new learning method, […]

CMU-LTI submission to the SIGMORPHON 2020 Shared Task 0 on typologically diverse morphological inflection . The (unrestricted) submission uses the cross-lingual approach of our last year’s winning submission (Anastasopoulos and Neubig, 2019), but adapted to use specific transfer languages for each test language . Our system, with fixed non-tuned hyperparameters, achieved a macro-averaged accuracy of […]

Top-performing deep architectures are trained on massive amounts of labeled data . Domain adaptation often provides an attractive option given that labeled data of similar nature but from a different domain (e.g. synthetic images) are available… Here, we propose a new approach to domain adaptation in deep architectures . We show that this adaptation behaviour […]

The only requirement is that there are some shared parameters in the top layers of the multi-lingual encoder . We show that transfer is possible even when there is no shared vocabulary across the monolingual corpora and also when the text comes from very different domains . We also show that representations from monolingial BERT […]

Deep learning has been widely adopted in automatic emotion recognition and has lead to significant progress in the field . However, due to insufficient annotated emotion datasets, pre-trained models are limited in their generalization capability and thus lead to poor performance on novel test sets… To mitigate this challenge, transfer learning performing fine-tuning on pre-training […]

Cross-lingual transfer between typologically related languages has been proven successful for the task of morphological inflection . However, if the languages do not share the same script, current methods yield more modest improvements… We explore the use of transliteration between related languages, as well as grapheme-to-phoneme conversion, as data preprocessing methods . We experimented with […]

Gaussian-guided latent alignment approach to align the latent feature distributions of the two domains . In such an indirect way, distributions over samples will be constructed on a common feature space, i.e., the space of the prior, which promotes better feature alignment . The extensive evaluations on eight benchmark datasets validate the superior knowledge transferability […]

Manifold Embedded Distribution Alignment (MEDA) approach aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain . MEDA learns a domain-invariant classifier in Grassmann manifold with structural risk minimization . Extensive experiments demonstrate that MEDA shows significant improvements in classification accuracy compared to state-of-the-art traditional and deep methods . […]

Bottom-up Clustering (BUC) approach based on hierarchical clustering serves as one promising unsupervised clustering method… One key factor of BUC is the distance measurement strategy . We evaluate our method on large scale re-ID datasets, including Market-1501, DukeMTMC-reID and MARS . Extensive experiments show that our method obtains significant improvements over the state-of-the-art un-supervised methods, […]

MobileBERT is a thin version of BERT{\_}LARGE, while equipped with bottleneck structures and carefully designed balance between self-attentions and feed-forward networks . It is 4.3x smaller and 5.5x faster than BERT\_}BASE while achieving competitive results on well-known benchmarks . On the SQuAD v1.1/v2.0 question answering task, MobileBERt achieves a dev F1 score of 90.0/79.2 (1.5/2.1 […]

Deep clustering against self-supervised learning is a very important and promising direction for unsupervised visual representation learning since it requires little domain knowledge to design pretext tasks . However, embedding clustering limits its extension to the extremely large-scale dataset due to its prerequisite to save the global latent embedding of the entire dataset… In this […]

Deep neural networks (DNNs) exhibit knowledge transfer, which is critical to improving learning efficiency and learning in domains that lack high-quality training data . In this paper, we aim to turn the existence and pervasiveness of adversarial examples into an advantage . We show composition with an affine function is sufficient to reduce the difference […]

Cross-lingual transfer learning (CLTL) is a viable method for building NLP models for a low-resource target language by leveraging labeled data from other (source) languages . Our model leverages adversarial networks to learn language-invariant features, and mixture-of-experts models to dynamically exploit the similarity between the target language and each individual source language. This enables our […]

Cross-domain NER is a challenging yet practical problem . We investigate a multi-cell compositional LSTM structure for multi-task learning . Theoretically, the resulting distinct feature distributions for each entity type make it more powerful for cross-domain transfer . Empirically, experiments on four few-shot and zero-shot datasets show our method significantly outperforms a series of multi- […]

jiant enables modular and configuration-driven experimentation with state-of-the-art models . It implements a broad set of tasks for probing, transfer learning, and multitask training experiments… jiant implements over 50 NLU tasks, including all GLUE and SuperGLUE benchmark tasks . It reproduces published performance on a variety of tasks and models, including BERT and RoBERTa .

Style transfer is an important problem in natural language processing (NLP) However, progress in language style transfer is lagged behind other domains, such as computer vision. In this paper, we propose to learn style transfer with non-parallel data. We explore two models to achieve this goal, and the key idea behind the proposed models is […]

We investigate supervised learning strategies that improve the training of neural network audio classifiers on small annotated collections. We study whether (i) a naive regularization of the solution space, (ii) prototypical networks, (iii) transfer learning, or (iv) their combination can foster deep learning models to better leverage a small amount of training examples. We evaluate […]

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. By substituting simple noising operations with advanced data augmentation methods, our method brings substantial improvements across six language and three vision tasks. On the IMDb text classification dataset, with only 20 labeled examples, our Method achieves an error […]

This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings. Our method integrates multiple word embeddings created from complementary techniques, textual sources, knowledge bases and languages. Existing word vectors are projected to a common semantic space using linear transformations and averaging. We can leverage pre-trained source embeddments from a resource-rich language in order […]

Many pre-trained Teacher models used in transfer learning are publicly available and maintained by public platforms, increasing their vulnerability to backdoor attacks. We demonstrate a backdoor threat to transfer learning tasks on both image and time-series data leveraging the knowledge of publicly accessible Teacher models. We launch effective misclassification attacks on Student models over real-world […]

We propose the first end-to-end network for online video style transfer. It generates temporally coherent stylized video sequences in near real-time. We show that the proposed method clearly outperforms the per-frame baseline both qualitatively and quantitatively. It can achieve visually comparable coherence to optimization-based video styletransfer, but is three orders of magnitudes faster in runtime. […]

Many data sets in a domain (reviews, forums, news, etc.) exist in parallel languages. They all cover the same content, but the linguistic differences make it impossible to use traditional, bag-of-word-based topic models. Models have to be either single-language or suffer from a huge, but extremely sparse vocabulary. Both issues can be addressed by transfer […]

Deep learning algorithms are considered as a methodology of choice for remote-sensing image analysis over the past few years. Due to its effective applications, deep learning has also been introduced for automatic change detection and achieved great success. This study will contribute in several ways to our understanding of deep learning for change detection. It […]

This paper digs deeper into factors that influence egocentric gaze. Bottom-up saliency and optical flow are assessed versus strong spatial prior baselines. Task-specific cues such as vanishing point, manipulation point, and hand regions are analyzed as representatives of top-down information. The knowledge transfer works best for cases where the tasks or sequences are similar, and […]

The proposed model uses a novel Dual-Attention unit to disentangle the knowledge of words in the textual representations and visual concepts in the visual representations. This disentangled task-invariant alignment of representations facilitates grounding and knowledge transfer across both tasks. We show that the proposed model outperforms a range of baselines on both tasks in simulated […]

We propose a Generative Transfer Network (GTNet) for zero shot object detection (ZSD) GTNet consists of an Object Detection Module and a Knowledge Transfer Module. The Object Detection module can learn large-scale seen domain knowledge. The Knowledge transfer Module leverages a feature synthesizer to generate unseen class features, which are applied to train a new […]

Generative autoencoders offer a promising approach for controllable text generation by leveraging their latent sentence representations. Current models struggle to maintain coherent latent spaces required to perform meaningful text manipulations via latent vector operations. We prove that this simple modification guides the latent space geometry of the resulting model by encouraging the encoder to map […]