This paper addresses representational bottleneck in a network and propose a set of design principles that improves model performance significantly . Slight changes on baseline networks by following the principle leads to achieving remarkable performance improvements on ImageNet classification . Code and pretrained models are available at https://://:// and the code is available at http://www.clovai/ […]

This paper studies the scratch training of quantization-aware training (QAT) It has been applied to the lossless conversion of lower-bit, especially for INT8 quantization . Due to its training instability, QAT have required a full-precision (FP) pre-trained weight for fine-tuning and the performance is bound to the original FP model with floating-point computations… Here, we […]

StyleSegor is an efficient and easy-to-use strategy to alleviate this inconsistency issue . Neural style transfer algorithm is applied to unlabeled data in order to minimize differences in image properties between labeled and labeled data . On a publicly available whole heart segmentation benchmarking dataset from MICCAI HVSMR 2016 challenge, we have demonstrated an elevated […]

Diffractive networks merge wave-optics with deep learning to design task-specific elements to all-optically perform various tasks such as object classification and machine vision . This learning-based diffractive pulse engineering framework can find broad applications in e.g., communications, ultra-fast imaging and spectroscopy . The results constitute the first demonstration of direct pulse shaping in terahertz spectrum, […]

Automated Bayesian inference framework, called AutoBayes, explores different graphical models linking classifier, encoder, decoder, estimator and adversary network blocks to optimize nuisance-invariant machine learning pipelines . The framework can be effectively utilized in semi-supervised multi-class classification, and reconstruction tasks for datasets in different domains as well . We benchmark the framework on a series of […]

This work is devoted to unresolved problems of Artificial General Intelligence – the inefficiency of transfer learning . We present algorithms for training a Delta Schema Network (DSN), predicting future states of the environment and planning actions that will lead to positive reward . DSN shows strong performance of transfer . learning on the classic […]

Deep neural networks are typically trained under a supervised learning framework where a model learns a single task using labeled data . Instead of relying solely on labeled data, practitioners can harness unlabeled or related data to improve model performance . Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art […]

Machine learning techniques have excelled in the automatic semantic analysis of images, reaching human-level performances on challenging benchmarks . Spell-correction systems are crucial for producing readable outputs . We demonstrate that purely learning on softmax inputs in combination with scarce training data yields overfitting as the network learns the inputs by heart . In contrast, […]

Inspired by recent advances in natural texture synthesis, we train deep neural models to generate textures by non-linearly combining learned noise frequencies… To achieve a highly realistic output conditioned on an exemplar patch, we propose a novel loss function that combines ideas from both style transfer and generative adversarial networks . We train the synthesis […]

Several applications of Internet of Things (IoT) technology involve capturing data from multiple sensors resulting in multi-sensor time series . Such approaches can struggle in the practical setting where different instances of the same device or equipment such as mobiles, wearables, engines, etc. come with different combinations of installed sensors . We propose a novel […]

Fine-tuning pre-trained convolutional neural networks (CNNs) has been shown to work well for skin lesion classification . Skin cancer is among the most common cancer types . Dermoscopic image analysis improves the diagnostic accuracy for detection of malignant melanoma and other pigmented skin lesions when compared to unaided visual inspection… Hence, computer-based methods to support […]

Top-performing deep architectures are trained on massive amounts of labeled data . Domain adaptation often provides an attractive option given that labeled data of similar nature but from a different domain (e.g. synthetic images) are available… Here, we propose a new approach to domain adaptation in deep architectures . We show that this adaptation behaviour […]

The only requirement is that there are some shared parameters in the top layers of the multi-lingual encoder . We show that transfer is possible even when there is no shared vocabulary across the monolingual corpora and also when the text comes from very different domains . We also show that representations from monolingial BERT […]

Deep learning has been widely adopted in automatic emotion recognition and has lead to significant progress in the field . However, due to insufficient annotated emotion datasets, pre-trained models are limited in their generalization capability and thus lead to poor performance on novel test sets… To mitigate this challenge, transfer learning performing fine-tuning on pre-training […]

Cross-lingual transfer between typologically related languages has been proven successful for the task of morphological inflection . However, if the languages do not share the same script, current methods yield more modest improvements… We explore the use of transliteration between related languages, as well as grapheme-to-phoneme conversion, as data preprocessing methods . We experimented with […]

Gaussian-guided latent alignment approach to align the latent feature distributions of the two domains . In such an indirect way, distributions over samples will be constructed on a common feature space, i.e., the space of the prior, which promotes better feature alignment . The extensive evaluations on eight benchmark datasets validate the superior knowledge transferability […]

Manifold Embedded Distribution Alignment (MEDA) approach aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain . MEDA learns a domain-invariant classifier in Grassmann manifold with structural risk minimization . Extensive experiments demonstrate that MEDA shows significant improvements in classification accuracy compared to state-of-the-art traditional and deep methods . […]

Bottom-up Clustering (BUC) approach based on hierarchical clustering serves as one promising unsupervised clustering method… One key factor of BUC is the distance measurement strategy . We evaluate our method on large scale re-ID datasets, including Market-1501, DukeMTMC-reID and MARS . Extensive experiments show that our method obtains significant improvements over the state-of-the-art un-supervised methods, […]

MobileBERT is a thin version of BERT{\_}LARGE, while equipped with bottleneck structures and carefully designed balance between self-attentions and feed-forward networks . It is 4.3x smaller and 5.5x faster than BERT\_}BASE while achieving competitive results on well-known benchmarks . On the SQuAD v1.1/v2.0 question answering task, MobileBERt achieves a dev F1 score of 90.0/79.2 (1.5/2.1 […]

Deep clustering against self-supervised learning is a very important and promising direction for unsupervised visual representation learning since it requires little domain knowledge to design pretext tasks . However, embedding clustering limits its extension to the extremely large-scale dataset due to its prerequisite to save the global latent embedding of the entire dataset… In this […]

Deep neural networks (DNNs) exhibit knowledge transfer, which is critical to improving learning efficiency and learning in domains that lack high-quality training data . In this paper, we aim to turn the existence and pervasiveness of adversarial examples into an advantage . We show composition with an affine function is sufficient to reduce the difference […]

Cross-lingual transfer learning (CLTL) is a viable method for building NLP models for a low-resource target language by leveraging labeled data from other (source) languages . Our model leverages adversarial networks to learn language-invariant features, and mixture-of-experts models to dynamically exploit the similarity between the target language and each individual source language. This enables our […]

Cross-domain NER is a challenging yet practical problem . We investigate a multi-cell compositional LSTM structure for multi-task learning . Theoretically, the resulting distinct feature distributions for each entity type make it more powerful for cross-domain transfer . Empirically, experiments on four few-shot and zero-shot datasets show our method significantly outperforms a series of multi- […]

jiant enables modular and configuration-driven experimentation with state-of-the-art models . It implements a broad set of tasks for probing, transfer learning, and multitask training experiments… jiant implements over 50 NLU tasks, including all GLUE and SuperGLUE benchmark tasks . It reproduces published performance on a variety of tasks and models, including BERT and RoBERTa .

Recent progress on few-shot learning largely relies on annotated data for meta-learning: base classes sampled from the same domain as novel classes . However, in many applications, collecting data for such data is infeasible or impossible… This leads to the cross-domain few-shots learning problem, where there is a large shift between base and novel class […]

We present easy-to-use retrieval focused multilingual sentence embedding models, made available on TensorFlow Hub . The models embed text from 16 languages into a shared semantic space using a multi-task trained dual-encoder that learns tied cross-lingual representations via translation bridge tasks (Chidambaram et al., 2018)… The models achieve a new state-of-the-art in performance on monolingual […]

The main clinical tool currently in use for the diagnosis of COVID-19 is the Reverse transcription polymerase chain reaction (RT-PCR), which is expensive, less-sensitive and requires specialized medical personnel . X-ray imaging is an easily accessible tool that can be an excellent alternative in the diagnosis . A public database was created by the authors […]

We propose Deep Distribution Transfer(DDT), a new transfer learning approach to address the problem of zero and few-shot transfer in the context of facial forgery detection . We examine how well a model (pre-)trained with one forgery creation method generalizes towards a previously unseen manipulation technique or different dataset… To facilitate this transfer, we introduce […]

Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions, which are then used to augment the training of the model for improved robustness . We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples . […]

The proposed sentiment transfer algorithm can transfer the sentiment of images while ensuring the content structure of the input image is intact . It outperforms existing artistic and photorealistic style transfer algorithms in making reliable sentiment transfer results with rich and exact details . We propose a global sentiment transfer step, which employs an optimization […]

COVID-CXNet uses deep convolutional neural networks in a large dataset . It is demonstrated that simple models, alongside the majority of pretrained networks in the literature, focus on irrelevant features for decision-making . This powerful model is capable of detecting the novel coronavirus pneumonia based on relevant and meaningful features with precise localization . It’s […]

Current regression-based methods for pose estimation are trained and evaluated scene-wise… They depend on the coordinate frame of the training dataset and show a low generalization across scenes and datasets . We develop a deep adaptation network for learning scene-invariant image representations and use adversarial learning to generate such representations for model transfer . We […]