Domain Adaptive Video Segmentation via Temporal Consistency Regularization

DA-VSN is a domain adaptive video segmentation network that addresses domain gaps in videos by temporal consistency regularization (TCR) for consecutive frames of target-domain videos . The network is based on cross-domain TCR that guides the prediction of target frames to have similar temporal consistency as that of source frames (learnt from annotated source data) viaadversarial learning .…

SurfaceNet Adversarial SVBRDF Estimation from a Single Image

In this paper we present SurfaceNet, an approach for estimatingspatially-varying bidirectional reflectance distribution function (SVBRDF)material properties from a single image . We pose the problem as an imagetranslation task and propose a novel patch-based generative adversarial network that is able to produce high-quality, high-resolution surface reflectancemaps .…

RGB Image Classification with Quantum Convolutional Ansaetze

Many quantum (convolutional) circuit ansaetze are proposed forgrayscale images classification tasks with promising empirical results . But the intra-channel information that is useful for vision tasks is not extracted effectively . This is the first work of a quantum convolutional circuit to deal with RGB images with a higher test accuracy compared to the purely classical CNNs .…

OLR 2021 Challenge Datasets Rules and Baselines

This paper introduces the sixth Oriental Language Recognition (OLR) 2021Challenge . It intends to improve the performance of language recognitionsystems and speech recognition systems within multilingual scenarios . The dataprofile, four tasks, two baselines, and the evaluation principles are presented .…

High Dimensional Differentially Private Stochastic Optimization with Heavy tailed Data

Differentially Private Stochastic Convex Optimization(DP-SCO) has been extensively studied in recent years . Most of the previous work can only handle either regular data distribution or irregulardata in the low dimensional space case . To better understand the challengesarising from irregular data distribution, in this paper, we provide the firststudy on the problem with heavy-tailed data in the high dimensionalspace .…

LARGE Latent Based Regression through GAN Semantics

We propose a novel method for solving regression tasks using few-shot or weaksupervision . At the core of our method is the fundamental observation that GANsare incredibly successful at encoding semantic information within their latentspace, even in a completely unsupervised setting .…

Data driven deep density estimation

Density estimation plays a crucial role in many data analysis tasks . It is used in tasks as diverse as analyzing population data, spatiallocations in 2D sensor readings, or reconstructing scenes from 3D scans . In this paper, we introduce a learned, data-driven deep density estimation (DDE)to infer PDFs in an accurate and efficient manner, while being independent of domain dimensionality or sample size .…

MCDAL Maximum Classifier Discrepancy for Active Learning

Recent state-of-the-art active learning methods have mostly leveragedGenerative Adversarial Networks (GAN) for sample acquisition . However, GAN isusually known to suffer from instability and sensitivity to hyper-parameters . In contrast to these methods, we propose in this paper a novel active learningframework that we call Maximum Classifier Discrepancy for Active Learning(MCDAL) which takes the prediction discrepancies between multiple classifiers .…

3D Radar Velocity Maps for Uncertain Dynamic Environments

Future urban transportation concepts include a mixture of ground and air vehicles with varying degrees of autonomy in a congested environment . Safe and efficient transportation requires reasoning about the 3Dflow of traffic and properly modeling uncertainty . This paper explores a Bayesian approach that captures our uncertainty in the map given training data .…

Detail Preserving Residual Feature Pyramid Modules for Optical Flow

Feature pyramids and iterative refinement have recently led to great progress in optical flow estimation . However, downsampling in feature pyramids can cause blending of foreground objects with background . We propose anovel Residual Feature Pyramid Module (RFPM) which retains important details inthe feature map without changing the overall design of the overall iterative .…

AD GAN End to end Unsupervised Nuclei Segmentation with Aligned Disentangling Training

Aligned Disentangling Generative AdversarialNetwork (AD-GAN) introduces representationdisentanglement to separate content representation from style representation . With this framework, spatial structure can be preserved explicitly, enabling asignificant reduction of macro-level lossy transformation . AD-GAN leads to significant improvement over the current best unsupervised methods by an average 17.8% relatively (w.r.t.…

MCDAL Maximum Classifier Discrepancy for Active Learning

Recent state-of-the-art active learning methods have mostly leveragedGenerative Adversarial Networks (GAN) for sample acquisition . However, GAN isusually known to suffer from instability and sensitivity to hyper-parameters . In contrast to these methods, we propose in this paper a novel active learningframework that we call Maximum Classifier Discrepancy for Active Learning(MCDAL) which takes the prediction discrepancies between multiple classifiers .…

Standardized Max Logits A Simple yet Effective Approach for Identifying Unexpected Road Obstacles in Urban Scene Segmentation

Identifying unexpected objects on roads in semantic segmentation is crucial in safety-critical applications . Existing approaches use images of unexpected objects from external datasets or require additional training . We propose a simple yet effective approach that standardizes the max logits in order to align the different distributions and reflect the relative meanings of each predicted class .…

An Adaptive State Aggregation Algorithm for Markov Decision Processes

Value iteration is a well-known method of solving Markov Decision Processes . However, the computational cost of value iteration quickly becomesfeasible as the size of the state space increases . In this paper, we propose an intuitive algorithm for solving MDPsthat reduces the cost of updates by dynamically grouping together states with similar cost-to-go values .…

Class Incremental Domain Adaptation with Smoothing and Calibration for Surgical Report Generation

Surgical reports aimed at understanding inrobot-assisted surgery can contribute to documenting entry tasks and post-operative analysis . Despite the impressive outcome, the deep learningmodel degrades the performance when applied to different domains encountering domain shifts . In this work, we proposeclass-incremental domain adaptation (CIDA) with a multi-layer transformer-basedmodel to tackle the new classes and domain shift in the target domain to generate surgical reports .…

Exploring Deep Registration Latent Spaces

Explainability of deep neural networks is one of the most challenging and interesting problems in the field . We show that such an approach can decompose the highly convoluted latent latent spaces of registration pipelines in an orthogonal space with several interesting properties .…

Human Pose Regression with Residual Log likelihood Estimation

Heatmap-based methods dominate in the field of human pose estimation by modelling the output distribution through likelihood heatmaps . Residual Log-likelihood Estimation(RLE) is effective, efficient and flexible . Compared to the conventional regression paradigm, regression with RLE bring 12.4 mAPimprovement on MSCOCO without any test-time overhead .…

Human Pose Estimation from Sparse Inertial Measurements through Recurrent Graph Convolution

The AAGC-LSTM combines spatial and temporal dependency in a single network operation . This is made possible by equipping graph convolutions with adjacency adaptivity, which allows for learning unknown dependencies of the human body joints . Tofurther boost accuracy, we propose longitudinal loss weighting to considernatural movement patterns, as well as body-aware contralateral dataaugmentation .…

Unsupervised Domain Adaptive 3D Detection with Multi Level Consistency

Deep learning-based 3D object detection has achieved unprecedented success with the advent of large-scale autonomous driving datasets . However, drasticperformance degradation remains a critical challenge for cross-domain deployment . We propose anovel and unified framework, Multi-Level Consistency Network (MLC-Net), whichemploys a teacher-student paradigm to generate adaptive and reliablepseudo-targets .…

Provident Vehicle Detection at Night for Advanced Driver Assistance Systems

Current algorithms share one limitation: They rely on directly visible objects . This is a major drawback compared to human behavior, where indirect visual cues caused by the actual object (e.g.,shadows) are already used intuitively to retrieve information . Humans already process light artifacts caused by oncoming vehicles to assume their future appearance, whereas current objectdetection systems rely on the oncoming vehicle’s direct visibility .…

Developing efficient transfer learning strategies for robust scene recognition in mobile robotics using pre trained convolutional neural networks

We present four different robust transfer learning strategies for robust mobile scene recognition . Fine-Tuning in combination withextensive data augmentation improves accuracy and robustness in mobile robot place recognition . We achieved state-of-the-art results using variousbaseline convolutional neural networks and showed the robustness againstlighting and viewpoint changes in challenging mobile robot places recognition .…

Photon Starved Scene Inference using Single Photon Cameras

Single-photon cameras (SPCs) are an emergingsensing modality that are capable of capturing images with high sensitivity . Despite having minimal read-noise, images captured by SPCs in photon-starved conditions still suffer from strong shot noise . We propose photon scale-space a collection of high-SNR imagesspanning a wide range of photons-per-pixel (PPP) levels as guides to train inference model on low photon flux images .…

Dense Supervision Propagation for Weakly Supervised Semantic Segmentation on 3D Point Clouds

Semantic segmentation on 3D point clouds is an important task for 3D sceneunderstanding . We train a semantic point cloud segmentation network with only asmall portion of points being labeled . We argue that we can better utilize thelimited supervision information as we densely propagate the supervision signalfrom the labeled points to other points within and across the input samples .…

Pruning Ternary Quantization

The method significantly compressesneural network weights to a sparse ternary of [-1,0,1 . It can compress aResNet-18 model from 46 MB to 955KB and a ResNet-50 model from 99 MB to 3.3MB (~30x) The top-1 accuracy on ImageNet drops slightly from 69.7% to65.3% and from 76.15% to 74.47% .…

High order modal Discontinuous Galerkin Implicit Explicit Runge Kutta and Linear Multistep schemes for the Boltzmann model on general polygonal meshes

Discontinuous Galerkin (DG)Implicit-Explicit Runge Kutta schemes and LinearMultistep Methods based on Backward-Finite-Differences . The new methods are validated considering two-dimensional benchmark test cases typically used in the fluid dynamics community . A prototype engineering problem consisting of a supersonic flow around a NACA 0012 airfoil with space-time-dependent boundary conditions is presented for which the pressure coefficients are measured .…

Type based Enforcement of Infinitary Trace Properties for Java

A common approach to improve software quality is to use programming guidelines to avoid common kinds of errors . In this paper, we consider theproblem of enforcing guidelines for Featherweight Java (FJ) We formalize guidelines as sets of finite or infinite execution traces and develop aregion-based type and effect system for FJ that can enforce such guidelines .…

Reservoir Computing Approach for Gray Images Segmentation

The paper proposes a novel approach for gray scale images segmentation . It is based on multiple features extraction from single feature per image pixel, using Echo state network . The newly extractedfeatures — reservoir equilibrium states — reveal hidden image characteristicsthat improve its segmentation via a clustering algorithm .…

LocalGLMnet interpretable deep learning for tabular data

Deep learning models have gained great popularity in statistical modeling . Theadvantage of deep learning models is that their solutions are difficult tointerpret and explain . We propose a new network architecture that sharessimilar features as generalized linear models, but provides superior predictivepower benefiting from the art of representation learning .…

Resource Efficient Mountainous Skyline Extraction using Shallow Learning

Skyline plays a pivotal role in mountainous visual geo-localization and localization/navigation of planetary rovers/UAVs and virtual/augmented realityapplications . We present a novel mountainous skyline detection approach wherewe adapt a shallow learning approach to learn a set of filters to discriminate between edges belonging to sky-mountain boundary and others coming from different regions .…

Cardiac CT segmentation based on distance regularized level set

A paper uses distanceregularized level set (DRL SE) to explore the segmentation effect of epicardiumand endocardium . Five CT images are used to verify the proposedmethod, and image quality evaluation indexes such as dice score and Hausdorffdistance are used . The results showed that the researchers could separate the inner and outer membrane very well (endocardiumdice = 0.9253, Hausorfff = 7.8740) and epicocardium Hausdice= 0.9687 .…