A Requirements Engineering Technology for the IoT Software Systems

RETIoT ( Requirements Engineering Technology for the Internetof Things based software systems) aims to provide methodological, technical,and tooling support to produce IoT software system requirements document . It iscomposed of an IoT scenario description technique, a checklist to verify IoTscenarios, construction processes, and templates for IoT software systems .…

Improved Initialization of State Space Artificial Neural Networks

The identification of black-box nonlinear state-space models requires aflexible representation of the state and output equation . Artificial neuralnetworks have proven to provide such a representation . A well-thoughtinitialization of these model parameters can often avoid that the nonlinearoptimization algorithm converges to a poorly performing local minimum of the considered cost function .…

RCT Resource Constrained Training for Edge AI

Resource Constrained Training (RCT) only keeps a quantised modelthroughout the training, so that the memory requirements for model parameters are reduced . It adjusts per-layer bitwidth dynamically in order to save energy when a model can learn effectively with lower precision .…

Palindromic Length and Reduction of Powers

Given a nonempty finite word $v$ let $PL(v)$ be the palindromic length of a word $x$ Let $x is an infinite non-ultimatelyperiodic word with $maxPL(x)=k<\infty . We show how to reduce the powers of $u$ and$u^R$ in $x' length in such a way that the length remains bounded . Less formally said, we show how we reduce the power of $U$ and $p(t) to a word with a finite or infinite word $y$ . We construct an infinitenon-ultinally periodic word such that $u …

Contrastive Learning based Hybrid Networks for Long Tailed Image Classification

Learning discriminative image representations plays a vital role in long-tailed image classification because it can ease the classifier learning inimbalanced cases . We explore effectivesupervised contrastive learning strategies and tailor them to learn better image representations from imbalanced data . We propose a novel hybrid network structure composed of a supervised contrastive loss to learn image representations and a cross-entropy loss .…

Confluent Vessel Trees with Accurate Bifurcations

We are interested in unsupervised reconstruction of complex near-capillaryvasculature with thousands of bifurcations . Unsupervised methods can use many structural constraints, e.g.topology, geometry, physics . Common techniques use variants of MST on geodesictubular graphs minimizing symmetric pairwise costs, i.e. distances .…

DAGN Discourse Aware Graph Network for Logical Reasoning

Recent QA with logical reasoning questions requires passage-level relations among sentences . We propose a discourse-aware graph network (DAGN) that reasons rely on discourse structure of texts . The model encodes discourseinformation as a graph with elementary discourse units (EDUs) and discourserelations, and learns the features via a graph network for downstream QA tasks .…

Learning to Track with Object Permanence

Tracking by detection, the dominant approach for online multi-objecttracking, alternates between localization and re-identification steps . In contrast, tracking in humans is highlighted by the notion of object permanence . We build on top of the recent CenterTrack architecture, which takes pairs of frames as input, and extend it to videos of arbitrary length .…

OTA Optimal Transport Assignment for Object Detection

OnCOCO, a single FCOS-ResNetNet-50 detector equipped with Optimal TransportAssignment (OTA) can reach 40.7% mAP under 1X scheduler . Extensive experiments conducted on COCO and CrowdHuman further validate the effectiveness of our proposed OTA, especially in crowd scenarios . The code is available athttps://://://github.com/Megvii-BaseDetection/OTA…

Image2Reverb Cross Modal Reverb Impulse Response Synthesis

Image2Reverb is the first work that generates an IR from a single image . This IR is then applied to other signals usingconvolution, simulating the reverberant characteristics of the space shown inthe image . We use an end-to-end neural networkarchitecture to generate plausible audio impulse responses from single images .…

Computational Model to Quantify Object Innovativeness

The article considers the quantitative assessment approach to theinnovativeness of different objects . The proposed assessment model is based on the object data retrieval from various databases including the Internet . Wepresent an object linguistic model, the processing technique for themeasurement results including the results retrieved from the different searchengines, and the evaluating technique of the source credibility .…

Character Controllers Using Motion VAEs

A fundamental problem in computer animation is that of realizing purposefuland realistic human movement given a sufficiently-rich set of motion capture clips . We learn data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs . The latentvariables of the learned autoencoder define the action space for the movement and govern its evolution over time .…

Bidirectional Projection Network for Cross Dimension Scene Understanding

The information inside these two visual domains is well complementary, e.g., 2Dimages have fine-grained texture while 3D point clouds contain plentifulgeometry information . However, most current visual recognition systems processthem individually . In this paper, we present a \emph{bidirectional projectionnetwork (BPNet) for joint 2D and 3D reasoning in an end-to-end manner .…

Mutually Constrained Monotonic Multihead Attention for Online ASR

Monotonic Multihead Attention (MMA) shows comparable performance to offline methods in machine translation and automatic speech recognition (ASR) tasks . MMA is still a major issue in ASR and should be combined with atechnique that can reduce the test latency at inference time, such ashead-synchronous beam search decoding, which forces all non-activated heads toactivate after a small fixed delay from the first head activation .…

Functorial Language Models

We introduce functorial language models: a principled way to computeprobability distributions over word sequences given a monoidal functor fromgrammar to meaning . This yields a method for training categorical compositional compositionaldistributional models on raw text data .…

Visionary Vision architecture discovery for robot learning

We propose a vision-based architecture search algorithm for robotmanipulation learning . Our approach automaticallydesigns architectures while training on the task – discovering novel ways of combining and attending image feature representations with actions . The obtained new architectures demonstrate better task success rates, in some cases with a large margin, compared to arecent high performing baseline .…

The Complete Affine Automorphism Group of Polar Codes

A permutation-based successive cancellation (PSC) decoding framework for polar codes attaches much attention . PSC framework isineffective for permutations falling into the lower-triangular affine (LTA)automorphism group . But BLTA equals the complete automorphisms of decreasing polar codes that can be formulated as affine trasformations .…

Training a Better Loss Function for Image Restoration

A single natural image is sufficient to train a feature extractor that outperforms state-of-the-art loss functions in single image super resolution, denoising, and JPEG artefact removal . Wepropose a novel Multi-Scale Discriminative Feature (MDF) loss comprising aseries of discriminators, trained to penalize errors introduced by a generator .…

Combating Adversaries with Anti Adversaries

Deep neural networks are vulnerable to small input perturbations known as adversarial attacks . We propose the anti-adversary layer aimed at countering this effect . In particular, our layer generates a perturbation in the opposite direction of the adversarial one, and feeds the classifier a perturbedversion of the input .…

Distilling Object Detectors via Decoupled Features

DeFeat improves ResNet50 based Faster R-CNN from 37.4% to 40.9% mAP, and improves RetinaNet from 36.5% to 39.7% on COCO benchmark . It is a novel distillation algorithm via decoupled features(DeFeat) for learning a better student detector . The proposal is available at https://://://github.com/ggjy/deFeat.pytorch.…

Planar Surface Reconstruction from Sparse Views

The paper studies planar surface reconstruction of indoor scenes from twoviews with unknown camera poses . Previous approaches have successfully created object-centric reconstructions of many scenes . They fail to exploit planes, such as planes, which are typically the dominant componentsof indoor scenes .…

Loosely self stabilizing Byzantine tolerant Binary Consensus for Signature free Message passing Systems

At PODC 2014, A. Most\’efaoui, H. Moumen, and M. Raynal presented a new andsimple randomized signature-free binary consensus algorithm (denoted here MMR) that copes with the net effect of asynchrony Byzantine behaviors . MMR is optimal in several respects: it deals with up to t Byzantine processes where t < n/3and n is the number of processes, O(n\^2) messages and O(1) expected time . The proposed algorithm is the first loosely-self-stabilizing Byzantinefault-tolerant binary consensus algorithms suited to asynchronousmessage-passing systems . Furthermore, it only requires a bounded amount of memory. Furthermore, the obtained algorithm preserves its properties of optimal resilience andtermination, (i.e., t

Toward Safety Aware Informative Motion Planning for Legged Robots

This paper reports on developing an integrated framework for safety-awareinformative motion planning suitable for legged robots . The information-gathering planner takes a dense stochastic map of the environment into account, while safety constraints are enforced via Control BarrierFunctions (CBFs) The planner is based on the Incrementally-exploringInformation Gathering (IIG) algorithm and allows closed-loop kinodynamic nodeexpansion using a Model Predictive Control (MPC) formalism .…

Agent with Warm Start and Adaptive Dynamic Termination for Plane Localization in 3D Ultrasound

2D US has to perform scanning for each SP, which is time-consuming and operator-dependent . Automatically locating SP in 3D US is very challenging due to the huge search space and large fetal posture variations . Our approach achieves localization error of 2.52mm/10.26 degrees, 2.02mm/11.48 degrees, 3.61mm/9.71 degrees, 1.49mm/7.54 degrees for thetranscerebellar, transventricular, transthalamic planes in fetal brain,abdominal plane in fetal abdomen, and mid-sagittal, transverse and coronalplanes in uterus, respectively .…

On the hidden treasure of dialog in video question answering

Modern video question answering (VideoQA) systems often use additional human-made sources like plot synopses, scripts, descriptions or knowledge bases . In this work, we present a new approach to understand the whole story without external sources . We treat dialog as a noisy source to beconverted into text description via dialog summarization .…

Unsupervised Robust Domain Adaptation without Source Data

This paper aims at answering the question of finding the right strategy to make the target model robust and accurate in the setting of unsupervised domain adaptation without source data . The proposed method of using non-robust pseudo-labels performs surprisingly well on both clean andadversarial samples, for the task of image classification .…

HUGE An Efficient and Scalable Subgraph Enumeration System

HUGE is a system called HUGE to efficiently process subgraphenumeration at scale in the distributed context . HUGE features a novel two-stage execution mode with a lock-free and zero-copy cache design . Huge isgeneric such that all existing distributed subgraph enumeration algorithms can be plugged in to enjoy automatic speed up and bounded-memory execution .…

Heterogeneous Graph Neural Networks for Multi label Text Classification

Multi-label text classification (MLTC) is an attractive and challenging task . We propose a heterogeneous graph convolutional network model to solve the MLTC problem . We are able to take into account multiple relationships including token-level relationships . We evaluate our method on three real-world datasets and the experimental results show that it achievessignificant improvements and outperforms state-of-the-art comparison methods .…

Three dimensional higher order raypath separation in a shallow water waveguide

Separating raypaths in a multipath shallow-water environment is a challenge . The proposed algorithm achieves a higher resolution and a stronger robustnesscomparing to the existing algorithms . Performance tests usingsimulation data in multipath environment, real data obtained in an ultrasonicwaveguide and ocean shallow water data, respectively, illustrate that the proposed algorithms achieve a higherresolution and a strong robustness .…

Synthesizing Linked Data Under Cardinality and Integrity Constraints

The generation of synthetic data is useful in multiple aspects, from testingapplications to benchmarking to privacy preservation . Generating links between relations subject to cardinality constraints (CCs) and integrityconstraints (ICs) is an important aspect of this problem . We provide a novel framework for the problem based on declarative CCs and ICs.…

Composable Learning with Sparse Kernel Representations

We present a reinforcement learning algorithm for learning sparsenon-parametric controllers in a Reproducing Kernel Hilbert Space . We improve the sample complexity of this approach by imposing a structure of the state-action function through a normalized advantage function (NAF) Thisrepresentation of the policy enables efficiently composing multiple learnedmodels without additional training samples or interaction with the environment .…

Unsupervised Document Embedding via Contrastive Augmentation

We present a contrasting learning approach with data augmentation techniquesto learn document representations in an unsupervised manner . We hypothesize that high-quality document embedding should beinvariant to diverse paraphrases that preserve the semantics of the original document . Our method can decrease the classification error rate by upto 6.4% over the SOTA approaches on the document classification task .…

DBATES DataBase of Audio features Text and visual Expressions in competitive debate Speeches

In this work, we present a database of multimodal communication featuresextracted from debate speeches in the 2019 North American Universities DebateChampionships (NAUDC) Feature sets were extracted from the visual (facialexpression, gaze, and head pose), audio (PRAAT), and textual (word sentimentand linguistic category) modalities of raw video recordings of competitive debaters .…