Safety of the Intended Driving Behavior Using Rulebooks

Autonomous Vehicles (AVs) are complex systems that drive in uncertain environments and potentially navigate unforeseeable situations . Safety of thesesystems requires not only an absence of malfunctions but also high performance of functions in many different scenarios . ISO/PAS 21448 guidance recommends a process to ensure safety of the Intended Functionality (SOTIF) for road vehicles .…

Continual Learning via Bit Level Information Preserving

Bit-Level Information Preserving (BLIP) preservestesthe information gain on model parameters through updating the parameters at thebit level . BLIP trains a neural network with weight quantization on the new incoming task and then estimates information gain . The results show that our method produces better or on par results comparing to previous state-of-the-arts.…

Natural Posterior Network Deep Bayesian Predictive Uncertainty for Exponential Family Distributions

We propose the Natural Posterior Network (NatPN) for fast and high-quality uncertainty estimation for any task where the target distributionbelongs to the exponential family . Unlike many previous approaches, NatPN does not require out-of-distribution (OOD) data at trainingtime . NatPN leverages Normalizing Flows to fit a single density on alearned low-dimensional and task-dependent latent space .…

Parameter free Gradient Temporal Difference Learning

Reinforcement learning lies at the intersection of several challenges . Many applications involve extremely large state spaces, requiringfunction approximation to enable tractable computation . Thelearner has only a single stream of experience with which to evaluate a largenumber of possible courses of action .…

Improving Fairness of AI Systems with Lossless De biasing

In today’s society, AI systems are increasingly used to make critical decisions such as credit scoring and patient triage . Mitigating bias in AI systems to increase overallfairness has emerged as an important challenge . In thispaper, we present an information-lossless de-biasing technique that targets the scarcity of data in the disadvantaged group.…

Parallel Sandpiles or Spurious Bidirectional Icepiles

In a recent paper E. Formenti and K. Perrot (FP) introduce a global rule to describe the discrete time dynamics associated with a sandpile model . In the first part we prove that the FP global rule does not describe the dynamics of standard sandpiles, but rather furnishes a description of the quite different situation of height difference between consecutive piles .…

Multi Objective Controller Synthesis with Uncertain Human Preferences

Multi-objective controller synthesis concerns the problem of computing anoptimal controller subject to multiple (possibly conflicting) objectiveproperties . The relative importance of objectives is often specified by humandecision-makers . However, there is inherent uncertainty in human preferences . In this paper, weformalize the notion of uncertain human preferences and present a novel approach .…

Stochastic Image to Video Synthesis using cINNs

Video understanding calls for a model to learn the characteristic interplay between static scene content and its dynamics . Given an image, the model must be able to predict a future progression of the portrayed scene and, conversely, a video should be explained in terms of its static image content and all theremaining characteristics not present in the initial frame .…

Self Guided Curriculum Learning for Neural Machine Translation

In the field of machine learning, the well-trained model is assumed to beable to recover the training labels, i.e. the synthetic labels predicted by themodel should be as close to the ground-truth labels as possible . Inspired by this, we propose a self-guided curriculum strategy to encourage the learning ofneural machine translation (NMT) models to follow the above recovery criterion .…

Recent Advances in Deep Learning based Dialogue Systems

Dialogue systems are a popular Natural Language Processing (NLP) task as itis promising in real-life applications . In this survey, we mainly focus on the deeplearning-based dialogue systems . We comprehensively review state-of-the-artresearch outcomes in dialogue systems and analyze them from two angles: modeltype and system type .…

A Rigorous Information Theoretic Definition of Redundancy and Relevancy in Feature Selection Based on Partial Information Decomposition

Informationtheory provides a powerful framework for formulating feature selectionalgorithms . Yet, a rigorous, information-theoretic definition of featurerelevancy is still missing . Using PID, we clarify why feature selection is a conceptuallydifficult problem when approached using information theory . We show that the conditional mutual information (CMI) maximizes feature relevancy while minimizing redundancy .…

HiTyper A Hybrid Static Type Inference Framework with Neural Prediction

HiTyper creates a new syntax graph for each program, called type graph, illustrating the type flow among all variables in the program . Based on the type graph it infers the types of thevariables with appropriate static constraints . It then adopts a SOTA DL model to predict types of other variables that cannot be inferredstatically, during which process a type correction algorithm is employed tovalidate and correct the types recommended by the DL model .…

UPC s Speech Translation System for IWSLT 2021

This paper describes the submission to the IWSLT 2021 offline speech translation task by the UPC Machine Translation group . The task consists of building a system capable of translating English audio recordings extractedfrom TED talks into German text . Submitted systems can be either cascade orend-to-end and use a custom or given segmentation .…

G Tran Making Distributed Graph Transactions Fast

G-Tran is an RDMA-enabled in-memory graph database with serializable and snapshot isolationsupport . We propose a graph-native data store to achieve good datalocality and fast data access for transactional updates and queries . We also propose a new MV-OCC implementation with two optimizations to address the issue of large read/writesets in graph transactions .…

Meta Cal Well controlled Post hoc Calibration by Ranking

Post-hoc calibration is a technique to recalibrate a model, and its goal is to learn a calibration map . Meta-Cal is built from a base calibrator and a ranking model . It outperforms the state-of-the-art for multi-class classification underconstraints, as a calibrator with a low calibration error does not necessarilymean it is useful in practice in practice .…

Rate Distortion Analysis of Minimum Excess Risk in Bayesian Learning

Minimum Excess Risk (MER) in Bayesian learning is defined as the difference between the minimum expected expected loss achievable when learning from data and theminimum expected loss that could be achieved if the underlying parameter $W$ was observed . We formulate the problem as a (constrained)rate-distortion optimization and show how the solution can be bounded above and below by two other rate-disortion functions that are easier to study .…

A Bregman Learning Framework for Sparse Neural Networks

Using only 3.4% of the parameters of ResNet-18 weachieve 90.2% test accuracy on CIFAR-10, compared to 93.6% using the densenetwork . The proposed framework also has a huge potential for integrating sparsebackpropagation and resource-friendly training . We derive a statisticallyprofound sparse parameter initialization strategy and provide a rigorousstochastic convergence analysis of the loss decay and additional convergenceproofs in the convex regime .…

How could Neural Networks understand Programs

Semantic understanding of programs is a fundamental problem for programminglanguage processing (PLP) Recent works that learn representations of code based on pre-training techniques in NLP have pushed the frontiers in this direction . We believe it is difficult to build a model to better understand programs .…

Towards a functorial description of quantum relative entropy

A Bayesian functorial characterization of the classical relative entropy (KLdivergence) of finite probabilities was recently obtained by Baez and Fritz . This was then generalized to standard Borel spaces by Gagn\’e and Panangaden . We provide preliminary calculations suggesting that thefinite-dimensional quantum (Umegaki) relative entropy might be characterized in a similar way .…

Multi modal Conditional Bounding Box Regression for Music Score Following

A conditional neural network architecture is proposed that directlypredicts x,y coordinates of the matching positions in a complete score sheetimage at each point in time for a given musical performance . The proposed approach achieves new state-of-the-art results and significantly improves the alignment performance on a set ofreal-world piano recordings by applying Impulse Responses as a data augmentation technique .…

Diversity Analysis of Millimeter Wave OFDM Massive MIMO Systems

We analyze the diversity gain for a distributed antenna subarray employingorthogonal frequency-division multiplexing (OFDM) in millimeter-wave (mm-Wave)massive multiple-input multiple- input multiple-output (MIMO) systems . We show that thediversity gain depends on the number of transmitted data streams, the . number of the antenna units, and the number .…

ADASYN Random Forest Based Intrusion Detection Model

ADASYN oversampling method to balance datasets was proposed in this paper . In addition, random forest algorithm was used to train intrusiondetection classifiers . Compared with traditional machine learning models, it has betterperformance, generalization ability and robustness . The proposed method can be applied to intrusion detection with large data,and can effectively improve the classification accuracy of network attackbehaviors.…

A practical effective calculation of gamma difference distributions with open data science tools

At present, there is still no officially accepted and extensively verifiedimplementation of computing the gamma difference distribution allowing unequalshape parameters . We explore four computational ways of the gamma differencedistribution with the different shape parameters resulting from time serieskriging . At the double 53-bit precision, our tool outperformed the speed of the analytical computation based on Tricomi’s $U(a, b, z)$ function in CAS software by 1.5-2 orders .…

ReLU Deep Neural Networks from the Hierarchical Basis Perspective

We study ReLU deep neural networks (DNNs) by investigating their connections with the hierarchical basis method in finite element methods . We show that ReLU DNNs with this structure can be applied only to approximate quadratic functions . We obtain a geometric interpretation and systematic proof for theroximation result of ReLUDNNs for polynomials .…

Spoken Moments Learning Joint Audio Visual Representations from Video Descriptions

Spoken Moments (S-MiT) dataset of 500k spoken captions eachattributed to a unique short video depicting a broad range of different events . Existing caption datasets for video understandingare either small in scale or restricted to a specific domain . We present a novel Adaptive Mean Margin (AMM) approach to contrastive learning andevaluate our models on video/caption retrieval on multiple datasets .…

An Analysis of Phenotypic Diversity in Multi Solution Optimization

More and more, optimization methods are used to find diverse solution sets . We show that multiobjective optimization does not always produce much diversity . We also show that multimodaloptimization produces higher fitness solutions . Autoencoder is used to discover phenotypic features automatically, producing an even more diverse solution set with quality diversity .…

Do Concept Bottleneck Models Learn as Intended

Concept bottleneck models map from raw inputs to concepts, and then fromconcepts to targets . Such models aim to incorporate pre-specified, high-level concepts into the learning procedure . However, we demonstrate that concepts do not correspond to semantically meaningful in input space .…

The Typical Non Linear Code over Large Alphabets

We consider the problem of describing the typical (possibly) non-linear code of minimum distance bounded from below over a large alphabet . We concentrate on block codes with the Hamming metric and on subspace codes with injectionmetric . In sharp contrast with the behavior of linear block codes, we show that the typical non-logical code isfar from having minimum distance $d$, i.e.,…