On the limit of English conversational speech recognition

The study also considers the recently proposed conformer, and more advanced self-attention based language models . Their combination and decoding reaches a new record on Switchboard-300 and 10.0% WER on SWB and CHM parts of Hub5’00 with very simple LSTMmodels. Overall, the conformer showssimilar performance to the L STM; nevertheless, their combination and decode with an improved LM reaches new record .…

SmoothI Smooth Rank Indicators for Differentiable IR Metrics

Information retrieval (IR) systems traditionally aim to maximize metrics built on rankings, such as precision or NDCG . However, thenon-differentiability of the ranking operation prevents direct optimization of such metrics in state-of-the-art neural IR models . To address this shortcoming, we propose SmoothI, a smooth approximation of rank indicators that serves as abasic building block to devise differentiable approximations of IR metrics .…

Fast Multi Step Critiquing for VAE based Recommender Systems

Recent studies have shown that providing explanations alongsiderecommendations increases trust and perceived quality . M&Ms-VAE is a novel autoencoder forrecommendation and explanation that is based on multimodal modelingassumptions . We train the model under a weak supervision scheme to simulate both fully and partially observed variables .…

DeepMPCVS Deep Model Predictive Control for Visual Servoing

The simplicity of the visual servoing approach makes it an attractive option for tasks dealing with vision-based control of robots in many real-world applications . Attaining precise alignment for unseen environments pose a challenge to existing visual servoational approaches . The recent data-driven approaches face issues when generalizing to novel environments .…

Multi agent consensus with heterogeneous time varying input and communication delays in digraphs

This paper investigates the distributed consensus tracking control problem for general linear multi-agent systems (MASs) with external disturbances andheterogeneous time-varying input and communication delays . An extended LMI is proposed which, in conjunction with the rest of the LMIs, results in a solution with a larger upper bound on delays than what would befeasible without it .…

A Rate Splitting Strategy to Enable Joint Radar Sensing and Communication with Partial CSIT

Joint radar and communication (RadCom) systems have attracted increased attention in recent years . Joint RadCom system is designed which marriesthe capabilities of a Multiple-Input Multiple-Output (MIMO) radar withRate-Splitting Multiple Access (RSMA) RSMA providesthe RadCom with more robustness, flexibility and user rate fairness compared to the baseline joint RadCom System based on Space Division Multiple Access(SDMA) System is designed in the presence of partial CSIT to maximize the Average Weighted Sum-Rate (AWSR) under QoS rate constraints and minimize the RadCom Beampattern Squared Error (BSE) against anideal MIMO radar beamp attern .…

Goldilocks Just Right Tuning of BERT for Technology Assisted Review

Technology-assisted review (TAR) refers to iterative active learning workflows for document review in high recall retrieval (HRR) tasks . We find that the pre-trained BERT model reduces review volume by 30% in TAR workflows simulated on RCV1-v2 newswire collection . In contrast, linear models outperform BERT for simulated legal discovery topics on Jeb Bush e-mail collection .…

Reachability of Black Box Nonlinear Systems after Koopman Operator Linearization

Reachability analysis of nonlinear dynamical systems is a challenging andcomputationally expensive task . Computing the reachable states for linear systems, in contrast, can often be done efficiently in high dimensions . The Koopman operator links the behaviors of a nonlinear system to a linear system embedded in a higher dimensional space, with an additional set of so-calledobservable variables .…

Universal Weakly Supervised Segmentation by Pixel to Segment Contrastive Learning

Weakly supervised segmentation requires assigning a label to every pixel based on training instances with partial annotations such as image-level tags,object bounding boxes, labeled points and scribbles . We propose 4 types of contrastiverelationships between pixels and segments in the feature space, capturinglow-level image similarity, semantic annotation, co-occurrence, and featureaffinity .…

Robust Control for Lane Keeping System Using Linear Parameter Varying Approach with Scheduling Variables Reduction

This paper presents a robust controller using a Linear Parameter Varying(LPV) model of the lane-keeping system with parameter reduction . Multiple varying parameters lead to a high number of scheduling variables and cause massive computational complexity . We designed the LPV robust feedback controller using the reducedmodel solving a set of Linear Matrix Inequality (LMI) The effectiveness of the proposed system is validated with full vehicle dynamics from CarSim on aninterchange road .…

Present and Future of Reconfigurable Intelligent Surface Empowered Communications

Signal processing and communication communities have witnessed the rise of exciting communication technologies in recent years . We discuss the recent developments in the field and put forward promising candidates for future research and development . We also envision an ultimate RIS architecture, which is able to adjust its operation modes dynamically, and introduce the new concept of PHY slicing over RISs towards 6G wireless networks.…

Generalized Spatially Coupled Parallel Concatenated Convolutional Codes With Partial Repetition

We introduce generalized spatially coupled parallel concatenated codes (GSC-PCCs) as a class of turbo-like turbo-style codes . We show that the proposed codes have some niceproperties such as threshold saturation and that their decoding thresholdsimprove with the repetition factor $q$. We also suggest that the codes asymptotically approach the capacity as $q$ tends toinfinity with any given constituent convolutional code .…

Lower Bounds on the Time Memory Tradeoff of Function Inversion

We study time/memory tradeoffs of function inversion: an algorithm, i.e., aninverter, equipped with an s-bit advice on a randomly chosen function $f : n : [n]-n]$ and using $q$ oracle queries to $f$ We make progress on the above intriguing question, both for the adaptive and non-adaptive case, proving the following lower bounds on restricted families of inverters .…

Audio Transformers Transformer Architectures For Large Scale Audio Understanding Adieu Convolutions

On a standard dataset of Free Sound 50K,comprising of 200 categories, our model outperforms convolutional models . This is significant as unlike in natural language processing and computer vision, we do not perform unsupervised pre-training for outperformingconvolutional architectures . On the same training set, with respect meanaver-age precision benchmarks, we show a significant improvement .…

Russian News Clustering and Headline Selection Shared Task

This paper presents the results of the Russian News Clustering and HeadlineSelection shared task . We propose tasks of Russian newsevent detection, headline selection, and headline generation . The presented datasets for eventdetection and headline selection are the first public Russian datasets for their tasks .…

Multi feature 360 Video Quality Estimation

The proposed method is based on computing multiplespatio-temporal objective quality features on viewports extracted from 360-degree videos . A new model is learnt to properly combine these features into a metric that closely matches subjective quality scores . No individual objective image quality metric always performs the best for all types of visualdistortions, while a learned combination of them is able to adapt to different conditions .…

Learning to drive from a world on rails

We learn an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach . A forward model of the world supervises adriving policy that predicts the outcome of any potential driving trajectory . Despite the world-on-rails assumption, the final driving policyacts well in a dynamic and reactive world .…

Prediction of clinical tremor severity using Rank Consistent Ordinal Regression

Tremor is a key diagnostic feature of Parkinson’s Disease (PD), EssentialTremor (ET), and other central nervous system (CNS) disorders . Clinicians or trained raters assess tremor severity with TETRAS scores by observing patients . In this work, we proposed totrain a deep neural network (DNN) with rank-consistent ordinal regression using276 clinical videos from 36 essential tremor patients .…

Leveraging Deep Representations of Radiology Reports in Survival Analysis for Predicting Heart Failure Patient Mortality

Utilizing clinical texts in survival analysis is difficult because they are largely unstructured . Current automatic extraction models fail to capture textual information comprehensively . They typically require a large amount of data and high-quality annotations for training . In this work, we present a novel method ofusing BERT-based representations of clinical texts as covariates for proportional hazards models to predict patient survival outcomes .…

Optimal heating of an indoor swimming pool

This work presents the derivation of a model for the heating process of the air of a glass dome, where an indoor swimming pool is located in the bottom of the dome . The problem can be reduced from a three-dimensional to a twodimensional one .…

UniGNN a Unified Framework for Graph and Hypergraph Neural Networks

UniGNN is a unified framework for interpreting the messagepassing process in graph and hypergraph neural networks, which can generalize general GNN models into hypergraphs . Extensive experiments have been conducted to demonstratethe effectiveness of UniGnn on multiple real-world datasets, which outperform the state-of-the-art approaches with a large margin .…

An Efficient and Secure Location based Alert Protocol using Searchable Encryption and Huffman Codes

Location data are widely used in mobile apps, ranging from location-basedrecommendations, to social media and navigation . But serious privacy concerns arise if users share their locationhistory with the service provider in plaintext . The underlying searchable encryption primitives required to perform the matching on ciphertexts are expensive, and without a proper encoding oflocations and search predicates, the performance can degrade a lot .…

Rate Splitting Multiple Access for Enhanced URLLC and eMBB in 6G

Rate-Splitting Multiple Access (RSMA) is a flexible and robust multipleaccess scheme for downlink multi-antenna wireless networks . RSMA relies onRate-splitting (RS) at the transmitter and Successive Interference Cancellation (SIC) at receivers . We present the optimal system designsemploying RSMA that target short-packet and low-latency communications as well as robust communications with high-throughput under the practical and importantsetup of imperfect Channel State Information at Transmitter (CSIT) originating from user mobility and feedback latency in the network .…

Dialectica models of type theory

We present two Dialectica-like constructions for models of intensionalMartin-L\”of type theory . We propose a new semantic notion of finite sum for dependent types, generalizingfinitely-complete extensive categories . The second avoids extensivityassumptions using biproducts in a Kleisli category for a fibred additive monad .…

Fast Power Control Adaptation via Meta Learning for Random Edge Graph Neural Networks

Power control in decentralized wireless networks poses a complex stochasticoptimization problem when formulated as the maximization of the average sumrate for arbitrary interference graphs . Recent work has introduced data-drivendesign methods that leverage graph neural network (GNN) to efficientlyparametrize the power control policy mapping channel state information (CSI) to the power vector .…

Intelligent Reflecting Surface Assisted Secret Key Generation In Multi antenna Network

Physical-layer key generation (PKG) can generate symmetric keys between twocommunication ends based on the reciprocal uplink and downlink channels . Bysmartly reconfiguring the radio signal propagation, intelligent reflectingsurface (IRS) is able to improve the secret key rate of PKG . IRS-assistedmultiple-input single-output (MISO) system aims to maximize the secretkey rate by optimally designing the IRS passive beamforming .…

Curious Exploration and Return based Memory Restoration for Deep Reinforcement Learning

The proposed method can be utilized to train agents in environments with fairly complex state and action spaces . The main challenge of using such a rewardfunction is the high sparsity of positive reward signals . To address this problem, we use a simple prediction-based exploration strategy (called CuriousExploration) along with a Return-based Memory Restoration (RMR) technique which tends to remember more valuable memories .…

Model Checking Quantum Continuous Time Markov Chains

A real-time system, we specify the temporal properties on QCTMC by signal temporal logic (STL) To effectivelycheck the atomic propositions in STL, we develop a state-of-art real rootisolation algorithm under Schanuel’s conjecture . Further, we check the generalSTL formula by interval operations with a bottom-up fashion, whose querycomplexity turns out to be linear in the size of the input formula by calling the real root isolation algorithm .…

Lecture Notes on Voting Theory

Lectures were developed for the course Computational SocialChoice of the Artificial Intelligence MSc programme at the University of Groningen . They cover mathematical and algorithmic aspects of voting theory .…

Child Robot Interaction Studies During COVID 19 Pandemic

The coronavirus disease (COVID-19) pandemic affected our lives deeply, just like everyone else, justlike everyone else . The children also suffered from the restrictions due to the restrictions . The precautions due to COVirus disease also introduced new constraints in the social robotics research .…