AGENT A Benchmark for Core Psychological Reasoning

For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life . We present a benchmark consisting of a large dataset of procedurally generated 3D animations, AGENT (Action,Goal, Efficiency, coNstraint, uTility) We validateAGENT with human-ratings, propose an evaluation protocol emphasizing generalization, and compare two strong baselines built on Bayesian inverseplanning and Theory of Mind neural network .…

Credit Assignment with Meta Policy Gradient for Multi Agent Reinforcement Learning

Reward decomposition is a critical problem in centralized training withdecentralized execution~(CTDE) paradigm for multi-agent reinforcement learning . We propose a general meta-learning-based Mixing Network with MetaPolicy Gradient~(MNMPG) framework to distill the global hierarchy for delicatereward decomposition . Our method is generally applicable to theCTDE method using a monotonic mixing network .…

Deep Reinforcement Learning for Safe Landing Site Selection with Concurrent Consideration of Divert Maneuvers

This research proposes a new integrated framework for identifying safelanding locations and planning in-flight divert maneuvers . The proposed framework wasable to achieve 94.8% of successful landing in highly challenging landingsites where over 80$\%$ of the area around the initial target lading point ishazardous, by effectively updating the target landing site and feedback controlgain during descent .…

Memory based Deep Reinforcement Learning for POMDP

A promising characteristic of Deep Reinforcement Learning (DRL) is itsability to learn optimal policy in an end-to-end manner without relying on feature engineering . Most approaches assume a fully observable statespace, i.e. fully observable Markov Decision Process (MDP) In real-worldrobotics, this assumption is unpractical, because of sensor issues such assensors’ capacity limitation and sensor noise .…

The Logical Options Framework

Logical Options Framework (LOF) learns policies that are satisfying, optimal, and composable . LOF efficiently learns policies thatsatisfy tasks by representing the task as an automaton and integrating it into learning and planning . We evaluate LOF on four tasks in discrete and continuous domains, including a 3D pick-and-place environment .…

Designing Explanations for Group Recommender Systems

Explanations are used in recommender systems for various reasons . Users have to be supported in making (high-quality) decisions more quickly . Explanation is designed in order to achieve specific goals such as increasing transparency of areendation or increasing a user’s trust in the recommender system .…

A CP Net based Qualitative Composition Approach for an IaaS Provider

We propose a novel CP-Net based composition approach to qualitatively select an optimal set of consumers for an IaaS provider . The provider’s and consumers’ qualitative preferences are captured using CP-Nets . A greedy-based and a heuristic-based consumer selection approaches are proposed that effectively reduce the search space of candidates in the composition .…

Directional Bias Amplification

Mitigating bias in machine learning systems requires refining our understanding of bias propagation pathways . A metric formeasuring bias amplification was introduced in the seminal work by Zhao et al. We introduceand analyze a new, decoupled metric for measuring bias amplification,$\text{BiasAmp}_{\rightarrow}$ (Directional Bias Amplification) We provide suggestions about its measurement by cautioning against predicting sensitive attributes, encouraging the use ofconfidence intervals due to fluctuations in the fairness of models across runs,and discussing the limitations of what this metric captures .…

Image Augmentation for Multitask Few Shot Learning Agricultural Domain Use Case

Large datasets’ availability is catalyzing a rapid expansion of deep learning in general and computer vision in particular . In many domains, lack of training data may become an obstacle to the practical application of computer vision techniques . We introduce an image augmentation framework, which enablesus to enlarge the number of training samples while providing the data for such tasks as object detection, semantic segmentation, instancesegmentation, object counting, image denoising, and classification .…

Learning Emergent Discrete Message Communication for Cooperative Reinforcement Learning

Communication is a important factor that enables agents to work cooperatively in multi-agent reinforcement learning (MARL) Most previous work uses continuous communication whose high representational capacity comes at the expense of interpretability . Allowing agents to learn their own discrete message protocol emerged from a variety of domains can increase theinterpretability for human designers and other agents .…

PsiPhi Learning Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

We propose amulti-task inverse reinforcement learning (IRL) algorithm, called \emph{inversetemporal difference learning} (ITD) that learns shared state features and per-agent successor features . We further show how to seamlesslyintegrate ITD with learning from online environment interactions, arriving at anovel algorithm for reinforcement learning with demonstrations, called $\Psi\Phi$-learning (pronounced `Sci-Fi’) We provide empirical evidence for the effectiveness of this method for improving RL, IRL,imitation, and few-shot transfer, and we derive worst-case bounds for its performance in zero-shot transfers to new tasks .…

Relating Reading Visualization and Coding forNew Programmers A Neuroimaging Study

Understanding how novices reason about coding at a neurological level has implications for training the next generation of software engineers . All three tasks — coding, prose reading, and mental rotation — are mentally distinct for novice programmers . While thosetasks are neurally distinct, we find more significant differences between proseand coding than between mental rotation and coding .…

Safe CPS from Unsafe Controllers

In this paper, we explore using runtime verification to design safecyber-physical systems (CPS) We build upon the Simplex Architecture, where control authority may switch from an unverified and potentially unsafe advanced controller to a backup baseline controller in order to maintain system safety .…

Learning Off By One Mistakes An Empirical Study

Mistakes in binary conditions are a source of error in software systems . They happen when developers use, e.g., instead of = . Theseboundary mistakes are hard to find and impose manual, labor-intensive work for software developers . We train different models on approximately 1.6M examples with faults in different boundary conditions .…

Modern Koopman Theory for Dynamical Systems

The field of dynamical systems is being transformed by the mathematical tools emerging from modern computing and data science . Koopman spectral theory has emerged as a dominant perspective over the past decade . This linear representation of nonlinear dynamics has tremendous potential to enable the prediction,estimation, and control of non linear systems with standard textbook methods developed for linear systems .…

Hero On the Chaos When PATH Meets Modules

The heterogeneous use of library-referencing modes across Golang projects has caused numerous dependency management issues, incurring reference inconsistencies and even build failures . We reported 280 issues, among which 181 (64.6\%) issues have been confirmed, and 160 of them (88.4\%) have been fixed or areunder fixing .…

Decentralized conjugate gradients with finite step convergence

The decentralized solution of linear systems of equations arises as asubproblem in optimization over networks . Typical examples include the KKTsystem corresponding to equality constrained quadratic programs in distributedoptimization algorithms or in active set methods . This note presents a tailoredstructure-exploiting decentralized variant of the conjugate gradient method .…

Research on False Data Injection Attacks in VSC HVDC Systems

The false data injection (FDI) attack is a crucial form of cyber-physical security problems facing cyber power systems . There is noresearch revealing the problem of FDI attacks facing voltage source converterbased high voltage direct current transmission (VSC-HVDC) systems . And finally, the modified IEEE-14 bus system is used to demonstrate that attackers are capable of disrupting the operation security of converter stations in VSC- HVDC systems by FDI attack strategies .…

Learning to Make Compiler Optimizations More Effective

LoopLearner addresses the problem of compiler instability by predicting which way of writing a loop will lead to efficient compiled code . Applying the transformations that our model deems most favorableprior to compilation yields an average speedup of 1.14x. When trying the top-3suggested transformations, the average speed up even increases to 1.29x.…

Space Time Codes from Sum Rank Codes

Linearized Reed–Solomon codes can outperform diversity codes based on cyclic division algebras at low SNRs . Simulation results show that the proposed codes outperform full diversity codes . We also provide sequential decoders for these codes and,more generally, space–time codes constructed from finite field codes .…

The INTERSPEECH 2021 Computational Paralinguistics Challenge COVID 19 Cough COVID 19 Speech Escalation Primates

The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four problems for the first time in a research competition underwell-defined conditions . We describe the Sub-Challenges, baseline feature extraction, andclassifiers based on the ‘usual’ COMPARE and BoAW features as well as deepunsupervised representation learning using the AuDeep toolkit, and deep featureextraction from pre-trained CNNs using the Deep Spectrum toolkit .…

Feature set optimization by clustering univariate association Deep Machine learning omics Wide Association Study DMWAS for Biomarkers discovery as tested on GTEx pilot dataset for death due to heart attack

Clustering based encoding scheme for structural variations and om-ics basedanalysis . Logistic regression to work best for death due to heart attack (MHHRTATT) phenotypic cause of death . Variant Id P1_M_061510_3_402_P at chromosome 3 &position 192063195 was found to be most highly associated to MHH RTATT.…

A predictive safety filter for learning based racing control

The growing need for high-performance controllers in safety-critical applications like autonomous driving has been motivating the development offormal safety verification techniques . Tothis end, we provide a principled procedure to compute a safe and invariant setfor nonlinear dynamic bicycle models using efficient convex approximationtechniques .…

A Quantitative Metric for Privacy Leakage in Federated Learning

In federated learning system, parameter gradients are shared amongparticipants and the central modulator . The original data never leave the protected source domain . However, the gradient itself might carry enough information for precise inference of the original data . By reporting theirparameter gradients to the central server, client datasets are exposed toinference attacks from adversaries .…

The non positive circuit weight problem in parametric graphs a fast solution based on dioid theory

In this paper, we design an algorithm thatsolves the Non-positive Circuit weight Problem (NCP) on this class ofparametric graphs . The proposed algorithm isbased on max-plus algebra and formal languages and runs faster than otherexisting approaches . It achieves strongly polynomial time complexity$\mathcal{O}(n^4)$ (where $n$ is the number of nodes in the graph) The proposed algorithms are based on max plus algebra, and run faster than existing approaches .…