Video based cattle identification and action recognition

Deep learning models have been developed and tested with videos acquired in a farm . An accuracy of 84.4\% has beenachieved for the detection of drinking events, and 94.4% for grazing events . We demonstrate a working prototype for the monitoring of cow welfare by analysing the animal behaviours of individual animals to enableautomated farm provenance .…

P Adapters Robustly Extracting Factual Information from Language Models with Diverse Prompts

P-Adapters are lightweight models that sit between the embedding layer and first attention layer of Large Language Models . They take LLM embeddings as input and output continuous prompts that are used to query the LLM . They showbetween 12-26% absolute improvement in precision and 36-50% absoluteimprovement in consistency over a baseline of only using natural languagequeries .…

WMDecompose A Framework for Leveraging the Interpretable Properties of Word Mover s Distance in Sociocultural Analysis

WMDecompose is a model and Python library that decomposes document-level distances into their constituent word-level distance . It then clusters words to induce thematic elements, such that useful lexical information is retained and summarized for analysis . We apply it to a longitudinal socialmedia corpus to explore the interrelationship between conspiracy theories and conservative American discourses .…

Compressibility of Distributed Document Representations

We propose CoRe, a straightforward, representation learner-agnostic framework suitable for representation compression . The CoRe’s performance was studied on a collection of 17 real-life corpora from biomedical,news, social media, and literary domains . We explored the behavior whenconsidering contextual and non-contextual document representations, differentcompression levels, and 9 different compression algorithms .…

Fast Data Series Indexing for In Memory Data

MESSI is the first data series index designed for in-memoryoperation on modern hardware . It is up to 4x faster at index construction and 11x faster than the state-of-the-art parallel approach . It can be used to answer exact similarity search queries on 100GB datasets in 50msec(30-75msec across diverse datasets) It enables real-time, interactive data exploration on very large data series collections, says the authors .…

Possibilistic Fuzzy Local Information C Means with Automated Feature Selection for Seafloor Segmentation

The Possibilistic Fuzzy Local Information C-Means (PFLICM) method is presented as a technique to segment side-look synthetic aperture sonar (SAS)imagery into distinct regions of the sea-floor . The chosen features and resulting segmentation from the image will be assessed based on a select quantitative clustering validitycriterion and a subset of the features that reach a desired threshold will be used for the segmentation process .…

Symbolic Knowledge Distillation from General Language Models to Commonsense Models

The common practice for training commonsense models has gone from-human-to-machine: humans author commonsense knowledge graphs in order to train models . In this work, we investigate an alternative,from-machine- to-corpus- to machine: general language models author thesecommonsense knowledge graphs . We also distill only one aspect-thecommonsense of a general language model teacher, allowing the student to be adifferent type, a commonsense model .…

Learning Temporal 3D Human Pose Estimation with Pseudo Labels

We present a simple, yet effective, approach for self-supervised 3D humanpose estimation . During training, we rely ontriangulating 2D body pose estimates of a multiple-view camera system . Atemporal convolutional neural network is trained with the generated 3Dground-truth and the geometric multi-view consistency loss, imposinggeometrical constraints on the predicted 3D body skeleton .…

DeepSSM A Blueprint for Image to Shape Deep Learning Models

Statistical shape modeling (SSM) characterizes anatomical variations in apopulation of shapes generated from medical images . SSM requires consistentshape representation across samples in shape cohort . Theseshape representations are then used to extract low-dimensional shapedescriptors that facilitate subsequent analyses in different applications .…

Rethinking Point Cloud Filtering A Non Local Position Based Approach

Existing position based point cloud filtering methods can hardly preservesharp geometric features . We propose a novel positionbased approach for feature-preserving point cloud filter . Unlike normalbased techniques, our method does not require the normal information . The coreidea is to first design a similarity metric to search the non-local similarpatches of a queried local patch .…

Spoken ObjectNet A Bias Controlled Spoken Caption Dataset

Modern audio-visual datasets contain biases that undermine the real-world performance of models trained on that data . We introduce Spoken ObjectNet to remove some of these biases . This dataset expands upon ObjectNet, which is a bias-controlled image dataset . We detail our datacollection pipeline, which features several methods to improve caption quality, including automated language model checks .…

Bugs in our Pockets The Risks of Client Side Scanning

Some in industry and government now advocate a new technology to access targeted data: client-side scanning . CSS would enable on-device analysis of data in the clear . CSS by its nature createsserious security and privacy risks for all society while it can provide assistance for law enforcement is at best problematic, authors say .…

BI RADS BERT Using Section Tokenization to Understand Radiology Reports

Domain specific contextualword embeddings have been shown to achieve impressive accuracy at such naturallanguage processing tasks in medicine . Radiology reports are the main form of communication between radiologists and clinicians, and contain important information for patient care . We thenevaluated whether using section tokenization improved the downstream extraction of the following fields: modality/procedure, previous cancer, menopausalstatus, purpose of exam, breast density and background parenchymal enhancement .…

Improving the Robustness to Variations of Objects and Instructions with a Neuro Symbolic Approach for Interactive Instruction Following

An interactive instruction following task has been proposed as a benchmark for learning to map natural language instructions and first-person vision intosequences of actions to interact with objects in a 3D simulated environment . We assume that this problem is due to the high sensitiveness of neural feature extraction to small changes invision and language inputs .…

Can Explanations Be Useful for Calibrating Black Box Models

Aims to improve a black box model’s performance on a new domain given examples from the new domain . We show that thecalibration features transfer to some extent between tasks and shed light on how to effectively use them . We experiment with our method on two tasks, extractive questionanswering and natural language inference, covering adaptation from severalpairs of domains .…

LAGr Labeling Aligned Graphs for Improving Systematic Generalization in Semantic Parsing

LAGr produces semantic parses by predicting node and edge labels for a complete multi-layer input-aligned graph . Thestrongly-supervised algorithm requires aligned graphs as inputs and infers alignments for originally unaligned target graphs using an approximate MAP inference procedure . On the COGS and CFQ compositionalgeneralization benchmarks the strongly- and weakly- supervised LAGR algorithmsachieve significant improvements upon the baseline seq2seq parsers.…

A Survey on Deep Learning for Skeleton Based Human Animation

Human character animation is critical in entertainment content production, including video games, virtual reality or fiction films . Deep neural networks drive most recent advances through deep learning and deep reinforcement learning . In this article, we propose a comprehensive survey on the state-of-the-art approaches based on either deep learning or deepreinforcement learning in skeleton-based human character animation .…

Retrieval guided Counterfactual Generation for QA

Deep NLP models have been shown to learn spurious correlations, leaving them brittle to input perturbations . We develop a Retrieve-Generate-Filter-Filter technique to create counterfactualevaluation and training data with minimal human supervision . Using anopen-domain QA framework and question generation model trained on original task data, we create counterfactsuals that are fluent, semantically diverse, andautomatically labeled.…

NeRS Neural Reflectance Surfaces for Sparse view 3D Reconstruction in the Wild

NeRS learns a neural shape representation of aclosed surface that is diffeomorphic to a sphere, guaranteeing water-tightreconstructions . Surface parameterizations allow NeRS tolearn (neural) bidirectional surface reflectance functions (BRDFs) thatfactorize view-dependent appearance into environmental illumination, diffusecolor (albedo), and specular “shininess” The project page with code and visualizations can be found athttps://jasonyzhang.com/ners…

HUMAN4D A Human Centric Multimodal Dataset for Motions and Immersive Media

We introduce HUMAN4D, a large and multimodal 4D dataset that contains avariety of human activities simultaneously captured by a professionalmarker-based MoCap, a volumetric capture and an audio recording system . By capturing 2 female and $2$ male professional actors performing variousfull-body movements and expressions, we provide a diverse set of motions and poses encountered as part of single- and multi-person daily, physical and social activities .…

Solving Large Break Minimization Problems in a Mirrored Double Round robin Tournament Using Quantum Annealing

Quantum annealing (QA) has gained considerable attention because it can be applied to combinatorial optimization problems . In recent years, research on solving practical combinatorials optimization problems using them hasaccelerated . In our study, wedetermine that QA demonstrates better performance than the solvers in the breakminimization problem in a mirrored double round-robin tournament (MDRRT) We also explain the desirable performance of QA for the sparse interaction betweenvariables and a problem without constraints .…

Semi supervised Multi task Learning for Semantics and Depth

Multi-Task Learning (MTL) aims to enhance the model generalization by sharingrepresentations between related tasks for better performance . Typical MTLmethods are jointly trained with the complete multitude of ground-truths forall tasks simultaneously . However, one single dataset may not contain theannotations for each task of interest .…

The Irrationality of Neural Rationale Models

Neural rationale models are popular for interpretable predictions of NLP tasks . In these, a selector extracts segments of the input text, calledrationales, and passes these segments to a classifier for prediction . We call for more rigorous evaluations of these models to ensure desired properties ofinterpretability are achieved .…

An Empirical Investigation of Multi bridge Multilingual NMT models

In this paper, we present an extensive investigation of multi-bridge,many-to-many multilingual NMT models (MB-M2M) ie., models trained on non-English language pairs in addition to English-centric language pairs . In addition to validating previous work which shows that MB-M1 models canovercome zeroshot translation problems, our analysis reveals the following results about multibridge models: (1) it is possible to extract a reasonable amount of parallel corpora between non-language corpora .…

WebAssembly enables low latency interoperable augmented and virtual reality software

WebAssembly (Wasm) offers a promising developer solution that can bring near-native low latency performance to web-based applications . It enables hardware-agnostic interoperability at scale through portable bytecode that runson any WiFi or cellular data network-enabled AR/VR device . Wasm resolves critical issues faced with just-in-time (JIT) compilation, slowrun-times, large file sizes and big data, among other challenges .…

A Dual Attention Neural Network for Pun Location and Using Pun Gloss Pairs for Interpretation

Pun location is to identify the punning word (usually a word or phrase that makes the text ambiguous) in a given short text . Pun interpretation is to find out two different meanings of the word . DANN (Dual-Attentive Neural Network) is proposed for pun location, effectively integrates word senses and pronunciation with context information to address two kinds of pun at the same time .…

The Neural MMO Platform for Massively Multiagent Research

Neural MMO is a computationally accessible research platform that combines large agent populations, long time horizons, open-ended tasks, and modular gamesystems . We present Neural MMO as free and opensource software with active support, ongoing development, documentation, and training, logging, and visualization tools to help users adapt to the new setting .…

Rethinking Point Cloud Filtering A Non Local Position Based Approach

Existing position based point cloud filtering methods can hardly preservesharp geometric features . We propose a novel positionbased approach for feature-preserving point cloud filter . Unlike normalbased techniques, our method does not require the normal information . The coreidea is to first design a similarity metric to search the non-local similarpatches of a queried local patch .…

Shortened Polarization Kernels

A shortening method for large polarization kernels is presented . It uses lower and upper bounds on partial distances for quick elimination of unsuitable shortening patterns . The proposed algorithm is applied to some kernels of sizes 16 and 32 to obtain shortened kernels .…

Brittle interpretations The Vulnerability of TCAV and Other Concept based Explainability Tools to Adversarial Attack

Methods for model explainability have become increasingly critical fortesting the fairness and soundness of deep learning . In safety-critical applications, there is need for security around not only the machine learning pipeline but also the modelinterpretation process . We show that by perturbing the examples of the concept that is being investigated, we can radically change the output of the interpretability method, e.g.…

On the Pitfalls of Analyzing Individual Neurons in Language Models

Many studies have shown that linguistic information is encoded inhidden word representations . Few have studied individual neurons to show how and in which neurons it is encoded . The common approach is to use an external probe to rank neurons according to their relevance to somelinguistic attribute, and to evaluate the obtained ranking using the same probethat produced it .…