Investigating Methods to Improve Language Model Integration for Attention based Encoder Decoder ASR Models

Attention-based encoder-decoder (AED) models learn an implicit internallanguage model (ILM) from the training transcriptions . Bayesian interpretation as in the hybrid autoregressivetransducer (HAT) suggests dividing by the prior of the discriminative acoustic model, which corresponds to this implicit LM . We propose several novel methods to estimate the ILM directly from the AED model .…

Self Training with Weak Supervision

State-of-the-art deep neural networks require large-scale labeled training data that is often expensive to obtain or not available for many tasks . In this work, we develop a weak supervision framework (ASTRA) that leveragesall the available data for a given task .…

Fine Tuning Transformers for Identifying Self Reporting Potential Cases and Symptoms of COVID 19 in Tweets

We describe our straight-forward approach for Tasks 5 and 6 of 2021 SocialMedia Mining for Health Applications (SMM4H) shared tasks . We explore how much fine-tuning is necessary for classifying tweets as containing self-reported COVID-19 symptoms (Task 5) or whether a tweet related to the virus is self-reporting, non-personalreporting, or a literature/news mention of the virus (Task 6)…

Deep Learning for Prominence Detection in Children s Read Speech

Expressive reading, considered the defining attribute of oral readingfluency, comprises the prosodic realization of phrasing and prominence . We consider a labeled dataset of children’s readingrecordings for the speaker-independent detection of prominent words using acoustic-prosodic and lexico-syntactic features . Deeplearning is applied to obtain word-level features from low-level acousticcontours of fundamental frequency, intensity and spectral shape in anend-to-end fashion .…

Machine Translation Decoding beyond Beam Search

Beam search is the go-to method for decoding auto-regressive machinetranslation models . While it yields consistent improvements in terms of BLEU, it is only concerned with finding outputs with high model likelihood . Our aim is to establish whether beam search can be replaced by a more powerfulmetric-driven search technique .…

Better Feature Integration for Named Entity Recognition

Named entity recognition (NER) could benefit from incorporating structured structured information captured by dependency trees . Synergized-LSTM model captures how the two types of features interact . The results demonstrate that the proposed model achieves better performance than previous approaches while requiring fewer parameters .…

Building a Swedish Open Domain Conversational Language Model

We present on-going work of evaluating the, to our knowledge, first largegenerative language model trained to converse in Swedish, using data from the online discussion forum Flashback . We conduct a human evaluation pilot study that indicates the model is often able to respond to conversations in both ahuman-like and informative manner .…

Factual Probing Is MASK Learning vs Learning to Recall

A novel and efficient method is able to predict an additional 6.4% of facts in the LAMA benchmark . The training data used by these methodscontains certain regularities of the underlying fact distribution, and all the existing prompt methods, including ours, are able to exploit them for betterfact prediction .…

SuperSim a test set for word similarity and relatedness in Swedish

SuperSim is alarge-scale similarity and relatedness test set for Swedish built with experthuman judgments . The test set is composed of 1,360 word-pairs independentlyjudged for both relatedness and similarity by five annotators . We evaluate different models (Word2Vec, fastText, and GloVe) trained on two separate datasets (Swedish Gigaword and Swedish Wikipediadump) to provide a baseline for future comparison .…

Estimating Subjective Crowd Evaluations as an Additional Objective to Improve Natural Language Generation

Human ratings are one of the most prevalent methods to evaluate the performance of natural language processing algorithms . It is commonto measure the quality of sentences generated by a natural language generationmodel using human raters . In this paper, we argue for exploring the use ofsubjective evaluations within the process of training language generationmodels in a multi-task learning setting .…

FUDGE Controlled Text Generation With Future Discriminators

Future Discriminators for Generation (FUDGE) is a flexible and modular method for controlled text generation . Given a pre-existing model G forgenerating text from a distribution of interest, FUDGE enables conditioning on a desired attribute a (for example, formality) while requiring access only to the model’s output logits .…

Estimation of Summary to Text Inconsistency by Mismatched Embeddings

The proposed ESTIME, Estimator of Summary-to-Text Inconsistency byMismatched Embeddings, correlates with expert scores in summary-level SummEvaldataset stronger than other common evaluation measures . ESTIME is more sensitive to subtle errors than other common evaluation measurement means . We also introduce a method of generating subtle factualerrors in human summaries .…

Entropoid Based Cryptography

Entropoid Diffie-Hellman problem is hard in Sylow $q$-subquasigroups . We post a conjecture that DEDHP is hard for Sylow- $q$.-subaquasig Groups . The entropoid based cryptographic primitives are supposed to be resistant to quantum algorithms . We give a proof-of-concept implementation in SageMath 9.2 for all proposed algorithms and schemes in an appendix .…

Machine checked ZKP for NP relations Formally Verified Security Proofs and Implementations of MPC in the Head

MPC-in-the-Head (MitH) is a general framework that allows constructingefficient Zero Knowledge protocols for general NP-relations from securemultiparty computation (MPC) protocols . In this paper we give the first machine-checked implementation of this transformation . We begin with anEasyCrypt formalization of MitH that preserves the modular structure of MitHand can be instantiated with arbitrary MPC protocols that satisfy standardnotions of security .…

Exploring the Attack Surface of WebSocket

Web socket is a new type of communications protocol, which was faster and more efficient than previous communication protocols . Its security has been discussed, and technology’s security has always been a challenge for us . In this article, we examine the structure and security problems that can occur in a web socket to choose an excellent alternative to HTTP and use it .…

Ethereum Name Service the Good the Bad and the Ugly

ENS has been criticized for its inherent design flaws, making the system vulnerable to kinds of attacks . DNS domain names are not fullycontrolled by the users, which can be easily taken down by the authorities andregistrars . Since blockchain has its unique properties like immutability anddecentralization, it seems to be promising to build a decentralized nameservice on blockchain .…

Cybersecurity in Smart Farming Canada Market Research

The Cyber Science Lab (CSL) and Smart Cyber-Physical System (SCPS) Lab at theUniversity of Guelph conduct a market study of cybersecurity technologyadoption and requirements for smart and precision farming in Canada . Weconducted 17 stakeholder/key opinion leader interviews in Canada and the USA, to complete this study .…

LocalViT Bringing Locality to Vision Transformers

We study how to introduce locality mechanisms into vision transformers . Locally-enhanced transformers outperform the baselines DeiT-T and PVT-t by 2.6\% and 3.1\% with a negligible increase in the number of parameters and effort . The same mechanism was used to apply to 4 visiontransformers, which shows the generalization of the locality concept .…

Towards Efficient Graph Convolutional Networks for Point Cloud Handling

In this paper, we aim at improving the computational efficiency of graphconvolutional networks (GCNs) for learning on point clouds . The optimized networks have reduced computational complexity, decreased memoryconsumption, and accelerated inference speed . Code will be available at\url{https://://://github.com/ofsoundof/EfficientGCN.git and the code is available at http://://www.gofsoundsof-of-soundof.com.org//gofEfficient…

Image Level or Object Level A Tale of Two Resampling Strategies for Long Tailed Detection

Training on datasets with long-tailed distributions has been challenging formajor recognition tasks such as classification and detection . We show that image-level andobject-level resamplings are both important, and thus unify them with a jointresampling strategy (RIO) Our method outperforms state-of-the-art long-taileddetection and segmentation methods on LVIS v0.5 across various backbones.…

Action Conditioned 3D Human Motion Synthesis with Transformer VAE

We tackle the problem of action-conditioned generation of realistic and diverse human motion sequences . In contrast to methods that complete, orextend, motion sequences, this task does not require an initial pose orsequence . We learn an action-aware latent representation for human motionsby training a generative variational autoencoder (VAE) We evaluate our approach on NTU RGB+D, HumanAct12 and UESTC datasets and show improvements over the state of the art .…

Learning Robust Visual semantic Mapping for Zero shot Learning

Zero-shot learning (ZSL) aims at recognizing unseen class examples (e.g.,images) with knowledge transferred from seen classes . In ZSL, the commonpractice is to train a mapping function between the visual and semantic featurespaces with labeled seen class examples . We focus on fullyempowering the semantic feature space, which is one of the key building blocksof ZSL .…

View Guided Point Cloud Completion

This paper presents a view-guided solution for the task of point cloudcompletion . ViPC (view-guided pointcloud completion) takes the missing crucial global structure information from an extra single-view image . By leveraging a framework that sequentiallyperforms effective cross-modality and cross-level fusions, our method achievessignificantly superior results over typical existing solutions .…

GAttANet Global attention agreement for convolutional neural networks

Transformer attention architectures, similar to those developed for naturallanguage processing, have recently proved efficient also in vision . We report experiments with a simple such attentionsystem that can improve the performance of standard convolutional networks,with relatively few additional parameters . We demonstrate the usefulness of this network (GAttANet) for variousconvolutional backbones (from a simple 5-layer toy model to a standard ResNet50architecture) and datasets (CIFAR10, CIFAR100, Imagenet-1k) Each time, ourglobal attention system improves accuracy over the corresponding baseline .…