Automated Graph Learning (AutoGL) is the first library for automated machine learning on graphs . AutoGL is open-source, easy-to-use, and flexible to be extended . We propose a machine learning pipeline for graph data containing four modules:auto feature engineering, model training, hyper-parameter optimization, andauto ensemble .…

## Sublinear Time Nearest Neighbor Search over Generalized Weighted Manhattan Distance

Nearest Neighbor Search (NNS) over generalized weighted distance is fundamental to a wide range of applications . The Manhattan distance could be more practical than the Euclidean distance for high-dimensional NNS . We propose two novel sublinear time hashing schemes ($d_w^{l_1,.l_2$)-ALSH…

## Supervised Feature Selection Techniques in Network Intrusion Detection a Critical Review

Machine Learning (ML) techniques are becoming an invaluable support for network intrusion detection, especially in revealing anomalous flows, which often hide cyber-threats . Feature Selection (FS) is a crucial pre-processing step in network management and, specifically, for the purposes of Network intrusion detection .…

## ODT FLOW A Scalable Platform for Extracting Analyzing and Sharing Multi source Multi scale Human Mobility

In response to the soaring needs of human mobility data, we develop a scalable online platform for extracting, analyzing, and sharing multi-source multi-scale human mobility flows . Within the platform, an origin-destination-time (ODT) data model is proposed to work with scalable query engines to handle heterogenous mobility data in large volumes .…

## Shuffler A Large Scale Data Management Tool for ML in Computer Vision

Shuffler is an open source tool that makes it easy to manage large computervision datasets . It stores annotations in a relational, human-readabledatabase . It defines over 40 data handling operations with annotationsthat are commonly useful in supervised learning applied to computer vision .…

## Simple Optimal Algorithms for Random Sampling Without Replacement

We construct algorithms that are evensimpler, easier to implement, and have optimal space and time complexity . Consider the fundamental problem of drawing a simple random sample of size kwithout replacement from [n] := {1, . . . n}.…

## A Novel Spatial Temporal Specification Based Monitoring System for Smart Cities

With the development of the Internet of Things, millions of sensors are being deployed in cities to collect real-time data . This leads to a need for checking city states against city requirements at runtime . In this paper, we develop anovel spatial-temporal specification-based monitoring system for smart cities .…

## Velocity Skinning for Real time Stylized Skeletal Animation

We propose asimple, real-time solution for adding secondary animation effects on top of standard skinning . Our method takes a standard skeleton animation as input, along with skin mesh and rig weights . It then derives per-vertex deformations from the different linear and angularvelocities along the skeletal hierarchy .…

## Compressive Neural Representations of Volumetric Scalar Fields

We present an approach for compressing volumetric scalar fields usingimplicit neural representations . Our approach represents a scalar field as alearned function, wherein a neural network maps a point in the domain to an output scalar value . By setting the number of weights of the neural network to be smaller than the input size, we achieve compressed representations of scalarfields .…

## Dissecting the square into seven congruent parts

We give a computer-based proof of the following fact: If a square is tiled byseven convex tiles which are congruent among themselves, then the tiles arerectangles . This confirms a new case of a conjecture posed by Yuen, Zamfirescu .…

## Disentangling Semantics and Syntax in Sentence Embeddings with Pre trained Language Models

Paraphrasepairs offer an effective way of learning the distinction between semantics and syntax, as they naturally share semantics and often vary in syntax . ParaBART is trained to perform syntax-guided paraphrasing, based on a source sentence that shares semantics with the target paraphrase,and a parse tree that specifies the target syntax .…

## A Deep Learning Based Cost Model for Automatic Code Optimization

A novel deeplearning based cost model for automatic code optimization has been proposed in the Tiramisu compiler . The proposed model has only 16% of mean absolute percentage error in predicting speedups onfull programs . Unlike previous models, the proposed one does not rely on any heavy feature engineering .…

## MIPT NSU UTMN at SemEval 2021 Task 5 Ensembling Learning with Pre trained Language Models for Toxic Spans Detection

This paper describes our system for SemEval-2021 Task 5 on Toxic SpansDetection . We developed ensemble models using BERT-based neural architectures and post-processing to combine tokens into spans . Our system obtained a F1-score of 67.55% on test data .…

## A Preliminary Model for the Design of Music Visualizations

Music Visualization is basically the transformation of data from the aural to the visual space . There are a variety of music visualizations, across applications, present on the web . Models of Visualization include conceptualframeworks helpful for designing, understanding and making sense of visualizations .…

## Jamming Resilient Path Planning for Multiple UAVs via Deep Reinforcement Learning

Unmanned aerial vehicles (UAVs) are expected to be an integral part of wireless networks . In this paper, we aim to find collision-free paths formultiple cellular-connected UAVs, while satisfying requirements of connectivity with ground base stations in the presence of a dynamic jammer .…

## Unsupervised Learning of Explainable Parse Trees for Improved Generalisation

Recent RvNN-based models fail to learn simple grammar and meaningful semantics in their intermediate treerepresentation . In this work, we propose an attention mechanism over Tree-LSTMsto learn more meaningful and explainable parse tree structures . We alsodemonstrate the superior performance of our proposed model on natural languageinference, semantic relatedness, and sentiment analysis tasks .…

## Multiple Run Ensemble Learning withLow Dimensional Knowledge Graph Embeddings

Link prediction using knowledgegraph embedding (KGE) models has gained significant attention for knowledgegraph completion . In this paper, we propose a simple but effective performance boosting strategy for KGE models by using multiple low dimensions in different rounds of the same model .…

## NorDial A Preliminary Corpus of Written Norwegian Dialect Use

Norway has a large amount of dialectal variation, as well as a general tolerance to its use in the public sphere . There are, however, few available resources to study this variation and its change over time and in more informalareas, \eg on social media .…

## The structure of online social networks modulates the rate of lexical change

New words are regularly introduced to communities, yet not all of these words remain in a community’s lexicon . Dense connections, the lack of local clusters and more external contacts promote lexical innovation and retention . Unlike offline communities, topic-based communities do not experience strong lexical levelling despite increased contact but accommodate more niche words .…

## PPT Multicore Performance Prediction of OpenMP applications using Reuse Profiles and Analytical Modeling

PPT-Multicore builds upon our previous work towards amulticore cache model . We extract LLVM basic block labeled memory trace using an architecture-independent LLVM-based instrumentation tool only once in anapplication’s lifetime . The model uses the memory trace and other parameters from an instrumented sequentially executed binary .…

## The algebraic structure of the densification and the sparsification tasks for CSPs

The densification and the sparsification of CSPs were formally defined as computational tasks . We show that in the Boolean case,$\Sigma$ is of polynomial size if and only if $Gamma is of bounded width . We give a complete classification of constraint languages over theBoolean domain for which the densification problem is tractable .…

## WEC Deriving a Large scale Cross document Event Coreference dataset from Wikipedia

Cross-document event coreference resolution is a foundational task for NLP applications involving multi-text processing . Existing corpora for this task are scarce and relatively small, while annotating only modest-size clusters of documents belonging to the same topic . We present an efficient methodology for gathering a large-scale dataset for cross-document coreference from Wikipedia, where coreference links are restricted within predefined topics .…

## A Graph Convolutional Neural Network based Framework for Estimating Future Citations Count of Research Articles

Scientific publications play a vital role in the career of a researcher . Some articles become more popular than others among the research community . One of the signs of popular articles is the number of citations an article receives .…

## TedNet A Pytorch Toolkit for Tensor Decomposition Networks

TedNet is based on the Pytorch framework, to give more researchers a flexible way to exploit TDNs . TedNet implements 5 kinds of tensor decomposition(i.e.,CANDECOMP/PARAFAC(CP), Block-Term Tucker(BT), Tucker-2, Tensor Train(TT) andTensor Ring(TR) on traditional deep neural layers, the convolutional layer and the fully-connected layer .…

## Classical quantum network coding a story about tensor

Kobayashi et al. showed how to convert any network coding protocol into a quantum coding protocol . They left open whether existence of quantum network coding protocols implied the existence of a classical one . We characterize the set of distribution tasks achievable with non zeroprobability for both classical and quantum networks.…

## Analyzing Thermal Buckling in Curvilinearly Stiffened Composite Plates with Arbitrary Shaped Cutouts Using Isogeometric Level Set Method

In this paper we develop a new simple and effective isogeometric analysis formodeling thermal buckling of stiffened laminated composite plates with cutouts . We employ a first order shear deformation theory to approximate the displacement field of the stiffeners and the plate .…

## A Hybrid Parallelization Approach for Distributed and Scalable Deep Learning

Deep Neural Networks (DNNs) have recorded great success in handling medical and other complex classification tasks . As the sizes of a DNN model and the available dataset increase, the training process becomes more computationally intensive . We have proposed a generic full end-to-end hybridparallelization approach combining both model and data parallelism forefficiently distributed and scalable training of DNN models .…

## Learning representations with end to end models for improved remaining useful life prognostics

The remaining Useful Life (RUL) of equipment is defined as the duration between the current time and its failure . An accurate and reliable prognostic of the remaining useful life provides decision-makers with valuable information . We propose an end-to-end deeplearning model based on multi-layer perceptron and long short-term memorylayers (LSTM) to predict the RUL .…

## Graph Streaming Lower Bounds for Parameter Estimation and Property Testing via a Streaming XOR Lemma

We study space-pass tradeoffs in graph streaming algorithms for parameterestimation and property testing problems . For many problems ofinterest, including all the above, obtaining a $(1+\epsilon)$-approximation requires either $n^{\Omega(1)$ space or $o(\log{(1/)\epsil)$ passes . This bound matches those of existing algorithms andis thus (asymptotically) optimal .…

## Research on Optimization Method of Multi scale Fish Target Fast Detection Network

The fish target detection algorithm lacks a good quality data set, and the algorithm achieves real-time detection with lower power consumption on embeddeddevices . The experiment uses Depthwiseconvolution to redesign the backbone of the yoloV4 network, which reduces the amount of calculation by 94.1%, and the test accuracy is 92.34%.…

## The Many Faces of 1 Lipschitz Neural Networks

Lipschitz constrained models have been used to solve specifics deep learning problems such as the estimation of Wasserstein distance for GAN, or the training of neural networks robust to adversarial attacks . Despite being empirically harder to train, they are theoretically better grounded than unconstrained ones when it comes to classification .…

## ALT MAS A Data Efficient Framework for Active Testing of Machine Learning Algorithms

Machine learning models are being used extensively in many important areas, but there is no guarantee a model will always perform well . Understanding the correctness of a model is crucial to preventpotential failures that may have significant detrimental impact in criticalapplication areas .…

## Q matrix Unaware Double JPEG Detection using DCT Domain Deep BiLSTM Network

The double JPEG compression detection has received much attention in recent years due to its applicability as a forensic tool for the most widely used JPEGfile format . Existing state-of-the-art CNN-based methods either use histogramsof all the frequencies or rely on heuristics to select histograms of specificlow frequencies to classify single and double compressed images .…

## Memory Capacity of Neural Turing Machines with Matrix Representation

Matrix neural networks featurematrix representation which preserves the spatial structure of data . MatNTMs have the potential to provide better memory structures when compared to canonical neural networks that use vector representation . The upper bound on memorycapacity to be $N^2$ for an $N\times N$ state matrix .…

## Robust Image Watermarking in Wavelet Domain using GBT DWT SVD and Whale Optimization Algorithm

Digital content can be copied easily, Copyright infringement has become a concern nowadays . One of the most common methods to solve this problem iswatermarking . In this method, a logo belongs to the owner of the media is embedded in the media .…

## Simple Majority Consensus in Networks with Unreliable Communication

In this work, we analyze the performance of a simple majority-rule protocolsolving a fundamental coordination problem in distributed systems . We prove that the Simple Majority Protocol (SMP) reaches consensus in only three communication rounds with probability approaching $1$ as $n$ grows to infinity .…

## The Cardan grille approach to the Voynich MS taken to the next level

The Voynich MS is an illustrated 15th century manuscript, whose text is written in an unknown alphabet . In 2004 Gordon Rugg published a paper in which he proposed that this text islikely to be meaningless, and could have been composed by an alternative application of a so-called Cardan Grille .…

## Fast Linking of Mathematical Wikidata Entities in Wikipedia Articles Using Annotation Recommendation

Mathematical information retrieval (MathIR) applications such as semanticformula search and question answering systems rely on knowledge-bases that linkmathematicical expressions to their natural language names . For databasepopulation, mathematical formulae need to be annotated and linked to semanticconcepts, which is very time-consuming .…

## Application specific dataflow machine construction for programming FPGAs via Lucent

Field Programmable Gate Arrays (FPGAs) have the potential to acceleratespecific HPC codes . Even with the advent of High Level Synthesis (HLS), FPGA programmers can write code in C or C++ . We argue that languages built upon dataflow principalsshould be exploited to enable fast by construction codes for FPGAs .…

## Fine Grained Attention for Weakly Supervised Object Localization

Recent advances in deep learning accelerated an improvement in the supervised object localization task . We propose a novel residual fine-grained attention (RFGA) module that autonomously excites the less activated regions of an object . Unlike other attention-based WSOL methodsthat learn a coarse attention map, our proposed RFGA learns fine-Grained values in an attention map by assigning different attention values for each of the elements .…

## How Should Network Slice Instances be Provided to Multiple Use Cases of a Single Vertical Industry

There are a large number of vertical industries implementing multiple usecases, each use case characterized by diverging service, network, andconnectivity requirements such as automobile, manufacturing, power grid, etc. Such heterogeneity cannot be effectively managed and efficiently mapped onto asingle type of network slice instance (NSI) Both approaches tackle the same technical issue ofprovisioning, management, and orchestration of per vertical per use case NSIs in order to improve resource allocation and enhance network performance .…

## Secure Cognitive Radio Communication via Intelligent Reflecting Surface

In this paper, an intelligent reflecting surface (IRS) assisted spectrumsharing underlay cognitive radio (CR) wiretap channel (WTC) is studied . We aim at enhancing the secrecy rate of secondary user in this channel subject tototal power constraint at secondary transmitter (ST), interference powerconstraint (IPC) at primary receiver (PR) and unit modulus constraint atIRS .…

## Load Balancing with Dynamic Set of Balls and Bins

In dynamic load balancing, we wish to distribute balls into bins in anenvironment where both balls and bins can be added and removed . We want to minimize the number ofballs and bins affected when adding or removing a ball or a bin .…

## Learning the CSI Denoising and Feedback Without Supervision

The biggest challenge is the overhead incurred when the mobile terminal has to send the downlink channel state information or corresponding partial information to the base station . We propose a novel learning-based framework for denoising and compression of channel estimates .…

## On the Accuracy of Deterministic Models for Viral Spread on Networks

We consider the emergent behavior of viral spread when agents interact with each other over a contact network . When the number of agents is large and the contact network is a complete graph, the population behavior converges to the solution of an ordinary differentialequation known as the classical SIR model .…

## Conversational Semantic Role Labeling

Semantic role labeling (SRL) aims to extract arguments for each predicate in an input sentence . Traditional SRL can fail to analyze dialogues because it only works on every single sentence, while ellipsis and anaphora frequentlyoccur in dialogues . To address this problem, we propose the conversational SRL task, where an argument can be the dialogue participants, a phrase in thedialogue history or the current sentence .…

## NOMA for Next generation Massive IoT Performance Potential and Technology Directions

Broader applications of the Internet of Things (IoT) are expected in the forthcoming 6G system . Massive IoT is already a key scenario in 5G, relying on physical layer solutions inherited from 4G LTE and using orthogonal multiple access (OMA) In 6G IoT, supporting amassive number of connections will be required for diverse services of thevertical sectors .…

## On Probabilistic Termination of Functional Programs with Continuous Distributions

We study termination of higher-order probabilistic functional programs withrecursion, stochastic conditioning and sampling from continuous distributions . We present a new operational semantics based ontraces of intervals . We obtain the first proof thatdeciding almost-sure termination (AST) for programs with continuousdistributions is $\Pi^0_2$-complete .…

## A tight negative example for MMS fair allocations

Kurokawa, Procaccia and Wang [JACM, 2018] present instances for which every allocation gives some agent less than her maximinshare . For three agents and nine items, we design an instance in which at least one agent does not get more than a $1 – \frac{1}{n^4}$ fraction of her maximin share .…

## Constructing Contrastive samples via Summarization for Text Classification with limited annotations

Contrastive Learning has emerged as a powerful representation learning method . How to construct efficient contrastive samples through dataaugmentation is key to its success . Unlike vision tasks, the data augmentation method for contrastive learning has not been investigated sufficiently in language tasks .…