Machine Learning for Performance Prediction of Spark Cloud Applications

Machine Learning (ML) provides black box solutions to model relationship between application performance and system configuration without requiring in-detail knowledge of the system . We investigate the cost-benefits of using supervised ML models for predicting the performance of applications on Spark, one of today’s most widely used frameworks for big data analysis .…

Representation and Processing of Instantaneous and Durative Temporal Phenomena

In this paper, we propose a new logic based temporal phenomenadefinition language specifically tailored for Complex Event Processing . We demonstrate the expressiveness ofour proposed language by employing a maritime use case where we define maritimeevents of interest . Finally, we analyse the execution semantics of our proposed language for stream processing and introduce the `Phenesthe’ implementationprototype.…

Injecting Text in Self Supervised Speech Pretraining

The proposed method, tts4pretrain, complements the power of contrastivelearning in self-supervision with linguistic/lexical representations derived from synthesized speech . Lexical learning in the speech encoder is enforced through anadditional sequence loss term that is coupled with contrastive loss duringpretraining . We demonstrate that this novel pretraining method yields WordError Rate (WER) reductions of 10% relative on the well-benchmarked,Librispeech task over a state-of-the-art baseline pretrained with wav2vec2.0only .…

An explicit vector algorithm for high girth MaxCut

We give an approximation algorithm for MaxCut and provide guarantees on theaverage fraction of edges cut on $d$-regular graphs of girth . Our guarantees are better than those of all other classical and quantum algorithms known to the authors . Ouralgorithm constructs an explicit vector solution to the standard semidefiniterelaxation of MaxCut .…

Revising Ontologies via Models The ALC formula Case

Most approaches for repairing description logic (DL) ontologies aim at changing axioms as little as possible . Instead, the input for the update is given by a model which we want to add or remove . This new setting is motivated by scenarios where an ontology isbuilt automatically and needs to be refined or updated .…

GLocal K Global and Local Kernels for Recommender Systems

Recommender systems typically operate on high-dimensional sparse user-item matrix matrix . We propose a Global-Local Kernel-based matrix completionframework, named GLocal-K . Our model outperforms the state-of-the-artbaselines on three collaborative filtering benchmarks: ML-100K, ML-1M, andDouban. We apply our model under the extreme low-resource setting, which includes only a user item rating matrix, with no side information, to an extreme low resource setting .…

Automata Linear Dynamic Logic on Finite Traces

Automata Linear Dynamic Logic on Finite Traces (ALDL_f) combines propositional logic with nondeterministic finite automata (NFA) to express temporal constraints . This is a gain in expressiveness over LTL at no cost. ALDL$_f$ is equivalent to Monadic Second-Order Logic. This is an improvement in satisfiability of LTL.…

4 bit Quantization of LSTM based Speech Recognition Models

We investigate impact of aggressive low-precision representations ofweights and activations in large LSTM-based architectures forAutomatic Speech Recognition (ASR) Using a 4-bit integer representation, a quantization approach results in significant Word ErrorRate (WER) degradation . We show that minimal accuracy loss is achievable with an appropriate choice of quantizers and initializations .…

Optimizing the hybrid parallelization of BHAC

We present our experience with the modernization on the GR-MHD code BHAC,aimed at improving its novel hybrid (MPI+OpenMP) parallelization scheme . We showcase the use of performance profiling tools usable on x86(Intel-based) architectures . Our performance characterization and threadinganalysis provided guidance in improving the concurrency and thus the efficiencyof the OpenMP parallel regions .…

Quantum Sub Gaussian Mean Estimator

We present a new quantum algorithm for estimating the mean of a real-valued random variable obtained as the output of a quantum computation . Our estimatorachieves a nearly-optimal quadratic speedup over the number of classical i.i.d.samples needed . We obtain new quantum algorithms for the .…

A High Fidelity Flow Solver for Unstructured Meshes on Field Programmable Gate Arrays

In this work, we design a custom FPGA-based accelerator for a computationalfluid dynamics (CFD) code . We target the entire unstructured Poisson solver . We propose a novel datamovement-reducing technique where we compute geometric factors on the fly, which yields significant (700+ GFlop/s) single-precision performance and anupwards of 2x reduction in runtime for the local evaluation of the Laplace operator .…

Rule based Adaptations to Control Cybersickness in Social Virtual Reality Learning Environments

Social virtual reality learning environments (VRLEs) provide immersiveexperience to users with increased accessibility to remote learning . Lack of high-performance and secured data delivery in critical VRLE domains (e.g., military training, manufacturing) can disrupt application functionality and induce cybersickness . In the event of an anomaly, the framework features rule-based adaptationsthat are triggered by using various decision metrics .…

A denotational semantics for PROMELA addressing arbitrary jumps

PROMELA (Process Meta Language) is a high-level specification languagedesigned for modeling interactions in distributed systems . It is used as an input language for the model checker SPIN (Simple Promela INterpreter) The main characteristics are non-determinism, process communication through synchronous as well as asynchronous channels .…

FAST PCA A Fast and Exact Algorithm for Distributed Principal Component Analysis

Principal Component Analysis (PCA) is a fundamental data preprocessing tool in machine learning . The purpose of PCA is two-fold: dimension reduction and feature learning . This paper proposes a distributed PCA algorithm calledFAST-PCA (Fast and exAct diSTributed PCA) The proposed algorithm is efficientin terms of communication and can be proved to converge linearly and exactly to the principal components that lead to dimension reduction as well as uncorrelated features .…

A functional skeleton transfer

The animation community has spent significant effort trying to ease riggingprocedures . The increasing availability of 3D datamakes manual rigging infeasible . However, object animations involve understanding elaborate geometry and dynamics . This paper proposes a functional approach for skeletontransfer that uses limited information and does not require a complete match between the geometries .…

SAUCE Truncated Sparse Document Signature Bit Vectors for Fast Web Scale Corpus Expansion

When a sufficient amount of within-domain text may not be available, expanding a seed corpus of relevant documents from large-scale web data poses several challenges . The authors propose a novel truncated sparse document bit-vectorrepresentation, termed Signature Assisted Unsupervised Corpus Expansion(SAUCE) The SAUCE can reduce the computational burden while ensuring high within-Domain lexical coverage, especially under limited seed corpora scenarios.…

SPARROW A Novel Covert Communication Scheme Exploiting Broadcast Signals in LTE 5G Beyond

This work proposes a novel framework to identify and exploit vulnerable MAClayer procedures in commercial wireless technologies for covert communication . Examples of covert communication include data exfiltration, remotecommand-and-control (CnC) and espionage . In this framework, the SPARROW schemesuse the broadcast power of incumbent wireless networks to covertly relay messages across a long distance without connecting to them .…

Developer Centric Test Amplification The Interplay Between Automatic Generation and Human Exploration

Automatically generating test cases for software has been an active research topic for many years . While current tools can generate powerful regression or crash-reproducing test cases, these are often kept separately from themaintained test suite . In this paper, we leverage the developer’s familiarity with test cases amplified from existing, manually written developer tests .…

Enel Context Aware Dynamic Scaling of Distributed Dataflow Jobs using Graph Propagation

Enel is a novel dynamic scaling approach that uses messagepropagation on an attributed graph to model dataflow jobs and, thus, allows forderiving effective rescaling decisions . Enel incorporates descriptiveproperties that capture the respective execution context, considers statisticsfrom individual dataflow tasks, and propagates predictions through the jobgraph to eventually find an optimized new scale-out .…

Quantum Sub Gaussian Mean Estimator

We present a new quantum algorithm for estimating the mean of a real-valued random variable obtained as the output of a quantum computation . Our estimatorachieves a nearly-optimal quadratic speedup over the number of classical i.i.d.samples needed . We obtain new quantum algorithms for the .…

CharmFL A Fault Localization Tool for Python

Fault localization is one of the most time-consuming and error-prone parts of software debugging . The tool employsSpectrum-based fault localization (SBFL) to help Python developers analyze their programs and generate useful data at run-time to beused, then to produce a ranked list of potentially faulty program elements .…