Scientific datasets are known for their challenging storage demands and the processing pipelines that transform their information . In this paper, we present an infrastructure for the HDF5 file format that enables datasetvalues to be populated on the fly . task-related scripts can be attached intoHDF5 files and only execute when the dataset is read by an application. Weprovide details on the software architecture that supports user-definedfunctions (UDFs) and how it integrates with hardware accelerators andcomputational storage. Moreover, we describe the built-in security model that limits the system resources a UDF can access. We present several usecases that show how UDFs can be used to extend scientific datasets

Author(s) : Lucas C. Villa Real, Maximilien de Bayser

Links : PDF - Abstract

Code :
Coursera

Keywords : - datasets - scientific - storage - user -

Leave a Reply

Your email address will not be published. Required fields are marked *