Declarative Data Collections for Portable Parallelism

Plain Schwarz

Li, Zhibo

Formale Metadaten

Titel

Serientitel

Berlin Buzzwords 2023

Anzahl der Teile

Autor

Li, Zhibo

Mitwirkende

N. N. (Moderation)

Lizenz

CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/66664 (DOI)

Herausgeber

Plain Schwarz

Erscheinungsjahr

2023

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

I would like to introduce Declarative Abstractions for Data Collections, which provides a novel, declarative approach to data collections for convenient, portable, and efficient parallel computation. Modern programming languages provide programmers with rich abstractions for data collections as part of their standard libraries, e.g., containers in the C++ STL, the Java Collections Framework, or the Scala Collections API. Typically, these collections frameworks are organized as hierarchies that provide programmers with common abstract data types (ADTs) like lists, queues, and stacks. While convenient, this approach introduces problems that ultimately affect application performance due to users over-specifying collection data types, limiting implementation flexibility. With the introduced framework, programmers explicitly select properties for their collections, thereby truly decoupling specification from implementation. By making collection properties explicit, immediate benefits materialize in the form of reduced risk of over-specification and increased implementation flexibility. In terms of computational performance, our framework helps shield the application developer from parallel implementation details, where the property-based data collection can be ported to multiple platforms, including GPU and FPGA, without modifying the declaration on the properties. The framework provides a data-centric approach for high performance computation, where the users focus on what properties the container(collection) would have and do not need to work around the implementation details. The framework has been developed based on C++ metaprogramming and provides modern C++ API for the users. This framework will benefit the community as a convenience and high-performance programming model for parallel data processing in heterogeneous environment. The audience will get to know a practical programming model for data-centric parallelism, which is useful for their everyday job regarding parallel data analyzing, data storage/filter, etc.