The LDBC benchmark suite

CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/62013 (DOI)

Herausgeber

FOSDEM VZW

Erscheinungsjahr

2023

Sprache

Englisch

Inhaltliche Metadaten

Fachgebiet

Informatik

Genre

Konferenz/Talk

Abstract

We motivate and present an open-source benchmark suite for graph processing, created and maintained by the Linked Data Benchmark Council (LDBC). We first define common graph workloads and the pitfalls of benchmarking systems that support them, then explain our guiding principles that allow for conducting meaningful benchmarks. We outline our open-source ecosystem that consists of a scalable graph generator (capable of producing property graphs with 100B+ edges) and benchmarks drivers with several reference implementations. Finally, we highlight the results of recent audited benchmark runs. Data processing pipelines frequently involve graph computations: running complex path queries in graph databases, evaluating metrics for network science, training graph neural networks for classification, and so on. While graph technology has received significant attention in academia and industry, the performance of graph processing systems is often lacklustre, which hinders their adoption for large-scale problems. The Linked Data Benchmark Council (LDBC) was founded in 2012 by vendors and academic researchers with the aim of making graph processing performance measurable and comparable. To this end, LDBC provides open-source benchmark suites with openly available data sets starting at 1 GB and scaling up to 30 TB. Additionally, it allows vendors to submit their benchmark implementations to LDBC-certified auditors who ensure that the benchmark executions are reproducible and comply with the specification. In this talk, we describe three LDBC benchmarks: (1) the Graphalytics benchmark for offline graph analytics, (2) the Social Network Benchmark's Interactive workload for transactional graph database systems, and (3) the Business Intelligence workload for analytical graph data systems. For each benchmark, we explain how it ensures meaningful and interpretable results. Then, we summarize the main features of the benchmark drivers and list the current reference implementations (maintained by vendors and community members). Finally, we highlight recent audited benchmark results. Information on the talk: - Expected prior knowledge: specialized prior knowledge is not needed - Intended audience: developers and users of graph processing frameworks