We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Ariadne: Online Provenance for Big Graph Analytics

Formal Metadata

Title
Ariadne: Online Provenance for Big Graph Analytics
Title of Series
Number of Parts
155
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Data provenance is a powerful tool for debugging large-scale analytics on batch processing systems. This paper presents Ariadne, a system for capturing and querying provenance from Vertex-Centric graph processing systems. While the size of provenance from map-reduce-style workflows is often a fraction of the input data size, graph algorithms iterate over the input graph many times, producing provenance much larger than the input graph. And though current provenance tracing procedures support explicit debugging scenarios, like crash-culprit determination, developers are increasingly interested in the behavior of analytics when a crash or exception does not occur. To address this challenge, Ariadne offers developers a concise declarative query language to capture and query graph analytics provenance. Exploiting the formal semantics of this datalog-based language, we identify useful query classes that can run while an analytic computes. Experiments with various analytics and real-world datasets show the overhead of online querying is 1.3x over the baseline vs. 8x for the traditional approach. These experiments also illustrate how Ariadne's query language supports execution monitoring and performance optimization for graph analytics.