From Shopping Baskets to Structural Patterns

Cite

Related Material

FOSDEM VZW

Petermann, André

Formal Metadata

Title

From Shopping Baskets to Structural Patterns

Title of Series

FOSDEM 2017

Number of Parts

611

Author

Petermann, André

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/42038 (DOI)

Publisher

FOSDEM VZW

Release Date

2018

Language

English

Production Year

2017

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

Mining frequent itemsets is an established approach to data mining andsupported by productive data mining solutions. For example, one can getinsights about buyers’ behavior by analyzing frequent co-occurrences ofproducts in shopping baskets. In contrast, frequent subgraph mining (FSM), thegraphy variant of frequent itemset mining, not only evaluates entity co-occurrence but also relationships among entities, i.e., structural patterns.However, existing implementations are all research prototypes which aretailored to textbook problems. In our talk, we want to give an introduction to the FSM problem on distributedcollections of graphs and our implementation in Gradoop, an open source systemfor scalable graph analytics based on Apache Flink. In contrast to otheriterative graph algorithms like page rank, in FSM the search space is droppedbut intermediate results of iterations are the desired result. Here, the majortechnical challenge is the respective usage of Flinks’ distributed iterations. We will explain different implementation approaches, discuss implementationdetails which influence scalability and show benchmark results. Intended audience and goal of the talk: Developers and analysts, interested inrelationship-centric data mining techniques