We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching

Formal Metadata

Title
CECI: Compact Embedding Cluster Index for Scalable Subgraph Matching
Title of Series
Number of Parts
155
Author
License
CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Subgraph matching finds all distinct isomorphic embeddings of a query graph on a data graph. For large graphs, current solutions face the scalability challenge due to expensive joins, excessive false candidates, and workload imbalance. In this paper, we propose a novel framework for subgraph listing based on Compact Embedding Cluster Index (idx), which divides the data graph into multiple embedding clusters for parallel processing. The sub has three unique techniques: utilizing the BFS-based filtering and reverse-BFS-based refinement to prune the unpromising candidates early on, replacing the edge verification with set intersection to speed up the candidate verification, and using search cardinality based cost estimation for detecting and dividing large embedding clusters in advance. The experiments performed on several real and synthetic datasets show that the sub outperforms state-of-the-art solutions on average by 20.4times for listing all embeddings and by 2.6times for enumerating the first 1,024 embeddings.