We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Graph Analytics on Massively Parallel Processing Databases

Formale Metadaten

Titel
Graph Analytics on Massively Parallel Processing Databases
Serientitel
Anzahl der Teile
611
Autor
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache
Produktionsjahr2017

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
As graph processing moves to the mainstream, a large number of specializedgraph engines have emerged. However, for many enterprises, much of theirimportant data resides in relational databases and SQL is the most commonworkload. So is it reasonable to suggest that relational data processingengines can be used to solve graph problems in a productive and performantmanner? The answer to this question is: “Yes!” In this talk, we will address the use of massively parallel processing (MPP)databases for graph analytics workloads. We will share some recent findingsfrom the Apache MADlib (incubating) project, including design of graph datastructures, implementation of common graph algorithms, and performanceresults. Graph analytics is becoming an important part of enterprise computing. Withroots in academia going back many decades, the last 10-15 years have seen ahuge surge of interest in this topic to address a wide range of modern usecases, from cybersecurity to social networks to supply distribution chains. Enterprises have made significant investments in infrastructure, software, andtraining of their employees, all centered around SQL. So how can an enterpriseadd graph analytics to their business without the cost and complexity ofmoving to specialized graph processing engines? And, what are the tradeoffs? Graph analytics is a new area of innovation in Apache MADlib, which is a SQL-based open source library for scalable in-database analytics. It providesparallel implementations of mathematical, statistical and machine learningmethods for structured and unstructured data. Many existing analytics products do not scale in a way that makes itconvenient and economical to operate on large data sets. The graph methods inApache MADlib have been designed to take advantage of the shared-nothing,scale-out parallelism offered by modern parallel database engines.