We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Generating distributed plan for PostgreSQL

Formal Metadata

Title
Generating distributed plan for PostgreSQL
Title of Series
Number of Parts
35
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Currently the query planner in PostgreSQL generates plan that is supposed to be executed on a single-node PostgreSQL. What if we want to run queries in the MPP (massively parallel processing) way? Greenplum Database is an open source MPP database with PostgreSQL kernel inside. In this talk, I will introduce how Greenplum Database generates distributed plan for PostgreSQL. Greenplum Database is an open source MPP (massively parallel processing) database with PostgreSQL kernel inside. It is essentially several PostgreSQL disk-oriented database instances acting together as one cohesive database management system (DBMS). Particularly, the optimizer in Greenplum Database is based on PostgreSQL’s optimizer, which takes a query tree as input, examines each of possible execution plans and ultimately selects the execution plan that is expected to run the fastest. In order to meet the MPP environment, the optimizer in Greenplum Database has been modified and enhanced to support the parallel structure of Greenplum Database. It produces such plan that is able to be executed simultaneously across all of the parallel PostgreSQL database instances. As a result, the implementation of optimizer in Greenplum Database differs from that in PostgreSQL in several aspects. This talk will introduce these differences in optimizer between Greenplum Database and PostgreSQL, and illustrate how Greenplum Database achieves and optimizes a plan for parallel environment.
Keywords