We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

DiscoPoP: A tool to identify parallelization opportunities in sequential programs and suggest OpenMP constructs and clauses

Formal Metadata

Title
DiscoPoP: A tool to identify parallelization opportunities in sequential programs and suggest OpenMP constructs and clauses
Title of Series
Number of Parts
637
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
This talk introduces DiscoPoP, a tool which identifies parallelization opportunities in sequential programs and suggests programmers how to parallelize them using OpenMP. The tool first identifies computational units which, in our terminology, are the atoms of parallelization. Then, it profiles memory accesses inside the source code to detect data dependencies. Mapping dependencies to CUs, we create a data structure which we call the program execution tree (PET). Further, DiscoPoP inspects the pet of a program to find parallel design patterns and parallelization suggestions in terms of OpenMP constructs and clauses. By far, DiscoPoP detects doall, reduction, pipeline, task parallelism, and geometric decomposition in a program. We used DiscoPoP to create OpenMP versions of 49 sequential benchmarks and compared them with the code produced by three state-of-the-art parallelization tools: Our codes are faster in most cases with average speedups relative to any of the three ranging from 1.8 to 2.7. Moreover, we analyzed the LULESH program and an astrophysics simulation code with DiscoPoP. In LULESH, we identify most of the parallelization opportunities which have been parallelized by expert programmers manually. In the astrophysics code, DiscoPoP finds unexploited parallelism opportunities and achieves a speed-up of up to 35%.