Utilizing AMD GPUs: Tuning, programming models, and roadmap

Cite

Related Material

FOSDEM VZW

Markomanolis, Georgios S.

Formal Metadata

Title

Utilizing AMD GPUs: Tuning, programming models, and roadmap

Title of Series

FOSDEM 2022

Number of Parts

287

Author

Markomanolis, Georgios S.

Contributors

Hoste, Kenneth

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/56829 (DOI)

Publisher

FOSDEM VZW

Release Date

2022

Language

English

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

During FOSDEM 2021, we presented in the same event the LUMI supercomputer and we discussed about the Open Software Platform for GPU-accelerated Computing by AMD (ROCm) ecosystem, how to port CUDA codes to Heterogeneous Interface for Portability (HIP), and some performance results based on the utilization of NVIDIA V100 GPU. In this talk we assume the audience is familiar with the content of the previous presentation. One year later, we have executed many codes on AMD MI100 GPU, tuned the performance on various codes and benchmarks, utilized and tuned a few programming models such as HIP, OpenMP offloading, Kokkos, and hipSYCL on AMD MI100 and compared their performance additionally with NVIDIA V100 and NVIDIA A100 (including CUDA). Furthermore, a new open source software is released by AMD, called GPUFort, to port Fortran+CUDA/OpenACC codes to Fortran+HIP for AMD GPUs. In this talk we present what we learned through our experience, how we tune the codes for MI100, how we expect to tune them in the future for LUMI GPU, the AMD MI250X, compare the previously mentioned programming models on some kernels across the GPUs, present a performance comparison for single precision benchmark, discuss the updated software roadmap, and a brief update for the porting workflow.