We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Let There Be Topology-Awareness in Kube-Scheduler!

Formale Metadaten

Titel
Let There Be Topology-Awareness in Kube-Scheduler!
Untertitel
Enhancing Kubernetes Scheduler
Serientitel
Anzahl der Teile
637
Autor
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
With Kubernetes gaining popularity for performance-critical workloads such as 5G, Edge, IoT, Telco, and AI/ML, it is becoming increasingly important to meet stringent networking and resource management requirements of these use cases. Performance-critical workloads like these require topology information in order to use co-located CPU cores and devices. Despite the success of Topology Manager, aligning topology of requested resources, the current native scheduler does not select a node based on it. It's time to solve this problem! We will introduce the audience to hardware topology, the current state of Topology Manager, gaps in the current scheduling process, and prior out-of-tree solutions. We'll explain the workarounds available right now: custom schedulers, creating scheduling extensions, using node selectors, or manually assigning resources semi-automatically. All these methods have their drawbacks. Finally, we will explain how we plan to improve the native scheduler to work with Topology Manager. Attendees will learn both current workarounds, and the future of topology aware scheduling in Kubernetes. Kubernetes has taken the world by storm attracting unconventional workloads such as HPC Edge, IoT, Telco and Comm service providers, 5G, AI/ML and NFV solutions to it. This talk would benefit users, engineers, and cluster admins deploying performance sensitive workloads on k8s. Addition of newer nodes running alongside older ones in data centers results in hardware heterogeneity. Motivated by saving physical space in the data centers, newer nodes are packed with more CPUs, enhanced hardware capabilities. Exposing to use fine grain topology information for optimised workload placement would help service providers and VNF vendors too. We’ll explain numerous challenges encountered in efficiently deploying workloads due to inability to understand the hardware topology of the underlying bare metal infrastructure and scheduling based on it. Scheduler’s lack of knowledge of resource topology can lead to unpredictable application performance, in general under-performance, and in the worst case, complete mismatch of resource requests and kubelet policies, scheduling a pod where it is destined to fail, potentially entering a failure loop. Exposing cluster level topology to the scheduler empowers it to make intelligent NUMA aware placement decisions optimizing cluster wide performance of workloads. This would benefit Telco User Group in kubernetes, kubernetes and the overall CNCF ecosystem enabling improved application performance without impacting user experience.