We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Titus: Adventures in Multi-tenant Scheduling

00:00

Formal Metadata

Title
Titus: Adventures in Multi-tenant Scheduling
Title of Series
Number of Parts
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Titus is a multitenant scheduler that runs a variety of workloads that vary from online workloads which serve customer traffic to big data workloads which perform machine learning. Getting all of these workloads to cooperate on a shared pool of resources together. Just to add a bit of complexity to the mix, these workloads all run on the cloud, and a shared storage and network fabric. Come to this talk to learn about how our approach to multi-tenancy works, as well as some of the challenges we faced along the way. Titus is a system which allows users to submit arbitrary container workloads to the cloud, and get their workloads running across many thousands of cores or more. This comes with a variety of challenges. We attacked this problem with a three-pronged approach. Our approach to scheduling is multi-tenant first. Our scheduler understands different workloads and the fact that different workloads have different Service Level Objectives. In addition to this, it understands the cloud, and the fact it’s a shared control plane. Lastly, we’ve had to teach our scheduler to handling situations during failover, and when scaling up is key versus traditional scheduling. Our approach to systems is evolving. Historically, our fleet was many single-tenant VMs. We’ve attacked systems level multi-tenancy from the multiple perspectives. The first of these involved giving our user the APIs that were as close to what they had on the VM. Subsequently, we’ve tried to enable security mechanisms like seccomp and apparmor that allow us to run nearly any workload on Titus. Lastly, we’re still figuring out resource isolation. Cgroups have come a long way, but there is a long way to go ahead before we can be as good as VMs. All of our infrastructure runs on the cloud. We decided that our approach to scheduling, and systems multi-tenancy should be cloud native, and leverage as many mechanisms as possible that already exist in the cloud rather than invent our own. Although this gave us a massive head start, it didn’t come for free. We had to solve problems like coordination-free optimistic interactions with our SDN, and solutions to shared-storage.
SpacetimeSystem programmingFacebookComponent-based software engineeringRun time (program lifecycle phase)StapeldateiService (economics)Open setComplete metric spaceRadical (chemistry)Different (Kate Ryan album)SoftwareOperations researchComputerIntegrated development environmentLocal ringSoftware testingComputer-generated imageryChannel capacityPlanningVirtual machinePoint cloudFunction (mathematics)Software developerStandard deviationMachine learningDimensional analysisScaling (geometry)Model theoryVideo trackingRemote procedure callCommutative propertyGUI widgetHierarchyCodeTask (computing)Control flowDisintegrationMetadataComputer networkData storage deviceMetric systemNamespaceExtension (kinesiology)WorkloadLoginRotationComputing platformWorkloadSystem programmingBitOpen sourceSoftware developerState of matterMedical imagingAxiom of choiceGame controllerStapeldateiDependent and independent variablesCASE <Informatik>Process (computing)Component-based software engineeringSoftware testingComputer fileMachine learningMereologyPoint cloudSoftwareService (economics)Mechanism designRotationOrder (biology)Integrated development environmentLocal ringGene cluster10 (number)INTEGRALExtension (kinesiology)Computer hardwareIterationEntire function2 (number)Group actionMetadataFile systemSoftware engineeringCloud computingDisk read-and-write headType theoryWindowNeuroinformatikTotal S.A.Multiplication signCartesian coordinate systemFunctional (mathematics)Public key certificateStability theoryStatistical hypothesis testingStudent's t-testGradientVariety (linguistics)RandomizationComplete metric spaceRadio-frequency identificationVirtual machineProxy serverPlanningKeyboard shortcutChannel capacityNetwork socketPublic domainLevel (video gaming)Standard deviationOperating systemPatch (Unix)Set (mathematics)MassHeuristicDeterminantPoint (geometry)Rule of inferenceSystem callKernel (computing)MiniDiscTrailOperator (mathematics)Library (computing)InternetworkingInterface (computing)Model theoryScaling (geometry)Dimensional analysisCodePrincipal ideal domainProduct (business)AbstractionLoginBuildingMagnetic-core memoryLecture/ConferenceComputer animation
RotationSystem programmingBlogTape driveLoginoutputRotationLine (geometry)2 (number)MiniDiscFigurate numberRadio-frequency identificationMechanism designGroup actionBitGoodness of fitData compressionBefehlsprozessorProcess (computing)Order (biology)Intrusion detection systemComputer animation
Control flowBlogFile viewerMetadataComputer fileLoginSpacetimeBitComputer fileoutputStandard deviationAbstractionMultiplication signPoint (geometry)MappingMetadataAttribute grammarExtension (kinesiology)File formatService (economics)System programmingRotationCartesian coordinate systemFile systemRange (statistics)Order (biology)Computer animation
Computer networkAddress spaceUniversal product code1 (number)SoftwareIntegrated development environmentSet (mathematics)Information securityRange (statistics)Level (video gaming)Line (geometry)
Information securityRoutingComputer networkShift operatorGroup actionAddress spaceElasticity (physics)Interface (computing)Block (periodic table)BuildingPlane (geometry)Control flowVirtual machineLimit (category theory)Overhead (computing)Local GroupGroup actionInterface (computing)WorkloadRule of inferenceVirtual machineSet (mathematics)Computing platformRoutingSoftwareInformation securityBridging (networking)DemonIP addressService (economics)Elasticity (physics)Game controllerPlanningShared memoryKeyboard shortcutCartesian coordinate systemPolar coordinate systemModel theoryProduct (business)BitPoint cloudSystem administratorPerspective (visual)Computer animation
System programmingComputer networkWorkloadInformation securityVirtual machineOverhead (computing)Type theoryBinary fileMassQuality of serviceGroup actionAmsterdam Ordnance DatumKernel (computing)Perspective (visual)Virtual LANInterface (computing)NumberComputer animation
Overhead (computing)Computer networkVirtual machineChannel capacityDifferent (Kate Ryan album)Real numberSoftwareBefehlsprozessorAbstractionPriority queueBand matrixOverhead (computing)Single-precision floating-point formatLevel (video gaming)Queue (abstract data type)Interface (computing)DataflowCartesian coordinate systemBlock (periodic table)ÜberlastkontrolleDrop (liquid)Population densityTerm (mathematics)Set (mathematics)Social classParity (mathematics)Group actionTransport Layer SecurityMereologyTotal S.A.FeedbackScheduling (computing)Order (biology)Computer animation
System programmingScheduling (computing)Multitier architectureService (economics)StapeldateiChaos (cosmogony)Scale (map)Computer-generated imageryInstance (computer science)Group actionInformation securityComputer networkVirtual machineScheduling (computing)PlanningMultitier architectureGame controllerTouch typingMereologyLevel (video gaming)Perspective (visual)Mathematical optimizationOrder (biology)Medical imagingRun time (program lifecycle phase)Asynchronous Transfer ModeNormal (geometry)Multiplication signDifferent (Kate Ryan album)Information securityMobile appReduction of orderType theoryWorkloadPoint (geometry)Instance (computer science)Default (computer science)StapeldateiAndroid (robot)Channel capacityEntire functionPattern languageFlow separationVirtual machineGroup actionFacebookScaling (geometry)Machine visionTheoryQuicksortMultiplicationFluxService (economics)Computer animation
System programmingComponent-based software engineeringChi-squared distributionIntegrated development environmentControl flowPlane (geometry)Group actionData recoveryStability theoryAntimatterStandard deviationPrincipal ideal domainNamespacePressure volume diagramStatisticsTurbo-CodeIntelChannel capacityCache (computing)Bit rateSpacetimeBefehlsprozessorWorkloadMathematicsStress (mechanics)Graph (mathematics)Software testingOutlierNetwork topologyJust-in-Time-CompilerOverhead (computing)StapeldateiService (economics)Information securityExecution unitFocus (optics)Computing platformComponent-based software engineeringVariety (linguistics)Control flowIntegrated development environmentPrimitive (album)Dependent and independent variablesBefehlsprozessorWorkloadSystem programmingGame controllerGroup actionSpacetimeScripting languageOrder (biology)State of matterAddress spaceMereologyMultiplication signService (economics)StapeldateiVirtual machineNamespaceCache (computing)Computing platformExpected valueUtility softwareProduct (business)NumberExistenceOverhead (computing)System callMechanism design2 (number)Cycle (graph theory)Disk read-and-write headPlanningGene clusterCAN busComputer wormPrice indexBit rateProxy serverCoprocessorKernel (computing)Channel capacityOutlierRule of inferenceSocial classScheduling (computing)Information securityAxiom of choiceShooting methodLeakComputer animation
System programmingOverhead (computing)Network topologyCartesian coordinate systemMechanism designRun time (program lifecycle phase)BenchmarkIntegrated development environmentMultiplication signSystem callGraph (mathematics)Filter <Stochastik>Meeting/Interview
FacebookSystem programming
Transcript: Englisch(auto-generated)