We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Being compliant with Open Container Initiative Spec

00:00

Formal Metadata

Title
Being compliant with Open Container Initiative Spec
Title of Series
Number of Parts
50
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Open Container Initiative (OCI) started in 2015 to make different implementations of container runtimes and images compliant with well-defined specifications. Together with other folks at Kinvolk, I have been involved in various OCI projects since months, and encountered various issues that occur in runtime specs and runtime-tools for verification. Since we live in a real world, not everything works well as expected. I’m going to talk about practical issues, and possible ways to get it improved. Open Container Initiative (OCI) defines container runtime specs (https://github.com/opencontainers/runtime-spec) as well as container image specs (https://github.com/opencontainers/image-spec) and distribution spec (https://github.com/opencontainers/distribution-spec). There is also runtime-tools (https://github.com/opencontainers/runtime-tools) that helps container runtime to verify compliance of the runtime specifications. The standard container runtime is runc (https://github.com/opencontainers/runc) that is included in multiple high-level container managers like Docker or containerd. Most of the practical issues arise when specification is not clearly defined in the first place, or when container runtimes have own reasons for not being compliant with the specs, or when there’s no consensus in the community how it should proceed. On the other hand, container orchestration systems like Kubernetes have defined their own interfaces such as Container Runtime Interface (CRI). The different interfaces (OCI runtime and CRI) exist at different layers in the software stack. I'll show how CRI depends on OCI and some mismatches between them. In this talk I want to introduce such practical issues, and try to suggest how we should proceed regarding spec compliance.
24
Thumbnail
15:29
25
Thumbnail
21:21
32
44
SpacetimeSystem programmingOpen setSoftware engineeringLecture/ConferenceMeeting/Interview
SoftwareSpacetimeKernel (computing)EmailProduct (business)Process (computing)Data managementBlogTwitterCore dumpOpen setRun time (program lifecycle phase)Computer-generated imageryDistribution (mathematics)System programmingInstallable File SystemMathematical singularity3 (number)Fiber bundleFile systemMiniDiscComputer fileOperations researchConfiguration spaceRootNormed vector spaceMereologyRepository (publishing)Run time (program lifecycle phase)Software repositorySet (mathematics)Server (computing)Client (computing)Configuration spaceDistribution (mathematics)Operator (mathematics)Latent heatFiber bundleFile systemDifferent (Kate Ryan album)TelecommunicationArithmetic progressionProjective planeComputer architectureWebsiteCommunications protocolMiniDiscOpen setOpen sourceProxy serverSlide ruleMetadataMultiplication signRootMusical ensembleConnectivity (graph theory)Computer fileCore dumpAreaTwitterSpacetimeBlogCubeFlagBuildingOcean currentData structureImage registrationComputer animation
Menu (computing)Configuration spaceDistribution (mathematics)Computer-generated imageryComputing platformFile systemCommunications protocolUniform resource locatorRun time (program lifecycle phase)Standard deviationMaizeOracleNetwork topologyFiber bundleLatent heatPersonal digital assistantMKS system of unitsSource codeCompact spaceTheoryMultiplication signPhase transitionExterior algebraRevision controlConfiguration spaceFiber bundleDescriptive statisticsSinc functionProjective planeLine (geometry)Validity (statistics)Computer architectureMereologySoftware testingCommunications protocolLatent heatComputing platformCASE <Informatik>RootLaptopDistribution (mathematics)Software repositoryArithmetic progressionInformationNeuroinformatikDiagramRun time (program lifecycle phase)Standard deviationFile formatGoodness of fitVector space2 (number)Bounded variationSerial portRepository (publishing)Cartesian coordinate systemCovering spaceGastropod shellComputer fileSlide rulePoint (geometry)Directory serviceOcean currentFile systemDefault (computer science)NamespaceRule of inferenceSystem callMultiplicationImplementationGraphical user interfaceResultantSingle-precision floating-point formatComputer animation
Computer configurationCASE <Informatik>Data Encryption StandardState of matterBeer steinDigital signalGame controllerKernel (computing)Run time (program lifecycle phase)Asynchronous Transfer ModeInterface (computing)Similarity (geometry)Configuration spaceMKS system of unitsSystem programmingFlow separationValidity (statistics)Revision controlHierarchySoftware testingNetwork socketGame controllerImplementationMultiplication signNear-ringRun time (program lifecycle phase)Standard deviationPhysical systemHookingSynchronizationType theoryProduct (business)Line (geometry)Point (geometry)Similarity (geometry)Latent heatSpacetimeComputer configuration1 (number)Asynchronous Transfer ModeCASE <Informatik>TheorySingle-precision floating-point formatOcean currentSoftware maintenanceLevel (video gaming)MultiplicationState of matterComputer animation
Software testingLatent heatDisintegrationRun time (program lifecycle phase)Similarity (geometry)TwitterEmailINTEGRALTelecommunicationMultiplication signBootingSoftware testingCommunications protocolValidity (statistics)Latent heatSoftware maintenanceSimilarity (geometry)Entropie <Informationstheorie>Level (video gaming)Point (geometry)NamespaceRun time (program lifecycle phase)Right angleConformal mapConfiguration spaceComputer animation
System programmingSimulated annealingMereologyOpen setFigurate numberLatent heatCASE <Informatik>Run time (program lifecycle phase)Multiplication signGoodness of fitSinc functionContent (media)Meeting/Interview
System programming
Transcript: English(auto-generated)
So I'm going to talk about being compliant to OCI, so Open Container Initiative. My name is Dong Su, and I am working at Kinfolk, a software engineer.
So I love to work across different spaces, like Linux Color and User Space. And also, I'm working on different projects, for example, flag-color-notes and kubespawn. So if you are interested in our open-source projects
being done in Kinfolk, then you can visit our website and blogs, Twitters, and so on. So I'm going to talk mainly about container runtime today. So for that, I need to go through
the history and background. So before OCI was born, there were two different runtimes, Docker and Rocket. And two container runtimes were competing with each other, and Docker was, of course, the most prominent one.
On the other hand, Rocket was favored by many people because of the layered architecture and better isolations and so on. So people think about the common interface, about how we could solve the common problem with better communication and specification and tools.
So they started to build open-source, open-container projects. That was 2015. After that, Open Container Initiative became the name,
and runtime spec started, image spec started. So Kubernetes started to support both runtimes in 2016. And both container runtimes joined CNCF last year. That leads to both releases for runtime specs
and image spec. And in this year, distribution spec also started, so this is still in progress. I took this picture from the CNCF landscape, GitHub repository.
So this is just only a small part of the landscape, but it is only about the container runtimes. So in this picture, you can already see that this is not only about specific container runtime, but about the all members involved in this area.
So for example, Cryo written by Red Hat, or Kata container, or the fresh member of Pouch container written by Alibaba. So making a common spec and validating it
is really an important topic. So I'm going to show an example. So think about a scenario that you download the Docker image running by Docker pull. Then, first of all, the images should be available
on the HTTP server, so you can see the image distribution endpoint. And on the container runtime side, so you can, after pulling the image, you can run the container runtime.
So image spec is about how the images are built on the server side, while the runtime spec is about how these downloaded images can be actually run on the client side.
And the distribution spec is about transferring the data, so HTTP protocols and so on. Runtime spec is actually a collection of requirements that every OCI compliant container runtime
should be conformed to. So it defines a lot of requirements about how to run the file system bundle on the local disk. So what is the file system bundle then? This is a set of files, so not only metadata,
but also data itself, which are needed for compliant runtimes to run the standard operations. So this consists of two parts, config.json, this configuration data, which I cover in the next slide,
and also containers root file system. So if it's a Linux container, then you can see this data structure, like slash dev or slash proc, sys, and so on. All these spec is available on this GitHub repo.
You can visit this repository. So I can show an example, maybe you can see it. For example, if you want to specify a UID or GID, you can do it in the config file, or what would you like to run in the container.
In this case, a simple shell. Also, current directory or capabilities, root test or mount points. If it's a Linux container, then you can specify even Linux namespaces or set comp rules
to vector-specific syscalls. And also there are other specs. So image spec is about actually composing a specific container image.
So it should contain sufficient information to launch the application on the target platform. So it contains image manifest and image config. Also about the file system serialization. For example, if you look into the Docker image,
then you can see that this is not a single layer, but multiple layers are there. You can see the image spec on the GitHub repo. Also, this is a distribution spec, which is still in progress. It's about the requirements
for every compliant image distribution endpoint. So for example, URL layout or protocols, mainly this is about HTTP. And this is pre-1.0, but it has quite a lot of information, but there are also some rooms for improvements.
And this is two Docker-specific, I would say. You can see also this spec on the GitHub repo. So what is RunC? This is a standard container runtime that implements the existing OCI runtime spec.
And this is quite widely used, of course. If you are running Docker on your laptop, for example, then you are already running RunC, because Docker is already running RunC by default. Or RunC is being used
by other high-level container runtimes, like Cryo. And RunC is mainly written in Go. Only a small part of it is written in C. So this RunC project has a long history,
which is because it's derived from Docker, and still not released as stable version 1.0. It has been in the RC phase since a couple of years.
So maybe, hopefully, in this year, there will be a stable release. There are also alternatives. If you want, you can use these alternatives instead of RunC. For example, C run, written in C, and Railcar, and RunSC. The last one is a fresh one,
and Google's gVisor was released a couple of months ago. I would say C run is quite mature, but other two projects are still not that mature enough.
There is also runtime tools. That is actually the main part I have been working on. So runtime tools is a collection of a runtime spec generator and validator. So it has a generator that creates a runtime spec, config.json,
I have shown in the previous slide. So you don't have to write the config file by hand. You can just run a simple command line. So it's oci runtime to generate. Then you can get the standard config file. Also, there is a validation part.
Validation can be done in two different ways. One is just validating only an OCI bundle without starting a container. So it's finished very fast, but most interesting part is the second one,
validation with actually starting a container. So it spawns a real container with a specific container runtime, and run every test case in that container. So it takes a little more time, but it's good for validation
for each spec requirements. You can see also everything in that GitHub repository. So for example, in this diagram, you can see that generator generates a config.json using that spec and validation tool,
reads that config and runs a specific runtime runc. In that container, you run each test case. Then you can see the result at the end. So 3,000 or more tests passed, and only 15 failed.
Yeah, that's it. That's quite good architecture, and it works mostly well. It's downside is running only on the command line tools.
It has no fancy GUI interface. So far, it looks pretty good. So no issue at all. So there's nothing to be worried about. So what's the deal?
I have been working on the runtime tools specifications since the beginning of the year, and the main problems are listed here. For example, unclear description in the spec. Some parts in the spec are written in an ambiguous way.
So I want to, for example, add some more sentences. Then someone do not like it because there's too much technical detail, implementation detail, so it's not possible to change the spec.
Or something is required by the spec, but 1c, the reference runtime, does not support it. And that's pretty unfortunate. Actually, it should support it, and it should work. So theory does not always apply to the reality. Also, there are long-running discussions
in the GitHub issues. Some GitHub issues are really old, more than two or three years, but never-ending discussions. It took a little, too much time until everyone agreed. I can introduce several cases here.
For example, mount option. This is the most, the simplest one. And you can see, just run the bind mount on the command line. And that, after that, you can see the mount table,
then see that the mount type is represented as a none. So on the other hand, spec has mentioned that, as an example, that type should be a mount instead of option. So there was long discussions about that,
what should be the correct one. In my opinion, type bind, that line should be removed, and that was effectively only simple removing.
But it took years until that could be really changed, and several people tried and gave up and tried again. Recently, it was merged finally. So it's quite not an ideal situation.
And another one is a little tricky. So OCI hooks in RunC is a little mess. So spec already requires that pre-start hooks and post-start hooks to be done when container is started.
So that's pretty obvious, but RunC still deals with the hooks when creating a container. So I'm quite not sure why it is,
and probably because of the history reasons, so I actually tried to fix that in RunC. You can see that per request. And it's still pending, not merged, not reviewed from the maintainer, and after several months, someone who is not a maintainer
came to me and, hey, maybe if your PR could be merged, then my own product would be broken, because my product depended on current RunC implementation instead of the spec itself. So that has been quite an issue,
and maybe there could be different ways to avoid it, but it's still pending. On the other hand, a similar issue is there, so spec requires hook states to be passed over to the standard end, and that should be also done
by RunC, but actually RunC does not do it, and this PR has been already around since multiple months, still no review. Also, there is a really difficult issue in single version two.
And corner itself decided to migrate to single version two a couple of years ago, but not every controller in the single version one can be migrated to the version two because of hierarchy issues.
And systemd also decided to give up several controllers from version one and only support a specific one in version two, and also introduced hybrid mode to support both version one and two.
That was maybe since last year or maybe two years ago. And runtime spec actually does not specify version two at all, so it relies on new version one. Obviously, RunC or other container runtimes
still rely on only one, version one. For example, freezer controller is needed by Docker, but it's only available in version one. So corner would not support freezer controller
in version two in the near future, and systemd would not, either. So how to proceed is still in discussion. So in my opinion, we should proceed by writing version two in the spec
and writing that in the RunC and so on, but it will take more time, probably. So that was the OCI level stories. So there are other scope of these issues.
So Kubernetes itself has its own interface for running multiple container runtimes. So it was mainly written for Docker and Rocket, but after that, they tried to convert it to a CRI. So common interface, which is basically a Unix socket,
so you can specify a specific runtime endpoint to the kubelet. So Kubernetes itself has its own specs for this container runtime, and also its validation test CRI tools.
So it has some similarities to OCI specs and validation tools, but not exactly the same. So it's not possible to just run these CRI tools to specify both, validate both CRI and OCI.
So simply speaking, kubelet itself needs to talk to the CRI shim, where it's a protobuf interface gRPC protocol,
and actually CRI applies to only the communication between kubelet and container runtime, so not under the lower level, while the OCI applies to that lower level. So even if some similarities are there,
for example, container config and Unix namespaces are supported by both CRI and OCI, it's quite hard to make a common layer. So what should we do about it?
An idea was that, for example, another layer to consolidate OCI and CRI, this is too complicated, and what's the point in making another layer only for this validation? Too much maintenance burden, I guess. And there would be a flexible way, for example,
just to describe how to run integration tests using a specific lower runtime. Cryo, for example, has its own integration test for doing that. Then we can see that, oh, it's running fine,
then maybe you are certified. Then it's probably similar to the existing Kubernetes conformance test, but maybe other high-level container runtime should be able to support these kind of integration tests.
So the discussion is on hold right now. You can visit the GitHub issue about that. Okay, I can skip it, and that's the end of my talk. Thanks, Tung-Shui. And so we have a couple of minutes left for questions.
So I just wanted to know, since there is the Kubernetes specification and the open container specification,
do you think that the industry is gonna, since Kubernetes is so popular, do you think that the industry is gonna move into the Kubernetes container in pacification, you know? Like, how do you see developing, do you think maybe they have figured out some of the things that the OCI has figured out,
you know? Because if an issue is on hold, a discussion is on hold for two, three years, maybe the Kubernetes specification figure it out, and in the end it beats the OCI. What do you think? How do you think it's gonna play out in the end?
Good question, but I actually have no idea. Maybe Kubernetes itself has developed its own interface, CRI, and it has been implemented a part of that
low-level container runtime interface. And probably in the future we can see such a case that Kubernetes CRI could be more wider
than the OCI, what OCI defines. That's just a hope. I don't know how that could be done, that's it. So we have time for one more. Otherwise, then thanks again, Dongxiu.
Thanks.