We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Optimizing Containers for Security and Scaling

00:00

Formal Metadata

Title
Optimizing Containers for Security and Scaling
Title of Series
Number of Parts
56
Author
Contributors
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
This talk is about creating minimal containers. The author has started to dive into Kubernetes and Container Security some years ago. Minimizing the size and the attack vectors are just two sides of the same coin. As a reward, you get much faster deployment pipelines, enabling more automated testing and higher scalability. A speed up by a factor of 10 or 20 is not unusual, sometimes the size of a cointainer shrinks by a factor of 100. - 12factor IX: disposability - bad examples - optimizing the size of a container - building minimal containers from scratch - a small step in a Dockerfile, a big leap for container size - debugging minimal containers - speed up - security measured by Trivy
Level (video gaming)Scaling (geometry)Information securityFood energyPhysical systemPoint cloudDatabaseSoftwareContinuous functionChaos (cosmogony)Time evolutionVideo gameForcePersonal area networkData transmissionStability theoryIntegrated development environmentInternetworkingComplex (psychology)Communications protocolWindows RegistryTopologyScale (map)Time zoneComputer networkArchitectureConcentricTowerMusical ensembleInformation securityInternet der DingeAnalytic continuationPoint cloudStructural loadProduct (business)Projective planeTerm (mathematics)Office suiteElectric power transmissionPhysical systemEndliche ModelltheorieUniverse (mathematics)Chaos (cosmogony)Integrated development environmentComputer configurationRevision controlFactory (trading post)Bridging (networking)NeuroinformatikInformation technology consultingClient (computing)Asynchronous Transfer ModeData transmissionBitElement (mathematics)Strategy gameMereologyInformationCartesian coordinate systemScaling (geometry)Communications protocolTime zoneProcess (computing)InternetworkingXMLUMLLecture/ConferenceComputer animation
Flow separationAsynchronous Transfer ModeCharge carrierCentralizer and normalizerGroup action
Control flowVulnerability (computing)Procedural programmingInformation securitySystem programmingInformationPhysical systemDatabase normalizationEvent horizonCAN busKeyboard shortcutService (economics)Default (computer science)SoftwareVideoconferencingHacker (term)Point cloudDemo (music)Computer-generated imagerySoftware developerComputer networkCodeSpring (hydrology)Electric currentGastropod shellRevision controlExploit (computer security)LaptopShift operatorNetwork topologyOpen setCovering spaceInformation securityCartesian coordinate systemRollenbasierte ZugriffskontrolleLibrary (computing)Sound effectSoftwareInternetworkingOperator (mathematics)Vulnerability (computing)1 (number)Medical imagingFunction (mathematics)BitShift operatorOpen setPhysical systemCodeGastropod shellRemote procedure callService (economics)System administratorDependent and independent variablesSoftware developerFile systemToken ringCubeConnectivity (graph theory)Multiplication signScripting languageKeyboard shortcutJava appletParameter (computer programming)Point cloudString (computer science)Slide ruleImplementationWave packetFlow separationRun time (program lifecycle phase)Computer fileWeb 2.0Revision controlDefault (computer science)Computer animation
Open setSoftwareRevision controlCovering spaceInformation securityCodeSoftware maintenanceScripting languageBuildingComputer fileLocal GroupCache (computing)Point (geometry)DemonLibrary (computing)RootRevision controlInformation securityOpen sourceInternetworkingSoftwareElectronic mailing listService (economics)Computer fileProjective planeVarianceFilm editingMultiplication signPosition operatorVulnerability (computing)Shared memoryCodeSoftware maintenanceLibrary (computing)Principal ideal domainCartesian coordinate systemFunction (mathematics)Physical systemGastropod shellJava appletMoment (mathematics)WordRadio-frequency identificationMusical ensembleChainScripting languageBuildingGroup actionCache (computing)Computer animation
Point (geometry)DemonBuildingCache (computing)Local Group.NET FrameworkApplication service providerRootGraph (mathematics)WeightOpen setComputer-generated imageryFile formatDistribution (mathematics)Configuration spaceComputer fileContent (media)Integrated development environmentRepository (publishing)Scripting languageDependent and independent variablesMoment (mathematics)DampingRevision controlDistribution (mathematics)Function (mathematics)CubeApplication service providerWeight2 (number)Repository (publishing)Gastropod shellWindows RegistryDivisorFile formatGame controllerProjective planePoint (geometry)Multiplication signComputer fileCuboidLibrary (computing)Medical imagingBuildingVolume (thermodynamics)Lipschitz-StetigkeitCartesian coordinate systemIntegrated development environmentGraph (mathematics)InternetworkingCodeBinary codeComputer animation
Run time (program lifecycle phase)FluxInformation securityDisintegrationSign (mathematics)Computer-generated imageryCodeProof theoryFormal verificationSoftwareLevel (video gaming)ChainContinuous functionGroup actionBuildingFunction (mathematics)Scripting languageVector spaceApplication service provider.NET FrameworkTwitterExploit (computer security)EmailLink (knot theory)Medical imagingIntegrated development environmentConfiguration spaceBitProjective planeAnalytic continuationRevision controlFluxLevel (video gaming)CodeSign (mathematics)Electronic mailing listFreewareChainResultantPhysical systemProcess (computing)Sound effectSoftwareMoment (mathematics)Electronic signatureWindows RegistryInformation securityWebsiteVector spaceRun time (program lifecycle phase)Link (knot theory)Scripting languageProgramming languageTrigonometric functionsAdaptive behaviorOperator (mathematics)Point (geometry)InformationBuildingMaterialization (paranormal)Computer animation
TwitterEmailHypermediaLink (knot theory)Java appletBitCartesian coordinate systemDifferent (Kate Ryan album)Lecture/Conference
Application service providerOpen setLaptopBitSoftware developerDifferent (Kate Ryan album)Presentation of a groupMeeting/Interview
EmailLink (knot theory)TwitterApplication service providerOpen setPresentation of a groupGene clusterPhysical systemLoginLine (geometry)Electronic mailing listChainFluidHash functionCodeProduct (business)BitLecture/ConferenceMeeting/Interview
TwitterLink (knot theory)EmailLinker (computing)Information securityRadical (chemistry)Software developerLecture/Conference
Open setApplication service providerCodeMiniDiscInformation securityComputer configurationRevision controlSoftware developerMultiplication signLecture/Conference
EmailTwitterLink (knot theory)Application service providerDot productVulnerability (computing)Fluid staticsMetadataLibrary (computing)Cartesian coordinate systemPort scannerNormal (geometry)Lecture/ConferenceMeeting/Interview
TwitterEmailLink (knot theory)Programming languageFormal languageInformation securityType theoryMusical ensembleCartesian coordinate systemLibrary (computing)Lecture/Conference
Application service providerDot productLibrary (computing)BuildingLevel (video gaming)Gastropod shellVulnerability (computing)Strategy gameFormal languageInternetworkingSoftware bugMultiplication signJava appletCartesian coordinate systemDegree (graph theory)Entire functionSoftware developerSelf-organizationLecture/ConferenceMeeting/Interview
EmailTwitterLink (knot theory)Musical ensembleLecture/ConferenceJSONXMLUML
Transcript: English(auto-generated)
I'm active in Kubernetes for whatever, five or six or seven years now from the very beginning.
I'm doing mostly Kubernetes security and critical infrastructure. I founded two companies, but this is not the topic here. I have several pro bono memberships in AG Critis, so dealing with critical infrastructure, I keep planning that, contribute a little bit as consultant to Gaia X, which is kind
of European answer to the big hyperscaler clouds. But the main topic here is, yeah, how is it evolving? So we have DevOps, we have continuous security monitoring, which has been mentioned but not implemented yet in our environment. We have SecDevOps, so it occurred in 2016 also, something like security chaos engineering.
And then later came DevSecOps as a buzzword. GitOps is very popular, so we have all these things. And now let's look how this with containers and security fits into the picture.
So normally what I recommend is to my customers in critical infrastructure not to use the German versions of documents, but there is a very good document by the Cloud Native
Computing Foundation, and I'm not really pro-military, but it's also by the US Department of Defense. They have all the documents you need to implement a DevSecOps process.
So what you see here is they took the DevOps process and added security to every step of it. That's part of the general strategy, add security to every step of your pipeline, which
is hard, but effectively it works sometimes. This is Kubernetes nowadays, so we have Kubernetes everywhere. So we started in trading and Pokemon was the first application, but you also have
it in ships, you have it in banks, you have it in healthcare, you have it in transmission grid and electrical systems, you have it even in F-16 airplanes. So all these systems are critical in terms of yes, we need security here, and this
cannot necessarily be connected directly to the cloud. This is my playground, I'm in projects with the European and the German transmission grids, so this is anything above 110 kilovolt, and it's about dispatching the load in the grid
according to the production in power plants and the consumptions in households and factories. We have a lot of problems here, it's more or less an IoT system, the Internet of Things,
but it's very old, it has protocols from the 90s, SCADA, and these all must be protected, because here failure of the system on European or German scale is not an option.
So if we want to deploy Kubernetes here, we must care about security. And that's our security model, as you might know it. So it's propagated by the German Bundesamf. So the Federal Office of Information Security, the picture is from British Columbia University,
but what is this effectively? So we have several zones, a very high security zone, high security zone, clients and so on, effectively this is this kind of security. You have a medieval castle design, and then you have one moat, two walls, towers, bridges for access,
and by the way this is Belvoir Castle, which was the influence castle design for the next several centuries. So this is our security model in information technology today. But this is Kubernetes battleground, so everything is moving, you don't have a clear separation
where you can put a moat or a wall, the ports are coming and going away, so we have to care about that everything is secured in a more dynamic way. The closest approach to that is the carrier strike group from the US, sorry for all the
military pictures here, but here this is the central thing, and anything else around the carrier is just to protect the carrier, which is a kind of medieval castle floating
on the ocean. And to get there, what we want to do and what we must do is implement defense in depth. So you have several layers of defense, each layer can fail, and the idea behind this
is that at least if you have one layer who is able to protect you, then the entire system is secure in a way that it can't fail. Let's look how deep is the security in Kubernetes, and effectively you see, and I've done a
hack on that, so in the slides there will be a connection to another talk where I hacked an OpenShift cluster from the outside, you see the effective depth in Kubernetes is only
three. It could be four, but Kubernetes has a design flaw, so everybody is using service account tokens inside containers, which is only necessary if you have an operator running in that container, but this is more or less a default. So what can happen?
First thing is your application, example here is image tragic, but could also be a JNDI application or anything else which is connectable from the outside world. This application can have a flaw anytime, because there is log4j shell in it, you have
some library which is vulnerable. The next thing you do, and you are the creator of these images, is the installation of the software. So the image creator has its own responsibility, and this is what I want to talk about, that
you can get a more secure and a smaller and therefore also a faster image. The next thing is if you go into role binding in Kubernetes, it's a mess, nobody really understands role-based access control because this has been designed for machines, and if
you copy some advice from the internet, you might introduce a connection of the operator which runs this application by this service account and token, and you give it access to the cluster up. The cluster admin can start anything in your cluster, so even take over the host system,
and in the clouds if you take over the host system, in the host file system you might find JSON files which give you access to the cloud account. So let's talk about your responsibility as a developer to secure your containers.
We have had examples in the past, so log4j was one victim, string had something, JIRA has something, and if you see this is the kind of approach you have.
If you can get something as a parameter where you can call here Java lang runtime, get runtime exec, you can execute any command in Java. The image tragic example was a little bit different, you could hide commands in pictures
in that version, so this can be exploited from the outside. So consider your application as insecure as long as you have a Java lang runtime exec in the code, which is probably true all the time. Then you have the nightmare of security, you have a remote code execution vulnerability.
And in deep security you need to catch that. With this vulnerability you can execute arbitrary commands in the container from the internet. And you have connection to the internal network if you don't protect it.
Full exploit is available here. All my examples are on GitHub in my security trainings. So you can read it and the output is also there, that you can follow what I'm telling
you here. And it has been tested on OpenShift because this was a challenge, Red Hat always says OpenShift is multi-tenant, it's more secure than the other ones. Yes, it's more secure, but only a little bit.
And if you have standard images, in that standard images you nearly every time find a cURL, and then with a cURL you can download as a script kiddie, you don't need sophisticated knowledge about Linux or containers. You can download a kubectl, with kubectl you can get the service account token and
execute arbitrary commands. Okay, what's in my container? How can I know what's in my container? And this was the outcome of the log4j shell problem. Everybody asked me, yes, are we affected by log4j shell?
It was in last November I think. And my answer was yes, if you look into your software bill of material, you immediately find the word, what's a software bill of material? We don't have a software bill of material. The software bill of material covers all licenses and versions of my software and
is a necessary output of your CI-CD system because otherwise you cannot give this code to somebody else. Normally you think you only have Apache licenses, but with an Afferto general public
license you could also be responsible to give the code to everybody outside who is using your code as a service on the internet. So please be careful, you need that software bill of material. If you are building proprietary or open source software, you need it anyway.
And it must be produced by everybody who gives software to somebody else. This is a requirement, otherwise you are in big trouble. Especially I'm in a German government project and there it is a must.
Okay, so this is about license compliance, are we legally clean? And of course it is about security, which version of which library do we run? Is it vulnerable or is it not vulnerable? And then you see here, this is a scan of trivia in Harvard of a Tomcat container last Saturday.
You see these are criticals, it's the red one, but if you look here, all these packages where you have vulnerabilities are not connected to Tomcat.
So this is something which is definitely hard for security people. They are completely blinded by red critical vulnerabilities. And effectively the Tomcat itself has no vulnerability, so far known.
So if I would scan it today, it could look different, but at the moment this version of Tomcat itself had no critical vulnerability, but all the tools around in the toolchain. What do we do? So we have lots of false positives. We have a Tomcat container, and effectively most of Java applications are not interested
in cURL or Python and not in the security. So the tools themselves are attack tools, it doesn't matter if I have a cURL or a Python, if I can execute Python code, I can also download things from the internet.
And we are blinding security, and what we would need here to get it in our ISCD pipeline propagated, this would mean we would have to create a big CVE ignore maintenance list.
This is effort for whatever full-time security people, and this is a waste of time. The answer was, I have been in an open source project where we had a similar problem, just do the user hardening, kick out everything which is not needed.
How can we know what is not needed? And then it can be done by a script at build time. So first answer is, use a positive list of files. This is an hardening script, simply using Alpine as the original, Nginx based on Alpine
as the original application, and then you only need that. You need the Nginx itself, some files in etc, some logging, some PID. Why do we need PID files?
Because we are in a container and the PID is one anyway, so it's not really clean. We might need a cache, we need an etc password, because the Nginx must look up, I'm running as user 110 or something like this, and therefore I need etc password, etc group, it needs
a user share Nginx with some library, you need the licenses, var run, var lock, and so on, var cache, and then this is all you need to run an Nginx. This has doubled, I don't want to confuse you, no, sorry.
The next example, some cut and paste has failed, so effectively what you create as
next is a container from scratch, and you take the output of the hardening script from that container, put it into your container from scratch, and run only the Nginx without anything else.
So we have no shell, no busy box, Alpine normally is full of busy box, and this means the container is much smaller, I will tell you how much smaller, and effectively you only have the binary you want to. And what you also see here is the minus D, means please look for the linked library
so it also attaches the shared library which I needed to run this binary. Another example, not as trivial, because I'm in a project where many people use .net,
.net runs best on Linux, we try and we can really do the same example, and we use the original example from Microsoft, the ASP.net app, and add our script from the internet, by the way, never do this, now you are running my code in your build system, but ok, I don't
care because it's my code, and then you need to be root for build time, you can apply the hardening script, some obscure magic with, do I need libssl and another library,
and then you can do the same trick, you can create a container from scratch, and then the entry point is the ASP.net application, so this works.
ASP.net is harder because there is a lot of magic in the DLLs, dependencies, I totally hate this and I did not go deeper into that, but effectively if you have knowledge about .net hardening, you even can kick out more of the original Microsoft DLLs.
So I now look into the sizes, nginx-latest is 142MB, on alpine it's 22.6MB, the hardened version has only 8MB, so if we compare the latest with our hardened version,
it's 17 times smaller, and with the ASP.net example we had at least optimized it by a factor of 2. This means if we have this step in our deployment pipeline, the volume of the container shrinks by a factor of 10 or more, which is a big step in creating
faster pipelines. If you look what we have lost, we cannot debug anymore, so there is no shell in the container, I cannot go into the container, I cannot do a kubectl exec
and give me a bone shell, and this is the price we have to pay. This is a hack, which might be helpful for you if you have everything under control. The more structured approach is using a tool like Epco from the same people who are using alpine, but
at the moment it only really works for alpine. What they have done is they have changed the package format of the internal APK packages to OCI. OCI is the official container image format from the Linux Foundation, so they are using container images as packages
in their distribution, which is a brilliant idea, and they changed the way they create these images in general, and the response is that you have now sub-second image build,
so Docker run normally takes 20 seconds, minutes, can even be longer, and this means you don't even need a registry, because you can create these images on the fly if you want.
This is a typical Epco build file. You say yes, I want to have this repository, and I want to have these packages, that's the entry point, and here you see the environment, and that's all. So this is building Docker files or OCI images in seconds.
And now we are at a point, okay, we can reconsider the entire way we run our CI-CD pipelines and integrate this into the steps, because what they also did is they implemented signing.
Signing for containers and signing for certain programming language packages, so Python is already on the list. Signing for images and signing for Helm charts, so you can sign every artifact in your environment, and you can configure your pipeline
that only signed code can be passed. So code and artifacts and configurations must be signed now. There is no way of adding something from the side. You can then check if
anything is correct. For example, all the container runtimes can do some checks on signatures. Normally in Kubernetes now you run something like KRIO, and KRIO is checking outside of Kubernetes if the image is correctly signed.
You can sign, you can look if your git commits are signed. Hava, which is the image registry, can be configured that it can only pass signed images to your environment, and the CI-CD tools like Flux and Argo-CD can be configured only to propagate signed Helm charts and signed
configurations. So this means you have not only GitOps, you have a version of signed GitOps, and you can now create a continuous security monitoring. You know in your cluster that you are only running images and configuration which come out of your deployment pipeline,
and your deployment pipeline is only passing signed configurations and images. And this means you are a step closer to control and securing your images.
So how is the adaption? CoSign is a tool to sign, and you already can use GitHub to sign it in GitHub or in GitLab or in Kubernetes. And the entire philosophy has been described as SLSA, supply chain levels for software artifact. You can define
the level on which you want to have security in your supply chain. In the critical projects where you are in critical infrastructure, you can simply say, yes, I want only signed artifact and configurations. In legacy systems, you can step down a little bit and say, yeah, we allow
certain things which are not as secure, but if you set up a new project, a new cluster, or in critical infrastructure, you definitely can insist on signing everything. This is a big step forward for supply chain security, and now you are in a stage, you know
what you are running, and you can even prove it, because you can show all my configurations require signing. So this is the result. So that's the way if you run this build
process, you get this. And as a side effect, you also can generate a software build of materials to be compliant and secure. So, conclusion. Hardening is possible. I have shown you how
to do it with a script. As a systematic approach, like APCO is knocking at the door, it's not ready, and it might be that you need other base images than Alpine. Alpine is going very far at
the moment, but they are also implementing it for Debian. It is extremely reducing the size and the attack vector of an image. You get S-bombs for free, and this continuous security monitoring is possible, and it is easy. That's all I wanted to tell you. If you have
questions, here are the links. They will all be in the PDF, and if you need more information on that or other things, or if I should hack systems for you, then just contact me.
So feel free to have questions now. You will probably hate me, but have you tried
already to harden Java applications? Java is a little bit different. What I would try is to apply something like Graal or Quarkus to the binaries and get statically linked
Java executables, and then you can have only that executables in the container. It should be possible. I've tried Quarkus. There is no reason why you should not do it this way, but it takes in your ICD pipeline. The Quarkus step
now for a hello world is 10 minutes on a real fast laptop, so this is something which probably developers don't like as much, but at the end it's possible, but a little bit different. Thank you. More questions? Thanks for your presentation.
I would like to ask what's your experience of trying to harden systems that collect logs from the clusters? I collect logs from the cluster. Especially for the debugging,
also in production, of course. If you would like to have logs, then you need Fluid or something like this in Kubernetes, and what I'm thinking about, if it would make sense, for example, to add hashes to every line in this. Normally, collecting logs is completely
okay. You can use FluentDs, but the story of the logs to prove that you haven't removed any line of the code would need a chained hash list or chained hashes in that. Not a blockchain,
but simply a ledger and then you could prove it. But I've not seen it so far. I would like to do it for audit logs, because audit logs have the same problem. It should be done,
but that's a little bit of a different approach. It should work, if you can imagine to sign or to hash and chain log entries. Thank you. More questions? Yes. Thank you very much. Maybe you have some experience talking about how do you sell
developers on containers without bash or terminal or something like that? How do you convince developers that it's worth it to harden while they cannot? I'm not convincing them security is demanding that, and then I give the developer the advice
how to do it as smooth as possible. The entire security must be as smooth as possible, because developers already have now 10 or even 100 times more code on their disks, so
blocking developers by security is not an option. It must be as easy as possible, and for a long time you can work with two versions of your container. The debugable, and then okay, we don't need to debug anymore. We use now the hardened version, and then you turn it on. Effectively, it's not really blocking you, and if you have the
guarantee that in your cluster runs everything as expected, it might even have a value for developers. But debugging in a hardened container might be very hard.
Are there any tools to scan a hardened container for vulnerabilities? You can use the same tools like Trivy, and normally you don't see anything in that.
But I'm not absolutely sure if Trivy is doing it right, but it looks like because it detects which are not in the metadata. Normally many scanners like Clear only detect metadata in the container, but Trivy also detects in static Go applications if there are vulnerable
libraries. So they are doing more than simply metadata scanning. So I would expect Trivy is the tool which can be used for that. Thank you. Any more questions?
Thank you for our talk. It was very interesting. I'm wondering if you could
what is the impact of the programming language on the security? So statically type languages versus other languages, for example? So yeah, the impact is every serious application pulls a lot of libraries.
And there might be a vulnerability in the library, like in log4jshell it was not the fault, not even the fault of log4jshell, it was in DNDI and it was effectively not a bug. It was a feature, but everybody has forgotten the feature. And you see the same in nearly every language.
You see it in Python where you have typo-squatting attacks. So I would say every language is susceptible to this kind of attacks to a certain level. But unfortunately you need a different strategy to protect your language because the library organization is different in every
language. So you need for Python, you need to scan. If you don't have a typo-squatting attack for Java, you need to look into the have to integrate something in the Maven build and the same for Ruby. And I would even say for Go, if you create a static library, it's the same as for Node.js but the download of the entire internet. So that step is done at
compile time. And especially Node.js, the Node.js developers are very aware now because they had several issues and you can scan everything. In my opinion, every language is vulnerable
to a certain degree and you have to adapt the strategy. More questions? Otherwise,
I will be around until the end of the conference. Thank you very much Thomas Fricke.