We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Operate First community cloud

00:00

Formale Metadaten

Titel
Operate First community cloud
Untertitel
A blueprint for a sovereign cloud?
Serientitel
Anzahl der Teile
542
Autor
Lizenz
CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
Identifikatoren
Herausgeber
Erscheinungsjahr
Sprache

Inhaltliche Metadaten

Fachgebiet
Genre
Abstract
Open source has become the defining way of developing software. But how do we open-source the operation of software? The Operate First Community Cloud is a peer-to-peer mentoring environment for running software in production, as well as a community for Cloud Native SREs to share knowledge about production practices. Using the same community-building process of open source projects, but extended to ops procedures and data. Experienced SRE’s find an outlet for sharing their knowledge and new talent get’s a chance to grow into an SRE role and get their hands on cloud-native projects in a production environment. Could this be a blueprint for a sovereign cloud? We’ll discuss how opening operations can free up a cloud deployment and lead to the same independence that open source brought to software.
14
15
43
87
Vorschaubild
26:29
146
Vorschaubild
18:05
199
207
Vorschaubild
22:17
264
278
Vorschaubild
30:52
293
Vorschaubild
15:53
341
Vorschaubild
31:01
354
359
410
Office-PaketOpen SourceOSS <Rechnernetz>RankingUmwandlungsenthalpiePhysikalisches SystemStochastische AbhängigkeitKontrollstrukturTermKontextbezogenes SystemOffene MengeLeistung <Physik>Open SourceArithmetisches MittelMinimalgradOffice-PaketIdentitätsverwaltungRankingTermKontextbezogenes SystemAutorisierungLikelihood-FunktionRechenschieberUmsetzung <Informatik>DifferenteKartesische KoordinatenProdukt <Mathematik>LoginSuchmaschineBitCloud ComputingRechter WinkelUnternehmensmodellMultiplikationsoperatorLesezeichen <Internet>BeanspruchungPlastikkarteCASE <Informatik>AggregatzustandHypermediaCookie <Internet>Nichtlinearer OperatorStochastische AbhängigkeitSoftwareAssoziativgesetzComputeranimation
CodeOpen SourceSoftwareDienst <Informatik>Offene MengeSoftwareTermService providerKontrollstrukturOpen SourceComputeranimation
SoftwareentwicklerNichtlinearer OperatorSoftwareProgrammierumgebungOSS <Rechnernetz>Open SourceWärmeausdehnungCodeTeilbarkeitProjektive EbeneUnternehmensarchitekturInformationZentrische StreckungQuellcodeDistributionenraumMereologieDifferenzkernMetrisches SystemSoftwareOpen SourceComputeranimation
DämpfungOSS <Rechnernetz>Nichtlinearer OperatorComputeranimation
HybridrechnerOperations ResearchService providerZahlenbereichApp <Programm>Digital Rights ManagementDienst <Informatik>Komponente <Software>OSS <Rechnernetz>Arithmetisches MittelBootstrap-AggregationStrömungsrichtungCluster <Rechnernetz>SchnelltasteProjektive EbenePunktGrundraumDienst <Informatik>HardwareInstantiierungCloud ComputingDokumentenserversinc-FunktionSoftwareentwicklerTouchscreenOpen SourceNamensraumTermBildverstehenSingle Sign-OnComputeranimation
VerschiebungsoperatorSystemplattformOffene MengeArchitektur <Informatik>EntscheidungstheorieBefehl <Informatik>Kontextbezogenes SystemZweiAuthentifikationDienst <Informatik>Produkt <Mathematik>EntscheidungstheorieKartesische KoordinatenAuswahlaxiomWeb logSichtenkonzeptComputeranimation
Offene MengeCluster <Rechnernetz>
GammafunktionSichtenkonzeptWurm <Informatik>Dienst <Informatik>WhiteboardBitrateCloud ComputingOSS <Rechnernetz>Weg <Topologie>Physikalisches SystemDokumentenserverGüte der AnpassungTemplateStellenringObjekt <Kategorie>DatentransferExpertensystemMereologieComputeranimation
Flussdiagramm
Transkript: Englisch(automatisch erzeugt)
OK. Thank you for showing up in person and so many people. So it's actually my first time at FOSTEM, and I think it's super excited to see so many people
and coming back to conferences, but also such a crowded conference. So I mean, the talks previously were super full. Now it's almost half full or half empty, depending on how you look at it. And even for such supposedly boring topics
like a sovereign cloud, I mean, that immediately sparked associations with state and GDPR. I mean, all the cookies that you have to click away. So sounds boring at first, but I think there's also some value in it. And I think there's a journey where
open source can help to make a sovereign cloud come to life. Like, look at some aspects of it. This is me. I go by the name Durandom on GitHub and on social media. Three days ago, I changed roles. So now I'm in sales again. Yikes. As a managed OpenShift black belt.
So I'm still looking a little bit at the cloud topic. OpenShift is like a cloud on a cloud. Before that, I worked quite some time on AI. On AI ops in the office of the CTO. And the last thing that I did for two years now is imagining or revisiting open source now
in the age of cloud and seeing how open source principles can also apply to operations. So we're going to look at the operate first community clouds. Why do we need a community of practice
around operations? What does this community cloud look like? And also, where is it? So I think this will be a, it's not really a hands-on talk, but you can take things away. You can, if Wi-Fi works for you,
you can log into the cloud right now. And I hope to see you in some of the meetups or in the community after that. Because it's really open to anybody who wants to learn something about operations or wants to teach something about operations. So when I first heard the term sovereign cloud,
like I said, it sparked the sovereign, the king who has now also occupied the cloud. And I put it into my favorite search engine and it immediately came up with a lot of definitions on sovereign cloud. There was one from VMware, some from telecom,
and they all looked at different aspects. And these days, in case you're not living under a rock, everybody's talking about jet GPT. So I thought maybe let's talk to this AI who already read all the definitions for me and ask it about sovereign cloud. And this is just the end of my conversation. So I wanted to highlight the differences
between the noun and the adjective sovereign. And the noun sovereign refers to a personal identity that holds supreme power and authority. While the adjective sovereign describes something that is supreme or superior in rank. That still sounds not really friendly to me.
Like do I really want something that is in supreme power over me? And why should I care about this then? But this is also a notion of independence in that adjective because that's what I always thought about when thinking about it a little bit.
And jet GPT came up with this, that there's a notion of independence in the adjective to be described. Sovereign means to be independent, not subject to control by any other person or entity, which if you think about it, that also implements if you have supreme power, then you can also move away.
And having the highest degree of power and authority, the term emphasizes the idea of self-governance and supreme power within a given context. And the context seems to be cloud. So when I look at sovereign cloud, at least in my small world view,
it means I have the power to move away, I have the power to control stuff, and I have that largest amount of independence. And that seems at first contradictory to that business model that we saw in previous talks. So somebody, a nice definition of cloud is,
I'm running stuff on somebody else's computers. So that doesn't seem to be like a lot of freedom because I have some lock-in. But actually, open source led a path away from lock-in. So I think it's important that we apply these open source principle
also to operations. And if you, these days, look at a cloud, is it really open? I mean, it's built on open source software. You get your RDS or some other product and underneath, yes, it's running MySQL, it's running Elastic, it's running all that open source stuff,
but you're still tied to that experience that the cloud provider imposes on you. If you want to rebuild that with open source solutions, you can do it. Well, it's looking pretty complicated.
So you need to master a lot of these technologies. You need to stick them together. And there's a reason for why people defer to the cloud because they are interested in the workload. They just want to swap their credit card and consume and build away the application. But I think the last speaker put it really,
or the previous speaker put it really nice that login to be defined as a product of cost and the likelihood that something is going away. So you have to deal with that stuff. But open source somehow, if you go to this slide here,
so open source actually showed a way out of this. And the left side of that funnel here is the traditional open source, as in software contributions funnel, which we all know for decades and which we all love.
So you find a project, you use it. There might be 100 users of it, and at some point, something breaks. So you might file an issue, great. You already contributed because you filed that issue.
And then maybe at the last time, even somebody fixes or maybe you fix that project. So there's really a funnel of 100 users, then 10% reporting issues and making up that community, and 1% actively contributing to that project. If I'm using something as a service, I'm essentially drowning this funnel.
So I'm stopping at the API layer. I might contribute to the underlying open source software that might run this service, but in terms of contribution, I'm usually stuck with maybe filing in the support case,
and maybe the provider comes back to me. But I have no possibility to actively contribute to that and maybe fix that API outage. But maybe I'm the only person having that problem, and so the cloud provider doesn't even care about this.
And this was the notion that our team thought about when thinking about open source in the age of cloud, where there's more value apparently in running and providing the software than the software code itself,
or at least that's on equal scale. And as we see with many enterprise distributions or business models, you can get the source code of that database or that service, but you don't get the,
sometimes you don't even get the built tools. You don't get the tools that actually operate that service, the SLIs, the metrics that you need to run there, the runbooks, et cetera. So every deployment is either behind a paywall, because that's the differentiated factor for that company,
or you have to learn it yourself. And it's actually quite hard to open up something also with legal constraints, right? So you might have a PII, personally identifiable information in there.
You have logs, so you need to make sure that you don't expose any of these secrets. So there's a tight balance, and that's why most companies or most projects default to closed. And even for communities that run their infrastructure, like the Fedora infrastructure,
that's somewhat open, but you still need to be going through a lot of hoops to contribute and to do something. So it's not really open by default, and it's also not meant to be as a blueprint for something. Only 10 minutes left, but I think I can go to the next part
of this presentation. So this is the concept, right? So we need to shift left, we need to open up operations and practice something so that we build up a community so that we don't have to build our operational deployments from scratch. And while that is the concept of this operate first idea,
we also thought you need to have something physical, something hands-on where people can actually contribute, because otherwise it would be just a talk show. So somebody needs to lead the way and implement that stuff. And we tried to build a hybrid cloud with full visibility into the operation center.
And hybrid cloud these days is, for a lot of people, Kubernetes. And so we have two bare metal Kubernetes clusters running at the Boston University with 34 nodes and 1,200 cores,
so it's not a small setup. Then there's one larger cluster running in AWS from the OS climate project, which is also managed with these operate first community cloud ideas. And we also work with a German super scaler, that's what the layer below a hyperscaler means.
Jonas, they donated some hardware and we deployed also some clusters there. So my vision actually is to have a really resilient distributed cloud setup operated under these principles
at as many hardware or cloud providers as possible. 626 individuals locked into these clusters, about 200 namespaces are there.
So we do a lot of stuff most of the stuff is happening on GitHub. We have 150 people in the operate first community. There are like 1,000 issues being filed. In terms of diversity, since it's a Red Hat Bootstrap project,
it's like one third or half of the people are Red Hat employees, but there's also a lot of university contributions from American universities. And also a lot of open source projects already contributing there. It's just a highlight of some of the more noteworthy projects using this infrastructure,
like OKD, the upstream of OpenShift, or OS climate or the genius IDP, which is a project for some backstage plugins. So backstage, it's currently one of the more hyped tools
for a developer portal by Spotify. These are some of the services that are running there, like the usual stuff that you would expect from a cloud setup.
We have Argo CD for doing GitOps. We have Grafana for monitoring stuff. We have Tecton pipelines for building things. There's a Brau instance running for doing CI CD and a lot of other things. So every, and that's all deployed by the community in a GitHub repository where you can integrate
into these other services. And I think that's where the actual value comes from. So let's get real. We love hands-on keyboard. And as I said, it's all done. We are really a GitOps, SRE, no, a Git first approach.
The current entry points for you would be going to operate-first.clouds or, and that clicked through some hoops, and you end up at the service-catalog.operate-first.cloud, which is an backstage instance where we, for one,
showcase the services. So you go to the catalog, you see all these services with all their dependencies, and you see all the managed clusters there. So you click on one of these clusters, and you are presented with a single sign-on logging screen.
And if you choose the second option, operate-first, you can log in with your GitHub account. So it's authenticates against GitHub. And without even signing up for an account, you get a read-only view of the cluster, which is pretty awesome. So you see how these services are being deployed
and how other community services are being deployed. So you get a hello world example, a live hello world example of a fully production cloud environment, which you would see at your site,
either for your project or for your customer or whatever. And we documented the way and the why we came up to certain decisions. So in this case, it's application monitoring, or there's also how to store credentials in a cloud. And these are the questions that you will face
if you're setting up your own local cloud. And we documented these to bootstrap either other deployments or to contribute back so that you don't have to really read through so many blog posts and documentations and make your own choice.
There are some dashboards here, and these are the dashboards that we use for troubleshooting or the community uses for troubleshooting. So you would see Kafka or Open Data Hub and Prometheus, live dashboards, and here's one dashboard for our clusters.
And that's the GitHub org where you would start talking to us or talking to the community. The main entry repository is the support repository. You can ask questions or you start with one of these templates,
and one of the coolest templates here, or processes, is onboarding to a cluster. So you get a form, it looks like a form, it's a GitHub template. You choose which cluster you want to be onboarded, the team name, and then we have some automation in place
that would automatically create a pull request to our GitOps repository. And we only have to say, looks good to me, to it. For the person that's part of the operating team, and they are onboarded. So that's also giving you some sense of how would I automate my local cloud deployments.
You don't need to do that, but it's a way to bootstrap you. And there's a lot of other issues going on, and as said, it's a community, so things will eventually also break.
Right now, we have problems with our object storage. It's broken. If you are an expert in Nuba or in object storage, and you want to get your hands dirty in rebuilding some of that stuff, this is the issue. So Tomito here, which will give another awesome talk at the end of this track here,
left some comments how to get started. Nobody worked on it yet, so it's up for grabs. Thank you. One question. Thanks, Marcel.
Question? Oh. Hey, you provided a good definition for sovereign cloud. Who are the customers for sovereign cloud? I don't know, to be honest. So I'm looking at it really from a technical perspective,
and I think my key takeaway is everybody who wants to build their own cloud probably wants to be sovereign in running their cloud, and then you have to focus on stuff like minimizing vendor lock-in and being able to move to another cloud provider or move your data across clouds.
To jump just onto the question, Elia, the transmission system operator, is one of the customers that would like a sovereign cloud because of the resiliency that we need to have because we have critical infrastructure. I think that was a statement, not a question, right?
All right, take out your smartphone, snap that QR code, and there's a biweekly community meetup where you can meet all the wonderful people that are involved in this community.