Intro to Ceph on Kubernetes using Rook - TIB AV-Portal

Intro to Ceph on Kubernetes using Rook

00:00

0

Related Material

Trost, Alexander Sitlani, Gaurav

Formal Metadata

Title

Intro to Ceph on Kubernetes using Rook

Subtitle

Rook Ceph in Kubernetes and the rook-ceph krew plugin

Title of Series

Number of Parts

542

Author

Trost, Alexander

Sitlani, Gaurav

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/61648 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

In this talk we are going to introduce you to the Rook Ceph Operator, which can be used to run Ceph clusters with ease on top of Kubernetes clusters. In helping make it easy to run a Rook Ceph cluster we will also be talking about the current state of the project development, the kubectl krew plugin and some more advanced features. There will be a demo about the rook-ceph krew plugin on how it is used to automate common management tasks and can make the troubleshooting process easier.

Speech

Text

Image

00:00

Execution unitData storage devicePoint cloudComputer animation

00:35

Projective planeData storage deviceMultiplication signSoftware maintenance

01:05

Solid geometryIntelState of matterComputer fileFreewareHash functionCodeDistribution (mathematics)Execution unitData storage devicePoint (geometry)BitAbstractionProduct (business)Cartesian coordinate systemProcess (computing)

02:06

Data storage deviceCartesian coordinate systemData storage deviceFrame problemComputer animation

02:43

Interface (computing)Data storage deviceCartesian coordinate systemVolume (thermodynamics)Device driverMappingComputer animation

04:00

Data storage deviceData storage deviceComputer hardwareAndroid (robot)Point (geometry)LaptopBitCartesian coordinate systemCombinational logicDirection (geometry)Computer animation

05:45

Connected spaceMultiplication signPoint (geometry)Operator (mathematics)Computer animation

06:26

Operator (mathematics)AutomationData storage deviceBootstrap aggregatingConfiguration spaceBitAbstractionCartesian coordinate systemPattern languageProcess (computing)Data storage deviceComputer hardwareConfiguration spaceOperator (mathematics)Computer animation

08:00

Data storage deviceCircleOperator (mathematics)2 (number)MiniDiscWeb pageMedical imagingLecture/Conference

08:54

Operator (mathematics)Connectivity (graph theory)SoftwareData storage deviceTerm (mathematics)Point (geometry)Configuration spaceDataflowObject (grammar)NumberState observerCountingMathematicsPattern languageProcess (computing)

11:12

Operator (mathematics)Medical imagingConnectivity (graph theory)NP-hardMereologyMathematicsBitComputer animation

12:39

Network topologyAsynchronous Transfer ModeOperator (mathematics)2 (number)MathematicsLine (geometry)Data storage deviceMiniDiscComputer animation

13:38

Block (periodic table)InformationData compressionFreewarePoint (geometry)Asynchronous Transfer ModeObject (grammar)Data storage deviceDifferent (Kate Ryan album)Total S.A.Computer animation

14:50

MalwareAsynchronous Transfer ModeExecution unitMeta elementData mining2 (number)Computer fileConnectivity (graph theory)Operator (mathematics)Object (grammar)State of matterComputer animation

15:27

Plug-in (computing)Data recoveryData managementCore dumpProcess (computing)Software maintenanceSoftware developerCASE <Informatik>System administratorRootInternet service providerBitComputer animation

16:50

Curve fittingHill differential equationDampingWorld Wide Web ConsortiumVacuumGamma functionInstallation artPlug-in (computing)Root

17:19

World Wide Web ConsortiumMaxima and minimaConfiguration spaceOvalGamma functionEmailWater vapor19 (number)WebsiteCore dumpData storage deviceLevel (video gaming)Plug-in (computing)Data managementComputer animation

17:48

Installation artComponent-based software engineeringConfiguration spaceData recoveryCore dumpCore dumpRun time (program lifecycle phase)Installation artDefault (computer science)UsabilityProcess (computing)Data recoveryRootTask (computing)Profil (magazine)Plug-in (computing)Ocean currentComputer animation

20:11

Limit (category theory)InformationProduct (business)AbstractionStapeldateiOperator (mathematics)Internet forumData recoveryConfidence intervalProcess (computing)Connectivity (graph theory)Point (geometry)Mechanism designDegree (graph theory)MathematicsRootSystem administratorComputer animation

22:57

InformationCluster analysisInformation securityFlow separationEncryptionPartition (number theory)Flow separationPoint (geometry)Plug-in (computing)Link (knot theory)Information securityObject (grammar)Computer fileBitRight angleDegree (graph theory)Operator (mathematics)Data managementConnectivity (graph theory)EncryptionPartition (number theory)Computer animation

24:12

Finite element methodComputer wormLattice (order)TwitterSynchronizationProduct (business)Computer animation

24:58

High-level programming languageElectronic mailing listTopological vector space19 (number)Maxima and minimaDemo (music)VirtualizationOperator (mathematics)MiniDiscCartesian coordinate systemBitAdditionDampingYouTubeMultiplicationGame controllerComputer hardwareInsertion lossComputer animation

29:58

Installable File SystemExecution unitSet (mathematics)CASE <Informatik>Cartesian coordinate systemBitStaff (military)Connectivity (graph theory)Data storage deviceUsabilityOperator (mathematics)Game controllerPoint (geometry)Multiplication signDevice driverShared memorySoftwareNormal (geometry)Configuration spaceData recoveryCluster analysisObject (grammar)Asynchronous Transfer ModeInterface (computing)Overlay-NetzComputer configurationComputer fileInsertion lossMathematicsFile systemLattice (order)System administratorLevel (video gaming)SpacetimeData managementFunctional (mathematics)Product (business)Computer animation

38:19

Program flowchart

Transcript: English(auto-generated)

00:05

So, everyone, we'll hear about Rook. Let's welcome Alex and Gaurav to the hand. Hi, everyone. Hope you're doing well,

00:21

not feeling enough sleepy after the lunch hours. We're here to talk about introduction to Ceph on Kubernetes using Rook. Here's Alexander. Alexander will introduce himself. I'm Gaurav, Cloud Storage Engineer at Kubernetes and I'm also a Community Ambassador

00:40

for the Ceph project from Indian region and been working with the Ceph and Rook projects since a long time and now contributed to the Rook project. Alexander. Hi. I'm Alexander Trost, Funding Engineer of Kubernetes technology sync. Well, I'm a Maintainer of the Rook project as well,

01:01

and we want to talk about Rook for everyone who doesn't know it. I want to get you started with storage. Who doesn't need fast, reliable storage nowadays with the Cloud-native applications? We're obviously talking about a bit more performance storage, I guess, depending on who you're asking.

01:24

Well, the point of Rook in the end is that with Kubernetes being kind of like with the shipping container, ship here, you have your Kubernetes that kind of abstracts everything,

01:43

tries to, well, provide you just one API for most to all things, depending on how far you want to go with it. And I guess for most people running Kubernetes, it kind of looks like that if you have a big giant ship running your production applications and you have your automation and CICD process that kind of

02:01

just try to keep it running. And that's where the poor question with storage more and more comes into frame for people, especially with local storage, already like I think a year or so, a year or two even ago,

02:20

coming into, let's say, being better supported in Kubernetes in a native way, and not just having things around Kubernetes to try to make that an easier endeavor. We have the question of how can I, for example, get myself storage talking with Kubernetes

02:41

so that I have storage for my applications. And well, that's the simple interface. It's more or less great. There is nowadays mainly one interface there called CSI, Container Storage Interface, which for storage vendors basically means

03:00

they only need to implement one storage. They only need to implement one interface and as well for Kubernetes slash you as a user, you have one interface or like one way on how you can get storage. For example, if you want storage on Kubernetes, you have the way of using the persistent volume claims.

03:20

We basically, from an application perspective, claim storage. And Kubernetes will take care of, for example, talking with Ceph storage and provisioning the volume. And subsequently, the CSI driver from Ceph will take care of the whole mapping the volume, mounting the volume,

03:41

so that it's completely transparent to your application. And the whole thing is with the CSI interface there, it's like just one way for any storage vendor to also get their storage running. There's obviously more than Ceph, but, well, obviously with Rook Ceph, we're going to focus on Ceph here.

04:01

And that's exactly kind of like a connector in between there. So if you want storage, doesn't really matter if it's just Ceph. The point though for obviously Ceph is that we have the Ceph CSI that's disconnecting bits from Kubernetes from the application's container side for your storage. And that's already a point where kind of Rook,

04:22

we're starting to talk about Rook here, is that you can run your Ceph storage cluster, well, on most to any hardware. I don't know what, we could run it on a Raspberry Pi as well, right? Yes. Easily. Well, I think I've heard people run it on

04:42

some Android phones or something even as well, but it's like, well, you know, just because you can doesn't necessarily mean you should, but that's a whole other discussion. The point being, you can technically have your Ceph storage anywhere. So it doesn't really matter if it's, well, if it's on the metal in your own data center,

05:03

or if it's just a few laptops thrown together, doesn't, that's the thing with Ceph in general there. It's like, you don't need the best hardware. Like you don't need to buy that big box from the one storage hardware vendor, or maybe to explicitly go into that direction to have storage.

05:21

And that's the thing where kind of the combination of using Kubernetes and Ceph might come into play or just for having storage for your applications, but also as a point for, how should I put it,

05:43

for running Ceph. That's what RUG is about. It's about running Ceph, obviously the connecting part, setting that connection up between Ceph and Kubernetes as well. But the idea is that RUG runs Ceph in Kubernetes,

06:02

in containers. Can I think, I think I mainly saw it from Ceph ADM last time we deployed a cluster on bare metal directly that like Ceph ADM also. One other way maybe to put like that, to install, deploy, even configure Ceph. Easily manage. It's easily manageable. It's one way to just install, run it.

06:22

It's kind of the same point for like RUG where RUG is basically a Ceph operator for Kubernetes. I'm going to go into a little bit more what an operator does because that's one of the vital points in general just from, well, running certain applications on Kubernetes.

06:41

And just again, as we had it, like running Ceph on Kubernetes, we can, with the operator pattern that we have in Kubernetes, we can easily have most things that cause quite some pain depending on how big you scale your storage cluster as well.

07:01

Obviously deployment, bootstrapping, configuration upgrades and everything. Like that's all processes that I think there's probably 5 million Ansible playbooks to install Ceph. There's well, obviously Ceph ADM. There's, what was it called? Ceph Deploy was there earlier. Ceph Deploy as well, which is Ceph ADM, I think now it is, right?

07:22

Ceph Deploy was earlier. Ceph ADM is now more of a advanced, I mean, latest tool that everyone is using these days. And there's more, I can already just think about five more tools on like how to deploy Ceph. And ironically for the people

07:40

that have looked into Kubernetes a bit more already, it's kind of the same story for deploying Kubernetes. But because of Kubernetes being kind of like this abstraction layer on top of hardware to some degree, not like abstracting everything away,

08:00

but let me, let me skip this page. It allows the Rook operator that's exactly where this image comes in. It's orchestrating a cluster. It's not just a deployment obviously as well, but it's about using the Kubernetes APIs to easily just take care of everything, so to say.

08:23

You want add a new node into your storage cluster. What do you do, technically speaking? You just add it to Kubernetes. And well, if everything goes well, 10 seconds later, the operator will be like, oh, new node, gotta do my job, run the preparing job and everything,

08:41

get the node ready. And a few seconds even later from that, the new Ceph components, the OSDs on the disks, depending on what disks are obviously as well, are taken care of. And that's kind of to make this full circle there with Kubernetes side, is like what the operator flow kind of pattern

09:00

it's mostly called is about. It's about observing. The operator is observing a status or even in Kubernetes case, custom resources. These are just, think about it as like YAML. Let's just give it a death what it is. An object of YAML in Kubernetes,

09:21

which the operator can well watch on. I as a user either make a change or even like my automatic CICD process makes a change to it. Like, oh, a new node has been added or something, or I want to tweak something in the configuration of the cluster. And yeah, yeah.

09:46

So the operator is observing that. And when there's a change or when there's even in like the Kubernetes cluster, there's a change, like a node missing or something, it analyzes that change. For example, if a node is gone,

10:01

or it's just not ready in Kubernetes terms anymore, let's say network outage for like two of your nodes, your operator would analyze, well, observe it first of all, analyze that and start acting upon that. For example, in Kubernetes terms, it would take care of setting certain so-called,

10:22

just to have that term out there, portrait disruption budgets, which kind of try to prevent other nodes from potentially stopping the components of the Ceph storage cluster. Main point is really just that it's this like observe, analyze, act kind of loop, because in the end it just repeats itself all over again.

10:42

It's a whole deal with Kubernetes operators. It's again, if like I want to, for I guess the people more already into Ceph, if you want to scale up some more Ceph monitors, or well, Cephmon's, you just edit the object in Kubernetes in the API

11:02

and just crank the number from one count from three to five or something. And again, this change is detected by the operator, analyzes it and acts upon it. And that makes it quite convenient as well. Again, here over the perfect, we have the YAML. Sorry, I'm just this guy and a little bit of clarification.

11:21

I don't have it mirrored on my screen, so it's a bit hard. But that's exactly the YAML that we talked about. Like as an example, I have my cluster running and let's say new Ceph release. What I would need to do to upgrade my cluster,

11:40

I would basically go ahead and just change the image to be, well, not 17.2.3, let's say. 17.2.5, that's the latest. As an example, yeah, 17.2.5 as an example. And again, operator would detect it, analyze if every component is up to date or not, and then even start a, well, I don't wanna say complicated upgrade process,

12:01

but there's, especially with something as Ceph, there's more than just, ah, let me just restart it. There's checks before every component is restarted for Ceph-native ways of like, it's basically commands that are, well, okay to stop, they're basically called like that. And that's the whole idea there, that the operator helps you with that

12:20

and in the end just fully takes care of it so that in the end, for the main part of your work, you can just sit back, change it in the YAML, and in a few minutes, or it can even be hours, depending on how big the cluster is, the operator will take care of that.

12:41

As I mentioned before, like the example with the monitor count, for example, if you wanna change that, change it. A few seconds later, the operator will pick that up and start making the changes necessary, or even if you would want to scale it down from like five to three, or three to one, which not recommended, we need highly availability there.

13:00

Or another option, the operator again, watches it, takes care of doing it. Or even if you want to specifically say on this one node, please use this one device, or even for this then, disk or NVMe, for example, use more than one storage team, OSD team for that.

13:20

These things are possible, and quite easily, just by writing some lines of YAML in the end. According to your workload, you can easily just customize your, according to your workload, you can easily customize your YAMLs. That will make your life easier. And we've mainly talked about

13:42

having the cluster running, or even setting up the cluster with the YAML definition of a Ceph cluster object. But if you would, for example, want to, well, run some Prometheus in your Kubernetes cluster and need storage for them. To be able to use storage in Ceph,

14:01

you need to have a storage pool, for example, RBD storage, block storage, basically. We also, again, just go ahead, create a Ceph block pool object, which is simply containing the information of, if we go from here, failure domain, where, well, you basically tell Ceph

14:20

to only store data on different hosts, to keep it simple for now. The replicated size that there will be free total replica, or free copies of data in your cluster that require a safe replica, let's just skip the format, it's like a safe replica size.

14:40

And even that, you could technically set the compression mode for this pool. Point is, again, we can just write this in YAML, apply it against the Kubernetes API, and a few seconds later, also for like the other objects, same way, you need the Ceph file system, Ceph object storage, same way.

15:00

The operator takes care of creating all the components. For example, the MDS for a file system, we have the, well, standard components, like the manager, the monitors, operator, the OSDs, and even for the object store, for example, the RGW components. And the operator simply takes care of that, and again, if you change the Ceph version,

15:22

few seconds to maybe a minute or two later, depending on what the state of your cluster is, operator will take care of doing the update. We've talked about, we've talked about, we've talked about deploying root Ceph cluster,

15:41

mainly right now. We want to highlight in that, in that point as well, the crew plugin that root Ceph is building and, yeah, well, providing. It allows you to, well, have certain processes automated,

16:02

even certain disaster recovery cases are easier to handle with that. And Gaurav will talk a bit about that. Here. So what's crew, right?

16:21

Crew is basically package manager for kubectl plugins. I mean, it makes the management of Kubernetes easier, and that's how the core developers and maintainers came together and thought, we can definitely write a plugin to make the lives of our developers and administrators more easier.

16:43

Crew was the way to go, since it's the de facto package manager for kubectl plugins. So, I mean, you can just do a kubectl install, kubectl crew install root Ceph. That's how the plugin will be installed.

17:01

And just, if you can see, we just ran the help command. It shows a bunch of things that you could do. You can just run a whole bunch of Ceph commands, RBD commands right now. Also check the health of your cluster. You could just do a bunch of things, like even if you want to remove an OSD.

17:21

So the need actually arise because, for example, if you want to use underlying tools, like Ceph object store or something like that, to debug core troubleshooting issues and issues at core OSD level. I mean, crew plugin is definitely a great way to go, as it provides common management

17:41

and troubleshooting tools for Ceph. It's currently, I mean, a lot of things work. We'll show you, it's just like I mentioned, you just need to run kubectl crew install root Ceph. It goes ahead quickly, installs the plugin. It's, I mean, way easier that you just don't need to,

18:01

earlier, I mean, you had to go inside the toolbox pod to debug and troubleshoot every issue. With crew, it provides such an ease of access that it makes, I mean, lives easier. And troubleshooting makes, is definitely easier. You could just override the cluster configuration,

18:20

just at the runtime. And some of the disaster recovery scenarios are also addressed. Some of the troubleshooting scenarios that were addressed is mon recovery. Suppose, let's say you will have the default three mon in the cluster and majority of them lose codem. I mean, recovering mon from mon maps, I mean, just doing a bunch of tasks could be,

18:42

if not done carefully, it could be, it could lead to more disasters, but certainly with more automation in place when things are definitely working, this is also made easier with the crew plugin. And even if you want to troubleshoot CSI issues, it makes it easier, for sure.

19:05

So, yeah. I mean, just like if you want to restore mon with OSDs, and even if we just delete the RookSafe cluster after accidental deletion of custom resources, that could be also restored.

19:24

And one of the common goals in the roadmap is also automating core dump collection, because let's say if there's an issue that happens with the safe daemon, and we want to collect the core dump of the process for further investigation to share it with the developers and with the community to understand

19:43

what issues we are facing. It can easily do well. If you want to just do a performance profiling of a process with GDB, that could be made easier as well. So these are some of the goals. The current plugin is written in Bash, but there's a work going on to rewrite the whole plugin

20:01

in Golang so that it's definitely more scalable and much more easier to manage. And even for contributors. So yeah. Just like that.

20:26

So, I guess the point we're more or less just trying to make is that if you have Kubernetes or even run a distribution of, well, what is the Ranger,

20:42

well, OpenShift, obviously, as well, on your hardware, and I would even put it to some degree as like you're confident enough with Kubernetes to run it. You can have it quite easy,

21:00

running a self-cluster as well on top of that. Obviously, to some degree, you need some self-knowledge, but that's with everything. If you want to run it in production. It's just that with this abstraction layer, again, with Kubernetes, it makes it easier for you. It's more of like you're kind of stamped in general there

21:22

to think of more of like, oh, well, I have some nodes, and they're simply there to take care of the components that you need to run for the self-cluster, and especially with the root self-operator, obviously, it makes the process easier by, well,

21:41

GitOps approach, for example, where you can just throw your YAMLs into, well, into Git most of the time, and have that automatic mechanism basically take care of this deployment process so that, again, the operator just takes this YAML, takes care of it, and makes the changes necessary.

22:03

And with the root self-crew plugin, just to get that summarized real quick again, it's a way for, yeah, for us to have certain automatic processes in the hand of admins when they need to, and not just as like a, hey,

22:21

here's like a 100-line batch script, please run it one command at a time, and it simply allows it again because we have this access to Kubernetes where we can just ask Kubernetes, hey, where's the monitor running? Oh, it's on node A, and all that because, well, we have an API that can tell us most of this information,

22:40

and also for recovery scenarios there, we can just ask Kubernetes to run a new pod or to, well, have a new monitor, for example, then running with this old information from the other monitors to have this forum recovered that is required there. And regarding RookSaf is like a general outlook

23:02

for the future. Some of the major points we're currently looking at is that we wanna improve the cluster manageability even more than we already have it at. We wanna make it easier. We're using the RookSaf plugin. Right now, you still need to do quite a lot of manual YAML editing

23:20

of the objects that we have in the API, but we would like to have, well, some more crew plugin commands there again to extend that functionality, make it simply easier. As well, improve security by having the operator and other components that are running in the cluster

23:41

use separate access credentials to the cluster just to have there a bit more, well, I guess to some degree even transparency if you would look at audit logging of the soft cluster, and as well, that's encryption support for CephFS and OSDs on partitions.

24:02

And as with everything, there's more. Feel free to check out the roadmap MD file on the GitHub, github.com slash Rook. The link will be shown as well. If you wanna get involved, if you wanna contribute, if you have questions or anything, we have, well, obviously GitHub. There's even the GitHub discussions open.

24:20

If you have any, well, any more questions, I guess, then that might not fit on Slack. Well, we have a Twitter account, obviously. We also have community meetings if you have any more pressing concerns to talk about.

24:41

And well, just kind of from that side as well, we're, as Garth and I mentioned, we're from Co-Technology Sync. We're building a company that wants to create a product around Rook Ceph, and just in general, try to help the community out there. Feel free to talk with us or contact us as well. And for now, thank you for listening,

25:01

and we'll gladly take some questions and can simply take the last, I think you showed 50 minutes, for questions or even just talking a bit about certain scenarios here with everyone, I guess. I would just like to add one more last thing before we go. All right, yeah, I'll just take it. It's not a good idea that there's two like. Yeah, I would just like to add one last thing.

25:21

If you want to check a demo and more troubleshooting scenarios, we did a talk at Ceph Virtual Summit 2022. It's already there on YouTube where we demoed a couple troubleshooting scenarios and crew plugin. I'll definitely share a reference and add a reference to here, but that'll be good to check out as well

25:41

if you want to check out a live demo. Cool, yeah. Thanks. Thanks. Thank you. Thank you. Any questions? Yeah.

26:06

The downsides of using Rook with Ceph because Ceph is known to be really hard in configuring and getting it right. So if I summarize correctly,

26:21

the question is what I'd like the downsides, I would more or less maybe put it at advantages, disadvantages of using Rook to run Ceph on Kubernetes, especially with Ceph being quite complex. What's the control that may happen on Ceph?

26:42

If there is any? If there's a lot of control on Ceph side? Yeah, because Ceph has a lot of knobs and things to configure. Are there some that we lose when we use Rook? Oh, I see. Okay, and added to that question that if there's anything that,

27:01

well, you lose when you use Rook Ceph. I guess as a major downside that most people see as well is because you have an additional layer with Kubernetes being that. I guess maybe to address that a little bit more

27:21

from what is at least I know there, for example, Ceph ADM. I think Ceph ADM is for a remark, or it uses Docker to run containers basically as well, Podman, yeah. Ceph ADM, right? Yeah, Ceph ADM uses Podman. Ceph ADM, for example, at least uses, kinda also introduces another layer, so to say, with Docker slash Podman.

27:41

Well, one that runs container, insert here. It has more or less in regards to installing Ceph, for example, in my eyes, but I'm very biased to containers, obviously. It has this aspect of here's the Ceph image, and it should work unless you have something weird

28:03

with the host OS going on. The downside is, again, like if Kubernetes just goes completely crazy, the Ceph cluster is probably also gonna have a bad time, but that's kind of like the weighing of,

28:21

are you confident enough, I guess, to, well, run Kubernetes and even running a Kubernetes cluster for long-term, that's like, especially with Kubernetes, there's more of this talk about, once again, pets versus cattle. So instead of just having a cluster for every application or something even, and just, oh, we're done throwing it away,

28:42

versus for, well, obviously something as persistent and important as a Ceph cluster, you can just throw it away then. From experience so far, I can tell that it is possible to run a Rook Ceph cluster over multiple years. I think I, when did I start mine?

29:01

I think I had it running for two years, and the only reason I shut it down was because I had gotten new hardware in another location, and I kind of just said it was like, do I migrate it or do I not migrate it, and it was just, okay, let's just start from scratch, but that's also because that cluster I'm talking about there had like 50 other applications running where it's just like, okay,

29:22

let's start from scratch anyway, so to say. In regards to losing control, it's not necessarily, you don't really have like a, like you don't have like a use this disk manual really way

29:40

besides putting it in the YAML and fingers crossed the operator takes care of preparing and then deploying an OSD to that disk or even partition. But it's like, again, I think with most tools there that take away certain aspects, at least in regards to the installation or configuration,

30:04

that those points are taken away, but it lies in regards to configuring staff or even certain aspects, you can do everything as normal. And at least from experience with staff,

30:21

I guess to put it like that has gotten a lot better as with the, tell me, the config store. The config store, as it basically says, you have a config store in the monitors where you can just easily set for certain components instead of always having to manually make changes

30:41

to any config files on the servers, on your storage nodes themselves. And it has gotten better, that's awesome. I think at a lot of, I would just like to say at a lot of places, it gives you a control as well. Because I mean, operator is responsible

31:00

for reconciliation and taking charge when, I mean, of automated scenarios where we want recovery to happen, right? And Saif, the goal is to improve recovery and in productions, you don't need any unexpected loss of control as well, right? We would want to give admins a certain level of control.

31:22

We don't want them to go ahead and play around with OSDs. So I think, I mean, in ways you, in many of the production scenarios, you need a certain set of control as well which Rook actually provides. So I mean, at that point, I would certainly

31:40

recommend and consider it as an advantage as well. You want to take the next one? Yeah, yeah, okay, yeah, yeah, sure, please. Yeah, yeah, yeah, yeah, yeah, yeah, please go ahead.

32:00

We need quorum, yeah, exactly, yeah, please go ahead. It's a similar question, but do you expect much of a performance hit with running Saif? Question is if there's going to be a performance hit in regards to running Saif in Kubernetes.

32:26

Depends on how you run it. If you run it, like I'm personally preferring running my Rook Saif clusters always with host network, but you kind of can already, depending on how far you're with container or Kubernetes, it goes over, well, host network.

32:42

Some like that, some don't. I personally do more or less just do it because I don't want the traffic to go over the overlay network as you have some plugins, some CNI, container network interface for anyone who wants to look into that that takes care of the network between your nodes. So it more or less depends. There's a lot of people using,

33:01

well, just having Rook Saif talk over the overlay network as well. It works fine as well. It's just a preference I would really more or less put it at and depending on what your network looks like. If you have 10G or something and your overlay network in the end maybe brings the, like in the iperf test at least, brings it down to like 9 point something, you know,

33:23

like is it worth exposing that traffic to the host network then versus just having it go over the overlay network. But again, if you think about it, just like another layer to consider. If you want that, if you don't, and if you don't want that, there's also options like Malthus

33:42

to allow you some more fine-grained network connections, or config in regards to the interfaces you want to pass in, like different VLANs or something. But that's like, again, it depends. Yes. Yeah, there. Can you still manage your Ceph cluster

34:00

via the Ceph dashboard, or is it another dashboard, or do you need two dashboards, or? The question was if you can still use the Ceph dashboard, maybe just to expand on that Ceph Manager dashboard, just to manage your Ceph cluster. To some degree, there is currently not a functionality

34:21

to add new OSDs, I think, if I'm remembering correctly. That's one thing as well with the future roadmap part, with the more managerial ability, where I also kind of looked at the dashboard and was like, wait, I have a create button, why? Oh, why don't we? But then it's the typical, oh, there's some roadblocks

34:41

that we just need to get out of the way and make sure that we are all, like especially with operator and even Cephadium and others out there, we're all aligned on the same way, or if there's like a manager interface, because there is even one, and I think if I understood you correctly, or I heard from the meetings correctly, they're even looking into improving that interface further.

35:00

It will hopefully be easier, thankfully also faster, to have the dashboard as this point of contact as well. Yeah, there's a lot of work that is currently going on. I'll just keep it. I'll just say that there's a lot of work going on

35:20

currently in the usability space from the recent discussions in upstream that we have had. To improve dashboard as well from both CubeCTL, from both Kubernetes and old standalone Ceph perspective, it's to make sure that, I mean, you can easily manage and monitor Ceph

35:41

even in the CNCF world. There has been recent discussions that have happened to improve it as well from Rook space. So a lot of work is going on in the usability space, but if you have any ideas, it will be most welcome, and really it would be great to have,

36:00

I mean, usability is one thing that really matters a lot, right? I mean, user experience is one thing that, I mean, we would certainly cater to improve in Rook. We have time for one more question? Mm-hmm, yep. It's your main use case to provide storage within the cluster where, for example,

36:22

I can use Kubernetes as an orchestrator for Ceph that I put text to my cluster where my main application's penetration is same way, but not within the cluster. So the question is if,

36:42

could I maybe modify it a bit more into the direction of like how could, can you run Rook, I guess? I think that plays into that as well. You can run Rook Ceph. You can run Rook Ceph in a way that you connect it to an existing Ceph cluster that it doesn't even matter if it's a Rook Ceph cluster,

37:02

just a Ceph cluster works as well. It mainly takes care of then just setting up the CSI driver then. I know people use that to some degree as well. If they have an existing or even an existing Rook Ceph cluster that they wanna share with the others,

37:21

there's also in this external mode the possibility of the Rook Ceph operator to manage certain components. So that, for example, if you want a file system, you could run those MDS daemons that you need for the file system in that cluster that your Kubernetes is running on then. That works as well. It's kind of like those two main external modes

37:41

and obviously the case of running it in the same cluster. That's kind of like this, either you just share what you have or share and allow Ceph file system or Ceph objects or you can just run the daemons in the same cluster then. Both works for all the operator. Does that answer that?

38:02

Thank you. Done. Any other question? There are no questions. There are a bunch of stickers here for everyone. Yeah, stickers and do we have some more? And if you've asked a question just now, just come see me, you'll get a T-shirt too. Maybe there's some left over after that.

Recommendations

45:56

Intro to Kubernetes

21:11

1:06:19

Intro to DNSSEC

08:47

Intro to Sprints

35:01

Intro to Transmogrifier

37:36

End-to-End Django on Kubernetes

50:09

Kubernetes on the road to GIFEE

30:15

Ceph on Kubic: Deploying Ceph with Rook on Kubic k8s cluster

23:16

Using BPF in Kubernetes

09:43

Migrate to Ceph-CSI