Magic Castle: Terraforming the Cloud for HPC
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/47324 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2020481 / 490
4
7
9
10
14
15
16
25
26
29
31
33
34
35
37
40
41
42
43
45
46
47
50
51
52
53
54
58
60
64
65
66
67
70
71
72
74
75
76
77
78
82
83
84
86
89
90
93
94
95
96
98
100
101
105
106
109
110
116
118
123
124
130
135
137
141
142
144
146
151
154
157
159
164
166
167
169
172
174
178
182
184
185
186
187
189
190
191
192
193
194
195
200
202
203
204
205
206
207
208
211
212
214
218
222
225
228
230
232
233
235
236
240
242
244
249
250
251
253
254
258
261
262
266
267
268
271
273
274
275
278
280
281
282
283
284
285
286
288
289
290
291
293
295
296
297
298
301
302
303
305
306
307
310
311
315
317
318
319
328
333
350
353
354
356
359
360
361
370
372
373
374
375
379
380
381
383
385
386
387
388
391
393
394
395
397
398
399
401
409
410
411
414
420
421
422
423
424
425
427
429
430
434
438
439
444
449
450
454
457
458
459
460
461
464
465
466
468
469
470
471
472
480
484
486
487
489
490
00:00
Point cloudSupercomputerMachine visionSoftware engineeringQuicksortMultiplication signComputer animation
00:43
Ring (mathematics)Context awarenessOrder (biology)BitComputer animation
01:05
Context awarenessComputerSelf-organizationWave packetWebsiteComputer animation
01:47
SupercomputerRun-time systemCluster analysisProcess (computing)Run-time systemSupercomputerPhysical systemOrder (biology)Form (programming)Computer animation
02:40
Different (Kate Ryan album)Computer animation
03:18
SupercomputerTowerSharewareSoftwareMultiplicationCluster analysisSupercomputerSharewareBitBusiness clusterRevision controlSoftware testingComputerYouTubeComputer animation
05:47
View (database)Projective planeMoment (mathematics)Instance (computer science)Business clusterRight angleMultiplication signComputer animation
06:15
Point cloudPersonal digital assistantPresentation of a groupOrder (biology)Instance (computer science)Representational state transferPersonal digital assistantComputer animation
07:05
Open sourceBusiness clusterVolumeComputer networkPublic key certificatePasswordParameter (computer programming)Configuration spaceoutputVertex (graph theory)Digital rights managementLoginBuildingMathematicsServer (computing)DampingInformation securityFuzzy logicSlide ruleOpen sourceBusiness clusterProjective planeFormal languagePoint (geometry)CybersexDigital rights managementInstance (computer science)Computer animation
08:25
Inclusion mapFunction (mathematics)Computer fileComputer architectureComputer animation
08:45
Physical lawFunction (mathematics)Computer architectureVertex (graph theory)Computer fileBusiness clusterSupercomputerLaptopVideo game consoleDigital rights managementMoment (mathematics)Volume (thermodynamics)Single-precision floating-point formatLoginProcess (computing)Interface (computing)Different (Kate Ryan album)Wave packetSystem administratorMehrplatzsystemService (economics)Classical physicsFile systemMultilaterationRouter (computing)Firewall (computing)PasswordOrder (biology)Personal digital assistantPhysical systemComputer animationProgram flowchart
11:22
Interface (computing)Inclusion mapFunction (mathematics)Interface (computing)Order (biology)Component-based software engineeringComputer fileComputer animation
11:47
Module (mathematics)Internet service providerPoint cloudoutputConfiguration spaceDirect numerical simulationDirect numerical simulationCloud computingPublic domainPublic key certificateInternet service providerConfiguration spaceComputer animation
12:41
Open sourceModule (mathematics)Public domainComputer-generated imageryLoginData storage devicePoint cloudoutputInternet service providerVertex (graph theory)Graphics processing unitGoogolConfiguration spaceDirect numerical simulationEmailMetreOpen sourceInternet service providerFile systemBusiness clusterParameter (computer programming)Direct numerical simulationoutputGraphics processing unitInterface (computing)DampingLatent heatDevice driverPoint (geometry)Ocean currentComputer-generated imageryVolume (thermodynamics)Group actionData storage deviceShared memoryPublic-key cryptographyPublic domainDomain nameMultiplication signVideo gameType theoryInstance (computer science)CountingOrder (biology)PasswordComputer animation
15:37
Machine codeUser interfaceConfiguration spaceFormal languageInternet service providerLatent heatLimit (category theory)ImplementationComponent-based software engineeringIP addressPasswordType theoryMachine codeLatent heatFunction (mathematics)Parameter (computer programming)LoginComputer animation
16:28
Variable (mathematics)SoftwareInstance (computer science)Computer animation
16:48
Configuration spaceExecution unitDigital rights managementOrder (biology)Bootstrap aggregatingDifferent (Kate Ryan album)Computer animation
17:18
Service (economics)Mechanism designBusiness clusterDigital rights managementBusiness clusterSystem administratorOrder (biology)SoftwareBuildingComputer animation
18:17
SoftwareFreewareKerberos <Kryptologie>StapeldateiMagneto-optical driveStack (abstract data type)Installable File SystemSoftware repositoryRevision controlFingerprintPermutationChainCompilerHill differential equationBusiness clusterSoftwareRun-time systemPhysical systemSupercomputerComputer filePoint (geometry)File systemOpen sourceComputer animation
19:54
Complex (psychology)SupercomputerModul <Datentyp>Kolmogorov complexityMachine codeBusiness clusterComputing platformComplex (psychology)Line (geometry)Computer animation
20:36
Cluster analysisBusiness clusterComputerSystem administratorComputer fileAxiom of choiceModule (mathematics)Point (geometry)Wave packetRootIdentity managementSoftware developerPasswordCASE <Informatik>Inverter (logic gate)Single-precision floating-point formatComputer animation
24:32
Point cloudComputer fileMIDISoftware maintenanceComputer architectureWave packetMultiplication signProjective planeComputer architectureProcess (computing)CASE <Informatik>SoftwareDifferent (Kate Ryan album)Cloud computingComputer animation
26:20
Open source
Transcript: English(auto-generated)
00:06
All right, time for the next talk. Félix Antoine, who flew in from Canada, to talk about Magic Castle. All right, so good morning everyone. My name is Félix Antoine Fortin. I don't have an exact title at Investe d'Aval,
00:22
which is in Quebec City in Canada. So I'm guessing I'm some sort of research software engineer working at my university. And today I'm going to talk to you about terraforming the cloud for HPC and mostly for teaching HPC, but I have like greater vision for what is Magic Castle.
00:40
But first, I'd like to start this talk with a question for you in order to get you involved and maybe wake you up a bit. Why do you think there are more wizards in Harry Potter than Lords of the Ring? I don't want you to answer it right now. I'm going to provide you some context and maybe give you some ideas of what is the answer,
01:00
and we'll come back to it. It makes sense at some point, I assure you. All right, so some context. In Canada, we have a global organization that coordinates advanced research computing across Canada, which means, so at the moment, we currently have five major HPC sites across Canada,
01:23
but all of those sites have the exact same software, they run the same scheduler, and the researchers who use those systems for free are helped by anyone from Canada. So if you are in BC and you speak French,
01:40
you can get support from Quebec. There's no issue there. So this is our infrastructure, and we also coordinate workshops and training. So across Canada, at the moment, we do around 150 workshops per year. All of those workshops necessarily try to use some form of the HPC software environment
02:03
we provide, but in order to get access on our HPC system, you need an account at Compute Canada, and generally, you need like two or three days in order to get those accounts. But if you are a new user or you're just getting your feet wet with HPC
02:21
and don't necessarily have an account, so could we do HPC somewhere else than our HPC system when it comes to training, when it comes to development? Could we replicate our HPC environment somewhere else since it's all the same across Canada?
02:41
Which gets us back to the difference between Harry Potter and Lord of the Rings? Does someone have a clue of why it could be? Yes. Okay. Academic sector. We're getting close. I'm not going to take too much answers,
03:02
but my take on it is it's wizardry schools. If you had wizardry schools in Lord of the Rings, you would get much more wizards. How do you train wizards? You need to get them in school. You need to train them. But you need schools for that, right? So my proposal is to move away
03:22
from the Soren Tower of HPC and more into multiple toy HPC clusters that are out of Legos that look like Hogwarts. Now, how do you do that in concrete? I'm going to do a demo.
03:41
This is a bit of a reckless demo because I'm going to do it with my phone. I'm going to create an HPC cluster in the cloud with my phone. Now, disclaimer, this is the original idea for that demo is from Casey Neistat from Google. He does it very well. If you haven't seen the demo, go look at it on YouTube after this.
04:02
But if everything goes well, I should be able to create an HPC cluster in around 20 minutes with my phone. So let me talk to Google. Talk to Compute Canada wizard. Get the test version of Compute Canada wizard.
04:22
Greetings. How can I assist? I want to build a cluster. What is the name of your cluster? Superman 50.
04:52
You want a two-nodes cluster named Superman. It will come with 50 guest accounts. Is this correct? Yes.
05:09
Your cluster will be available in around 20 minutes at Superman Calcio Quebec cloud. Thank you for your patience.
05:22
Don't get excited. This is just it. Wait, all right, so this might just be a recording, right? Something could have failed along the way. I'm not even sure yet if it has truly created a cluster. So we'll go look at it. So in Compute Canada, we also have multiple open stack cloud
05:43
that are among our federation, and one of them is on Cedar. So I can go look at, so this was my project in Cedar cloud just before the talk, so I'm going to refresh it, and this is where I created my cluster. So if I refresh it,
06:02
we should see if everything went well, we should see some instances being created at the moment. So yeah, it worked. So in around 20 minutes, so if we have time during the questions, I could show you up the cluster. All right. So let's get back to the presentation. So what did I do just now?
06:21
I talked to my Google assistant which talked to dialogue flow. So I have a few intents, a few of these questions were pre-canned with Google. Those intents then eventually get some answers from me, and those answers were feed through a REST API in Flask,
06:41
which was then feeding some of these answers, so just variables, parameters, to Magic Castle that I'm going to present to you, and Magic Castle actually eventually talked with the OpenStack API to create the instances. So we're going to just focus on Magic Castle for today. All of that is just fireworks
07:03
in order to make Magic Castle shine. So what is Magic Castle? Magic Castle is an open source project that instantiate a replica of a Compute Canada cluster in any major cloud. So I just did it in OpenStack, but I could have done it in Google Cloud, Amazon, Azure, or OVH, which run OpenStack.
07:25
So it creates instances, management node, login nodes, compute node. So I could, if I had enough resources, I could have 400 compute nodes. No issue there, it scales. It creates volumes, network, security issue. All, it starts really,
07:41
as long as you have the quota, it starts from scratch and creates a new cluster and provision it all together in around 20 minutes. So it is available on GitHub, if you want to look it up. And my slide should be on Fuzz them with cyber at some point. So it is composed of, Magic Castle is based on two major open source project,
08:04
Terraform for creating the infrastructure and Puppet to do provisioning. So if you don't know Terraform and Puppet, you can look it up. But those are very powerful tool and they have both their specific language that do their own things. So first we use Terraform to create the instances
08:21
and then Puppet to do the actual provisioning of the instances. So when you get Magic Castle, you have to select whatever cloud on which you want it to run. And Magic Castle architecture is the compose around those files, which are mainly Terraform files
08:41
and a cloud in it that will eventually bootstrap Puppet. So we're going to focus for now, mostly on the infrastructure. So that would be the infrastructure Terraform file. So as I said, what it creates is a whole HPC kind of cluster
09:02
that get access from our HPC users. So when my Google assistant was asking me, how many accounts do you need? It was actually creating guest account with a single password that was pre-entered before. So our users can connect on a login node through,
09:21
yes, the classical SSH, but also through JupyterHub. So we have in Canada, when I'm not working on Magic Castle, I'm trying to push to have JupyterHub on all of our system. And I'm using Magic Castle as a form of charge and horse in order to get our HPC admin to get to know and work,
09:40
get their feet wet with JupyterHub. So the login node as global is also as an endpoint. So if we want to train our users on how to exchange data between clusters, they can connect with global. So all of that is actually, all of the services are provisioned by Puppet later.
10:01
At the moment, what are being done is the creation of the instance, the firewall and the router and the access pro for the users. So when the login node is actually accessible from the internet, the management node is all that are classical admin administrative services.
10:22
So we have LDAP, DNS, Slurm is running, Slurm CTLD, Slurm DB is all running on a single management node at the moment. It might not scale to a too big cluster, but again, at first Magic Castle was meant for training. When it comes to storage, what we do is we simply run,
10:40
we simply mount volumes, set volumes directly on the management node that are then exported with NFS. Again, we are thinking of different file system at some point, but for now, for training, that was enough. And the actual compute nodes, the one on which the users are going to run their jobs
11:01
are simply running Puppet, Slurm D and console for provisioning, but I'll get back to this later, and JupyterHub single user. So when actually a user starts a notebook using the JupyterHub interface on the login node, they eventually get their notebook
11:21
on the compute node. Now, in order to spawn a cluster, I meant this for reuse to any research analyst in Compute Canada that don't know necessarily know about Terraform. So I wanted to have an interface that is as simple as possible. So we are going to go through that interface.
11:41
So when you interact with Magic Castle, you just interact normally with a single main file that is decomposed in four components. So first you need to select whatever provider. So whatever cloud provider you want to run on with, then you are going to specify your infrastructure customization.
12:01
And eventually if you have, if your cloud provider has some specific parameters, so for example, you run around on Google Cloud with Magic Castle, but you would like your compute nodes to have GPUs, you need to specify it specifically for Google Cloud. And then Magic Castle also takes care of the DNS configuration
12:22
if you have a domain name. So in my case, when I talk with my Google Assistant, it also registers supermam.calculcabec.cloud in Cloudflare DNS and created all of the certificates, SSL certificates required. So when we log in on JupyterHub, it's perfectly secure.
12:42
So first step, you select your provider. Very simple. In the main.df, you have a source per meter. So depending on which release you are going to get, this is going to point to Azure, GCP, AWS, or OpenStack. The next step is your cluster customization.
13:02
So when I said Superman to my Google Assistant, what it actually input as a cluster name is Superman. The domain name was already selected. The image is going to be your image on your cloud. So Magic Castle is meant for now only to run on CentOS 7.
13:20
But if you want to customize your own image, you can specify it through that perimeter. And then the number of users. Again, this is meant for training at first. So in this case, we're going to get 100 guest accounts that can log in with a password that gets specified. And finally, you can specify your public keys. So in order to admin that system,
13:42
you can connect with the CentOS account and your public keys so you can manage and administrate your own cluster. Then you can define the different instances type. So when you download a release, it is meant
14:00
for, there are already predefined type of instances. But if you'd like to get bigger compute node, you can change the type and increase those counts. And all of those parameters can be changed at any point in time in the life of the actual cluster. So if at some point you need 100 nodes and you at first created your Magic
14:20
Castle with just one, you could just do a reapply of your current plan, it's going to have 99 nodes and the cluster is going to adjust and scale by itself. Then you can define your storage. So for now, we only support NFS and all of those different sizes are for different volumes that are
14:43
copying the interface that we have on our file system for Compute Canada. So the users have their own home, but they also have a share group project, yep, and a scratch folder. Then eventually,
15:00
as I said, you can input some parameters for some cloud specific things. So if you're using, if you'd like to have GPUs and your cloud supports it, if at some point the puppet provisioning detects some GPUs, it's going to install the CUDA drivers automatically. And as I said,
15:20
we can support DNS automatically based on the different parameters that were created for your cluster at first. It's going to be registered in this example for Cloudflare, if you have a Cloudflare in a registered domain. Then once you have entered all of your
15:41
parameters in a single file, you just type terraform apply enter, and this is what again, my Google Assistant did. Eventually, it's going to apply a plan, output the different parameters for your cluster, so the actual password for your guest accounts, the IP address on which you can connect to the login node, etc.
16:02
One of the challenge that I found when designing this specific terraform project, if you have no experience with terraform, was not repeating myself, since we are supporting around four major clouds, it was easy to just copy stuff, but we managed
16:21
to be able to share as much as possible of terraform codes between the different clouds. As for provisioning, terraform is just meant to build the instances. When they are built, there is no actual software that are provisioned. All of that
16:41
is missing. So, all we do provision the nodes is with puppet, but we need first to actually bootstrap puppet, because we are starting an all new puppet master, all of our nodes are running an agent, but we are also running a puppet master on the management node. So, we are using the user
17:02
data and the cloud in it, and all of these different steps in order to bootstrap a puppet master on the management node. So, this was quite a challenge in order to make it all sync, but once this is provisioned, all we have is the management node
17:20
conductor actually managing the different provisioning between all of the nodes, and everything can be synced. One of the other challenges that are facing when coming to provisioning is that all of that cluster
17:40
can be put in hand by any research software analyst in Compute Canada that are not necessarily sysadmin. Once this cluster is provisioned, it needs to be self-sustained, and people shouldn't have to do any sysadmin by hand. it was meant to
18:02
it is quite difficult to actually build puppet in order to make sure that once provisioned, everything works fine. It's a day-to-day challenge to maintain that infrastructure. You might ask, well,
18:21
you have a cluster, but how does it make it an actual HPC cluster? We have in Canada those softwares are normally found on our cluster, but the main point is that we across Canada share
18:41
the same scientific software through a file system that was developed in CERN that is called CVMFS. All of our HPC systems share the same exact scientific software through CVMFS, which is a file system mounted through HTTP. Since all of our systems
19:00
can get access to that file system, my cluster, my Magic Castle cluster can also get access to that file system. All of our scientific software are installed on that file system. When you spawn Magic Castle, we also mount the CVMFS
19:20
volume, which provide access to over 4000 different scientific software that were pre-installed on there. You get the same exact scientific software environment that you would get on our HPC system. There is a paper that was presented at PERC last year and Bart Ollerman was also in FOSM last year to present
19:42
CVMFS if you have interest. Because anyone actually in the world can currently mount CVMFS and get access to our open source software that were compiled and made available through CVMFS. So the key takeaways are all of this wouldn't have been possible with, and you could probably
20:00
cross Terraform and just replace it by infrastructure as code. So if you'd like to build an equivalent but with Pulumi, that probably would have been possible. But the infrastructure as code is what made us be able to actually build something as complex as an HPC cluster
20:20
inside a few thousands of lines. And finally Magic Castle is a teaching and development, I call it meta-platform because it is creating platform, HPC platforms for you to teach or develop new stuff. So again Magic Castle can
20:40
replicate the Compute Canada cluster in around 20 minutes and I can take questions. Any questions for Felix Antoine?
21:02
Did you manage to sell any of this back to the traditional cluster admins? So you mean the user or the admins? The admins? Yeah actually so we chose Puppet because we already,
21:21
in Canada, we already used Puppet to provision our cluster. My idea was to be able to reuse some of our modules. We're not there yet. But because so far Magic Castle is quite self-contained. I'm hoping that at some point they might go and grab some of my modules.
21:41
Yeah, we are getting there. Thank you for the talk. Do you have any specific reason to choose Puppet or it was just one? Yeah, two reasons. The first one I
22:01
already mentioned, we were already using Puppet in Compute Canada so it was an easy choice. The other thing we talked, we think about using maybe Ansible, but the fact that we had an agent on the node is actually of value because if at any point my research analyst switch as
22:22
root on that cluster, delete a file by mistake, the agent in around 30 minutes might find that file has been deleted and put it back. So it's self-sustained and with Puppet I can manage that aspect. So again I'm putting that Magic Castle cluster in hands of not necessarily
22:42
sysadmin, so Puppet is kind of doing the sysadmin for me. Thank you for a very interesting presentation. My question is around the Superman cluster now.
23:03
Identity management, would I need to have a Canadian identity to be able to log in and how does that work? And also the life length of a cluster after the workshop, does it disappear? Alright, so the Superman cluster can be the Superman cluster stays alive
23:21
as long as I want to, so once I did the apply I could do the invert which is terraform destroy. So in case of our training we do, depending on the duration of our training we do either a single day workshop, well we keep the cluster open for like two or three days for people to maybe download
23:42
their files or keep on playing with it. We have Magic Castle cluster that I've been running for multiple months just for development for example. Identities for logging in? Alright, so logging in, again it created a few, in the Superman case it
24:01
created 50 guest accounts that starts with user 01 to user 50 and the password for the superman.calcalcabat.cloud is FOSDEM lowercase 20 exclamation mark. You can try it if you like. You can break it, you can
24:21
hack into it, I don't care because at the end of the day I'm just going to destroy it. It doesn't matter. Fantastic, thank you. Do you see a tension between your original case of supporting training and workshops and also supporting
24:42
for example if I wanted to use it, I was the one who asked you about Luster earlier in the week because I would want it to be running real work maybe on an open stack and do you think that it might make your job as maintainer too complicated? No. We are already getting, so
25:01
we started by training and then people started asking questions, what is that? It's not an HPC cluster. Can I use that for my own needs? And so I don't know as a maintainer where it's going to get me. So far it has, at first it was a pet project and now it's almost a full time job just
25:22
for Compute Canada and I'm curious where it's going to get me now that it's fully open source. I don't know, maybe my actual employer is going to say you cannot do that anymore. We'll see, but yeah I'm fully curious on how far we can get that thing and
25:42
we are actually getting interns this summer looking to implement maybe Lustre and work on different capabilities that are only provided by some commercial cloud provider like for example the Lustre in AWS or different networks or different architectures
26:01
too. So far we're just running x86-64 kind of architecture but we could do ARM too. Ok, that's all we have time for. Thank you very much Felix Antoine. Thank you.
Recommendations
Series of 13 media