HPC devroom welcome, introduction to HPC-UGent and VSC
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 199 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32544 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2014187 / 199
2
3
10
11
12
13
14
17
18
19
23
24
25
27
29
45
46
47
48
49
50
51
52
55
56
57
58
65
67
68
70
72
74
75
76
77
78
79
81
82
84
86
87
89
90
93
94
100
101
102
103
104
106
107
109
110
111
112
113
114
115
119
122
123
124
126
127
128
129
137
139
141
143
145
147
150
154
155
158
161
163
164
165
166
168
171
174
175
176
179
182
183
185
188
190
191
192
194
195
196
00:00
Level (video gaming)Virtual machineMultitier architectureVideo gameChannel capacitySystem callSystem administratorPosition operatorMultiplication signBitRow (database)Distribution (mathematics)Goodness of fitUniverse (mathematics)ComputerGraphics processing unitGene clusterPhysical systemStapeldateiSpring (hydrology)Moment <Mathematik>Data managementEmailMixed realityMaterialization (paranormal)1 (number)Statement (computer science)Electric generatorSupercomputerData centerMoment (mathematics)Disk read-and-write headEndliche ModelltheorieWorkloadMereologyVirtualizationFlow separationCartesian coordinate systemComputational fluid dynamicsSelf-organizationAssociative propertyVirtuelles UnternehmenDifferent (Kate Ryan album)SpacetimeVariety (linguistics)SequencePoint (geometry)LaptopInformation technology consultingServer (computing)WordSummierbarkeitWebsiteExploit (computer security)Series (mathematics)Profil (magazine)Single-precision floating-point formatForm (programming)InformationData storage deviceVideoconferencingFrame problemScaling (geometry)Centralizer and normalizerWave packetResultantView (database)Uniformer RaumCASE <Informatik>Dependent and independent variablesMilitary baseBuildingRight angleMedical imagingDialectVarianceEvent horizonTrailProjective planeAreaInheritance (object-oriented programming)Revision controlParameter (computer programming)Food energyExecution unitSet (mathematics)Rule of inferenceProcess (computing)LogicIntegrated development environmentNumberFormal verificationArithmetic meanError message
Transcript: English(auto-generated)
00:00
Hi everyone. The host and people got the video thing fixed right in time, so we're good to go. I'm Kenneth Hosta. I'm part of the HPC team at Ghent University. I'm, let's say, the organizer of the Ghent University, so I'm going to be running around like crazy. I'm just going to give the word to Ewald, who is our team lead at the HPC team at Ghent University.
00:24
He's going to do a quick introduction before we go to the real first talk, which is about to go. Welcome everyone. I'm so numerous.
00:40
In the first few slides, I'm just going to give you an overview of what we've done at Ghent University and the HPC team, and also in the summation series. So, I gather most of you will know about how to try to pull this computing. The HPC team in Ghent was started in 2008.
01:02
It's quite interesting. It's a single point of contact for researchers. The question is related to how to try to pull this computing. And so we basically said our mission is centralization of HPC with respect to infrastructure, expertise, support, and training.
01:22
How many are we? How many people are there? We're eight, so I'm supposed to be the team leader. Stedewijk is the technical team lead. Kenneth Hoster, who is organizing the Dev Room, who goes with the guy, is what you can call a head of user support, and also involved in this easy-build project,
01:43
which he will elaborate further on. Walt is a sysadmin. Jens is also a sysadmin and does some user support. Kenneth is our storage expert. Aldi is the low-level or deep-level, whatever you want to call it, user support and sysadmin. And then Ewan is our newest tech position.
02:06
So that's the HPC team. We take care of all the users at Ghent University. But we also go a bit beyond. You might have seen the banner outside the door. There's also something called the Flemish Supercomputer Center.
02:23
That basically brings together all the supercomputer, well, the individual HPC teams of founders into one virtual organization. And so far we've been maybe a bit virtual. We're getting more and more real,
02:41
especially since the advent of our new HPC manager, Russell Slapp, and so on. And Jens said that's the scout. He's located at Helplast, and he's kindly providing us with a lot of funds for new HPC.
03:01
And he's managing all the HPC teams of founders from all the five main university associations. So if you sketch where we are in the global picture and the European picture,
03:24
you can categorize high-performance computing, you can categorize supercomputers in three tiers. So at the lowest tier, you have tier two. Maybe even if you go below that, tier three is like your laptop or the server in which you have at home. At the tier two level, we're talking about, let's say, 500k.
03:46
500k. And you have a decent-sized supercomputer, which is an amount that's doable for universities, individual universities in Helplast.
04:03
And you can have, let's say, entire them at the awards, from your video band, and so on. One of them is tier one. That's everything times ten. That means scale times ten, but also money times ten. Actually, a bit more. Tier one facilities are located in Ghent also, as it happens,
04:25
but they're managed by the Flemish supercomputer center. So they're not really ours, they're really Flemish supercomputer center level. And then, of course, at the top level, there's praise. The facilities that praise offer.
04:42
Ten supercomputers, ten tier zero supercomputers. That's just the shitload of money, what you're talking about. So tier two is basically capacity computing at university level. That's what we're involved in. That's the primary mission of getting that infrastructure there and maintaining it. And it's, unfortunately, spread out all over Ghent
05:03
because we started in 2008 with a data center that was not yet there. So it was installed, or first clustered in the basement of the record building, which is actually a good one for that, which is not a very good idea. Then we moved to the university campus in the place we clustered there, but now we're happily consolidating in our new data center.
05:23
And just to give you a very brief overview of what kind of infrastructure we have, so it's always showing up, comparing deck size and stuff like that. We have seven clusters, ranging between 30 nodes and, what is it, 200 nodes.
05:43
Quite a few cores, and all the details are on here, which I don't know too much about. Officially, our cluster is called after Simon Stephen, who is a famous, well, Flemish-born, but then moved to Holland, scientist, mathematician, and, well,
06:04
homo universalis, as it's called. But unofficially, well, we had to name our cluster something. We were planning on having a lot of clusters, and then you need something with names that you have a lot of, and then Pokemon spring to mind. So we've decided to name all our clusters with Pokemon names.
06:23
So unofficially, we're called the Pokemon cluster. But what makes your life a little bit pleasant, and no one's talking about Simon Stephen, actually everyone's talking about the Pokemon names. And if you happen to have been blessed with my age, you don't know what Pokemon are,
06:41
so the first time you were on that machine, you were talking about, who are men's teams? It's a generation thing, I guess. Very importantly, so that's tier two. So that's what really is owned by the University of Ghent itself. That's what we manage, what the team manages.
07:01
It's a lot of work. We're only getting to have some of those. We have training, user support, and so on. On top of that, we are also responsible for this machine. So it's a tier one supercomputer of plumbers. There's only one at the moment that we know of. It's called Muck because it's our prerogative.
07:20
We named it according to Pokemon. But as I said, it's owned and operated by the Flemish Supercomputer Center. It's pretty powerful in the sense that we landed in the top 500 of supercomputers when it was inaugurated in June 2012. Well, it's 27 years, actually. 118, 118.
07:40
And then we're now gradually dropping down the list, of course, which makes sense. We're still doing quite well. Obviously, the economic crisis has nothing to do with that. And currently, we're at space 306, which is likely to be 400 or something the next time around. So all in all, this is about 8,500 cores,
08:03
which is the largest supercomputer in Obama that we know of. No longer at Algernon. The honor goes to the French president, of course. This supercomputer time is available on the supercomputer.
08:22
But you have to make an application. Every information is available either on the website of the Flemish Supercomputer Center or on our website. And you're free to ask questions about that. Like I said, we're only eight people at Kent University.
08:40
We're taking care of our tier two systems. We're also taking care of tier one, because we have to collect exploitation. So no wonder we're always interested in getting interesting profiles. So if you are interested in a job, I'm not saying we're hiring at the moment, but if you're interested, just send us your CD, and we'll have a good time.
09:01
Thanks for your attention. Any questions? In general about HPC or our infrastructure? What does it take?
09:27
You have to fill in a boring form. That's basically it. Well, you might need to have some consulting from maybe the local university who supports you. But if not, you're free to send us emails about us.
09:44
If you're not from Belgium, that would mean that we have to discuss it, but it's not possible. You basically have to send me a email. What problems are you working on? What problems?
10:01
We're not working on the problems ourselves. It's always the scientists. We're basically technical style. We're just maintaining the systems and helping the scientists. But what problems are scientists working on? We have computational chemists who are looking for new materials to improve, for instance, the reactor cell wall and nuclear reactors
10:20
to improve the mix there. We have computational fluid dynamics people who want to improve chemical reactor design or want to make a better atmospheric model of, for instance, pollution above Antwerp.
10:42
Just saying something from the top of my head now. What else? What's fancy? Bioinformatics. Maybe that's fun. So we're looking for interesting sequences to apply in enzyme engineering and so on.
11:01
So I mean, basically, the variety of research in bioinformatics, that's what's going on. You said there's eight separate clusters, and there's eight people? Yes. Do they come from different papers? Yes. And does that cost management increase
11:22
your work load because it's different? Yes. I'm not an informatician, so I'm not allowed to work on the systems. But there's seven. Do you find that increasing your work load because they're all the same size? Yes, it does. But it's a cost issue as well.
11:41
We shop with different vendors because depending on the time frame you're buying in, one guy is more expensive than the other. Or maybe IBM is cheaper for storage, and Dell is cheaper for machines. We also have developed, and we use tools, and we have developed some tools ourselves to make our life easier in that respect.
12:01
So that we can just do the same thing once, and apply it to all the machines. The last procurement, picking in on Andy, the last procurement, basically we just bought hardware, we just bought pizza boxes, and the team did the installation by themselves, which made actually our life a lot easier.
12:20
If we order it with, well, maybe we're a bit picky in that respect, but if we order it from a vendor, we usually end up spending as much time reinstalling or reconfiguring to our needs. What's the distribution use? Do you want to speak? It's active.
12:41
That's what we do, but in learning, for example, then I'm using something. We're trying to get to a more homogeneous situation. If you homogeneous, it won't be the same question, because it's always liquid-based, it's too high. Do you have a common batch system? Yeah, PBS, the torque.
13:01
The open source, the run-code, all that stuff. Maybe one more, one last one. Is it the CPU plus 4, or do you have accelerators like the GPU for FPGA? In Gantt, we don't have GPUs, but they're limited. We have GPUs.
13:20
So that's the idea of the homogeneous supercomputer system to diversify. I mean, Maven takes care of GPUs, and we don't want to do that. We're wanting to compute and take data, that's what our mission statement is, and then, wow. Maven is more specialized in GGs.
13:40
Okay, sorry we have to wrap it up now. Let's go. Thanks for that.