Introducing HPC with a Raspberry Pi cluster
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 490 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/47290 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
SupercomputerInheritance (object-oriented programming)SoftwareMultiplication signSoftware engineeringSupercomputerComputer virusRight angleComputer animation
00:45
SoftwareRoboticsWhiteboardComputerSupercomputerSystem programmingNewton's law of universal gravitationSoftwareBitSystem programmingUniverse (mathematics)ComputerWave packetSupercomputerSingle-board computerSoftware engineeringRoboticsWhiteboardProjective planeMultiplication signComputer animation
02:05
SupercomputerSoftwareSoftware testingSystem programmingSystem programmingSoftwareSoftware testingStructural loadComputer animation
02:32
Demo (music)Computer programmingSystem programmingBitGoodness of fitCASE <Informatik>SupercomputerScheduling (computing)Process (computing)Real numberRevision controlSoftwareComputer animation
03:29
BefehlsprozessorDemo (music)QuicksortSupersonic speedModel theoryReal numberComputer animation
04:08
Limit (category theory)Control flowIntegrated development environmentAbstractionSystem programmingGame controllerComputer hardwareIntegrated development environmentMultiplication signExtension (kinesiology)QuicksortUniverse (mathematics)Structural loadBitStudent's t-testSocial classComputerLaptopSoftwareData centerSystem administratorResource allocationComputer animation
06:32
Disk read-and-write headModel theoryInternetworkingLaptopRevision controlRevision control1 (number)Module (mathematics)MereologyReal-time operating systemLaptopInternetworkingLoginDisk read-and-write headServer (computing)Entire functionSystem programmingModel theoryDerivation (linguistics)Computer animation
07:27
Graphical user interfaceComputational fluid dynamicsDemo (music)SoftwareHost Identity ProtocolSturm's theoremDirectory serviceModul <Datentyp>Computer networkVertex (graph theory)Integrated development environmentSupercomputerBootingKernel (computing)PlastikkarteRootkitIdentical particlesComputer-generated imageryInheritance (object-oriented programming)LoginCodeParallel computingSelf-organizationRevision controlDemo (music)CircleDot productPoint (geometry)SoftwareApproximationSupercomputerCodeServer (computing)MathematicsDirectory serviceMultiplication signProcess (computing)Graphical user interfaceComputerPlastikkarteBootingQuicksortBuildingFile systemKernel (computing)Queue (abstract data type)Modulare ProgrammierungEvent horizonBitMedical imagingComputer scienceRootkitPhysical lawRandomizationIntegrated development environmentStructural loadWave packetStudent's t-testVirtual machineComputer programmingRepository (publishing)Standard deviationComputer animation
12:18
SubsetInheritance (object-oriented programming)SupercomputerPhysicsPhysicalismUniverse (mathematics)Video gameLaptopMultiplication signStudent's t-testComputer animation
12:58
SupercomputerLocal ringTouchscreenFeedbackSturm's theoremSystem programmingInternetworkingApproximationBackupLevel (video gaming)Mixed realityStaff (military)System programmingInternetworkingLaptopIntegrated development environmentBitQueue (abstract data type)Process (computing)Multiplication signFunction (mathematics)Real numberVideo projectorWave packetSoftwareBackupMixed realityComputer animation
15:22
Data managementConfiguration spaceSupercomputerDemo (music)Scripting languageSatelliteSimulationMonster groupMonster groupAreaSatelliteDifferent (Kate Ryan album)Mathematical analysisAddress spaceBitDerivation (linguistics)NumberPhysicalismSet (mathematics)Attribute grammarConfiguration managementRight angleArmSimulationProcess (computing)Module (mathematics)SoftwareDemo (music)System programmingSoftware testingGraph coloringType theoryScripting languageBuildingPattern languageGroup actionDefault (computer science)Software developerStudent's t-testComputer animation
17:35
Scripting languageEmailSupercomputerInformationPlastikkarteFile systemMedical imagingBitSystem programmingArmRight angleComputer fileBootingMiniDiscParallel computingIterationGreatest elementAddress spaceSoftwareMereologyDifferent (Kate Ryan album)Mathematical singularityPresentation of a groupReal numberEmailDisk read-and-write headMathematicsSoftware developerCodeScripting languageComputer animation
23:29
Point cloudFacebookOpen source
Transcript: English(auto-generated)
00:05
Just in time for the first talk, we have Colin who will be talking about building clusters with Raspberry Pis.
00:23
Is that working? Yes. Right. Hello. My name is Colin Soze. I'm a research software engineer at Aberystwyth University in Wales. And I'm going to talk today about my experience of building Raspberry Pi clusters and using them to teach people beginning their first journey into high-performance computing.
00:45
Just a bit about me. I'm working on a project called Supercomputing Wales. We work across four universities in Wales, ourselves Bangor University in the north and Cardiff and Swansea in the south. Between us, we share two physical high-performance computing systems,
01:01
one in Swansea and one in Cardiff. And across the project, we've employed 15 research software engineers such as myself whose role is not just to optimise software for any users, but also to enable training and to help people become familiar with using the system.
01:20
My background is actually a bit different to this. I did a PhD in robotics and in doing that, I spent a lot of time playing with single-board Linux computers and various other embedded Linux devices. The Raspberry Pi came out just after I finished that and I kind of converted to it very quickly because very suddenly I realised that the computers I had been paying $200 or $300 a board for
01:41
were now $20 or $30 with pretty much the same capabilities. And I read a lot in the early days about people building Raspberry Pi clusters and I thought, that sounds fun, but I can't see why you'd do it apart from for the sake of doing it. The performance of a Raspberry Pi is so small that any real computation you want to run is better off done on your desktop computer than it is done on a Raspberry Pi cluster.
02:04
And then more recently I started to see people doing slightly more serious things, that Los Alamos National Laboratory in America built a 750-node Raspberry Pi cluster in this giant, I think it's a 4U rack-mount case, and they actually use it as a test system for anyone who wants to run software on their real cluster
02:22
to take the load off the real system. So you can run a test case, do it on the Raspberry Pi, you prove it works on there, then you're allowed access to the real system. At the opposite end of the scale, EPCC in Edinburgh in Scotland built a system called Wii Archie, named after their Archer supercomputer,
02:40
which is the largest public supercomputer in the UK. And they've also got an even smaller version that they call Wii Archlet, which is what you see in this picture, which is just five Raspberry Pis in Lego cases. And they take these out for hundreds of outreach demos to schools, science events, all sorts. And very helpfully they put some instructions on their website about how to build one.
03:03
And this is what I started following as the basis for mine, but I found these were a bit out of date, and also they didn't go anything beyond doing basic MPI run. So there was no job scheduling system, there was no real software environment, there was just basic MPI and a few demos. Since I first looked at it though, they've actually come back and built about 20 or 30 really good demo programs.
03:26
And they're something I'd like to reevaluate in a bit. My final inspiration for this was my colleagues at Swansea University actually got some money together to build a 16-node cluster that they demoed at the Swansea Festival of Science in 2018.
03:41
And they had a demonstration running with CFD, Computational Fluid Dynamics, using a Microsoft Kinect sensor modeling the airflow of how air would move over a person. So they basically took a 3D picture of a person and then tried to show how the airflow would move around them. And they've done a lot of work with this sort of thing in real research. They are behind the CFD modeling for Bloodhound, the supersonic car that's currently in South Africa
04:04
attempting some record-breaking high-speed runs. So why should we actually teach with Raspberry Pi cluster instead of a real system? Well, the first thing is the real system is busy doing research. It's often that we see 85, 90, 95% load on our real systems.
04:21
The last thing you actually want is a class of students coming in and using it and disrupting the real research. There's also, I found, a fear amongst learners that they are going to break it. They are using this multimillion-pound research equipment for the first time. A lot of the people coming into it for the first time don't have huge amounts of computational background. And it's a bit scary suddenly being told, here, go and play with this very expensive system.
04:44
Whereas being told, here, go and play with this very cheap system that's right in front of you seems a bit less daunting to them. Also, when you make mistakes with resource allocation of a big system, because it's so big and you're doing sort of simple demonstration programs, it's not always immediately obvious that you've done something wrong.
05:02
When you've got a much smaller system with much less memory, much less disk, much less compute, it becomes much more obvious much more quickly that you've actually made a mistake in what you're doing. It's also nice to have control over our environment. On the real system, we are at the behest of our system administrators
05:20
and to some extent our suppliers as to what we can change and how things go. So if we need a particular piece of software installing for a lesson, it might take a few days to get it installed, and we've got to wait on other people to install it. Also, the hardware is abstract. It's a way in a remote data center. For us, it's even in another university that's two hours drive away.
05:42
And even for the people in that university, they're never going to see the real system. It's hidden in a basement somewhere that they are never going to see. Having the real system there in front of them makes it a lot less abstract, even if it doesn't quite look the same. And finally is that they don't need accounts. It does take us a few days to get people accounts. They need to already be registered on our system as students or staff, and that can cause problems.
06:03
You always get one person who shows up in a lesson. You get one person who hasn't got their account ready. You go and create an account there and then, and it's at least an hour until they can log on and do anything. Having a Raspberry Pi system there in front of them means we've got something we can immediately make an account for. No one cares that we made an account for them on it, and they can log straight in.
06:21
Also added to that is that the Raspberry Pi cluster presents as its own Wi-Fi access point, so anybody with a laptop can connect straight to it without having to get on the rest of the university network. So my little cluster I've called Tweety Pi, following a theme of naming all of our clusters after birds. This was literally begged, borrowed, and stolen from parts I found lying around the university.
06:45
So it is 10 of the oldest original Raspberry Pi model ones. One Raspberry Pi 3, which is I think a 2017, 18 model, to act as the head node and the login node. Running the latest version of Raspbian, which is the Debian derivative for the Raspberry Pi,
07:02
and the head node acts as a Wi-Fi access point. The whole thing actually needs internet access to work because you have to have a synchronized clock across all the systems. The Raspberry Pi has no built-in real-time clock module. The easiest way to get that is to get on the internet and pull it off an NTP server. For that, I then plug in a laptop which is on the main Wi-Fi or a mobile phone,
07:24
and then run the entire system through that. Demo software is something that's still a bit early days for. I very quickly wrote a demo program for National Science Week in March last year, which was calculating pi using a Monte Carlo method.
07:43
So here you put a load of random dots on a square and draw a quarter of a circle. Those that fall inside the circle are counted, and those that fall outside, and the ratio between them gives you an approximation of pi. The more points you use, more or less the better your approximation gets, up to probably 15, 20,000 points.
08:02
The version I've written this uses MPI, so you can choose to distribute it across all the compute nodes, and you can choose how many compute nodes you'd like to use. There's a very simple graphical interface to it that lets the user choose how many points they want to calculate and how many nodes they want to use, and then a very simple illustration of the queue. So they get to see how job queuing works.
08:21
So one person can go and submit something to all 10 nodes, then the next person can't run anything. But obviously those 10 compute faster than if you just submitted it to three or four nodes. I mentioned earlier that my colleagues in Swansea have done a more impressive graphical demo using CFD. It's a bit more power-hungry, though.
08:40
They've got 16 Raspberry Pi 3s, which are quad-core. I've got 10 Raspberry Pi 1s, which are single-core and a much older processor design. I probably wouldn't be able to run their thing in a sensible amount of time. And even on their cluster, it's taking three or four minutes per person to run. If you're at a science event trying to pull through many people in one go, that doesn't really work very well.
09:02
Only in the recent months did I come across the demos from EPCC with the Wiachi and Archlet demos. I think there are 10 or 20 demos on their website, and that's certainly one of my things to evaluate for the future. Now, I wanted to make the environment as realistic as possible,
09:20
and most of the cluster designs I'd seen out there didn't do this. They went for basic running MPI, and that was it. So I went for first installing MPI CH, and then I installed Slurm Job Manager, so you can run all the standard Slurm commands, and all our main real clusters run Slurm as well. I put quotas on home directories to stop people creating too much data on there.
09:44
All home directories are NFS-mounted across the entire cluster. There are software modules through the TCL modules command, of which there's only one at the moment, which is a version of Python that's not available through the standard repositories. And all the nodes network boot off one image, so I don't have to go and replicate SD cards with gigabytes of data
10:04
and keep updating them every time I install a new piece of software. Now, that causes a slight problem. The Raspberry Pi 1 didn't support network booting. It only came in in the Raspberry Pi 3. There's a slight hack to this. What I did was I placed the kernel and the bootloader on the SD card,
10:21
and then I NFS-mount the root file system. So now there's about a 50 megabyte image, which is identical for every machine on the SD card. That image doesn't update very often, only when there's a kernel update, and all the other software is then pulled over the network via NFS. If I ever do get Raspberry Pi 3s, then moving this up to PXE booting
10:45
on the Raspberry Pi 3 should be a trivial change of basically copying that SD card image to a TFTP boot server. For teaching materials, I had already written a lesson based around the principles of software carpentry.
11:00
I don't know how many people here are familiar with software carpentry, but okay, one, two. They are an organization who goes out to build sort of computational training for researchers, and they focus very much on trying to teach enough skills to be useful, not trying to teach an entire theoretical background and the whole of computer science.
11:21
So I'd already written a lesson loosely based on another lesson called HPC in a Day, which we talk about what is an HPC, how you log into it, what file system is available and how you can transfer data, submitting jobs and monitoring jobs at SLURM, and then some slightly more advanced stuff on how to do parallel processing. So a bit on profiling and measuring where performance bottlenecks exist,
11:44
parallelizing code and looking at Amdahl's law for limitations, then taking that parallel code and moving it to MPI, and finally a bit on HPC best practice. And what I did was I adapted this lesson to be entirely on the Raspberry Pi cluster. I removed any bits that didn't work for the Raspberry Pi,
12:01
and it probably shrunk down to what should have been a three-hour, three-and-a-half-hour lesson. I was then asked if I could teach something for a bunch of summer school students who were coming to visit us, and this unfortunately got cut down to about an hour and a half, so I ran a very rapid version of this. These PhD students were new students doing solar physics,
12:22
all funded by STFC, the UK Science Technology Facilities Council. So every PhD student who was starting a PhD in solar physics that year came to us. And one feature of this was none of them had actually registered for their courses yet. They were all just about to start, so none of them actually had a university account, which meant getting them onto any real HPC would have been an utter pain.
12:43
Having something there on a desk that they could all just connect to from their laptops certainly made life a lot easier for that. For most of them, it was the first time that they'd ever used an HPC system, but most of them had some Unix experience already. And they seemed to really enjoy it. They seemed to really enjoy actually playing around with just SSH into a system,
13:02
and the idea that they were sharing a system over SSH, I don't know what it was, but they really, really took to that idea, but also the kind of idea that they didn't care about the system, they couldn't break it, but also that they could kind of mess around and see what each other was up to and see what was in each other's home directories, because I hadn't set permissions to be world-readable.
13:24
That seemed to really grab them for some reason. And unfortunately the main complaint with that was just lack of time, so we ran through the whole thing in one and a half hours, and we got as far as just doing basic job submission. But one thing that was really obvious is with 15 people and a 10-node cluster, is queuing became really apparent really easily,
13:41
because one person would submit a five or a 10-node job, and instantly the queue is full. And what I did was I actually left up on the projector the output of the watchSQ command, so you saw every second an update of the queue, and you can see who was in the queue, and they could go and blame each other and see how long jobs were taking. A couple of issues with just running that is
14:02
I accidentally overwrote some system accounts when I created all the user accounts, and Slurm wouldn't run correctly on the first day, and that all the Wi-Fi was going through one laptop that was on actually our guest Wi-Fi because I hadn't registered the laptop properly, and that caused a few Internet problems. And because they were all connecting to the cluster with their laptops
14:22
for their Wi-Fi, that meant that all of their Internet traffic was also going through that one laptop, and that was where they were getting the course notes from, and it meant that they were often flipping backwards and forwards between other networks or their phones and that main Wi-Fi for the cluster.
14:40
I then ran it again for a second time, this time with a slightly different group. This was approximately 10 people who were actually doing proper training on how to use our real research cluster. It was about 10 people with a mix of levels, mix of staff and students, and this time I simultaneously ran the real HPC alongside the Raspberry Pi cluster, so people could choose to use one or the other or both.
15:02
And it proved to be a really good backup for those who had locked their accounts, of which I think there were two. It helped them make it a bit more tangible, but a lot trended to pick one system or the other and would submit all their commands on one or the other, but I managed to get the environments close enough that it didn't matter. They were able to run everything that was in the material on either system.
15:23
So in future work, what I would like to do is have a bit better configuration management for this. Right now I do have a set of scripts that should theoretically deploy a cluster. Testing these takes ages because you've got to start from the beginning and build software and go through the whole deployment process. I think on the Raspberry Pi ones,
15:40
compiling the software that I've put in just for that one module takes about a day. So it's not a quick process to iterate through the development of this. The other big difference at the moment between this and any real system is this runs on Raspbian, which is a Debian derivative. Pretty much every HPC system I've ever come across runs on some kind of CentOS or Red Hat Enterprise Linux derivative.
16:03
And it would be nice to try and get CentOS to run on there instead of Debian. As far as I understand, there is an ARM port that will run on the Raspberry Pi, but I haven't got around to trying it yet. Raspbian's kind of the default Raspberry Pi Foundation push. And I'd also like to focus a bit better on some demos.
16:21
The two I really want to do is we have a large group who analyze satellite imagery. I'd like to write something around there where someone can type in an address and then see some satellite imagery of that address and then do some analysis of it to look at how that land use in that area has changed over time. The other big area of research for us is in genomics.
16:41
And we had two PhD students a few years ago develop something called MonsterLab. And here you put a set of Lego bricks together, and the different colors of the Lego bricks determine what gene you're looking at. You run this through a sequencer, which looks at the colors of the bricks and then determines what the genetic sequence is and therefore what attributes your monster has.
17:01
So for instance, the first Lego brick is the number of heads, the second one is the number of legs, the third one is the number of arms, the fourth one is the number of eyes. And by having different colors, you then get different attributes to your monster. Right now, all the system does is you read through the brick and it tells you what the attributes are and the kids then go and draw their monsters. What I'd really like to do is simulate them and have maybe some kind of physics simulation
17:22
that goes and simulates them. Or to do some gene sequence pattern searching and look for similar monsters that came up previously, which I suppose is the more realistic scenario to real genomics. So finally, if you're interested in building your own cluster, I have put up on my GitHub all of the scripts that I have used to create this.
17:44
These are still under development somewhat. I've basically done one full iteration through them. Copying the code to the network booting is not fully tested, but you should at least be able to get a SD card deployed for the head node and for a compute node.
18:01
Also up on GitHub, this is actually under my organizational one, is the teaching material that I used. And if you want to know any more about this, then please feel free to email me. My email address is there on the bottom of the slide. James, please thank the speaker.
18:24
Before you leave, please keep seated while we do questions. Any questions for Colin? Thank you for a very interesting presentation. I was wondering about support for containers.
18:40
I see that coming more and more with Singularity. Could that be added? I don't know if Singularity works on ARM. I know Docker definitely doesn't, but I'm not sure if Singularity can be made to work on an ARM system. Given that we're seeing some real ARM clusters, in the UK there's one called Isambard at Bristol University. Maybe that is something that's in the pipework for Singularity. But yeah, it's definitely a feature that would be nice to add.
19:03
Thank you. Is it essential to boot a Raspberry Pi from an image
19:22
rather than install it in a conventional way? And if so, is there any news on an image for CentOS for Raspberry Pi? You put an image on an SD card. You have to have an SD card in the older Pis to boot from. My problem is if I want to put 10 images that are identical onto 10 machines,
19:41
I've got to get the image right, then copy it, and then if I make a change, I've got to either redo those changes to every machine or rebuild the SD card image and then put a new SD card in every one of them. So by having the network boot, that problem goes away. And sorry, what was the second bit about? Is there any news on a CentOS image for the Raspberry Pi?
20:01
I think there is a CentOS image, but I'm not certain. Right. There's definitely an ARM image, but... Put it back, please. Was NFS your first choice,
20:20
or did you also try other file systems, networking file systems, like Lustre, GFS, whatever? I think Lustre might be a bit too much for a Raspberry Pi. Also, given the size of the cluster, just NFS is fine. There's not a performance issue, really. And I'd also need multiple disks if I want to use a parallel file system. Having a single disk being served over a parallel file system
20:42
won't achieve anything. I'm not fully aware of HPCs that much, so it might be a newbie question, but as a user, the PhDs who are using,
21:00
logging on onto the Raspberry Pis, do they see any difference instead of just SSHing to a just regular node instead of HPC? Do they experience any different...? Not really. That's part of the point, is that the experience is very similar to the real system. The main difference is obviously that it's got less memory,
21:21
less disk space, and it's slower. But that's not immediately obvious when you're just typing commands at an SSH prompt. So why would that be an advantage for them to train on...