We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Deploying Containerized Applications on Secure Large Scale HPC Production Systems.

00:00

Formal Metadata

Title
Deploying Containerized Applications on Secure Large Scale HPC Production Systems.
Title of Series
Number of Parts
637
Author
License
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
The ever increasing need for the ability to easily customize, reproduce and migrate applications and workflows are steadily increasing amongst the High Performance Computing (HPC)community,as non-traditional HPC software environments and applications are starting to require HPC resources to tackle “real world” scientific problems. In addition,traditional HPC software are becoming more complex and are often deployed on multiple different architectures.In this talk, we discuss the issues associated with the deployment of containerized environments on secure HPC systems and how we successfully deployed traditional and non-traditional HPC applications on a secure large scale HPC production system with HPC specific containers.
179
Thumbnail
20:09
245
253
Thumbnail
30:06
294
350
Thumbnail
59:28
370
419
491
588
Thumbnail
30:18
SupercomputerSystem programmingLaptopInformation securitySystem programmingIntegrated development environmentSupercomputerCartesian coordinate systemSoftware developerQuicksortSheaf (mathematics)Maxima and minimaOpen setGroup actionProjective planeCore dumpBoom (sailing)Operator (mathematics)Constraint (mathematics)ComputerOffice suiteLocal ringBasis <Mathematik>Computer animation
Information securitySoftwareSystem programmingEnterprise architectureConfiguration spaceComputer hardwareRead-only memoryVertex (graph theory)Sanitary sewerInternetworkingInstallable File SystemProcess (computing)SupercomputerStability theoryCompilerSystem programmingSoftwarePatch (Unix)Binary codeResultantTerm (mathematics)Mathematical optimizationQuicksortInternetworkingVirtual machineLibrary (computing)MathematicsUniform resource locatorLaptopParallel computingProcess (computing)Latent heatStability theoryArmBefehlsprozessorCartesian coordinate systemMultiplicationComputer hardwareDifferent (Kate Ryan album)Device driverComputer fileConnected spacePattern languagePower (physics)Graphics processing unitIntegrated development environmentFile systemSupercomputerRevision controlFamilySystem softwareFigurate numberPhysical lawFatou-MengeForcing (mathematics)Personal digital assistantInterface (computing)Compilation albumOnline helpBuildingNumberComputer animation
SupercomputerSoftwareStack (abstract data type)Component-based software engineeringOpen sourceOpen sourceStatement (computer science)SoftwarePresentation of a groupLink (knot theory)Mathematical singularityComputer animation
Scripting languageComputer-generated imagerySupercomputerSystem programmingData conversionModule (mathematics)Sturm's theoremSupercomputerProcess (computing)System programmingModulare ProgrammierungSoftwareSoftware testingNumberQuicksortMedical imagingMultiplication signAxiom of choiceModule (mathematics)DataflowModel theoryTask (computing)Form (programming)MeasurementWave packetComputer animation
SupercomputerSoftware testingSystem programmingLocal ringData conversionMechanism designArchitectureSpacetimeSoftwareRun time (program lifecycle phase)Revision controlIntelTensorDataflowPoint cloudStability theoryTask (computing)Vertex (graph theory)Parameter (computer programming)Parallel computingMiniDiscCrash (computing)Pattern languageDirectory serviceSanitary sewerFatou-MengeDefault (computer science)Module (mathematics)Integrated development environmentOpen setSingle-precision floating-point formatFLOPSCoprocessorMultiplicationMatrix (mathematics)AlgorithmResultantKeyboard shortcutSoftwareSystem programmingDirectory serviceProcess (computing)Multiplication signLatent heatData conversionQuicksortGroup actionMathematical singularityPoint cloudComputer fileFile systemAxiom of choiceMedical imagingEntire functionSoftware testingLocal ringCartesian coordinate systemSet (mathematics)RankingArmProjective planeSoftware developerLevel (video gaming)Data centerStability theoryVideo gameModulare ProgrammierungRootModel theoryQuantum computerParallel computingMiniDiscTable (information)CollisionRun time (program lifecycle phase)Centralizer and normalizerBit rateRoundness (object)Profil (magazine)Rule of inferenceContingency tableMachine visionQuantumWordWeb pageSpeciesPhysical lawFood energyLibrary (computing)Fatou-MengeMeasurementPurchasingOperator (mathematics)Uniform resource locatorGoodness of fitOpen sourceDirection (geometry)Standard deviationSurgeryExecution unitForcing (mathematics)Computer animation
Fatou-MengeSoftwareSystem programmingSupercomputerDefault (computer science)Point cloudDirectory serviceModule (mathematics)Integrated development environmentOpen sourceDirectory serviceSystem programmingSoftwareIntegrated development environmentMathematical singularityPoint cloudLink (knot theory)SupercomputerDefault (computer science)Different (Kate Ryan album)Set (mathematics)Module (mathematics)Keyboard shortcutMereologyLiquidCartesian coordinate systemFatou-MengeOpen setRevision controlComputer fileSoftware testingInstance (computer science)ArmTerm (mathematics)Enterprise architectureQuicksortSelf-organizationMoment (mathematics)Multiplication signBitMedical imagingReal numberElectronic mailing listEmailCloud computingInstallation artParameter (computer programming)Design by contractSurgerySpeciesMathematicsInheritance (object-oriented programming)Product (business)Direction (geometry)Computer configurationBridging (networking)Food energyWeb pageOnline helpRule of inferenceDataflowBoss CorporationComputer animation
Element (mathematics)Computer animation
Transcript: English(auto-generated)
My name is David Brayford and I'm a senior scientist at the Leibniz Supercomputing Center in Munich. And I'm also a member of the technical steering committee for the open HPC project. And today I'm talking about how to deploy
containerized applications and workflows on large scale and secure HPC systems. So basically what we're looking to do is to transition your workflows and your applications from your development environments, which for lots of applications will be laptops and desktops,
to a supercomputer with as minimal effort as possible. And it just needs to work. Obviously this is gonna become more and more important when we're looking at things like AI and other sort of new sort of non-traditional HPC applications. So the ability to actually create your workflows to be portable inside a container
is really, really important. It's not necessarily just building your application and compiling your application, but actually you can try workflow because generally that's what's important to you. Not just building your application and running it because your workflows are getting significantly more complicated. So the ability to actually transition workflows from your development environment to supercomputers,
because obviously a lot of scientists don't just wanna run it on a single supercomputer, they wanna run it on different sensors. And they also wanna be able to do this like with as little effort as possible and also to make it as portable as possible. Unfortunately, containers themselves are not portable
and we'll explain this in the next section. So obviously some of the key challenges you're actually gonna see when you're running for HPC systems or you can have different instruction set architectures, which also known as ISA. So you've got the CPU. So you have the X86 family, then you have the ARM family, then you have the power family.
Then you have various different GPUs from Nvidia and AMD and also Intel in the future. And also you can have accelerators on sort of novel and systems. Also, you can have different memory, interconnects and nodes configurations. Those are also gonna be important to you because if you change them,
you might actually have to change your workflows and what's actually installed in terms of drivers. Also, one of the big concerns which we have at LRZ is that we don't have a direct connection to the internet. So building stuff on our HPC system is not possible if you're connecting to the internet. So if you're using stuff like Python, Julia, or you're pulling down applications
or software from the internet, that's not actually possible. And finally, things like system software, file systems, obviously driver hardware for GPUs, distributed processing software, such as MPI. Those also need to be specific for your particular HPC system.
So your container will have to be modified to just that one. And also IO patterns because the parallel IO file systems have specific IO patterns, which they run really well. And when you sort of move away from them, they can actually cause significant problems in the system. So key challenges for the actual user
is if you're looking at sort of the AI, ML, DL, workflows like such as TensorFlow, you've got a rapid update cycle. So basically you're constantly building stuff. So if you can do it in a container, then this actually helps. Also dependencies, are you running Python 3, Python 2, different versions of glibc,
different versions of compilers, different versions of libraries, et cetera. Reproducibility, so this is what people talk about when they say containers are reproducible. Actually, they're not because the HPC, underlying HPC system changes. So your host system changes. So you might get changed to your parallel file system,
such as GPFS. You might get a change to the OS in terms of adding patches. So the actual, what you're actually running is not even though the binaries inside the containers are the same, the actual binaries which will interface on the system or not. And that actually can actually mean that it's no longer reproducible in terms of your results.
Also, we're looking at sort of performance and stability. Basically, you need to use optimized libraries and software which have been set up for that particular HPC system, especially when you're looking at things like MPI, because if you don't, then you can run it on a small number of nodes,
but as soon as you scale to the hundreds of nodes and the thousands of MPI ranks, you're gonna have a problem. And finally, which people don't realize is the actual, the container size. If you're doing something like machine learning, you might have terabytes of data. Obviously, having a container of multiple terabytes of data, you're not gonna be able to create that on your laptop.
So you basically want to store your data in a different location in your container because you need to actually move that container from your development environments to the HPC center. So the next thing I wanna talk about is OpenHPC. It's basically a open source HPC software stack,
and it's under the Linux Foundation. There's the mission statements. And we also, the OpenHPC supports the container technologies such as Singularity and Charlie Cloud, which are used there. And we actually had a tutorial at SC20, and I'll actually provide links at the end of the presentation for that.
So this is basically a simple workflow which we use to create containers at LRZ. So basically we create a Docker image, we modify it to make it HPC specific, we copy the instruction and execution scripts.
So we'll say sort of the first one, the actual number one, is we actually try to do that inside a Docker file, no, rather than actually build a container. And the reason we do that is that you've now actually got the recipe there in front of you. So you don't have to go back and say, what did I do six months ago or 12 months ago to create that container? Then what you want to do is after you've actually created your Docker image,
test it to make sure that it works, then convert your Docker image to the HPC container image of your choice. Again, test it to verify it works, then copy it to the HPC system. If it has a module system and it has the container technologies inside the module system,
you load the module and then you execute it via Slurm. And for the system admins on the HPC system, a lot of times it will just look exactly like a traditional MPI job if you're running MPI software or MPI job. So this is basically the takeaways which we've sort of found out in LLZ to work really well.
A lot of pain, we actually decided and come across these things. So do all your conversions software and create a Docker file because generally the HPC specific containers are not portable. I mean, you cannot translate from a Singularity to your Charlie to Cloud or vice versa
or to Shifter or to any other HPC container technology. So they all support a transition from Docker. So if you start off with Docker, then you can move to the HPC container of your choice or the container which the HPC system allows because obviously Ellard said, we do not allow Singularity for policy reasons.
So when people come and say, we have a Singularity container, we ask them, can you create a Charlie Cloud container? And it's a pain if they've just created a Singularity container and not from Docker because it's a lot more work for them to do it. Obviously you want to basically have everything
in the Docker file so you can actually change it for different architectures. Obviously, if you've written a container and you've got a binary for an x86, it's not gonna run on an ARM system. But the Docker file, you can modify it and then rebuild it with the specific ARM or NVIDIA libraries. Also test the container workflows at each stage
if you built your container. So when you've built your Docker image and then also your specific HPC container because it saves you a lot of work than moving it to the HPC center and then moving it. Do as much work on your local system or developments VM as possible because you can be roots on your own system
but you're not gonna be on the HPC system. And it just makes life a lot easier for you. So these are examples of what we've run on LRZ. So if we go to a more background about this is the reason we started to use containers was to get AI and specifically TensorFlow
to run on the system. And this was an application which is from CERN which is the 3D GAN model for detecting high energy particles from some of the colliders. And as you can see, if you look at the tables you can see we've scaled up to basically a single island on SuperMUC-NG
which is 768 nodes. And each node contains 48 Xeon Skylake processors. And on the next one, you can see we've got the measured performance and petaflops on the small matrix multiplication algorithm. So we're not in the full TensorFlow
but actually on the matrix multiplication which we'd expect to be good. And we actually getting good results from that side. If you look at the execution line, so it says CH run minus B and this minus B allows you to bind the directories from the host system inside a equivalent directory
inside the container. And the reason we do this is to allow the container to actually access and use software from the module system and specifically MPI. And the only reason we got such good performance is we're actually using the system MPI at the runtime in the container
rather than the standard end pitch. We actually only able to scale up to I think it's 256 nodes and the performance was like four times slower. And the scaling efficiency was bad. And also it wasn't as stable. It was crashing a lot. But when we were using the system MPI which has been tuned for the HPC system
it worked very, very well. So this is another example. It was from a process project for an EU process project and they wanted to run some containerized applications. So initially they created a container and they had MPI problems. And the reason was that they inside the container they hadn't set the libfabrics parameters.
So they set that and then it got working but the performance was really poor because they were actually running the MPI over TCP. So to resolve that we asked them to install the OmniPath software and then set it to take advantage of the high-speed interconnects. Because obviously the HPC centers have spent a lot of money on interconnects
and you wanted to use them as much as possible. The outcome was that all the issues and stability was no longer observed and they were able to run their application on hundreds of nodes on the Skylake system and up to 8,000 MPI ranks. So another example we came across at LZ
was we had a set of researchers who wanted to use fuzzing testing on the HPC system and they caused all sorts of problems on the system. So basically when they tried to run it they were bringing down the parallel file system because they were creating up to 6 million
very, very small files in a very, very short amount of time. And it's caused, it basically brought down the entire data centers infrastructure. So, and so initially we said, okay can we switch to a more high performance directory inside the HPC center file system
by basically using the mount command. So they could write to work instead of running it in home. It also still crashed. So what we looked at and said, found out is how big of files are you creating? They went, oh yeah, they're really, really small and you're creating and destroying all these files. So actually you can actually run them in RAM disk. So we basically mounted RAM disk inside the container
to store these temporary files. And the outcome was that the application did no longer crash the HPC center. So that was good for everybody. And so that's the sort of things you need to like, be careful of. And when you're actually running these HPC systems
another example we've been working on is the developments with iCheck in Ireland. They compute HPC central Ireland on a quantum computing software package called Quantex. And what they were using is developing all their software in Julia. One of the initial things we found out
was that Julia installs their packages in home slash Julia. However, the HPC containerized software we use, which is Charlie Cloud, the default is to map the user's home directory
into the home directory in the container. So when you go to the home directory inside your container, you see that your home directory on the host. That caused a huge amount of problems because obviously the packages were no longer found and they was causing problems. So we actually went in and changed the environments. And you can also set the environment when you actually start the Charlie Cloud container.
So when you saw the example CH1 minus B, there's another parameter called set environment and you can set it to a file, which is actually the file which the containing the environment settings which is created when you created Docker. So the default behavior of Charlie crowd is it takes the hosts environment. But we overcame that by using the set env command
and then set it to the file with the Docker environment. We also wanted to actually profile the application using liquid. And again, we use the mounting option, the minus B for the bind
to the host system to get the modules. And we're actually able to profile the Julia applications with liquid inside the system. So that's actually been a real use for us. And it's gonna be also useful where if you want to run, say for example, profiling software which is available on the HPC system
which either has a license associated with which the HPC system has or you don't or it has software which you actually don't want to actually install in your container. So it allows you to actually create a more a small or a minimal container. So you don't need to actually install the software inside your container if it's available on the host system. So that actually reduces the size
of your actual container. So the next part is we wanna basically talk about how this has been used. So at Supercomputing 2020, we gave a virtual tutorial using the AWS cloud infrastructure. There's a link to the open HPC GitHub
and we have all the tutorials on there including how to run containers. So if you want to just run the containerized section or if you want to actually set up a open source HPC software stack inside the cloud you can go through the entire tutorial. It will take you about four hours.
The next link is to Charlie cloud and how to use that. And this is the software we use at LZ because that's the policy. We also know that other HPC centers use singularity which is I admit is the most popular version for the HPC container software. But if your HPC system doesn't allow you to do that
you need to actually use a different one. So we try to make these workflows as not linked to any particular HPC container. So you want to be able to use different HPC containers because you don't want to actually move to a HPC center and have to go through and change your entire workflow because you've written it for either a singularity container
or a Charlie cloud container or a shifter container or some other set of containers or set of workflows which require a particular container. So we actually want to keep that as generic as possible. So it gives you, so it's a little bit more work at the start, but it gives you a lot
it saves you a lot more time when you're actually having to run on different HPC systems. And again, we want to use Docker files because that gives you the recipe. So it's like, oh, what did I, you don't have to remember what you did to install all the stuff to get things running. You can look at the recipe from the Docker file and say, okay, I'm not using open blast.
I need to use kublast or I need to use the arm instance of these numbers. You can go in and go and change them to run it. And then obviously you need to test it as well before you sort of start running it on the production system. So hopefully the HPC centers have a test VM system or with their architecture to test this stuff.
There's another example of actually running the open HPC software stack inside a container using Podman. And again, there's examples of this on the GitHub and this is used basically from a Docker image and runs that as well. That's also starting to take more prevalence to the users.
So some people actually say, okay, we'll install that. Obviously HPC centers are a lot more conservative in terms of what they'll install, but maybe eventually they'll start using that technology at the moment. You're probably not gonna find a HPC center using that.
The open HPC community, which is an open source software community is open to all people to sort of join to get involved in it. But if you're a organization in academia or in a government lab, it's free to join. Obviously you have to, there's a cost associated if you were the company
and the contact would be for Neil Kayden, which is the below. It's a good community and the mailing lists are actively used and people are using this software on real HPC systems. So the system at LRZ, the Supermuc-NG system
is based on open HPC, only a small amount, but there are other systems in the top 50 and I think even in the top 20, which are entirely open HPC software systems. So you'll be all seeing open HPC software and open source software being entirely used
on real world, large scale HPC systems, which people can use. I guess now I'll just open it up to questions.