HPCBIOS: Getting Your Software, Users & Documentation in Sync
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 199 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32545 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 2014186 / 199
2
3
10
11
12
13
14
17
18
19
23
24
25
27
29
45
46
47
48
49
50
51
52
55
56
57
58
65
67
68
70
72
74
75
76
77
78
79
81
82
84
86
87
89
90
93
94
100
101
102
103
104
106
107
109
110
111
112
113
114
115
119
122
123
124
126
127
128
129
137
139
141
143
145
147
150
154
155
158
161
163
164
165
166
168
171
174
175
176
179
182
183
185
188
190
191
192
194
195
196
00:00
Compilation albumChainSlide ruleMereologyTask (computing)Selectivity (electronic)SoftwareLine (geometry)Computational scienceUniformer RaumCartesian coordinate systemOpen sourceNeuroinformatikConnectivity (graph theory)Operator (mathematics)MathematicsLibrary (computing)Block (periodic table)Arithmetic progressionBuildingClassical physicsPresentation of a groupFormal languageCellular automatonRepository (publishing)Bounded variationDifferent (Kate Ryan album)2 (number)Software developerLevel (video gaming)Fiber bundleInformationCodePhysical systemRight angleRevision controlObject (grammar)Point (geometry)Axiom of choiceDivisorGroup actionHistogramInstance (computer science)Graph (mathematics)Expert systemCodeArithmetic meanEmailSound effectSubject indexingExterior algebraFlash memoryDigital photographyExpected valueNumberSurvival analysisContext awarenessAtomic numberSocial classPhysical lawStandard deviationCASE <Informatik>Semiconductor memoryEndliche ModelltheorieMultiplication signComputer virusVideo gameView (database)File formatState of matter
06:55
Integrated development environmentContext awarenessPoint (geometry)Multiplication signCentralizer and normalizerRight angleSource code
07:52
Integrated development environmentQuicksortVarianceSet (mathematics)Arithmetic meanGroup actionInternetworkingLevel (video gaming)View (database)Self-organizationCombinational logicProduct (business)SoftwareFiber bundleUsabilityRevision controlWordDefault (computer science)WindowSoftware maintenanceContext awarenessModule (mathematics)Ocean currentVideo gameStandard deviationDomain nameBuildingWebsiteDifferent (Kate Ryan album)Physical systemMathematicsProcess (computing)Source codeMultiplication signXML
10:57
Source codeCollisionVideo gameField (computer science)Cross-correlationSheaf (mathematics)SoftwareMereologyMultiplication signSet (mathematics)MathematicsView (database)Integrated development environmentElectronic mailing listModule (mathematics)BuildingSlide ruleSource codeXML
12:40
Order of magnitudeUsabilitySet (mathematics)WebsiteScripting languageLibrary (computing)Computer fileSoftware testingNumberOpen sourceMathematicsConfiguration spaceXML
13:48
Multiplication signMoment (mathematics)Physical systemShared memoryCASE <Informatik>Task (computing)State of matterSound effectSocial classDivisorRevision controlTheoryBitEndliche ModelltheorieSampling (statistics)ImplementationElectronic mailing listNumberModule (mathematics)WebsiteFunctional (mathematics)SoftwareGame controllerBasis <Mathematik>outputDebuggerFigurate numberDiscrepancy theoryChainFiber bundleCollaborationism1 (number)Cellular automatonIntegrated development environmentBuildingMathematicsConnectivity (graph theory)Slide ruleConfiguration spaceFraction (mathematics)Process (computing)Object (grammar)Different (Kate Ryan album)SupercomputerMultiplicationDependent and independent variablesSuite (music)Type theoryConstraint (mathematics)Default (computer science)
19:40
BitFamilyRandom matrixMeeting/Interview
Transcript: English(auto-generated)
00:00
and synchronization, and it's a methodology between expectations between different entities, let's say. I am this guy. This email may be given his living reputation. I am signed, but he will just give lax and locality. And at the end of this month, I will start freelancing activity.
00:26
So this is the index of the presentation. I'll just go through the slides. So what is HPCBIOS? It is a network concerning the users
00:42
have been retrieved from the task of computational platforms, be it HPC, bridge, cloud, so on and so forth. In a uniform and painless manner, as much as that is technically feasible, some things are possible, some others not. Here we care about the things that are possible. It is defined at three levels.
01:03
One is structured documentation. The second is policy information. There is this concept of bundles that we are gonna see. And then it's wrapping code to satisfy the opponent provides him the build sets, that's another concept that we are gonna visit.
01:21
I hope you managed to pay attention to the previous two presentations about motions and easy builds, because they are kind of prerequisites for what's being presented here. And at present, HPCBIOS is apparently more of open source documentation, because there is a big synergy with easy build. Easy build provides the language
01:40
for describing many of the things that we see here. And so the resulting easy conflicts, for example, end up as part of the easy build repository. And the advantage of this that I like, like Taylor, is that we end up writing this in Python rather than cell code,
02:01
which is what I could be doing if easy build is not there. So who may care about HPCBIOS? It could be users involved in scientific computing. Experts and system means involved in support of scientific codes. I expect these people to come to understand faster
02:21
what this HPCBIOS system is about. And then we have developers and lawyers of easy build, which is the right tool of choice. For the objectives that concern HPC and software development. So here's an interesting histogram from a survey.
02:42
That was done maybe a year before I even made this build. We were calling a group of HPC users from a number of countries about the foundation software libraries that they depend on. And we came up with this kind of histogram,
03:00
and you can see here some names that you probably have met elsewhere. And the interesting stuff about it and how it connects with the previous presentation is that you will quickly notice that a good bunch of these parts of the histogram relate to this good chain
03:21
that you can find on this build called gold. And if you check there, this is a layer pack, this is W, this is Atlas. And you have a few more linear antibranded faces that have been covered here, blacks, scarab pack, elephant, and so on.
03:43
And there is also an alternative to change gold which is basically the same like this, but without this piece of software, the Atlas component, it instead has open blast. That's what this always stands for.
04:02
And for people that haven't heard open blast, it is in effect a replacement of photo flash. Here imagine that it is better than what I prefer to hear what people comment about. So the summary is that we have some user needs
04:20
that on the substrate level can be served quite well from two chains. And interestingly, this part here corresponds to the Intel vendor provided set of libraries for math operations that can replace these two chains.
04:42
I kind of simplified my talk because I didn't mention anything about compilers and APIs that are part of the two chains. But the bottom line is that you can have selections of components that can replace each other. And on top of it, build the applications that you just care about, like for example, this one, which is the GNU scientific library.
05:01
So you could build GSL with any of these three and you may have, if you have enough, that's enough user base, you will have people that care about all three different variations of this. Before I leave this slide, I will just mention that everything that you see here, as you can read,
05:20
has some kind of, this is built to be provided automatically, which is a good, it's a way to demonstrate that there is good progress with the substrate. And our common struggling blocks for users are basically the following.
05:42
We have need for common tools for handling software from the very classic ones, like TAR compressors, auto-commodemake, Bison, CMake, whatever, we have seen this many times. Then we need the libraries and the software that is common for math operations and other foundational libraries in general.
06:02
It could be for a graph handling, for instance. Then we often have common needs for software, popular with entities, like R, Python, components of these languages and so on.
06:22
And finally, on the next PC platform, it's an advantage to have not a single version of the given software component, but multiple, because that can help very much for differential dividing. So I think that most people have been playing with software, at some point they find out
06:42
that they're having two, three versions of the given package is a trigger, and that's because you can hit a button that is apparent on only one of them, and then you can go via the present button to place pause. And one more point that regards
07:01
the context and motivation of this. About two years ago, I landed in Luxembourg, in the middle of the bioinformatics community, and I realized that all these people were actually providing little pieces of all these tools, without really much central work or information,
07:22
and there was not a further lack of work. Eventually we can provide this as modules, but then you arrive at another problem. These are three times 20, 60 pieces. If you really want to do modules, you add all these things individually. That's not really a great way to manage the scientific environment.
07:40
So maybe we can do something by packing these things together, so that we can automate a lot of this in a different quality of the world. So objectives, that we can think of, is first of all, we have to go to the kind of soft environment at different levels,
08:02
at the level of the system, which is what most system means in the US and what scientists care about, and the level of the group, because users collaborate in the projects, and they often have common needs, and they don't want to build this particular national path to 10 times, because there are 10 users,
08:21
they don't necessarily customize the build, and then the level of the individual users, and actually that is feasible by employing modules, because modules allow you to have this variable called module path that can have many components, and as soon as you have many pieces on this module path, you can actually implement this kind of organization.
08:45
Currently what we have in production in Luxembourg, and it works relatively well, is the combination of this and this. This is still possible if users are willing to implement it, but they haven't seen it yet, and get interaction. Now, as regards to the usability of software fields,
09:03
this could be feasible by using this event. And there is an apparent conflict between agility, move versions forward, and stability, and backward compatibility, so that is a challenge there, how do we handle that?
09:21
Our organization apparently in Luxembourg is to only change the default versions when we pass through a maintenance window, so that users don't get surprised and their jobs break in the middle of the grant. But once we pass a maintenance window, then we have all the incentives
09:41
to promote the software versions. So how can you let users have both agility and stability? We will see a concept of build sets that will allow you to perfect rollback, and even roll forward, so I might even have a default version right now.
10:01
I will have a maintenance window in two months, but I can still provide to my users the build set that will become a default in advance so that they can test it when they need that. Now, so XPSPAS is meant to improve the source environment. It's used to minimize the time that people spend with individual sites of systems configuration.
10:23
The most well-served domains currently are these, math by infant life sciences, but more scientific domains are considered. It's about standardization and producing user experience across sites, systems, and so on. It could even be applied in a lab.
10:42
The gradients, as we say, is these. Most of these will be built, and we'll see now the concept of build sets and bundles. Sometimes I use the word policy for the word bundle, keeping it in your mind so that you understand the context. Let's have a look at this.
11:01
So I ask for modules that have this kind of pattern, and I get this list. I think it's important to realize straight away on this slide that this is build sets, it's collections of software, and the section that you see up here is just a part, a small part of,
11:24
basically, I think that this slide, I see this one. It's a bicattegory view of the builds of this particular set. And what is going on here? These are different, big collections of builds
11:40
and modules that have happened over the last year. And inside them, we have many, many modules that can be collectively organized in this package. For example, if I have a user that asks me, give me back the software that we provided last year on mathematics, he can very easily do most of the work on this one
12:03
and this one, and he will get the software that runs the mathematics of that time. We have currently a nominal lifetime of this build sets of 12 months, but this is a site-specific choice, how long you build them. What matters is that people can go back,
12:21
can do a rollback, or can even go forward, regardless of what is your full field set that is given this moment in time. And they can select the same environment here. So the two fields you take from this slide, this one too, and let's see how this correlate
12:43
with the scientific activity. Open source math library, it's there, it's five hundredths per month. It's this collection. It's this set of popular libraries that we showed with arrows, the scripts that we brought.
13:03
Then we have policies about the test-specific rules, performance testing for file and tools, and so on. You may wonder where these numbers come from. As we'll see in the next slide, the same effort has occurred elsewhere in the US between six Department of Defense SPC sites, and they have defined
13:21
these kind of policies, objectives, and they assign some numbers. So because of backward compatibility, let's say, we have these passing numbers over there. So this is exactly what I described before.
13:40
There is this concept of page-like configuration. Let's have a look at this, because it's easier to understand by seeing the page-specific sites. The user base is set between them,
14:01
and they want to present what features they give to the users in a structured manner, so that the user can in advance know where he has a good chance of running his software before applying for access. And here you see, on these sub-topics, multiple version software, scratch-based,
14:21
and then some time, care about those in the cloud, and barium, most of the names, looking cells. I will go briefly, as you told, the buyers, users notices, common change control robots, and the iOS suite. And so, there are a few things that we want to talk about as approachable sites.
14:41
One of them is that when you have this, then the users can quickly know exactly in which place they have a good chance of getting their job done, and if you don't get support tickets for a long time. And the same idea has been replaced
15:00
with the main specific values, over here. The numbers that you see in this tile, they are, let's say, backward compatible with the list that we showed before, while these ones are new policies that have been invented. The particular way of delivering is to describe here.
15:22
There are a few more, about bioinformatics, JPGs, and so on. They are in development. The concept, actually, it is quite common when you have collaborations. For example, if you have heard of the place,
15:41
the Tresa, they have defined common production environment. But it is not always documented all that much on the site basis. And here is a network to go through all this. How do we contribute back?
16:01
This type of software called Kerpsin-type objectives, the answer is more solutions, easily, are the things that you have to look at. And then if you manage to automate some of the objectives, then you should turn it into a component
16:22
of hpcbios, and this sits on GitHub. So a classic pull request, the way to handle this. And things are relevant for hpcbios when they are applicable across two sites. So if you have rough consensus
16:40
and something that the people in two sites, they believe that it's a common denominator about how to organize the environment, I think that could be a good candidate for this collection. Of course, different people have different needs. And here are the challenges. It's actually, I know very well, which are policy objectives matter.
17:03
At the moment, we are actually abusing these against food chains for what are the bundles. That has a side effect, that constrains functionality to homogeneous targets. For instance, if you want to bring up debuggers, you may have a difficulty if you want both Intel and GCC compiled debuggers.
17:23
There are some constraints that have to be taken into account. This approach is more of a hack, and it may have to be revisited. Then another challenge that is more of daily practice, when you have a site with some nodes that should be used,
17:41
for example, some nodes without, or other architectural features, you can see some there, if you want to have the shortest main span, the shortest amount of time when you release a new preset, how do you schedule this optimally? At the moment, I do this manually. We may have tools eventually to get this done quickly.
18:05
There are also other things that have to be taken into account. Actually, this tool relates to PCB, regardless if you care about the specific BIOS. If you want to build Atlas for instance, that will do some kind of auto-pulley.
18:22
So in that case, you will care about how you run your build, because you may end up in a non-optimal configuration. Finally, there are some discrepancies about how the ordering of modules is done. The loose version and modules and Lmod
18:42
have different understanding of what the figure run was. And this can be to some crisis by the user when they load the default. Most of the responses are posted in the label. By the label, it's not exactly what people expect all the time. I'm in the last slide, let's say.
19:03
So, there is a link of references, some activity on GitHub, how to use it. Basically, you download this, you can try this with the intensive build. Theoretically, you get much of what you saw in that slide built in one shot. Of course, it takes some time, but that's the way you do it.
19:22
Read the HPC BIOS objectives, and process them, and so on. That's it. I'm not done. This HPC BIOS handful, it's a historic, let's say.
19:55
Bank of Madison, so the bar goes.