A Method for Distributing Applications Independent from the Distro
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 199 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/32678 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Right angleRevision controlTablet computerRegular graphBit rateControl flowCartesian coordinate systemFocus (optics)Multiplication signCausalitySoftware developerProjective planeScripting languageBitGastropod shellExtension (kinesiology)Form (programming)Distribution (mathematics)ResultantSet (mathematics)Parity (mathematics)Self-organizationRule of inferenceCivil engineeringCovering spaceDrop (liquid)File systemSoftwareVideo gameData centerTouch typingOffice suitePublic key certificateField (computer science)Information technology consultingOperating systemInstallation artEnterprise architectureNormal operatorQuicksortConnected spaceMereologySoftware testingStudent's t-testIndependence (probability theory)Different (Kate Ryan album)Physical systemComputer fileLink (knot theory)Universe (mathematics)Graph coloringSoftware repository1 (number)Information securityReflection (mathematics)Integrated development environmentOcean currentProduct (business)Parameter (computer programming)Open sourceDigital rights managementNormal (geometry)Slide ruleUtility softwareWeb 2.0Arithmetic meanServer (computing)MathematicsPatch (Unix)AreaStress (mechanics)SpacetimeNumberUsabilityDimensional analysisReal numberEntire functionNetwork topologyLibrary (computing)WeightProcess (computing)Fitness functionHTTP cookieRuby on RailsLine (geometry)View (database)Decision theoryPermutationSoftware maintenanceForceRow (database)TrailSinc functionOpen setSeries (mathematics)Category of beingSampling (statistics)State of matterOperator (mathematics)Data storage deviceCodePrice indexWebsiteExpected valueCellular automatonTheoryAxiom of choiceBasis <Mathematik>Default (computer science)CodeContext awarenessMacro (computer science)Functional (mathematics)AbstractionForcing (mathematics)Constraint (mathematics)Sound effectWordUltraviolet photoelectron spectroscopyProfil (magazine)Random variableSystem administratorBuildingStaff (military)WindowMultiplicationImplementationTerm (mathematics)IntegerPoint (geometry)PressureObject (grammar)Electronic mailing listCASE <Informatik>Interactive televisionPortable communications deviceCondition numberFiber bundleFamilyGroup actionDependent and independent variablesExterior algebraDisk read-and-write headEncryptionSystem callMechanism designSpecial unitary groupInterrupt <Informatik>Film editingBlock (periodic table)Graph (mathematics)Perspective (visual)InformationComputerFreezingCurveUniform resource locatorCountingBinary codeMassAnalytic setVirtual machineLaptopMobile appGoodness of fitFlow separationLevel (video gaming)DemosceneStapeldateiModule (mathematics)Direction (geometry)Software bugMultilaterationVideoconferencingWeb page.NET FrameworkTouchscreenCache (computing)Online helpFluid staticsLoginChainConfiguration space2 (number)MiniDiscFigurate numberLecture/Conference
Transcript: English(auto-generated)
00:31
Smarter than me. There we go. I wonder why I didn't echo. So, as I said, I was a camp counselor for a long time, so
00:40
big parade fields, etc. And what I do is I work for Red Hat. I'm this kind of new thing called a developer advocate or a developer evangelist. And so I've spent about 15 years mostly as a software consultant. So a lot of different projects, mostly kind of big enterprise sort of things.
01:02
And now what I do is I talk to developers who use Red Hat products, in particular Enterprise Linux, talk to them about what they hate about it, and then I go and talk to engineering about how we can fix that stuff. So that's mostly what I do. And that's a little bit about me.
01:22
So this talk is actually it's kind of essentially about this concept called software collections, but it also, I come definitely from the developer background rather than coming from a packager's background. So I think I have a lot of biases towards being a developer and not needing things like QA or
01:40
any of that, any release stuff. We'll just send it out there and everything will be fine. No security updates, etc. So just a little bit of on me. So here's a little bit of my argument. Over the last, I don't know, whatever, 15 years, Linux distributions,
02:02
they kind of started in this kind of world where the open source community was relatively small. Even proprietary software and open source software, neither one could you trust basically at all. It was all run by developers
02:21
who didn't really know or care what they were doing, so things like security updates would never happen, QA if you were lucky, etc. So early on I argue that the distributions kind of formed to try to basically solve some of that ease of use problem and kind of redistribution problem
02:41
around security patches and software and kind of things like that so that you knew where you could get stuff from and you could also redistribute things so that they would stay secure. It was also kind of over the arguably earlier than that, but over the same 15 years or so
03:00
is where kind of the dynamic linking concept really took off and caught on. There were definitely pockets before that, but it wasn't until I don't know, like 96, 97, where it was like it started to become the way to do things. So, and then the last thing was actually license assurance.
03:21
This was much less of a problem with proprietary software, obviously, although in fact they were hugely violating licenses anyway, but you couldn't tell, but with open source it was a big concern, right? So you wanted to make sure, particularly as open source started to be adopted by the enterprise, that you knew what kind of licenses you were getting into.
03:41
Also, feel free to interrupt me, so you can tell me I'm crazy. If it's just blanket crazy though, let's wait until the end. So, particularly now in the last maybe five years, there's been and I couldn't find a good graph for this, I've actually seen one before, I was looking at I was talking about it with somebody earlier, that kind of just the package count in Fedora
04:03
each version, it's just scaling like crazy. Obviously in something like Enterprise Linux, we have significantly less packages, but even that, it's still a nice strong curve. And so, between that, now we also have a massive plethora of actual Linux distributions
04:23
that are reasonably popular, so it's not three big ones and then everybody else. There's a good chunk at the top. And then Linux in general is also becoming hugely adopted, particularly in the Enterprise, kind of in server rooms in particular, but then even to some extent in the Enterprise
04:43
on desktops. I don't think we're ever going to see the year of the desktop, but it is in some places. I think it's largely going to be the only people who use kind of desktops, and by desktops I mean laptop half the time, are going to be people like you guys. It's going to be just developers, just people who
05:03
actually make software, who use these things at all. Everything else is going to be tablets and that kind of tool chain. So, we really need to focus on making sure that we can distribute things well. So, here's the extent of the math I can do anymore. So we have a whole lot of projects
05:24
and then we have versions of each project, then we have it across distributions and then we have versions of the distributions. So now we've just made the thing massively wide. So, kind of typically, we just do this kind of
05:42
take a tar ball, right, and then kind of have to figure out the distro's packaging rules and then kind of figure out how to build the thing itself, and then start to put that into a package. We've got to follow these rules and make sure it goes in the right places, etc. Then on top of that, inside an individual organization, there may be kind of a second set
06:03
or third or fifth or forty-seventh set of packaging rules within that organization. So you kind of have to solve the problem over and over again. So, what would be a lot nicer is if we could kind of say, okay, I have this package, and if you saw I don't know how much of Donny's talk covered this earlier, I know he's covered it in the past,
06:25
but whatever that means, so a package, but essentially, if we had a blob of thing that we could run, say, on any distro, that was OS-independent version-wise as well, so you could kind of go this way for the package
06:40
or you could drop on each one, or you could go this way, so version two, version three, version four of the OS. And then the other thing is can you also start to trust, essentially, the developers to have packaging rules that make sense for their
07:00
application, rather than it being a distribution problem. So, lots and lots and lots of people have tried to solve this problem. Here are whatever, six, five that I thought of off the top of my head, and ones that I could find random, easy links for. Actually, I had Linux apps in there too, but I couldn't find a good place to
07:23
point people at off the top of my head. So, some of them solve the exact same problem, some of them kind of take it from a slightly different perspective, so that's what I think is kind of interesting, is to every piece of software anybody writes has trade-offs. And these people clearly were taking
07:42
different trade-offs, right? So, kind of the environment module, SEP, and to some extent Gen2 slots kind of all take roughly the same set of trade-offs, very similar actually to software collections, where it's kind of like I want to have multiple versions of the same thing installed on the OS at a time, and I want to rely on the user to
08:02
choose between them when they're running applications. OS tree, a relatively new thing that is really actually about, it's really about supporting arguably supporting debugging, so you can take an entire set of binaries and kind of swap them out, in and out,
08:21
and it's for, you know, doing debugging of Gnome, right? So it's not just one application. Yeah? Using OS tree? Yeah. So, mind you, Gnome I think has three or four different ones that all try
08:41
to accomplish this goal. There's Glick 2, Glick, another one too. Then you have update alternatives, which is also trying to solve the same problem, but in a very different way, which is I want to be able to switch for the entire OS what the version of something is. So instead of per application, which is what these guys do,
09:01
it's actually for the whole OS, right? And that can kind of cause some problems, particularly with something like Python, right? So you switch the version of Python in your OS on Linux, right, and bad things are going to happen. And like I said, there's lots and lots more. So taking some of that thought, right, and taking
09:22
some of the, so Red Hat has a little bit of a problem where particularly for Enterprise Linux, a lot of our customers want to use whatever version of Enterprise Linux that they're using forever. And I mean forever, right? Because, well, one of the reasons is because
09:42
the application that they're running on it is working just fine, right? So why would they want to invest, normally what you have to do is you have to invest development effort, QA effort, release engineering effort into modifying the application to be able to run on some future version of the operating system. If it's a junk application that just
10:03
does what it does, why would you make that investment? So as a result, they instead yell at companies like ours, right, and lots of other ones. Microsoft has this problem a huge way as well, as you all have, I'm sure, seen with their big EOL announcements. Every time they announce an EOL, you know, then it'll, I think it's like five times they EOL it, and then it'll actually die.
10:24
So they have this huge problem as well. They do some other interesting things. Look up the civilization fix, you know, the video game civilization, the fix they did in Windows sometime. So what a software collection tries to do is let's, instead of
10:43
having the application kind of natively touch the OS itself, let's put a layer in between it, put a layer of indirection in there, so that we can now take this blob that's in between, it might be Ruby, right, or it might be kind of a full stack, so it might be a web server plus Rails and Ruby and everything else, but the idea is
11:03
there is a blob that you depend on that's kind of above generally above the actual OS and below your application. So that's kind of what a software collection is trying to do, is saying, okay, let's identify that blob and let's try to make that blob portable. So now what you can do is you can say, okay, I have this janky old
11:23
Ruby app, let's say Puppet, that is running on Ruby 1.8, right? Why is it running on 1.8? The way I hear it is because Enterprise Linux 5 runs Ruby 1.8, and why is Enterprise Linux 5 running 1.8? Because Puppet runs
11:40
so... But any which way, the point is you take any old application or older application and what you can do is you can kind of say, here's this blob, I'm going to guarantee the API on the top of it, and then I'll change how it delivers that API underneath. So that it can be portable across multiple versions of the OS. You also have the problem the other way, right? So there's
12:03
a lot of push for people to move to Python 3, however not all the distributions are ready to make all the changes to be able to move to Python 3. So how do you start writing applications in Python 3 right now without the, you know, kind of working your Linux? Well, one of the ways you can do it is with a software collection.
12:24
So it's kind of, you know, by the two examples I'm kind of giving, you can kind of see it's OS independent. The OS can continue on its merry way, and your application can decide to upgrade when it's ready. Or you can be aggressive and say, hey, I'm writing something bleeding edge
12:42
brand new, and I know I want to use the most cutting edge version of whatever, because I'm not going to change it again for ten years. So, let me get that right now and use it right now to write my application. So, kind of going into it a little bit more
13:00
so, it really is, as it says it's kind of defining a process a little bit more than a product, you know, so it's kind of a conceptual thing. You know, there is an implementation, but that's kind of what it does. The other big thing that it does is it packages outside of the normal kind of Linux file system. And that's sometimes a problem
13:21
for some of the people, I think, in the room. It allows multiple versions to be co-installed. And then per application you can essentially start, or you can say, this application gets this blob, rather than that blob. So, the way
13:40
we kind of defined it is it installs under opt with, you know, the L acronym that is the Linux naming that I can't actually say. You go register for a Linux name, so that you can get kind of a short code there, and then you put your application name there. And that's kind of where it installs, and then you have kind of some scriptlets that
14:04
know where that is and can provide that information to the application. And the way it basically does it is by fancy pathing. So, the application doesn't have to know, this is the other, sorry, I didn't mention this part. One of the big differences between the ones that I was talking about earlier and this one, is that the application doesn't
14:23
have to know that it's not the base OS version. Which is, I think, hugely important, because you don't want to have to rely on your average developer to figure out where it's going to be. And, again, to allow for that OS independence, someday all the Linux distributions will be on Python 3, and that will be
14:43
what's at userbin Python. So, what you want to do is be able to write the application into, say, userbin Python at the top, and then it just goes to the right place, rather than having to, you know, do fancy symlinks, or make a mistake and say, or have userbin Python 3, and then find out now you have to do engineering
15:03
effort to port it. So, in a little more detail, here is kind of the way the one for Ruby works, that we have built. So, here's kind of regular
15:20
Ruby, right? You know, assuming that you installed Ruby, the other nice thing about this is that you don't actually have to have, I can't remember if Ruby is installed by default, but you don't have to have Ruby installed except for this particular application. So, in other words, it doesn't have to be anything in userbin Ruby. It can just be down here. So, I think it's
15:40
pretty obvious, but basically here's the normal path, right? And here is our special path. You know, we ship a Ruby 1.9.3 variant, you know, one of these blobs, it sits under RH because that's our name, and that's pretty much all there is to it. This little script here is what does all of the magic, such as it is,
16:01
which it basically kind of provides those paths so that the command to run these, which I think is down here, so scl enable Ruby 1.9.3 bash, so what you do is you kind of say, here's the command, right? scl, I want to turn it on, and then which one do I want? Ruby 1.9.3. And then what do I want to execute with it?
16:20
So, if you just want to get a new shell back where Ruby 1.9.3 is available, you just run bash. But it could just as easily be some app, right? So, if you're a sysadmin, it would be like an app, and then you put this in a shell script or a start script or a something script. For a developer, he's going to
16:40
drop in a bash most of the time. You can add this to your .profile, and so it's just always there if you want as well. Then we talk about you know, there's basically some tools out there that kind of help make this stuff work, and so there's some links.
17:02
I think all these slides are available somewhere. Joe would know better than me. I will email them to Joe. He will make them go. This is the ones that we've done already. Actually, it's some of the ones we've done already. We actually have some that are out there for like HTTPD
17:21
2.4. I can't remember what else is up there that's just kind of public. There are some other distributions that are related to ours that also have all of these. There's a guy I think who's talking next to who had a hand in those. So,
17:41
I think they're well represented now actually in CentOS, Scientific Linux, and Fedora. So Fedora just approved them, whatever it was, last week? Or was it the week before? I can always pray. It's getting there. There is some of this stuff
18:03
built for Fedora, but you can't get it in the main line distro repos yet. So, that's basically most of my slides. I was expecting more interruptions. I invited hecklers, and they're not delivering.
18:23
Exactly. That's true, that's true. So, here's kind of my argument. I'm probably crazy, but I think vendors in general are a lot more trustworthy. When I say vendor, I mean proprietary software, I mean open source projects,
18:41
I mean everything. Any kind of application. I think the distribution doesn't have to take on as much of a responsibility for making sure that all the stuff in the world runs. Which is not a high bar. Let's say are reasonably
19:02
trustworthy, so they're even more than that. I think a lot of this stuff, most people understand, most people who do this kind of work, they understand what's important, how to break their things up. Yeah, they're still false, but I think a lot of them are starting to get it. There's well known
19:25
libraries for particularly security related things. So, people mostly use those libraries. I think Bruce Schneier is finally getting through to people and saying stop writing your own encryption libraries. So, same kind of thing. I think in general they're considerably more trustworthy, so
19:44
that I don't necessarily think that every distribution needs to package everything under the sun. I also think another big reason is redistribution. One of the things that software collections has that fails and goes backwards in time a little bit, is a lot of things are considered to be statically linked, because
20:03
the only way you can be portable across OS's is if you carry the bits you need. So that means static linking, sort of. Or at least it means that it's got to be carried in the bundle. So it can still be dynamically linked, however it's got to be dynamically linked from within the software collection. So, one of the things that I was certainly confused about when I first
20:23
looked at this stuff, so if you think about an RPM, one software collection does not equal one RPM. A software collection is more like 70, 1,000 RPMs, so it's broken up just like you would do a normal package, it's just that it's all kind of in this one context to say, hey I'm part of this one software collection.
20:44
So you do, it is better than straight out statically linking, but you still have the problem of redistribution. So one of the reasons that dynamic libraries are nice, is that you only have to copy that one file onto every machine. With a software collection, let's say it's OpenSSL
21:03
and you have 37 software collections on every machine all using OpenSSL, that means you have at least 37 different copies of that SSL RPM. So you need to distribute that all the way through. However, and judging by the overflowing-ness of the config management talk,
21:23
config management is really, has really finally taken you know, I remember the fits and starts, right, I mean there was a huge push for it, I think in around 2001 and just never quite took, but it's really kind of taken now. So I think redistribution, particularly in big data centers
21:43
has gotten a lot easier, so I think that's why we can kind of take this risk. The other thing, the other risk you have is that now you have a copy, you have 37 copies of OpenSSL you know, disk space is not that big a deal anymore so does it really matter that you have lots of different copies of the same
22:04
stuff on one machine. And de-duping file systems someday, I pray, will be default and so it won't matter anyway. The other thing is that I think users are more sophisticated. So I think that they can, they understand the context that they're
22:23
and when I say user here, I mean a developer, right, so somebody who's going to consume one of these software collections. They're way more sophisticated about what is going on underneath and can usually figure this stuff out.
22:50
Right. I think
23:05
dependent collections are a very real and very important aspect of this that we're still working through some of the, like there's a, I don't know where it's published right now, but we've been working on trying to get a good guideline for how to do dependent collections.
23:21
So do you really want to do a Rails software collection? Well, not really. You want to do a Ruby software collection and then have a Rails software collection that depends on it. So, and what's the right size of a software collection? Should it be a full on kind of web server with Ruby and Rails or should it be piece meal
23:44
and I think a lot of those questions are yet to be known. The other thing I think that is going to be interesting about this is kind of the move towards containerization. So containerization and I'm using that term to mean to be as far away from possible
24:04
from any individual implementation. So you have things like Docker, you have LXC, you have this combination of open we use an OpenShift which is like Cgroups and SELinux you have actually much older and much more sophisticated
24:20
versions in other operating systems that are not Linux. But this containerization concept where now we kind of say hey this whole application is now a blob and it's portable and we can run it into certain kinds of conditions and it can be updated on its own. I think that's going to be the next step for some of this stuff and I think a lot of that will rely on something
24:44
either exactly like software collections or something very similar. It allows you to kind of bring that bundle inside your container. Are you ready for questions? Yeah. Alright. I've got a mic so if you have questions raise your hand we'll bring the mic around so it turns up on the video later.
25:06
One thing I didn't follow if you've got two software collections installed for two different versions of something like Ruby, what's the mechanism by which the application gets to use the appropriate version? So this versus update alternatives, so the question was basically how do you say
25:24
I want this application to use Ruby 1.9 and that application to use Ruby 2. So this of course is only at the very bottom of the screen here so you probably can't see it. I should have probably put it at the top somewhere too. This unlike say update alternatives, you tell
25:40
the application which one it's using. The application doesn't actually know but you say SEL enable whatever collection you want to use and then the application you want to run. So that it's basically running it inside a shell of its own that has the correct paths so that it finds it as if it was
26:00
at its normal locations. Is this changing the environment? Essentially, yeah. Try to organize your questions by row.
26:23
I don't even know why I got up at 6.30 this morning to go to the fitness center at the hotel. I'd like to ask how security updates are propagated to containerized applications in software collections. So say I have a Ruby software collection
26:40
and Ruby depends on some library that got the security update upstream. Does it mean that the maintainer of the collection will have to update it manually? And if I have like five collections which depend on this library, does it mean that five different people will have to update their collections to include the update?
27:00
Exactly. Yeah, that's the problem. Not if you do dependencies. Like I said, let's say you had a Rails 3 and a Rails 4 both dependent on the Ruby 1.9 software collection and the security update happened in the Ruby one,
27:21
you wouldn't have to update the Rails ones. In theory. One of the things that I like about software collections is it also lets you, well, like, dislike, I don't know. It also lets you defend your QA problem. So one of the reasons that the enterprise doesn't want to upgrade
27:40
often it may be that they expect the code to completely work, but they don't want to test it. QA resources are often the ones that you have the least of in any organization. It's often the constraint at any engineering organization I've ever worked with that keeps features from happening. And so when you're doing, again, kind of talking about that janky old app that is just running
28:04
and doing its thing, why would you want to waste that very limited resource of your QA people to test the upgrade? I also would hope that the, you know, if I guess I kind of expect that in a particular enterprise, much like
28:24
I would kind of expect with OpenShift, is like in a particular enterprise, I would kind of expect them to build their own software collections for use in-house. Probably, hopefully, right, dependent on, you know, for a large percentage, 90, 95 percent of the functionality on somebody else. So like on Red Hat
28:43
or on the community in general or on something. But that they would have their own one so that they could be confident that the API layer at the top is what they expect completely for the applications that they're building. You know, it does two things. One, it gives them more, you know, kind of tolerance for moving across versions
29:03
but then it also does that kind of opposite effect, which then you can still control your developers by saying, here is the approved Ruby software collection, right? Here is the approved Python software collection. Don't go installing random other stuff unless it's made it into the software collection.
29:26
One back there. I have a question on how this SEL enable works. So, can I use multiple of these environments at the same time? And so will it only map this for this one process with something like Proot or
29:42
Mountbind or how does it get into the environment? Yeah, so you separated my space, right? But you can just kind of list out multiple ones that you want. And it's not that cool. It's just paths, right? So, you know, this is not, you know, truding
30:02
some brand new file system or something and running in that context. There's no security here, right? None of that stuff. This is just to provide the environment that is kind of so when you make a function call, you get the right binary. That's it. Yeah, so what is like the lowest level
30:26
of libraries that you include? For example, these packages, would they include libc or do you still expect that from the distribution itself? The package maintainer needs to decide that. And everything that you
30:42
don't include makes you less portable, right? But makes it so that you have less of the problem that he was talking about back there of massive redistribution updates. But I think it's hard to know like all the pieces that you need. So I wouldn't even know if I need Apache, I wouldn't even know all the libraries that it uses, so.
31:02
Well, I mean, there's the one aspect of you need to know what libraries you care about, right? But then there's the other aspect of where's this fine line of how much stuff to include. I think we're still struggling with that too. I think, you know, I don't think there is, well, I don't think there will be a silver bullet kind of answer. What we are trying to do is do a better job
31:21
documenting how to think about it. And there are starting to be a lot of examples, right? So now we have whatever, you know, ten or twelve that we're shipping you know, there's at least another five or ten that I know about kind of in the software world. So there's starting to be examples that
31:41
are making decisions. I think we're also going to run into problems. You know, it's really nice actually that RHEL 7 is kind of on its way out the door because that will be the first test of some of this stuff. So, you know, I'm running the beta here. And so what I'm but it's only been like a couple of weeks. So what I want to do is start to try
32:00
to run, take an application that I wrote on the software collection for Ruby on RHEL 6, run that hopefully unchanged on RHEL 7, and then I actually want to be able to run the same thing unchanged on OpenShift. Because OpenShift actually uses software collections to, that's how it delivers those blobs
32:21
as well. Do I see a question over here? Yeah. No, no, he's first then you, because that way he can keep walking in the right direction. So I'm no advanced coder, I'm not a coder at all, but I'm aware of distributions
32:42
and dependencies and versions and stuff, and I just want to understand correctly what this means. So you say I can make a package of software and include my own dependencies, what I need
33:01
and do it all into the opt system. And all I need to do to make this work and any distribution is to insert this SEL trick. Is that about correctly? So it's a little more complicated than that.
33:21
The first part is you have to create the thing. So you have to have your application and all its dependencies, and then you have to shove that all into some sort of package. In the case that I'm talking about, that's all RPM. Then on deployment, so on install of the application is where the opt stuff comes in. So it's where it's going to show up. And then
33:42
this kind of SEL stuff works, because basically now you have all of the pieces that you need some of it might be coming from the operating system itself, some of it might be coming from this set of packages called the software collection to actually deliver the API for Ruby
34:01
to another application. We expect, and I know of a couple of companies that are doing this, but we expect that some companies will, whether it's open source or proprietary, they're going to actually create their own application as a software collection
34:21
which will then have a dependent collection that they also carry, or a set of dependent collections, so that then the whole thing will kind of get installed there, but then you'll have that, so when you install X, Y, and Z software you actually get that whole collection from them, and including any dependent collections that they may get from elsewhere. So every package brings its own environment completely?
34:47
Right, so it may rely on some stuff that's in the OS2 or it may not, it's up to the package or the collection. My question is, are you aware of Nix and Guix
35:02
package managers, because they are doing similar job, but I believe slightly better, and without RPM? So, as I said on that earlier slide, everybody has had this problem a ton, all the time. Microsoft has a
35:22
solution for this problem as well. I don't know if you guys know anything about .NET, but one of the major changes you may have remembered the advertising was .NET solved DLL hell, and the way it did that was having this thing called a global cache, and then a per application cache, and what ended up happening, they had this magical, mystical idea that in the global cache would be
35:42
things like glibc. What in fact happened is that every single application that got installed picked up their own. Mac has always and to the best of my knowledge, and I don't know a whole lot about Mac packaging, but Mac has pretty much always worked this way, in that every application basically carries its entire dependency tree every time you install it.
36:02
So, it's not limited to Linux, number one, and it's certainly not limited to the examples I gave. I don't actually know the one you mentioned, so I'll go ask Google. Okay, and another thing, I didn't understand that, what are the new demands for
36:22
software vendors to be able to use this? Well, so, it's really just a style. It's a way of saying, hey, this is a way you could deliver your application that because this happens to the RPM, but the same concept
36:42
could work in anything. It doesn't really matter that much. You just need something like Yum to manage where they're coming from and getting updates. But I think a software vendor of some kind could say, hey, here's a way to distribute it, and it might be easier for you to build
37:01
the packages for, let's say, Enterprise Linux 4, 5, and 6, they might be able to find a way to create a software collection that will install on all of them, so that they don't have to re-certify for every minor version. They can also deploy on Fedora, even though Fedora is way ahead of RHEL.
37:23
Does that make sense? Arguably? I've got a question on, I suppose this goes back to how big the software collections are, but say you have a software collection for different versions of RHELs and a software collection for different versions of Ruby, is this to suggest
37:41
that for every permutation you want, you would then need to build another independent software collection on top of that? Theoretically, right. And that's the struggle, right, is where do you want to draw that line? So right now, for example, the one we built, this Ruby 1.9.3 actually includes RHELs 3 as well. So even though it's called Ruby 1.9.3
38:01
there's actually, I don't know, 40, 30, some number of gems in there, I can't remember how many, but a whole bunch. Which if I read Ruby 1.9.3, I don't hear gems. So I'm not sure that we're entirely happy with that choice, that maybe RHELs should have been separate.
38:22
Because then it would have been easier for us to say, okay, now we have a RHELs 3 software collection and a RHELs 4 software collection both dependent on the Ruby 1.9.3 software collection. Maybe I missed it when I was just looking at my phone during your talk or whatever. Sorry about that. Face down. I'm curious about some of the details.
38:43
How do people actually create these collections? How do they deliver them to users? How do users install them and how do they manage them? What do the tools look like in practice? So I was going to include some of the macros, but I decided not to. But basically when you're writing an RPM, you have this big thing called a spec file
39:03
which kind of describes what's in the RPM. In that spec file we just add some new macros that indicate that this is a software collection and that it's going to go in a special place. And that's about it as far as the difference. One thing that was kind of a design goal was that the same spec file, if it knew about
39:23
software collections, could also be built to not be a software collection. So hopefully what will happen is that all spec files everywhere will have kind of the support for a software collection, so you could build it that way, or decide to not build it that way depending on your use case. And then as far as the tools are concerned, it's just
39:43
yum install x, y, and z software collection dot RPM. Whatever it happens to be called. In Enterprise Linux you have to add what's called a channel, which don't get me started on that. But basically you have to add another repo coming from Red Hat and it's got
40:02
a bunch of them in there. You just say yum install blah blah blah. I think CentOS, I think they're mainlined. Did they make it to mainline or are they still in test? They are, ok. So in CentOS you can actually just log in, yum install Ruby 1.9.3.
40:20
So it's all the tools you're already familiar with, for the most part. They'll work just like RPM does with Puppet. So it has also all the failings of RPM, but it's easy. Do you have an example yet of an application that's packaged in the normal places? Like where a user would expect them, but it depends on an SEL. So for example
40:42
the user would run the application where it normally is, has never seen the SEL enable utility, but the application then enables the SEL it requires itself. Sort of. Basically so I've done that with like HTTPd, right? So to get the web server to use the Ruby 1.9.3 stuff, I need
41:03
to kind of modify its start script. And for that particular example I was using the native Apache web server, rather than a software collection. I think we want to be mildly careful of that because that's one of my fears, is that
41:23
if you have something that is distributed by the normal OS that's dependent on a software collection, that might be weird. You know, it kind of goes out of the user's, when I say user here, right? The system administrator or developer's expectation of where stuff is going to show up. So my assumption is
41:43
it's distributed separately in a separate yum repo, even maybe with an SEL but when a user installs it, the application appears in its normal places but it's using an SEL behind the scenes. Exactly. So that's exactly what I did for like HTTPd, you know if you want to set up a batch profile that it just runs, right? So
42:03
however, in general, if I had kind of a real top level application you know, so my cool website that runs in your office, I would rather that people package that as a software collection and then it's dependent on software collection. I think user expectation
42:24
so, you know, I think having a shell script or something that's kind of off in a more normal place that points at it is fine, but I don't know I guess I don't like, if an application installs under opt then it shouldn't touch the rest of the OS. If I choose to make it
42:42
touch the rest of the OS, so Calibre installs if you install it from the Python script, it installs under opt it doesn't touch the normal operating system basically at all but I can put a symlink to it from userband, right?
43:04
Sorry, just one technical add there's no reason you can't, it's just my preference. Well, I don't know if you have already answered my question but it's another twist on the situation. Let's say we have a Ruby collection or a Python collection and I want to add a particular
43:23
gem or Python module I know how to create a packet for a native environment can it be easily added to the existing collection? Do I have to create a new collection and overwrite or remove the previous one? Right, so that's exactly the problem
43:42
in some ways, we didn't really talk about it here the other thing that a dependent collection could or should be able to do we're talking about, we have the Ruby collection this one happens to include Rails, right? So let's say I want to write a dependent collection that swaps Rails 3 and puts in
44:02
Rails 4. Well, we don't want you to have to rebuild this one So one of the things that we're working on I think it's more of a documentation problem, hopefully it's not a bug problem but there's a documentation problem to say, okay, how do you write a dependent collection that swaps part of a collection?
44:22
And does that make sense, or is that completely insane? I've got a question, does the software collection packages run independently from each other, means if I run two at the same time which have got a dependency inside which breaks if they run at the same time
44:43
do they run independent of each other? So I was kind of saying that earlier, you can kind of so I had an example where I was doing, I think it was Postgres so we have a Postgres software collection, so I had an app that used Postgres whatever, 9 something that we're shipping, and Ruby 1.9
45:02
So what I do here is I kind of say, sel enable Ruby 1.9 space Postgres bash, so that gives them both there if there's a dependency between the two then I think you should have a software collection that is dependent on the two and then you use that software collection, if you follow me so in the Ruby and Rails example, I wouldn't kind of run them both
45:24
I would have built the Rails one such that it's dependent and then I would sel enable Rails 4 bash yes, and it's just that one depends on the other rather than using both. The only time
45:43
I would use both is because for my application I need two, three, four different pieces of software Postgres, memcached, and Ruby We have time for this one and one more, so the next question goes to whoever
46:00
is going to go get me a beer after the question In the Red Hat developer blog, there was this topic one year ago and compared to this, what's the news? Not a lot Like I said, what I do think we've found
46:22
software collections have been now around, what, almost three years? So what I think we're finding, particularly since we started shipping some stuff a year and a change ago is the problems people run into, like this dependent collection problem and so getting real world use starts to
46:43
poke some of the holes in it. I think the other big stuff that's changing is containerization is already hugely important and becoming way more important and how that integrates with packaging is also going to be a really interesting ride Is there any project page or something?
47:05
Yeah, let's just say I would really, really like there to be a proper upstream for software collections as a concept. It is all open source but it's all over the place. So while it's open source
47:21
there's no practical means of finding it all, unless you work with the CentOS guys and have this magic ability to find things So one of the things that we have in the works is to try to get kind of a landing place for software collections as a whole and we've heard from people about
47:41
the software collections that they've built that they would like other people to be able to use and so kind of allow for a community to form around it Everybody's got lots of things to do Any other questions? Apparently no one wants to give me a beer Alright, thanks Langton