Transactionally Protected Package Management
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Number of Parts | 97 | |
Author | ||
License | CC Attribution 2.0 Belgium: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/45765 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSDEM 201010 / 97
1
2
4
5
6
8
14
15
16
17
23
29
38
41
42
44
46
47
48
50
53
54
62
63
64
65
66
71
74
75
77
78
79
80
82
84
85
94
00:00
File systemFinite-state machineNetzwerkverwaltungCASE <Informatik>Virtual machineMixed realityEntire functionVideo gameOperations support systemSoftwareData storage deviceMetadataCycle (graph theory)Distribution (mathematics)Rollback (data management)InformationInstallation artFile archiverMereologyGroup actionDatabaseQuicksortConnectivity (graph theory)Condition numberDivisorDatabase transactionACIDConsistencyAtomic numberSet theoryDevice driverMultiplication signMusical ensembleWeb-DesignerWordSlide ruleSoftware developerComputer fileDifferent (Kate Ryan album)Context awarenessImplementationProjective planeEndliche ModelltheorieSound effectFundamental theorem of algebraType theorySystem callOrder of magnitudeRepository (publishing)Natural numberProcess (computing)Right angleKernel (computing)Physical systemLecture/ConferenceMeeting/Interview
08:04
DatabaseFinite-state machineCodeAxiom of choiceSampling (statistics)AbstractionEvent horizonMultiplication signNetzwerkverwaltungForm (programming)Software testingSystem callScripting languageConnectivity (graph theory)Row (database)Rollback (data management)Mechanism designNumerical analysisVirtual machineComputer fileContent (media)Sound effectExtension (kinesiology)EmailACIDPoint (geometry)MereologyInverse elementDirectory serviceGroup actionInteractive televisionExterior algebraCASE <Informatik>Process (computing)Operations support systemFile systemInformationImplementationFerry CorstenQuery languagePhysical systemRight angleResidual (numerical analysis)State of matterEntire functionMathematicsRoutingUniqueness quantificationDifferenz <Mathematik>Database transactionInverter (logic gate)Lecture/Conference
16:09
Mathematical optimizationLoginMultiplication signPerfect groupGoodness of fitPoint (geometry)Overhead (computing)Virtual machineFile systemOrder (biology)Product (business)Lecture/ConferenceMeeting/Interview
16:44
CASE <Informatik>LoginWindowMereologyKernel (computing)Personal identification numberState of matterPoint (geometry)DatabaseSound effectVirtual machineMoment (mathematics)Right angleOverhead (computing)Concurrency (computer science)Endliche ModelltheorieNumerical analysisSemiconductor memoryExterior algebraFile systemServer (computing)QuicksortDatabase transactionContent (media)NetzwerkverwaltungUniform resource locatorClient (computing)Operations support systemOrder (biology)GoogolMultiplication signComputer hardwareSoftware maintenanceoutput10 (number)InformationSpherical capSoftwareModule (mathematics)MeasurementGroup actionDisk read-and-write headVirtual memoryMathematicsMetadataData storage deviceOcean currentTunisComputer fileLecture/Conference
25:39
MathematicsMultiplication signPattern languageKey (cryptography)Row (database)Numerical analysisString (computer science)IdentifiabilityElectronic signatureDatabaseEmailStructural loadSound effectInformation retrievalRollback (data management)File formatProjective planePoint (geometry)TwitterOperations support systemEndliche ModelltheorieFile systemRepresentation (politics)Block (periodic table)outputMeasurementQuicksortDifferent (Kate Ryan album)Overlay-NetzLevel (video gaming)File archiverComputer hardwareOverhead (computing)Binary codeExterior algebraData recoveryElement (mathematics)Online helpBoundary value problemLoginFundamental theorem of algebraContent (media)InformationCASE <Informatik>Type theoryMarkup languageWeightOrder (biology)Database transactionFinite-state machineCross-platformNetzwerkverwaltungPrinciple of maximum entropyNetwork topologySoftwareInstance (computer science)Revision controlControl flowData conversionIntegerReading (process)Arithmetic meanLecture/Conference
34:34
Group actionMultiplication signControl flowBitMeeting/InterviewLecture/Conference
35:40
Distribution (mathematics)BitComputer configurationGroup actionComputer fileFerry CorstenExtension (kinesiology)CodeLevel (video gaming)Sound effectIntegrated development environmentVariable (mathematics)InformationSingle-precision floating-point formatEndliche ModelltheorieCategory of beingScripting languageFundamental theorem of algebraCASE <Informatik>Event horizonGastropod shellOperations support systemSimulationSoftware testingProgramming languageConfiguration spaceInverter (logic gate)1 (number)MathematicsDemonLecture/ConferenceMeeting/Interview
39:09
Right angleComputer fileFigurate numberConfiguration spaceScripting languageContent (media)WritingArithmetic meanTelecommunicationInverter (logic gate)Meeting/InterviewLecture/Conference
39:50
Order of magnitudeOperations support systemNetzwerkverwaltungSound effectGastropod shellPhysical systemPoint (geometry)Einbettung <Mathematik>Lecture/Conference
40:53
NetzwerkverwaltungScripting languageAsynchronous Transfer ModeMoment (mathematics)Point (geometry)View (database)Lecture/Conference
41:30
SpacetimeNetzwerkverwaltungEndliche ModelltheorieScripting languageExterior algebraRoutingPoint (geometry)View (database)Decision theoryCodeSound effectGroup actionArithmetic meanProjective planeLecture/Conference
42:35
Projective planePresentation of a groupNetzwerkverwaltungEmbedded systemFunctional programmingReal numberLecture/Conference
43:23
Multiplication signLecture/Conference
Transcript: English(auto-generated)
00:00
I'm Jeff Johnson, and I'm the lead developer at RPM5.org, and also part of the Mancusi project. You just saw me meet many of the other members, you know, essentially for the first time. I'm a remotee since I live in the U.S., but Mancusi is the best
00:22
research I've seen on package management, you know, since forever, okay, with actually analyzing and modeling and looking at the details of package management and trying to scale package management into, you know, hundreds of
00:44
repositories and thousands of packages and trying to make sure that the fundamental goal of package management of software installation is actually functional, okay. So I recommend Mancusi not just because, you know, I'm
01:01
part of it, but it's the best research I'm aware of, and they're the only attempt that I know to approach Debian and RPM packaging, you know, neutrally, right. For the last decade there's been an insane amount of effort wasted just because of packaging war differences, okay. Both of them are successful, RPM and
01:26
dpackage, and they're just different, okay, and they're more alike than different these days. You know, the same problems are being solved, it's just one person says banana and the other person says tomato, but it's the
01:41
same intrinsic problem. In any case my talk is pretty much ad hoc, you know, I have no slides and so on, you know. As a developer I tend to have too many notes and not enough music, and I promise certain members that I wouldn't be
02:01
boring. This is my third talk here at FOSTUM. My first one I made the mistake of providing too many details, I put a whole room to sleep, all right, but my second talk I found that it was easier to find essentially three to five talking points and try and convey an overview of a very complex subject. The
02:26
subject here is transactional protected package management, and those highfalutin words basically just mean trying to add ACID behavior to make packaging more reliable. ACID is well known and if you are adding transactions
02:46
to a database the essential components are well analyzed. ACID for those who don't know stands for atomicity, consistency, isolation, and durability,
03:01
all right, and atomicity is usually achieved by locks, almost everything has it. Consistency, if you're inconsistent, well you die fairly quickly, so those are the rather easy aspects of ACID. The parts that need to be added to package management are isolation and durability. Durability is perhaps the
03:26
easier one because it just adds persistence and a persistent state machine on beyond just the action of an install. Package management is different from archivers because it tries to manage software over its entire
03:44
lifecycle, okay, from building into distribution, into installation, into, you know, retiring it and legacy compatibility, and so durability is trying to add a persistence to the store of the metadata on the
04:07
installed machine so that it's known reliable and it can be maintained in a consistent fashion. So isolation is rather more complicated because of the
04:25
mixed problem case that a package manager has. There are basically three different types of information that need to be preserved when a package is
04:43
installed. The easiest one, because it's an already solved problem, is the database ACID. Databases have ACID, we all know this. The second one is file systems, and file systems is where isolation becomes very tricky. A file
05:05
is no file system that really has any credible approach to isolation. What isolation is is two processes shall not know, you know, that they're both about each other. There shall be no side effects that are shared, but that isn't
05:25
how file systems work, all right. File systems try and provide a consistent viewpoint onto a data store on a file system, and so another entity will see a
05:43
file as soon as it's created. Since a package manager is installing on a file system, isolation is rather tricky to achieve. There have been several attempts to do this with, let's say, file snapshotting or copy-on-write and
06:03
aspects like this, but ACID will never be fully achieved on a file system until the file systems themselves start providing some sort of transactional protection. Actually parked, and this is probably an
06:22
appropriate place to point it out, as part of doing due diligence on whether transactional protected package management was feasible, one of the things I looked at is the two attempts that I'm aware of to try and do transactional protected system calls, and one of them is the Valor file
06:46
system, and this is the third implementation that is starting to perform nearly as well as ext3. It's only three times slower, but when you
07:02
add the logging necessary for transactions and transactional rollbacks and two-phase commits, you know, you're going to write more information, and so actually a factor of three slower is not too bad, all right, but the Valor
07:22
file system has transactions built within a kernel driver, and the benefit for a file system is all sorts of race conditions are avoided if you can provide the I and ACID isolation, all right. The other approach
07:45
which I'll just mention offhand is I think it's out of the University of Texas, and it tries to provide a begin and commit on a transaction around sets of operations. That's a second approach to essentially the same thing. Unlike a
08:07
file system, okay, a package manager can, speed is not as critical. RPM is not going to be compared to ext3. It will be compared to dpackage, okay, but
08:22
dpackage is slow enough that it's possible to try and provide some isolation that on file system operations, all right, which is the second component if I recover all my parentheticals, you know, there's ACID for
08:41
databases, there's ACID for file systems, and the third part is the part that John Thompson talked about this morning, which is ACID for scriptlets, and that's a really hard problem to solve, but the second component where isolation is very tricky is how do you preserve all the system calls that
09:05
are done by a package manager as it's installing content on a file system, all right, and just to summarize, you know, a rather complicated and roundabout way of talking about it, until a file system provides isolation, RPM can't,
09:29
all right, but RPM does do things like installing files in a temp name and then renaming them in place. If the temp name is unique enough, it's unlikely
09:42
that there will be a interaction between two interacting processes, but that's about the best you can do, you know, with transactionally protected package management is install into a unique name and then try and move it into place as easily as possible. So the three
10:03
components for transactionally protected package management are database ACID, system call ACID, and scriptlet ACID. The underlying mechanism is quite complicated, but it's not hard to understand. If you want a transactional
10:25
log, you have to write a log record before you perform any operation, and if you have a log record, then you can also put a marker in the log that indicates that you completed successfully, and if there is no marker,
10:43
you can infer go back to previous one, and that's essentially what happens with a rollback. Once you have that basic state machine for transactional logging, then the rest of it's rather easy. It's just moving between different,
11:01
you know, consistent points, and anything in between can be inverted and discarded, which removes side effects. This will be one of the major benefits. I know it was one of the very first things I was asked, can RPM5,
11:22
truly if there's two headers in there, will the other one disappear? Okay, and this is one of the side effects. If there's an abnormal exit in the package manager, then there will be residual state that's left, and this often shows up. If you do a query, you'll have both the old and the new, because you
11:40
failed before the old could be removed. All right, that, you know, will now be transactionally protected, and if that install a race pair, okay, does not achieve a commit, then the intervening information can be discarded reliably,
12:00
so that the underlying store, okay, is known to be moving through time by definite checkpoints, all right, with known good behavior. Most of the rest of what I'll talk about, and the part that's complete now, I believe, although
12:21
the really hard part about transactionally protected package management, is to know that you have full coverage over all possible cases. That will only come with sufficient testing and usage, but what is there now is using Berkeley DB, all right, RPM already uses Berkeley DB,
12:41
since all that transactionally protected package management relies on is ACID behavior, any database would do, but Berkeley DB already has all this, and so there's no reason to look at alternative forms, but any SQL database
13:00
these days provides some form of ACID behavior. The choice at RPM 5 was to use Berkeley DB. The underlying reason for this was Berkeley DB already has sample code that shows how to extend the database logs into other abstract
13:24
events. Other abstract events, the sample case that's in there, is if you have two system calls, makedir and removedir, they form an invertible pair, okay, if you, while you're installing, you make a directory, but if you want to
13:42
roll it back, you remove a directory, and so it's very important that every action has its own inverse, but with sample code and an aux script that generates all the glue layers, it's possible to extend Berkeley DB to handle system calls. RPM itself, during package installation, performs about 20 system
14:09
call operations, and there are things like makedir and removedir, and of course writing a file. To extend Berkeley DB ACID to file system ACID, when, if you're
14:27
going to remove a file as part of installing a package, what needs to be done is to write the previous content into the log and then remove the file. So
14:41
if you have to put it back, the content can be put directly back out of the log. Nothing else is needed except the log and something which can read the log. So my goal with transactionally protected package management, and I'm not there yet because there's a large number of pieces that still need to be
15:02
implemented and tested, is to take a machine, take the log off it, take the executable, which reads the log, remove everything else, and put the machine, the entire machine, all state on the machine, every single file, exactly
15:21
the same, using only the log. That's the goal of this, because RPM is expected to run in empty change routes and on bare metal machines, and there is no stronger test of it, a test of any implementation, then can you
15:42
recreate anything, everything that's on the system, and pass a diff test to make sure all the information is identical. I think that that is feasible, but I won't know until I actually get there. Yes, sure. For the camera, the goal is
16:12
remarkable. There is a lot of good behind everything you're saying, but in order to do all this production, how much overhead do I put over my file
16:21
system? If I'm installing 20,000 packages that takes 40 gigabytes and then I need to go and restore from the logs, it means that I need to have at least 40 gigabytes of logs, depending on how many times I've been updating in between. So there must be some some kind of optimization you have in mind to avoid that kind of overhead on the log on machine. You've given me a
16:41
perfect entry point to describe what I've been working on for the last six months. There's a number of problems with the schema that has been used by RPM for years, and I've never really been able to tune Berkeley
17:00
DB performance to try and minimize the overhead that you're mentioning. So a large part of the work was preparing the infrastructure to take in a Berkeley DB transactional machine, but in order to succeed at that, I have to be no
17:25
slower than I was before. So I had to fix a lot of the the flaws in the current schema. I've achieved at least a tenfold improvement in performance,
17:41
which since for logging operations you essentially you know have to do everything twice, right once the log and once under the file system, well tens bigger than two. The way that this has been done is to Berkeley DB
18:02
will use an m-map to move information from the database. RPM also does m-mapped I O, so the second copy is not that expensive, all right. Almost all of the
18:22
overhead of reading a database once the resource cap is changed to permit you know up to let's say a quarter of the machine to be memory mapped, the I O
18:41
overhead on a database largely disappears. Writing to a second location from a memory map is also not that significant in overhead. There is still
19:01
the cost of the log itself and writing content in the log you know if if you actually try and preserve those logs over time, it becomes quite significant.
19:20
At the moment what I'm doing is every successful RPM operation creates a checkpoint. By creating a checkpoint I'm establishing a base that you know determines you know how far back in history do I want to go and clearly
19:44
nobody cares what was done on a machine one year ago. I mean people care but not most users, all right, and you don't need gigabytes of logs you know going back forever in most cases, okay. The cases where you do care I mean
20:01
it's important and you'll allocate resources to maintain those logs but not for an end-user machine. What I'm seeing is that by paying attention and changing the schema and I'll describe some of the details in a moment because
20:20
that is the most important point and things like RebuildDB are not the right thing to do if you have a transactionally protected database and are no longer necessary either. What I'm actually measuring is that it's quite feasible to switch from the concurrent access model that RPM is
20:46
using now to a transactionally protected module if the schema changes at the same time and there's measurable increased lock overhead but you know the overall wall clock measurements are about the same and the
21:06
saving the previous transaction is probably a tolerable you know saving so that if you have an interrupted action that you can restore the metadata store and take out the side effects of a broken install is probably
21:24
worth the effort but that's not for me to say this is my opinion okay I mean and I do have deep mixed feelings about about logging and transactional protection there's basically two approaches to trying to improve
21:48
reliability one of them is what I'm talking about where you have a log on the local machine the other one is called I call it provisioning and that's basically you take the state off the machine and put it someplace this is
22:04
basically what Red Hat Network does this is the sort of Google model you know and I call it provisioning because you can always recreate the machine and provisioning as an approach to the same problem has two
22:20
major benefits first of all it's always off the machine so you never have to worry about hardware failures and things like that and second it's usually centrally located which minimizes the maintenance cost of it the disadvantages of a provisioning approach is you often don't have the
22:43
sufficient number of details to track all this the detailed state this is better done on a target machine on a client machine and with a log there's a point someplace between those two all right but RPM and package management in
23:03
general is is a client only it's not tied to a server so I'm solving that problem but the other one exists and there's nothing you know better or worse with it that answer the question yes absolutely okay since I
23:24
described the two sort of general alternatives provisioning versus transactional protection I probably should say and describe some of the other approaches one of the approaches has been from Nexenta with ZFS and
23:44
young in fedora is attempting to use bitter FS and file system snapshots these are definitely viable approaches the problem is is that if
24:03
you're going to use something like a ZFS or bitter FS snapshot you first are stuck with some of the artifacts of being a file system ZFS you know can
24:20
perform snapshots per mount point okay a package can install on several mount points and so you have the snarl of package side effects across multiple mount points that makes it more complex than otherwise the intrinsic flaw is if you're going to use a file system approach you know to let's
24:42
minimize you know the number of rights or overhead or something like that is that you must have that file system in the kernel and neither bitter FS or ZFS are sufficiently widely deployed you know rpm5 runs on every Unix in sight it's not in Mac OS X and lots of other places and I'm finding myself
25:04
talking to people about even you know doing rpm on Windows okay you can't bet on things like bitter FS or ZFS you know as a general approach that's part of the reason why I decided to just use Berkeley DB rather than to
25:21
rely on a snapshot in a file system you know a snapshot is basically a cheap it's essentially a checkpoint on a file system with a copy on right attached so at the checkpoint the current state is known and anything that
25:44
changes from that point you know is tracked separately with some sort of overlay if you will and that's the way I pictured like a file system snapshot a log internal to rpm is essentially the same operation the real
26:13
difference in the performance measurement is basically what you choose
26:22
to count or how you choose to count the resource usage like a clock ticks or you know number of blocks of IO but all logging approaches are going to be the same you have to write a copy before you know actually performing some operation but the point I was trying to make is that I transactionally
26:44
protected package management is not the same as a file system snapshot these are complementary approaches to essentially the same problem another thing that transactionally protected package management is not about is let's
27:05
say a representation of the log you know if you're trying something like the modeling you often modeling that Mancusi is doing you often have to try and come up with a reasonable format C UDF is one I'm quite sure that there's
27:24
another one coming from Mancusi work project three that describes the elements of a rollback each one of these is a different format each one of these is a different representation of the same information but describing
27:40
things differently is in a different type of markup for different usage cases is not exactly what the it's a different meaning of log ok and that's not what I'm talking about with transactionally protected package
28:01
management the state machine for rolling back a transaction is actually quite different than a package state machine which you know does erase the old package and update with the newer content so the log is pretty much
28:22
internal and the representation can always be retrofitted later since I'm reaching about 15 minutes let me go through some of the details that have already been implemented with database rather shortly but one of the important elements is that their tools that are going to be needed that will
28:45
be external to RPM the most important if you bring nothing else away from this talk if you're trying to use a transaction model rebuild DB won't help you but the alternative is DB recover minus EV all right since
29:04
most of the world is fixing rpm side effects by attempting rebuild DB it won't hurt you but it's not going to save you DB recover will the log management and particularly the checkpointing which is the most
29:21
important element at reducing log size is the tool needed is DB checkpoint and that puts in a boundary which in time that says this is the last known good to get rid of older logs the tool is DB archive and these are you will need
29:46
tools to manage your database I haven't yet even considered the question of compatibility with older versions of rpm I compatibility is going to be rather tricky because essentially lots of the elements that used to be in rpm
30:06
no longer exist most specifically is there are no join keys these are integer identifiers for packages that were monotonically increasing and since those don't exist you know I'm not sure how compatibility can be
30:22
meaningfully defined I hope to do it by providing conversion you know conversion from one to the other is an acceptable way of achieving compatibility but that's not the same thing as interoperable compatibility the all integers are now saved in network order this means cross-platform
30:45
is now meaningful previously rpm saved in net in native order all the time but network order permits binary strings and integers to be dealt with equivalently using a mem compare which is important for b-tree access
31:04
and that also is a performance increase by switching to network order one of the other incompatibilities that's been introduced is that Berkeley DB provides a counter that does that monotonically increases
31:26
previously this was done with a weird hack by using record number zero in the primary store as if it were an instance counter by using a persistent number
31:42
there is now a identifier that is durable over time some of the other performance improvements is I'm now able to do pattern retrieves on keys which means this is much higher performing than what rpm has done in
32:05
the past which is a secondary lookup with a header load and marshaling and signature checking that the actual database side effects of the retrieval this is where most of the performance and treatment performance improvements
32:26
that I've achieved reside one of the other sort of deep fundamental infrastructure changes is that signatures and digests are no longer verified the actual there hasn't been a problem that has been detected with a
32:44
signature or digest check on a database blob for quite some time and so what I'm instead now is using mmap and prod read to protect the blob when it's read okay
33:03
there's pretty clearly no reason to verify the digest and signature on a blob of binary data that is hardware protected and that removed a very large amount of overhead reading headers from a database and it's a big chunk of the
33:25
tenfold performance improvement that I've gotten that's about you know because of time is I'm much more interested in questions if you want me to continue I can go into some of the the way that scriptlet acid will be
33:48
achieved okay having heard John Thompson's talk this morning okay that rapidly got changed since 10 o'clock this morning but I can try and flesh
34:00
out some of the details that I think to where the connection between the work that Jim Thompson and Paulo and Kaisha magica are doing is going to fit into this or I can also just try and field some questions you know it's
34:24
up to you other questions or yes yeah I have another question how important it is for you backward compatibility you mentioned that the new version you are developing now of rpm breaks compared to the old one or you need to provide a medal to transition from one to another isn't some time that
34:41
necessary to just break and say now we start clean so to speak and move on well compatibility depends on what's being talked about and as does break all right and I'm I'm not trying to break things without reason all right most of the problems with the rpm schema of known for a long time and
35:09
and I've lived with that but yes there's times to break I've just been notified five minutes but any other questions okay there you go so I might be a bit
35:45
off here because I'm not really involved with rpm distribution but as I understand it is also an option within rpm to do scriptlets installation maybe you call for this so how do you plan to to roll back those scriptlets through they may be creating files they may be doing stuff okay the the talk by
36:05
Jim Thompson this John Thompson this morning was about using a DSL to break a scriptlet into smaller pieces all right the fundamental problem with scripts in
36:23
packaging is that they're too coarse grained and they're opaque you know you have the general cases you have side effects and the side effects need to be captured somehow the benefit that I see from the DSL is that it will be
36:49
possible not just to look at the exit code from a script which is all the information that is being returned but it will be possible it was even in one
37:00
of John's examples that he illustrated that there were two operations within a single scriptlet and that's fundamentally the problem if you have like a configuration change and a daemon restart and other side effects you know it's possible to have one of them fail and it being
37:24
unimportant while the other ones are absolutely essential to the operation so by having finer grain then it becomes possible to come up with partial failures or and partially invertible failures the other benefit will be in
37:42
simulation modeling okay because scriptlets are often in shell it's painful to do anything other than de facto testing of the scripts by simulating and modeling other properties such as ensuring that
38:01
prerequisites that aren't specified in the ordering the package are actually met and what I hope to do is to use that DSL breaking it up into smaller pieces and map those pieces into a log as a starting event and a
38:22
the other more traditional approach is scriptlets run in an environment and that environment is needs to be captured in the log okay what the environment variables were okay what the prerequisites are to some extent that provides a level of acid and invertibility all right the problem is
38:46
is that this is a general programmatic language and so things like typos you know cannot be tolerated all right but these are not very interesting problems they are important problems that have to be solved but it is a very hard
39:01
problem to get acid underneath scriptlet behavior does that answer the question almost there's just one more thing actually what I basically meant is if one of those scriptlets creates a file or updates a configuration file or whatever and you find figure out later that you need to undo the upgrade for some reason then you should basically maybe also
39:21
undo what the scriptlet did scriptlets need to be invertible yeah all right and there needs to be an undo that's what I'm saying the same invertibility happens when you install a file I mean if you're writing content or you're removing a file yeah but there's probably some some way for the scriptlet then to communicate to RPM that hey I've modified this file or is
39:40
that way I don't know a better communication will be the means by which acid is achieved under scriptlet behavior all right okay right and it's the same basic effect that you log any operation before it's performed the complexity comes solely from using born shell which is painful all right
40:02
right and embedding is another approach that could be used like either in bedded Lua and then RPM is more aware of the actual operations are there but the general approach is just to use the 55 operations and that helper and to parameterize those and that's a perfectly sound starting point
40:22
because that's the order of magnitude of the side effects of package management and this isn't a large corpus okay gotcha any other questions yes and what do you think about the next OS package management system which
40:47
is a purely functional way and most of the side effects I like it a lot I've been talking to Ilco I started about a year ago and I from a package manager
41:05
point of view the single biggest failure mode at the moment in packaging is scripts buggy scripts you know that it's not only buggy but scripts that aren't written generally enough all right and my private view is that I
41:22
need to remove all scripts from rpm if I say that in public and I didn't say it here okay if I actually remove all the scripts I'm going to also appear to remove all programmatic means to change packaging okay this is the same
41:42
decision point that everybody faces switching from to a functional model you know you have to think about the problem space very different all right but from a package manager point of view I have two alternatives I go the nixos route and I say no scripts and everybody hates me and I retrofit the
42:01
same actions okay through other means or I take a slow and gradual approach to a functional okay where the first step is to use deb helper and then to break deb helper into smaller pieces and you know moving gradually towards you know refactoring code fewer side effects and eventually
42:24
achieve a functional model I don't know what the answer is but I like nixos a lot it's something that you are going to try in my pussy project I can't speak for Mancusi but I have tried to talk to the nixos
42:43
presentation you know I haven't yet found it but and I have considered this you know I the approach that will probably take will be to embed Haskell and try and do nixos package management directly from rpm and to see
43:04
whether that approach is viable all right but I'm well aware of what nixos is and I like functional programming I I don't do it okay but because it's just too hard to switch from one to the other you know I have I live in the