Re-thinking system and distro development - TIB AV-Portal

Re-thinking system and distro development

00:00

1

Wirzenius, Lars

Formale Metadaten

Titel

Re-thinking system and distro development

Serientitel

Anzahl der Teile

84

Autor

Wirzenius, Lars

Lizenz

CC-Namensnennung 2.0 Belgien:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.

Identifikatoren

10.5446/40047 (DOI)

Herausgeber

Erscheinungsjahr

Sprache

Produktionsjahr

2012

Inhaltliche Metadaten

Fachgebiet

Genre

Abstract

FOSDEM (Free and Open Source Development European Meeting) is a European event centered around Free and Open Source software development. It is aimed at developers and all interested in the Free and Open Source news in the world. Its goals are to enable developers to meet and to promote the awareness and use of free and open source software.

FOSDEM 201222 / 84

1

16:04

The ZIO Framework

2

48:49

You're doing it wrong

3

10:13

XQuery 3 0 Rocks

4

1:03:58

Working with contributor communities round table

5

15:38

6

44:55

Why the community should welcome Average Jane and Joe

7

19:35

Welcome to FOSDEM 2012

8

14:39

Model your UI ... Live! Wazaabi

9

49:09

10

36:59

Voice Applications for the Modern Open Source Hacker

11

44:46

Virtualization with KVM

12

13:33

vcsh - manage config files in $HOME via fake bare git repositories

13

45:21

Univention Corporate Server

14

15:13

15

11:24

16

12:08

Threat Modeling Revolutionized

17

43:26

The Wild West of UNIX I O

18

15:17

The Growl Project

19

21:34

Succeeding in the Google Summer of Code as a large project

20

15:53

Managing Data in MediaWiki

21

10:37

The Self-Describing Wishbone Bus (SDWB)

22

47:27

Re-thinking system and distro development

23

43:34

QA tools for FOSS distributions

24

16:43

Q A with the FOSDEM staff

25

16:08

Powerful tools for Linux C C developers based on Eclipse

26

14:38

PMD5: What can it do for you ?

27

15:42

28

35:01

One Year of openQA

29

14:38

Managing your network with Netmagis

30

43:35

Native Linux KVM Tool

31

15:39

MultiPath TCP: Linux Kernel Implementation

32

39:53

Multiarch - why it's important

33

56:38

Multi OS Continuous Packaging with Project-Builder.org

34

16:12

Minemu: Protecting buggy software from memory corruption attacks

35

10:51

mail2trac: a pluggable email handler for Trac

36

13:14

LISPmob: enhanced network layer mobility solution

37

48:42

Linux containers and OpenVZ

38

46:10

LibreOffice: on-line and in your pocket

39

15:11

Libre.fm and GNU FM

40

59:07

Introduction to hardening, the Gentoo Hardened approach

41

51:55

Internet of Threads

42

50:58

Implementing Domain Specific Languages with LLVM

43

43:07

Illumian, a new illumos based distribution

44

16:07

How we scaled up OpenQuake

45

32:40

How to replace a legacy tool with 100k installations

46

16:01

Hacking in the real world

47

16:45

48

46:14

Gnuk - OpenPGP USB Token implementation

49

15:59

git annex manage files with git, without checking their contents into git

50

26:59

Gentoo ruby packaging

51

55:34

52

13:51

Geeklog The secure CMS

53

48:06

Ganeti: "how you can use it"

54

48:22

Free Software - A Viable Model for Commercial Success

55

11:38

FOSS in Broadcast

56

53:53

CESecore and common criteria certification of open source software

57

14:36

58

14:59

59

23:50

Overview and a demo of Dogtag Certificate system

60

28:58

yaSSL - yet another (embedded) SSL library

61

48:37

Deployment and Use of X.509 in Free Software Components

62

45:23

CAcert: Trust - the root of evil?!

63

31:53

Open Security Hardware

64

16:05

EPFSUG - everybody needs a hacker!

65

57:10

EFL Enlightenment Foundation Libraries

66

13:37

Dovecot: More than an email server

67

44:25

Distributions infrastructure system administration round table Q A

68

50:16

Debtags.debian.net reloaded!

69

41:24

Debian Secrets what I wish I knew before joining Debian

70

53:30

Debian packaging for beginners

71

36:52

coreboot - The last frontier: Laptops

72

44:00

Coping with wide-impact changes in a distribution

73

50:11

74

27:35

The Apache Cassandra storage engine

75

31:37

Caret and Stick

76

44:54

Caching and tuning fun for high scalability

77

44:27

Btrfs and Snapper - Overview and Future

78

48:41

Bringing monitoring into the 21st century

79

47:04

Anatomy of a role playing game

80

54:40

81

07:23

An introduction to EclipseRT

82

32:50

Adventure of setting common account database for a distribution infrastructure

83

51:27

A strategy for managing diverse equipment in the CERN controls group

84

27:20

A New OSI for A New Decade

Automatisches Abspielen

Sprache

Text

Bild

00:00

Physikalisches SystemProzess <Informatik>Produkt <Mathematik>DistributionenraumSprachsyntheseStrömungsrichtungKernel <Informatik>ComputeranimationVorlesung/Konferenz

01:34

Physikalisches SystemProdukt <Mathematik>Quick-SortATMSechseckInstallation <Informatik>DiskettenlaufwerkElektronische PublikationDateiverwaltungRootkitDatensatzTexteditorOrdnung <Mathematik>DistributionenraumReelle ZahlBootenDefaultPhysikalisches SystemDatenverarbeitungssystemSoftwareProzess <Informatik>Virtuelle MaschineVorlesung/Konferenz

04:17

Physikalisches SystemDigital Rights ManagementErwartungswertDistributionenraumFrequenzMailing-ListeQuick-SortComputeranimation

05:01

BitResultanteDifferenteWärmeübergangGemeinsamer SpeicherPunktDigital Rights ManagementTopologieDistributionenraumCodeDebuggingVorlesung/Konferenz

06:38

SoftwareDesintegration <Mathematik>Installation <Informatik>DistributionenraumBitBootenTexteditorProzess <Informatik>Ordnung <Mathematik>Digital Rights ManagementDatensatzSoftwareentwicklungSoftwareComputeranimation

07:57

FreewareDistributionenraumSoftwareIntegralVersionsverwaltungMereologieQuick-SortDifferentePunktAuswahlaxiomWeb SiteBinärcodeVerzeichnisdienstVorlesung/Konferenz

09:42

SoftwareDesintegration <Mathematik>Installation <Informatik>VerzeichnisdienstIntegralDatenbankOrdnung <Mathematik>Web-ApplikationServerElektronische PublikationBitDifferenteDistributionenraumBenutzerbeteiligungKonfigurationsraumStreaming <Kommunikationstechnik>ComputeranimationVorlesung/Konferenz

11:09

BeschreibungskomplexitätDistributionenraumBitDifferenteGraphArithmetisches MittelSchnittmengeNotebook-ComputerComputeranimation

11:47

BeschreibungskomplexitätGruppenoperationSoftwareentwicklungDifferenteSymboltabelleProgrammbibliothekZeitrichtungE-MailFehlermeldungComputeranimationVorlesung/Konferenz

13:01

BeschreibungskomplexitätProgrammbibliothekCASE <Informatik>ResolventeNabel <Mathematik>Physikalisches SystemDifferenteImplementierungSkriptspracheBinärcodeDistributionenraumOpen SourceGraphSoftwareentwicklungHalteproblemResultanteInformationVorlesung/Konferenz

14:41

BeschreibungskomplexitätInformationInstallation <Informatik>InformationsspeicherungKomplex <Algebra>SoftwareentwicklungOrdnung <Mathematik>DatenfeldEntscheidungstheorieElektronische PublikationDatenverarbeitungssystemMomentenproblemAggregatzustandBitComputeranimation

16:04

Komplex <Algebra>SchnittmengeEntscheidungstheorieBootenOpen SourceKombinatorikEindeutigkeitSoftwaretestKonfiguration <Informatik>GruppenoperationSystemzusammenbruchMultiplikationsoperatorGraphOrdnung <Mathematik>Physikalischer EffektProzess <Informatik>LaufzeitfehlerNotebook-ComputerBrowserElektronische PublikationVersionsverwaltungBenutzeroberflächeTermCASE <Informatik>MathematikSchaltnetzChaostheoriePunktDistributionenraumRechter WinkelProgrammbibliothekGasströmungDifferenteZahlenbereichVorlesung/Konferenz

21:08

SoftwaretestDifferenteGruppenoperationVersionsverwaltungMAPQuick-SortMathematikGanze FunktionVerzweigendes ProgrammDistributionenraumOrdnung <Mathematik>Flash-SpeicherBitNP-hartes ProblemVorlesung/Konferenz

22:59

SoftwareentwicklerQuick-SortSoftwaretestSoftwareBitSchnittmengeDigital Rights ManagementBeweistheorieProzess <Informatik>GruppenoperationGraphProjektive EbeneMultiplikationsoperatorWeb ServicesBenutzerbeteiligungVorlesung/Konferenz

24:31

BeschreibungskomplexitätSchnittmengeSoftwareZentrische StreckungQuick-SortMultiplikationsoperatorsinc-FunktionComputeranimation

25:14

MultiplikationsoperatorProgrammbibliothekKonfigurationsraumMeterVorlesung/Konferenz

26:14

BeschreibungskomplexitätBitDigital Rights ManagementSystemaufrufKonfigurationsraumSoftwareDatensatzComputerspielSkriptspracheVirtuelle MaschineVorlesung/Konferenz

28:36

Kontinuierliche IntegrationSoftwareGruppenoperationSoftwareentwicklerComputeranimation

29:25

Kontinuierliche IntegrationProzessautomationMAPMathematikMaßerweiterungSoftwareentwicklungVersionsverwaltungSoftwareQuick-SortQuellcodeSoftwaretestKontinuierliche IntegrationKomponententestGanze FunktionSuite <Programmpaket>FunktionalKlasse <Mathematik>ProgrammablaufplanIntegralEinsDruckspannungInstallation <Informatik>Computeranimation

31:40

Kontinuierliche IntegrationDruckspannungBootenSoftwaretestDesintegration <Mathematik>Physikalisches SystemNetzwerkverwaltungRückkopplungBildgebendes VerfahrenSoftwareSystemzusammenbruchBinärcodeSoftwaretestVerkehrsinformationProgrammverifikationMathematikProzess <Informatik>BildschirmmaskeHalbleiterspeicherBootenProgrammfehlerDämon <Informatik>VersionsverwaltungMultiplikationsoperatorFigurierte ZahlRandwertComputeranimationVorlesung/Konferenz

34:26

FunktionalSoftwaretestPhysikalisches SystemKontinuierliche IntegrationBereichsschätzungSchnitt <Mathematik>Prozess <Informatik>ProgrammbibliothekSoftwaretestBootenKontrollstrukturMathematikVersionsverwaltungFunktionalPhysikalisches SystemComputeranimationVorlesung/Konferenz

35:20

Kontinuierliche IntegrationPhysikalisches SystemBildgebendes VerfahrenSoftwaretestDatenverarbeitungssystemWeb-ApplikationDruckspannungSoftwareSoftwareentwicklerNeuronales NetzComputeranimation

36:18

SoftwaretestFlächeninhaltSystemzusammenbruchCodeBildschirmmaskeSoftwareentwicklungGruppenoperationProgrammfehlerEin-AusgabeSchnelltasteSchreib-Lese-KopfDatenverarbeitungssystemProgrammablaufplanComputersimulationProgrammQuaderSichtenkonzeptGanze FunktionVorlesung/Konferenz

37:53

Kontinuierliche IntegrationSoftwaretestKontinuierliche IntegrationBereichsschätzungComputeranimationVorlesung/Konferenz

38:32

Kontinuierliche IntegrationSoftwaretestMaß <Mathematik>RückkopplungKombinatorikPhysikalisches SystemKontinuierliche IntegrationSoftwaretestZentrische StreckungCASE <Informatik>NormalvektorComputeranimationXMLUMLVorlesung/Konferenz

39:33

Stetige FunktionMathematikQuaderComputerarchitekturKontrollstrukturSoftwareentwicklungPhysikalismusDruckspannungDigital Rights ManagementSoftwaretestFolge <Mathematik>Ordnung <Mathematik>HardwareDistributionenraumTermGruppenoperationPhysikalisches SystemSchnittmengeVierLeckÄußere Algebra eines ModulsKontinuierliche IntegrationUmwandlungsenthalpieAnalytische FortsetzungInstallation <Informatik>XMLUMLVorlesung/Konferenz

42:49

Kontinuierliche IntegrationTrennschärfe <Statistik>ComputerarchitekturTaskGleitendes MittelBitBinärcodeRichtungServerKontrollstrukturProgrammierumgebungDistributionenraumSoftwarePhysikalisches SystemSchreib-Lese-KopfDreiecksfreier GraphInklusion <Mathematik>HardwareFehlermeldungFrequenzSprachsyntheseBenutzerbeteiligungARM <Computerarchitektur>GradientMeta-TagProzessautomationRechter WinkelCASE <Informatik>SoftwaretestComputeranimationVorlesung/Konferenz

Transkript: Englisch(automatisch erzeugt)

00:01

Ladies and gentlemen, may I have your attention for our next speaker, Lax Fusinius. How many of you are involved in the development of a Linux or BSD or HERD distribution?

00:31

Or has ever been? I want to apologize beforehand because I might say something stupid about you.

00:41

However, I would also like everyone else to give these people an applause because they're doing an awesome job. Without people who develop distributions, most of us wouldn't be running Linux or BSD

01:03

or HERD, et cetera. I'm going to be mentioning Linux a lot in this speech or talk, but I believe most of the things I say will apply to all these shows regardless of kernel.

01:23

The current way we develop distributions is okay. The end product that we create works. People successfully run Linux for all sorts of purposes.

01:44

So there's no huge problem there that immediately requires everyone to drop what they're doing and fix things. However, I believe we can do much better. And if we can do much better, I believe the end product will be a whole lot better as well.

02:01

In 1991, about August, if I remember correctly, my computer, a 386 PC, was the first one in the world to get Linux installed on it. Linus had been growing his Linux system on top of an existing MINIX install and he

02:21

was about ready to start making a release for other people to use and requiring them to first buy MINIX and then install MINIX and then fiddle with it until it runs Linux seemed like a bad idea. So he borrowed my computer in order to create an installation method which he ingeniously

02:46

named the boot floppy. The boot floppy was the default way of installing Linux for some years until real Linux distributions came along. With the boot floppy, installation was quite easy and smooth.

03:02

You booted off the floppy and then you fiddle with your text editor in hex mode if possible in order to modify your master boot record and specify where the root file system is and then you copied files here and there and you did a lot of things, some of which

03:26

were documented and if you were really lucky and you knew exactly what to do after you had gone through all of this work, what you could do with your system was to compile more software and in order to get the software on the machine, you might have to

03:45

fiddle some more because there was no networking. So you went to somewhere that had networking, downloaded some files, put them on floppies, copied them over to your machine, unpacked and well, if you were lucky, you've run dot

04:03

slash configure and make and make install but most of that stuff was invented later. So the process was not so smooth as it should have been, however, for those of us who were living through that period, it was smoother than we expected.

04:23

Management of expectations is very important or so marketing people tell me. After a while, some real distributions happened.

04:43

There was SLS and MCC and all sorts of things. SLS became slack fair and in 1993, two distros that you may have heard of started. One is called Debian, the other is called Red Hat and these distros both eventually

05:07

developed a packaging manager that of their own, couldn't share code, of course, that made things better. The difference between something like dpackage and apt versus rpn and yum or other front

05:27

ends for rpm isn't big, it's small, it's tiny. All the mainstream distros, Linux distros these days essentially use equivalent technology

05:42

to produce something that the differences between the end results aren't also very big and what this means is that all the mainstream distros, they look completely different but they're not so different. It's a black horse and a white horse.

06:06

So I became a Debian developer in 1996. I worked for Ubuntu for two years and I'm fiddling with bits here and there elsewhere as well so I have some background in how distros are created and I'm not going to be

06:26

telling everyone what we are doing wrong but actually I'm going to be telling two things that we could do better instead. The point of distributions is sometimes hard to get, especially for people who have no

06:47

involvement in developing distributions. It seems like a distribution is there so that end users don't have to compile anything but actually distributions do a little bit more.

07:00

They provided the tools to do an initial install and trust me, if you have to use a text editor in order to modify a master boot record then you want a tool to do that for. It's not fun. Distributions also provide tools for managing and installing and upgrading more software.

07:25

A distribution is not just the initial install, it's also installing additional software and making sure that everything goes well in that process. Anyone here who has not used one of any distros packaging manager?

07:46

Right, so you know the feeling of it being very nice to install. You hear about the new program so you just install it and it's there. This is crucial for distributions. If you had to go to the internet and find a random website that may have a binary for

08:05

your distributions, things will fall apart. We might want that as well but it shouldn't be the primary way of doing things. Distributions make sure that upgrades work. So if you install something, a new PC and later on the distribution has a new version,

08:26

it should be possible to upgrade. If upgrades are difficult then you have to do a reinstall all the time and that gets tedious. But that's the simple technical part of distributions.

08:43

The next thing I want to talk about is not so technical. Distributions need to choose the software they include. The choice might be based on license which is basically what Debian does. If it's free software Debian includes it or wants to include it.

09:03

If you have a distribution that is more specialized it might choose only software for a particular purpose. It might be a distribution that's meant for one corporation only so they only include software for that corporation what they need.

09:22

And there's all sorts of different kinds of criteria you might use but the point is that you have to choose. There's too much software that you might include, there's no point in including everything. And then the big one, all this software that's in a distribution needs to be integrated so it works together.

09:41

The integration might be simple like deciding that all manual pages go into the same directory which is surprisingly difficult sometimes. And it might be more complicated, it might be that you want to provide an integration

10:00

method by which a web application has access to a web server and a database engine and possibly some other things so that the people who create the packages for web applications don't all need to invent the same things. And the people who install things don't all need to go then and configure 47 different

10:25

configuration files in order to get Apache to start. All of this work is something that tends to be hidden. Upstreams don't see it because it happens within the distro and users don't see much

10:42

of it because or they don't realize that it's there because it just works. And all of these things are things that I think distributions as they currently exist do reasonably well.

11:01

But I think we can be truly awesome and we should be. The big problem I see with modern mainstream distributions is that they are a little bit too big, meaning they are a little bit too complex.

11:22

Debian's apt tool has a command which can produce a graph which shows the dependencies and interconnections between different packages. This is not the whole graph. The whole graph is too big for my laptop to compute.

11:45

This graph is for the set of packages that belong into the Debian build essential group. Namely, basically, GCC for C and C++, Make and a few other things. This is only enough to compile a few simple programs.

12:08

Although, I'm not sure if you can see it, but there's different kinds of symbols there that correspond to packages and then there are arrows between them that correspond

12:21

to dependencies and so on. Horrible. This is horrible. I don't think anyone here is able to manage this manually. Much of that is for shared libraries.

12:42

For shared libraries, we have tools for managing things automatically. If you have an executable, a binary program, the ELF headers tell you which shared libraries it needs and then you can write a tool to find out which packages you need to have

13:01

those libraries. Shared libraries are easy in this way. However, it's not possible to write tools for every case. For example, if you have a package that includes a shell script. And your distribution happens to provide two different implementations of shell.

13:25

Debian has a tiny POSIX compliant implementation and bash, which is not tiny. How do you decide which one you want to depend on? If possible, you want to depend on the smaller one because this makes it possible

13:44

for people who do embedded systems based on Debian to use the smaller shell. If you want to be safe, you can always just depend on bash, but then the end result is worse. So you want to choose. And if you can write a program that can certifiably be able to decide whether dash

14:08

is sufficient, the small one is sufficient, then I think there are people who would like to give you a PhD. Possibly during a war. The halting problem is not fun.

14:27

Debian currently has about 35,000 packages, binary packages. About 17,000 source packages. A graph like this, if I could compute it, would have so much information in it that

14:45

it's useless. Nobody can actually make use of it. In order to be able to manage all of this, it would be nice if things were simpler. I'm going to come back to that in a moment. But it's not just about the complexity of the depends fields or requires fields

15:04

for RPM. If you have 35,000 binary packages and you're a user, what do you do when you need to install something? If someone gives you the name of the package to install, it's easy because you don't need to decide anything.

15:22

But if you hear that there's a nice program for finding duplicate files somewhere in Debian, what do you do? Well, you can do some searches and stuff and most people, what they do is they enter a state called decision paralysis.

15:42

I get this in restaurants with large menus. However, geeks who love their computers tend to be able to overcome this because it's a familiar ground for them. If I go to a restaurant that I've been to before, it's easy for me to choose something.

16:01

New restaurants are a bit harder. Decision paralysis is not a joke. It's something that actually happens to people and it's one of the fundamental things you have to know when you make user interface designs. Not that I'm very good at that, but I hear it's important.

16:23

Presenting the user with too many options is not good. It's difficult to choose. The case of duplicate file finders in Debian, we have so many tools for that now that people are wanting to write tools for finding duplicate file finders.

16:43

This was probably a joke by the person who suggested it, but I think it's a viable tool to make.

17:00

The complexity graph isn't so big yet that we can't manage. We are doing a reasonable job of making sure that everyone has the right dependencies, but it's only getting worse and worse by the time. When I became a Debian developer, I knew by heart what every package did, what's the

17:21

purpose of every package. I had installed most of them manually. Well, all 200 of them manually at some point or another. Today, it's not possible, but it's not just dependencies that are wrong, that are

17:41

too big. It also affects things like transitions. If you have 35,000 packages and some of them, say the GTK libraries, which are, there's a bunch of them, and you want to upgrade those to a new version.

18:02

This basically means you have to rebuild everything that depends on GTK. So all of GNOME and all of XFCE and probably a lot of other stuff. Some of that will fail. Sometimes it's a build failure. Sometimes it's another runtime failure that you will find about later.

18:22

While you're doing this transition, somebody starts a different one. What you get is two groups of people working on different sets of packages that sometimes overlap. And in that case, what you get is what I would like to use the technical term chaos

18:45

for. You have people who make changes in one set of packages in order to fix their problems and then cause problems for other people who then fix that and cause problems back to the first group of people.

19:02

And while they're doing this ping pong, you get a third group of people who want to touch the same packages. So in order to avoid this, you have to do something. And one way of doing this is to serialize things. You do one transition at a time with 17,000 source packages.

19:24

If you do one transition per source package, your releases are going to get very long. Debian has been doing less than two years per release for the past three releases, which

19:40

is very, very lucky and hard to achieve. We don't want to make that harder. We want to make it slower. Sorry, you want to make it easier, of course. It's also not just about development. It's also about things like testing and support.

20:06

There's hundreds of people in this room. How many of you are running Debian? Quite a number. I would suspect that each of you has a unique set of packages installed.

20:22

Nobody else in this room has the same set of packages installed. Even if there's two people having the same packages, they're very likely to have different versions. How do you test when you're developing a distribution? How do you test so that all these combinations of packages and versions actually work?

20:44

I think the term combinatorial explosion applies here. It also applies when your grandfather calls you and tells you that, oh, my Linux laptop doesn't boot anymore, or the browser crashes all the time.

21:02

How do you fix this? Well, you might try to reproduce it on your own laptop, which is finding a different version of everything and doesn't have some of the packages installed because he didn't tell you that he installed Flash himself.

21:22

So you get the situation where testing is not meaningless, but almost, and that support is unnecessarily hard. For things like transitions, basically what we want is to be able to branch and merge

21:46

an entire distribution. And we want to do it with Git, not with RCS. If you serialize transitions, that's essentially using RCS at the distro level. RCS has a global lock that everyone honors.

22:02

You have to have that lock before you can make a change. Exactly like serializing a transition. However, distributed version control systems like Git have shown that if you have good merging, then it's okay for people to do parallel development.

22:24

And if you can branch an entire distribution in order to make all sorts of wild changes, like replacing one version of GNOME with another, or replacing GTK with QT, if that's

22:40

what you prefer, then you can. And you can do that without hurting anyone else. All of this leads me to think that in 2012, binary packages, the way we know them

23:05

currently, are a bad idea. I think we should do away with them. We should, instead of having a separate tiny binary package for every upstream project, we should collect these packages into larger sets of packages.

23:26

A collection of software, if you wish. The purpose of this would be then to have a set of packages that you know work together, so that if you're doing web services, you know that you installed this set of software.

23:43

And this set of software has been known and proven to work, or shown to work together. It's not a mathematical proof, it's just a software developer proof. All sorts of things become easier. Then you have larger collections of software.

24:00

You don't need to depend individually on every little bit, every tiny little binary package. You can just depend on a larger collection of software, making this graph manageable by humans again.

24:21

It also means that people who do testing have an actually meaningful job, because they're testing something that people will actually be running. All the people who want GNOME have the same set of packages, same set of software installed, and this leads to all sorts of simplifications.

24:48

Everyone went very quiet. I must have said something that nobody agrees with. In 1993, 1994, Debian had, and I believe Red Hat also had, a few hundred packages.

25:05

We have grown a hundred times since then, and with the scale of growth, all sorts of problems appear. Things that were small problems in the 90s have grown big.

25:21

It's like if you go on a walk and you have a small pebble in your shoe. If you're walking 100 meters, you don't care, because you're about to stop and you can take off your shoe then. If you're walking 10 kilometers, 100 times longer, then if you don't take the pebble out, your feet are going to be blunt,

25:41

and that's not enjoyable. If there's something that the Finnish army taught me is that bloody feet are not fun. As an example of a pebble, in Debian, a package that includes shared libraries

26:02

needs to arrange for LD config to be run after the libraries have been installed. If it doesn't do this, then things break. So everyone who packages a Debian shared library needs to arrange for this to happen. That's thousands of packages, hundreds of people.

26:22

If instead we had the package manager automatically do this, things would be simpler. It would be one tiny pebble less. And it's not like it's difficult to call LD config in your posting script, but it's unnecessary work and should go away.

26:48

Collecting software into bigger collections, of course, means that we reduce flexibility a little bit.

27:01

When you're installing a machine, you don't get to pick and choose all the packages yourself. You don't get to choose that I want these bits of KDE, but not these other bits of KDE. I want to use grown bits instead.

27:21

If you decide to do this, then with some ingenuity, we would be able to still make this possible. With some extra work. But for the majority of people who don't want to do this, things would be massively simpler.

27:45

Does anyone else think this is a good idea? Does anyone else... Does anyone think that is a bad idea? I have a Debian release manager in the front row who keeps scowling at me.

28:05

Anyway, I think this would make the life of distro developers much easier and should be experimented with. I'm not saying this is something that will make life better.

28:24

Nobody knows until it has been tried. So we need to find a way to try that. The other thing I think distro developers should embrace

28:42

and possibly extend is continuous integration and delivery. The software development world in the past about 15 years has learned a lot of new things.

29:02

Not everyone agrees that they're all good, but the group of methodologies known as agile development have increased the quality of software products quite a lot. Few people talk anymore about the software crisis

29:21

unless they want you to pay them more money. One thing that especially has started to happen in the past 15 years a lot is that if you are writing a new piece of software you're expected to provide an automated test suite

29:42

which would consist of things like unit tests for individual functions and classes and integration tests for the entire software and deployment tests for the installed software when it's actually running

30:01

and possibly all sorts of other kinds of testing. Agile development and extended programming before that did not invent automated testing. However, they have gone in for automated testing in a way that nobody else did before

30:23

and I think this is a good idea. We should have more of it and we should have it at the distro level. We should be able to verify that when we upload a new package or a change to a package and that the end result the distro actually still works

30:44

and it passes all its automated tests. Some of this exists, some especially embedded developers are using continuous integration. I don't think the big ones, big distros do much.

31:06

Or not systematically. The idea of automated testing is separate and independent from the idea of abandoning binary packages as we know them. But I think they would work together well.

31:23

So the flowchart I would like to have is that everything starts from a developer committing some source code into a version control system which they could choose freely as long as it's called Git. Because I've decided that Git wins.

31:44

But it doesn't actually matter. After this, they do a push and then some automated build demons create the binaries. It might be binary packages as we know them or some other form.

32:01

And then these binaries are joined into system images which can be tested. You don't just have a binary package for network manager. You create a system image which includes network manager and then you verify that it works.

32:23

So you might have a system image. You might need many many flavors of this and then have a system image which you boot under a virtual machine and verify that it gets network up.

32:42

And if it doesn't, you fail the test and you fail the upload. And people do not get to use this version of the software because it obviously doesn't work. The benefit of this is that the developers get

33:01

quick feedback that something went wrong. And when I mean quick, I mean minutes possibly are worse. We do not want to make the feedback loop so long that you do an upload into Debian Unstable. I'm using Debian as an example because that's what I know best.

33:21

And then you wait at least 10 days before the upload gets into testing and lots more people get to use it. And then you wait for a while because people are slow to upgrade. And then a few weeks later, you get a bug report saying that network manager crashes on startup,

33:42

which would of course never happen, but hypothetically. And by the time these three weeks, four weeks have gone, you've entirely forgotten what you did. You might have done an upload on network manager and entirely forgotten that you have ever touched it.

34:03

And this slows down development quite a lot. If you can make it minutes, possibly an hour or two, all the things that you did are still fresh in your memory and it's easier to fix the problem. If necessary, you can back out of the change and then figure out what to do instead.

34:26

So the development process becomes more efficient. It also gives a lot more confidence to the developers because the developers know that, okay, I made a change, all automatic tests still run and pass.

34:43

So I probably didn't break anything much. And this becomes especially important if you want to make big changes. If you want to replace or upgrade the KDE libraries to a newer version, in a few hours, you will know if you broke anything really bad.

35:05

So that would be the functional system test. If we do this, we can then also start adding more non-functional tests, if you wish. Tests, for example, to verify that boot speed hasn't suddenly increased a lot.

35:23

Or that the size of your system image isn't suddenly several gigabytes larger than you wanted it to be. Or that your Apache web application still performs well under stress testing.

35:41

We should also never forget that automated tests don't automatically find everything. You need human testing as well. But human testing becomes much more efficient if you can concentrate on things when testing

36:01

that a computer wouldn't find anyway. At least not until we have working artificial intelligence. Anyone here who has worked with a professional tester during software development? A few people. I have worked with a few and it's an eye-opening experience

36:26

to see them find problems in your code that you didn't know could exist theoretically. A good tester might not be a programmer at all, but an excellent intuitive way of figuring out areas of the program where bugs will exist.

36:52

And I don't know how they do this, because if I did, I would fix my code. But as a simple example, I would not have expected

37:04

a tester to test the input form of a program I once wrote with his forehead. He kept banging the keyboard with his forehead until my program crashed.

37:20

This would actually be possible to automate this test, but you have to have someone who can tell you this. Yeah, simulate head banging. So we will always need manual testing, but we shouldn't make people do work that computers do well.

37:42

The goal of this entire flowchart is to get to the bottom box, the release. And if we have automated testing and a group of people doing testing who say that they can't find anything too bad,

38:03

then the release can happen by pushing a button in an ideal world. There will always be some complications. However, the confidence of making that release will go up if we have a continuous integration and delivery pipeline.

38:31

An executive summary for those who came in late. We have a very complicated Linux system currently,

38:42

and it would be nice to be able to simplify that. I have one idea for doing that. And continuous integration and automated testing will save the world, and then we'll have bunnies and puppies and ponies and everything.

39:02

Any questions? Yes, there's three microphones. One in front, one at the staircases over there if anyone has a question.

39:26

So what happens with your continuous integration on the scale of a distro? How are you going to do that without actually serializing development in exactly the same way that you've already worried about?

39:43

In order to avoid serializing development with continuous integration, you have to have a continuous integration system that works really fast and can react to enough changes per day that it can be done continuous.

40:04

One way for this is to break this box down into smaller pieces, where the first box runs very fast.

40:23

It's a set of smoke tests, and this can be run, well, in Debian terms, for every D install run. And then you have larger tests, a sequence of tests that are larger and larger

40:41

and take longer and longer to do until you get to the end. Running a stress test, for example, might take a day or two. It's not possible to do for every package upload. But you can do smaller tests, faster tests, for much smaller groups of packages.

41:01

You also are going to be needing a lot of hardware. And for some architectures, they might not be able to keep up. This is not necessarily a really bad thing, because most programs work reasonably well,

41:24

regardless of where you run them. There will always be architecture-specific complications, but you get most of the benefit just by running these tests on a fast architecture. Next question.

41:41

Lars, thank you for that. Do you realize that the existing system that has been doing this for years now, including the testing, including the package management, including the contained package management, the rollbacks? It's called Nix packages?

42:02

Yes, there are non-mainstream distributions that do at least some of these things. Yeah, the Debian people have looked at Nix packages, but since it doesn't align with the, what's it, the FHS? They decided not to go for it, but actually it solves a lot of the problems.

42:21

Yes. I did not mention Nix and other alternatives by name, not because I don't think they have any value. I think they have great value and should be explored, but because I wanted to concentrate on these two things.

42:40

And I hope Nix will help Debian see the light. So, and did I understand you correctly that you were proposing another layer, just on a, for example, task select, or making task select more fine-grained,

43:02

or do you want to make the binary package go away completely? I want to make binary packages go away completely. Okay, and what's speaking against having a layered task select architecture where you just say, okay, we have some tasks which are,

43:21

or binary packages which are combined in a way, or? We pretty much have that in Debian with tasks and meta-packages, and it's not working very well in my opinion. Well, but probably these tasks are not fine-grained enough because they just install a really large set,

43:43

so if it just, you have something like desktop user and server and stuff like that, and I think if this would be a little bit more fine-grained, it probably would also solve a lot of problems where you can just say, okay, we focus on these tasks

44:02

and use our existing binary package-creating system and so on. You're right, it's possible that it would work. My intuition is that it doesn't, but until I have working code, I'm full of error. Okay.

44:20

Hi, I'm just curious, what do you think about Debian rolling release and in future, how much do you see this principle being used in all of the Linux distributions and how successful it could actually be? The Debian rolling release is not something I have much experience with.

44:42

I believe I personally would not like to run a system that changes every day. Right, no, as a concept, what do you think? I want to hear your opinion. If my ideas and further ideas get implemented properly,

45:00

then I think a rolling release will work much better. Instead of just grabbing every day's new packages, you can grab those packages that don't break automated testing, and I think that would be something that a lot of people would like to run.

45:22

Yeah, because I see a lot of potential, what you're seeing here in that kind of thing, rolling release, but what do you think about rolling release as in future? Will Linux distribution head towards that goal or are they going to spread to something else? I think the future of Linux distributions

45:43

has at least three different directions simultaneously. One is to have something like rolling releases or extremely frequent releases like once per month, because this lets people run the latest software in a reasonably stable environment.

46:01

The other direction is yearly or by yearly, stable releases the way, for example, Debian is doing them or Ubuntu is doing them or Fedora is doing them. Basically, continue in existing practice. And the third one is long-term releases that happen,

46:29

how should I put this, that happen quite infrequently because people who want to run a decade per release get very anxious about this five-year very short release period.

46:45

But these really long release cycles for this would include updates like RHEL does for hardware-dependent stuff so that you can continue to run your five-year-old distro release

47:01

on modern hardware. Okay, thanks. Anyone else? In that case, I think we are done. And I thank you. Have a good lunch.