Continuous Integration for Commitfests
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 37 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/49131 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
PGCon 20183 / 37
1
4
5
7
10
11
12
14
15
16
18
22
23
24
27
30
33
34
36
00:00
Computer animation
00:47
Computer animation
01:53
Computer animation
03:39
Diagram
04:25
Diagram
05:17
Computer animation
07:53
Computer animation
08:49
Computer animation
10:38
Program flowchartComputer animation
13:02
Program flowchartComputer animation
13:59
Computer animation
18:37
Computer animation
Transcript: English(auto-generated)
00:15
I'll start. So hi, everyone. My name is Thomas Munro. I work for Enterprise DB.
00:21
And I'm going to be talking about the continuous integration project that I've been doing to test patches in Postgres. Yeah, hi. OK, so I'm going to get started. So I work for Enterprise DB. I've been there for three years now. And I guess the biggest thing I've done so far
00:42
is parallel hash joins, which are coming into Postgres 11 and some other stuff. So I made this thing that's a website. And it shows you a list of all the patches that are currently proposed for Postgres and answers three very simple questions.
01:00
Does the patch apply? Do the tests pass on Linux? And do the tests pass on Windows? It's just these little things here, right? And there's a view of those can be filtered by author. So if you're a patch author, I've put Michael Pacquier here, because he's like the patch author superstar.
01:21
He has more patches than anyone else by far. And if you click on those little things there, if it's a fail, you click on it, you can go and read the build log. Or you can always read the build log, not just for fails. And there's one for Windows and one for Linux. And I find that pretty useful, because I don't really like it. I don't have Windows.
01:41
I don't work on Windows, so in this case, you can see someone here also doesn't test on Windows and something broke. And that's quite useful to know, right? So the other thing is that core file backtraces are shown if something crashes, and regression test output is shown.
02:01
OK, so that's what it is. I thought I'd just come straight out and show that. I mean, it's very simple. And now I'm going to talk about the motivation behind that, and what I'm trying to achieve, and where I'm trying to go with this thing. So basically, the PGSQL hackers mailing list has around about 140 people contributing code.
02:22
There's slightly more people contributing code than people contributing to commit fests, because not everything gets registered. But it's that kind of order of number. We have about 500 people contributing to discussions. And there's currently around about 250 patches in consideration being proposed at any given time, which
02:41
is a lot of code, right? Now, we have the commit fest website, where these patches are proposed four times, right, five times. So all the actual work of proposing patches
03:03
and discussing them and getting them committed all happens through the mailing lists. But we have this website where we just register the patch. You want someone to review it. So the website is an important part of the process. But we're not actually really using the website for the review work, as some other projects do. And that's the case for many larger projects,
03:21
famously Linux, but also some others. Using a mailing list centric system seems to work quite well for things that involve a lot of discussion. So putting things into the commit fest website is our equivalent of making a pull request.
03:40
Now, over the past few years, this is the complete history of the current commit fest application. There was another commit fest before that, I think, made by Robert. I didn't try and pull data from there, because I don't know where it is. But if you go back to just four years ago, we had around about 80 patches in each commit fest.
04:02
And now we're up to about 250. I think that's pretty impressive. And that's actually happening because we have more contributors. Or depressive. Or depressive, yeah. And if you look at those two charts, you can see that they're highly correlated. Basically, we have more people showing up, and more people write more code, which is cool.
04:20
It feels better than just previously that we just keep punting things forward. Right. And that's actually a really interesting question. Do we just keep punting things forward? And if you look at this, I've shown in blue here when things are being moved forward, which raises the question, well, how far did I get punted? And so I tried to capture that with another chart here. I took all the patches that reached a final status
04:44
in the last commit fest. Now, that was a final commit fest, so maybe the statistics would be different for a different one. I didn't look into that. But you can actually see that around about half of our proposed patches are committed within one commit fest, so basically slam dunks. Half as many, again, take two commit fests to get in.
05:02
And then half as many, again, take four commit fests to get in. And then we have some outliers over here. And we have some stuff, of course, which has never actually reached a final state. It's been bouncing around for years. So that's quite a lot of code that you have to keep working for a long period of time and keep up with changes. And my kind of central idea here is that wasting people's time is a really bad idea.
05:24
It's not going to help us deal with all this code. So there's a whole series of different things that you can discover automatically. The most trivial one, usually it's trivial, is just bit rot, because the master branch keeps changing, like seven or eight or whatever commits per day.
05:41
And your code's going to have a typical patches seem to last for a few weeks, and then they get bit rot, right? But sometimes the bit rot's more serious. It actually is telling you that the code is changing in an incompatible way, and you better keep up with that. And it's actually an important change rather than just white space or some other crap like that.
06:01
The other thing is that other compilers can be more picky than yours or using more warnings enabled or something like that. Tests fail, because people don't always test their stuff. They might test make check, but they might not run the full make check world, or they may not have all the obscure build options or tap tests enabled.
06:21
That happens all the time, actually. Pretty good percentage of patches fail in some obscure test that no one ever really runs manually. Then we have portability bugs, endianness, word size, OS libraries, all that kind of stuff. So Windows is a great example. Loads of patches just don't build on Windows the first time anyone posts the patch,
06:41
and eventually we get that fixed. There's also things that are non-deterministic, like just running stuff on a different computer can find race conditions or uninitialized data, which worked out OK on your computer, but doesn't work on some other computers. So just running things on more computers is a good idea, right? And very frequently, patches have broken documentation, because people just jump in there and edit the text
07:02
and then send the patch, and they never actually build it, because it's kind of painful to install the tools and everything. So we do have a build farm, which is an incredibly valuable thing that tests everything in the master branch on a ton of different computers. But that happens after commit, so if you break the build
07:21
farm, you're wasting a lot of people's time. And not only are you requiring people to do things to fix it, you're also it's a single shared resource, the build farm, right? If you turn it red, then no one else is getting that stuff tested until you turn it green again, right?
07:40
So that's kind of like, we need to have that, but I'd like to push some of the automation earlier in the pipeline and do it at a point before you've actually reached the shared resource, right? So how does this thing work? Well, exactly a year ago at this very conference, someone was complaining to me about my patch not applying.
08:03
And it's something that I've been, it's quite a large complicated thing, which is still not committed. And it's hard to get people to, there's a kind of mental load involved with this particular patch. And I felt really bad that someone had taken an interest in the thing and gone and tried to apply it. And then I'd kind of wasted their time, right?
08:22
So my first step was I set up a cron job that just would email me in the morning when just when my own patches don't apply. And then from there, I just incrementally built from there. And I started checking for bit-rodded patches that I was interested in reviewing. And then I started running the tests. And then I started freaking out because now I'm running random code I downloaded from the internet.
08:42
And it's, you know. So then I stopped doing that. And I thought, well, what's the right way to do this? So before you even get to actually running the code, the first thing you have to do is apply the patch. And if you go and look at the history of patch,
09:05
I mean, recently there's been a few headlines because there's been some recent CDEs there. I mean, patch is just, it's an amazing piece of software that's enabled us to do what we do. But like, man, that software is just full of holes. I mean, it was designed in a time when it was not like,
09:20
there's ways to shell out and make it run shell scripts. And there's ways to, like it wasn't really, didn't really live in the directory it was supposed to be targeting. You could get out of the, there's all these different things you could do with it. And they keep patching it and improving it. There's more than one strain of patch. I don't know if they're related or not. But there's the BSD one, and there's the GNU one, and I don't know, there may be multiple patches.
09:42
Some of these CDEs actually apply to all of them, which I think means they must have a common ancestor, but they seem to be a bit different by now. Whatever. You can't trust it. So what I came up with was, I run this thing in a jail. I use FreeBSD for this.
10:01
I have a, can I use this thing? I have this pristine source tree with like nothing from the internet on it. And then I clone that. And I just do all the work inside a clone. This whole process just takes like less than a second to start up a jail, which is like a container. I think FreeBSD jails might be better than containers on Linux in terms of security,
10:21
but I'm not entirely sure. But it's like a Solaris zone or whatever. It can't access the network. There's resource limits, and you just nuke it, and it's gone, right? So that way I can finish up with a way to take all these patches and push them out to GitHub. Once they get to GitHub, you see all these branches
10:40
are created. That's the commitfest ID number. You can go and look at that on GitHub. I've got this user, postgresql-cfbot. So that's the end of the involvement of my server that's cheering on these patches. Once it gets to there, you can see the, if you look at the individual branches on GitHub,
11:03
you can actually see the patch that's been applied. It's got a machine generated commit message that just tells you who really wrote this, and here's a link to the mailing list, and so on. So you can see where the actual patch came from and so on. So the next step is to actually build and test the thing. I didn't really want to own that problem myself
11:21
for the reasons described before. I just don't really, it's code from the internet. I mean, I don't think anyone's really going to send malicious code to the Postgres mailing list, right? But still, as a matter of principle. Now we know what you want. Yeah. So fortunately, there's this whole industry now of companies providing continuous integration
11:41
as a service, which basically means build boxes, right? And they're just like, I think these things might, I think Travis ones are on EC2. I don't know, but basically it's just like a really easy way where they'll monitor your GitHub repo or any other public Git repository. And if they see a certain control file in there, they'll just run the stuff that's in there, and that's really convenient for this.
12:04
Appveyor.com is a similar company that does it for Windows, which I think is incredibly valuable, because I don't want to touch Windows, and they'll do it for me. It's fantastic. Fantastic. So anyone here using these kind of tools themselves? OK. So almost half the people in the room.
12:22
That's pretty cool. So it's very easy to set up. You just stick one of these control files into the tree. I think in some cases, you can also configure it to get configuration from outside the tree. But the really lazy way is to just like, I actually just have a patch sitting in my home directory, and I just apply that patch, push it to some branch name,
12:42
and it might just make up, and then push it on to GitHub, and then these things will build it for me. That's what you need to do, right? It's very simple. So using that yourself is quite easy. I use that myself when I want to test Windows stuff. But what this talk is about is how to plug our mailing list into that. So yeah, that's involved a bit of extra machinery.
13:03
So what I finished up building is this, I just have like five bucks a month kind of VPS, like tiny virtual machine, that simply pulls these things not too frequently, I hope. I haven't had any complaints from the PGN guys yet.
13:20
And just simply pushes that stuff out to GitHub where it goes to Travis and out there. If you go directly onto the Travis website, you can actually see where it's detected those branches, and they'll build, I think, three or four at a time. And you can see the build history for each individual branch. So yeah, that's all relatively simple.
13:41
And then I collect the results up. I actually only just started doing that. Earlier I had it so my website was just showing the badges, which were just like an image sourced from their website. But then I didn't really have the data. And that led to some interesting problems, like I couldn't really talk about it on the next slide, what I'm planning to do with that data.
14:00
So it doesn't work perfectly. I've still got some active battles. In fact, I was hoping, by talking about this here, to maybe see if anyone knows how to fix some of these problems. For Windows, which is like a major part of my goal here, was to test on an operating system that I don't have and that I think about maybe 1% of Postgres hackers actually work on. I don't know. Who works on Windows here?
14:20
Anyone? OK, so we've got. Who likes it? So to get Windows working, I actually I asked Andrew Dunstan, who's over here, to, well, I just asked on the mailing list and Andrew very generously
14:41
extracted some bits of the build from Perl scripts that he maintains to get this thing going on out there. And I've been tweaking and hacking on that ever since. The thing that it can't run is the tablespace test. And this is what I'm hoping, by talking about this problem, I'm hoping if anyone, the tablespace,
15:02
if you actually run as administrator, you get a permissions failure when you try and run the tablespace test. Did you know about this? I think you mentioned it. OK. That's the test that we have the most trouble with. So as a Unix guy, when I see this thing where if you have the most privileges, then you get told you can't do it because you
15:21
don't have enough privileges. So you just drop to a lower privileged user and then it works. I don't know what's going on with that. You're familiar. Those customers use two other nodes. So I'm hoping that someone can help us figure that out. It doesn't run check world yet. It really should do that. But that's just because I haven't gone around to stealing those bits of the build from yet.
15:41
And I'll get there. OK. There's some other occasional transient failures which are kind of annoying because they make the whole thing less reliable. Mainly it's that source forge is really flaky. And I don't know what they did. Actually, I mean, it's kind of a venerable thing that's
16:02
kind of, anyway, I don't want to talk about source forge. But basically they get dossed all the time because for some reason people on the internet hate them and attack their servers. So that's why sometimes, and that's actually, and they moved their data center recently. And that's the reason why everyone who runs like make docs on a computer that doesn't
16:22
have the XSL stuff installed, it stopped working, Robert and I discussed this on the mailing list two months ago or something. Because several bits of the build stuff actually go out to the internet. So if it's not working, or source forge in particular, which hosts some of the XSL files that we need.
16:41
So that's a source of transient failures. Probably like one in 1,000 builds or maybe even less. But it's kind of annoying because you see the red flag, you go and investigate, and you just wasted your time, right? There's also one particular tap test that I haven't really figured out. It occasionally fails the crash restart one. Sometimes there's like a 60 second timeout in there and it reaches the timeout.
17:01
But I can't figure out why it's reached the timeout. It's like, is it because the virtual machine's just being extremely slow, or is it because I'm looking at it because this is his test. But I think there may actually be a bug. I don't know. In August it's code? Yeah, it was code.
17:21
Yeah. That one is probably telling us there's a bug. I don't know what it is. There are some weird timeouts we see in the regular build farm occasionally, and they're quite sure it's a shit lot. Well, this would be so slow that I don't think that could explain it, because the timeout in question is 60 seconds.
17:40
It's like an immediate shutdown. Wait 60 seconds and it hasn't shut down. So maybe the file script's busted. It takes two every day or two, so I suspect it's just one where we are making some weakness in the file script. Hang on a second. It couldn't be that stuck with the signal handlers being
18:00
busted and doing non-reentrant stuff. No, because we do kill night in the background. All right. OK. The timer could be the problem.
18:31
So yeah, that's probably not a false negative, but I don't know. I can't prove that. So yeah, what am I going to do to fix those problems with the network? I probably need to figure out how
18:41
to get more of the stuff to be local so that it doesn't have to go to the network. I haven't really got to the bottom of that yet. I actually proposed that we might change our make-doc so that it doesn't point to SourceForge anymore and instead points to, I forgot on the URL they use now, the people that do docbooks.
19:00
They've actually moved off SourceForge, but our stuff still points to the old location. So yeah, that might be an improvement. Some of the plans for the future. Well, the main thing that I want to get done, and I'm in discussions with Magnus about this, is just move the stuff onto the actual commit fest and then shut down my website, because I don't really want to run a website.
19:23
That's really a terrible mock-up. See why I don't want to be in the website business. And basically the way this is going to work, well, firstly, it needs to be reliable enough that the information is worth putting onto the main commit fest application.
19:42
This thing is supposed to save people's time. If it wastes people's time, then it wouldn't be a success. I think it's getting pretty close to being reliable enough. So what we're discussing is a way to make it so that the commit fest app would just have an end point, an HTTP end point, and we would just like, my CF bot would just hammer it with results, and they'll show up here.
20:00
Very simple. I guess you'd probably want to filter by them. That's it. I guess one thing is, is it not show results that are stale. The last test was against the patch version. It's been obsolete by a reason. That's a good point. It's hard to do that in your existing app.
20:22
I would think on the main website, we'd be able to know which version was tested last. Yes. That's a good idea. How do you update it from when the commit is going to main, right? Don't you agree with all these tests we've mixed up into the main project? So what I actually do, I've actually,
20:41
up until about two weeks ago, it was really quite stupid. I just used to just build something every five minutes and just go around in commit fest ID order. But that was actually using too much of the generous resources of travisci.org, for example. I didn't want to just waste their CPUs
21:01
with that level of abandon. So what I do now is I actually poll this page every minute. And if that latest mail date changes, then I go and read the thread. And if I see a new patch that I haven't seen before, I create a new branch. And then I also rebuild every single patch
21:23
once every 48 hours. That's the two different. Yeah. Yeah, so I actually pull from master, like an update. I do a fresh.
21:40
So your stuff gets tested every two days, minimum. But when you send it in your patch, I try and pick it up straightaway. Because like the earlier version, people would email me and say, hey, your thing doesn't work. But actually, it would just take a really long time to get around to that ID, because it was too stupid. But what I was trying to do there is start polling this thing. I have discovered that it's not that reliable sometimes.
22:01
Sometimes there's some inconsistency. That's a bit boring. I'm not going to go into that. But I've, hm, OK. So ideas for the future. One thing that I looked into is running Coverity, which is a static analysis tool that can tell you really clever things about your code
22:20
that you, that can be very valuable. And Coverity offers a free service to open source projects. And it does run on, I don't know who, is it Michael who looks at it? Somebody occasionally reports to us on. The security team. OK, security, OK, right. So the security team has access to the Coverity instance that's run. We do that every Sunday night.
22:41
We have a script that both current, master, and all branches, and push it and build them all. And send it to the security team. Cool, OK. So it's not public. But it's being looked at. So yeah, that's cool. It's done, we want it arrayed. Yeah. So anyway, Coverity will let any open source project use their stuff for free. But I looked into doing this, and I figured out
23:01
that for projects over a million lines of source code, Coverity will only let you run one test per day. So it would take about nine months to get through each commit fest. Yeah, that's why we do it once a week. So if you wanted to use Coverity, you'd probably just have to buy it, I think. If you wanted to test everybody's patches with Coverity, like earlier in the pipeline, as I'm doing here. So that's probably not.
23:20
You're doing it through the fact that just Postgres that are false positives. Yeah, there is a lot. A lot of those. Uh-huh. I mean, it's an effort to keep it main day. Brilliant. Right. So that one's probably not worth pursuing. Running all patches with Valgrind, Valgrind, however you say it, that would probably be a good idea.
23:47
I don't have to, I don't have to, you know, I could just do that once a week. Is that right? I mean, it doesn't have to be. OK. Yeah. So maybe it's OK to just leave that to be. It's on somebody else's machines.
24:01
Yeah. The isolation test takes about four hours. It's certainly on one of my test machines. That's amazing. I run it for unit tests only for PG backrest. So Valgrind is run for all the unit tests,
24:21
but not for the integration test, because it's just too expensive. It's painful. And it doesn't really have value after the unit tests. If something does die, then I can just go in and look at it with Valgrind, or I can look at a backtrace or whatever. But the value seems marginal, and the cost is extremely high. So OK.
24:40
OK. So that's another one which it seems like it's best to just leave it to be after the, do it on the build farm where there's apparently two, three machines. Andrew just runs at least one. And Andrew as well. And ceiling static analyzers. No good? There's just not much to it. You know, they're adding stuff, but right now it's just really,
25:02
there's not much to it. Let me put it that way. Anything that it reports should definitely be fixed, or excluded or looked at or something. It just doesn't have that many classes of issues that it's looking at currently. I mean, over time it can make it a lot better. But we do run it anyway on backrest for every
25:21
progression run. Just to make sure we're not losing this evening. And it's quick. Relatively. So another idea is to add one more platform. I don't want to have a whole build farm worth of different operating systems because firstly, no one will give them to me for free. That's the primary reason.
25:42
Also, because it's just a pain to maintain all these things right. So I think one really well-judged system, I was thinking maybe a 32-bit non-Linux system with a big-end architecture might be useful because you'd hit several things at once with one extra build. But I'd have to run that myself. I mean, I'd have to own the problem of running the untrusted code
26:01
because those companies, no one has weird systems like that. They only have very mainstream systems. I'm partway towards getting that working on Travis with the emulator. Like QEMU or something? This is a constant problem for me because on this Debian we support 32-bit big-ending systems in general and I have nowhere to test it.
26:22
So I just get bug reports or it doesn't build or something or the tests don't run and I'm like let me make something up and send it off and I usually get it right So this is like QEMU and it's usually the tests that are failing not the code is fine
26:40
but the tests are constructed in a way that only tests on a little-endian system so I'm going to have to construct the test in such a way that they calculate the value that they need by other means and compare it to the value that the application is generating and it's a big pain in the butt so I've been kind of working on that I'd love to hear about that
27:01
but I'd be happy to send it to you That'd be great So is that using QEMU? And like PPC? Which architecture? Is it PowerPC? How do you get the big-end in? It was PowerPC I believe It's been a while since I've looked at it I managed to run Postgres once
27:21
in exactly that in QEMU It was so slow I guess it would probably take hours to run Yeah, in my case I'm just trying to get unit tests to run If I can get them to run on big-end systems then I feel pretty good I actually looked into
27:42
PowerPC VPSs that you can rent on the internet whereas you can get an Intel whatever from Digital Ocean or Amazon EC2 for like a few bucks a month Power hardware starts at about a thousand bucks a month so that's not going to happen But we have
28:10
the big question actually comes down to presumably we've already had build for our members to cover this, right? So maybe it's not worth worrying about that No, I don't think
28:23
you should do that Yeah I mean, we've had this before but is there some way to have somebody like do a review of code or at least somebody who is trusted who submits the patch Yeah, but then do you trust that the pipeline though, I mean
28:40
even if you trust the person whose name is on the patch do you trust the whole pipeline it's like a forged email, right? No, no, no, in the commitfest app you could have somebody who has a community account that actually logs in that's okay to run Another idea is that if I'm doing all this I could also actually display
29:01
build documentation so that people who are reviewing patches just have the documentation built I guess you'd want to somehow highlight the bits that changed, I don't really know how to do that or at least show you, take you directly to the pages that were affected Just there, most of the problems
29:21
I've seen were just Yeah I'd read this to make what I would check for, yes, two mils up I don't know Well, it's quick, we need all this stuff Yeah, yeah, but it's still for the build farm they own that thing
29:40
Yeah, yeah So another idea I had is see at the moment I've got that thing that I showed earlier where I have like a FreeBSD jail and I use a ZFS clone and all that kind of stuff but I could actually I could actually get the patch applied inside Travis and up there The reason for doing that would be
30:02
one reason for doing that would be it's the goal of this thing and this is just a proposal, right If the goal of this thing is to hand it to PGN for then they probably don't want a FreeBSD box I don't know We really don't want to be running the ship Yeah, yeah, yeah So even though the
30:22
That was actually the question I was going to ask earlier Why don't you just do it Yeah, that would totally work I'd be able to make that work very easily for Travis Making the Windows thing do that You've got to be able to unzip tarballs and all that kind of stuff
30:42
The tree is some sort of file Do you trust tar? A lot more than patch Okay So there could be a halfway option there where you do some of the work but actually applying the patch happens inside the computer that we don't own
31:01
My main reason for that is to make this thing more acceptable to pg-infra We can get to a point where pg-infra doesn't have to run any Android that goes I would think we would definitely be open to running it We'll only find a way to actually run that on pg-infra as well, just not straight up Yeah, so
31:20
the thing that I'm doing with Can limit I'm going to quincentilate this is also something I've been working on Just for to make my
31:41
testing pipeline for patches easier So I can just need everything to Travis and say, okay, here are all the patches Create patches for them and test each patch No matter which patch people apply The order obviously has to be the same They can apply patch 1, patch 1-2, patch 1-2-3 and I know that they'll work Because I find that more and more
32:01
I'm spending unreasonable amounts of time Just generating and testing patches for things that I want to push up Even when I made a very modest number of changes I then spent an hour, an hour and a half Maybe Gooping off and doing some other stuff Waiting for test runs to finish Starting the next test run and doing whatever I've been starting to push that work up to Travis
32:20
Because So My use case is a little different but I am Able to apply patches From various branches In Travis, you know, create the patch Apply, test it Then apply the next patch Build again, test Apply the next patch, build again, test
32:40
Yeah The issue isn't about the test Yeah Yeah, I mean Whatever shell scripting wizardry is required In Linux is not going to be a problem It's the Windows site Yeah So
33:01
I mean, so another question which kind of I guess is going to come up Is Some people just want to propose code in a public Git repository I mean, do we want to keep doing mailing list emails? I'm not here to propose we change anything I was trying to build something that works with what we have Right, but I guess eventually people might Want to do things differently
33:21
I don't know We tell people if you include Some reference to your Git repo In the commit path There's a thing for it One patch ever Has filled that out Literally one Ok, but if that starts happening
33:40
People can do that and then that solves the app error problem It does And that, so If you want your stuff tested on Windows, you use this I think that would be fantastic I don't mean that, because the branch has never had a Security issue I don't care if it's security We're not doing that, it's not our problem It doesn't solve the problem entirely
34:01
Because somebody has to rebase that branch On top of master To get this continuous testing thing And you can automate that If it doesn't work We never can automate that Our bolt doesn't necessarily have write permissions No, we can't You can do three things, origin, master Yeah, yeah, what do you do?
34:20
You can do something that's fairly trivial I mean, that's not that bad It should be straightforward So there's an error I didn't talk about Earlier, because I guess it was too boring To have a slide about it But now that we're talking about this It's sort of relevant One of the jobs that this thing has to do Is deal with all the different ways that people post patches
34:41
And when I first started Doing this, like I started out Just looking for bit rot in my own patches So I was just looking for the way that I spelled them Like just look for files called something.patch And then apply them in alphabetical order, right Or in sorted order But then when I started doing all the patches You know, some people send patch.gz Some people send
35:00
Tarball, some people send like bz2 Whatever There's like seven or eight or maybe more Or different ways that people do it Somebody sends their patches in .txt files I don't try and apply those ones By trying to How about 7-10 7-10 So I actually originally thought
35:20
That was going to be the hard bit But that turned out to be trivial Because there's only so many ways people post patches I don't know whether it's because people It's a challenge Yeah CSV CSV Yeah I don't know if it's because people have started Looking at my thing or if it's just because
35:41
I've I've noticed that people have started sending complete patch sets In the past people used to send patches They'd just send like one patch Oh, by the way, you also need to apply that one over there And that one over there and point to other threads and so on And then this thing would just break Because it doesn't speak English I've noticed no one does that anymore So I think people must be looking at it I don't know, maybe
36:00
For people who are the opposite And there is a certain amount of Adapting to the committers Who's attention you're hoping to get
36:23
Yeah, yeah I don't mind that I just apply all the patches You just apply But people who Work a lot with me Know that if they give me six things At number three before I Say something that I'm unhappy with
36:41
Then one or two will go away And they're like, okay, I like that So there's a bit in the stack So there's the SQL Smith thing Andreas Seldenreich's Really incredible Oh, yeah, sorry, I didn't do that on purpose
37:01
Yes, that is a thing which So if we had those files on the tree Then everybody's public Git repos Could just be turned on and it would be really convenient And that would be quite a cool idea The only question is, are our controls Well, first of all Which external providers of stuff Do we want to actually specifically Have stuff on our tree for Those aren't the only games in town
37:24
So, yeah, I think we'd have You'd probably have to Ah, they're called dot something, you won't see them Just one Just one One file at the top level, they're all dot files I think
37:40
Yeah, for each provider, does anyone know Was Travis the first, or are they just the most famous The first The first Only if we change our recipe for what we want to do Yes
38:04
Or maybe there's a way to Well, yeah, that's what I'm doing That's what I do It's called distributing the cost to everyone Yeah, and that's what I do with my own work When I want to get my stuff tested on Windows I just have this patch I apply and then I push it
38:25
Yeah It's just a little annoying When you're setting things up Yes
38:53
It's really simple It's just like It's like a YML It's like name, colon, value
39:02
You've script And then it can I did deny it There Like that's a minimal file You'd probably want to have a few extra things Because you want to apt get install Some packages first You want to have like Like lib reline dash dev and all that kind of stuff
39:21
So that There's a few extra lines about that, but that's it Very little Yeah, I guess we could get a configuration set up That would Build with all of the configure options Anybody's likely to care about Maybe It seems like this is getting too specific To the people that test somebody wants to do Yeah, it does
39:41
But it's hard to know how else to do it How many people need New configure options in their new patches I mean like 80% of the patches Are actually some core code that Is all branches And you never get a patch patch Right, but if your patches are being tested Without LLVM Then we won't really know whether they work
40:01
Because at least some of your patches Will not get exercised Unless the testing is done with LLVM How many times have we added Stuff like that and I wouldn't know that No, what I'm trying to say is If we're going to have it There should probably be a reasonably Comprehensive configuration Built with most of the options
40:20
Because we don't Otherwise we don't know whether we're really getting the patch Then you don't know The build skill If you don't have an option True You can make it random The build farm will tell you that Yeah, the whole point here Is to try
40:40
The real concern I have with these Speed and central repo Is that they're going to enforce A uniform testing Configuration You can still change them in your own It's not necessarily a bad thing If we come up with a set Like I have a set here that I use for testing See a cert, cap test, coverage, open SSL
41:01
And those are the things I turn on When I'm doing my patch testing Well, let me give you an example We routinely have complaints That some patch causes a problem And not a cert build Because nobody ever tested it Right Yeah, that's a problem I mean, we've got the build farm We expect every contributor
41:22
To test with every option So here's a different approach Right, what if we had I don't know if we want to do this in main Or whatever, but what you're doing already Would have a bunch of different branches that have different build options That we automatically apply This to with different build options Yeah, that's what I'm talking about I wish these config files were separate
41:41
From the source streams Right, well I guess what I'm saying is that you keep them that way And have something like C++ Is there any way Of telling GitHub or Travis or AppBear Yeah, you can, you can go in there and configure it To go and find a file somewhere else At least you can without Bear, I'm not sure about Travis
42:00
Okay Yeah But it kind of removes the convenience of it though Like, what's cool about this is that If we had it in the tree, like a lot of software these days Has these things in the tree, you'll see it everywhere And it just means that casual Contributors can benefit from it Without having to figure out how you set it up Because they won't figure out how to set it up, right So
42:22
And if you go to Travis, if you're logged into GitHub Somehow that single sign-on thing That they do You just turn it on and then that's it And it says Do you mind if I do anything I want to your repository? Sure, no problem No, I won't
42:40
Yeah, we Okay So That's an open question I think we should keep considering that I don't know the answer, I'm not strong I don't actually think that the files that I have right now are good enough So I'm not proposing them right now Certainly the Windows one is not good enough It's not even running all the tests, so yeah But eventually when we have really good ones We might want to consider that
43:04
Code coverage reports That's something I actually had earlier But it was causing trouble And I haven't got back to getting that working again It was actually pretty cool, there's a website called codecov.io And if you set your Travis Build thing up so that it Exports a file to those guys Then that website will then show you the code coverage
43:23
For the lines that Changed in your patch Which is really cool, like somebody posted us A hundred lines of code And the tests run them or don't and you can see it very clearly Unfortunately Actually doing the code coverage stuff was causing False failures on Travis because That version of GCC
43:40
A bleeding edge version of GCC has fixed this problem The basic idea is that What are they GDCA files If you build with dash dash coverage They didn't think of a multi Process program hammering those files So basically it corrupts itself Which doesn't cause a problem for actually running But it causes it to print little warnings To standard error from time to time and then
44:02
That breaks our regression tests I know this is actually at run time This is actually when it's running the code When it's writing the files
44:22
So it was happening Maybe one in a hundred Yeah It's a success running everything within the docker
44:47
So your script Actually runs a docker image Which is much more reusable Interesting That's what we do too Everything runs in docker So we use Travis
45:01
But then we build docker containers on top of Travis And do all the testing for Rel6, rel7, Ubuntu, Debian Really interesting It gives you a lot of flexibility Anything you can run in docker Windows still is going to need this outlier But you can increase your test coverage Enormously And have environments that are much more reproducible
45:20
You can also pre-build those containers Put them up on docker and actually Pull them when the Travis test runs So that you don't have to spend all that time Building It's a lot easier to build Pull that pre-built container And that would also give you the opportunity to run Different settings in parallel Different TCCs
45:41
Different settings Different kernels Etc Interesting idea
46:00
So the final thing I wanted to mention It's probably not really part of this project But it's just like An idea that kind of suggests itself Once you start looking at this kind of thing Is automatic performance testing I know David's very interested in this subject It's Can we come up with a set of
46:21
Tests that would actually be valuable Personally I've been using My recent work on parallel hash joins I was using TPCH because that's what we Selected to validate our stuff That's what we're going to use So I started building scripts You know when you're When you've got a test Where you've got to actually do something for an hour
46:41
To build the large data set And People keep committing code And then you have to keep rebuilding your database Because you're doing this on master So I got sick of doing that So I sort of built some scripts to do that when I was asleep And then I would have TPCH fully loaded With 30 gigabytes of data or whatever it is That works with the latest master And then I could test my stuff
47:02
Which started me thinking Why don't I At least Test the master branch for every commit With like a suite of different Performance tests but could you come up with good enough Performance tests to actually tell you something valuable I don't know I think that's the difficult question So there's a Google Summer of Code project
47:23
Essentially a performance farm thing Like the build farm Thing that's going to More consistently try to do performance Tests and one of the parts of that Is going to be collecting tests to run Right? Over and over On every commit And report the results back
47:41
I would encourage you to chat with Mark That sounds really interesting Move that along or we can give feedback That sounds good I'd like to do something similar To provide resources We got computers Yeah We can talk about that Well that would be really cool
48:02
You need a quiet environment We have performance tests That we run in backrest And we run them as part of CI just to make sure that they don't break But we set all the limits Down really low just so the performance Tests will run, we know they run And we have them broken but if we actually want to run Then we need a quiet system that we go to manually And then we bump up all the
48:21
Limits so we're running a real Performance run and then we go do that Performance run I mean DC2 doesn't dedicate instances The performance varies between the instances Yeah We tried running on Travis but we had The numbers were all over the place So there's no consistency on Travis No yeah, there's no point in putting it on there You probably need
48:43
Dedicated CPU's or whatever I don't know We can actually get all you want to get about it If you're looking to spot Which is a very valid thing to do If you're looking to spot that 1% regression That like, gee I added An L2 miss into this That's hard to do
49:01
In a cloud environment To get the reproducibility Of that You can still validate That over time You're staying within this like plus or minus 5% Window time It's tough You need to make sure that you're not On the slow slope down
49:21
All of a sudden you're out of it But it's not the last minute that pushed you out I think that definitely happens So it seems to me there's like Two different kinds of things you could be doing One is looking out for unintentional regressions What you're describing And the other is validating some new piece of code That's supposed to make things faster I think for that second thing You kind of need to design special tests for the occasion
49:42
I guess so anyway But at least for watching out for regressions That could be interesting I don't know if it's worth doing something like that For every patch that's proposed Or whether it's better to just do it on master I mean just for example
50:06
We have PGbench And Sysbench and some other stuff That just runs 24-7 Like hour long runs at a time On different instance sizes And all the regions And then we just report out the numbers And we alarm
50:21
We're not going to alarm if it loses a little bit But if it's outside of that band It runs Each run gives a slightly different number But they are tend to be within that Within that range I mean are you doing this against head With different commits Or are you doing this just constantly your own head Constantly our own head
50:40
So we do it both on Software that we ship in production environments That customers would be able to create We have those versions just get pounded And that also on our own head pipeline Yeah So That's it. I also wanted to point out That several people have provided Loads of really great input
51:00
To make this possible So thanks to those people Cool