AV-Portal 3.23.2 (82e6d442014116effb30fa56eb6dcabdede8ee7f)

CheriBSD: A research fork of FreeBSD

Video in TIB AV-Portal: CheriBSD: A research fork of FreeBSD

Formal Metadata

CheriBSD: A research fork of FreeBSD
Title of Series
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
CheriBSD is a fork of FreeBSD to support the CHERI research CPU. We have extended the kernel to provide support for CHERI memory capabilities as well as modifying applications and libraries including tcpdump, libmagic, and libz to take advantage of these capabilities for improved memory safety and compartmentalization. We have also developed custom demo applications and deployment infrastructure for our table demo platform. In this talk I will discuss the challenges facing a long running, public fork of FreeBSD. The challenges I discuss will include keeping up with current, our migration from Perforce to Git and the difficulty--and value--of upstreaming improvements. I will also cover our internal and external release process and the products we produce. CheriBSD targets a research environment, but lessons learned will apply to many environments building products or services on customized versions of FreeBSD.
Scaling (geometry) Code Variety (linguistics) Direction (geometry) Projective plane Moment (mathematics) Mereology Total S.A. Formal language Mathematics Befehlsprozessor Computer animation Personal digital assistant Remote Access Service
Multiplication sign 1 (number) Database transaction Mereology Neuroinformatik Data management Inclusion map Proof theory Malware Insertion loss Internetworking Ranking Office suite Row (database) Identity management
Code Decision theory Mathematical singularity Execution unit 1 (number) Function (mathematics) Food energy Data model Medical imaging Spherical cap Software framework Data compression Physical system Metropolitan area network Computer icon Service (economics) Email Mapping Namespace Instance (computer science) Flow separation Process (computing) Befehlsprozessor Direct numerical simulation output Right angle Asynchronous Transfer Mode Identifiability Computer file TLB <Informatik> Token ring Connectivity (graph theory) Web browser Value-added network Root Hybrid computer Googol Software Authorization Implementation Address space Robot Addition Scaling (geometry) Cartesian coordinate system Compiler Length of stay Film editing Computer animation Logic Function (mathematics) Network socket Password Statement (computer science) Cuboid
Point (geometry) Computer program Perfect group Length Code Multiplication sign Real number Mereology Derivation (linguistics) Pointer (computer programming) Mechanism design Energy level Arrow of time Address space Data type Virtualization Bit Semiconductor memory Cartesian coordinate system Flow separation Compiler Pointer (computer programming) Process (computing) Computer animation Buffer solution
Computer file Code Execution unit Range (statistics) Boom (sailing) Translation (relic) Compiler Mereology Sima (architecture) Pointer (computer programming) Mathematics Semiconductor memory Different (Kate Ryan album) Hybrid computer Codierung <Programmierung> Data buffer Bound state Code Bit Semiconductor memory Binary file Compiler Pointer (computer programming) Befehlsprozessor Computer animation Personal digital assistant Compiler Hybrid computer Buffer solution Object (grammar) Quicksort Freeware Spacetime Asynchronous Transfer Mode
Building Code Source code 1 (number) Raster graphics Mereology Software bug Mathematics Befehlsprozessor Single-precision floating-point format Data acquisition Set (mathematics) Videoconferencing File system Extension (kinesiology) Social class Physical system Metropolitan area network Arm Sound effect Mikrokernel Bit Entire function Process (computing) Befehlsprozessor Buffer solution Right angle Quicksort Cycle (graph theory) Whiteboard Prototype Physical system Game controller Real number Virtual machine Control flow Device driver Branch (computer science) Student's t-test Value-added network Prototype Root Peripheral Internetworking Software Computer hardware Data structure Gamma function Booting Address space Computing platform Standard deviation Bound state Plastikkarte Directory service Cartesian coordinate system Frame problem Peer-to-peer Cache (computing) Word Kernel (computing) Computer animation Software Network topology Mixed reality Codec Operating system Library (computing) Extension (kinesiology)
Trail Context awareness Functional (mathematics) System call Open source Code Debugger Line (geometry) Multiplication sign Set (mathematics) Dirac delta function Information privacy Total S.A. Revision control Data management Mathematics Semiconductor memory Befehlsprozessor Kernel (computing) Booting Error message Information security Address space Physical system Exception handling Injektivität Metropolitan area network Collaborationism Default (computer science) Email Chemical equation Lemma (mathematics) Computer file Projective plane Memory management Mikrokernel Bit Multilateration Line (geometry) Semiconductor memory Kernel (computing) Computer animation Query language Quicksort Physical system
Android (robot) System call Run time (program lifecycle phase) Code Decision theory Multiplication sign 1 (number) Parameter (computer programming) Mereology Web 2.0 Data management Mathematics Semiconductor memory Linker (computing) Core dump Bit Instance (computer science) Flow separation Arithmetic mean Linker (computing) Configuration space Right angle Quicksort Physical system Point (geometry) Ocean current Functional (mathematics) Implementation Regular graph Revision control String (computer science) Integer Data structure Implementation Proxy server Metropolitan area network Data type Addition Generic programming Cartesian coordinate system System call Word Kernel (computing) Computer animation Personal digital assistant Mixed reality Hybrid computer Object (grammar) Form (programming) Library (computing)
Point (geometry) Logical constant Computer program Slide rule Asynchronous Transfer Mode Group action Computer file Virtual machine Parameter (computer programming) Mereology Demoscene Revision control Term (mathematics) Energy level Vulnerability (computing) Demo (music) Interface (computing) Binary code Planning Bit Directory service Cartesian coordinate system Tablet computer Computer animation Network topology Quicksort Object (grammar) Physical system Library (computing)
Point (geometry) NP-hard Computer program Code Multiplication sign Sheaf (mathematics) Compiler Mereology Graph coloring Twitter Software bug Revision control Mathematics Semiconductor memory Linker (computing) Traffic reporting Installation art Metropolitan area network Turing test Inheritance (object-oriented programming) Interface (computing) Projective plane Debugger Bit Basis <Mathematik> Line (geometry) Software maintenance Cartesian coordinate system System call Compiler Category of being Particle system Pointer (computer programming) Computer animation Network topology Radio-frequency identification Buffer solution Website Quicksort Freeware Physical system Asynchronous Transfer Mode
Point (geometry) Server (computing) Quantum state Information State of matter Multiplication sign Projective plane Bit Branch (computer science) Voltmeter Flow separation Revision control Process (computing) Computer animation Term (mathematics) Befehlsprozessor Data structure Freeware Arc (geometry) Computing platform Physical system
Point (geometry) Asynchronous Transfer Mode Scaling (geometry) Forcing (mathematics) Multiplication sign Software developer Branch (computer science) Bit Degree (graph theory) Revision control Tablet computer Mathematics Computer animation Personal digital assistant Repository (publishing) Network topology Set (mathematics) Quicksort Endliche Modelltheorie Freeware Freezing Computing platform
Metropolitan area network Code Software developer Multiplication sign Set (mathematics) Streaming media Branch (computer science) Perspective (visual) Type theory Mathematics Computer animation Commitment scheme Personal digital assistant Iteration Quicksort Freeware Physical system
Inheritance (object-oriented programming) Computer file Video projector Demo (music) Code Multiplication sign Branch (computer science) Knot Bit Line (geometry) Demoscene Computer animation Repository (publishing) Network topology Energy level Directed set Freeware Resultant
Asynchronous Transfer Mode Multiplication sign Mathematical singularity Point (geometry) Electronic mailing list Virtual machine 2 (number) Mathematics Workload Order (biology) Word Computer animation Right angle Asynchronous Transfer Mode
Point (geometry) Arm Computer file Multiplication sign Point (geometry) Projective plane Device driver Streaming media Device driver Mereology Symbol table Revision control Mathematics Word Computer animation Personal digital assistant Natural number Network topology Order (biology) Quicksort Physical system Booting
Computer program Computer file Software developer Code Codierung <Programmierung> Direction (geometry) Multiplication sign Mathematical singularity Execution unit 1 (number) Function (mathematics) Total S.A. Arm Wiki Goodness of fit Mathematics Carry (arithmetic) Gamma function Physical system Focus (optics) Arm Software developer Code Bit Demoscene Hand fan Compiler Arithmetic mean Befehlsprozessor Kernel (computing) Computer animation Network topology Interface (computing) Video game Remote procedure call Quicksort Pressure
Android (robot) Scripting language Service (economics) View (database) Source code Virtual machine Sheaf (mathematics) Client (computing) Function (mathematics) Mereology Web 2.0 Cache (computing) Computer hardware Core dump Cuboid MiniDisc Exception handling Touchscreen Wrapper (data mining) Flash memory Weight Representational state transfer Web browser Compiler Web application Process (computing) Computer animation Software Function (mathematics) Buffer solution Interface (computing) output Game theory Procedural programming Physical system
INTEGRAL State of matter Multiplication sign Disintegration Set (mathematics) Compiler Branch (computer science) Shape (magazine) Information privacy Product (business) Medical imaging Mathematics Semiconductor memory Kernel (computing) Hybrid computer Endliche Modelltheorie Abstraction Information security Flow separation Compiler Mathematics Neumann boundary condition Process (computing) Moore's law Computer animation Computer hardware Right angle Abstract machine Quicksort Block (periodic table) Simulation Library (computing)
Building Group action Overhead (computing) Computer file Code Multiplication sign 1 (number) Water vapor Mereology Bit Event horizon Perspective (visual) Storage area network Value-added network Revision control Hardware description language Emulator Term (mathematics) Kernel (computing) Computer hardware Cuboid Software testing Data structure Gamma function Macro (computer science) Computer architecture Area Metropolitan area network Software developer Projective plane Code Computer simulation Bit Benchmark Peer-to-peer Message passing Pointer (computer programming) Kernel (computing) Computer animation Data storage device Buffer solution POKE Point cloud Right angle Quicksort Freeware Operating system
and and Brooks Davis Iron with SRI International and part of a team of researchers working on the on the cherry project shall get into a moment I'm here to talk to you today about cherry BST which is our for carefree previous to support the cherry processor on its in many ways similar to France's the trusted BEST project for the the hardened BEST project and that were going off in another direction we're going off in a direction that you previously isn't ready for for 1 reason or another on the case trusted BEST it was that there was a large swaths of fairly disruptive technology that had to be proved out its scale before they can be merged with the tree before you're going to make changes to thousands of places in the code on Cherry BST is a little different in that Cherry BST is about according previous to a new CPU with new instruction with new instructions and you C-compiler on some significant changes to the C language and so why brought variety of things that obviously we can't just dump in the previous D-tree and nothing else you can't buy the CPU so some people like ObjectTel large-scale changes for something you can even use on the 1st a little background on you know as
hardly a week goes by read here about some new new malware missile new breach or malware problem or whatever you know
and then losing 18 million customer records on
the banks losing hundreds of millions of dollars in things like the target breach on or the I think it's when actually the the firm American decide this is a reference to the bank's simply failing to notice on hundreds of millions of dollars in fraudulent transactions that were made in the money to another bank and they took the money at proof of or
there was 1 of the most recent ones the Office of Personnel Management and the US basically a chart for the civilian part of the US government they lost 3 . 2 million HR records and so basically everything you need to know to steal someone's identity of this is this is the daily reality of computing and the Internet so we decided it's time to do something about it the approach we're taking
is 1 that's been taken for quite a while which is application fertilization and compartment alizations decomposes offered isolated components of the of the and that each sandbox runs with only the rights needed so you could allow a preclude approach on 1 common example of this that you probably is unity the use every day use SSH has privilege separation on so that the most risky bets that must run as root of our set the of most that must run as root are separate from the most risky bets that dual descriptor handling or as much as possible on and the and the goal here is that you can take an application you can start with application like in this example Jesus in cut it up into multiple pieces of the compression logic which is what people screw up because the right and tight C code designed to be fast designed to trick the compiler to generate the best they can and least 20 years ago whatever they make those decisions on you put that often a process which only has limited rights so in this example this this example maps pretty well to capsicum which we already have
been previously is a process based framework on where we have capabilities which are unforgeable tokens of authority on or file handles very fundamental thing in next and those 2 you can enter a capability mode re cannot opening a new file handles through throughout arbitrary namespaces you must obtain them by other capabilities and you you can and and you can restrict the the rights on this capability so for instance energies of example which is it does in its most basic mode it opens 1 file and opens another file handle recently file and it does a bunch of work and then copy it moves data from 1 side to the other you obviously don't want the input file the writable you don't want the output file to be readable work that's probably fairly harmless but in principle you don't really want that on caps from works great for something like Jesus it works great for open SSH for the existing framework was already the existing probably separation already based around units principles of but it doesn't scale so the kind of things you might want to do so for example and in a web browser right now a web browser like chromium has a separate process for each tap the least until you open to many of them problem is you you can't if you have too many processes stopped a TLB entries on your CPU online 1 instance where in addition running out of TLB add entries on some systems are running out of address space identifier's and so even the processes are often quite small if you have too many of them you still get you'll be simply because you you've had a reuse face identifier's fire on so this is serious scaling problems and that and that's even just with tabs which it actually like this to render every image in your browser in a separate sandbox so that when something does go wrong the e-mail that was sent you with the bat that some recent you that had a bad image invent are embedded in can't read your bank statements can hijack your password reset your world all of which are your e-mail and all of which were in that town and in that process so were trying to make this we wanna make the scale so were doing doing that and hard work so
same with process separation you can have this this of you can avoid this problem where you have 1 process here a pointer to a buffer in some other part of the program or in some other part of the application is in process this arrow doesn't work on and we'd like to do that at the application level in a single application and had that mechanism for every pointer so we can have every C object a separate thing G cannot manufacturer pointers so with cherry we're
doing that armed with capabilities we created fact what we've created fat pointers these are 256 bit pointers on yes that's big on so each point here as an offset which is essentially which is in practice the pointer on it is it is where you're pointing in memory and and you have a base in a length which relative to your virtual address space and started manipulation of this which means that you can only critique can only create a capability by deriving it from another capability the only derivations they're out of a shrink it so you can increment the base move move up the address space or you can shrink the length you can also reduce the permissions of pointer I 1 important thing here there is that and the reason we have an offset English international design we did not have an offset on it turned out that this real world C code that's not OK because you can't just keep shrinking year your data because what you do is you see an application that program like ffmpeg inside would be Kodak there's code where they pass a pointer to the middle of a piece of data to be decoded because the compiler generates better code that way you can use small positive and negative intermediates and so we have this offset which allows or capabilities to be near perfect replacements of C pointers on and the reason this works is that the offer you only checked that the offset is inside the bank that basin like at the reference time there so we as I
as I alluded we want this to be C pointers so we've we've done that and we have 2 modes were you can use that 1 is a hybrid mode and this is where you have conventional in our case MIT 64 code and annotates some other pointers in your code is being capabilities on that means the compiler takes more space for them uses the correct manipulation instructions to access them and that works so you can you can do you can use this encoding you want just protect a small a few buffers they're very very important but but it is a lot of work on we made some changes the TCP up and I'll talk a little more about them later on where I added bounds to the buffer that was to the to the packet buffer that was being deceptive and and at work that was mediators of unpleasant so unpleasant changes to the code and thousands of changes and so we ended up adding on pure appear capability mode where every pointer is a capability on so in the in the in the translation units of files the compiled in this mode all pointers to 256 bits they're all bounds checked almost all bounds are correctly inferred on you MolLoc something you get a pointer that's exactly the size of the object that you know you allocated on to try to go outside the bounds things go boom in a nice predictable way rather than randomly corrupting your memory so this that as I said we have
as a sort of a of a range of ABI 0 so we have the pyramid 64 this is conventional Free BSD runs on this processor without modification on interviews this mode you get no benefits but it also doesn't require work important part is to work your way through on and so there's a bunch of different approaches we can take but we have a lot of hybrid code today and some pure capability can were increasingly moving toward the capability can are and here
is an example of the sort of things we can support the process so this is sort of this is our current situation for the most part of or have a conventional kernel it's running standard it makes 64 paired with a conventional application also running a standard that 64 code and we have z with in a compartment on the words chosen because it has a smaller is easy on and compiled and it has pure capability code inside and then a little hybrid wrap around the outside that lets us call in the application on 1 1 example that we've talked about 1 of our papers on users drift a PNG doesn't know that the libraries protecting it in fact there is essentially no performance impact in this text on but if there's a bug in z it will fail stuff on and even if it somehow even if it doesn't manage to fail stop the is very difficult to gain control arms in the application if it's a pure within the bounds of the control flow but you also see a world which working right now actively to add pure capability application which means I need yet another systolic effects on and peer libraries 1 interesting thing now this this example actually isn't ideal on but 1 of the interesting things is we can put conventional 64 code say to a binary library that we don't have the source code to by the because we bought it from someone who doesn't wanna give a source code or because we've lost the source code to heard some large Internet companies doing and and we can put the sandbox and we can have a rapid effect so or library can be pure even know we had this bit that we don't have any control over it can't get out of its sandbox so it's less likely to fail stop if something goes horribly wrong but what it can do is very little on it has access to a few you it might have if it's a a video camera a proprietary video codec all to do is write the wrong video frames of and reads video frames which it was supposed to be doing anyway is also on you go farther along and of course you start writing microkernel having a single single address space applications and what things but we're quite a ways from that but if nothing else LDM doesn't yet compile MIT 64 away that usable for a kernel so just a little bit more on
CPU and prototype CPU and it's a 54 bitmaps CPU and sort of our for k so nicely out of patent and it has the cherry ISA extensions that have been talking about the 100 megahertz which is a bit slow but you know we lived with the nineties and we actually have quite a bit more RAM which helps although and we have a gig of RAM on the board and could have for gigs of RAM and we should embeddable unimaginable when I was in college and which is is mostly except occasionally gets a little excited 1 of 1 of the early ones we had was of bonding in the SD card controller that we got from the FPGA manufacturers that if you wrote a bite to the buffer to cycles are so it through the 2nd 1 way that you did it there was literally was killed in there that we got to rights around the way on we commented that out it works better on but 1 interesting thing is we only found this problem after reboot because we had so much rain the entire file system that the buffer cache so it was only when we flush the buffer cache by rebooting machine that we discovered that we corrupted the on this data structures and Fiske would say 0 what have you done to this poor file system from this so it's interesting that 1 of the interesting challenges here all but the thing is that we have is that we have this CPU you would have this operating system that and talk more about and we have a modified LVM so we can run real software that's where things get excited and we can really test things out there carry is the use of course via the BEST to support carry on it's of mix of platforms support which is to say the drivers for things on the FPGA and and peripherals so which is very CPU you can you can compile ICP without the tree that's the what it's open source on and if you wanna do things like running up against rectilinear hardware class you can you use that fairly simple CPU and and things like a better branch predictor is an exercise that people can reject on that on so is also supported ISA features and I'll give you a rundown of sort of how much changes that was in a bit but there is in infrastructure to support the compartmentalization of libraries or whatever it is you wanna stick in a compartment but there are some cost out applications I'll talk more about about tcpdump and a few others and then there are a bunch of build system improvements on because we're doing what's a slightly weird things so on so things like so we've we've done a lot of work like we have the ability to build tree BST and installed a directory without any privilege on which is now being used in parts the release building the structure and we did it because we wanted to have grad students and grants students building tree BST and I wanna give the root of and and mess up licensing machines and so here is a
snapshot of our of our page on get have serious almost 6 thousand changes here and relative to previous and were actually so and that's the will quite a few more than 6 thousand now that was a week ago are and were a bit behind merge periodically talked quite a bit more about merging on but the main thing is and what makes what I think makes our project interesting is that we hear from we're we both want this to be publicly accessible because we have collaborators what we can so which means we can do things like re based on but we also are you know we merge probably 200 changes are 300 changes the street so we have this huge set of changes that we have to maintain over time which leads to some challenges and version control I don't have
a is a breakdown of the kernel changes this was in our paper at ITER with security and privacy on so we you know we a bunch headers of various things lots of access to assembly functions on there's some set up in the same set up a cherry in the kernel basically saying turn it on and you give me the full capability which at boot there's a default capability which is the address space and you can search shopping out of later there's context which can because our new or new our capabilities have to go into special register so we need to save and restore the registers exception handling because we've created all sorts of new ways to call the exceptions sense now what do you have a balanced query reference is a track on there's no memory management memory copying slot and there's some support for the actual fertilization there which is somewhat suffer from memory safety and that for a system called the signal of liberates error on a few thousand lines of code injection is not too bad distributed several million lines of kernel code and we actually written a very tiny little microkernel before that's not very big of that and I think it works anymore so in
addition to kernel changes per cent make some tweaks to the time on the 1st the biggest thing is then copy and then move need to be capability where even capability oblivious code on 1 of the interesting things here and 1 of the reasons why it's really important have a team as large as we do is actually our 1st version of the ISA didn't allow you to implement and cannot efficiently so you actually would have had to check every 256 bit chunk of memory to see whether or not it was a capability and then copy with capability instructions or copy of regular structures that would be insane on so you can now use capability instructions to copy non capability memory unpack she turns out the previous use generic man copy and then move implementations in C om require almost no changes we just had to talk so 0 just use the right size as the basic word size when doing copies it generates the right code and I actually pleasantly surprised that we of course have assembly versions as well there are a little better but that's the basic the the nice thing is with the integration go on that and we also had an explicit versions for hybrid code we had explicit versions of things like string and memory manipulation functions and so that we could have a mix of arguments and that I think I'm not sure that's going to stay on by our current infrastructure that's how it has to be on and then there were a bunch there's a bunch of interesting cases like in current C code like in the storyline function he assumes that once it's aligned to the front of the aligned to the front to the size of a word that I can always read a whole words but we have a byte granularity so and we have we have a byte granularity restrictions so if you read if you can definitely read the next flight that doesn't mean you can definitely read the rest of the words and so that had to be fixed as easy to do but aren't required change on and 1 thing that we're working on right now is taking the syscall implementation in web C and moving it into a web site's calls so yes and and be asking for X X brands on the courts at some point here that's going to be exciting and but it would help us because inside a sandbox we want we may want to attempt to make such calls and have the kernel mediate on based on some new configuration that's capsicum like or we may want to have something that simply proxies persist calls back to the approve which part of the application and it can make to make the decisions and also be useful for people like will were you know building Android with previously on and the systole layer's obviously totally different so the separation of a could be generally useful on
next bit bits for compartmentalization so would cherry is the library for creating sandboxed libraries on its for creating sandboxes and you can instantiate an object instantiate objects you can give them types of and then you can allocate but allocate copies the object do resets that sort of thing and so that's sort of the core functionality for implementing the paradise libraries or compartmentalized tcpdump for instance and it also has a lower run-time linker sense objects have a new calling convention and we want to and and we don't want to have to write horrible little RPC-like things where you say well in the past 8 integer arguments at the capability arguments and try to remember how which ones go where what not we did that for while on we are also working on a it also includes insist on imitation bits so compartments can call out to an object in the privilege brought
something confusingly similar name and we have all of the wood tree directory this is where pure capability versions of libraries live Tulsa where this became the object slipped somewhat like the 32 bit support for 64 bit machines and in fact I copied and pasted the stuff in Make file that Inc 1 on 1 reason why merge sort of there's lots of turning that file on so in eventually right right now we're only using those libraries to build little compartmentalized libraries that have a hybrid outside interface on longer term the plan is to have pure capability libraries so as part of your transition from my conventional ISA to cherry you looking points along the way not only at the library level but at the application level so you can have your most important or what are the most important or easiest to modify on applications be pure capability and other applications can and and and other applications since can remain as they are you can still have you know you're you're binary only whatever parameters that you have so
we also have demo applications this is a screenshot here about her picture here of the we all sort of very weak PowerPoint like program that I wrote and as a demo and so we we actually gave a talk to but that at the principal investigators meeting on a tablet from which the Robert holding here and when we got to the end we said and then we have 1 more thing on we're running our not only are we running our group at the slide deck has an exploitation on and triggered Trojan and cherry successfully defeated the duration of that it was very hot constants wrote all of this but nonetheless detect technology did work on and does this so we're on a bunch of little custom applications part attributes the those don't protect presenting particular challenges for us on in terms of Jerry BST made the thing
it does represent bigger challenges were also modifying existing applications so at 1 point we took a look at what's what's compartment ways Wireshark wear shorts full of our abilities the giant carrot spent quite a bit of time on said well that was insane and its 3 million lines of code out and uses then it's all very complicated to decide the tcpdump instead of which has the advantage of being base on which is is kind of good and bad so or 1st version we had we had over here is very simple we just compartment lies to section and it was standard that's going on both sides the just a little the hybrid code to get in and out of the sandbox we later added memory said to which I alluded to before and that you know 6 thousand lines changes a whole lot of not fun on and part of that there was actually 1 of the things that were we want an interesting lesson which is that we had no ability in in tcpdump of modest amount of code that advances the buffer advances the advances the buffer that's all I forgot I want something a little back behind my pointer that's fine and see that's a perfectly legitimate reasonable thing to do however before we had the offset that require that didn't work because committed the base we implemented the basement and they slopes can go back that's not allowed on so we we add offsets that helped quite a bit on added per particle they sector so this is the Super sort of the the most protection you could get from TCP down into the pit which is to say every particle lives in its own sandbox on so as you're intersecting Hypios find TCP goes fine because and http in Europe maybe problem or realistically color something like SNP the someone 1 passes broken as it always is alarming gets exploited on you can still trust the TCP and IP this action on because you failed deeper sandbox that require a modest America change their yet to all the call site that's pretty reasonable we went to pure capability mode that got of tons of annotations was nice on and then if you're capability about removing annotations actually driven by the fact that Free BSD got a new version TCP up so this is 1 of the things that's really interesting about working on a real-world projects and keeping your tree up today date is that we won some lessons about maintainability the hard way on trend is important here and then you know were 30 600 conflicts and get it seemed like a bit much so we may have Monte changes got back to here and we had when we added the linker support that reduce the amount of junk in this actually this is dropped dramatically because somewhat surprisingly properties of maintainers accepted by change to shuffle a thousand lines of code around within the tree and put it in different places so that the interface between the 2 sectors on and the front end is both fairly narrow and that it's only 5 or 6 functions on and fairly simple which should help build a better capsicum capsomers version tcpdump but also from simplifies Mike will see on my next merge goes because the approach I took 1 upstream is a bit but there's also a bit of
infrastructure work we've done and it would to force the which builds on we've added some tax to let us do change the compiler on a per program and per-file basis and that's because early on our customer loyalty and was not sufficiently who it was not robust enough to actually compile all the code so we wanted to focus on the Cobra we could do something interesting on and then over time expanded out it's getting close to being able to compile everything but the quite a bit of time the last few weeks trying to compile things and sending David bug reports on when the compiler fractures and some some other acts like on which will will help some upcoming changes like we restrict binary during the build rather than using the install program things like that
and for more information so the general what's in what's injury BST particularly very stuff to just reading the journal article there a few months ago because now on a bit aversion to
a revision control times 1 of the places where it had a lot of challenges we started off in performance which was so the conventional way that people did for 2 previously in the past supported on previously project and the structure of given on merging is very good at it really is a good way to maintain something it's a fork of reviews the term I've done at previous of jobs as well and it's easy to maintain what I'm calling here stacked branches so for while we had a very BST which was the platform that's on and it's sat in between jury BST and Free BSD so that we could try and keep maintain some separation there but we merged everything in the previous 2 branch and that wasn't too weird and then got rid of it at some point that was the most quite helpful in our team already knew it so that was a good reason to stick with her forced downsize purpose sucks public access and you have to give people account have give them access to the system every check involves adding server state so even if willing to get out a lot of accounts eventually the project will run out of resources but it's very easy to get a situation where your purpose server needs to have a half terabyte of France and we probably had a buyer for the project and I'm adding users but annoying and then there's the offline support is a very good that's not too big a deal but it's not it's not going on so in
October October 2013 we decided that we needed to have public access there were people a dynamic at MIT Lincoln Labs in some other places want to start using jury BST so we want to give them direct access to the to the repository rather than having to take the time to package up Thompson have did per-packet package of snapshots and push them out of the way them all that but to moved over to get on we lost a little history granularity in the process because many of the Committee's couldn't just be applied 1 at a time because things have moved on the but verging on but really not too bad on uncover was a bit of a trial by fire and freezing without freezing get sort of that scale produced is a bit on the side forget and and or export has some weird features that will get to but also I and Robert remain in Cherry BST developers will not experienced it uses at the time so lots of excitement it's not clear that our model was the right model on but it's the 1 we've got so we're kind of stuck with it and what we ended up doing 3 4 the Free BSD repel 1 get have but 1 thing that I found this kind of weird about get hands 14 model is if you want a fork cherry BST it seems that you get a copy the prettiest Europe and least that's what happened right trying to do it recently that might simply be because I already have a copy of the previous degree but so if someone was try and it'll make it yourself gettable never on but that was a bit or on and we get all the commits to work to the master branch and not sure that's the right solution that's what we've done and at this point were stuck on not do a force tree-based mess everything up and then we merge changes in the previous upstream periodically and typically what the typical working model as we merge changes when we need something or after a demo after after a big deadlines passed and we're realize were behind will do emerge just sort of catch up will work well we have some breathing room time so our 1st attempt is sort of the basic obvious thing you might you might think to do is we fetch upstream merge mastering your current tree in a branch but then what is emerging on it mostly works on the 1st few times it went pretty smoothly and there were some sort of strange looking conflicts that I didn't understand at the time but it works on we got past them on other 1 annoying thing that we still haven't resolved is that we basically go horribly wrong so if somebody else pushes while you're doing emerged just after through the merger way of rebates is never worked out in this case so however after
a few times we came along we started doing work integrating the VT Council stuff into our tablet platform and I did the merge everything piled everything seemed to work pushed it and something was broken and VT and who knows where it wasn't do emerge conflict on is actually API change turned out now the problem is that this is sort of a notional model of how it works you know you're going along neuronal 1 your along merge upstream it's all good and problem is this is more
like it and this is actually much simpler than reality but reality is you're going along on your own 1 thousands of changes occur on and then were from upstream poll in another 3 months of development which is typically several thousand years that is thousands of changes but if you try to bisect while in the in our case it was API it changed so all these were refined don't include any of our code and all these are fine because the problem was here well here I guess that's going on and so there's nothing to look for and so that was really annoying and and found it eventually I so I I ignore for recovering the IOC set of types of our
article that I whimsically named merger fire on it merges 1 committed a time because from the perspective of a consumer Free BSD every change is a feature that's not perfectly accurate because sometimes it's a you know a commit that's broken and then another committed fixes iterative that's broken and if you commit semantic commit that fixes the previous 1 there's no way to deal with that case in a sensible way so I just pointed on but merging 1 committed a time does help in that the overall system is much like that does help in that you now have commits you can bisect those each of those merge commits is useful and the future I haven't added to the tool yet is something to knock out all of the sort a child commits on from the bisector would be pretty easy to do you haven't written it yet so that you only consider the comments that actually on change your branch of 1 of that 1 of the key things that I figured out over time is that you from our perspective it's only those mergers that matter and everything else so it's got some history and cares on so the 1st attempt we
just merge every commit and then TCP have came along and there's an update trade and things went really strange on we got these merge results that were completely knots on mostly it was the top level file would get something from contribs squished into it on it turns out that what's happening in the Free BSD export these things in things in the vendor branch have a common parent at the Free BSD tree the empty repository so in fact there's a slash Make file in both of them and get says they've got common blank lines squishes them together and really goes badly from is actually you don't care about that and it is not important what you care about is the committee that merged that in the the previous D-tree so I change the code to only pick the direct commits to the branch and merge each 1 of those 1 time times so but I I was going to
do a demo here but I'm getting a little short on time so I think I will skip the demo and come back to it at the end of particularly since as a bit of forcing the projector to make it work on so
alluded to so there would have to be for on rebase is broken or and I think it's again because we base is applying over individually rather than those individual merge commits and so I thought my list to attempt a attempt to change to try applying commits 1 at a time basically where you implement rebates on should be doable but I haven't done it yet but I have an upcoming words that looks like it's going to be exciting so reveal fixed than on so
that's into these from a device on Monday Dan this re basements by motor talked about before where you can skip all the commits that don't make any sense to look at on and 1 thing I would like to do eventually is periodically retain 10 commits a every 100 commits check that things build on right now I don't get to do that which is sometimes a little frustrating I get to the end and discover get botched emerge somewhere or I botched emerge and my current workload is any time I do anything by hand I assume I screwed it up on and do a full build just to make sure that it's right so that you really like to do it every commit and that would be really slow right now it takes several seconds to merge each change because get is fast but even if it's all in RAM on a fast machine previous repose big on so this would take quite a bit longer but hopefully on time here meta mode will make it fast enough that I could you know try every 10 committee of so could sect within that once once I knew what was going on so on another topic
upstream and the best way to remove got merge conflicts across Ostrander changes and that way they become everyone else's merge conflicts not mine and the if that sexual Make file is the same then you know things they're going on particularly the top-level makefiles introduced I have made so many changes to them get conflicts all the time on the way so this impressions the what upstream I think it's the simple or philosophical questions here on the answers will vary depending on your project the nature of your work on obviously drivers for things that people can use should be upstream from and drivers along the better things upstream as the owner of the driver because it means when something changes the instructor that that bigger driver you don't have to come along 3 months later and say what they heck happened what is my driver not work anymore was about compiled and that sort of thing but also general infrastructure we built quite a lot of infrastructure along the way so we've been upstream that as we can words we have time on so things that are shared by multiple external consumers I I would I like to try upstream and things that are useful to multiple people even if they're not useful the most consumers the project a not entirely useful in the base system if you a cleanup that doesn't that is critical the you but doesn't matter the base system but other people using I think that's a good thing upstream but that's a little tricky in that this is the sort of things that get broken arm that had some issues with that on and also think there is low are low impact and are likely to generate conflict so we actually have a new signal that we've been meeting upstream because capsicum might use eventually it won't immediately but every time someone adds a new symbol for when the signal that creates a ton accomplished because in every single case it's adding something in exactly all the spots we changed so I think what stream the sick part of 2 so these we have
upstream and the FTC support from its so far tree but it was there in really work on we fix that bunch of drivers are bunch of driver improvements so we had way to turn on the point support that was in the kernel on and we've made it actually work on 4 minutes we added and added orders for threats and that the ambit and privilege build stuff it's a bunch of other stuff on then net for 4 years now so on lots of things
we also have some sort of related upstream and we've been doing not previously that other pressure so if you start they see in John stock earlier today on the tumor remote work that's letting us build almonds almond 64 packages which will soon the official from that's work we did because we did packages and for that we don't actually use them very much right now but surely we knew we were gonna wanted do demos about a tree code eventually so we needed some way to build them other than trying to build up to 2 years something on 100 megahertz that's didn't seem like a good idea mean time to review the power failure would be an issue arms so we've done a lot of improvements decline in LVM on 64 of the imagination technology people's focus is definitely on little-endian 32 bit bits and were beg beginning in 64 bit so we had to fix a lot of things and I've upstream accepted tcpdump on mostly make my life EEG is your virtualized from but they seem to like the direction the compartmentalization direction so that's nice internally we were so we've been doing wrote some releases we've done them internally for a long time mostly snapshots and initially it would be periodically I would do a really spilled I'm using a little the old system and push something output on the wiki on went down a couple of restricted releases the partners are 1 of the problems we have is that the FPGA a license agreements thing you can share compiled that files so we give them out to a few people who were in the program and we don't have licenses but it's sort of the the units problem hurts alluded to on and we started doing public releases to adjust their 1st from public release 3 weeks ago I think so cheery CPU . org and you can download the CPU you can compile it I don't run the 1 that the bit file that comes out without some careful testing because at least the ones that are Jenkins builders is producing the fan doesn't work since the wrong OK as long as you don't do too much that it gets hot and things as being on so another small change
gears since my focus on this is a previous developer some tips for developers and many of which can around the idea you know as you develop new things like new ways to build the kernel of new ways to build the OS spin off a lot of time compiler so that some suggestions for ways to with
you at this the 1st and probably most important 1 user big enough to show on seriously if somebody is paying you wait for compiles they can buy quite a lot Our hardware fairly quickly on for what they're paying you you need enough my view is you want an operand to hold all the source of all the output of we know 128 games is enough John Anderson has 1 that's working well for him the machine we use as a 256 K machine so and we have the fastest too fast so as the equal procedure with the half terabyte of SST on works pretty well and I would say that you know the 120 to 100 128 games should be enough to gigs definitely should be enough for anyone except last week when there was a compiler going on and we had a dozen claim processes using 72 70 gigabytes piece we ran a swap of it was exciting and it turns out that well there swap and then they started dumping core and then you can't kill processes that are dumping course so that was unfortunate but usually it's big enough and we literally could not you work were doing without this machine and it's only about 5 thousand dollars it's definitely worth on another thing that I found really helpful is a II is a little Notific push-notification service that has a REST interface so the 1 I use is a pushover . net on there is a web browser-based client that's 10 box and there's Android and IOS which also 10 bucks apiece to activate you can write the service for that so that's totally with it on and I have a little command line wrapper so run this command notice right command and then I get something like this on my screen and being in my phone buzzes on and it's really great at reducing the latency between well my compile finished and I notice that it's finished so and so that's that's been really helpful and
another thing I build everything into not sections actually 1 of the things that I discovered part way through this is really important is switch away from the Bill do not send over the network even over data that locally enough stuff buffer that it significantly delayed you're going to get back to the product and on so better than not rendered at all and that was slightly surprising I found that it carries no I knew it was an issue over the Atlantic oral why would switch away when I was tethering exciting is wasteful that they're with but turns out we always which way on
and I have 1 final step can use integration is really great but we do full for builds after every compiler or OS change and they take about 20 minutes but we also due for release is at least a couple times a day on keeps everything working and 1 of the things us is because architectures weird and we're making strange changes so and we build the 64 and and the 64 all the time in the 64 Lecture really good characters also builds I 3 6 libraries and so that means we build a whole lot of stuff and just keeps us honest so here we are we're research product but we really want this stuff to be used for real so we don't want to get the veered down some blind alley encouraging doesn't and then come back and say OK I have 3 months of work to get this thing functional again from in the shape right converges so do that all the time on and also when working on a release now we create a separate set of Jenkins jobs to build the release daily on the release branch just to make sure everything state is stable I also
mention some of papers that get published with 3 top 2 papers in last year on and cherry and harbor paper on the evidence here the harder paper Jerry the cherry capability model visiting risk in age of risk and that's kind of mostly what I talked about earlier we have beyond the PDP 11 processes support for a a memory memory-safe the Abstract Machine that was announced lost on if you're Seagate it's a great paper and I definitely recommend it I want all sorts of weird things about C writing a paper so that's 1 of and then we had departmentalization paper at i tripoli security and privacy and can you reached there's also ISA document and what not on the image website the future work
here and so we have a working on a pure capability Free BSD so probably if you have very long time even once harbor exists on before your ship over to previously that only uses peer capability code but it's pretty likely that you want to run on a fair bit of pure capability here and we also realized recently that the best way to get a tonic current learning is to be able to have a pure capability build within Free BSD so we must pile everything you try to use it when it through the test we can see what happens on a little note at the exciting and are also working on we'd like to anterior kernel and we have the code we have in there is all assembly or macros around inline assembly and it works for what we're doing but we'd like to be able to do things like protect and box DEC's storage buffers on the bottle is the kernel we can go microphone on and we're hoping event well so when the other things as I said you know early on 256 bits is a pretty big pointer on the lots of overhead on the in terms of cash footprint what not on in extraordinary where heavy benchmarks is about 20 per cent of performance of red is not acceptable and so we're working on a 120 that compress capabilities of some interesting tradeoffs exploring we're pretty confident that work that the water details on and in our simulations that should get the overhead and about 3 per cent in a benchmark that's basically data structures that pointers then on and then also looking at on its architectures so happy to answer
any questions raised throughout but on here at time 1 Cervantes about 5 years now but I am guessing we got over 50 years of work into it on and we got quite a bit go but to fund projects and I think we're done a lot interesting developments stuff that's generally useful for previously along the way questions that's right you're not on that hold holes that order yes actions were actually using on perspectives timber along on which the Haskell derive HDL on we both pilots so it has a to compile a cycle-accurate see simulator and so we don't do that for a long time more recently we have actually started synthesizing the files loading them under the JISC and learning operating system on and that's Jenkins it's the Jenkins cluster is grown quite a lot over the last year and on the hardware side 1 of the areas is amazingly patients willing to deal with broken junk I know I don't think so i parts of it might be in the released I'd have to go look when asked me where I can take a take a poke at the get every what we've actually put out and what's buried in the Jenkins can vary now this not the cloud ABI work and I probably should be undercut aviator before the start of it on I think so what we're doing here is adding a new asystole ABI so like the ITER like the 32 would 32 and the 32 bit are produced eateries you emulation were adding a cheery ABI but the sort of the 1st had not clear what the long term right answer is that that's 1 we know how to do and so I get a 1st pass Cisco's top master the great thing is you take all the ones that say compact clear them out and you never shipped with those versions we never will work thank you all for coming from the