CloudABI Cloud computing meets fine-grained capabilities

Video in TIB AV-Portal: CloudABI Cloud computing meets fine-grained capabilities


Purchase DVD

Formal Metadata

CloudABI Cloud computing meets fine-grained capabilities
Title of Series
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Cloud computing meets fine-grained capabilities CloudABI is a new runtime environment that attempts to make it easier to use UNIX-like operating systems at the core of a cloud computing platform. Instead of offering full machine virtualization (e.g., bhyve) or requiring the use of intrusive OS-level virtualization techniques (e.g., Jails), end users can simply provide a set of binaries that communicate with the operating system over a secure and compact POSIX-like interface. Advantages include ease of maintenance and increased security. Over the last couple of years, we've seen the use of Capsicum increase. It's already being used to harden services like hastd and sshd, but also in interactive tools like tcpdump. CloudABI attempts to extend the scope of Capsicum by providing a light-weight POSIX-like binary interface that is purely based on the principles of Capsicum. CloudABI can be used at the core of a cloud computing service. Instead of using full machine virtualization (Xen, bhyve, KVM) or techniques that attempt to virtualize namespaces (FreeBSD Jails, Linux cgroups), CloudABI makes it possible to safely run user-provided executables with very low CPU/memory overhead, but also without any complex system configuration. Compared to other UNIX ABIs (Linux, FreeBSD, etc), CloudABI is relatively compact. The number of system calls is low (~60) and all data types and structures have been decoupled from the public C runtime environment, meaning that it is relatively straight-forward to add support for CloudABI to other operating systems. Implementations for FreeBSD and NetBSD already exist. An implementation for the Linux kernel is being worked on. This allows users of such computing platforms to run the same executables without targeting a specific operating system. There is no need to recompile. CloudABI uses Clang as its C/C++ compiler. It ships with a modern C library that is specifically designed to work in a capabilities-centric environment. Interfaces that typically tend to break when using Capsicum on FreeBSD (e.g., locales, timezones, DNS) may still operate correctly in this environment. The C library is almost entirely thread-safe and has high testing coverage. CloudABI attempts to abstract away traditional UNIX concepts that are not applicable to pure cloud computing environments, such as UNIX process credentials management (local users and groups), file system access control management and terminal handling.
Web crawler Open source Observational study Multiplication sign Branch (computer science) Unicode Goodness of fit Software Touch typing Thermodynamic system Product (category theory) Information Forcing (mathematics) Projective plane Cloud computing Functional (mathematics) Data mining Arithmetic mean Kernel (computing) Computer animation Personal digital assistant Device driver Revision control Quicksort Freeware
Complex (psychology) Randomization Concurrency (computer science) Thread (computing) Java applet System administrator Web service Computer network Single-precision floating-point format Office suite Information security Social class Enterprise architecture NP-hard Thermodynamic system Keyboard shortcut Shared memory Parameter (computer programming) Bit Port scanner Demoscene Category of being Process (computing) Sample (statistics) Graph coloring Quicksort Web page Slide rule Computer file Open source Student's t-test Event horizon Number Database Operating system World Wide Web Consortium Data type Default (computer science) Standard deviation Information Server (computing) Surface Uniqueness quantification Model theory Content (media) Code Skewness Directory service Set (mathematics) Line (geometry) Cartesian coordinate system Local Group Vector potential Software String (computer science) Green computing NP-hard Standard deviation State observer Code State of matter INTEGRAL Multiplication sign Direction (geometry) Mathematical singularity 1 (number) Parameter (computer programming) IP address Thermodynamic system Variance Subset Web 2.0 Video game Network socket File system Programming language Cloud computing Virtualization Functional (mathematics) Front and back ends Vector space MiniDisc Website Configuration space Software testing Right angle Computer programming Trail Server (computing) Service (economics) Presentation of a group Virtual machine Directory service Social class Root Internetworking String (computer science) Software Linear programming Software testing Subtraction Projective plane Interactive television Kernel (computing) Computer animation Doubling the cube Web service Network socket Computer network Dependent and independent variables Object (grammar) Communications protocol Routing
Randomization System call Hecke operator Code Multiplication sign Client (computing) Mathematics Spherical cap Single-precision floating-point format Network socket File system Error message Partition (number theory) Intelligent Network Thermodynamic system Namespace Bit Port scanner Open set Metric tensor Radical (chemistry) Arithmetic mean Process (computing) System programming MiniDisc Configuration space Procedural programming Quicksort Freeware Point (geometry) Computer programming Statistics Service (economics) Computer file Patch (Unix) Regular graph Regular graph Sic Software Utility software Operating system Computer-assisted translation Booting Standard deviation Scaling (geometry) Directory service Line (geometry) Cartesian coordinate system System call Local Group Particle system Kernel (computing) Maize Computer animation Software Computer network Iteration Gastropod shell Library (computing)
Suite (music) Complex (psychology) Group action Confidence interval System administrator Source code Run-time system Front and back ends Web service Virtual reality Mathematics Error message Social class Thermodynamic system Namespace Binary code Shared memory Interface (computing) Category of being Data management Process (computing) Internet service provider Quicksort Computer file Motion capture Control flow Streaming media Open set Database Implementation Scaling (geometry) Graph (mathematics) Information Uniqueness quantification Physical law Code Line (geometry) Set (mathematics) Directory service Density of states Binary file Cartesian coordinate system Timestamp Local Group Compiler Table (information) Personal digital assistant Data center Library (computing) Greatest element Structural load Code State of matter Multiplication sign Compiler Mereology Stack (abstract data type) Proper map Thermodynamic system Web 2.0 Spherical cap Network socket Data conversion Position operator Virtualization Cloud computing Price index Functional (mathematics) Open set Connected space Vector space Right angle Computer programming Autonomous System (Internet) Asynchronous Transfer Mode Server (computing) Batch processing Service (economics) Observational study Line (geometry) Virtual machine Field (computer science) 2 (number) Regular graph String (computer science) Software Subtraction Task (computing) Time zone Element (mathematics) Debugger Cryptography Human migration Particle system Kernel (computing) Computer animation Web service Dependent and independent variables Communications protocol Local ring
Operations research Building System call Logical constant Thermodynamic system Block (periodic table) Mathematical singularity File format Interface (computing) Compiler Port scanner Run-time system System call Thermodynamic system Data management Latent heat Kernel (computing) Type theory Computer animation Linearization Energy level Quicksort Units of measurement
Logical constant Computer programming Building Implementation System call Computer file Java applet Mathematical singularity File format Compiler Online help Parameter (computer programming) Run-time system Mereology Thermodynamic system Formal language Subset Type theory Operator (mathematics) Software Operating system Energy level Data structure Run time (program lifecycle phase) Metropolitan area network Operations research Multiplication Logical constant Thermodynamic system Keyboard shortcut Binary code Electronic mailing list Cloud computing Port scanner System call Causality Word Kernel (computing) Computer simulation Computer animation Personal digital assistant Right angle Data type Row (database)
Length Ferry Corsten State of matter Mathematical singularity Source code Compiler Water vapor Parameter (computer programming) Run-time system Mereology Thermodynamic system Maxima and minima Fluid statics Component-based software engineering Single-precision floating-point format Eigenvalues and eigenvectors Extension (kinesiology) Error message Stability theory Physical system Metropolitan area network Programming language Thermodynamic system Computer file Electronic mailing list Cloud computing Bit Unit testing Functional (mathematics) Process (computing) Vector space Buffer solution Software testing Right angle Energy level Quicksort Writing Point (geometry) Read-only memory Computer programming Trail Slide rule Implementation Divisor Directory service Average Scattering Number Frequency Read-only memory Natural number String (computer science) Operating system Lie group Units of measurement Run time (program lifecycle phase) Addition Default (computer science) Multiplication Dataflow Volume (thermodynamics) Cartesian coordinate system System call Table (information) Compiler Computer animation Disassembler Separation axiom Library (computing)
Suite (music) Slide rule Socket-Schnittstelle Code Mathematical singularity Compiler Average Thermodynamic system Internetworking Software testing Software framework Information systems Units of measurement Metropolitan area network Addition Thermodynamic system Dataflow Unit testing Directory service Functional (mathematics) Subject indexing Computer animation Personal digital assistant Software testing Right angle Energy level Quicksort
Area Point (geometry) Metropolitan area network Email Computer file Point (geometry) Computer file Source code Code Cloud computing Line (geometry) Portable communications device Number Computer animation Revision control Integer Information systems Data type
Point (geometry) Implementation Google Chrome Computer file Code State of matter Multiplication sign Event horizon Binary file Open set Number Goodness of fit Malware Component-based software engineering Internetworking Database Operating system Scripting language Data conversion Message passing Metropolitan area network Time zone Algorithm Thermodynamic system Information Point (geometry) Projective plane Code Portable communications device Category of being Process (computing) Computer animation String (computer science) Revision control Lipschitz continuity
State of matter Multiplication sign Time zone Function (mathematics) Type theory Haar measure Area Metropolitan area network Algorithm Email Link (knot theory) Thermodynamic system Computer file Constructor (object-oriented programming) Electronic mailing list Shared memory Drop (liquid) Cloud computing Unit testing Maxima and minima Vector space Linearization Software testing Quicksort Implementation Computer file Set (mathematics) Rule of inference Number String (computer science) Database Scripting language Software testing Data structure Message passing Time zone Key (cryptography) Information Line (geometry) Directory service Word Explosion Event horizon Computer animation String (computer science) Password Revision control Routing Library (computing)
Standard deviation Code Multiplication sign Source code Weight Thermodynamic system Information technology consulting Data management Video game Component-based software engineering Kernel (computing) Cuboid Electronic visual display Office suite Thermodynamic system Cloud computing Price index Demoscene Data management Numeral (linguistics) Process (computing) Buffer solution Order (biology) Website Software testing Right angle Slide rule Computer file Patch (Unix) Event horizon Number Computer hardware Software testing Subtraction Computing platform Units of measurement Run time (program lifecycle phase) Operations research Focus (optics) Patch (Unix) Projective plane Model theory Directory service Limit (category theory) Cartesian coordinate system Compiler Kernel (computing) Computer animation Software Integrated development environment Computer hardware Vertex (graph theory) Library (computing)
Multiplication sign Mereology IP address Formal language Web 2.0 Preprocessor Component-based software engineering Single-precision floating-point format Videoconferencing Cuboid Software framework Information Email Product (category theory) Thermodynamic system Spacetime Namespace Shared memory Cloud computing Variable (mathematics) Message passing Process (computing) Configuration space MiniDisc Quicksort Computer programming Implementation Server (computing) Service (economics) Sequel Computer file Event horizon Transcodierung Open set Number Product (business) Wechselseitiger Ausschluss Lecture/Conference String (computer science) Gastropod shell Software testing Scripting language Subtraction Condition number Information Surface Element (mathematics) Directory service Set (mathematics) Cartesian coordinate system Computer animation Software Personal digital assistant Local ring Abstraction Library (computing)
all can everyone you mean even people sitting in the back OK good gas so on from there should they should try to see if we can turn off the lights of there's a lot information to the press all these random buttons over here in the hall lights this I'm not yet perfect thanks so good morning everyone and thanks for showing up mean it's the 1st the 1st talk of the day so the cost of the people from the other hand over and show you a hard time getting out of bed and showing up but then now I'm here to talk about the quality of project after working on for the last half years and then it's it's actually while like I spoke about college I another conference last month but this is the mean the 2nd time I've sort of an international crowd sitting here so I really wonder what you guys think of it than you know the so let me hear a before I don't explain what cloud API is and what it does that the sort of it's a small background am my so I mean when I was making a spider discovered as I've been sort of using previously knocking on the 4th already 10 years now quite long time so the 1st thing I want to I come together with a friend of mine from my studies reported that Microsoft freebies to the Microsoft Xbox One back in 2005 5 was quite to project and after that I started work on some of the more serious projects and the 1st big project worked on was making the TTY there in free BSD kernel as and be safe and after that and moved on to a couple of other projects for example the VT driver does a lot of freebies users use nowadays I was the 1st person who started happening on this and so it's nice to see that it's become popular year after that I start work on clanged B is the primary use the was a branch of freebies where we try to enforce claimed into the system and the without city and most of itself also ended up in free BSD had quite happy with that especially seen that claim sort of becoming popular also outside of previously after the recent work on C support because it to you on December 2011 the use a suspect was released and for a lot of new interesting features and so I support for the 11 atomic and support for some of you go functions that right it's unfortunately when I graduated I was sort of you know uh I started working I ended up at a company where we didn't really work with previously lot so they never lost their time more couldn't couldn't Hakim freebies these much light but late last year and I have decided to quit and actually start my own company so and what I'm going to present over here it's it's an open source product that you can just use in any way you like but you keep in mind that my company also offers commercial support on it so please get in touch if you think about using it but very interested to hear what interesting use cases the use of force so before
I'm going to explain what what Cloud API is I'm going to give 2 Introduction to the 1st one's called what's wrong with Unix this may sound overly harsh but you know people always say I have different opinions about what's wrong with Unix but is what's what's wrong with Unix in my opinion after that I'm going to give an introduction on category most of you probably heard that 1 them the quality of life in crowded scenes so unix MIP is a really offer and possible operating system I for using different flavors of Unix for quite a long time now but what I realize over the last couple years is that you most Unix systems they sort of share a common set of problems and 1 of the largest problems that it it doesn't make it it doesn't make it easy to run software securely I'm going to explain in the next couple slides but I mean it also doesn't really stimulated to write software that can be reused easily and that can be tested easily and there are certain sort of ongoing open source project within the year a sort of projects within the open source community community that have been dragging on for a decade and might be those be blind to recent and does and finally I think that systems administration on Unix is far from perfect we spend too much time and you know maintaining a unique system and not the program on top so what's wrong with Unix security so there are 2 problems in might be the the first one is that if you wanna run the surface the security impact of security exploit is far too big so so you wanna run a web service and the only thing you need to handle incoming HTTP requests of acts a couple data files that are stored on disk send some API cease to some kind of database servers and and handhelds response in practice this process can do a lot more so if there's a security but is of an attacker can already do quite a lot of interesting things even without being rude so 1st of all to create a parable of the entire system and send it back over HTTP and it could also register across not job so even if you as a systems administrator notice that there's a skewed bodies vectors and fix it and that an attacker could have easily just installed the binary somewhere on set like in such temporal whatever wherever and we launch from from that so you really have a hard time sort of figuring things out because you know most people don't actually look at the content of the system to see whether they're still so in a consistent state also while you're trying to log and should try to figure out what's going on with the system the attacker can invoke the right command line tools standard TTY fortunately a way to turn this off but still it's going that and this is enabled by default and even if not that decides to like not to any data from the system or do anything harmful it could still turn the system into a like a bit going minor or something like that no compute stuff and then send it back to the other side so there are a couple of things that have been designed over time to to mitigate capsicums 1 of them but a lot of these systems they're not really that effective for example Apple armor and you abusing as humans policy to try to make some from more secure so the 2nd problem and is like something that doesn't get enough attention in my opinion is that it's not that easy to just run third-party applications directly on top of the Unix kernel without sacrificing the integrity of the system so saying that a third-party application just from like a random customer or doesn't really matter from some page on the internet I just top slashed this program that's actually quite unsafe especially if you just that your own user right it just delete all the files your home directory plunder into hearing whatever even running it as a separate users still not that secure so any time we invented technologies like jails and you know the Linux sightsee groups and not knowledge doctors have the use of the events sort of trying to make this more secure but in practice you see that that's sort of the EPI that's exposed by kernel so incredibly large that it's really hard to get something like the groups and yields were correctly so you can already notice if you're inside of a jail you can still get quite a lot of information from the host system and and it's questionable whether that information should be exposed so most people nowadays they used the and to sort of create security and you can for example students in like cloud computing where parties like Amazon and Google they they they haven't offering for virtual machines you know you just get your own system we just have to install a completely which is a shame because that doesn't really exploit the cloud to its full potential in my opinion it would have been a lot nice if you could just say to like Amazon or Google USA program just run it for me and automatically scale it up and balance it and do whatever you like instead of just giving you individual in its virtual machines and sort of try to those of robust of that so I already mentioned a couple slides earlier I think it's really hard to make use of all possible software Munich and I when making the slides I discovered that it was actually hard to explain why you know it's um like someone who can't see explaining trying to explain to the person what the color green it's it's pretty hard so what I decided was to sort of 1st take a look at another environment where testing is actually a solved problem so there's this programming language called Java allows you to build all these really nice applications and you know sometimes just programs are being lots of writing in code that you can't read anymore but there are some things in Java that architecture quite smart and learn from it so I say I would write a very simple web server in job is of course not a complete faster more member functions but you know a simple web server would sort of look something like this will contain code namely we need to keep track of some kind of network soccer and in some kind of string restored the root directory of our web server you know whenever you get likes a GET request for slash file you just concatenate them slash file root directory you know which file to open so I really poor web server would sort of be initialized like this right I mean when the class is constructed you just created that we're talking about what 80 and you know said route direction to certain death we all agree that this web server is not really reusable right I mean if I would wanna run a web service for 81 to web services to take at the same time or you know using a different route directly I can't do it if I use this model so the use of the site to extend the class right I mean that a couple of arguments constructed possible Waterman directly and some leaders that serves a lot more a lot better to use but still in practice you see that if you choose 1 web-service reusable implemented like this so instead of letting the webserver constructed some TCP socket certain what number passing a spot give vantage of this approach is this lecture number 2 different network protocols it does it not only works the TCP it could work with like a Unix sockets whatever you can even sort of have a virtual network socket that's not really attached to a physical networks often the operating system it should be used for testing like a mock socket and the same holds for file system access instead of you know literally opening files within like the rest of the class you want to pass like a directory object with a couple of member functions like you get fall contents and nice thing is you cannot test is web server completely from within Java code with no interaction with the the actual operating system so my
observation is that the way we actually write Unix applications it's similar to the 1st 2 examples right I mean they either hot quite a lot of assumptions that they don't they hardcode pathnames to configuration files specifying the configuration of the application so applications that require resources on behalf of you instead of and you just providing resources for the application you want for example with Apache in configuration file you specify an IP address and port number you wanna listen on and that she opens a network socket for you and a disadvantage of the smallest that every time you want the the the web server to like you want to teach to teach at a special trick for example I want you to bind the TCP socket where you have different TCP timeout parameters you can't do this you you need to add explicit support to the web server for this you can't just provided a network socket where everything is configured where you like and everything needs to be implemented in Apache itself and it might be in this sort of double standard on writing object-oriented software we really appreciate writing testable go for Unix applications we don't care a lot about it so here's an example of a simple unique that server that is in fact possible it's actually really easy it assumes that follows with 2 0 there's already networking socket present and and you know when it gets an incoming request just replies hello world and the proper would be more common but became complex of course but the nice thing about this that service that you see that I've been at a single line of code so that I P V 6 support to respect right and these are many years it take a before we actually proper IPV 6 supported most of our unique software right all that could have been prevented if the network's offices provided to the patient and the nice thing is I can write a single line of code actually get support for for concurrency because what I can do is I can just pharmacy web server process 10 times they give it the name that the same network there is no need to implement this thread pool inside the lecture and more and this subset is also possible because I can just provide a unique socket and just programmatically inject requests and and uh check what I get the desired response so a lot of scope and you get more features such as what it boils down to so capsicum of
switching over to a different topic but later on you'll see that relationship is further explained previously so was used uh has heard of capsicum OK so if few hasn't heard of cats before OK well and then with that despite its quick introduction to get to work together to them as a sandbox technique that's present previously and allows you to harden applications and make them more secure and it's actually quite easy how it works so your program starts up like a regular UNIX process there's no configuration file installation in the sea that describes how this program should be sandboxed it it's just a regular intervals with at some point in time the program calls a special system called can enter and get enters the system call it instructs the kernel to say from now on I'm not not going to open any new resources anymore so I'm not going to open any random files on disk anymore and not going to create any new network sockets you can lock me up I only need to interact with the fall the press that I have so far so system calls like read right except they still work because they operate on the follows the procedure have system calls like reboot unlink they simply don't work anymore because they depend on syllable global namespaces and what's interesting is that it still allows you to access the file system because what politics 2008 added was and support for and opening files relative to a file descriptor that added this feature to make it possible to have race free access to the file system but this is actually used like so if you have a file descriptor to a directory you can call that open at system called SILK all the files underneath so this is really powerful because if you compare to change means for example you lock up the application in a single directory with this approach you can just open as many directories as you like then call enter and still have access to those directories so this is an unified quite a lot of programs always freebies DH clients being I mean being only opens a single network socket and after that can just call can enter it has a false scripted today to the wrong metric socket use for the ICT traffic it has a file descriptor to your terminal so it can still send Africa packets received on the right uh and statistics to your terminal that's all it means to and SSH the also uses it nowadays the some previously but another interest in services has been written by a BJT so like networks for service the only free needs to do is it needs to have a single network soccer which it gets incoming network requests for you know give me this piece of data and itself on follows scripted to up to the file system partition and it can export to the network so has the isn't isn't quite powerful mechanism for for sandboxing applications so
late last year I started to use capsicum another piece of software that I was writing and to you know make it more secure want Parliament to harden it and I noticed that capsicum is really also really works as advertised it really does what it's supposed to it might be another operating system should add support for capsicum as well I mean it would be nice if not use the open BC also added support for it and there isn't someone working for group homes trying to resurrect support for Linux as long as trying to get the upstream that's going a bit slower than i'd want but we're getting there the only thing that I noticed about capsicum was that sounds weird but capsicum doesn't scale and with that I mean it's really easy to sandbox really simple programs I mean you see all these changes flesh by previously that people as Capsicum or they capsicum utilities like cats and sorting unique those are really easy to patch up right and just open data for the false initially called cap locked up but as soon as you have a program that becomes more complex and with that I mean for example 1 lines of code mn offered particles it actually becomes pretty hard and the reason for that is that if you want to optimize an application what you typically do is you you come up with the idea of a point in the code where you wanna sort of starts and what to do so before you actually received any network requests do anything that that that might be sensitive just put a cap and recall there you start the application for the 1st time it breaks miserably because it turns out that it needs so always run falls and this and a sort of iterate on that until it works so that's not easy you know applications might break and really nontrivial ways right you might get these really obscure error messages or no error messages at all and it really take some time before you can actually a portal of articular application of and what I've noticed that even the standard freebies libraries don't work as well as capsicum so I can give you a
couple of examples so the top line of code local time underscore or what it does is it's a function that takes a UNIX timestamp and converted over 2 light-seconds hours minutes a day of the month that kind of stuff so the translation depends on your time zone of course but you know for example if you're in the Netherlands you 1st add or subtract what's free 600 seconds or something like that seconds and binds them before doing the conversion so what I noticed is that and if you use this function before you call Carpenter accuses touched on my case if you call kept enter 1st it uses UTC because the problem is that this function 1st and we call that needs open user shares only info slashed Europe such absent my case before can actually go over property so you need to make sure that you have least confidence timestamps once before calling kept before you actually use the proper times zone that's that's not easy while if you know it it's easy to keep in mind but know in a larger application that is this is really annoying of course so another thing that I notice is like the following character sets for suppose 6 2008 added support for creating the so called a local handles the function called you look around which is a as you know I would have like a handle to the Chinese will come which you can then do it on the Chinese localities as UTF-8 and a composite fell onto 2 functions like WCS to NBS and I'll explain what the name means literally means light character string to multibyte string you pass in a unique code stream which means 4 bytes per character on 1 you google . and it actually generates a UTF-8 string or any other class as well if you call can introduced from study breaks completely new concrete you look elements anymore because every time you call this function it needs to open use shallow calcite whatever and the piece of code at the bottom of the actually the the worst example that I encountered was a and I have I won't give you the name of this library will with it but it was a really horrible Crypto Library not open SSL what happened is it contains this piece of code to for the 1st time you start using data what it tries to tries to open that then and is it feels falls back to this piece of code so if I would call Catanzaro over here I would have some pretty decent entropy right so that and if I can grant of gap under over here meant to be is not that many more and this is impossible to figure out right I mean I only discovered this by lot because I was running trust of over my applications to getting this is called on an assault opened random return minus 1 another table identifies law get time of day get bitten and affordance of 0 0 no this can't be real looked at the library source code and indeed this wasn't just horrible so is there
a way to reconstruction we solve this problem and I thought about this quite hard and I don't think it's that easy because we sort of have conflicting requirements between environment right so normally of freebies the we don't want to put all sorts of data in that city that might change so for example timezone information changes quite a lot so you don't want hardcoded into the applications you just want to put in this directory if you want build this cluster slashed cloud computing environment where you want applications to be some more self-sustaining and don't really depend on the environment you might want compiled into the binary also functions like open your functions that simply don't work with capsicum at all just do whatever you want them to be present because human rights or but after you call cap enter it would have been a lot nicer if you actually got compiler errors for that piece of code so all the code that runs after cutting again and is if it calls open then so what I thought about was instead of writing some kind of really complex limiting tool or something of that that group did you would run over cold and you might give you an indication of what kind of work needs to be done before it works and you know it would work well give like thousands of false positives etc. I think I started to think about what would it be like if you said pure Data originates from time environment so that the mean of that I mean that if the problem starts up it's already in the sandbox 1 so it's not as if you need to call can enter any more get enters already caught cold like before main runs or even before that like before the 1st instruction of the program runs cap and was already called kernel so what you can usually just remove all capsicum unsafe functions entirely so all administrative interfaces everything that depends on global namespace just throw out of time only in the existing code will fail to build a lot of existing Cobol field will but the nice thing is that you don't exactly know what you need to fix right I mean this piece of code it issued earlier like opened every Random House fall back to other functions get called field to build because polls open which is also do is sort of goal in the side you know all the functions that are part of this runtime environment to deal with zones protocol databases services databases just build into the C library instead of depending on many equal the so after given a lot of thought I realize that this is actually not a lot of really bad idea because what it means it means that the Assembly safety just execute arbitrary for a particle me astonishing given file descriptors to other things that you don't wanna give it access to the different programs directly on top of the Unix kernel of any virtualization sandboxing so you just don't start from class providers say sale do customers instead of giving in virtual machines you can just say I'm going to start application I'll make sure that there's a follows could 0 network sockets follow scripture 1 there's a lot fall fall for but to a connection to database back and this program complexes as any other part of the system but you know you don't you don't need to do in a traditional Linux systems administration that would be really interesting approach might also that suffers reusable decibel wife fault right I mean just started up with a different set of follows vectors and you can never talk to different database back ends you can inject and requests and capture responses you can let it use a different directory this all kind of stuff you can also much more easily migrate sulfur around because what you can do is the application cannot hard-coded single-path name you really provide all the directories of from so if you just move it around and just say next time I'm starting this application just you know give the I update the but the adjustment should I give the file descriptor to this new directory and this knowledge migrate so migrating processes between servers might be a lot easier with the small and 1 of the things I also thought about it if you look at a larger scale so if you have a much larger set up a couple of database nodes some kind of MapReduce running right action of batch jobs web front-ends like a whole set of different applications run work like tools running in like the Web Stack adjusted the service the nice thing is that all the dependencies between all those tasks the Forest Service yeah no not from right so what you can do is you can some really interesting traits when it comes to a new cluster management you could have some kind of cluster suite that automatically knows like if the database backends are down there is no need for me to actually start at the front ends because state I can't give them a natural connection to the database backends anyway which of the for example do is the cost management realizes that all of the database back into running in a completely different data center the front end no adjustment at the dependency interdependency Graph of all the services that it's running and observes that there are some that the locality is not perfect so connection so start to reshuffle processes across data centers to actually improve locality that's stuff that's almost impossible to do right now with traditional Linux pianist right and just autonomous systems completely on their own and the cluster management system doesn't know anything about it at all so that's why I don't like when I realize as kinds of
things that's when I started working Cloud API and cloudy I sort of really low level building block there is no high-level cluster management yet but you need to start somewhere it's basically this pure
capability-based runtime environment that I've been talking about so you could think of it as opposed to the minus kept both 6 plus skeptical minus stuff that doesn't work with with kept them anyway and what I realized is as soon as you start doing this Unix becomes incredibly small there's so many things of freedom art that don't make any sense anymore if you use capsicum if you just remove them what's left is extremely small but I ended up with just 57 system calls that implement most of deposits 2008 features if you compared to previously previously has lots of like 405 hundred system call notes so it's just one-tenth the size of of previously 170 the size of linear it's through small so what I started to is I started coming up with this no specification of what safe units runtime environment look like and instead of just using the tradition the traditional approach of writing kernel writing let's see I decided to 1st sort of standardize the the binary interface so for example
here this is a file that just contains all the system calls this a list of all the system calls and their arguments and you can reuse is in any way you like so if you're writing your own kernel you think I would support quality API processes you can just use this file and drive your own system called alternatively you can also automatically trade language bindings with this definition so if you're writing your own lives C or something like that you could just you use this listing granted generates a system call records if you don't wanna run C programs but 1 a great a goal rusts Ruby Java runtime instead of building on top of this you are but you could also consider like I'm just going to rights as offer directly against against a kernel on that might be easier for a for some a runtime environments because for example the fretting environment that the goal has been it's nothing like the frets what do you have and and the goal would have been a lot simpler if they could actually had an API like this related to low-level programming and on so the
constants types and data structures are completely defined separately they're not part of an operating system or part of an implementation and the idea is that we're not going to write a very own operating system because I mean the world's not waiting for yet another operating system but we can just add support to existing operating system for quality the isomer similar to help combat Linux works of freebies you know ABC can run Linux binaries as well or at least a subset of finally we can just do the same trick and add context-free previously the to all the operator and is competitive this is going to be quite contact actually because I I mean combat Linux you can simulate hundreds of system calls but in this case we only need to to implement 57 of what this means is that you then end up with an environment where you can just compile programs once and run on multiple operating systems and this is exactly what you want for like a cluster cloud computing environment right I mean it would be really annoying of a large cluster provide like Amazon would say like word we offer Cloud API support but you do need to make sure that you use the Linux cloud freebies the quality of that just means to be a cloudy so what this is low level API
giant like it actually looks quite a lot like the traditional politics API except that there are some many some system calls that are missing or a bit weird and they usually are and they also don't depend on n equals state so what I wrote as a couple of them had a files and for every system call the system called the cultivated table they generate a static inline function that uses inline assembly so you can just copy that that follow new into your source interval system calls and these functions still depend on any global state so that they don't keep track of the error number in in in the global variable really just on their own if you actually think about the disassembly that the generated just a couple of instructions and they give you the return value directly so here are a couple of sort of more traditional Unix system also at the top there's an act trying to allocate some memory you know using that knowledge as it should be the minimum amount of memory low there is a right cold so there is no traditional right columns that you just provide a buffer knowing no it uses eigenvectors by default so I 1st final factor where explain and understand this buffer at 6 here as long as you increase the volume vector for that and then I say write out a single vector that's used for scatter factor source categorized cost traditional exit system called but on the next slide have a couple system calls that are a bit less traditional so an example is you know raising signals only if you're applications and something wrong you want terminated duration you just write kill get paid whatever kill get being sick of water there's a live function called reasoning does exactly that those killed get signal because we don't want process to actors and access to global process stable you know instead of providing a true physical systems that use system added a racist college makes looking and so there's also for example a separate system call for acquiring random data because you can't open their from random anymore and this lecture also interesting system call I mean it's like traditional and here but there are things there are 2 things that are sort of out of the ordinary so my guess is that most people will be writing C programs against us right but I wanna sort of make a future proved C is only 1 of the few programming languages out there that uses null-terminated strings and most other programming languages they use they just keep track an explicit account so all of this is because of the problems that expect an explicit lengths to be processed so this makes it a lot clearer if you just want write sago runtime on top of this because instead of copying the path name over into a buffer that's 1 byte longer and adding an explicit small white you can now just say like here's just this piece of memory and I am the creator to reunite additional source of cup often in search of a really long path in listing of multiple components you say OK and the data 1st 5 let the 1st 5 flights then anchored in the 1st 20 bytes 50 bytes whatever depending on all the components apartment and also something that's different compared to traditional traditional Unix is normally this system call would have 1 additional parameter namely permission bits in the environment I'm writing user credentials are not that important anymore your user ID is something that doesn't make any sense on this cloud computing platform right we just 1 run your own isolated environment so permission bits are simply non-existent if you call this function on on 1 of the existing and implementations and will just use 0 7 7 7 or something like that but then you must so of course people
shouldn't use this API if they just wanna write their own they're sort of more traditional applications that's guidance only interesting for people working on their own runtime environment but most people when use quality and qualities as you library read it you can just back from the up that I wrote and it's a C library that only implements a part of politics that make sense so the goal is not to achieve 100 per cent oppose its compliance to about 90 per cent opposing points also stuff that doesn't work is simply not there so this raises compiler errors and this makes it a lot easier to actually capsicum lies in the program and and using this library doesn't actually Kozlov vendor lock-in because they are also not that many extensions it's really just Unix + capsicum minus incompatible features so if you write frequencies you can still use it on Linux and BSD down down there simply aren't that many extensions so the nice thing is that the quality has a lot of unit tests at 650 of them right now and these unit tests are used for 2 different things 1st of all the nature that the library itself correctly but they can actually also be used to check whether the implementation the operating system you're running to this implements all the necessary features and I can show you what is unit that look like
so I started looking around for c unit testing frameworks all over the the internet and I looked at that was in check and see units and what have you but I don't feel like non of them I mean there were only proposed and beverages paintings so for example I think it was seen that were check where you in addition to writing the test yet to write your own mean function where you sum up all the unit tests and in my case where 650 unit test you don't wanna have just 1 mean function where you manually register that's right justice to be done automatically so is there anyone who recognizes this index has anyone ever use yes what is that looks a lot like Google test exactly so this is just this is attested framework that I wrote myself and I can explain why but it uses the same syntax as Google test and I really like Google test it's really nice testing framework received but and this is Google test for essentially so it doesn't use any at exactly but there were 2 reasons why I came up with my in unit testing suite so 1st of all when I started to write tests there was no I can run any single C + + code yet of course but the most important thing is this unit testing framework has a couple of other sort of aces up its sleeve so for example as best get it gets its own file-descriptor to a temp directory dedicated used to store falls and its test and that's something that you is not force not unimportant with Google that's right just after the global namespace but with this I wanna write tests for creating Unix sockets and writing stuff unwanted test all the false system system calls so that's why ended to extend it to actually provide support for acting as actors involved so yes and that Google test is really
awesome citizens he of course unit testing framework and I mean do use it so much better than all the other stuff out yes next slide so yes I did
right over you don't you know what that while you can use it outside of cloud area right now is integrated into clothing see but it could just be pulled out and use elsewhere and yes I think it's completely independent of cloudlets here if just go from a copy of the source files that would work maybe you need to adjust some of like the 1st couple of lines from the the header file you know to include derived dependencies but it would be right quite easy to use another advantage of this is that the but I really like about who will test
is that parliament the derisive type thing so if you a certain equals on 2 floating point numbers when it feels that she printed as floating point numbers is an integer princes as if it's an integer and you can also achieve this and see nowadays by using 11 generic so this users see 11 generic quite quite a lot to just make it the same type of reviews in a certain way and so yes there
are quite a lot of components that only using from some of the other systems out there so for example the malware writing your own now is not a fun thing to do and it also doesn't make sense because there are implementations out there that are so much better than I I could possibly right so I decided to go for j and melt also part of of previously also not often and complex not support that's is hard to get right so I decided to use previously or sorry openly event over event is a project that you can also just fine and get up and it's a mixture of old because the lip out there but it's pretty good it's portable it doesn't use a lot of these years and and I could easily integrated into college and also for passing imprinting floating point numbers that's also quite a tricky job so instead of writing my own algorithms or I decided to go for a package called double conversion you can also find in and get out and it's written by a guy called Florian which uses now been called the so you can also find a really nice paper about on the internet if you just search for food which leads to the top it's a paper on that and this algorithm is written like google because what they discovered was that when printing a passing floating point numbers in uh chrome 58 . they noticed that and implementations differ wildly between operating systems so certain operating systems that were really quickly but they are inaccurate other operating system state run very slowly but their precision so you should know be and that this supposedly faster than the traditional G-to-A algorithm that we use on on previously but it's also supposed to be correct in certain ways from correctness properties it is from so I really realize like it's good so datasets you know for example finds only if I am using officially times at the time zone database of course because coming up a few times and information is a whole lot of work and the other times and they just pretty good the only thing that induces a timezone code package in a reason for this is that so you genes that code is a package that contains implementation of for example local time and the time but what I want to do that once we really open files in users share his own info on would surpass and that it was party these binary files and what I did with my implementation is that I I can show to you it's quite
funny actually I wrote a really ugly Python script I'm not going to put showed a Python script to you but
I despite inscrit just converts the the times of data into this huge structure structure initializes so this entire file is 6 thousand lines long and yes descriptors in revolt yeah it has to be classified against script to work correctly because the route you know it quite a lot of annoying things in the fall in the times and data files the use a couple of constructs only in a couple of places and they're really going to parse etc. but the nice thing about this approach is normally uh times on data sort of expanded so the official time-zone data files say it's sort of like a macrolanguage is sort of state before 1991 the Netherlands had its own rules but after 1991 CEU and and it has a list of rules that apply to the EU so normally all those rules are expanded to user shares some info is typically quite a large directory on on Unix systems 1 file per time zone and that's all rules and then these aren't expanded and the funny thing is I managed to augment the data structure in a couple places adding like a couple key numbers that were missing but computing and it's still allow you to apply a linear time algorithm on to convert to the the times of so but also refinery can show it to you guys and I still have 50 there's a tool called the set on and what it does you can just 1 on your existing system and it prints 2 lines for every time the the time became discontinuous so for example vertical could 2015 you can see that after 1 2nd of 1 o'clock at midnight UTC dying the time from 2 o'clock to 3 o'clock and so what I did is I wouldn't of funny prices that that passes this output again from that time and time local time test and then creates test vectors for it so I mean by using the uh I just use that the output of this z some tool to create unit test for my own implementation against the implementation works really well and I'm pretty sure it's quite correct and so what
stop words what stuff doesn't work quite a lot of header files were not going to explain all those things in the dual of course but multithreading works that's like 1 of the most important things right and you go support is also quite decent there's still some minor stuff that needs to be done to get for example strangulation working you know comparing strings and like culture-specific way um wild-characters support it still and just needs some fixes up here and there but most of the stuff that's suppose infection were then there are couple things that I don't have a working at so asynchronous I it's and area erecting supporters still need to find a decent recollections implementation can implement the apple has a pretty good 1 and think about importing 1 into the library and their stuff that I'm really not going to integrate into the system so support for reading the password database it doesn't make sense to the cloud computing environment so which is
the source of the light support right now I support actually 664 hardware platform might use only at support for platforms that are actually being used so it's unlikely that I'm going to add support for SPARC 64 for IT numerous times systems but if if people are interested in seeing on supports appear especially on the 6 and higher than that of course someone added so operating system support is like an SLA previously free 9th 99 . 9 % of the test on the past and I'm getting exactly 100 per cent of net is the there's still a handful of tests that don't work still trying to figure out what I need to get more to work Linux support I only started working on it 2 weeks ago I think so only 90 per cent of the test fast but the last 10 % the hardest and other systems 0 % of course How to use quality maybe I shouldn't be that hard right now so what you do is you just go to the site try to figure out the subversion buffer clank check it out and solid no extra practice needed the signals from Beatles just check out the latest thing until sources bolts less configure their stash targets is xity 664 quality I make consulting and you already have a properly functioning cross compiler toolchain no patches needed at all laughter offered to be just so cloud C and offered you want you probably wouldn't soak up more libraries like a compiler or runtime which people us after that you need to better operating system kernel to support criteria executables and offered you're done you just compile software running directly on top of freebies the limits on the PC there is no steps so the future work of the
handle and of exactly right all X exactly so that the people from the was a signal sitting in the back the or of them they were um kind enough to actually allocates won't solve repeated questions so the question was how does the kernel know whether it's a cloud ABI executable that needs to be executed in a different way the answer is I contacted Dixon lost people and they were kind enough to actually allocating operating system number for me so this cross compiler toolchain based on playing Beatles if you compile self-awareness and take a look at the 1st novel by 2 to executable somewhere the number 17 in there and that's an indicator that it's a quality guy executable them on any other questions choosing the right people at the yes it does so so you can just open directory and you obtain a follows with open from Boston 0 1 score directory of thinking and you get a false picture you can pass that along you can't even pass it through Unix sockets different processes on the system and then this other process can just call open at common to open a file on any all exactly yes so the entire subtree rooted at that follows and so I and I have a couple of the 2 more slides and I'll quickly finish that often takes money so my idea for what I wanna work on a nearby future is 1st upstream previously units in support of my friends to find someone of the conference who wants to to do code reviews and then I can get that pushed and what I still need to do is so all the all the components that I'm using have been patched up to support all the and to to support quality also special 7 upstream except a couple of lectures from the C + + so that C + + still doesn't build out of the box correctly years some fixes the time due to say I and I used to use the kind of stuff but I also wanna get those upstream of course I think it's really important because you + + support should just work out of the box office that I wanna focus on Linux support and um them 1 work in a couple of our projects I don't know what the order yet so we really nice if we had previously packages for the toolchain so you just run may consult cleaning you have a nice development environment for quality and what I want to work on and verify futurism having a package manager specifically for quality so you can install the same package manager on 3 D scene at the Linux Mac OS 8 say give me I need for this lady events just mainly focusing was just given this library for display for Cloud API and they can just really builds some high-level applications using a lot of third-party libraries and all of those libraries are safe or quality of life and in a very far future is where it gets really uncertain I would wanna develop a cluster management Orchestration system built on top of this model so you take 3 BC systems on Linux systems you run a separate process on them and the only thing it does is accepting come in our seas and which you can do is you can send RPC saying start up this executable and provided these resources and if that is then goes down and across the management or the master node notices it and said process of differences and this could be used for example by you know a larger company or call provided to just 1 provided software so more more
information just go to get out of it is the name of the company based in the Netherlands only to you know if you wanna if you have any other questions you can you know us and that in a minute or you can talk to me here at the Conference or alternatively sending e-mails me especially if you think about using a product like this in a in a commercial product and I'd love to to hear how how we can help out that said irony questions the yes so unfortunately I component they're part of the C library that some it does look to binary size quite a bit but and how I make sure that it's not it doesn't become really bad is so normally on most UNIX systems such as cross product between character sets and countries right so there's a Dutch local for UTF-8 as 1 vise uh 1859 1 what I'm doing right now is I'm just implementable once so all of the strings locale strings are stored in Munich internally and they get converted on-the-fly so uh binary size should be I mean it's still bigger than loading it from disk but it shouldn't be that suggested you know the elements of the of the yeah exactly like a low calluses surface that's that's that's the idea that you sort of a separate process contains all the data and all of the processes on the system can just use that I think that's a good idea as long as people still have the liberty of running that service themselves because say you provided as a cloud provider but it's a good idea but it and this allows people from making modifications to locales furthermore it means that if they migrate isophote 2 different cloud provided has been a different dataset of BH differently so that's why for now I'm I'm still putting in all of the executables but there week we can change it in future yet I saw another hand something which has a a lot of work for for all the gaps in the world yes so so I I I do have to disappoint you I haven't really tried running applications that are extremely large for the reason that I mean I only started working on this think half a year ago and you already getting it working was already consumed by a lot of time so as I've been experimenting with libraries instead instead of just compiling the programs so for example if event and some video and audio transcoding libraries and I noticed that this actually works in that specific case you know you compile the exactly know what needs to be matched up but unfortunately tried running larger applications to have yes it would be also ultimate test and also if if you would get a certain like interesting piece of software will like engine or my sequel then that was something that people could actually start using this for doing business stuff you know just using it did day-to-day right now it's so it is really nice use sort of a new software that you're developing but I'm I haven't tried it systems that much to confess and yes it's the role that that's also that's a really good question so if you want to really primitively for example in the shell you could always just use like smaller than for space sometime in April directory and then that directory will be sort of top to follow script number 4 3 can this insanely long shell commands where you say like pass this territory possibly the way that that's that's also another really good question so I eventually what I wanna do is and come up with some kind of how should I could explain this so applications of course whenever the configuration
followed the web once every config files saying I have these virtual hosts it would be be nice if there some kind of configuration language sort of use namespaces where you had a single configuration can't follow normally have like this is a web server listening and these IP addresses uh and it has this virtual hosting is rooted at the but then some preprocessing runs over the configuration file and its he's all those IP addresses and path names and then replaces it by by file-descriptor descriptor numbers and then configurations passed on to an application and it can just passed that instead it knows which follows the process to to use that something that that would be interesting to work on but I have no time to work on that yeah does that answer your question OK good any other questions yes what's that so so you can use bytes Unix sockets shared-memory all that kind of stuff what and so there is no uniform framework in this so that's a bit annoying but there's 1 thing that is actually pretty awesome so this implementation does support shared walking so 2 processes that open piece a shared memory and place a new mutex or condition variable in the piece of shared memory so you can do some really fast you space I with this it's not I don't have any nice abstraction top of that but you can write it right now it works out of the box and your questions well the I I guess that's it thank you for attending at this time of


  882 ms - page object


AV-Portal 3.15.0 (0adb9429a9b6d91003da50b8636c932b69ab95bb)