OSv, a New Operating System Designed for the Cloud

Video in TIB AV-Portal: OSv, a New Operating System Designed for the Cloud

Formal Metadata

OSv, a New Operating System Designed for the Cloud
Title of Series
CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date

Content Metadata

Subject Area
OSv is a new open source operating system for the cloud. It is designed to run a single application per virtual machine and its tuned for applications running under the Java virtual machine. In this talk, we will introduce OSv, showcase its architecture, and explain performance and application management improvements. We will also talk about OSv specific improvements to the JVM that improve application performance in virtualized environments. Operating system developers, as well as application developers who deploy to the cloud, may enjoy the talk. No special expertise is required. * Why do we need a Cloud Operating System? * OSv design overview * JVM optimizations for OSv and the cloud * Management * Performance * Future
Information management State of matter Core dump Operating system Configuration space Bit Quicksort Mereology
Word Open source Lecture/Conference Code Patch (Unix) Execution unit Operating system Virtual machine System call Physical system
Mathematics Lecture/Conference Computer network Operating system 1 (number) Damping Office suite Physical system
Overhead (computing) Computer-generated imagery Lecture/Conference Different (Kate Ryan album) Personal digital assistant Internet service provider Quicksort Cartesian coordinate system Abstraction Physical system
Virtual machine Device driver Cartesian coordinate system Flow separation Hypercube Latent heat Kernel (computing) Lecture/Conference Personal digital assistant Single-precision floating-point format Quicksort Operating system Physical system Library (computing)
Web page Scheduling (computing) Thread (computing) Service (economics) Memory management Database Cartesian coordinate system Latent heat Process (computing) Virtual memory Computer network File system Operating system Endliche Modelltheorie Physical system
Area Implementation Computer architecture Lecture/Conference Length Normal (geometry) Bit Cartesian coordinate system System call Rule of inference Physical system
Point (geometry) Lecture/Conference Operating system Bit Right angle Lie group Extension (kinesiology) Computer architecture
Laptop Implementation Machine code Image resolution Multiplication sign Distance Rule of inference Perspective (visual) Machine vision Formal language Product (business) Latent heat Bit rate Lecture/Conference Core dump Operating system Software testing Proxy server Metropolitan area network Demo (music) Graph (mathematics) Memory management Maxima and minima Bit Cartesian coordinate system Word Computer animation Algebraic closure Personal digital assistant Blog Chain Right angle Quicksort Musical ensemble
Point (geometry) Slide rule Email Flock (web browser) Overhead (computing) State of matter Multiplication sign Virtual machine Bit Mereology Limit (category theory) Benchmark Approximation Number Product (business) 2 (number) Befehlsprozessor Personal digital assistant Different (Kate Ryan album) Single-precision floating-point format Software testing Resultant
Default (computer science) Service (economics) Kernel (computing) Computer-generated imagery Lecture/Conference Core dump Maxima and minima Quicksort Operating system Physical system Number Usability
Area Point (geometry) Implementation Overhead (computing) Mapping Key (cryptography) Structural load Mathematical analysis Similarity (geometry) Set (mathematics) Cyberspace Cartesian coordinate system Proof theory Process (computing) Kernel (computing) Personal digital assistant Semiconductor memory Internetworking Computer network Reduction of order Software testing Arithmetic progression Communications protocol Asynchronous Transfer Mode
Presentation of a group Electric generator Computer animation Semiconductor memory Virtual machine Memory management Operating system Speicherbereinigung Endliche Modelltheorie Pressure Limit (category theory) YouTube
Trail Arm Electric generator Multiplication sign Plastikkarte Mereology Information Technology Infrastructure Library Wave packet Virtual memory Lecture/Conference Personal digital assistant Reduction of order Table (information)
Complex (psychology) Common Language Infrastructure Server (computing) Service (economics) Divisor System administrator Virtual machine Similarity (geometry) Online help Mass Perspective (visual) Twitter Number Mathematics Lecture/Conference Different (Kate Ryan album) Operator (mathematics) Operating system Diagram Extension (kinesiology) Physical system Installation art Area Dependent and independent variables Touchscreen Information Interactive television Instance (computer science) Representational state transfer Cartesian coordinate system Database normalization Software Personal digital assistant Configuration space
Dependent and independent variables Presentation of a group Token ring Projective plane Motion capture Sound effect Total S.A. Representational state transfer Number Befehlsprozessor Computer animation Synchronization Operator (mathematics) Order (biology) MiniDisc Configuration space Software testing Data logger Resultant Physical system
Mathematics Service (economics) Lecture/Conference Multiplication sign Mathematical analysis Configuration space Database transaction
Process (computing) Computer-generated imagery Lecture/Conference Repository (publishing) Forcing (mathematics) Operator (mathematics) Sound effect Endliche Modelltheorie Protein Physical system
Mechanism design Implementation Information management Computer-generated imagery Lecture/Conference Touch typing
Graph (mathematics) Mapping Multiplication sign Electronic mailing list Configuration space Bit Quicksort Perspective (visual)
Complex (psychology) Software developer Multiplication sign Content (media) Cyberspace Software maintenance Human migration Goodness of fit Kernel (computing) Lecture/Conference Computer hardware Normal (geometry) Codomain
Point (geometry) Lecture/Conference Workstation <Musikinstrument> Computer hardware Quicksort Stack (abstract data type) Cartesian coordinate system
Point (geometry) Laptop Focus (optics) Run time (program lifecycle phase) View (database) Software developer Cartesian coordinate system Limit (category theory) Number Kernel (computing) Process (computing) Causality Semiconductor memory Different (Kate Ryan album) Single-precision floating-point format Core dump Right angle Musical ensemble Quicksort Endliche Modelltheorie Near-ring
Computer animation
Computer animation
Computer animation Meeting/Interview
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
the you know the hell I want to that up I have you to talk about policy ss ss thank you are it's a new operating system for the cloud of this talk is actually divided into parts I will 1st talk about those in core and where we're at right now and then I'd sort will join the states to talk about the configuration and management a little bit about the future on the
call made there of course me so if use send patches 1 problem the 1 to merge so a few words about quality and it was founded in December 2012 on why the thought of KT more the kings of canadian must the registered well proclaimed should 14 people in 7 different countries all the code is open source available on GitHub and everyone news the and we actually John unit foundation of last December it the Our mission is to build the best operating system to polar
virtual machines in the cloud are you so that means that we are expecting to run on top of a hyper was around the writing offers operating system on them so
what it's always me i is a cloud of reserve writing system how we like to call it lot of insistence written from scratch no we did
not for clinics on or any other operating system of the role the I mean all that Goldwater stuff
and everything from scratch we need data networking from BSD like every 1 of us and we also important the effects of without actually modifying the networking stack quite heavily to support network channels which are going to audit of it later but the efforts we obviously didn't change and we are hoping to switch to open the office of some and that we use + + 11 extensively which is always beneficial for people coming from things so why did we decide to write a completely new operating system to use 1 of the existing ones if you look at typical if you look at the typical CloudStack
from using it what is it what quality of but just the but the case in between different layers so the hypervisor
operating system like image for example JVM altering provide protection abstractions so sandbox what it was the 1st so and actually I think this is 1 of the big reasons what people looking into container-based virtuous and solutions because of obviously this causes some sort of amount of overhead and it's actually not that useful for the guys upstairs trying to just put the apathetic application on the cloud that so always uses a slightly different approach to
traditional operating systems and if the
library also like design is not a new idea was pioneered in the nineties we that's a kernels but it actually has become a viable solution now our this when you're running on top of a hyper was so you don't need that many device drivers so what it means is a single application per virtual machine there's no kernel uses best separation like have so basically running in green 0 no for obviously but because we're running directly on top of hyper y so you can access things like in the New year all the eyes we do support of 4 6 8 years for compatibility with the little into more people so effectively they want non is when collapsed but for cloud a specific case of JVM and operating system and 1 so we have 1 layer less features so what can you expect from running your application or what can you expect from also when you're running your application they actually support all the sort of operating system
services you would expect from a model of ring system so we have a schedule which schedules threads processes because there's only 1 presence obviously the memory management so
specific demand paging and memory mapping it turns out that memory mapping is really important for certain kinds of data and applications like Cassandra which is a really popular the west you'll database which is basically bypasses the JVM and relies on the operating system memory maps to book for performance but we obviously support networking and we have a full file system so we took because the FS and basically it means that you have a full-fledged production quality defined system uh in there were interested API as we actually support the Linux
system calls although system calls don't exist in the traditional sense that as normal calls
but we go to great lengths to actually of ruling and this is for compatibility reason because we run unmodified open to any k so we want you for example to say the advertiser criminals and if you have a specific intervals 1 remember anymore but this basically means that application doesn't it's actually even know that it's running on top of of and the calorimeter has a full on a full but it's it's actually quite completed the implementation itself based on something called most which he took but we have extended quite turn all but there are some mostly specific areas we're not trying to build our own API so just because but for things like in the new which there are no Posix 6 April 11 it because we have come up with some of our own architectures so also pure 64 bit
architecture operating system obviously being written in just 1 year ago are we currently run on 62 of the 64 bit x on top of
KVM but and I was about to say extend ACM about someone at the example told me that it's actually extend GvHD and a half but in any case and that that runs on Amazon E C 2 and that that's actually the public loves of his really driving our efforts at this point and we have some 10 itself but it's really completed but there a lie in the audience of probably didn't talk to you want to help as we're doing work to support the emperor virtual looks its most incestuous right of actually are and are planning a 64 bit on some ports from I think actually some people are already looking into it and we would want to support other a pocket this as well but it's been such as well come at this little bit about the status but like I said we run
on modified opens EDTA on have this that most of them all sort of major JVM languages so Java generally scholar rule the closure of the
band the JavaScript implementation in all open JDK 7 I think it's the was changed to all European vision in a in case of 4 about your favorite language so this test it and let us know if you find any problems we will fix them Otto rates JVM applications of reusing for testing and performance tuning all the time is scarce and Anton on graphs and there are actually quite different from workflow perspective but like I mentioned that 1 personally work in the presence of quite a bit and it's actually seems to be mostly the operating system memory management related stuff that needs to be done to actually working well formed that obviously more on the next inside yeah although of bloggers has a really strong JVM four-poster core there's nothing chain in specific about it and actually I and someone'll reported in Ruby which is a minimal Ruby implementation on top of the the man by the person you love to see some important older and toasted there's no schools and also we would be equal right and we support native applications of course you don't get the kind of our sandboxing but as you get the general purpose of racist and distance but that we were quite hard to get members the work really really well on top of then someone is using as a proxy of I think what even using it and products I was planning to do a demo all but apparently um my laptop with a resolution is isn't good so but you can just go on download also yourself and see the sub-millisecond of 2nd both times weeks and will be worse than that and yeah I was just planning to show you how quickly it goes through mostly about on this is the final and if you're interested i can show it to you from my a few words about performance so I didn't
include any numbers here because they change all the time but if you're interested have to send me an e-mail and I'll share the results to you were actually a running performance tests all the
time the this before obviously is a major concern for us but we do all perform limits in some quite interesting benchmarks specs so when I say Linux I mean all of books that are up running in a virtual machine we haven't to it at all so obviously people tend to have more but this is so frozen pizza and 1 of the key point is that you altitudes itself and it does the flocks experience and performance needs to be there so is in some sense maybe a little bit unfair but that's not but that's what people are running in production so spectator and that's really interesting because it's mostly about JVM performance but we're seeing something like 2 3 % improvement across all the different down that's what makes its basic collection of different things like lamb interest a it's by think it's even 50 per cent faster in it for the single CPU there's and we're actually doing work to reimplement parts of Member States to relational of what you can do with also when you completely abandon their onset of was it's 1 of and that's part of we have really have a really would networking results of talk feel a bit more about them than that of what metals we are roughly in the same ballpark which limits on approximate photometry Cassandra yeah like a sender less than 1 2nd more time in in the Quran case this actually means also that you and I think it takes 1 . 1 seconds to bring up the whole and of so the and added that the final thing was taken from allocated slides and what this is what you would expect because there's no overhead and we are 4 times faster than links in quantities of the little bit about in its size
so the minimal cost re-emits which includes the kernel services and Lipsey 17 megabytes and that actually includes the CFS and it's basically we also see it's always includes also the as as system the method itself in images
29 megabytes and was really hoping to show you a nice idea fault opens again numbers but they're really horrible of the core of it is that of Anthony itself is quite big and we have some sort of an issue with the CFS in look just generating a lot of unused data and the default but anyway we're working on on fixing this and it shouldn't be there are more than and 127 megabytes plus the 17 of minimal sorry different world How associated or what kind of things can we do with all the now that we have and then all out of the design assumptions in traditional general-purpose operating systems so networking channels actually people
tell me that there's something similar in b is equal the net map or something like that but in any case I but this was proposed by 1
Jacobson the father of TCP by in 2006 for Linux and on what happened but it was never about modes and he was able to show really nice the overhead reductions the area it of even beats us his proof of concept implementation so 25 % for once appear case and 20 per cent for a two-CPU place i and the basically the whole idea there is that the way that that you want to redesign the networking stack to avoid blocking and queuing and accessing a lot of memory it anyway so we have no progress networks and implementation also we haven't Mercer get about the practice on the values and 0 1 key point Internet channel stuff was to actually more protocol processing to space so we of obviously having more user space to the kernel space we able to do that so the net analysis directly connected to application and it's really interesting to see that we're seeing already of 30 per cent through putting improvement in I think this is so net per TCP tests of jumping from 36 new governments 2 47 and the set appears that Linux is running with the host generated load and also in the guests receiving the and something quite different through that
door JVM stuff on Jamie model involving is really typical of the technique using virtual machines and we trying we're extending it to the and so
the idea is that we can hold to the GC heap and give all the memory all the in memory to the JVM so you don't need to the limits down malt of Javier memory to maybe it's something like 80 per cent of the size of whatever up and then we're able to steal memory from the JVM when the operating system encounters memory pressure and needs this is running on top of unmodified JVM and now it's it's actually quite fascinating because the generational garbage collector stuff around so as we take us we still memory from the JVM and the memories this can be moved somewhere else there's actually a really nice of presentations like lower cost on youtube we are interested in the neighbors the final thing I want to mention
about this feature variable to do since we have that access to the and then you are 1 thing we're trying to optimize is to
modify the JVM to replace the GC court tables with memory mapping tricks and what Jesus card table so this is arm data Structural keeping track of preferences from all generation the young generation but in any case so this is something we are working on we haven't published anything on it yet but it's not really a new idea either puzzle with their coarseness GC on C for uh do really similar tricks to reduce pop science is post times but they require what were some all of training expenses and I think that should try to submit those for Linux but is the basically doing this really crazy and yourself 0 it's really hard to get in yeah that's what part a separate continue all configuration management thank Hyams bullets are for anyone come now 6 and so what I wanna show next is I like Kirkuk 0 it's only show the
that works to some extent what I'm showing here in the next 5 minutes is more for planning that we have and we are trying to look at today I want to make a cloud the worse In this case always be different
than Maryam traditionally or a general purpose so as for many administration and perspective so if you look at any and operating system like Linux it variant to with a command line interface to the CLI I would say the other operating system our focus on on cool even militancy alive at their let's put that aside a similarity is basically for humans it's made for human interaction is you can interact with it and it's fine when you have a standalone system when you have a 111 thousand server on the cloud that you want to administrative costs and I will not do and so what we're doing with those of the world there were oriented toward API and toward mass number of silver In other aspect also fails but there was a trend tool in hand dailies configuration for example in basically take out everything which require human interaction if you look for example at to the configuration of the operating system you have In many cases multiple fighting would to pollute the area of this east of the 5 system sometime each of them with different text format for the configuration and you need to go or when manually update a lot of them to do anything useful in configuration change as you know there are a lot of flies to which help to solve this problem In Chef Puppet etc. help you do a lot of this stuff but this tool for awoke very hard actually to try to convert in the human factor of 3 to make it up to make it because they would pass the response of the machine give them back and try to extract information from the and it's pretty challenging what you're trying to do in our take this redundancy way and take the complexity way by doing everything automatic through all In a through EPI yeah so we chose to have a REST API to In basically doing everything in our system and I will touch on that a little bit later and if we look at what exactly interact with the cloud the western and by the way I guess this diagram can apply to any operating system which even the cloud was specifically to was the issue of bunch of services that interact and with the operating system and with the application running mate so you have configuration you have packaging I will touch on on those 2 in a 2nd you have me to rein and trace in Logan which collect information from from the cloud instances in you have what they call Operation maybe not the best name and basically API which allow you to do stuff on the system reboot the system a change configuration install software it whatever only to do with the system could not be go to consider screen
captures so as I mentioned we chose to do everything in order to automate everything in our the with API Mitchell's REST API
for 8 In every L operation that would be you can do on the system will be expose eventually fool API it's not there yet right now you can you still to that of stuff manually or through the CLI but all folivorous on and I welcome people that want to help us to do that they had to join the project will have everything over API we choose to define the REST API with the total cost swagger everyone here is familiar with it if you so I mean it's not too late to directed to always there but I I recommend to check it out the it's really cool tool that tonight to defined REST API a side effect it's give you a clue which interact with these With this a REST API you can see some capture here or you can go to this why isn't a father example and so everything that you want to do with those you can do with REST API this celebrity I just a layer on top of it you don't necessarily need to use it and want to in performing operational name will operating system that not everything is corny so if you want to do so that's something like probing amid their CPU or probably the disk or something like that yes you can have a request get a response back but if you're doing something like run test or or doing something which take more than a trivial so I guess a few nanosecond or or MS you can't responsibility without answer so which written it like it's arson choreic and what would do we response automatically which an HDP appear response but said the result we it dumping to a log file and you can collect the result from a low 5 we also planning to use and it's not there at present user and synchronization token reference number which will allow you to continue and follow the specific requests as it executes the story configuration
services so when I don't
want to go over everything in analysis of time that we don't have much of it but basically worry ahead and configuration as API all the configuration check we've
done through the API that I mentioned earlier in the week 8 I of course stumbled intercalation come tracing we come transaction etcetera you will not be able to do in a configuration change directly on the filer anything like that it's often is control if indeed for automation and at least I found at a much easier to maintain services and
for some of you it might look familiar or an Atlantic complete decided and you can tell me what it's remind you so to have people get into Wallace and to let people use a noisy and we are planning a public repository of force the image is ready to use it
you basically have to wear to use if you want to install your own self 2 0 so you have 2 ways to do that 1 way is to be added it to to add your model into the build system and if you look at the data is instruction of of how you can do that the other way of doing that he just copy the job of fire or and or the been our finding deep relating to the system itself and to make it even easier what are planned to have is a public repository of the images which you can pull and run immediately or you can modify an optimal disease again it to this public repository in the what what does it remind you similar to and they are normal to this effect docket because I think you so yeah it's very much inspired by the way and the images in so we took the the good thing out of and these are some of the operation that that's a protein would
supported this kind of mechanism by the way we if we have done with touch on container data and after that I'm showing here is specific to contain a what we actually inspired by easy management of the images that they're using
there which is great love trap is really something that we are already starting with so you don't see an actual
implementation rather released before option but definitely everything that installed the cloud it doesn't make sense to to pull and push all the looks locally will be collected by central service and handle
them a little bit on a road map so right now I think we are about here
in this graph so it's mostly shown what we did the air but we think that we are planning to do really in the next few weeks even on the next
few months and so we are aware of an upcoming release which will be an eye for a list which would include a lot of and it will overbooked to each of this feature because quite technical but in inform hypervisor perspective we are planning to support Google Compute Engine avian wearing future books very soon and for the rest of them just join the main list and see everything and it's going on there the so this configuration back to contain this 1 final thing I wanted to talk about since I think we have the time as container sort of times when know what I mention also people to say pain we already have containers so what what's what's the point of and to be honest contender so will be
especially Doc Rivers also it's really has done a lot to make their content of stuff use of all of us have be having been also involving Linux kernel development I can tell you that there
are the control group's underneath on on that that has also must doctor but we content is what you get is really fast time basically steerable time of the 1st provisioning and role performance under the hood it's taken of quite different from what we're doing it all so containers freely reject the idea of a hypervisor on and they're built on a shared kernel interests can also be a problem when you're upgrading the kernel or when you're doing hardware maintenance that is so you basically don't have any of their fertilization good is like go alive migration this a lot of complexity in the kernel sometimes I'm surprised it even works um and it uses a copy of bright user space which is really great for saving in image space but it's still here basically tied to of normal Linux user space so what what kind of tend to think is supposed to be in also you were trying to
sort of combined best of both worlds we are also able to provide fast will times and fast provisioning and role performance
because we're cutting down layers but on top of that you get all that there were 2 station stack of features that you thought of you would expect uh and uh like I mentioned earlier you you basically have hardware access some specifically the axis which is interesting for some certain applications and for this as well there's a great book post by none other than global cost the point here is actually the audience today so we're that's it be thank
Newton's and if we have questions 5 minutes the In this work and you do it by
the world think so the question is that applying to support the are we planning to add stuff to support more runtimes so we do support open city that's our focus but we do run there in Ruby are run times on and we are adding basically it so it's basically about adding these up cause it's a the eyes that are required by the runtime and yes so we we almost certain quotas 1 probably working on different run times in the near future but if anyone is interested in porting and send faxes and not we can believe and how and what we are doing is we are adding false its API eyes and different at the just to make it compatible so I guess the answer is yes and that and some of its core and the right is the work of of the so suppose that what we obviously cannot guarantee that because we have to have 4 so if your application does for needs all the then you need to do something with application run on top of all the but that's sport trading model it this is really from application point of view it looks like you're running a single process limits of single process on the the and yes so that the numbers are all run on top of Canadian that so for us it's mostly a convenience because obviously everybody's laptop has KB and install a new are and for Nate channels I think probably polybutadiene 1st and has actually went down the pockets so we don't have those numbers so and I'm moving in the other so the question was if we only have 1 1 process or to reduce runtime debugging actually gdb is really well connected to or integrated with thank you and so it's actually being a Linux kernel developers actually easier to debug the whole thing uh under a medium band and 1 so it's basically a and yet what they obviously it's like a memory corruption possible memory corruption in the kernel at the sorts of course wide application that so obviously it can happen but it's for us it hasn't really been on any issue at all is it always take like nobody gave running Linux of the war on whatever and once you accept this but that's sort of working properly then stated in any more questions so if you now so fit
so the ph
I wanted i in some of the