Cloudlets: universal server images for the cloud - TIB AV-Portal

Cloudlets: universal server images for the cloud

00:00

1

Formal Metadata

Title

Cloudlets: universal server images for the cloud

Title of Series

Number of Parts

97

Author

License

CC Attribution 2.0 Belgium:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.

Identifiers

10.5446/45686 (DOI)

Publisher

Release Date

Language

Content Metadata

Subject Area

Computer Science

Genre

Conference/Talk

Abstract

In this talk we will discuss the issue of server images, and how it affects inter-cloud portability. We will describe our vision for a universal format which can be shared and improved as easily as a Git repository, and how we're implementing it with cloudlets. Cloudlets are universal server images for the cloud. They're lightweight, version-controlled, and you can export them to any bootable format known to man: Xen, KVM, Amazon EC2, or just a plain bootable CD. They can be shared and distributed with the semantics of tools such as Git and Mercurial. Our goal is to build the foundations for truly cloud-independent infrastructures. Our roadmap includes: * Multi-image stacks * Auto-scaling * Automated integration tests * In-place image editing * Integration with existing VM generation and configuration management tools

FOSDEM 201089 / 97

1

39:48

2

17:00

Wt::Dbo: a C++ ORM library

3

41:25

Write and Submit your first Linux Kernel Patch

4

40:16

Working with GNOME upstream

5

49:34

6

26:13

Welcome to FOSDEM 2010

7

12:42

Uzbl - webinterface tools which adhere to the unix philosophy

8

15:31

Introducing UmlCanvas

9

42:32

Mutualizing packages translations

10

43:41

Transactionally Protected Package Management

11

44:51

Understanding, Growing, & Extending Online Anonymity with Tor

12

15:37

tinc: the difficulties of a peer-to-peer VPN on the hostile Internet

13

33:39

YOURI, a package management framework

14

14:36

The Wiki 4 Open Tech

15

48:33

16

34:14

The Nmap scripting engine

17

38:58

The Maemo Community Council

18

46:56

The free software desktops graphics driver stack

19

48:23

System management with RHQ and Jopr

20

15:41

SyncEvolution: From the SyncML Protocol to Free and Open Implementations

21

38:46

Starting the sysadmin tools renaissance: Flapjack + cucumber-nagios

22

44:38

Spacewalk - Free & Open Source Lifecycle Management

23

30:50

The Evolution of MonoTorrent

24

17:11

Smuxi - IRC in a modern environment

25

15:27

SIP Communicator: Skype-like conference calls

26

15:29

shadowcircle: a pentesting distribution alternative

27

51:22

Scaling Facebook with OpenSource tools

28

51:37

RepRap - Manufacturing for the Masses

29

46:38

RPM packaging collaboration

30

16:15

QuBit: Introducing Quantum Superpositions

31

10:56

Qi Hardware's Ben NanoNote: open to the bone device

32

36:23

Promoting Open Source Methods at a Large Company

33

45:42

Postgresql: Lists and Recursion and Trees (oh my)

34

11:56

The PortableApps.com Platform, an Introduction and Overview

35

29:44

Polishing X11 and making it shiny

36

37:29

ParallelFx, bringing Mono applications in the multicore era

37

15:23

Padre, the Perl IDE: Building an open source team, getting the project to users against the odds

38

44:33

Working with upstream Perl

39

30:58

OSCTool - learning C# and Mono by doing

40

17:00

OpenPCF: An Open Provisioning and Control Framework

41

35:21

42

13:48

OpenERP - Focus on the framework

43

13:47

Open-source software: Blaming the unknown, or a constructive approach to technology

44

38:44

NixOS configuration system

45

35:54

MooTools as a General Purpose Application Framework

46

33:33

Moonlight and you

47

41:58

48

1:03:30

49

32:05

Mobile distributions and upstream challenges

50

37:04

51

56:18

MINIX 3: a Modular, Self-Healing POSIX-compatible Operating System

52

39:33

MariaDB: extra features that make it a better branch of MySQL

53

46:03

Maemo 6 Platform Security

54

38:20

Linux distribution for the cloud

55

13:15

LiMux: 5 years on the way to free software in Munich

56

48:03

Large scale data analysis made easy - Apache Hadoop

57

16:23

Kerrighed: Flexible distributed checkpoint/restart

58

15:09

Kamailio (OpenSER) 3.0.0: redefinition of SIP server

59

16:05

Kaizendo.org: Textbooks, the free software way

60

16:12

jPoker: pokersource web poker client

61

43:56

Javascript charting with YUI Flot

62

1:02:40

Inside StatusNet: How Identi.ca Works

63

1:07:07

Infrastructure round table

64

30:43

Image processing with Mono.Simd

65

44:11

How to be a good upstream

66

20:53

Hermes Message Dispatching

67

39:33

GPU Userspace - kernel interface & Radeon kernel modesetting status

68

15:02

GNU Savannah: 100% free software mass-hosting

69

15:35

GeeXboX: An Introduction to Enna Media Center

70

39:22

Ganglia: 10 years of monitoring clusters and grids

71

42:48

Fedora Governance

72

18:26

Fedora fr and upstream French communities

73

15:34

Faban: Developing benchmarks and workloads using Faban 1.0

74

43:06

Evil on the Internet

75

50:56

76

30:33

Distribution Image building with KIWI

77

39:37

Distribution HR management

78

40:07

79

49:07

Debian and Ubuntu

80

15:39

csync: Roaming Home Directories

81

35:25

Cross distro packaging with top git

82

39:02

openSUSE Buildservice

83

43:55

Cross-Distro Dependency Resolution

84

36:34

CouchDB! REST and Database!

85

1:00:41

Coreboot and PC technical details

86

39:34

Project Builder: A GPL continuous packaging solution

87

39:39

Config Model and configuration upgrades during package upgrade

88

15:26

Coccinelle: Finding bugs in open source systems code

89

11:53

Cloudlets: universal server images for the cloud

90

08:55

CiviCRM: Common goals of FOSS and Not For Profit Organisations

91

15:57

CAcert: Client-certificates and SSO - the old-new thing

92

1:01:27

Building The Virtual Babel: Mono In Second Life

93

45:36

Building High Performance Web Applications with the Dojo Toolkit

94

24:16

95

13:51

Beernet: Building peer-to-peer systems with transactional replicated storage

96

12:07

asterisk: An introduction to Asterisk Development

97

21:33

ACPI and Suspend/Resume under coreboot

Automatic playback

Speech

Text

Image

00:00

Server (computing)Computer-generated imagerySolomon (pianist)Data conversionSolomon (pianist)TwitterComputer animationXML

00:33

Point cloudInstallation artSoftware repositoryCodeModal logicWeb pageConfiguration spacePhysical systemComputer-generated imageryMiniDiscBootingVirtual machineWordBlogMathematicsCodeVirtualizationOpen sourceDynamical systemBootingOpen setMiniDiscComputer configurationMereologyCartesian coordinate systemPoint cloudRaw image formatMobile appAdditionConfiguration spaceOperator (mathematics)Proxy serverWritingFile formatServer (computing)CloningJust-in-Time-CompilerOvalPhysical systemSoftware developerSoftware development kitDifferent (Kate Ryan album)Computer animation

03:43

Manufacturing execution systemEntire functionRevision controlComputer-generated imageryConfiguration spaceMobile appServer (computing)MathematicsInterface (computing)Web 2.0CodeComputer animation

04:06

Normed vector spaceLink (knot theory)Virtual realityMathematical optimizationRule of inferenceComputer configurationConfiguration spaceLatent heatMathematicsComputer fileModule (mathematics)Computer animation

04:34

Software repositoryConfiguration spaceCodeTwitterLattice (order)Configuration spaceServer (computing)Block (periodic table)Focus (optics)Computer fileLattice (order)Multiplication signMatching (graph theory)Stack (abstract data type)Open setMereologyInformationCodeBinary codePoint cloudComputer hardwareResultantLevel (video gaming)Different (Kate Ryan album)File systemBootingComputer configurationNormal (geometry)MetadataOpen sourceLogicData storage deviceData conversionRaw image formatBitState of matterRight angleEquivalence relationCombinational logicRepository (publishing)Term (mathematics)Arithmetic progressionScripting languageEntire functionVirtual machineMathematicsTwitterPhysicalismPhysical systemFile formatEmailSoftware developerNeuroinformatikProxy serverNetwork topologyView (database)Hidden Markov modelLine (geometry)Computer animation

11:49

XML

Transcript: English(auto-generated)

00:07

Alright, thank you. Everybody hear me alright? So I'm Solomon Hykes, and let's talk about Cloudlit. Before I begin, let me just say this is a conversation. I'm here to talk about a problem that we're trying to solve.

00:22

It's not a problem that has been solved, so if you relate to the problem at all, if you have anything to contribute, please join the conversation. On IRC, Twitter, come see us at the bar, whatever you like. So the problem is sharing. Specifically, how do I share my code when I want it to run in the cloud?

00:43

So of course the cloud is just a fancy word to say anything server-side. So I'm writing an application, I want it to run on people's servers, how do I share that efficiently? So of course, as open source developers, we know how to share, we know how to share our code, we know how to collaborate with people

01:02

who want to contribute to our code, and we know how to package our code so that people can install it on their favorite OS. However, if we're writing applications for the cloud, in addition to all that, you also need to provide configurations. Typically, as an example, let's say I'm writing Apache,

01:24

people want to see the Apache code, other people want to install Apache on their systems, but there's a lot of interest in a great working configuration of Apache and all the necessary stuff to run Python apps or Ruby apps, a great stack that just works.

01:42

And if I want to share that, then that's when things get complicated. I can see three options here. Either I write a tutorial, write in a blog post, and people comment, edit it, or I package a virtual machine and put it up for people to download, or I use dynamic configuration engines, things like Chef, Puppet, etc.

02:05

We think that all three of these approaches have drawbacks, and the summary would be that we're good at sharing code, good at sharing packages, but we're still pretty bad at sharing configurations that just work. So what we tried to do is improve that situation

02:22

and provide a format to package a configuration and share it. So a Cloudlet is an image that you can use like a virtual machine. Basically, that means it's self-contained. You don't reference anything else. If I give you an image in the cloud that's format, you have everything you need to boot it,

02:41

and you have everything you need to boot it anywhere. From a Cloudlet image, you can convert that into a Xen image, a VMware image. You can get a raw disk image and then boot it on a physical machine. We've done it with open VZ containers, things like that. So the interesting part compared to virtual machines,

03:03

I don't know if everybody has ever tried to share a VMware virtual machine, something like that, or a Xen image, with people who don't actually use that virtualization technique. That's really tricky. And what we really like is the tools that everybody here, I think, knows, something like JIT, Mercurial, SVN.

03:23

I can clone someone's code. I can change it. I can commit the changes. I can merge. I can see the differences, all these cool operations that we're used to. We'd like to be able to do that on images as well, and that's basically what the format is made for. You can take someone else's Cloudlet image, fork it,

03:40

make changes, and contribute the changes back, things like that. Now, that's a quick example. If anybody here uses Mercurial, you'll recognize the actual Mercurial server interface. This is not our web code. This is an image that a friend of ours made. It's the bare WSTI server, so it's an Apache configuration to serve Python apps.

04:02

You can see the complete history of the image, and you can inspect a specific change. Let's say, oh, here I see changes in the Apache configuration, things like that. Here, this is a change set where, obviously, as the comment says, he installed the module, and you can see all the files that changed.

04:21

The interesting thing is we didn't replicate the work of a Mercurial. We use it. We take the image and put it in Mercurial. You can actually do ht pull the image. So that's the workflow we envision. The same way we say, here's the repository for my code,

04:42

you can say, here's the repository for my image. Do whatever you want with it. And I think that's a really interesting way of seeing things, because then you can start building on top of other people's work, just like you did with code. And I think that might accelerate the progress we make

05:03

in not just writing one piece of code here, another one there, but actually plugging them together and sharing entire cloud stacks. So I think that's as far as I'm going to go in terms of lecturing. I'm not going in any details on how it works.

05:21

I was worried I wouldn't have enough time. But I would love to either answer your questions on email, Twitter, IRC, whatever you'd like. Join us at the bar, like I said. We're going to go there right after this talk, and we'll be there again at five. Who here would like to see a demonstration later,

05:41

just to see it happen? All right, so either right after this at the bar or at five. Is that right? And I think I have time for a few questions.

06:00

Yes? Binaries and everything. Oh, I'm sorry. The question was, do you actually store the entire image in the repository? The answer is yes, we store everything. However, we don't work at the block storage level like VMware or Xen, for example. We don't store that. We work at the file system level.

06:21

So basically, we take a file system tree, and we say, OK, I'd love to put that in a repository, but what should it take out? So we add metadata so that the image is smart enough to say, if this file changes, you want to record that change. If this changes, you're not really interested in that.

06:41

Things like that. But yes. Any other question? All right. Well, this is it. Thank you. I guess I'm ahead of time. Yeah, right.

07:08

So the question was, you save the file system, but do you save any hardware information? The answer is no. We think in terms of moving from one cloud to the other, let's say you have a Xen image running,

07:21

and you want to share the behavior of that Xen image with someone who's using VMware. If any hardware information you share with him will be useless. So the interesting part is to look at all the information, all the bits you have in that image, and only take what's worth moving.

07:49

The question is, can you actually do that? Can you actually take information from a VMware image, move that to a physical machine, and is it enough? Will it work?

08:01

The answer is yes. So the way it works is you start from the file system. Let's say you don't have a cloud image. You have a normal VMware image, and you want to use cloudlets to move that. The way you do it is you start with the raw file system, and then you have to add the metadata the first time. So using the cloudlets format,

08:22

you'll add information saying, this is persistent, this is volatile, this is a template, things like that. And then once you've done that, then you can use the metadata to move the image. So that means there's an initial step of authoring, adding manual information,

08:40

and we've worked really hard to make that as simple and short as possible. So there's no giant XML file, no scripts to write, anything like that. Typically, the metadata is a 10-line JSON file. No, you don't have to describe the services.

09:01

You don't have to give any knowledge of how the code works inside, or how to construct the image, or what are the relationships inside the image. You just focus on the file system, on the final result. Because whatever the configuration logic is, the end result is always a certain file system.

09:22

So that's the big difference with Puppet or Chef, for example. We don't describe how to get to a certain behavior. We let you do that any way you want, and once it's done, we let you snapshot that. And it's actually a great combination with tools like Puppet or Chef, because who here uses something like Puppet, Chef, CF Engine,

09:43

or the equivalent homemade scripts? Typically, you describe steps to take from one state to the final step you want, but you still have the problem of the initial state. A Puppet script describes how to change a base image, but how do you get that base image in the first place?

10:02

So the typical scenario is, here's a base image in the Cloudlet format, and then here's a Puppet or Chef script to make it change its behavior dynamically once it's booted. So we do before boot, Chef Puppet, CF Engine do after boot.

10:20

Great match. Oh, yeah, the question is, does it only work on public Merkle repositories? I'm guessing you mean things like Bitbucket or open source?

10:48

Of course, yeah. Obviously, here, the focus is on sharing for open source developers, but the technology works for anybody who wants to share images with anyone. It's the same workflow options you have

11:01

as with Merkle in general. Sorry. Not on this computer, but yeah, I'd love to... This is basically why I'm here speaking, is I would like to interact with people

11:23

who are interested in this problem. So this is really just to get the conversation started. Any other question? Okay, thank you very much.

11:41

Thank you, too. Thanks.