We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

Managing the Cloud with a Few Lines of Python

00:00

Formal Metadata

Title
Managing the Cloud with a Few Lines of Python
Title of Series
Part Number
20
Number of Parts
119
Author
License
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Identifiers
Publisher
Release Date
Language
Production PlaceBerlin

Content Metadata

Subject Area
Genre
Abstract
Frank - Managing the Cloud with a Few Lines of Python One of the advantages of cloud computing is that resources can be enabled or disabled dynamically. E. g. is an distributed application short on compute power one can easily add more. But who wants to do that by hand? Python is a perfect fit to control the cloud. The talk introduces the package Boto which offers an easy API to manage most of the Amazon Web Services (AWS) as well as a number of command line tools. First some usage examples are shown to introduce the concepts behind Boto. For that a few virtual hosts with different configurations are launched, and the use of the storage service S3 is briefly introduced. Based on that a scalable continuous integration system controlled by Boto is developed to show how easy all the required services can be used from Python. Most of the examples will be demonstrated during the talk. They should be easily adoptable for similar use cases or serve as an starting point for more different ones. ----- One of the advantages of cloud computing is that resources can be enabled or disabled dynamically. E. g. is an distributed application short on compute power one can easily add more. But who wants to do that by hand? Python is a perfect fit to control the cloud. The talk introduces the package Boto which offers an easy API to manage most of the Amazon Web Services (AWS) as well as a number of command line tools. First some usage examples are shown to introduce the concepts behind Boto. For that a few virtual instances with different configurations are launched, and the use of the storage service S3 is briefly introduced. Based on that a scalable continuous integration system controlled by Boto is developed to show how easy all the required services can be used from Python. Most of the examples will be demonstrated during the talk. They should be easily adoptable for similar use cases or serve as an starting point for more different ones.
Keywords
CodeLine (geometry)Point cloudSoftware testingField (computer science)Multiplication signMereologyLine (geometry)Speech synthesisBenchmarkPhysical systemComputer animation
Physical systemSineWeb serviceMetropolitan area networkCASE <Informatik>Revision controlSoftware developerOperating systemGroup actionCartesian coordinate systemSinc functionScaling (geometry)Lecture/ConferenceComputer animation
Physical systemWeb serviceSineData modelData storage deviceComputerDigital filterKey (cryptography)Focus (optics)Point cloudFilter <Stochastik>Physical systemKey (cryptography)Food energyMessage passingGreatest elementNeuroinformatikGroup actionData miningSoftware developerInformation securityContent delivery networkData storage deviceFunction (mathematics)Information retrievalDatabaseContent (media)Formal languageWeb applicationWordLocal ringLecture/ConferenceComputer animation
Euler anglesLine (geometry)Expert systemCurve fittingWeb serviceScripting languageMultiplication signGastropod shellUML
Projective planeCodeGreatest elementSummierbarkeitResultantComputer animation
Metropolitan area networkReal numberEuler anglesInternetworkingLogarithmInclusion mapUniform resource locatorWeb serviceNumberData storage deviceNamespaceBit rateCalculationStreaming mediaDemo (music)Hard disk driveKey (cryptography)Multiplication signTerm (mathematics)Object (grammar)Directory serviceRight angleService (economics)INTEGRALData structureVector spaceArray data structureDiagramProgram flowchart
Connected spaceComputer fileUniform resource locatorLaptopMetropolitan area networkWorkstation <Musikinstrument>Menu (computing)Storage area networkGrand Unified TheoryOpen setValue-added networkGamma functionLaptopOSI modelLine (geometry)Musical ensembleRight angleFigurate numberMultiplication signElectronic mailing listMathematical analysisPhase transition2 (number)Electronic signatureElectronic visual displayComputer fileObject (grammar)Key (cryptography)Hard disk driveUniform resource locatoroutputLink (knot theory)Computer animation
Link (knot theory)Presentation of a groupDifferent (Kate Ryan album)Heat transferLimit (category theory)Reading (process)Client (computing)XMLUMLComputer animation
Euler anglesUniform resource locatorMetropolitan area networkQueue (abstract data type)Message passingGamma functionBeta functionWage labourTrigonometric functionsValue-added networkClient (computing)Digital photographyMach's principleMIDILogarithmGroup actionOvalDirect numerical simulationInstance (computer science)CASE <Informatik>Message passingCountingState of matterConnected spaceOSI modelBootingProjective planeModule (mathematics)Web serviceMultiplication signTouch typingOnline helpKey (cryptography)Queue (abstract data type)Kernel (computing)Term (mathematics)Medical imagingParameter (computer programming)outputLevel (video gaming)Open sourceInformation securityDistribution (mathematics)ImplementationRootkitGroup actionPhysical systemVirtual machineBlock (periodic table)Sparse matrixMereologyRight angle2 (number)Variable (mathematics)ArchitectureSheaf (mathematics)Service (economics)Standard deviationWater vaporReading (process)Endliche ModelltheorieBitPresentation of a groupLinearizationDependent and independent variablesXMLComputer animation
Metropolitan area networkData acquisitionValue-added networkGamma functionSet (mathematics)CompilerKernel (computing)Ordinary differential equationScaling (geometry)Inclusion mapComputer iconUniformer RaumGateway (telecommunications)Graphical user interfaceZoom lensInstance (computer science)Different (Kate Ryan album)InternetworkingUtility softwareLine (geometry)Service (economics)BitThresholding (image processing)Multiplication signSlide ruleAreaBefehlsprozessorConfiguration spaceData managementNatural number2 (number)Matching (graph theory)Well-formed formulaOntologyScaling (geometry)DialectColor confinementMaxima and minimaKey (cryptography)Source codeTime zoneIP addressPersonal digital assistantGroup actionFunctional (mathematics)Point (geometry)Greatest elementRadical (chemistry)Price indexProduct (business)Object (grammar)Medical imagingSocial classPoint cloudBit rateMoore's lawFigurate numberGateway (telecommunications)Type theoryParameter (computer programming)Data centerDirect numerical simulationVirtualizationRoutingSoftwareLastteilungComputer animation
String (computer science)Formal verificationMultiplication signWeb serviceLine (geometry)View (database)DampingLecture/Conference
Code
Transcript: English(auto-generated)
So, the second speaker of the session is going to be Frank Becker.
Frank Becker has been developing Linux systems for more than 10 years. He specializes in the field of testing and benchmarking. Since 2013, he is working for Amazon Web Services, and I see a couple of Amazon t-shirts in the room.
So, wow. So, Frank also has held tutorials and talks at the German PyCon about IPython and Django. In his spare time, which by his own accord is very rare, he is producing the German speaking podcast, Import This.
Today, Frank is going to introduce the libbutto, which makes it very easy to use AWS from Python. Please welcome Frank Becker with Managing the Cloud with a few lines of Python. Hi, everyone.
Thanks for your interest in this talk. We already lost four minutes, so we have to skip through the stuff here. As I already was kindly introduced, I work for AWS since 2013, so that's one and a half years now. I'm with the operating system group, which is based in Dresden,
200 kilometers south of where we are right now. We're desperately looking for talent, so in case you're interested in something like this, please talk to us. We have a booth downstairs, but we also have, I mean, all over Europe, locations, development centers, where we have a lot of very interesting work.
And yeah, if you especially like to work at scale, that's probably a good opportunity. I've been working with Python since about version 2.4. Yes, I, as already mentioned,
we actually put some effort into this podcast here. A friend of mine, Marcus, who had to talk earlier about swing localization, and he already promised that, yeah, in the next couple of weeks, there will be some new episodes,
but back to the talk. The idea for the talk, I actually got a local little Linux conference where I was talking to a couple of DBN developers, and they said, look, we get those AWS credits, and actually, those guys at least didn't have any idea what to do with them, which is sad, because AWS is giving it out
for them to improve DBN, and I think they can use that. So we had a chat, and well, sure, as many of you probably also, they didn't fancy to click through web applications to launch instances and whatnot, so they asked, I mean, how can we automate that?
And they also had some Python background, so I introduced both of them, but I had the feeling like this is something that's also helpful for others. Before we actually talk about Boto, two other little things, the first one is,
humans like abstractions, so this word cloud is kind of a buzzword, but let's say at least different people have different opinions what it really means, so I define it for this talk only, maybe in half an hour I will have a different opinion. What I mean by that is that you have dynamic,
or in AWS speak, elastic IT resources that can be something like storage, that can be something like compute, so virtual hosts, that can be networking, so if you need a content delivery network, it's just there waiting for you, but it also could be some routing stuff, packet filters, what we call security groups,
you can have easily databases, as I will show you in a sec, messaging systems, and the key here is that you can scale up and down those things, and while you scale up, sure, you have to pay more, but when you scale down, the idea is you don't pay anything for all this stuff you don't use.
And as mentioned previously, that has to be scriptable because nobody really wants to do that all by hand, and Python, I believe, is a perfect language to do so. If you now think, well, I want to write my hundreds
S3 uploader, there's a tool for that already, actually, Osiboto comes with a command line tool for that, but I would recommend the AWS CLI tool, which is also written in Python, and yeah, there's a different talk for that.
So, you know, if all you need really is, actually fits in a simple shell script, and maybe Boto isn't the thing, then you might be much faster using that one. So, Boto, actually, Boto was started by this guy,
Mitch Garnett, he also used to work for AWS, but he left the company, unfortunately, so now the project is managed by AWS, which means we make sure that the code is up to date,
but actually, we also are very happy about contributions, so I checked on GitHub last week, so we had nearly 400 contributors to the project. We had over 6,000 commits, and that is just the GitHub history from somewhere in 2010.
Yeah, and the name actually comes from this little dolphin here. Which brings me to the first example I want to show you, so maybe many of you are familiar with the storage service called S3,
some storage. The idea is that you can dump stuff in what we call buckets, you also could think of it like a namespace or a directory, and there you have to create a key that you can have your objects back,
and attach to this key, you have an object, and this can be a stream of whatever, so it can be files or, well, as I said, whatever. What we make sure is that you actually get your data back, so the term for that is probability, and what AWS guarantees you is that you get 99.999,
and that's really nine times 9% of chance that you really get your data back, and if you look up what your hard drive gives you, and then you do some calculations with some rate arrays, you will see that's hard to reach this number.
I dare to do live demos. I have a couple of IPOS notebooks. Prepared, I just have to see if, because we had a little problem setting up the display here if that works,
so what you basically do is, in the first place, you input Boto, can you see the mouse pointer? Yeah? Good, so I execute that. Done. I have a little file on my hard drive called Dolphin Jupyck, so what we do here is we first try to create a bucket,
if that fails, we just connect it, and then we, as I mentioned, we have to create a key for the object we want to upload, and this key is called Dolphin Jupyck, and then we just upload the content,
so let's do that, this little star here turned into two, so it's done, and yeah, that's some IPOS magic, so we generate, actually, what we do here is we get this bucket again,
we go through the list of keys, which is not really so much relevant here, but actually this line here then generates a URL that is valid only for 120 seconds, so what you can do is, of course, you can generate URLs that are valid forever, but sometimes you just want to share a file
and you do not want that others downloaded it too, so you just want to have this link being valid for a certain amount of time, and that is what this thing does, so if all goes well, and it does,
you actually see the signature for this, for this attached to this URL, and it will be downloaded, the file. Wovia knows how to create torrent files with AWS, torrent, you know, BitTorrent, thing that the music industry got a little wrong,
but actually a very helpful protocol, wait and let me do this bigger, I prepared that already, you may not remember
that I gave you this link here, down here for this presentation, and this, of course, also comes from S3, so the only thing you do actually is, you attach to your S3 link, question mark, torrent,
and what you get is the torrent file, so with the limitation of the wireless we have here, I'm not sure if clients here can talk to each other, but that would improve downloads like this, a lot. All right, okay, next example,
I talked about this message queues, the service is called SQS, what it basically does is, and there are many other implementations of that, you just dump a message in a queue, and somewhere else you take it out,
that's the basic concept, very useful in distributed systems, and as I mentioned, there are many open source projects that this is kind of the same thing, but if you want to have that scalable, if you want to have that in a high availability, you will find out it's not so simple anymore
to set it up, and actually it also can be quite costly if you have to distribute the service and stuff, so with Spoto, it's quite easy to do that, and let me go to the IPOS notebook, again, I'll try to make it a little bigger,
so this time we use the SQS module out of Spoto, we create a connection to, this is always per region, so we go to the European region from AWS, there we use the service,
we create a queue which we label Europizen 14, we set the timeout, I'll come to that in a bit, and wait, just let me execute that, first one, second one, and in the next block,
we actually add a message, so we import the message class, we instantiate one, and we set the body as I imagine in the queue, and we write that to the queue we created before,
so now let's assume we are somewhere else on a totally different system, we create a remote queue, we get all the messages, we print the message body, and we print the queue count, so I executed again, I mean I tested that before,
that's why I started there, yeah, of course we get the message, imagine in the queue, and we also get a queue count of zero, so now we wait a little, actually this timeout, and we see again how many messages we have in this queue,
and big surprise, now it's one, so the idea there is that if for whatever reason your service that was actually dealing with the message and receiving I don't know a chunk of JSON and doing something that failed for whatever reason, crashed, then you do not actually want your message to be gone forever,
you want actually the service then when it dealt with the message to delete it, and that's actually the last block, so you just say, again you get the messages, you iterate over them and you delete them, then, so you get for one deleted message to true,
and next time it's empty, that was example two SQS, now let's launch a virtual instance,
that brings me to the next IPOS notebook, again we this time import the EC2 module out of portal, let's actually do it, that you also see which kind of help portal is, that you know you do not have to generate all this APE XML stuff and ULS for yourself,
this time we enable logging, so my basic do is I input logging and set the log level to debug, which portal will pick up, I again create a connection,
now I have debugging and portal actually tells me that it found the config, I didn't touch that, there are several ways how I can put in your AWS keys, and in case some of that should show up, I have temporary keys for this presentation,
so in case you want to reuse them, don't try, so, and actually all it takes to actually run this instance now is this command, or this line, I have to say this parameter,
first parameter is the actually image we want to launch, so the term there is army Amazon machine image, and that really defines what you actually get, if you get a Linux system, what Linux system, what's being installed there, so as I mentioned earlier, oh, did I forget it, probably I forgot it,
so we actually have our own Linux distribution where you make sure that it runs best on AWS, it's called Amazon Linux, but you also can have Red Hat, Zuza, Devian, or whatnot. So, the thing you have to notice is you,
that's actually all the thing, now portal generates for us to launch it, so you will have the, I come to availability sounds in a bit, but we have a kernel ID, and security groups and all that, the architecture, root devices,
we don't really have to have the time to get into that, but the thing now is since that thing is launched in really a couple of seconds, and then actually the system boots, and that is kind of the handoff from us to you as the customer,
when we do not touch it anymore, and when we don't really know if your instance really boots up or not, so there is a service for that to monitor that of course, but it's not default, so what you get back in that case is a so-called reservation ID, which you can use actually then later on to see if your instance is turned to the state running,
and which instance ID it got, so every virtual host or instance, same thing, gets an ID of course, that you can find it back, and that is what will change here now, just logging really,
right we can,
yeah, that's why I don't do live demos, but then I show in the slides, in that example here I actually get back four of those instances, and yeah, and once I have the object for this instance, I have a couple of methods, like I want to have the public DNS name,
as you see later on, I also can terminate those instances, then check them or whatever, but for this,
before I actually can start with the next example, I have to introduce a few concepts, the so-called virtual private cloud, which roughly you could say is a LAN, but just in the cloud, so I already talked about regions,
so a region is really data centers at geographical point, that is divided into so-called availability zones, which means if you want to have a available service, you will launch in different availability zones, and if one goes down,
then you are at least ensured that the other one will still be up and running, that's the idea behind that, and then you want to have, at least in a VPC, you want to have your own network in there, you do not want to see traffic from our management
or from other customers or whatever, and therefore you basically launch with private IP addresses, you have subnets in there, the subnets are per availability zone, and if you want to have those instances exposed to the internet,
you can always attach a public IP address, and they're visible again, or you can go through this internet gateway and route and there are different things like load balancers and stuff you can use here. It's important for what I want to do now, so this example actually shows
how to just launch 10 of those hosts and instances, install this CC and all the stuff you need to build a Linux kernel, set up this CC, and then this CC has a functionality where it actually can broadcast
and find other nodes, and then you can compile. I'm afraid we do not have the time to really show that, because the launches and all that takes a little. So let me just show you quickly, oh great, now Firefox doesn't want to work,
I'll have it back up. What you basically do is, that here is important, this time you say I don't want to have just one instance, I want to have 10. That's just a thing. You say explicitly which instance type you want to have, so C3xlarge has a little bit of compute power.
You have to give a subnet, I talked about it earlier. We want to have monitoring this time, and yeah, well, that's done, you get a couple of instances, I used Fabric to SSH into them, install the stuff.
Start the CC, again, done with Fabric, I actually kick off the compile, and after a little less than two minutes, the whole thing was over, and I actually can shut down everything and done.
I very quickly skimmed through the last example, where you could say, well, but maybe 10 instances is a little too much, so I want to have this more flexible. Let's say you have a compile service or something. The key there is another tool called Autoscale.
What it basically is, you have a so-called launch configuration, where you define the instance size of your virtual host, which army want to use and all that. You need a so-called autoscaling group,
which defines the availability zones, for instance, the minimal size of your cluster, the maximal size, and yeah, this launch configuration that is all started. When you kick this off, you could imagine that now there are, for instance, it's been launched,
and that is exactly what you see on the bottom of the slide here, was the get activities method. Then you have to have a scaling policy for scaling up and down. You have to kind of triggers or alarms
for that, and that's done here, and those alarms. You give it a threshold for CPU utilization, which has to trigger for a certain amount of time, so that's twice 60 seconds, and if this triggers, then it actually scales up by,
that's this parameter up here, by one instance, and this would go on and on until you reach the eight instances. All right, and well, to shut this whole thing down again, three lines of Python. Well, I'm through my slides.
Actually, for all the services you see here, Boto is the API, or is the tool to use. Thank you. We got, thank you, thank you a lot, Frank.
We got time for one very quick question. Okay, then I give you, I give you the answer. May I ask a quick question? How long would it take me to set up a service running on Boto from scratch?
Well, it's really, well, you have to click an AWS account, which means some verification that you claim you'll be, and from there, you get a key, so you have to configure Boto, that's just two strings or two keys. You have to give it as a shell variable or whatever,
and then, yeah, you use one of those lines here, and you're up and running. Okay, thank you a lot. Let's thank the speaker again and prepare for the next talk. Thank you very much.