Using python, LXC and linux to create a mass VM hosting, managed by django and angularjs
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Alternative Title |
| |
Title of Series | ||
Part Number | 8 | |
Number of Parts | 119 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/20044 (DOI) | |
Publisher | ||
Release Date | ||
Language | ||
Production Place | Berlin |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
| |
Keywords |
EuroPython 20148 / 119
1
2
9
10
11
13
15
17
22
23
24
27
28
41
44
46
49
56
78
79
80
81
84
97
98
99
101
102
104
105
107
109
110
111
112
113
116
118
119
00:00
Mass3 (number)MereologyMassScheduling (computing)Computer animationLecture/Conference
00:30
Computer architectureBroadcast programmingMereologyPoint cloudComputerDebuggerMereologyComputer architectureTwitterNeuroinformatikInternetworkingMultiplication signJSONXMLUMLLecture/Conference
00:56
Server (computing)Service (economics)Open setOpen sourcePoint cloudMultiplication signInternetworkingRootVirtueller ServerService (economics)Open sourceVirtualizationWeb applicationInstallation artLecture/ConferenceXMLComputer animation
01:24
MIDIAmsterdam Ordnance DatumSummierbarkeitValue-added networkLipschitz-StetigkeitMobile appUniform resource locatorMessage passingWeb pageTemplate (C++)Revision controlContent (media)Uniform resource locatorMereologyXMLLecture/ConferenceUML
01:54
3 (number)Data managementMereologyAsynchronous Transfer ModeFront and back endsServer (computing)DebuggerDemonWebsiteComputer architectureStructural loadMeeting/Interview
02:15
DemonServer (computing)Client (computing)Demo (music)3 (number)DemonClient (computing)System callVideo game consoleScripting languageService (economics)Gastropod shellServer (computing)XMLLecture/Conference
02:35
MaizeCAN busInterior (topology)Row (database)Density of statesMetropolitan area network12 (number)Set (mathematics)Value-added networkEmulationVarianceExecution unitCommodore VIC-20AverageComputer-assisted translationNormed vector spaceRaw image formatArtificial neural networkCone penetration testScripting languageVirtual machineSystem callTemplate (C++)Coefficient of determinationSource codeJSONXMLLecture/ConferenceMeeting/Interview
03:03
WebsiteCAN busSystem callPoint cloudCoefficient of determinationWebsiteVirtualizationData managementMereologyServer (computing)Virtual machineSoftware testingComputer-generated imageryXMLLecture/Conference
03:47
System callWebsiteProbability density functionPoint cloudImage registrationSystem callWebsiteXMLLecture/Conference
04:16
BuildingExistenceInstallation artDefault (computer science)Instance (computer science)Cycle (graph theory)Limit (category theory)Read-only memorySoftware maintenanceMagneto-optical driveSummierbarkeitComa BerenicesTrigonometric functionsBuildingMereologyData managementCellular automatonCodePoint (geometry)Formal languageXMLComputer animationSource codeJSONLecture/ConferenceMeeting/Interview
04:58
Default (computer science)Software maintenanceInheritance (object-oriented programming)Installation artInstance (computer science)Cycle (graph theory)Limit (category theory)Read-only memoryMagneto-optical driveTrigonometric functionsSummierbarkeitComputer clusterComa BerenicesFormal languageTemplate (C++)ResultantSource codeJSONLecture/ConferenceMeeting/Interview
05:23
Instance (computer science)Network topologyTemplate (C++)ResultantFamilyLine (geometry)Computer fileWeb 2.0SoftwareHierarchyTable (information)CuboidGastropod shellLecture/ConferenceSource codeJSONXML
06:02
Point cloudFunction (mathematics)outputTable (information)Computer networkBefehlsprozessorKernel (computing)Group actionBefehlsprozessor2 (number)CountingXMLComputer animation
06:22
Read-only memoryStatisticsCountingAirfoilData storage deviceSemiconductor memoryFile systemCache (computing)Data storage deviceComputer fileHorizonDirectory serviceNumberPhysical lawContent (media)Entire functionLecture/ConferenceJSONXMLComputer animation
07:16
Network topologyHorizonPoint (geometry)Computer fileVolume (thermodynamics)Sampling (statistics)JSONXMLLecture/ConferenceMeeting/Interview
07:37
Computer fileVolumeDifferent (Kate Ryan album)Point (geometry)MiniDiscSummierbarkeitSelectivity (electronic)Volume (thermodynamics)Video gameMultiplication sign10 (number)Computer animationLecture/Conference
08:09
Computer file3 (number)DiagramStack (abstract data type)Execution unitLogicComputer-generated imageryFile systemComputer filePartition (number theory)CASE <Informatik>Volume (thermodynamics)Block (periodic table)Software testingPhysicalismMiniDiscBit rateVirtualizationRight angleData managementForm (programming)Hard disk driveRAIDBackupVapor barrierXMLProgram flowchartLecture/ConferenceMeeting/Interview
09:40
HypermediaBlock (periodic table)Vapor barrierCodecWritingPoint cloudVapor barrierBlock (periodic table)WritingFile systemAreaCategory of beingSet (mathematics)HypermediaState of matterPoint (geometry)Bit rateRAIDXMLComputer animationLecture/Conference
10:21
Mathematical singularityPartition (number theory)RAIDBackupVapor barrierSemiconductor memoryPoint (geometry)Partition (number theory)Software testingNetwork topologyStability theoryXMLLecture/Conference
10:48
Read-only memoryResource allocationWeb pageSemiconductor memoryBlock (periodic table)Table (information)CASE <Informatik>Resource allocationLine (geometry)Web pageKernel (computing)Analytic continuationOrder (biology)Vector potentialUniform resource locatorDemonXMLComputer animationLecture/Conference
11:19
Read-only memoryOrder (biology)Resource allocationBlock (periodic table)Web pageInfinite conjugacy class propertyKernel (computing)Network topologyBitSemiconductor memoryState of matterFreewareXMLUMLLecture/Conference
11:49
Point cloudEvent horizonLeakComputer fileSerial portComputer fileSemiconductor memoryBitEvent horizonMessage passingXMLUMLLecture/Conference
12:30
Limit (category theory)Interior (topology)MiniDiscSpacetimeSoftwareMereologyLimit (category theory)String (computer science)Multiplication signNumberVector potentialXMLComputer animationLecture/Conference
13:30
TwitterBitMetropolitan area networkContent (media)Formal languageData managementWeb 2.0XMLLecture/Conference
13:53
HTTP cookieEmailDefault (computer science)Symbol tableMoment <Mathematik>XMLComputer animationLecture/Conference
14:14
Symbol tableFormal languageDefault (computer science)Template (C++)Library catalogScripting languageInformationProcess (computing)Software developerFormal languageTemplate (C++)Standard deviationSign (mathematics)2 (number)Pairwise comparisonComputer fileComputer animationLecture/Conference
14:36
Scripting languageJava appletLibrary catalogView (database)Computer fileTime domainInformationVarianceState of matterView (database)Library catalogChainProcess (computing)WritingFunctional (mathematics)WebsiteComputer fileComputer-generated imageryComputer animationLecture/Conference
15:02
Dependent and independent variablesAuthenticationTouchscreenCASE <Informatik>LoginDefault (computer science)Service (economics)Exception handlingService (economics)Standard deviationException handlingInclusion mapFormal languageFactory (trading post)XMLComputer animationLecture/Conference
15:31
WindowFunction (mathematics)AuthenticationFactory (trading post)TrailHeat transferOvalUniform resource locatorDependent and independent variablesError messageWeb pageCASE <Informatik>Image registrationLoginChemical equationDoubling the cubeProduct (business)XMLComputer animationLecture/Conference
15:50
TrailHeat transferPoint cloudTrailElectronic mailing listWeb pageOpen sourceXMLLecture/Conference
16:20
Server (computing)Element (mathematics)Key (cryptography)TrailBinary fileStatement (computer science)Division (mathematics)Addition2 (number)Module (mathematics)Decision theoryRevision controlDirectory serviceSocial classCASE <Informatik>Fault-tolerant systemWeb 2.0DebuggerConfiguration spaceFirewall (computing)HTTP cookieIP addressPoint (geometry)ConsistencySoftwareData managementTable (information)Goodness of fitCartesian coordinate systemVirtualizationSemiconductor memoryMultiplication signWeb browserSoftware developerPlanningServer (computing)Projective planeUniform resource locatorScripting languageCodeProduct (business)WebsiteComputer fileFile formatArithmetic progressionGenderGroup actionOpen sourceRegulator geneSoftware testingComputer programmingType theoryProcess (computing)Arithmetic meanConjugacy classAnalytic continuationView (database)Cellular automatonXMLComputer animationLecture/ConferenceMeeting/Interview
Transcript: English(auto-generated)
00:15
Hi everyone, thanks for being here. I'll talk about using Python LXC, which is Linux containers and Linux,
00:23
to create a massVM hosting managed by Django and AngularJS. We have a schedule, part one is my part, back end and architecture, and part two is Oliver's part, this is the front end in Django and AngularJS. Now, first part, back end and architecture.
00:42
First about me, I am Daniel Kraft, from D90, and my Twitter account is wam-dam-dam. I am doing computers since 1985, and I am online since 1987, obviously not in the internet at that time.
01:01
Who are we? We are creating a service for preconfigured ready-to-run virtual servers with root for many open source web apps. Think of it like a one-click installer. All this hosted in Germany, by the way with 100% renewable energy.
01:22
And that's how it looks like. You basically choose some templates like Django in that example, choose a version of it, give it a name, click on add container. Is this viewable? Yes. Now you have this container in red because it's off, it's turned off.
01:42
Then turn it on, then it becomes green, then you click on the URL on the reachable at HTTP, and there's a Django. And here's how it works. We have two parts in this architecture. The back end is called CON, short for container management.
02:01
The front end is called site. This is just our names. CON has two modes. First mode is it could be run, it can be run as a daemon. Then it is an XML RPC server. Otherwise, it's an XML RPC client to its own daemon.
02:21
So basically you start it as a server, as a daemon first, and then you can use it as like a console script which connects to its own XML RPC server. So this is how it looks like if you don't call it as a daemon shell script with, I don't know if you can read that, that's not so important. It's more about, this is anything you need to manage virtual machines on a host.
02:46
You can, I can't read it here, I have to look there. You can build, I'll show an example shortly. You can remove templates, create containers, duplicate containers, start and stop them, and so on.
03:01
So CON calls itself, so it eats its own dog food because it calls its own XML RPCs methods. Just as like as you'll see shortly as the site does, it can be called by others like the site. It contains anything needed to work with virtual machines.
03:24
So a very important part is CON works completely without site. That means we can use the server part, test it individually, only that we can run it locally all without any user management or something. This is just a virtualization layer.
03:44
The site on the other hand, based on Django and AngularJS, calls CON, which can be many via XML RPC. It does accounting and payment, it creates PDF invoices, it manages user accounts and the registration and so on.
04:01
And also the site works without CON. This is also important to test it and to run it locally. Of course, when CON doesn't run or isn't available, you won't see any containers. Now back to CON. We re-implemented an existing solution for repeatable builds.
04:22
It looks like that, maybe someone knows what that is. Yeah, a Dockerfile more or less. So we essentially do the same. This is because of the history of CON. We first ran on Docker, but Docker didn't have the features we required.
04:43
So we added the features and had about 80% code on top of Docker and just 20% managing Docker. And at some point we threw away the Docker part and re-implemented the 20% ourselves and this is part of it. So this is a very, very simple language that essentially starts up a LXC container
05:05
and runs a command inside it like apt-get update and closes that container again. And for the next command, it creates a new snapshot of this template that now was configured with that command and runs, for example, apt-get upgrade dash y.
05:25
This results in a large template tree because what you see here is one line is a snapshot of one command you saw earlier. So one file, we actually call it CON file,
05:41
one line in this file results in one line here in the hierarchy. And the longer lines are the final templates that you can use in TUBOOP. So CON is using LXC for virtualization, shell in the box for the web console,
06:00
IP tables for network accounting or Linux tools. Here are some rules for that if you didn't know. And it's using a lot of the C group magic from the Linux kernel for accounting like the CPU act group, where the CPU act usage counts the nanoseconds per second
06:24
which are used on the CPU. And the same for the memory which gives RSS active, inactive memory, file memory caches and so on. It's using OFS for storage. This is a layered file system. You can mount nearly any number of your ordinary directories
06:46
on top of another and they derive from each other. That means if you have a file A in the lower directory and a file B in the higher directory and you mount both, then you see file A and file B.
07:02
And they do some magic with deleted files and so on. This is a very stable solution. So that leads me to the failures we had. Yes? Many. Let's talk about B3FS. We first choose B3FS instead of OFS because it's fast.
07:24
It's even fast for millions of files. First it works very good. It has writeable snapshot. That means you can at any point use any subvolume in a B3FS, make a snapshot of it and write in both and they diverge.
07:41
It has live quota with subvolumes. That means you have at any point the disk usage of this subvolume in sum with all the snapshots below it and just the difference to where it was snapshotted from.
08:01
And it has instant creation of snapshots. It's like a tenth of a second. But maybe you have seen that. This is the Linux IO stack or a diagram of it. You can basically stack anything in Linux on top of another
08:22
like block devices, then file system, then file system image, then partition inside it and so on. Without knowing exactly what that does. It's not needed. We did use a device manager for RAID 10. This is on the hard drives, on the physical hard drives.
08:42
On top we used LVM, the logical volume manager. On top we used virtual IO. This is the KVM virtual disk IO layer. On top we used a partition. On top of that for one partition we used X4. On top of that we used an image file which we mounted as a loopback device and put B3FS on it.
09:06
This was a test setup because the image file is quite nice for handling and for backup you can just turn it off, copy the image file somewhere and run it again. However, then the B3FS cleaner died.
09:21
B3FS is a lazy file system. It does what it needs to and cleans up later. That means the B3FS cleaner has to run to clean up later and it died during its job. And we lost data. That's not meant to be. There's a thing in the Linux IO stack that's called barriers.
09:45
It's copied from an article on LWM. In a sense a barrier forbids the writing of any blocks after the barrier until all blocks written before the barrier are committed to the media. That makes sure that the journal of the file system is consistent.
10:02
Looks like the barrier didn't find its way through these layers. So some point in this stack, obviously after debugging it, didn't work with barriers. So we tried again. The same basic setup we used the RAID 10. We used LVM on top. We used VirtIO with KVM because this is our default setup.
10:22
We didn't want to throw that away. It helps us very much with backups and things like that. So on top of that we used the partition and directly a B3FS. And hooray! We tried to crash it again. It didn't. So the B3FS looked stable from that point.
10:42
Barrier standing. Well, there's another thing about the B3FS cleaner. It produces a lot of memory fragmentation. If you have never heard about memory fragmentation, yes it exists and Linux has a table of it that you will see when you see a kernel traceback in DMSK.
11:02
And one line of that is a page allocation failure, in that case order 4. The order is the potential of two of the block size in memory that couldn't be allocated. This means that a 64 kilobyte block wasn't available of continuous memory. This is pretty bad because that's not much.
11:24
And there's no defragmentation tool in the Linux kernel. If you have this state it will never run again, except as a memory freeing. Okay, so we threw B3FS away and used HoFS, which is a bit slower but much more stable
11:43
and we are happy with it. Next failure, XMRPC. First we use CRRPC, a really excellent tool. It's pretty fast, it has a good serialization. You can basically just fire off messages, they will arrive somewhere
12:03
and it's a lot faster than XMLRPC. But it was leaking file descriptor when not using GEvent, which we can't because we are currently bound to threads. And then we used XMLRPC. But it was very nice, we were a little bit blue eyed, it's in German.
12:26
We used bytes for anything that was transferred, like for memory usage, for network traffic, for disk space and so on. But there's two potential to the 31 limit of XMLRPC.
12:48
And we couldn't use bytes anymore, so we had to serialize all large numbers to strings or moved to megabytes where possible.
13:01
It's running for now until we hit the 4 gigabyte limit again and have problems with megabytes too. Okay, that's it for my part. I would have a lot more failures, but time is running and I'll give over to Oliver. For questions, I'll be there directly after his part.
13:28
Hi, I'm here working at the front end of 2Woop. My name is Oliver Rock. I choose Django and AngularJS to get the web UI started pretty much fast.
13:42
I use Django for user accounting invoicing management and as a mediator from Daniel's XMLRPC API to a JSON API I can digest with AngularJS. First of all, Django is using CSRF protection for a lot of use
14:01
if you activate the middleware. So we have to tell AngularJS to take the token from the cookie and send it to every asynchronous request. The next problem, AngularJS templates will collide with your Django template language
14:21
because they all use the double curly braces so we have to tell Angular to use, for example, a curly brace and a dollar sign. In the nationalization, Django uses PO files which pretty much like Soap or Blown does
14:42
to have a consistent state between the Django views and the JavaScript views. You can use the Django views i18n JavaScript catalog which takes the PO files and generates a JavaScript you can include into your site and you have a function like getText to have internationalization.
15:02
You wouldn't use document write in AngularJS. It's just for example. The next, we have a lot of requests depending on user permissions so we have to include a permission denied exception that is delivered by Django but a standard HTTP service by AngularJS doesn't handle 403
15:25
so you have to create a custom interceptor for that so we have a factory permission denied interceptor you can handle a request, request error, response and response error in this case it's a response error 403 so we set the location to a slash
15:41
it's the front page and the registration login page. Another good product is the double entry bookkeeping it's Django account balances you have full audit trail you have always a debit entry and a credit entry or credit entry to debit entry so you won't lose any money.
16:01
It's pretty simple. You have to define a source. It's our bank account, the destination account, the user the amount and the user that is privileged to transfer the money. The most important thing is keeping the DOM on the front page of 2Woop you have a list of your containers running
16:21
that is updated every two or three seconds if you miss to track by an ID AngularJS will replace the whole DOM every two or three seconds so you won't be able to interact with all your containers because just in a click all DOM is gone and replaced by another one.
16:41
Yeah, it's all pretty much simple but it's due to the fact that Django and Angular are simple to program. Any questions? Thank you Daniel, thank you Oliver.
17:02
Any questions, come to the microphones and we'll ask you in turn. So let's start with this chat here. What kind of version of B3FS you used for tests? What version did you use for B3FS tests? We were starting with Ubuntu 13.10
17:23
and were testing it again on 14.04 I think it's one point something in the latest version where the B3FS cleaner also did these memory fragmentation things.
17:43
You might be the wrong person to ask this but if you're using JSON for the front end why use XML RPC in the backend? It was a decision for development speed. The XML RPC module in Python is well tested
18:00
and really complete and all you have to do is derive from the XML RPC class and it makes the server automatically. You have no effort with it. You just define methods that can be called from outside. This is just for development simplicity.
18:20
So you switched away from Docker because features were missing. Docker is still in current development and do you think they will catch up with the features you need soon? It wasn't just about features. It was inconsistency too. There were a lot of accounting things
18:40
that Docker returned when calling it that were inconsistent in itself and we had to work around that. I don't exactly remember what that was. We had to do all the cgroups magic I was shortly talking about ourselves and much more.
19:00
We included all that and network accounting like the IP table stuff and so on completely into our own product and that was the most code of it. The pure LXC virtualization isn't much. I think Docker is improving on monitoring and instrumentation. Another question.
19:21
You have Django and Angular. The other guy. You have now Django and Angular. Is your site, does it support like progressive enhancement or how do you handle that? Like if you go to a URL directory. Is that still working? Sorry, I didn't get it. Progressive enhancement, do you use it?
19:41
What's that? So when you load your site and then you don't expect any JavaScript to be running and the site still works. Graceful degradation. Non-client rendering, no, no. We've got lots of time for questions so come up to the microphone.
20:00
I have a question. Is your container format compatible with Docker 1? Because if you made a so-called fork of Docker and if I want to use your hosting infrastructure
20:20
and didn't want to vendor lock in, can I move to the Docker hub or something? So your question is if we have a compatible layout of directories. Yes. No, we don't. You could manually copy the things around and I think it would work and write some configuration files for Docker
20:41
for each container but you can't use it directly in Docker. Okay. Do you do any kind of network isolation between containers of the same customer and how do you do that?
21:02
We have a network isolation. All containers have private IP addresses and we only forward configured ports for each container that you can configure yourself so it's like a web firewall management thing but we don't support private networks right now between containers.
21:22
We are here to learn and we already implemented a few things. We heard from people talking about downstairs and one of these things is private networking. We had a reasonable use case. We heard a reasonable use case for it now
21:40
because our aim isn't to orchestrate applications together but to have one container that contains anything you need like a Postgres and a Django and whatever. We have now a good use case where private networking is needed and I think it will come in a short time.
22:01
Hey, congratulations on a really cool project. Do you have any plans for an API at the moment? Can I script these containers? Not yet very beautiful. You can inspect what the site does with the browser and use that but it's still based on session cookie
22:21
and stuff like that. You can of course use that but we are making that a lot more beautiful and documented especially. Yes. How long did it take to reach this point? We started in August last year.
22:46
I think that's it. Okay, any final questions? We have got time if anyone has a final question. Otherwise, put your hands together and thank Daniel and Oliver for a very interesting talk.