Cloud Native GIS
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 52 | |
Author | ||
License | CC Attribution 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor. | |
Identifiers | 10.5446/44691 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
FOSS4G SotM Oceania 201939 / 52
9
11
12
14
15
17
20
23
26
28
30
32
34
39
44
00:00
VelocityShift operatorPoint cloudSystem administratorFormal grammarConfiguration spaceData managementFrictionOperations researchSoftware developerScale (map)Local GroupStandard deviationIntegrated development environmentGamma functionDynamical systemBitCartesian coordinate systemExistential quantificationComputer fontVirtual machineStack (abstract data type)Projective planeConfiguration spaceSoftware developerLevel (video gaming)Mathematical singularityFormal grammarGodProcess (computing)Presentation of a groupNeuroinformatikElectronic mailing listServer (computing)EmailArithmetic meanVirtual memoryOperator (mathematics)Mixed realityExterior algebraPoint cloudPerpetual motionMathematicsLaptopCoordinate systemPhysical systemStandard deviationSubsetSystem administratorCuboidCASE <Informatik>Product (business)Task (computing)Wave packetIntegrated development environmentSelf-organizationConfiguration managementSoftwareFluid staticsRadio-frequency identificationInstallation artSet (mathematics)StatisticsComputer animation
05:43
Point cloudCodeConfiguration spaceScheduling (computing)Execution unitSoftwareCartesian coordinate systemOverhead (computing)State of matterService-oriented architectureData managementPoint cloudData recoveryProgramming paradigmScaling (geometry)Virtual machineCuboidServer (computing)Multiplication signPoisson-KlammerScheduling (computing)CurvePortable communications deviceCloud computingDemonService (economics)Configuration spaceNeuroinformatikSet (mathematics)QuicksortDifferent (Kate Ryan album)MathematicsScripting languageProcess (computing)MappingPower (physics)Online helpLocal ringMoore's lawUsabilityDisk read-and-write headCASE <Informatik>Computer animation
11:25
DatabaseExtension (kinesiology)Information securityComputer-generated imageryMiniDiscConfiguration spaceInstance (computer science)Service (economics)Gamma functionDefault (computer science)Proxy serverInformation managementPoint cloudSoftwareTask (computing)Point cloudBitCartesian coordinate systemLevel (video gaming)Physical systemForm (programming)Incidence algebraQuery languageContext awarenessDecision theoryDirectory serviceService (economics)Proxy serverData storage deviceOpen sourceTerm (mathematics)Computer fileOverhead (computing)Multiplication signBlock (periodic table)SpacetimeMereologyGeometryCache (computing)Software developerTesselationECosObject (grammar)Type theorySoftwareCategory of beingGoodness of fitData managementScalabilityCASE <Informatik>Information securityConfiguration spaceAddress spacePhysicalismMiniDiscDistribution (mathematics)DatabaseRight angleServer (computing)INTEGRALPresentation of a groupWeb 2.0Projective planeMedical imagingFlow separationBackupQuicksortDefault (computer science)BuildingCodeAutomationHeegaard splittingSolid geometryInternet service providerConnectivity (graph theory)CubeEmailMobile appCapability Maturity ModelPlanningInternetworkingRevision controlComputer clusterFile systemSource codeVirtual machineData conversionCloud computingScaling (geometry)UsabilityArithmetic meanComputer animation
Transcript: English(auto-generated)
00:01
Hello everyone, so today we're going to talk a little bit about deploying GIS applications on cloud native stacks In this case kubernetes. So just begin with who am I I was introduced to sysadmin I've kind of moved into this DevOps role
00:23
I Have no formal GIS training. I sort of got asked to do this project and Had no idea what a coordinate system was I had my first experience well Changing from one coordinate system to another about three weeks ago so Yeah, that's that's about the level of knowledge
00:44
So just a small bit of history about how we normally build our our technological stacks We tend to have these quote unquote pet VMs where there's a lot of manual process put into building them People end up becoming owners of them. So when something goes wrong with a particular server
01:04
Someone else will say gets the XYZ person and it's their problem even though Realistically, we should Have everybody know how to work with most of our systems We are slowly moving towards DevOps and cloud native the
01:23
Problem there is with any large company. We're approximately 350 employees There's a lot of legacy systems to consider And a lot of people who are very hesitant For change because people don't Don't like what they don't know
01:43
We're also normally deploying things with more traditional tools so puppet ansible system configuration and We tend to have the traditional ops are in one team and developers in another and operations say things about you know developers want to throw something over the wall that they shouldn't or
02:04
Developers complain that operations are being a little too strict on what they're trying to do So just to compare some more traditional approaches That you will see there's the all-in-one virtual machine
02:20
I think someone spoke about those last night in the lightning talks the problems with those is You tend to just install things on there. There's not usually a lot of configuration management They're difficult to scale up to multiple machines because you might be running four or five applications on a single machine
02:42
The alternative to that is the handcrafted pet VMS I just mentioned When we create one of those there's still a lot of manual configuration to install a piece configuration software such as ansible or puppet We still have to think about how many resources we're going to allocate and that's a very static value
03:05
It's very difficult to change Like I said, we treat them like pits they have owners People tend to be very scared to work on a system that someone else built While they don't have the scaling issue, it's still a time-consuming task to stand up a new VM set it up
03:26
Set up your configuration management and so forth So with these approaches nothing is wrong For most companies these tend to work most
03:41
organizations As I mentioned they they lack flexibility and God something's happened to the the font Sorry about 20 minutes before this this presentation I had to convert this from a web-based
04:02
presentation to a PDF file Yeah, yeah I clearly didn't read the email properly Anyway The configuration tends to tightly be coupled to that project meaning that any
04:20
gains with Doing that trying to have a generalized approach don't always work and we also tend to have a lot of deviation between environments, so The machine built on a Developer's laptop may be an all-in-one vagrant, you know Linux virtual machine box
04:46
whereas the Development stat with a production stack. Sorry is probably seven or eight different boxes each with their own role in a singular application Yeah, and obviously who's heard the phrase it worked on my machine
05:05
I We've been thinking about getting a swear jar for that one So enter cloud-native These aren't everything that defines cloud-native These are just a handful that have been picked from the cloud-native computing foundations list
05:25
If you're not aware of who the cloud-native computing foundation out there a subset of the Linux foundation that has been Brought together to handle projects like kubernetes prometheus, which is a kubernetes
05:41
Native monitoring solution Um, but anyway, yeah, they're containerized. So as in docker, which I imagine by now most people are aware of Orchestration means you don't worry about where your application lives You let the software in this case, it'll be kubernetes
06:04
Handle where your application lives in your infrastructure You just worry about the application is in there And I put brackets around microbe because It's more service oriented. You don't look at an application as
06:21
the whole the whole thing is your application as opposed to Or the application is comprised of services, sorry and Flexible in that you can dynamically scale both your infrastructure and your services. So
06:42
If you need more compute power, it's really simple one-liner. I can add three servers of compute power It's highly available. So if one machine goes down the whole thing doesn't die and Things like disaster recovery
07:00
It makes better use of resources because we pull the resources and Like I said the dynamic orchestration will allocate Your containers or your applications where they can go where there's resources available and Portability in that this configuration can be or flexible as in portability in that
07:23
Your configuration can be quite easily moved between cloud providers Meaning you aren't necessarily locked into a certain vendor if you want to have Your application on one cloud and then you want to have your disaster recovery on another one It's entirely possible
07:43
So looking at Kubernetes It's one example of how to do cloud native. It's right now probably the most popular there's a few others such as Docker swarm, but This one seems to be the one that's taken over
08:03
Yeah, the configuration is all simple yaml syntax, I don't know how many people have sort of opened up something like puppet looked at the horrifically Difficult at times syntax and gone. I don't want to do that. Give me back, you know, my manually built bash scripts
08:21
I'm Reverting a bad configuration is quite simple because of this and I've already explained that so basically how Kubernetes works is we have Two and this this paradigm is pretty common amongst cloud native technologies
08:43
We have master manager leader So on nodes these each of these boxes represents just a virtual machine and then we have minions which Basically carry out whatever the master nodes say to do So the master nodes schedule containers create and maintain your resources. They ensure that everything is healthy
09:05
your minion worker follower nodes Literally just run your application. You don't have to worry about Them doing anything. They are just about bringing in compute resource Um, so how does this help us lease overheads I think in the last talk it was a lot about how cloud
09:25
Helps us with that This overheads in cost Wasted compute time is very expensive if I have to stand up eight servers when I really only need three That costs a lot of money Time like I said puppet is significantly more
09:44
fiddly than containers and yaml syntax and Portability as I mentioned when we initially built this we built this on a local kubernetes cluster We dropped it into the cloud and it worked seamlessly and the the other overhead is
10:03
You don't have to fight your IT team because they set up your cluster and you deploy your application to it Um, what else we got? Oh, yeah, and it gives you more time to make maps Which I imagined for this crowd will be Quite important
10:22
So in our experience, there is quite a steep learning curve initially You have to learn a lot of jargon a lot of verbs a lot of nouns things I didn't know the difference between a daemon set and a deployment when we started and I still fully don't
10:40
There's there's a lot of nuance to a lot of these things and it is well documented But there's also quite a steep learning curve initially It Becomes very easy to make changes once something new is learnt though. So once you've got a handle on the basics If you turn around and decide I don't like the way we did that
11:02
It's not a particularly difficult or time-consuming job to go through and adjust it to the new thing you've learned as opposed to Where state actually matters on your virtual machines in the past? It's quite difficult to go in revert as revert your state or modify it so forth
11:24
And yeah some applications aren't architected for the cloud There's nothing wrong with that. It just makes things a little bit more difficult to configure which we'll go through a bit later on Well here apparently
11:41
So The first thing we decided to do which half the internet will tell you you shouldn't is Running a database in kubernetes now This actually isn't there's a lot of fear mongering around this because people are scared Scared that they'll lose their data scared. They'll lose their database
12:02
On the other hand We tried a couple of things we see it on there's a project called cube DB by a company called apps code And they make it very simple to set up a master replica postgres database They
12:20
Basically take care of all the heavy lifting for you. It's an open source project. You can inspect the source code You can see what's actually being run in your cluster And Yeah, unfortunately cube DB when we first started using it it uses a Linux distribution called Alpine That is very small, but didn't have support for a lot of GIS
12:44
Type tools particularly. I think it was G Dell. We had a lot of trouble with So Well, we tried a lot of things compiling it manually and didn't really get anywhere So we ended up going with a Debian base image instead if it works it works, right?
13:05
And Yeah, we also had a few challenges around safely exposing the database for external use We eventually settled on Using an external proxy virtual machine. It's not ideal, but it works. It's not quote-unquote cloud-native
13:22
but Sometimes security is a little more important than ease of use So the next challenge we faced was trying to create a scalable geo server cluster We
13:40
The geo server configuration is entirely stored on disk as files which poses an interesting challenge because getting files into your system that aren't in the form of Say a database Can be a somewhat challenging System do and also maintaining those things you have to actually spin up block storage or
14:05
Object storage connect those things maintain them This post another challenge in that if you want to scale geo server out Every version of geo server you're running has to have its own Or has to have access to the same copy of configuration
14:23
So you know we We basically in conclusion it needed to have the same configuration and data We had to know when a file and disk has changed and the PDF thing has broken my presentation again Our simple solution was we wrote a small microservice that
14:42
Piggybacks off of the geo server notification module it goes out queries kubernetes API and says how many endpoints do you have that are geo service and Then reloads all of those in turn And as I mentioned we ended up using a shared file system we initially tried using a
15:02
Object storage based solution, but found that for certain files and certain parts of configuration. They were just a little too slow And this is probably the Yeah, this is the last sort of challenge we faced well last large challenge we faced
15:27
By default geo server ships with a built-in geo web cache For our needs that really didn't suit it. We wanted to have a tile cache be a separate service from WMS server from
15:42
Apostures database so we ended up using the standalone pulling that all of the the geo web cache component out of geo server and And Redirecting all of the tile requests or every WMS request through geo web cache initially
16:03
which Sounds a little insane but We solved this by adding a pretty simple proxy in front of geo web cache and geo server that Just says does geo web cache actually have that tile if not go to geo server get it return it
16:22
which is fairly straightforward and In this we've learned that These free and open source GIS projects are very mature pieces of software. They're very reliable Some already have integrations and long-term. That's something we'd like to change we would like to
16:41
Have these applications at least reliant on just es3 and more cloud native You Know containers and cloud native software take care of a lot of the boring things the day-to-day Management that you don't need to worry about We haven't had Any major incidents and I'm aware of where the whole thing has fallen over
17:03
short of problems with that cloud provider and Yet the other thing is this is a huge eco space There's you know, Docker and kubernetes are a very very small part of a very large ecosystem and some of these move very slowly at times because people lose interest or
17:25
you know, we've had a little bit of trouble with that and If you're interested in knowing any more or contact me about this, that's my email address there. Yeah questions
17:45
That's a really good presentation Um G a web cache is able to have a shared cache directory. Yes. Did you try that? Yeah, so we're we're caching We're caching directly to object storage
18:01
At this stage it was more about if we wanted more tile caches available for say rendering a bunch of tiles in advance or It's also more about making things as atomic as possible. So one of the big principles of containerization is Each application should try to do one thing And we just wanted geo server to be our WMS server
18:23
We wanted a tile cache. They work together fair enough. Yeah, cool questions from the audience Dave So I'm terrified of running a database and kubernetes Could you go into a little more around the decision to use to do that rather than say RDS? So
18:44
Um, first of all It terrified me myself to begin with I had many a conversation with one of my colleagues about should we really be doing this In particular there are quite a few Quite good projects out there are cube DBs one. There's another one for postgres called. I think I think the company name is crunchy data
19:08
These these providers Do a lot of things around that so automated object storage backups you can also automatically rebuild from a snapshot for our use case It definitely works. I can see there be use cases. We're
19:25
potentially having some other type of database service would be Much better, but I think as long as you're always backed by physical storage solid backups and a solid backup plan You shouldn't be scared It's not really that far different from just running a postgres container with a block storage thing behind it
19:46
Or did you have access to a DB as a service and that is also true So we're using catalyst cloud and we don't yet have database as a service So yeah that did so you couldn't even if you wanted yeah
20:02
It's catalyst it gonna be building a database as a service I Believe so, I'm not stay tuned on when I don't want to make any promises for Kubernetes doc a backed one, right Who knows any other questions really just a comment a lot of what you said
20:26
Yeah, I understand Yeah, I wanted to make this as easy I didn't want to get too bogged down in nitty-gritty details because that scares people Yeah
20:40
Cool so Chris Thanks, so Do you care to comment on I don't mean cost financially, but get cost of doing? Effectively making things cloud-native. I mean how hard There is work involved in that right and there is a work business decision about whether to do that or not
21:00
Cuz it's a lot of work right personally that that's not a business decision. I make I mean, I'm just the Developer we do have a team lead that isn't particularly me There is an overhead Especially considering we're very very new to the space. We've done bits and pieces of GAS before but nothing like this
21:21
We're also sorry very new to the cloud native space Our cloud is just getting started with the the kubernetes as a service offering is which is what we're using for this Um Yeah, there's overhead in building. Like I said those extra micro services having to test
21:41
multiple competing solutions things like that The splitting geo server one with the notifications was you'd count that in an hour and hours like a couple of hours of dev time the geo web caching proxying
22:02
part that was Probably the better part of a week of our time We're quite a small team. There's only three of us So yeah, that took quite a bit of our time just to get right make it reliable stable That sort of thing. So yeah, there is there is a time component. It does get shifted around
22:22
Cool. All right, we got to call it there. Thank you very much again Alistair