Add to Watchlist

Stress Testing and Analysing MapServer Performance


Citation of segment
Embed Code
Purchasing a DVD Cite video

Formal Metadata

Title Stress Testing and Analysing MapServer Performance
Title of Series FOSS4G Bonn 2016
Part Number 77
Number of Parts 193
Author Girvin, Seth
License CC Attribution 3.0 Germany:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
DOI 10.5446/20311
Publisher FOSS4G, Open Source Geospatial Foundation (OSGeo)
Release Date 2016
Language English

Content Metadata

Subject Area Computer Science
Abstract After 5 years in production the open source based Pavement Management System for Ireland has amassed over 15 years of road-related data. The back-end mapping engine is powered by MapServer and we are looking to improve performance when dealing with more and more data. The talk will focus on how to set up Locust, an open source Python load testing tool, to automatically get average load times for each WMS and WFS layer from MapServer, and how many users MapServer can handle concurrently. A small open source project is currently being written to help this process. Whilst MapServer is the focus of the talk, any OGC-compliant server can be tested in the same way. The talk will then briefly run through a series of experiments to see how changing various components affects performance. These are: Running MapServer on Linux as compared to Windows Using the MapServer native SQL Server driver, and using the OGR driver Map file size Seth Girvin (Compass Informatics)
Keywords Compass Informatics
good well I'm looking to continue now the losses in the afternoon before the before will go off on a budget this evening for which I understand we have to be ready to leave his I was introduced as given from comes into my informatics is going to talk about maps performance testing stressed OK good afternoon everyone that govern
and from compass informatics in Dublin Ireland so on the maps of a user rather than the developers here but rely on on all the work to keep the project going so thanks to them and and we've been using that server in a system that's been in production for about 6 years now so it's
silence pavement management systems so pavement is actually much effect Rhodes so every local authority in Ireland uses the system to manage their the road works so they planned they can put them there called where they don't and it also captures things such as accidents and speed limits represents references against the road network so that the local authorities access this so there's a browser front-end and at the back end maps which says that WMS and WFS so users mostly come in through the browser front-end they also bring in the same data into their desktop GIS for the printing and and further analysis so it must have been 1 of a kind of stable pieces of the system on the basis of change quite a lot like the the John scripts is rewritten every couple years but maps was kind of nice and easy to operate through the different versions
some yes you want about 6 years at beginning there's very little data that kind of just about half a gigabyte of people's row schedules and 6 years we've gone up to 10 weeks of data so some of the questions about the coming this year are from the user's was when we have a new release they said that the system seems slow and sometimes a map play would load anymore for them all that when they're trying to print out a high scale it wasn't it wasn't fast enough and so on we're going to be moving to a new hosting environment so kids could cater for kind of the increased data and increased number people using the systems so systems with the assistance administrator for the new house environment losses what we needed and we don't really have any idea beyond can give us what what you can do with as much as we can so we can really can specify the memory and the demands of such machines and things needed and then from a developer's point of view latent the performance always is 1 in which bits we could change maps so which might speed up the response times to our users so this is the the whole idea is to make users happy and so that when they click on that and that players they they come back quickly and also the weights 2nd so
before judge the and improve the performance we have to measure the usage so we've done some ad-hoc queries on the number of users and this is an interest systems so we have a rough idea of how many road engineers hands and planners we using this system the loss of login but there because of people connected to the system we have to log in and but we had to server logs being built up through the years so I took these and then using systems in Python scripts I important all the web repressed the what makes 2 to the whole system so this kind of . net API to do some parts and the maps of those all mapping parts so after importing that into database and I should capture a million maps of a request made by real users so these were fake requests actions of people bits in relation to public investment WMS so once it was in the database and became a lot
easier to see why users during how many years they were and how many connected at the same time so and yet in terms of stress testing and also looking 60 thousand users and we have about 60 years of the very but because it's there that they job and that using all data entered data and making reports this it's important that runs faster and so we have about 700 distinct IP addresses which roughly translates to to 1 user per IP address so they will come in on the Irish government VPN so so we know who's connecting from from which local authorities and then a few things became apparent when we went through the logs and they the kind of the critical 1 was we have over 100 letters configured 100 different datasets so kind of roadworks speed limits accidents and nearly all the users used a handful of layers so I had to change the scale logarithmic 1 because the the roads network layer was used for more than any of the others so had to be honest about 90 the lines that was not much point in trying to optimize them because he already looked at them and animate sensed optimize just a handful of letters that kind of 95 % the users looking at it as something else that became apparent was we get legend graphic progressive so we using a Durex front-ends and every time you zoom into a difference in level it hits maps over to ask for new legends and a couple of layers legends change a difference in scales so it was kind of wasted request so we kind of knew it from from the Jožef developments but when we saw the steps of how many requests made with the get legend graphics and became something that they're going to look at to try and reduce the amount of hits that maps of those guessing so what we collected all the data together on on the real users to them I tend to a library called which is a Python user load testing tool so you might be more familiar with Jamie to think there was a workshop on just over an angeI earlier in the week but i'm only writing stuff in Python and locus is written in Python so I tend to this and rounding of the URI in the XML files of Jamie and with locus the gives you a few kind of basic components and then you can write Python code to simulate your users a couple of other benefits were the terms you can of this lightweight processes used for each user you can have 1 machine simulating hundreds of users so One Laptop can hundreds users and hit maps of so afraid thread for each user and yeah it's open source as well so that makes it easier and easier to work with and so his sense in
Python code and of the locus library comes with a few basic concepts so there's a task set which is a set of tasks that you might do so in on the locus page they talk about the user logging in connecting to a forum and posting something in the form so am I made my own accustomed OWS task tasks that which would make a series of W and W S tight requests so the few specific care and that of the maps of such as the application content type errors and at the end the causal link just to just with this with this code that could be reused for for other OGC services so tasks
that would have a collection of tasks so in this example is year wfsc feature counts so just get the number of features in in the in vector and so I have a few other ones such as getting alleging graphical getting get Map requests is the most common 1 for the for the WMS services once you got all your task setting up tasks then you add them to the local costs which which is kind of a user although if you want to compare a user to play forming insects not sure about the change from the names of the shows the clients and this you had this task setting and then you have a minimum wage and maximum wait times so it will choose a task at random from the past certain executed and then it all wait to random amount time between between each call and there's also a concept of weightings so you can give more some tasks higher weighting than the other so feature counts in this in this case has always in 1 beaker make you'll get map requests kind of a waste of tensors 10 more times more likely to be called despite the trying to use these local classes to to simulate real user actions so it's most motion benchmarking where you just want to ramp up the users but is the trans simulate the real usage of power and how the system is being used it once you set up your
class there's a command line thing that spits spins up a Python web server and it gives you a web front-end so you can can mess around with this signal so that all from the command line and so choose the number of users you want to use to create and the hatch rates how many come online it's 2nd so just this example within 10 but you could create hundreds thousands of them and send them all at your it maps and once that's running and then you get a kind of light that portable will the request coming in with respect to the mean median max of all the response times and you can categorize them as you want based on new tasks and task names so something that I was
interested in was getting the the heaviest load on our servers so Google Analytics was defined they and concurrent users as hourly sessions by average session duration divided by number seconds in an hour was equipped sheets of it because there was a bit more complicated than this but I'm just the wanted to find the the maximum users usage of our system so then there is this year on Monday afternoon after lunch and it we have 19 users online and between them and how they made them almost 12 and a half thousand that's the requests will
so from this data and at their actual requests and again using Python the save them into to pickle files which the Python web caching objects so space adjust them to create say some files of the user's request and then modified the the locus class again so that will pick some of these pickle files to load up and stimulates the real users actions from from the 6th century when and then I set it to and set it to hit the maps of the maps every instance so with that 10 years everything was really quick relief really units with 100 it was the same and when set up to a thousand users been things started to slow down and responses was still coming back but they're taking 12 13 seconds to get together mapping image back correlation back and then they opted to 210 thousand and and yet Python came back to the narrative you can create a thousand twenty-four sockets so and my conclusion was that map servers uncrushable so that's that's good and locus does have a way of you can coordinate several machines at the same time so you can have and is also think of other low locust library so you could have several machines each with a thousand users so that we can get some more more users and hitting the system so as a as a colleague pointed out you could use it for a denial of service attack which is pretty much what the stress testing is the parental these tests on staging environment because the Irish government network has been crashing plus 6 months but that's that's nothing to do with this
and so I set up my my locusts some with simulated users and I could just by the change in the back end and see if they have any effect on longer response times so this has 3 3 experiments for them and and went through in the last couple weeks just to see and change changing things sped things up and then basically would lead to happy users so the first one
and something that would be interesting for a long time have really tested was whether the map file size so the maps of the configuration file size had a noticeable impact on performance so there's a handy M a website where generates it's in text and you can just you can choose how many bytes he wanted to create so is use this to to to fill out my map files with lots of comments just increase the size of the measure this system is quickly passed on that server and 4 and 5 thousand bytes of configuration of a layer would take a lot longer to go through possibly so this might be a kind of a slightly fake test but I created maps maps of the mat files of different sizes and then
said the locusts against against them and yet these were the results so our current map map file sizes 218 kilobytes so we have all 100 layers in 1 file which may not be the best way of doing things so on I just wondered if we had a kind of another 20 layers which be about another 50 kilobytes estimated difference in you can see kind of up to 100 kilobytes would really make any noticeable difference to the to the user and what if she went crazy and went up to kind of map calls of 1 and a half megabytes then then I would say if you're requesting an loss that tells at the same time then there would be no civil and they're going the other way if we shrink down the that file so this if I took on most use layer and within its own that fall so would only be 7 kilobytes than pretty much how the the median response time for forget Map request so this form is something that I look implementing its is moving at top 3 layers into their own that files so that the for those tiles which is the most used ones more quickly generated In the 2nd in
experiments is back and is using Microsoft sequel server and that's all the local governments use so that's what we we stuck with and there are 2 ways of connecting to sequel server and maps of there's the the maps of a plugin which is hi and part of that is part of the maps of the project but it's a separate plugged in use you let types of has it is has a plug-in set up for it and the other way is to use a Lara's analogy land and then you can use the OGI lab which is part of the the usual and it'll libraries and you can use that connects to 2 sickle cell so we've been using the maps 1 for the past 6 years and we have a few issues with date and time filters in the in the new version maps server so and the OGI driver since the more updated more frequently so i've assumed that the because the OGL 1 is more generic it would be it would be slower but but I was kind of right
but it doesn't really make much difference and the critical 1 for us really the the get Map requests and the get feature doubly oppressed ones on that important not that many people using the W S side of things and but the OGI the sequel driver would pretty much identical response times so that was a that might make consider moving to the to the OGI driver and brings me to the last experiment
which was a Windows Vista's volunteers SELinux so at the moment and Irish government is all kind of Microsoft Windows-based so all the services are all Windows service and it's changing now also when we started this Linux wasn't even an option for the last couple years this kind of being a push for more open source software and within forest government so from previous forms and reading things previously it was kind of mentioned on Linux it was running about 10 20 times faster than the non Windows these are quite old forum posts I kind of believe themselves but I wanted to test it so we set up and is staging environment Windows we using Windows Server 2008 still this is some kind of just out box to Apache to set up and windows is using IS is as the web so again and was sent sent the locus against that to see the the response times and
yet I wasn't sure whether I should show this slide along forget bruised but some yet the Windows Server was actually quicker for the median response times and and the average the only thing it was long before was the maximum response times which most of the time you can ignore there is a that they could be network issues of something that cause the kind of 1 maximum and sorry just after speaking to a few of the mass of the people apparently the the Windows CE runtimes that have come on a long way in the last few years they might not be that much difference in in performance between between the 2 systems the but Linux is still an option because in terms of pricing and if we want to get multiple service the moment around all from from 1 server then it might still be an option for them I was only tested on 1 level has had to configure can rewrite the natural for Linux and rewrite the left side of the OGI driver this most commonly use less so the most important source that's they could be differences like people's space might be very different so and so the results anyway and
so just kind of to sum up and after going through this this exercise and I guess like the the key things I learned was that so you can't really decide on performance until you start monitoring reviewing it so I see this as kind of a few products now offered by analyzing usage of WMS and WFS and we just have the server logs which which was done a good enough for us and this passing out the question and seeing which led to being used and yet still optimize only what is necessary so this and this rate of 4 point of spending all the time of my layers can optimize the sequel and they the database calls it is as a small region affects many of the users and and yet you can't assume that if you can't read that something should make things faster it will and so if you start measuring it you can actually make better judgments and just for yes that and the other thing is you this this kind of approach of the stress testing and and monitoring performance by kind of tells you what's wrong but it doesn't tell you why so this kind of more integration level can integration test equivalent rather than unit tests suggest you want to find out why the the letters loading slowly then you continue to go down to the database profiles and then maps of the logs so you
appetizing photos have some locusts and Anderson links to the locus project which there has been a lot of development on this year and it's nice and easy to work with and there is a link to the gist which is kind of the arrest ws specific locus cost so thanks for your time and if you any questions will have to answer thank you for your time so some generally have a questions are you can start taking other approaches to performance now looking at your data and looking at the how many vertices you have 4 line segments those kinds of things complex labeling our how many features you're actually going at a time things like that
to improve your performance and yet so willfully start looking
at the heart of most use layer and I'm looking at which scales like we we have the road network at the moment we're showing everything everything scale but we condition major major roads at the top level because when we started we wouldn't have many users that has been growing out soon as used everywhere so it's nice problem to have be at the beginning we just made maps of display everything they have there's there's definitely room for for loss of performance improvements and NLS specific ones and in
questions excuse me into moles of the EU priority not stressed this into your prior to performance system calls for stress tests income voltage distinct issues the
special frameworks for example the understand near special framework it it's provides since it is a you know provides the 90 95 99 since it is so low low priority on a simple table so that it is that it is simple also provide the company a benchmarks uh approach to use not being bolstering its and if you use this opportunity and those it's more cortical company OK so 1 around the
the abstracts and I was planning to the stress testing and then when I found the locus library was more about testing so yeah that was kind of what I was trying to to to crash maps of the was my aim to find out at 1 level it will crash for users we have already stress testing is user load testing and but I wasn't particularly interested in in the benchmark and was more interested in kind of speech coming back to the users so to us year I couldn't provide a 97 95 % that's really need them I just want to know kind of the the state of the user gets a map back so it was the abstract was written a started working on the project so also example
bolsters distant and for some instances non-normal tables famous uh can to be a school OK 20 at school OK and so it is with for you yeah it is this the
labels libraries it there it has 1 request per 2nd progestin using for the photographs so you can get those facts and then as an interface we can download them before the graphs so just interested in the response times and on the table questions terrible
things from assessment speakers and have a good evening
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation


  398 ms - page object


AV-Portal 3.8.0 (dec2fe8b0ce2e718d55d6f23ab68f0b2424a1f3f)