Add to Watchlist

Beyond grep: Practical Logging and Metrics


Citation of segment
Embed Code
Purchasing a DVD Cite video

Formal Metadata

Title Beyond grep: Practical Logging and Metrics
Title of Series EuroPython 2015
Part Number 172
Number of Parts 173
Author Schlawack, Hynek
License CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
DOI 10.5446/20128
Publisher EuroPython
Release Date 2015
Language English
Production Place Bilbao, Euskadi, Spain

Content Metadata

Subject Area Computer Science
Abstract Hynek Schlawack - Beyond grep: Practical Logging and Metrics Knowing that your application is up and running is great. However in order to make informed decisions about the future, you also need to know in what state your application currently is and how its state is developing over time. This talk combines two topics that are usually discussed separately. However I do believe that they have a lot of overlap and ultimately a similar goal: giving you vital insights about your system in production. We'll have a look at their commonalities, differences, popular tools, and how to apply everything in your own systems while avoiding some common pitfalls.
Keywords EuroPython Conference
EP 2015
EuroPython 2015
while everybody on unique from Internet e-mail mental Romania and quite a bit of writing and coding and what not but I have a little time and I'm not that interesting but more interesting is the river every from very small part hosting company and when reconstruct and the reason why it's interesting to you it is because we have so big enough such that we need proper metrics and living systems in place of to be able to use functions but on the other hand we are not we are small enough that we have we don't have a team does the price we have to do it on the side and if justified 1 part of all work and I think that makes kind of related to you at least I don't think that will stand there looking to more you have to learn something from it so make more convenient for you mn
a page with all links all the concepts all everything
and I mention here so on just relax and listen and they did this 3 things basically I'm gonna talk about errors and how to get modified often I'm gonna talk about metrics and help to know what the hell is going on in service I'm not talk about logging and centralized it and the reason it's so 1 question who is happy with their own well-being and metrics infrastructure liar and I'm not promising happiness it's computers my maybe we can on can make it may they can provide you you with functional unhappiness which is nice sometimes so errors and some of
them right away because they had no you have to deal with them and other wikis wants to make so I'm starting with a well everyone is still still fully awake of and again I have 3
expectations from where the well-being of my eradication system and really timing modifies and the right away when something happens underground but only 1 because this happens to people will just use an exception to the military so I mean I can I had 1 e-mails from such a thing so I would also
like to have some useful context might errors because monitoring
Mason tell you that something is broken but this is not really helpful to have any idea what is broken what's going on so obviously there's a huge market on 2 solutions and don't local
only 1 of them which is central center is a lot of things going on for it and most importantly its owner and same famous expression deviated preferable and we don't want so considered as someone who who's closure might also open source software and it's written in Python using Django
so you're deploying Python services you may be already noted for it and if you don't want to do that there's there's a page solution on the plants are pretty affordable I think and there's also a free trial and the free ones so you can be up and running within seconds so if if you don't have any of the eradication should try it out so what you get is instance useful modifications by e-mail but also
by slide for where we want to to placards for sorry content of phrase fragments of metadata and most interesting but of course this is you on century 1 a nice touch finds that those e-mails had a reply to have a set to people your whole team so maybe you're on a train and you use something that I think that you could just it's reply and give them some hints how fix it so on
that interface much much more and my favorite topic is is just about of this but was telling me I've said you wanted to know the in your networks so I want you fix this exception it is marked
resource and if it happens again it could mark them the great regression and you get notification again so basically it just exactly what you want to do on
as you can see there's a lot more going on there's a lot metadata much of its click automatically so you
can think of it like the the gender stack-trace view that many people still serving company customers but it's just for you so how do you get your data in there so the answer is Jason over GDP in so you can use it in any language framework even something you want to go go to scale so that nicer plans for various languages that they have usually membrane and and that that months forwards both multiple transport which is how the source of the literature review you and you think I O was that request and so on and so on also integration which is basically an arm how open that's collected automatically without you doing it explicitly so for example logging you install looking handler and every exception of the right steps forward Central Europe began to change anything you come after working hours for Jango there's great support this of general with the support more maybe the more sense than that of the slides so let's start symbol you it when the left as like this you instantiate a client using you you get from centers of 1st and then you capture it on this is how you capture errors and report and throw it on myself different where procedure like this this on 4 and help tools by which you may or may not have a lot of new operations on every exception is called here of this that happens in function forward to century so you can you don't even have to change your functions you can just edit decorative to it and are you ever called on forward integration is making it easier for a mention that it is built on the 1 January so authors know a thing or 2 about gender so that supports the best as far as I know you had a single line you get all of the countries reported and you can import declined from anywhere and then we are
already that and deploy or give it a few bucks we can be when you in for a minute and a few months you projects if you don't have a rope and occasions I really have to stress that you are missing and your customers casinos terrorists you're not you're losing customers get something done and to make it even easier there was nice enough to
issue on mind from a code which is I think 100 bucks and not getting anything out of it but so if you want right there so there you go and there we go to metric 1 matrix the
trees are numbers that makes them at time series data because they are associated with a time stamp and they are basically the difference between guessing and knowing because if you want to make this the fact that the it's accepted with them because arises from weeks to months building something that's useless or even harmful images of facts so we will give them up could grow and have a distance between system and application on matrix system metrics something you observe on the server like the load or how much charge is going through a very important to be collected using something like 40 but not really a my talk or talk about it and matrix which is something you on measure with your and symbols of metric can have a counter is something happens and you increase integer which is pretty fast so then timers and you want to know how long your autonomous restate maybe you want to know how long your request they can leverage the final is the user-defined undervalued because they're really useful if you want to about something they're just numbers which you want to track that can be the number of customers online or the number of connections in the connections and things like that I find it surprising so they're much more but it's really I'm opening it was 1 so what can you do with metrics so we said there are times of the basic plot them and such a called gives you a lot of information about their numbers don't for example using development over time so you can tell
but you're running at 99 per cent capacity every day at 4 pm and it if you
don't do anything that might hold on it's the when you get 1 more customers the also the trends so you can tell if you need to buy the you will have to scale of today tomorrow or next week or maybe never because you're losing customers because you don't have proper or so
the new graphic and correlate them so you can see I request for 2nd versus latency how because per 2nd can useful and and since they're just numbers you can to math
the so for example if the counter I'm just erasing line it's not really interested but is it in the relation of
the and africa % if you have promised not taking the average is not very useful but for some types of of very interesting so for example what's at which we must time for the smallest 0 . 0 1 per cent your customers because what if everyone problems the request takes 1 minute you wouldn't know from everything that gets used elsewhere other than in others customize bits regular if for some reason 1 of 1 major request he human and you to almost this method is the average human has 1 already in 1 testicle which is true but it's not very useful information and you can do the same mistake with your system that's at matrix so unless you know what the exponentially decaying reservoirs are useful but people do not want it so 1 I think you can monitoring of metrics of course because again set a hard limit for acceptable media it's the threshold this is exceeded towering above error rates if you have a of these the application you you have always some kind of errors if they go all the way something's going on and on a treaty for any kind of anomaly is of for example the 9 for for once go away there's something going on to investigate and
there's actually a whole state cable at the just made for finding and moments like that so we say there live in a database from prominence collide so we're looking for a called times you studies the speech of various features like special varying and everything was 1 of the most important ones that that you have a role off your data so you which means that various resolutions of your data for that task because you probably don't have enough storage to to store and of seconds resolution of audio metrics for the past years that might get expensive pretty fast even if you become part of this suit usually smoothed out some also you want to know what the average load was a year ago per day but you want to know it's very precise for the past hour so I'm going to introduce this 3 the 1st ones hate and hosted and it's really
really nice to think it's out immediately but by using kernel and time and you have a curve fueled your system other done that we started it like to the graphs of beautiful and a solid duties a lot of fun to work with if you want the hosting yourself on the current
example still graphite which implies that it's easy to and it's written in the form that is in general the bank and this in which the called carbon it's finally trusting you to build yourself and you can say that it's a widely supported standard nowadays so low dimensional probable carbon is supported by other applications to just on this for compatibility so the thing is that it
is a little bit longer so the storage configuration just follow the role loss limits so that the and it might be not the most pretty interfaces seen today as xj as you you had a pleasure to work wouldn't get as will have the problem of building interface I mean it's open source of the summit complaining about it here in Canada problems but that 1 thing
found which is something that's really just to build pretty dashboards for graphite and once in small and you will probably 1 of the few Austin because it's so much fun to play with a mix of and profound also supports
complexity which is the next generation time you database written don't because its price what do do know what it has a company behind the also hosting so let's hope they don't pull Fundación the beam and it is used by her room so it's not a your point of 4 but it is in production it's looks better it's easier to manage storage you contains values which will anyone of appreciate whatever put the server names into there are metrics names that you see in the slide before now you don't have to can put an attack on a value and 59 it offers
us a like query language to those metrics and a graphite from and which means you can if you're running right right now can point you tools just like to and it should work words computers so much work but it is sort of today as the we commend in the 1st if you are run red lights and so are functionally and unhappy only then I would not abandon ship so quickly that big of a deal some
collecting gets the data into this so that the basis and there are basically 2 approaches no 1 on the you aggravate so something happens and you send also said 180 packets who sets the or probable bauhaus remark constant the smaller comes also from that it's the ecosystem on simple to use symbols that Ramon is biased for smart person and it's confident enclosure so you probably have to to be also smart you that the good thing is that it has most states super simple to set up and use the bad thing is to have no direct sections so you need at least 1 more service you can see what metrics are coming out of the system in the case of stats you know to because that's the part does only aggregation and then forwards it to graphite with women you get at least kind pressure therefore so the 2nd approach is that you aggregated you metric and then you applications and and the it to your metrics are based this approach is the moral rights by covalent and talk metrics metrics everywhere feature changeable of art if you want to get into metrics super interesting it's super funny to watch and this will give you immediate insights into your own interapplication get some kind of vegetable the cation and our this is useful both in development and production of the of course you will state state is bad state means boxes but I personally prefer the 2nd approach because it's more practical so the question is how we do it impact so that's CDs Olympic finds pick 1 arm you then recall the same instantiate the climate in the world and you should take it around the look at random because dominance in everything's gonna be OK or not because the system is running UDP might not be the best way to our message your state so the only known working solution to a lot in the matrix to me is scales so come looking for us that's that you have to set it up on it and my most in meters that is for all something that happens 2nd so basically derived continent and the PMS that is sometimes something else so how do use it for me during the just whole mark on it and for timing it's it is the context manager that they do something inside of it and you're done not by doing this alone you get a nice solvent therefore out of this is the metering that already out of every department 550 minutes even nicer is that you get out of retirement because he didn't perceive hassle-free plus some moron nice statistics and was data also get as Jason 2nd collected from a given collect you never I personally use the of graphite periodic approaches that conflict of scale you just defined period following whole thing to just send all the metrics and you're done yeah order to click the
traits of the story now become too
large a but In an ideal world we wouldn't be longing because you want to know what errors which you you now ever century and you want to notice state of your system which are metrics so that people like on America who just refuse to the block anything I personally cannot get away with that simply because we needed for some kind of book keeping of the customer calls us they always like there was stated that they did not look into the server that is not change files and had need a way to our check without telling us and from this you should
me next summer and support and those people usually don't don't have this speech sorcerers so that this data should be somewhere searchable in some places so we're talking about centralized I can talk about thresholding and mentions black and these might see them more money next next to the name of them are 1 slides is for reasons of
this is in the process and it's not just 1 the interface of research has
platform they literally in X it works both on-premises and in the cloud it's great if you can afford it but it is the price of the so and which is full PDF white papers on there's a lot of living for you to attend to the kind of things so more down to earth
this paper trail and being which I
have worked our good and bad things about both so it's a matter of taste on I'm sure you you're gonna be reasonably happy with any of them to choose and if you want to also say your blog less only foreign service which I personally don't and it's like you're running out
probably heard about it right it's currently the most popular stack and consist of ElasticSearch losses Japan means strictly show you how
it works to get so we have servers
generating a lot faster overlooked by somehow get into lockstep which passes them
at meaning that and say that into elastic search which is about a this easily searchable and easily possible at knowledge from there you can view it you think about that which is of the interface to all these things on yet and that all that the similar solution called gray
all but also uses Elastic Search for storage in search but too bad not only of view on Search upgraded does more because I'm quoting here optimistic search is not the lot management system so overall it's a bit more integrated do more in
by pursuing not article want of having a mark grander in my infrastructure so if you say yourself i haven't found a compelling reason to switch from old but I'm sure there's someone some so if you have any questions about out from the crowd somewhere probably In some popular and she works for this for elastic company we had so he will be happy to answer your questions and he's is also the maintainer of the are clientalistic search so 1 more thing that's much
more than just over graph they have a lot of things going on like you use them and everything so there's a lot of things to find knowledge of the
CT heart how you did today with how
do produce so I'm going to say this should be the goal for you no a times
and something mentioned readable with and much useful context as possible on because that makes configuration gruesome you would only tell what station there sometimes then adjacent about to figure it out of course is just 1 line but I thought you might find it more readable and that size so how do you get there is the matter of context and format so you want to look at everything important and you wanna formatted in emission readable way and you try to achieve that the standard tools you may find like I it's for a tedious so I wrote something my own thoughts structure as anyone knows structure OK let's changes so certainly is not going so it's not a replacement for 1 group but it's not a replacement for responding it's not instead gives you a bound over the wraps your want so if you're gonna ask me circle wrote with x the answer is yes not it also gives you a context features can bind key-value pairs and once you decide to look it up on the event out this complex you say before this combined with the new key value pairs 2 1 argument dictionary and is the mandatory is run through long chain of processes which i just call a function that gets off at the jury and returns a dictionary nothing of the last process the return value of it is passed into the original longer so if you're using the standard logging of forms of library you would return a string for example adjacent string or whatever format you want and return XML will at all care starting consulates Jason and give a test for that matter so the thing of all processes through equal because it's really just hold in what everyone to comply data out of it you can collect metrics from me a lot of entries you you can report errors the century from them and so and reached with the context you collected those really enacts so this kind of both context and and let me give you a few examples because of the abstraction thought simple case you get along which is everything pretty pretty much configurable and you can now you can walk using key-value pairs is against the writing prose on July like me I heated writing prose for but is what the mortars parsing approach so in this output is completely configurable this is the default on which is just the key value pairs which is human readable in and development so I find is already a huge progress over Austin library but you can do more so and this is incremental that right so again you get and I can just start whining he value pairs the taller and as loss as objects in a new object every single time this is immutable of data and we have no new tools that allow us have school friends a great property to have some any and everything you bound to the longer it looked at along with that the you at the end of it is configurable at this notice there is that you don't care at all hold it as represented within you're all within your business data business because that's that's something that you care about somebody else and processes when you log in you but not in your business of code you just find the values and just look them so our now maybe you more practical so the use practice of this a pyramid you very simple 1 but it would probably work the same with any other so In the beginning you buy a request object to your mother and then you look something out and how do you do something useful with that object so you write a process that extracts the data for you try to to remove the the request from the argument if you want so you've removed but you know if you add some data from the request like the IP address of the client or the user a view of the user and he returned and you are the new dictionary and this is what you get out of it in case you have a duration of 4 installed again you did not care about what you want to watch out in your view you that something that you decide that's so it's always the mistaken samples of given questions to talk to to me and I'm pretty proud of that 1 I'm not something so the center talk about standard libraries loading and say this is all you should do and ignore all the rest are just a lot standard out and hand outside those units had almost 40 years to develop solid loading tools and there's absolutely no need for us to find people to reinvent the wheel life they say the thing that makes them think all or on the protection we are doing the 1st stop doing it just to down out on also I heard
it's not that much fun to use but if you need a judge so on no matter such
data instead of would you next send it into into a fight or synthesis law or any other Kafka I haven't interlocking agent-like like elements form begin whatever you want it's justified and and the
celebrity paranoid on because I don't know a lot of well it important to me so I don't want to walk in terms any log entry so and no workings world as reliable as x 4 so I said everything in a fire but despite its rotation of 48 Heat our standards of those entries are deleted and I should it off from their from this file so well I don't want to have to use graph I still want to retain the real reliability of gripping from files on a file system so but will do that so I use structural
or to bind to look things out sample makes the adjacent string which goes into walking the sentences that out not I just wanted to run my processes it doesn't really matter what do you think about running counter that takes them out at a time stamp to its and writing to a now in my own country save this file to watch a lot such for work formerly known as as lumberjacks and it's sensor that's what they pass the savings Elastic Search 1 is so long yeah you don't
earlier let's get some you're not so that stream as complements of the verb that a pragmatic part so how would you put those 3 things together without making it grows because this is what you you can barely see the logic hidden in the jungle reporting measuring counting and what not so I want to look like this which is much nicer something happens at a locus in the body and that of course that's always possible by really try to hot really like to try hard to get some of so with error it's pretty easy and they're saying so if you use so some kind of comes with us and leader just longing or if you're using running Django The Journal of that from or structure that's what I do when I using pyramid I just plug my Aristotle and other the logging stream and our again drop entries if something isn't interesting and that that's that's sort usually also way to define terrorism and is is really really cool cast again permits on you that the exception and a request object now you use the error ID which is served from centuries so now when you customer calls you complaining about so they can say they tell you that idea and you can look at the idea you have the exception of the and that the customer Saul and
this is so great that we've seen something it's even rarer than a wide range of which is the happy on rocket so although I have to say since I made a slide Johnson free so to give a grain of salt but still so on the matrix most
metrics can be observed from the outside and inside and mean what side of you've used outside of your app also you so failure have look at risky containers that 2 major ones have both so not that will help you with that so during 1st set C integration right there so you add 1 commandment option and you have and which request times in your state the in in your graph right you don't have to change your code at all Michael with the as usual goes far apart forgot they of course that's that's they have direct carbon a provide support they had a whole metrics subsystem including nightmare using things like this and he so you get your stuff done with that and with this you get the picture of of the state of the application without even touching apps so go for it then you can write in the middle there no that magic again parameter of this is to be which is a very awkward contraction of the team and this is called on every request comes in that request object and is that in this case you just we just measuring time but you can of course look at the data within the request object and start of splitting up your data about a depending on you or some argument you're passing through you probably don't have to because there are things like permits that student read up it you but you always have to possibility to do things from within your head but outside your picture logic but of course you can extract data from walks of because if you look something else you shouldn't have to also counted measures so workstations that for you it supports all major metrics state and so on the the problem is that you have to change the configuration of blocks which may or may not be a problem for you but it's not the real problem to me about it friction which and not like so I I don't want to annoy the people are responsible for that to fix it for me because I had metric of course you can construct what I do on can just come to events by their names and you really have something useful OK so finally you can also leverage monitoring which is you forever but are any any monitoring system has some support for metrics numbers in the 1st case just measure the time it takes to execute the check and say this is so you get really external view of your data own view of the behavior of your apps on which is not very precise course but sometimes it's useful to see how we assistant fields from out of sight wandering not from within your arm availability zone or your computing center OK so what what's left with you to yourself so if you want to measure called have you probably have to look at some cultures is selected for example that its or if you have some major I use cases like of you that sometimes use only capture data and sometimes it's a database it's not very useful to average just to numbers not the states completely useless so you may want to split up and so I of course gages if you want to expose numbers from within application you will probably have to architecture application in some way and now we're really don't so what did you learn proper what learning is important centuries also on metrics are important inflects TV is probably the future graphite is the present whatever you want from those 2 my trifling sensible training maybe we need its collect your topic will help you to read and I know you know how to use all of them with ties with all the worst complications so I hope everyone learned something to
go forth and measure that call it formula that the amplitude German-speaking friends of vitamins from our media thank you few
on and I'm sorry I'm not taking any questions because whenever I did a completely misunderstood the question and say something very stupid so if you have any questions I will be outside and you through Sunday I will be at the springs of get lunch just check me up I'm happy to answer any questions thank you
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation
Computer animation


  445 ms - page object


AV-Portal 3.8.2 (0bb840d79881f4e1b2f2d6f66c37060441d4bb2e)