Where is the bottleneck?
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 20 | |
Number of Parts | 169 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/21180 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
EuroPython 201620 / 169
1
5
6
7
10
11
12
13
18
20
24
26
29
30
31
32
33
36
39
41
44
48
51
52
53
59
60
62
68
69
71
79
82
83
84
85
90
91
98
99
101
102
106
110
113
114
115
118
122
123
124
125
132
133
135
136
137
140
143
144
145
147
148
149
151
153
154
155
156
158
162
163
166
167
169
00:00
VarianceAlgorithmMathematical optimizationComputer simulationSoftware engineeringInstance (computer science)Computer programmingPoint (geometry)Goodness of fitLevel (video gaming)Key (cryptography)Degree (graph theory)SubsetWordSlide ruleForm (programming)Strategy gameQuicksortSeries (mathematics)Physical systemSemiconductor memoryVideo gamePresentation of a groupDisk read-and-write headComputer animationLecture/Conference
02:21
Electronic meeting systemContext awarenessFocus (optics)Conditional-access moduleSpecial unitary groupCodeTrigonometric functionsParsingWindowStructural loadContent (media)Local ringSoftware testingEmulationInstallable File System5 (number)Metropolitan area networkExt functorVarianceCurve fittingExecution unitContext awarenessFunction (mathematics)BefehlsprozessorPresentation of a groupView (database)Multiplication signProcess (computing)MereologyCodeComputer programmingMathematical optimizationDataflowRandomizationSoftwareFocus (optics)Functional (mathematics)Electronic mailing listTheory of relativityLine (geometry)Computer fileVirtual machineRun time (program lifecycle phase)Point (geometry)Query languageDifferent (Kate Ryan album)Product (business)Operator (mathematics)Kernel (computing)Image resolutionComplete metric spaceVector spaceGame controllerNoise (electronics)Block (periodic table)System callCASE <Informatik>Source code1 (number)Finite differenceStaff (military)Physical systemRevision controlFeedbackRight angleSemiconductor memoryParticle systemExecution unitComputer animation
08:23
Function (mathematics)Metropolitan area networkOrder (biology)Different (Kate Ryan album)Semiconductor memorySlide ruleBefehlsprozessorProcess (computing)Service (economics)Computer animationSource code
08:57
MaizeArmChannel capacityDataflowProfil (magazine)Semiconductor memoryVideo game consoleFunctional (mathematics)DebuggerObject (grammar)Maxima and minimaComputer programmingComputer animation
10:00
Absolute valueFunctional (mathematics)Radical (chemistry)Function (mathematics)Multiplication signCost curveSemiconductor memory2 (number)Sampling (statistics)System callComputer programmingComputer animation
11:05
Information managementRing (mathematics)MKS system of unitsHill differential equationPresentation of a groupMereologySoftware testingDivision (mathematics)NumberGraph (mathematics)Data structureSemiconductor memoryInformationCondition numberParameter (computer programming)Set (mathematics)Computer programmingComputer animationProgram flowchartSource code
11:55
ArmSource codeProfil (magazine)Line (geometry)Multiplication signFunctional (mathematics)BefehlsprozessorArithmetic progressionCodeTraffic reportingComputer programmingRevision controlGame controllerAverageComputer animationProgram flowchart
13:15
Multiplication signFunctional (mathematics)Source codeGradientOrder of magnitudeProfil (magazine)Function (mathematics)NumberCodeTotal S.A.TheoryRun time (program lifecycle phase)Cost curveLine (geometry)
14:55
Semiconductor memoryProfil (magazine)Functional (mathematics)Library (computing)Source codeCodePlug-in (computing)Computer programmingLine (geometry)Level (video gaming)BitTouchscreenTraffic reportingMultiplication signExtension (kinesiology)Computer filePrime numberStructural loadSystem callWeightCost curvePrime idealComputer animation
16:41
1 (number)Source codeCost curveFunctional (mathematics)Multiplication signLine (geometry)AlgorithmComputer programmingProfil (magazine)Operator (mathematics)ProgrammschleifeInteractive televisionGame controllerTraffic reportingStructural loadMathematics2 (number)Group actionMathematical optimizationSource codeComputer animation
19:23
Presentation of a groupLine (geometry)Profil (magazine)MereologySemiconductor memory1 (number)Electronic visual displaySystem callSource codeComputer animation
20:04
Finite element method3 (number)OvalDreizehnOperator (mathematics)File viewerComputer programmingReal-time operating systemSemiconductor memoryInteractive televisionSystem callProfil (magazine)Graph (mathematics)CodeMultiplication signServer (computing)Functional (mathematics)Physical systemLimit (category theory)PlotterMathematical analysisOverhead (computing)MereologyComputer animation
21:30
Clique-widthFunctional (mathematics)Graph (mathematics)Computer programmingGreatest elementBitArrow of timeBlock (periodic table)Multiplication signSheaf (mathematics)Group actionDataflowRight angleProfil (magazine)2 (number)Distribution (mathematics)Cost curveSystem callError messageComputer animation
22:48
MeasurementBit rateChi-squared distributionHexagramSpecial unitary groupConditional-access module5 (number)Web pageComa BerenicesLogic gateEwe languageBit rateFunction (mathematics)Multiplication signEvent horizonGraph (mathematics)Variable (mathematics)Thread (computing)Set (mathematics)Context awarenessSlide ruleMeasurementProcess (computing)Differenz <Mathematik>Data managementFunctional (mathematics)Run time (program lifecycle phase)1 (number)Assembly languagePointer (computer programming)WordLine (geometry)Profil (magazine)Metric systemMatrix (mathematics)Cartesian coordinate systemOperator (mathematics)LoginCodeBlock (periodic table)System callFile formatMathematical optimizationView (database)Source codeXMLComputer animationProgram flowchart
25:54
Computer programmingSystem callGraph (mathematics)Form (programming)Particle systemProfil (magazine)Endliche ModelltheorieInformationFunction (mathematics)Computer animationSource code
26:32
Maxima and minimaConvex hullFile formatGreen's functionTerm (mathematics)Square numberRight angleRootMultiplication signPlotterMereologyPlanningEndliche ModelltheorieLine (geometry)Computer fileComputer programmingFunctional (mathematics)Control flowBlock (periodic table)1 (number)WebsiteBefehlsprozessorPower (physics)Closed setCodeEncryptionSystem callGraph (mathematics)Order (biology)Instance (computer science)Cost curve2 (number)Arrow of timeMathematicsProfil (magazine)Lecture/ConferenceComputer animation
30:07
Radical (chemistry)MereologyVarianceMultiplication signNumberSoftware testingBitFunctional (mathematics)2 (number)Existential quantificationCodeThresholding (image processing)EncryptionINTEGRALLoginGodSpring (hydrology)Arithmetic meanBit rateWeightInheritance (object-oriented programming)Computer animationSource code
32:17
Raw image formatPresentation of a groupFitness functionProfil (magazine)Stack (abstract data type)Buffer overflowDebuggerMultiplication signRandomizationSoftware frameworkLine (geometry)Software testingWebsiteMatching (graph theory)Program flowchartComputer animationLecture/Conference
33:39
Client (computing)View (database)Functional (mathematics)SatelliteComputer programmingLine (geometry)Profil (magazine)Interactive televisionBefehlsprozessor1 (number)Multiplication signLecture/Conference
34:52
Number1 (number)Process (computing)Extension (kinesiology)Multiplication signState of matterService (economics)Different (Kate Ryan album)Product (business)Metric systemBuildingSystem callView (database)Form (programming)Graph (mathematics)Staff (military)Context awarenessPresentation of a groupComputer simulationData managementFunctional (mathematics)Computer fileSemiconductor memoryVisualization (computer graphics)FeedbackMereologyLoginStreaming mediaComputer programmingLecture/Conference
37:44
ParsingLine (geometry)Arc (geometry)Semiconductor memoryScripting languageProfil (magazine)Mixed realityFunctional (mathematics)Multiplication signVideo gameParameter (computer programming)TouchscreenComputer programmingProof theoryComputer animation
Transcript: English(auto-generated)
00:00
I'm Manuel Miranda, software engineer in Skyscanner. I've been working there since November from the last year, and previously I was working in research, which was about the algorithm optimization and simulations and stuff. So that's basically from all my experience comes from. I hope you find the talk interesting
00:21
so that I don't have to see people with their phones throwing pokeballs to me. So let's start with it. So first of all, I put the basic level in the talk because it's first tag, and I wanted to try. And I'll go first with a strategy,
00:41
what I usually check before trying to optimize a new program or not. Key points I've learned through all this experience. And then I'll just show some tools I like to use, and for me they are really useful. I'll go from operating system tools, which I'll skip very fast.
01:02
Then I'll show some resources tools, like memory, CPU, and stuff. And then some more fancy that I call them advanced because they do more stuff than just checking one resource. So don't worry, this is the slide with the most text in the presentation,
01:23
so I promise it will be just this one. So the main key points I like to check when I start with a new program is that those. So the first one is focus. So when I start checking a new program,
01:41
I like to think what exactly what I want to achieve, which is the degree of acceptance I want to achieve by optimizing this new program. Because sometimes you are so happy about optimizing your problem, oh yeah, I'm gonna raise the speed up, I'm gonna make it faster, and blah blah blah.
02:01
And you end up being so deep in, sorry for the word shit, like you don't know, you start on Monday optimizing the program and then you stand up and your teammate says to you, like, good weekend man, oh shit, I didn't do anything, I'm just still struggling with it. So next key point is cost.
02:22
Like is the optimization we are doing to do worth it, like the company is paying you money or you are wasting your time, and for example, if it's a speed up of just winning some minutes for each execution and the execution takes some hours
02:40
and maybe it doesn't worth it. So the cost and the focus are a bit related. Then the third one is code knowledge. If some of you have been through legacy code, sometimes you see this bunch of code that it's like, I mean, why this guy is programming in Python
03:02
and doing this big 1000 line files when you have just building functions like all or more list comprehensions and this kind of stuff. So before starting optimizing the code, you should, if it's not your code,
03:20
you should ask, why is it that way? I mean, because in legacy code, believe me, you change one line, one print, and bad stuff is starting to happen. Then also context awareness. So context awareness. This is related more like,
03:40
do you control all the environment, the things you are trying to optimize is just code or do you depend on random stuff going around like network issues. Oh well. So random stuff like network issues or MySQL queries that sometimes take random time.
04:02
So before trying to optimize with this noise, you should be isolated from this stuff. Optimizing these queries or network performance stuff, it's related with another scope, you should go into that part. And the last one is local context.
04:21
So sometimes you just start and you say, I'm gonna optimize that, I'm doing a git clone, and I'm start coding and everything, and you spend like two days coding and then you move that to production, you obtain it like half time for executing your code,
04:40
you move it to production and it takes you more time. Why? You don't know. Maybe because virtual machine stuff, the resources are different, operating system, kernel version, whatever. So before starting, set up a nice environment, try to reproduce it as much as you can,
05:01
and if it's not possible, don't just wait two days for moving your code to production, do it iteratively so you can have feedback soon enough. So that's after three years of working with this kind of stuff. I'm still not applying all of them sometimes,
05:22
but I think they are really important points that can save you lots of time if it's like a big job you have to do about that. So here you can check my skills in design. So usually I try to approach this from the outside part to the inner part.
05:42
So usually when I start with a new code, I just try to, well, let's check how much, how long does it take to execute? Once we know how long, we also want to know the resources it consumes, like memory, CPU, or this kind of stuff. And then you have these two things. The good thing, the thing you have to do then
06:00
is understand why. Why is taking that time, and if it's taking this time, is it normal to consume all these resources? And to do that, you have to have the code knowledge we talked before, because you have to know what the program is doing, and this way you will know if it's reasonable or not to be consuming all this.
06:23
So once you understand the code, you can just start writing, or going more inside, and this kind of stuff. One thing I want to comment is that when you take one code you don't know, usually this measuring time, measuring resources,
06:40
and all stuff, you apply it to the whole code. But if it's your code, you usually know which is the problematic part of the code. So you just go to monitor that part. Of course, once you apply the code part, you just have to execute again, so you don't mess up with the whole execution, but that's basically the flow I usually use.
07:04
Oh, I lost my, this. So, let's start with very basic tools. I know you all know them. But for me, the most interesting ones, I mean the first thing I do, maybe the first five minutes, is using time and htop for checking how it goes.
07:24
The thing I like usually for time is to check if my program is really resource-bonded, or IO-bonded, because I don't know if, oh, how many of you don't know the time tool? Okay, so basically the output it gives you
07:42
is that the execution time in CPU and the execution time of the user. So if you have many blocking stuff, like network queries or MySQL queries, it will tell you which is the difference of it. So it can give you an idea of the things the program is doing.
08:03
Then, this htop is basically, well, I started using because I started using, well, they made me started using Mac. And I don't know if you have checked the output of top, but, oh, wait, I think I have to move the presenter view
08:22
I don't know how to get out of here. So if you check the output of top from Mac, it's like, I mean, if someone can read something about that, I'll give you a rare Pokemon, because it's just like messy, I don't know, I mean, in Linux, it's pretty much better. So if you go with htop,
08:42
I think it's, I mean, at least it orders by CPU, it shows you the four different processors, memory and blah, blah, blah. But as I said, this is really basic and you already know all that. So let's go to some more, oh, more interesting stuff.
09:07
Oh, really? So first one, is memory profiler. This one is quite interesting because it allows you
09:21
to check the whole flow of the program, like the memory consumption for the whole program, function oriented, bylines and all this. So just to, well, there is another feature, interesting feature that it, can you trigger a debugger
09:40
once you've reached a maximum capacity for memory. So you can tell when executing like, if the memory I'm using is more than one gigabyte or 100 megabytes, just drop me into the debugger console. So then you can check the status of the program, the objects you have initialized and all this.
10:01
So this is some examples of the output. Let's start with that one, for example. Here, if you can see from the back, I don't know, it shows you like, for example, we have main that it took like 17 seconds and then we have first costly function and second costly function.
10:20
And it just marks you where its function starts. So it gives you an idea of the whole program, which is the function or the functions you should focus on like to really improve the memory consumption, which sometimes can be a problem. So as another example here, we have the terminal output for just checking the memory consumption per line,
10:44
which is kind of interesting too. This one is really slow. It takes like eight times more to execute, but I mean, for one time is pretty enough. So let me show you the,
11:03
does anyone know how to move the terminal to the presentation part easily? Not just because I don't want to be out and just out that, but well.
11:22
So the program I used for that profile, can you see from the back? I don't think so. So the program I used as a test is that one. So I mean, you've already seen the graph, but when I was first trying that, I thought that this prime would take much more memory
11:41
than a trial division because of the big number and stuff, but it resulted the other way. So with just this basic tool, you can just extract useful information and something interesting to have in your tool set. So the next one is for the other resource,
12:03
which is called line profiler. You know, people in Python usually, we got really cool names for the tools. We are really original. So this one is an advanced version of C profile. C profile is the built-in profiling tool for Python.
12:21
I'm not presenting it that because it's the one used always, but basically it's useful to know the CPU consumption time of your programs. It shows you per line, per function, the average percentage of consumption, of time consumption of your lines, your functions and everything.
12:41
It's compatible with C profile output, and that's something interesting I found out that sometimes you start profiling and those profiling tools like multiply the time of execution, and then you are profiling and you get pissed off and say fuck it, and then you take control C,
13:00
and usually with some kind of tools you lose all the progress because the report is generated at the end. With that one, just generates what it displays you, what it calculated from that time. So it's pretty cool. So that's the example of the output for that one.
13:21
I mean, it's the same code, and for example here you can see just the execution time, like the percentage time, the number of hits for each line, the time in total, and the time per hit. So for example here we can see which is the problematic function tool,
13:42
like this one is one grade of magnitude above the first costly function, which if we wanted to go to check the contents, like which one we want to optimize, we will go definitely for that one. So that's pretty cool. It also takes some time to execute with full code,
14:04
and I mean if you have code that takes like hours to execute, using these tools are kind of, well, it's too long to use them. And it's too long because usually
14:20
you have to monitor the whole code. And if you have done this kind of stuff, sometimes it's a bit difficult because you have to modify your source code. I don't know, well, the profile decorator, it was there, so if you want to monitor the functions you have to use that decorator, then you have to change your source code,
14:42
and then you execute, you realize that that wasn't the function you want to profile, then you go to another function, you have to execute again, so it's kind of a messy stuff. So that's where our super IPython comes to help. So IPython, line profiler and memory profiler
15:02
are supported as plugins for IPython, and that's really cool, because it allows you to profile any function in your code or source code of any other library interactively, so you don't have to execute the full program. So just let me show you one of the example outputs,
15:24
but now we will play a bit with it. So we are, if you check at the top of the screen, we are just using the load extension line profiler, then I'm importing my program, and then I'm using the magic command lp run with the function I want to profile,
15:41
which is the minus F, and then the program I'm calling, which is run mocks, well, run the mocks dot second costly function, yeah? So here we are having the report for the line profiler, which it's one level deeper than we had previously,
16:01
just monitoring for the outsider function. So here we can see that why is the second costly function more costly? It's because it's indeed calling this Eratosthenes method, which I don't know what it does, I think it's something with prime numbers, but that's the one that it's actually taking lots of time.
16:21
So here we have also the hits, the time, the per hit time, and the amount of time it's consuming, which is what we want to actually work with. So just to show you that I'm not lying,
16:43
so if we go to, I don't know which, I think it was that one, ipathon, and then, let me put that on top, then I'm losing the load
17:02
exed line profiler. Then, from my program, I'm importing the second costly function, and then from algorithms, now we are going to introspect the Eratosthenes function without touching our source code,
17:21
so you can check how easy it is to check how much time each line of the Eratosthenes function is spending. So from algorithms dot math dot sieve, well, I knew that already, but sometimes you will have to check in the source code where each function is, but you will know that.
17:40
So we are importing this function, and now we are running the Alpiran, we want to profile the Eratosthenes function, and we are gonna run again the second costly function. So now we are running the second costly function,
18:02
monitoring the time that Eratosthenes is taking. So what if the second, now it's taking only like 20 seconds or stuff, so what if the function took like two hours? Really, we have to evaluate, like spend two hours, just to check the cost of Eratosthenes function?
18:21
That's another cool thing we can do, just, well, control C, you know, I can see the report now. So we can just call the Eratosthenes function by itself by calling it with random arcs, and it will show you as the report, as fast as it calculates just one iteration,
18:41
because in the original program it was calling it like 800,000 times, which is not necessary to have a basic profiling of the function. So we can see again here, which are the functions that are, like what functions, operations that are the most time consuming. So if that was a real problem we want to optimize,
19:01
we would check like, those ones are really loops we need, are these operations the most optimal ones and stuff? We are not gonna go into this detail, but now at least we have spotted where we want to do all our stuff. And I mean, for me, IPython is just pretty handy
19:21
for doing this thing interactively. So back to the presentation. I mean, if some of you have worked with that already, I mean, come on, like if you search for profiling in Python,
19:40
line profiler and memory profiler, and these kind of tools are the first ones you obtain. So this second part of the presentation, which I call the advanced ones, are more cool or fancy tools that displays visual graphs, so you can play more interactively with them. Which, I mean, that's cool.
20:02
At least it looks cool. So the first one I like, well I kind of love and hate at the same time, you will see why, but it's this one code blob, it doesn't have much comments, and it's not that much maintained, but it does its work.
20:21
So the features it has is that it's a really low overhead profiler. So why is that? Because it uses the strace and ltrace function from the operating system, which basically it just reads the stack of the program being executed, so it doesn't interact with your program,
20:41
like we've seen with memory profiler and stuff, which just puts stuff in the middle of your program. So it does this stack analysis, and it basically displays a call graph of all your functions, how they are called between them, and the time spent with them, and it also displays flame graph,
21:00
which I don't know if you know what it is, but it's something pretty fancy, and for example in Netflix they are using it. I don't know if you follow their blog, but it's pretty useful. So there has a server running with tornado, and with a decent setup you can just feedback this viewer so you can upload this visualization in real time.
21:26
So to show you these things, for example here we have the profile view, that's why I hate the program, because the call graph is a bit messy,
21:40
and then if you want to just move the stuff, you have to play a bit and do like, well move it a bit so I can see it better, so it's kind of shitty, but here we can see that the error test function is indeed the one consuming most time, and the main is with the one calling second call function and first call function, so we can see the flow of it,
22:03
and then we also see the size of the width of the arrows is the time spent also calling these functions, so it's pretty useful, and also with the flame graph, I know you don't see anything here, but let me explain it to you, so it's the same, like here at the base we have the main function,
22:22
the main function calls second costly function and first costly function, so you have to go from bottom to top, and the width of the block tells you how long did it take to execute this function, so here at the bottom it tells you that main took 99%,
22:40
this Eratosthenes call took 89%, so that's pretty cool, and let me show you why it's called flame graph, because with a real program it looks like that, which is lots of stuff, but you have to iterate a bit and learn how to isolate stuff there, so next one,
23:04
it's called pi formats, I'm running out of time, so I'm gonna skip the code I had prepared, but you can reach me out then later, so pi formats, it doesn't have fancy graphs, but it's pretty cool, because I bet you have done a lot,
23:23
this typical from time, import time, start time, time dot time, then we do some stuff, then end time, time dot time, and then we apply the subtraction and then we log it and stuff, so basically pi formats is a set of tools
23:40
that with context managers and this kind of stuff, it gives you how many times a function has been called, how long it took a function to process, how many, this one is really interesting, the measure rate of events over time, so it tells you how many times your function was called during the last second, during the last minute, during the 15 minutes,
24:01
so this is really cool for generating metrics for an API or for this kind of stuff, so it's not exactly an interactive or optimization tool like for checking the profile, but it's more like for generating metrics, so you can just monitor how your application is going.
24:20
Something you have to take into account is that these timers, which are the ones measuring rate of events, shed variables, so you have to be aware that if you are using threads and this kind of stuff, it uses logs internally, so it can just end up being a mess if you don't use it properly. So that's what I had prepared,
24:41
but I'll upload the slides because I want to present this last one, which is the one I like, basically. So this last one is called Cake at Segrint. This one was originally for profiling C and C++. In the research job I had, I was doing C++ stuff, and that's the one I was using
25:01
for optimizing all these pointers and operations and things. It's a really old tool. I think it started at 2002 or some stuff. So it just uses, well, it displays you the call graph, it displays the execution time, it displays block view of the time spent for the functions.
25:25
Also, the time cost for line, which is not true in Python, but it was in C++. I haven't been able to make it work properly in Python, but I think I want to do it. And it also displays assembly code,
25:42
which, of course, I use every day. And it reads from C profile output using this tool, which is PyProof, to call tree. So just to check the output, this is the call graph we have from the same program. So just to, instead of showing you an image,
26:03
let's open the program so you can see how it goes. So I have this profile already generated,
26:22
but basically what you have to do is call your Python program with the C profile module, which will output the profile information, and then you have to convert it to Colgreen format so Colgreen can, well, K, Katze Green can understand it.
26:42
It's called with kill because in Mac, K, it doesn't work. You have to use the kill, Katze Green. Yeah, it's just like, well. So here we have this. So basically here we have the other functions that have been called in our program.
27:02
We just want to check the ones that we basically control. We are not going to optimize the math as square root of Python, or we are not going to, well, or the length or that. So we just want to check the main and the thostiness and the ones we actually programmed. So in that view, we have them ordered
27:21
by the inclusive time that it takes. So the main has taken 100% of them, obviously, and then we have from the most time consumed to the least time consumed. So if we order by itself, we will see again that that instance is the one that consumed more time.
27:42
So we can go, for example, here, and here, we want to select main. Okay, so in the first one, it's a fast, just rather than ordering here
28:03
and all the stuff. In the top right part, we have a block view, which just lets us easily check which is the function or the functions that are taking the most time to execute, which is like, I mean, just by one load, you can check. Okay, here is the problem.
28:21
And in the bottom right part, we have the call graph of our program flow. So we called main at first, then main called first costly function and second costly function, and then first costly function called is prime, second costly function called trial division,
28:41
and both of them called Eratosthenes. And again, like in the blob tool, we have the size of the arrow and this thing here that tells you how much time was spent calling that function, which is also the same here. And then that one X is how many times we call that function.
29:05
So that's a really interesting tool. Another thing I really liked about that when I was programming C++ is that, which now obviously doesn't work,
29:20
but it was showing you a report of like we've seen in line profiler. So for any function of your code that you have called, it would display a cost CPU time cipher next to the line. So you know every line how much has taken to execute it.
29:41
I'm still working on it, so I'll update if I make it work correctly. So just to finish, how are we about time? Are those five minutes with questions or without questions? Oh, cool.
30:02
Then let me show you the code I skipped from the PyFormans one, which is I think it's interesting to have here. So now we are gonna go to the terminal. It looks better.
30:25
So this is a code that is using PyFormans. So basically I'm importing timer, which is a tool given by PyFormans. So you can do with timer test.time. That's the shared variable I was talking about.
30:42
So in any other part of your code, you can just access test timer and just print the min, getmax, getbar, which basically it's printing the min time it's taken to execute that part of code, the variance of the executions, min rate, which is the number of executions for one second,
31:03
wait one minute, wait for one minute, and this kind of stuff. It's pretty handy because it's, imagine you had a code and you want to keep it like, you know, it's being executed around with 10 seconds or so. So you can just use this tool to trigger an alarm or lock and tell you,
31:22
oh, God, this function took like 20 seconds now. Something's going wrong. So you can use that for your performance tests or this kind of stuff. So for a test also, let me show you. This basically, it's telling me that
31:43
there is a lot slow execution because I don't know if you saw the threshold cipher I had here, that I have threshold 0.21. Then if the getmin is above the threshold, it's telling me slow execution, something wrong happened. And why is that? Because I'm doing a sleep run dot dot uniform 0.1, 0.3,
32:03
which not exactly, but the time it should take should be 0.2, but because of internal stuff, it takes a bit more. So that's handy to have. So now, yeah, for finishing the presentation,
32:27
I know I've presented a bunch of tools some of them won't fit on your tool set, some of them will. So one thing is really important before,
32:41
starting with this kind of stuff, it's just building your tool set. Like what exactly is it? Like, do I need this kind of tool or do I need something more advanced for the framework? I'm using Django debug tool bar, or if I'm using Jeevan Greenlight Profiler because C profile doesn't work well with them. So just try to do some kind of research
33:03
before using any tool I find in Stack Overflow or Google results. So that's pretty much all. I hope you found it interesting or at least you learned something new.
33:22
So if you have any questions or if we don't have time enough for questions, you can just reach me outside, come to talk about this or any other random stuff. I'll be happy to talk with you, so questions.
33:59
Well first of all, thank you, very nice.
34:03
Just a simple question, which one of these do you use the most? So as I said, my favorite one is K Kachegrint because it just, I mean, the execution is really slow but just by one execution, you have the full view of what's going on in your program.
34:20
The thing is that since I don't have the CPU time after that execution, usually I end up using Line Profiler for that interactively with IPython because once I have the big picture, then I'll, okay, that's the function that it's pissing me off. I'm going just to call to the IPython, I call it manually so I know which lines exactly
34:41
inside that function are the problematic ones. So that's basically the two of them. But the other ones I like also to use them because sometimes by using different tools, I mean, by just using one tool, sometimes you just have a narrow view of what's going on. So by having different tools, sometimes with this flame graph or this blob stuff
35:02
or any, or memory for example, you have more big picture of what's going on because with just one tool, you don't see all of it. And sadly, there is no tool that does everything so you have to play a bit with it.
35:23
Okay, thank you. We have more time. We have some minutes for questions. Very interesting talk. Do you have any advice on measuring the performance of an ongoing process like a long-running process
35:41
or something like that? So you mean something like API or? Yeah, something that is running in the background. Streaming things? So I mean, there are two kind of things here. For me, long-running processes can be simulations using in research that can take like one week or I don't know, a lot of time.
36:03
So for those kind, logging is a really important one. Like logging, for example, this performance one I've showed you, it's really interesting because then you can just have feedback of different metrics inside your program. Like how many calls, how many times have you called
36:22
this function you know is the big one? Is that number of calls the one you were expecting or not? So that's for one side. And for the other one, for example, APA calls and this kind of stuff. As I said, performance is also an interesting one because it can tell you how many times
36:42
this endpoint has been called or this kind of stuff. And the PLOB one, I'm not using it in live services and in production services, but I know people is. They are using it just to know how stuff is building. Like this call graph is building during the execution.
37:02
You have to set up because the comments I'm showing in the presentation are for the whole process. So you start the monitor and then it ends and then it creates a file so you can open it. But it has context managers and this kind of stuff so you can open the file, register the activity,
37:20
close the file, and then call another function. So this file gets dynamically evolved. So then you can check the visualization from time to time to see how it goes. But for me, basically, the most interesting part is the logging. For the long ones, logging.
37:44
Oh, I don't know if it was you or. Hi, interesting talk, thank you. And the line or the memory profiler, do they also work with arc parse or with command line parameters? With command line parameters?
38:01
Yes, if my script is working on command line and it expects parameters, and do they mix up or I saw that you called it with a minus? Oh, because you mean because of the M proof thing. So you can call it that way. Then it will just monitor all the decorated functions,
38:23
which take less time, but you can also call it with normal Python. So you do Python minus M memory profiler and then your program with all the arguments you are using for your script and then it will work anyway. So there is two ways of calling it.
38:42
So, yeah. Okay, thank you very much. Thank you.