Programming in Parallel with Threads

Video in TIB AV-Portal: Programming in Parallel with Threads

Formal Metadata

Programming in Parallel with Threads
Title of Series
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Programming in Parallel with Threads [EuroPython 2017 - Talk - 2017-07-11 - PyCharm Room] [Rimini, Italy] Threads are typically not the way to take advantage of multiple CPUs for CPU-bound problems. The Global Interpreter Lock (GIL) allows the use of only one CPU at the time when using threads. However, the GIL is released for IO task The use case is a scientific simulation model that has to run 18,000 different simulations. The input data for these simulations need to be extracted from a common database, re-assembled and translated into several input files per simulation. After each simulation that is run with an external, standalone executable, the output data needs to be gathered and rearranged in a output database. The implementation scaled up to 50 threads. On a eight-core machine more than 90 % usage of all CPUs could be achieved, bringing the total run time down to about two hours from about 15 hours. Depending on the use case, threading can help to speedup a program and even take advantage of multiple CPUs. This talk presents such a use case. The approach can be translated to problems from other domains if the sub-tasks can be turned into IO tasks. Asynchronous programming could have been used here. However using a thread per task and using class that represents a task, is likely conceptually simpler for programmers not used to asynchronous programming
Web page Slide rule Server (computing) Video projector Image resolution Multiplication sign Source code Combinational logic Proper map Computer programming Sound effect Mathematics Different (Kate Ryan album) Software Personal digital assistant Endliche Modelltheorie Diffusion Physical system Form (programming) Area Simulation Focus (optics) Touchscreen Concentric Physical law Stress (mechanics) Diffuser (automotive) Sound effect Coma Berenices Bit Ultraviolet photoelectron spectroscopy Instance (computer science) Cartesian coordinate system Thread (computing) Connected space Personal digital assistant Self-organization Quicksort Arithmetic progression Embargo
Polar coordinate system Presentation of a group Building Structural load State of matter Multiplication sign Direction (geometry) Source code Execution unit Combinational logic Insertion loss Water vapor Function (mathematics) Mereology Food energy Area Subset Programmer (hardware) Bit rate Insertion loss Different (Kate Ryan album) Square number Information Endliche Modelltheorie Office suite Error message Fiber (mathematics) Area Source code Logical constant Concentric Building Software developer Structural load Point (geometry) Computer file Moment (mathematics) Parameter (computer programming) Staff (military) Instance (computer science) Statistics Series (mathematics) output Quicksort Reverse engineering Spacetime Metre Point (geometry) Dataflow Statistics Gene cluster Time series Control flow Average Number Goodness of fit Population density Software Diffusion output Information Artificial neural network Projective plane Length Diffuser (automotive) Field (computer science) Radius Integrated development environment Personal digital assistant Revision control Speech synthesis Moment <Mathematik>
Presentation of a group Concurrency (computer science) Structural load State of matter Multiplication sign 1 (number) Water vapor Function (mathematics) Food energy Computer programming Bit rate Different (Kate Ryan album) Interpreter (computing) Befehlsprozessor Personal digital assistant Cuboid Process (computing) Office suite Endliche Modelltheorie Physical system Social class Constraint (mathematics) Electric generator Concentric File format Structural load Data storage device Stress (mechanics) Staff (military) Thread (computing) Connected space Degree (graph theory) Arithmetic mean Befehlsprozessor Process (computing) Series (mathematics) output Right angle Queue (abstract data type) Task (computing) Resultant Writing Reading (process) Row (database) Supremum Overhead (computing) Computer file Calculation Time series Distance Theory Number Template (C++) Term (mathematics) output Punched card Task (computing) Form (programming) Multiplication sign Physical law Timestamp Personal digital assistant Calculation Interpreter (computing) Table (information) Local ring
Overhead (computing) Run time (program lifecycle phase) Code Run time (program lifecycle phase) Multiplication sign Stress (mechanics) Total S.A. Total S.A. Thread (computing) Computer programming Perspective (visual) Radical (chemistry) Term (mathematics) Personal digital assistant Software testing Software testing Right angle Thermal conductivity Resultant
Group action Thread (computing) Code Source code 1 (number) Instance (computer science) Function (mathematics) Parameter (computer programming) Computer simulation Mereology Computer programming Data model Strategy game Endliche Modelltheorie Social class Computer file Stress (mechanics) Parameter (computer programming) Thread (computing) Open set Type theory Message passing Arithmetic mean Vector space Telecommunication Hard disk drive output Right angle Task (computing) Resultant Reading (process) Writing Metre Reading (process) 12 (number) Table (information) Computer file Divisor Binary file Template (C++) Local Group Revision control Writing Queue (abstract data type) Data structure output Task (computing) Boss Corporation Standard deviation Directory service Template (C++) Personal digital assistant Function (mathematics) Calculation Video game Object (grammar) Service-oriented architecture Exception handling Library (computing)
Reading (process) Inheritance (object-oriented programming) Cone penetration test State of matter File format Computer simulation Counting Parameter (computer programming) Price index Line (geometry) Function (mathematics) Computer simulation Food energy Thread (computing) Number Data model Loop (music) Personal digital assistant Function (mathematics) output Exception handling Resultant Boss Corporation
Simulation Addition Parameter (computer programming) Instance (computer science) Computer simulation Mereology Iteration Iteration Queue (abstract data type) output Exception handling Resultant Exception handling
Loop (music) Addition Vector space Sheaf (mathematics) Parameter (computer programming) Right angle output Lattice (order) Resultant God
Intel Addition Multiplication sign Insertion loss Student's t-test Function (mathematics) Mereology Lattice (order) Computer programming Data model Goodness of fit Causality Telecommunication Term (mathematics) Befehlsprozessor Operating system Process (computing) Data structure output Domain name Scale (map) Scaling (geometry) Direction (geometry) Forcing (mathematics) Computability Stress (mechanics) Code Parameter (computer programming) Bit System call Thread (computing) Window function Degree (graph theory) Skeleton (computer programming) Message passing Befehlsprozessor Process (computing) Vector space Angle Logic Personal digital assistant Calculation Telecommunication Video game output Right angle Queue (abstract data type) Resultant
Complex (psychology) Thread (computing) Computer file Code Multiplication sign 1 (number) Function (mathematics) Open set Computer programming Order of magnitude Mathematics Bit rate Different (Kate Ryan album) Term (mathematics) Semiconductor memory Energy level Cuboid Flag Data structure Error message Form (programming) Task (computing) Multiplication Weight Forcing (mathematics) Moment (mathematics) Computability Stress (mechanics) Bit Database Perturbation theory Line (geometry) Benchmark Frame problem Vector potential Subject indexing Process (computing) Vector space Personal digital assistant Calculation output Right angle Quicksort Service-oriented architecture Writing Resultant Reading (process) Library (computing)
I think you're much better but the trouble to get this slide and impossible Brooks law proper for being my screen but connectedness of projector change resolution all of these those supposedly programming in progress rats as scientific use case so I would like to give
you a bit use case where you can do this France's and put a focus on the use case and then show you 1 solution what you can do with stress method pretty useful so in this case is essentially the application server go into a little bit harvest scientific application uh it's post about supposed to do and they're about to talk about the condemnations and Michael concentrations from lecture culture and domestic housing so you apply a lot of chemicals and then you to agriculture and then of a house you have a paint and then the rain washed of some of the page so took over a small the concentrations and here and they have a use case at the riverine which is a big bigger Berman European by the show in a minute and that's that's a sort of form area since the readout out of these things are about the best I could get uh in time here so Michael contamination so very small concentrations so if you nanotube micrograms per liter all and the liters of their in the real and they have biochemical effects of embargo was organisms they can which is strong so instance herbicides photosynthesis and this has a very big impact on the whole a ecosystem across it the photosynthesis is is very everything starts all this organic substances are produced and this is inhibited then that might go something wrong and it's system likewise chemicals can inhibit reproduction of fish and currently no more than 100 thousand registered chemical to Europe so that's a lot of a lot of them and then this simulation model is the system with see it how it works as you should be able to model most of them OK some of the sources so Serbia have diffuse herbicide combinations from agriculture and you see here we have agriculture different places them and you can see here
uh you applying lots of some of those like herbicides and they are not moved to not not everything is actually used for the purpose depending on the presentation the rain some of them is washed into the river on this is diffuse because it's widespread what
a larger area that's 1 source agriculture
herbicides the 2nd source would be fuse point source combinations from buildings should be have buildings then you have chemicals that are used for the constructing the building painting for instance clustering and then the terrain during carries many very small amounts of this staff and finally ends up in the rebel you've was a condemnation of households like combinations and and that sort of medication radio-opaque substances so if you go to the hospital you he did you get a better picture for X-ray in something like this then eventually will be and AB in the relevant about cosmetics household of chemicals artificial for interesting there part of the fish that have any calories so this is not that attractive at all any migrants the break and found that the breakdown they can accumulate in the environment the and also from from sewage treatment plants that pretty knowledge but they don't ever see especially those more concentrations so for kind of stuff they won't be able to get it out the OK at a catchment area of the the riverine you see here it's more than 12 can meters long and more than 800 because you can use a ship you the the In never get on there and the error is about 185 thousand square kilometres and a discharge verse between a thousand 2000 speaking this per 2nd depends right on and tell about 8 58 million people living in this in this area so this is would be there and it's called the catchment this green thing is called the catchment of the area from where the water actually finally ends up in the in the of so all excess water from precipitation will end up in this regard OK and this is just the big catchment by the consists of about 18 thousand some patients small areas you can see here see the revolvers Jupiter racist small the reverse of flow into the riverine eventually and it's 18 thousand of them and you can and then in space depending on the so chances of a lot of us found and then you'll visit OK this was this direction this direction to have different areas sub catchments economy now let's have a short look how this numbers come about so they have a lot of statistics so you know about how much medication coverage a person consumes per year and how much of this will be actually ends up in the wastewater on and you know that the paraphyly how much of this can be the be removed and those which treatment plants and also some of them are used in the characters as you know about the look at and let's get a lot of these medications and then typically this is just a a a function of the population density that's a population and the the torque collars means a lot of people white means very few people you can see how people lived around Zurich and here on the the the area and this 1 and in big cities you have a lot of people of course more people have more input in the real world the good initial market cultures that difference is so you know about how much herbicides usually use the problem you know what crops are growing there and there's a loss rate to less rate depends on the precipitation the more friends Morse lost and also depends glued how the rain goes so if there's a very strong rain it might be moment to have a look at this little bit right over time we see it's different so that exceed is of course it's more that agricultural areas off and you see this it's here together we just you know how much they applied at these office assistant on most of the this it into the model so but other available because if you have a local discharges a time series so we see this we have all the data for 1 year we have avalanches state city no if there are 2 culture of their cities and how many speculative source Granny too so this will be this catchment areas and be know when these substances are applied over time and also being being the you know the loss rates and decay rates of the substances some of them the case some of don't so what took so that we can neglect this is just a case of the number demand that reduced the gasses here some prior on the radius of the this project the developer the local dose of the gas 1 external programmer and suppose bonds but we only use executable and this it's a concentration loads for 1 catchment and so this is for 1 catchment and for 1 substance so we have this 1 catchment and typically iterates axonal fibers all this information and then it was a recent sees the file this is time-varying data at issue in a minute and it spits out this and and so the resulting PCs before so that how looks like so we have this this XML file is just a typical XML file this stuff inside and you see the it describes but substances and some of the features of the subsets and just so bright pink that's it already so that's a few things and many of them are we apply them to the general and indices that you specify the name of the input file is other a private unit and some of these features but it
is and some of the things with but the model is supposed to do to OK thanks years 0 0 simple so they have this time stamp of the starting in every time step stands for 1 hour so about like 9 thousand time steps was for 1 year and if the temperature in degrees celsius presentation which is not theory here and we have a distance to the know how much we actually water flows for every hour from this and sub catchment into River OK the result will be a final looks like this so again they have this times staff here which was a good offices with 1st step and it echoes back the input and energy to this is the most important on but this is a concentration in this case pachytene which is 1 of the substance so that's what comes out of it then they had to prepare the data will be because we have 18 sounds catchments and we get them just in the wrong format there is a fire was 18 thousand columns but we need 18 thousand rows so we need a transpose these months of understand us and them talk about the details how to do this year would be different topic and the land used at all so that you get from that of the deepest pockets to read it from there as well as the subject and connections which would be useful the next OK so what the need to do so 1st we have to get the data from this big file as I said rearrange the sub catchments and that's actually what I wanna talk about this red box here to get the calculator loads in concentration for each of these 18 thousand sup catchments and finally have to rearrange the data again because any and put time local catchments that would be number it's post-processing step to rearrange the data talking the now this this but the 1 in doing so we when I calculate 18 thousand catchments which catchments we get input data and then you wanna like still model and you get some output of course the program there is an implied constraints thing uh solve this presents 3 5 2 times time but everything's running this presence research these number nonpipelined doesn't by tables so the store the data in H D F 5 formant use of scientific form at that can hold very large amount of data so you can have terrabytes if you like in this case is not a big output also 6 gigabytes of not that big OK so the calculate loading concentration form for 1 catchment and that's what we need to do so the get those data from this H. if I found that I put it in the generate input files this XML file so I'll use a template just have to fill in those missing since there's not much and also we have to generate this time series for the temperature of the temperature of the presentation and the destruction then the meter next on the processor metals finished and then the the read output from that tone process and processing so the but is found 4 steps and now become the this talk actually yeah a case for is a case for threats so you might know that the Python to that standard Python has is what's called the cold interpreter lock so even see prices supports rest from the very beginning it uses operating system and hence all this rhetoric is operating systems but exceeded the DOE and run in parallel in terms of CPU so you have CPU class then if we have multiple threats they all run 1 of about an Indian are unexpected at the same time and so if you have CPU-bound problems threats that makes sense because it actually gets all because you have overhead switching from one's right to do the on if you have I O-bound Input Output alone tasks like reading and writing of 5 starting next on the processes and then try makes sense because then you have been put out and cost and prized reduces the kill people interpret the law and you can take a punch of stress that is very important you use rights you have their work that input output on things otherwise it would make things faster unless you want concurrency in you don't care about making things faster that's also find uh but you only really see the linear but that's a very important effect so that read through extend our steps we have here and the Dakotas stance then the see all of them actually are pretty much dominated by input output to be generate files to be right files the stomach so the process such written by must the by far the most time he spent Lucy and then we wait for the results and that you read out the news so most of these tasks are input output dominated in the bound though B heavy use case for our for OK this is a big picture of what happens so you have a means France and you have 4 across these are the acting persons imminent speak some here in in our process uh their main strength is intact which is a data it's rates of data from this big HDF I file now and saves the results that and then it distributes the raw to each
worker state starts a new worker for each calculation In the regardless of the past and we see this in a minute so the work actually staying the calculation and then the rocket doesn't richer and in this case but the rocket puts the resulting TEQ so whenever you can and Europe a stressor should use cues don't try to do locking so that does all of the things if you wanna Gropius difference right you you can do locking but it's really not recommended because it's very very difficult potentially about these things
yeah I use accuse each workers doing something and then it's done puts resolving q and the main striped
taking result from the q so it that said maybe conductances picture and so actually I do start the news right for each uh for each work on in as a question is this inefficient when the better of you distract will and maybe it's maybe men maybe well again the user's radical but talks efficiency at the short test and stopping 18 thousand stress joining impact takes about a 2nd and more less in man-machine but the total runtime of public programs to enough of us to 1 2nd versus to have our owls it doesn't really matter for this use case for this use case it doesn't matter if you start to many stress and kill them uh after this course you can potentially came at half a 2nd or up to a 2nd if you if you do this and if it's to have also run time you can forget it that's all this sometimes it makes sense to do stressful to you this case in terms of time it doesn't make any this doesn't give you any advantage to do something like that OK so let's have a look at the code the here so how the code works and still the most
denoising the shifted here so that's this is source code of the stress so I sredine and the class right from the threading model and it from the Python standard library and indeed inherit from strive to make it a straight this is a recall that's doing all the work in see at this has has a lot of methods but although than have on those schools that need only used in that the strength of the strand has to generate the input files which is either the 1st XML file and the time-varying founders doing some work has 62 takes on the program run and get the result then reads output of this so these are the 2 tasks the broker has to do but the only thing that's we consume Peisistratus run so I have to override a method called run and then we will see then you saw the Stratus message to be executed so these are just the things the short 1 2 3 4 what's supposed to be done inside 1 of these workers and the main thing is the 1 which will be executed and if you look at a run this he does however long run looks like so the run is just creating the inputs according the execution of this external program that's doing this scientific calculation and output vector so that's what 1 of the stress of doing and of course this group be called the 18 thousand types and just run with called 18 constant so this is how this looks like an it I just gave it all the subclass names it doesn't really matter what's inside this is just a detail has nothing to do with stress that only the 1 and it's important to and then that supports so the guy the the main threat to the main threat is just a normal class here we can inherit from object and composer tricked you don't have to and this is this is still a lot of preparation for of everything that has to be done once like I'm generated templates 1st that the food later on high generates and temporary files and directories actually the bad BGH workers putting files and reading the output from it opens HFI 5 file so I have open in that you have to do something special is aged 5 to make it work here in this case to keep it open and close H 5 5 and then as I read the parameters the global parameters and so on and then the interesting part on this actually uh this run all the run all message it's doing the main verb in this is the only method is concerned the stretch all the other ones are not from this right so the global Thomas so is always a good idea of any workers rights to isolate this whatever consensus right in 1 method is possible then you know if something goes
wrong with us right you only have to look at is 1 of the 2 In light of this possible that this case it's possible so again the the past is doing something that concerns the global things everything everything that concerns all the 18 thousand the calculations shifted so I rebates day 5 output the right idea trophic theoretically you could ever read In threats from the file but I tried didn't work so having eating all this raucous excesses H 2 5 5 to read separately it didn't work and I think it does remove the much because when you read from the H 2 if 5 5 to be the hottest is the limiting factor anyway so if 8 or 10 or 50 straight read from 1 5 1 any faster because it doesn't come up out any faster than uh the hot right delivers it so I have this means right reading the data and writing that when I don't think it's a bottleneck uh course typically that the hard drive is a limiting factor 2 to be fast enough 1 h 2 5 is very fast way to read and write data because use a binary format so typically the heart from slipping so this is not a problem OK and now how come cation Brooks and I found to struggle no you understand so for me to speak the might be extendable to the patient asks if you like it look for all and purposes glow piece scooters and if the queue so using cuse so some maybe someone in the sense of I think and is solution so these accuse you becoming Kenny given that that's where what's that yes you can use threats all this try these huge for communication between workers and the main strategy has a big advantage also feeling of undergo to multiprocessor you typically you can move your code to multiprocessing was my depressing user versa from version were model using cues for communication and so it makes your life much easier don't whenever you can avoid share-out structures of data where there's something to put in some things affecting out UCQ which right size and then somebody else of problem we write so then the the the the + graces workers typically as many represents abuse HEQ can do more I try and then I started right feeding the initial para meters and then the money meaning was supporting the offset thus have returned value but rather are outputted the resulting q the just that's how it looks like see this is the this broker again and I show you the relevant parts for this right so I get a q every straight gets a queue then see the run
method again now and then the important thing
here are actually uh it iterates a result you see this these of column names that use panacea the reader sees the which is very convenient actually and then the important thing here and this is important line I put my result back teach so have an ID is every strip gets a new ideas so I can tell them apart just count of the number of Council yeah and I put my output which is upon the State of Franklin and that's it so does or retro never put it there and then I'm gonna put energy that's how this complication works and this is the main strength and the main trick makes 1 q in this case use 1 q you can use different models we would make a different true for each strike that's also possible had use 1 q and then I have a big loop here
and it's bit in on both body only at said everything you see I have I D so make as many as ideas as any to make simulations and I tried to get the next ID and after a while all bodies would be used up and anarchist of iteration exception and then I break out of the loop and stop the whole thing concept on patrol which will break and later on we see that we another important part
this year I make instance of the rocket activity so that's the way the rocket and put the result and then I do with few checks down
here and he announced a 40 we moved out the see SEQ gets and you get it me the result so the q actually asked is are something and that's what I'm doing success sooner something in the Q I get something out and get the next result as is I attach the ID to each of them and the rocket vector idea I know which result that it's so I use this 90 and that's pretty much it and and then
here at the end that have joined the workers I you that this right and and I've also there have to place a section 1 of them into scared and this is just for some God that's at the very end so this is in the main loop and then they get fewer and fewer have that's why have 2nd get that but this is a complication so they're they're this say put here puts it
in and the main strength says and gets result out so no return value but couldn't get for importance of cues Q is actually the way to go here OK not gain so how did it work out so compared to doing this through calculation about just have 1 strand 1 program is out and he's writing doing it 1 of the ob ob so it has a mind to I 5 uh I think is a course actually it's it's for CPU smallest this so that I keep 8 CPU was a PC around 90 per cent of which is pretty good so you won't get much fossils and then what what of the 90 % and actually amazingly scales up to 50 threats so our 1st of all if I have 8 causal force use maybe 4 8 would be a good 1 but I tried more and even at the Göteborg faster using more so in the operating systems and logical SOS processes from the instant this and 95 per cent or so of time on you more and spend the next term processes so my problem actually doing very little computation compared extant process such and I just need to keep defeat tend to get them running Provo and all the heavy lifting is done by the operating system the courses students rats that's what you want you would have the in efforts In get other people doing the hard work In this case it's the operating system it's going all the stress uh I run this on on a Windows machine vectors windows but also runs actually and connect across all the Linux because they use wine and it have you may use wine and works because Sister degree program so I can run in the angles OK so that's what I did say that could be other ways of
solving this students that's is my approach so you can use concurrent features which is a part of Python to have and use by domain why I didn't use it but you can use it it gives you the high-level approach there so this gives you a straight and you don't have to handle it yourself he could use processing but I don't think that helps because since most of time spent the next on the process it's my depressing starting from human of external process to that of the present but once I don't it doesn't end and gain in here but eventually you could move it to something on a constant so if you have a closed off and you would have to have a 10 thousand independent processes I could theoretically a skeleton across a contestant calls would run would work something like this and it could use something like Pyro for some of our technology for this so if you use a you pretty close you might have to change that them to flow in a bit but you could use some technologies but for our projected loss OK to causal suppose anyone on a single computer and to follows the scientists a very happy result so typically used to something running overnight and everything that's lesson overnight that's it's OK fast enough good conclusions take-home message the use rights whenever you have input output by problems the uh decodable yes against the most the coder looks very much like pseudo there's nothing special about stress all users to cases where a repressor Q and try to isolate them in in in separate my and what methods about possible and then over possible also use cues for communication which makes lot my life much simpler than using sharp data structures OK so thank you much and there's a a few minutes for questions like few
so was and I big of course the it thanks for of is envisioned uh my questions about that so if you mentioned that it might be some issues what the lines of your cold could be changed so wrong things can happen yeah the the main thing is that if the value of the subprocess doesn't finish and then distract with this this 1 broker blocked it could be a problem here and so that you need to do have some time out which I don't have few now this if this 1 doesn't come back after 2 3 minutes then you need to buy the restart of all at least make a red flag that it works in my case every single was finished so all the calculation finished the courses very simple sophisticated culations it has no numerical problems that can happen so it can have any you too much error handling the subprocess times in terms of its rights this is a totally ever workers to the independent and thus interfere with the all those there shouldn't be a problem like at that log-linear because nobody's really weighting the only ones within the mainstream for the rocket to right result in the broken arises out of a travel so in my case it didn't happen maybe tried different persons it works if if you've change would notice OK uh ID 5 thousand from a 50 didn't finish and not some but have many as look what went wrong answers want all just if you have 18 thousand you you lose tools freedom might be OK good any more questions in on the it my question is how can you make it possible to run on multiple fronts access the same data so for example you you can you can uh but it's the people if the blocking so if you want have multiprocessors success is simply the structure can be done by the can it because of its rights on not deterministic you don't know but have and about the time when lot at this rate all of those right is active the opposite processes the use of all the typically I would recommend tried to avoid it it's possible but increases the level of complexity possibly 1 order of magnitude to make as simple as possible if you wanna do this uh just like if if an array of new fish apparatus OK if force rights subdirectory form distractors working to support the sponsored the going in the future of if if possible and otherwise you need to lock is in our I'm doing something and if you look something you have the potential for that book it's very difficult to to dark because the but never happen wasn't own hands once in my thousand times in history hot Corsican it reproduces and if you reproduces the candy box to try to avoid Chapter destructive all possible then the if you we need something in parallel and you do some computation heavy things you might wanna look at some of technologies and bike open P or something this size and or of Peru technologies then licence rights might not be the way to go my recommendation the it my question is regarding the book like is there a specific reason why it induces abilities sold in normal voice the bottom legal frightened 1 file having little bit uh actually aged if I this very close to a database it's designed for scientific data and a lot of benchmarks the stage here 5 is full of use cases fossils and grows for operating writing uh and you can also the thing is the if you look at it as the 95 % of time next on the processes the reading writing and it's just maybe a per cent of the time so even even make it twice as fast as I go from 1 person to have since this does make sense you money spent at the moment most of time is spent is next on the process I cannot change anyway joke the performance only becomes because it don't wait to 4 to finish 1 cause of other and I run 50 of them and prose and that the time comes from the the writing and reading file is not the bottleneck because the database also has the right of the whole drive the eventually all this is the so this would be the physical writing but be the limiting factor but in in this case it doesn't really matter because it's extant process dominated by far OK any more questions this 1 more here and the hello and thank you for the talk was of I did not in reading the fine as it was still being uh you would need all of same time of do the inner tank feeding it to all it's time yeah no it have 1 1 big 5 is all the data aged if I find him and I only have been the main strength accessing the file that of August never access any of this made 5 they do generate intermediate files for the x from a program that this XML file and this is sees we files are generated by each worker now and then was the text on the program needs physical files and it generates input has a pretty small 9 thousand lines of the log now and then external program reexamine writes an output and then the broker reads output and changes into a data frame and hence a data frame vectors of to the that is the main try the names would eventually rights in a big a chip 5 5 end up is age firefighters 132 2 million lines and then I need to rearrange them which is 5 is doing because you can index h if I've that this doing all the sorting for me so I don't have to know there everything memory I can index this sorting this and takes about half an hour or 2 sort of look for the 2 million lines which is pretty good thank you OK anyone else in I have to learn today it yeah the yeah I want to think about using the agreements instant of threads for the same problem but can you repeat the of what you do think of all using governments of the the the 3 that's the I think it would be a could also be that there could be many approaches you could use as a programming if you want so you can use greedy let's but if you use green let's innocent you need to directly to the to totally change the debate here you can mainly used serial code as would write the normal Python when only the 2 things you have been you when references if inherit from Stratton if the run method and have accuser get input and that's pretty much it and so if it's a different things so you need to be guessing DM program can be much faster if you have a hundred thousand of proper things going in the same kind and try to put another way to go but here ever it's if used to be a that don't have more than 8 things going on at the same time so stress of perfectly fine so having a few thousand 10 thousand frets about these computers would be a problem if you want you can run a million sizes ago publique easy 1 million things the same time the bit different task not so but you could use green lets you can use a single so you can use many different approaches them just depends on your preferences but you know already and what's available you don't need an external library and I think from from I imagination is pretty easy to to follow what happens if you have 3 nested on the work but it's certainly possible OK he was no OK uh then things might you of also and