AV-Portal 3.23.3 (4dfb8a34932102951b25870966c61d06d6b97156)

Test-driven code search and reuse coming to Python with pytest-nodev

Video in TIB AV-Portal: Test-driven code search and reuse coming to Python with pytest-nodev

Formal Metadata

Test-driven code search and reuse coming to Python with pytest-nodev
Title of Series
Part Number
Number of Parts
CC Attribution - NonCommercial - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this license.
Release Date

Content Metadata

Subject Area
Alessandro Amici - Test-driven code search and reuse coming to Python with pytest-nodev We will present the test-driven reuse (TDR) development strategy, a natural extension of test-driven development (TDD), and how to execute it with [pytest-nodev] an Open Source test- driven search engine for Python code. When developing new functionalities developers spend significant efforts searching for code to reuse, mainly via keyword-based searches, e.g. on StackOverflow and Google. Keyword-based search is effective in finding code that is explicitly designed and documented to be reused, e.g. libraries and frameworks, but typically fails to identify reusable functions and classes in the large corpus of auxiliary code of software projects. TDR aims to address the limits of keyword-based search with test- driven code search that focuses instead on code behaviour and semantics. Developing a new feature in TDR starts with the developer writing the tests that will validate candidate implementations of the desired functionality. Before writing any functional code the tests are run against all functions and classes of available projects. Any code passing the tests is presented to the developer as a candidate implementation for the target feature. [Pytest-nodev] and other nodev tools that help implement TDR for Python are newer than the JAVA counterparts, in spite of that we will present several applications of the technique to more and more complex examples.
Information management Functional (mathematics) Multiplication sign Electronic mailing list Virtual machine Data storage device Code Plastikkarte Mereology System call Arithmetic mean Latent heat Computer animation Search engine (computing) Query language Core dump Self-organization Software testing Physical law Software testing Data structure Resultant Social class
Randomization Manufacturing execution system System call Beta function State of matter Code Mountain pass Multiplication sign Sheaf (mathematics) 1 (number) Parameter (computer programming) Mereology 19 (number) Web 2.0 Pointer (computer programming) Roundness (object) Set (mathematics) Physical law Social class Exception handling Real number Digitizing Interior (topology) Bit Connected space Data management Order (biology) Software testing Resultant Dialect Standard error Ewe language Implementation Functional (mathematics) Decision tree learning Perfect group Computer file Divisor Video game Virtual machine Maxima and minima Event horizon Wave packet 2 (number) Number Product (business) Degree (graph theory) Integrated development environment Software testing Multiplication sign Execution unit Standard deviation Distribution (mathematics) Lemma (mathematics) Projective plane Limit (category theory) SCSI Computer animation Integrated development environment Personal digital assistant Search engine (computing) Function (mathematics) Large eddy simulation Object (grammar) Window Library (computing) Flag
Presentation of a group Code Mereology Set (mathematics) Aerodynamics Endliche Modelltheorie Website Social class Scalable Coherent Interface Parsing Token ring Interior (topology) output Software testing Right angle Resultant Spacetime Point (geometry) Trail Ewe language Functional (mathematics) Implementation Observational study Token ring Real number Streaming media Theory Emulation Product (business) Element (mathematics) Writing Operator (mathematics) String (computer science) Software testing Gamma function MiniDisc Computer-assisted translation Condition number Copyright infringement Operator (mathematics) Core dump Grand Unified Theory Line (geometry) Computer animation Integrated development environment Estimation Personal digital assistant Function (mathematics) Lie group Game theory Library (computing)
Metropolitan area network Execution unit Software bug Polar coordinate system System call Touchscreen Multiplication sign Computer file Partition function (statistical mechanics) Bit Limit (category theory) Calculus Infinity 19 (number) CAN bus Category of being Graphical user interface Computer animation Set (mathematics) Object (grammar) Proxy server Computer-assisted translation Scalable Coherent Interface Newton's law of universal gravitation
Wechselseitige Information Code Mountain pass Multiplication sign Mathematical singularity Numbering scheme Shape (magazine) Parameter (computer programming) Mereology Demoscene Pointer (computer programming) Different (Kate Ryan album) Computer configuration Set (mathematics) Lipschitz-Stetigkeit Local ring Logic gate Multiplication Arc (geometry) Position operator Lambda calculus Social class Area Fatou-Menge Interior (topology) Electronic mailing list Menu (computing) Ext functor Staff (military) Bit Regulärer Ausdruck <Textverarbeitung> Annulus (mathematics) Data stream Type theory Arithmetic mean IRIS-T Configuration space Convex hull Right angle Resultant Point (geometry) Ocean current Software engineering Functional (mathematics) Implementation Computer file MIDI 3 (number) Maxima and minima Streaming media Generic programming Theory Emulation Number Product (business) Software testing Proxy server Task (computing) Condition number World Wide Web Consortium Addition Information management Execution unit Uniqueness quantification Line (geometry) Cartesian coordinate system Similarity (geometry) CAN bus Causality Word Computer animation Integrated development environment Personal digital assistant Travelling salesman problem Network topology Universe (mathematics) Video game Data Encryption Standard Object (grammar) 9 (number) Spectrum (functional analysis) Library (computing)
Complex (psychology) Code Mountain pass Plotter Multiplication sign Set (mathematics) Parameter (computer programming) Independence (probability theory) Web 2.0 Computer configuration Hypermedia Object (grammar) Personal digital assistant Query language Search engine (computing) Lipschitz-Stetigkeit Extension (kinesiology) Lambda calculus Probability density function Beta function Email Theory of relativity Gradient Drop (liquid) Test-driven development Open set Hypothesis Software development kit Arithmetic mean Process (computing) Computer configuration Malware Vector space Order (biology) System programming Normal (geometry) Software testing Right angle Resultant Spacetime Row (database) Booting Point (geometry) Asynchronous Transfer Mode Implementation Functional (mathematics) Regulärer Ausdruck <Textverarbeitung> Software developer Virtual machine Student's t-test Generic programming Focus (optics) Emulation Number Element (mathematics) Well-formed formula Software Software testing Implementation Task (computing) World Wide Web Consortium Newton's law of universal gravitation Addition Execution unit Validity (statistics) Projective plane Code Basis <Mathematik> Line (geometry) Machine code Vector graphics Subject indexing Computer animation Integrated development environment Personal digital assistant Search engine (computing) Function (mathematics) Data Encryption Standard Social class Object (grammar) Fingerprint Extension (kinesiology)
we focus why come to the storage of this material this other Sundrum inches and could regret thing you can that the french so he will explain some stuff about tests and so were the few to so all of this talk is about green and gold structure and this is the rather new technique not so new because someone already tried it 2 years ago with each other and it is but it's 1st time that I see quite by the pretty simple what we want to what would be produced what would be the he was a very basic search engine it's the that's not that lies at the beginning that enables you searching for coded in inside your machine on the packages that you have installed your local much the special think about this the best 3 and search of is that you use the test the part of the search query so you may use some the also make about them and try to the little refine your search that they call at the core what you're looking for this is what you describe we did with the that we going to sophistication that they something that tries to specified that you for a future we don't going too much into the details of how it is implemented and once you round your search engine you will get something result so this is a list of functions or classes or whatever object actually that after the specification the data documentation that the mean of the call of tools is that by just no they're beginning and then you add the mean the woman patients but out if you want I don't don't have a lot of other tools that you that I will show during that is uniform not since this is something new the other beginning at the organizers of the so how will the go but then I completely relieved yesterday because they've been really really but examples make people understand much faster so High Court that do people here and no a unique fast like that and by the features all 1 what does health care now this here they did that the base the and the
implementation of the data is that the to begin provides is special the star that's called candidates and you need to use this feature when you write a test that you want to use the search for approach to what will happen is that the this feature will affect the belief parameterized you all your the right possibilities the objects that will manage to find your environment so you if you install that packages the European funding in European digital environment to get back to us that it will collect all the objects of the light objects in stumble library and all that the packages that you install then obviously since this would be a parameterized test the test will be run every the thousands of time was growing and once for every object and the object will be passed the reference subject we passed into the candidate candidate valuable so you basically we use this candidates as if it was your data function that you're looking for and then the search engine will just you with which functions the classes or objects in general actually appear to be exactly as you intend this is the next do our 1st search the you want to search for some kind of a function that if feature for example except for a function that given the name of the accessible audible returns the past to in is not just a nice example of this is actually the 1st the real the real case that we have we need we had exactly the need and with started searching for it on on the web we need like the results and with OK Node is the perfect test case because something easier since to write a test it and maybe there is something somewhere In the amendment already adults obviously a you could just to write something like that the subprocess called 2 weeks and then passed the result except the act that would be happy and would not work on Windows it's not the the best so what is sufficient have I write it standard test function for beta I use the candidate feature then I just as 2 to to have the text more readable 390 busily renamed the candidate which which is more or less than the idea that it and something that works like the which come and the son library so they then her to the giver did I expect i.e. failure sh fell I want to lose their mind function might might China looking for so the return being that states and if I not this train em but it's should that you have there been and this to work to very common Unix garments and they are the ones who are among the most stable because of comments can make the US are being are you being nice in our but this toward more common so once every 10 days the sensory I write a file and then I just write it as usual it would find answers you just by need to of candidates from all this means that the candidate function with the kind history will be parameterized by everything I find in the my environment so the south there are found out that section and I get usually something like 5 thousand and 6 6 thousand a the objects news events very much on how many coming back package you have installed the there's not many it's easy to into to the 30 thousand or 50 thousand and then just ran for a while and we will see in a minute and since the path is expected to fail if they might as well print and small x well the test of clean your trolling this year and find friends DL expected film of time than you have capital axes which means that the best and in the end it was unexpected but best at the end of the round you have many many texts what you expect to do is whoever there is a result of that in this case we found 3 functions object the of that and this is the apart so for my best which 5 but we find that the this case with in which the test which found part of the and is the acceptable no by value the this function as well and let's see how it works how much time takes so when I'm not using there's quite a few desk I'm using the kind of the the book said the random exercise inside doctor complying because when you truly random arguments to random functions and in Canada and so if you try to do it on your machine you will find make up flies for a for users with pretty names that are occurring connections schools around also or whatever so you prefer to do a conductor and at the end of the the round neutral way up so what happens is that right now it's collecting all the so object now he I have a little bit less objects on when I did the best because they have to actually but with subjects all the time because they are the limit crush your environment so around the an open up the Bruiser extended and this is what I want now the order that they did that test is running we all the functions we see in most smaller that means that the we define and much here we have 1 X so this is 1 of the we find at least 1 function that actually work the states of approximately 60 seconds and everything goes paying and now we should also have some garbage on the screen because since you are using
find sensor across in unexpected ways you're always for we random factors that you end up discovering a lot of products and the package that you have because in most of the for the printout is that exception the Dell needed that our you know but printed to the standard error what I'm so this is no weapons once you will get the results if case idea that surpassed that very easy very basic and what do we do well since I manageable number of of results I can just have and decide if this is really what I want this this so this this is useful and find executable the main looks like what we are interested in and this is inside the thunder library so it's very it's very useful maybe I don't need to write any code that for my for my finding find executable for my which function because I I just use this when you see that's more than what I thought that yeah it's not that it's based on how always independent way that I need the last summer winter to checks that I even didn't think that needed because I don't use with the windows usually but yes might be useful and then it just tries to see if there is a if the fight about it's not really the best these file it doesn't strike if they're finds an executable show it's not really perfect but at least they have a templated if I want to go that that we expect would switch I don't care too much about that because I already have a function in the standard library so I don't need to other dependencies my project if I want to use but then the passage reach for this is even more standard and the
standard library and this is the coldest and if you were to look at the code that it's much much more complex than the cats the real axis track that means the sex that you can read that read the 5 and you can execute it does federal the game is that I would not have thought would not I would it would have taken me 1 year of production to get right so overnight unfortunately if you go to the into the condition you learned that this is only a 3 only actually quite . 3 so if your coat is there that you use cases important actual infringement on tho very nice find acceptable it some the library not 9th OK or maybe you can just that he does and weight and get better if your Python 3 only you have the luxury to use which which is and how many of you already know they which function are added to solve this problem OK a few right I mean it's in this from the library but I mean I denied that was foster his way down to to all to look for it OK let's go
back but this is the very simple the examples uh but it also shows the how things work now in 1 of the point is that that it is the input and output of the function were really but when you have something that and where it is a reasonable implementation is really easy it's easy to write test but as soon as you we look for a more complex stuff the writing test that doesn't that that that that he sometimes conditioning announced it doesn't make too many assumptions of the presentation of these complicated small complicated but actually by going to really great tool right stuff that it's not too title implementation details on the condition because it has been necessary for example using that in you are not forced to guess right to the right of this better pipe that even operator is extremely powerful and a lot of classes even work nicely with the in operator that these instead of looking leave the that the the result of your function is least and the 1st element of a least studies what you were looking for you just used in operator to see somewhere inside your function you said your result of theories what you expected it be and then there are you may write specific helper is in particular as we wrote to denote that suspects that helps you that leverages the it is like a model to go even deeper into the session where is that your assaulted actually contains this is what you expected to contain anything crazy ways so let's see how you would write the and this is from a that that tries to be more independent columns from the implementation here I want to pass in F C 39 85 86 but you're right that is also a real real test me OK and so by the user can be a fix just for naming for a for so I'm going nicer the use of test you arrive and then I get my all functions that we get will be passed if you're right and expected to return some kind of tokens and then here I would say that if the schema and the I would in you're right are already no since most get that another of the false positive you that are just the streams onion and just concentrate on the same thing as the inputs I checked that return of my function is not strange I don't get but I really want this thing to be into the I didn't talk about so I don't want 1 string I want some kind of police so if here is how it goes this is the denied implementation in the sense that I didn't use any of the but as a special treat except I haven't standard that involve operator that is overloading now this is going to that usually the different common lines that can be passed and those mostly needs to restrict the search space and if you already know that some part of the of Europe some packages are not useful you want to restrict the search is so if you get faster but it use 1 candidates from all it's the more powerful it's just search for everything and anything your environment so this is where the whole others I tested just before the the on the other
side of the around holidays and since it takes a little bit of time to right now I will also tried 2nd on the 2nd example that is the same the same partition function it is instead the best that investors region using some of the advanced functionalities this is what the container you know that's that's narrative is get an object and you think it's a proxy object that when you use the in insights science and he tries really hard to see if item that you're looking for at least some the object so for example it's what you to our articles into the properties of it is even if it's immutable it looks inside every day and we identified readable so it's extremely for let's
see if we manage to not to so but the apparently and so on the screen
I have this nice but best but 1 bit for us before it was some kind of a race condition because it's going to on and now let's see what the results are but I don't similar from the results 9 the 1st 3 results in collect found not on the area because he's you should not be the same really look like false positives that is they're not trying to do anything with an RFC your passing but it's just they're packaging some this scheme that you are giving them but then you you have used Earth's EPA tonight API unique reference that looks very nice but also they the during the past year part the means you features it function that are able to use both the inner package and also standard library now what is interesting is that in both cases both URL the you're past inside the CU 39 88 packets and also the 1 from the library they don't on the list they account classes so how exactly this work we did last the point is that a lot of people are quite smart and they some way to get to access suffer because staff in an implementation-independent way that is the the to implementation the used to actually at the pride they understand score contains some strong discovered and me but that that task exactly even though they managed to find pieces of the best things up like a stream like history toppled so it is not a simple but if class that behave like type very nice so you have you use this 1 for most of the of Europe of your made but if you need more features you fuel makes for the cold and use the special package as more features for example it's able to recognize a user name which this from the library function now there's something even more interesting than other tests that 1 that uses indicated a proxy object to do the a containment vessel this time 1 more 1 more objects that matches and this is the classes in the in the that the product that actually does the right thing but doesn't provide the denying environments that help functionality so we managed to get it well because they have defined from tried there without to find if the cost grows and the part of this team and part where inside the class so so do a way to get to toward test results in in the implementation-dependent way but then I want all of us are mentioned in the independent way the conditioned by the way so in this case would help me is the current price it's still there parameterize marker of and the for example I think in case and looking for the the function that just remove comments from the data stream and the main point is how do I represent the stream because this is my text and this is the readout of the of my configuration file for example and I want to this comment here so how do they do it I think use the parameterized partly the parameterized and their argument so that I can say OK you have different functions that will make this text of these comments into different shapes I can pass it as it is as I can muster doesn't the least of individual life or I can plus there's a list of individual aligned with number of disease how minor addition actually was doing spot or a composite justified the in the In this case
seems uh I have a lot of the there the I will around not just 5 thousand times but as we're and 20 thousand times so you I prefer to restrict my nice option just like just including any any function whose name matches the regular expression so I want something that has to do with comments it makes everything much much faster and here so I find that it is not prominent the function In people are very good because people is something that I might assume it's like dependency and is that the that that that's the stream of that after this the 3rd 1 and so are you back yeah 0 1 2 0 so it's this is the way in which it was exactly the way I prefer I couldn't work with all the other trees but these it means that I don't even need to change my application to to use that so they were extremely fast this they they cannot common it's very simple and there's also the feature that keeps the ionized it's empty and theories of the the line number because it doesn't have all the lines also at at the other function just below the spectral peaks options we choose a special class in this class must have this we keep retirements radiates otherwise press 0 got even if I needed this I would never ever managed to asked the correct the the current parameters with because these parameters is extremely tight to implementation so I'm going test the quite the coupled with the implementation in front of me out implementation of the possible but I always find it functions callable or and classes that our that our our our good code they're announced they don't it needs implantation the gates university you it would just said you skip is steeper requirements that I X has been a key word that what we have seen before the function would be as useful and I would have been able to the search for it export to use it the this so when you search you make
it that's only relevant results which means your grade just perfect all you have to refine your where you should not get any results so at all which happens very often it means that your task it's too strict and you probably don't need to remove test data and edge cases are probably just use the lower number of normal cases if you find a lot of results but they're not relevant means that your test is not base to week not strict enough so you need to add more cases more that describe your future better and probably optimal corner if you see different you you to go to from no result at all to normalize results that it means that you don't find anything you most probably they are looking for a function that is not your environment now this is the basis of the could it the test reuse which is something that something has been studied a little bit that in the in the uh in the job of communities and the media is the use of that community reuse is just that use of like this student and up and you you start your test maybe you try to write more independent way down which to dust uh that you would do if you already know what he's implementation that you're doing and then you try to search if you find it function that already works is even a cold but if any function but you're fast then you have 3 options for each of manufacturing it's tested and and you have to developing from find other ways you may just because that means you get the dependency hands or you may for kids that is you get exactly the same codes task you take license and copying of Europe project I just have a look at it to see how many cases you don't think or In other treaties that you may just use the president concept which is a tool by itself for you because validation it if you wrote test you think would test then you make a make search with it and you find a couple of quality and related function it means that the test is to win it finds fault false but it falsely at so in addition future work the main plot points right now is performance in then you may do other things like in extending the search space and making more of making that more tools but then you get even more work to do and so performance performance performance and normalization except it would be very nice if this was not done your machine but on web so what we are trying to do all this tool will make kind of search the search and gene from the weapon if you want to you when things are starting to roll there right e-mail to the to meet him at the end we learn for people willing to test conclusions if you start using it you will recognize much better what I would test and with our good code and you will tend to at least this is what we notice we tend to write to write your code so that all the implementation dates are are far away as possible as simple or as intuitive as possible thank you for your attention if if the be OK if you have a question so do you think that some of already on preserve a minimal arguments that can be passed and similar things because of a function doesn't take any arguments and you need something takes 1 that is not the other kind and send that with the when the when you look for candidates of settings of solving of problems you will see that already on the things that you don't right now in all this is 1 of the reasons use we web search in iterated the index of objects would be nice but it's very difficult to do it on your machine and each with by if you make them how many arguments during a function but not much more because it is you that have typing you might not actually want to be too strict so that he did he did behind having a web search engine is that you have a greater the index of what kind of function me the vertical that's or not of is anything for the timing of functions of and irradiated the to count every death has a time 1 2nd so a lot of this stuff that are tried out I'm out I Use prompted rowing boat the extent even problem the only the real problem is when you called the C extensions and they just practice that idea a long list of of of a long vector for this another questions there's no not so how do you deal with the argument functions where I don't know what the order of the arguments is going to be in what's the time complexity of what's out there so this is what he did in the 1st line in the 1st row try to look at the beginning there's another money them additional arguments I I refuse that goes this right now to get correct it's more important than to a larger search space but as I was writing the story I noticed that you can easily I use the formula you can easily
parameterized yeah and from just switching elements so it's can be done right now we but it was apartment rights which which and their complexity is very hard if you have to 2 arguments is still but that it's and vectorial start with 4 arguments you already have a variable it's very easy to willing into the hundreds of thousands of that now it was really the small environments and for all the in the educational I got minute that to and