00:00

so don't think very much I have come into this thought I appreciate that I kind of this title there and is a very boring title and maybe you think gonna talk about how to output data to a spreadsheet with Python is not want to talk about and talk about how to build a spreadsheet application with Python how to build an alternative to accept essentially monomers Harry went to the hand was the agent like

00:27

voltage relics I'm not uh you know more 4 was practically so the followers on my website is a beta testing go don't come right about testing so that here and talk about the

00:39

magical Python spreadsheet as so as we do that anybody know through here no power spreadsheet works how a spreadsheet calculates you know

00:48

the functions you put into this you did anyone else good alright so you then can tell me whether I give a good description and everybody else my plan is to try and the making a spreadsheet is a little bit easier than you think we're trying to build 1 up step by by step so 1st I like to take you back to let's say 2004 it was a simpler time are there was no will face but just started out just going out of out of universities in america and MySpace was still rage I have a MySpace page Britney Spears is a just released test seminal comeback album toxic articles a great great year from the beginning of the North American ragged jungle Renaissance if you like for instance so in this time series of programs got together and they had this idea that everyone loves using spreadsheets but actually working with them so the stocks because you have to use the DA and when it be great if instead of having to use the VA when you wanna scripture spreadsheet you can use Python when anyway if there's a Python expression and so these guys together and had this crazy idea and wealth they built a Pythonic spreadsheet was a carrier and then for 5 years later I joined the company as we were just re implementing this out in a web based form coming out later conferences like the receivable when so we get a build then apply Python spreadsheet with our colleagues here back in in 2002 and this can be a web based tool and well actually take you step by step through how can we build a working spreadsheet with Python starting from scratch

02:23

and its chemical the like this so 1st of all we will assume that the jury is a solved problem and we can make a 2 dimensional grid like this little clever JavaScript with you allow the user to interact with the wood is going to build the back the engine for recalculating a spreadsheet user does so here's a spreadsheet I want you to do things like type into yes at the top with alive demos is going to go wrong and Monotype and

02:45

things like there's but let's say would take into and what typing and 3 5 OK so where my got so far is I've got a sort of a 2 dimensional grid and told the state-run unity of say OK well if anyone is 1 of the 3 is the string and 3 so will start

03:06

off mice easily that might find in a proposed to use something like this we have a dictionary is gonna be indexed on the tuple of

03:14

row number and column number and as they contain a certain object and that's an object is this 0 1 1 1 2 and 3 so a good or a member blogs guys guys guys because the problem is the grass that has lots of cheap and but I'm cheap 1 and

03:34

when you get alright so and that's a

03:36

pretty useless spreadsheet and then see if we can't make it do something better than that like over here when we get equals to plus 2 a spreadsheet allows use of mass OK so what do you think is afterward presented here I know it's a work of art and a diverse to so what's vessel too bad and notice what introduced here is that there's a difference between the formula the formulas equals 2 plus 2 and then the result of the value of the cell it used for so now I've introduced the cell is not just uh detects also has a distinction between former and value

04:13

and that might look a little bit like this income the and have a cell class is gonna have a formula value which we initialize the semantical magical undefined special variable I and we want to calculate what she we just go find all of the cells in the worksheet that the back slides azimuth because that's really through all the cells in which you me as the heart does this double equals in that case and is a former i have to some special otherwise the value is just what the user entered so what's special thing can I do to get like

04:42

2 plus 2 to 10 into 4 I can basically just to the valve so the formula romances school

04:52

that a 2 plus 2 is intended for use in article evil encourage you all statements the less thing about pike and they never go wrong do they

04:59

are get now they want to keep people who would like to change the formula

05:03

put into the cell any suggestions yes literally and I could do that yes you can and put in a string how they do this and wants to see this right the error all handmade for led to somehow we have to handle this we

05:21

have to notice when the user does something stupid we have to catch some errors may be given nice to trace back and little warning there's the would be wiser to have really that OK well we

05:31

put tri-accept classic clinical e-mail so formula and then when we catch exception we again populate that error on the cell instead of calculating its new about fair enough OK well

05:42

separated well and I've got a spreadsheet is basically a two-dimensional calculator no itself and talk to each other this is still a spreadsheet is still no better than a calculator and so it really would do that is maybe be able to refer to other cells in the spreadsheet right in my things I want you to go something like a 1 plus 2 and think this is a way yes or and was 2 OK so what we've

06:09

done that we've taken something that looks like a 1 plus the 2 and we need to somehow turned into something Python's can understand and so in a way

06:18

we're taking a 1 a 2 and we've got a worksheet objectives a dictionary and a contains all of the other cell objects and basically what translates the string a 1 in the string a 2 2 becomes invalid Python so I nice 1081 into watching 1 1 dollar value in which you want to don't value and then I could call evil that and so the way we do this we need to transform things that like Excel formula things includes cell references into things that look like the a python can understand we've already got all worksheet object to review no refer itself doesn't make sense of all right so what we can do that OK fine we have no formal sector for a formula and we as a haven starts with an equal the the user's input and otherwise and you know even in the actually that you might get a full neurosyphilis a syntax error in the formulation pop back their arms and meanwhile you can transform uses formula into a Python 1 that makes sense now would you like to see some of the magic of to Python formula enthusiastic yes but I this is the 1st little bit of a recursive fundamental represented using some test right through idea say equals 1 to turn into 1 equals 1 prostitution times 1 plus 2 equals anyone should become worksheet 1 1 dollar value and then you have like you can have crazy form in your things you can have X times a ones that x in range 5 that is a valid formulae and enter into this Pythonic spreadsheets and that's it turned into this so single happened I'm not gonna go into the details of parser parser parses is the special 1 that knows how to understand excellent Python but our but you know you're going to have some workers of finding and look at a node you say hey if it's a cell range we rewrite that if it's a cell reference rewrite that and we call the parsing we write

08:14

function on each of the nodes inside so if if you got a cell reference of so many books and children in that's the 1st the recursive from past things annuity ones and and you've transforms and they wanted to use into into valid and OK so that makes sense but we already finished that job case we've got because they wanted to know what if we have something

08:38

and universal version of this 1st of I what will

08:43

understand this 1st instead

08:45

of OK so that equals a 1 plus a 2 and that's fine no if over here I have another 1 which says i equals B 3 well known I can't just evaluate these cells in any old order because before I can calculate this what I need to know that depends on this 1 and this 1 depends on these 2 so introduces a kind of dependency graphs that's the

09:09

fundamental structure underlies a spreadsheet and have a series of cell references cells point to each other and there's the dependencies in order to actually do my calculation my spreadsheet and

09:18

you need to know what that graph looks like so I'm calculating things that have no dependencies 1st and I can things and amendment project things among them and I do in the same order and so we had a little bit recursion our parser and other real recursive from history begins recursion is the only truly fun thing about programming and we get something that like that composite tendencies out reviews of positive no recognizing one's anyway so we can also say OK that 1 I can find for any cell formula what other cells that depend on so I can have no function to calculate a cell and you go OK you've ourselves Python formula and now I'm going to build dependency graph and this against a wall or reminding him to re-evaluate my cells in if I build graph finally there's really us and want to find all the leaves in a graph I start with the leaves problem officer of q and each time I calculate a particular cell I can then remove it from its parents and does that make sense so we going to assay that are but when you calculate the cell the things that depend on it doesn't depend on anymore you remove yourself from the parents that means that we don't have a graph that was a one-way graph we knew what we sell depended on but now we know we need to know who and so we have a cells children and we also need show the cells parents so we need an algorithm for parsing a one-way graph and turning into a two-way graph and then see I can explain that so build on the graph we generate a cell subgraph for each of the cells and we are passing these items in the 1 we keep track of the current graph which is the two-way graph instead of a one-way graph we keep track of the current location we keep track of what

11:01

things we completed already so like any recursive algorithm the very 1st thing we think about is what's the exit condition it's a condition is the thing that you come across a node that you've already done in which case you can exit out

11:12

fantastic next we look at ourselves and we know this children also again and then into the graph and then for each of those children we recursively call the same function to go find you know do the same things we do children and recurse down into each of children once we've done that we've completed that cell and adding the children involves creating a node for the parent and freeze the children we created them we say that you are in the parent child relationship with it does that make sense so far where now that's quite a lot of hard work you have to wrap your head around it and pretty soon you'll be thinking hang on a minute and if I have a 1 depends only 2 native every 3 days we depend back on a 1 and I've got a circular dependency like can't quite do it like that I'm going to have to track the current path that I've taken through the graph as I'm recursing down into it and if I spot the complication that is already in the path I came to in the current stack of the recursion raises cycle error and that means that I need to catch the cycle errors when I make the recursive call down into the algorithm is looking at each some cells and so that I can be sure that I can also catch cycle errors still OK everyone yet another fantastic who knows a better way to do this tries not sure about that but you can do with thing called network X which is a network analysis package and you can get network x of wonderment 1 graph nodes give you back to graphs in like 2 lines of so on on a wasted effort and hard work on the part of the of the dirigible spreadsheet developers evidence character-building we enjoy recursion natives joyful so fantastic on alright so so far so good migrated now little

12:58

thing that can go OK the 3 a 1 a 2 a 3 now what if I want to go into my in a spreadsheet and actually start having

13:08

some proper Pythonic fine so if I want to define some custom functions and now start using maybe more in of Python in my actual spreadsheets authority to death through no of like say x and we return and x + 42 alright great it be lovely time they have to

13:29

use this function through inside of all but which we call it a spreadsheet identify use of 3 more interesting that could worked this or alright how you get Alex we've now got 2 sets

13:45

the user inputs the all the values that

13:47

will form as they put into the spreadsheet and we've got also some custom user codes they put on the right hand side sigh

13:55

are what we need to do then is we got 2 sets of evaluations 1st we need to evaluate the user code and 2nd we need to evaluate each the cell we do that is we start isolating or evil calls from the global context which is probably something we should on ages ago we call the values the code that's a populated through function in this context so context is just a dictionary or license namespaces adjust dictionaries and namespaces on the great and then applies that same context the developed in context of each 1 myself or

14:24

I am not custom functions but I'm sure I hear you ask what if we want to write a function that actually can access things that already in the spreadsheet and so

14:33

supposing wrong having a function through and I have a function that say and we're going and is gonna like some everything in Rome and I want to say OK we'll look at things that are already in the spreadsheet so I can't evaluate my user code before say I've already done a cycle of loading some of the constants in the spreadsheet but always the constants if they've got

14:56

find yes is a constant and need to evaluate its evaluate and before I read the but whole cells like evaluate some of these functions and then you have something like

15:06

that and at at the door all so I'm Motorola constant and then he vowed my user code so the user can have access to the constant I have a function that looks at what rate constant spreadsheet and then evaluate the formula but as I'm sure your wondering what if I wanted to write a custom function that can access the results of evaluating the cell will then I'm going to have to say maybe younger let the user input 2 types of user code 1 to be 1 after we load the constants but before we evaluated formally want to be run after we evaluate the formatting and sure enough that is what 1 can do so I didn't do something a little bit like that so you know about the constant function you got a and even all of the user code preform the evaluation and we evaluate formally and he values occurred post the evaluation who is still with me and all almost everyone says there are enough

16:04

and so here's the real as that would mean that I would have to put to little toad panels runs the right hand side and hopefully

16:12

this is a bit of presentation with something like a magical happens if we have a look at this carrier we're doing OK let's fill the context of the worksheet to it and we took all load constant function we call the evaluated formally function and we've got this sort of user code that's kind of like before and afterwards what if instead and you have a thing

16:32

always eventually but you can now if you put things into the tree former evaluation functions you can actually put formula into the spreadsheet and from the user code panel before the formula evaluated and this is a certain in this amazing patterns which is a sure-fire set of guns pointed at both feet at the same time it's brilliant project so what about the building right now and and then

16:58

what the user actually sees when they log into a dirt or spreadsheet and our sovereignty

17:02

114 and they see this the

17:07

user panel is pre-populated with a function called load constants which noticing is exactly the same name as the function I used in the actual evaluations spreadsheet and it's got a function called evaluated formally which you might remember with the same name as the function we gonna use in the real evaluation and so here we're going to do is I'm basically going to

17:30

turn the whole spreadsheet on its head and say that the spreadsheet is the use of and I have my load constants evaluated form my calculate function I take my context for the worksheet into that for the low constant function to the evaluated formally in context which takes a worksheet curries it into a normal evaluate function for value for many functions and then all I do is exactly user code and it's just your user code panel that lowers the constants and evaluates the formulae and from all of that and that means you totally crazy things in

18:04

your user code like you can evaluate formally multiple times or you can put a nested recursive calls to the spreadsheet itself you can make spreadsheets the call of a spreadsheet you can populate formulas time programmatically all that sort of find the spreadsheet is the user code the user code is the spreadsheet this is the most triphonic spreadsheet available at my telling you that you should use the spreadsheet absolutely not if you wanna work with spreadsheet-type data in Python just use IPython notebook compounders and this is just a little bit of fun I thought you might be interested in from the dirigible source code is all available on our get how that could have for such Python anywhere for sectors will

18:42

spreadsheets which events taking a look at it and now given you talk about it and hands up please if you found that easier to understand than you had thought it was going to be to understand how a spreadsheet wake hands up if you now understand how a spreadsheet works that is like a 90 % I'd say her and who thought that was easier than they thought it was gonna be before arrived into the room that they thought that they thought that the we evaluated within the effect that I just think that viruses held up a sign saying all you to think about that I that has a beautiful thought I was easier and simpler than about 1 person AAAI that's good enough there please hands up if you like January authority and if you thought that was like really confusing data as of if you think I should never do this talk again by implication that 2 people put in the hands of incidentally any program a working groups that considering I told submissions future conferences to deal with it and up which means about 200 people in think should be the talk again thank you very much everyone not fj

19:54

last fall minutes full question for I think you think about security yes we did so the uh obviously that's animated beings we're giving is out the random strings and instead saying I will be valuable I think there are also has the and so yes we have a source sandboxing model that allows you make sure initially users can only access to restricted part of our system and then basically the whole containerization story that you've had to death uh at this conference already out so but that's the story I have like code out all of the dirigible was published online we find in the history of you're interested and because it uses gender and easy to understand without security of yes you have the question in 1 walking what in data for the talk it's more common than the question of well to to issue is 1 of my colleagues the love with time with Excel and it could be some kind of killer application was serious because of there there is that we can work with parameters and this level but the during our recall from people really want to change single cells still among thousands cells because of some special things they just or whatever and if this is easily tractable and and expendable with with Python functions I think it's really it has potential for a really serious loss events which was there I emulated good especially wonderful tool wrote thesis about my muscles and and get a hold of instinctive two-dimensional I can see the numbers like changing did you represent relationships so wonderful stuff and it turns out doesn't want that we want to use a spreadsheet of itself and is you try and build a compact space elements like 0 I want every single shortcut key cell has to also work and you also get an image and it's quite a lot of work catching up like 20 years and so on have any IPython developers IPython over developers who want to integrate some sort of spreadsheet component and we happen to have point here and there is no basis in this and in end use they what time we have to finish iris given it to mess around 5 another question correct the idea of work someone radiation announces they have long before we can get maybe there's a question for a Python conference that did you think about implementing this injustice and of am so this trails that no but I'm thinking about it now and this is 1 interesting things about it is in the wrong things on the service so I remember when I read executed that we you to believes in the spreadsheet can secure various itself paralyzing that and then you can do things like recalculate spreadsheet across a cluster of machines because once you've got a leaf node that's totally independent from all the others and you can save so thinking like maybe massive parallelization interesting what what did you come into jobs that was the answer to all of you know and suggesting the existence and evaluating maybe that you have all the elements tyrannical locally and get on the browser here I show you could rewrite all jobs that's pretty easily run notes of Helsinki for us so that you see in fact even in the match and and talk the more Harry fast and the idea of an adult thanks very much like I