The Boring Python Office Talk
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Subtitle |
| |
Title of Series | ||
Number of Parts | 132 | |
Author | ||
License | CC Attribution - NonCommercial - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/44883 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | ||
Genre | ||
Abstract |
|
00:00
Office suiteProbability density functionWordAutomationCodeTask (computing)Pointer (computer programming)Modul <Datentyp>Sample (statistics)Presentation of a groupPowerPointFisher informationComputer filePivot elementoutputAsynchronous Transfer ModeFingerprintTask (computing)Projective planeFisher informationCodeOffice suiteWordComputer fileDatabaseScripting languageSweep line algorithmPresentation of a groupFile formatModule (mathematics)Line (geometry)Information technology consultingStandard deviationNumberMultiplication signSlide ruleParameter (computer programming)Computer programmingPoint (geometry)Software testingSoftware frameworkBitGoodness of fitPower (physics)CalculationBoss CorporationMedical imagingSystem callMenu (computing)PhysicalismCodeSelf-organizationTouchscreenSource codeWeb applicationSpacetimeWeb crawlerExtension (kinesiology)Amsterdam Ordnance DatumOpen setSoftware engineeringData managementPixelTwitterControl flowInternetworkingMultiplicationDifferent (Kate Ryan album)Intelligent NetworkComputer animation
07:57
Probability density functionAsynchronous Transfer ModeMaxima and minimaFingerprintComputer-generated imageryTable (information)Computer filePivot elementInterior (topology)Rule of inferenceTransformation (genetics)Data typeDrum memoryPrice indexComa BerenicesComputer clusterManufacturing execution systemInclusion mapTrigonometric functionsWritingTraffic reportingFrame problemMassWorkstation <Musikinstrument>Asynchronous Transfer ModeObject (grammar)Computer fileBitMereologyCASE <Informatik>Pointer (computer programming)Process (computing)File formatBit rateLatent heatFactory (trading post)Goodness of fitRow (database)Different (Kate Ryan album)Category of beingChainNumberForm (programming)Function (mathematics)Software repositoryMaterialization (paranormal)Revision controlShared memoryPoint (geometry)CodePower (physics)Airy functionGame controllerData managementType theoryStandard deviationTable (information)CodeProjective planeCalculationView (database)Task (computing)Line (geometry)WordPivot elementPosition operatorMultiplication signServer (computing)Proof theoryMedical imagingMultiplicationTerm (mathematics)Probability density functionOffice suiteClique-widthMathematicsOvalData structureLipschitz-StetigkeitPiLibrary (computing)Data conversionReading (process)PixelSet (mathematics)Single-precision floating-point formatCoordinate systemArithmetic meanOpen setComputer animation
15:54
File formatForm (programming)Data typeTable (information)Conditional probabilityShape (magazine)Parity (mathematics)Computer filePresentation of a groupSlide ruleFree variables and bound variablesClique-widthCloud computingTelephone number mappingProbability density functionCommon Language InfrastructureContent (media)Web pageDigital filterFreewareAutomationProcess (computing)Bookmark (World Wide Web)Moment (mathematics)Computer fileCategory of beingRule of inferenceOrder (biology)PlotterMereologyNumberCASE <Informatik>Lipschitz-StetigkeitFisher informationLatent heatTerm (mathematics)File formatGraph coloringModule (mathematics)Form (programming)Pattern languageCondition numberTable (information)WritingFigurate numberObject (grammar)Line (geometry)SpreadsheetScaling (geometry)Template (C++)Different (Kate Ryan album)Web pageFunction (mathematics)Multiplication signBitGame controllerCodeAsynchronous Transfer ModeRow (database)Revision controlEmailSlide rulePhysical system1 (number)Position operatorComputer configurationProcess (computing)Functional (mathematics)Raw image formatElement (mathematics)Clique-widthMathematicsComputer fontProbability density functionSynchronizationProgrammschleifeMonster groupLimit (category theory)Office suiteSeries (mathematics)Right angleComputer animation
23:51
FreewareAutomationProbability density functionProcess (computing)Web pageTraffic reportingTable (information)Computer fileNeuroinformatikPixelTemplate (C++)Complex (psychology)Scripting languageSign (mathematics)Product (business)Web 2.0WordPivot elementMultiplication signSoftware maintenanceOpen setModule (mathematics)CASE <Informatik>SpreadsheetLevel (video gaming)Instance (computer science)Computer animation
Transcript: English(auto-generated)
00:05
Thank you everybody say Well, I am impressed. So it is Friday on the conference It is after the coffee break and you actually came as volunteers to talk. That's is boring now There may be some confusions in restroom. If you want to have a nap is over there
00:21
It's the one day before at one door before Yeah, why is a boring office talk? So well, she's been told a couple of years ago a switch to the dark side So I'm mostly to project management nowadays and there as you know Well, there's a lot of excellent PowerPoint involved and so I'm on my little quest to automate that so I looked for
00:40
Available modules because there are always well a couple of modules available in Python and in this space They're quite a lot and what I would like to do is first of all Well motivate you while boring things can be interesting and then to give you a let's say short introduction And what's available and how it could look like hoping that you will then start to do your own well boring stuff
01:03
With these modules and maybe with some insight from this talk So motivation some things should be boring. I like boring So I don't have to think at all about the well equipment in this and conference center I don't have to think about the plumbing at all. It just works always when I have to think about the plumbing
01:22
There's a problem so and I would like to have the same situation for Well office files because they are to a certain extent of plumbing in most organizations that I know they get sent around Somebody looks at them annotate them So well a way to communicate and I want to be make this boring. I don't want to think okay
01:42
I have to copy and paste a lot of files a lot of images a lot of pictures a lot of PowerPoint slides at Friday 8 p.m. Just before just before I hopefully can go into the weekend That's something that we should do automatically which Allows me to introduce my three characters. They're
02:04
Totally um, well archetypes. So we have the boss the boss as always once an Excel file Hopefully with nice numbers, but well in doubt an Excel file with the right numbers have will have to do then we have Well this little guy here, so he's working in office. He's the only one with a tie here
02:25
And he mostly works copy pasting combining doing Excel files so we learn what he does later and he has one co-worker and His co-worker is a Python known pragmatist. So maybe she studied physics learned about Python now started a job and
02:42
While he knows the all the files that has to be produced. She knows how to automate it well, and he knows that he wants a files and Again, what do I hope to accomplish with this talk? I'll have to warn you the code on these slides if you compare with what you've seen as conference
03:03
It's not pretty its API code and its API codes It is pretty closely aligned to how the file formats work So you will see lots of calls that well enter some data manipulate some data do Highlighting this is not something that is really exciting. It's just useful. Well as a talk applied boring
03:25
so What you should get out of this talk is module names There are a lot of different modules that are available and a rough understanding how much effort you would need to invest to say Have a nicely highlighted Excel file have a PowerPoint with chart have a PDF
03:43
file that looks like your own PowerPoint file and With that you can maybe just convince People at your workplace to let you play with pandas because you can do calculations in a repeatable way and then create some Let's say business conformed fights from that that your boss can then take and add it around
04:06
So ultimately what I want you to see and to do is when you maybe sit down on Monday when you Talk to somebody and say okay. I could send you an Excel file and Whether with a CSV file, you know You always have to think on how to really open that
04:21
next of all is something that you know, and I can even do a PowerPoint for you that Once you have automated it will be easy with Python On all the code which is not much is on github I will also there's also the slides on github right now as soon as I have figured out how to upload the slides for the
04:43
Conference you'll also have some there So you can take this and hopefully well find something interesting Another word about the philosophy about how I do things for this talk We have learned a lot about good and best practices in
05:03
using Python This is I tend to follow especially when I start more or less a quick and dirty approach So if you think about Automating things in your work, you may be if you have a software engineering background. You may be tempted to think okay, I Will need a full testing framework
05:21
I will need to have this in Docker and it should be a web application and there some extra parameters Personally for things like that. I would advise against this What you're trying to do is just save you some work to get started Especially this is a newbie talk if you haven't done any Large Python and programs before I think this is a very good way to get started because you can do small things
05:43
And then you can check whether they are correct And just to see well Basic idea. So what I do for example is just to with exactly these API's. It's just to combine some time sheets So we have multiple consultants working on a project that each have the time sheets I basically copy them together and see whether the numbers fit as I'm put I'll put them in a different format
06:04
It is not especially challenging it's I think 50 lines of code The good thing is I don't have to think about what I've done every month I just want the same script and things are pretty much. Okay. So what do we want to do?
06:23
basically have just three tasks in this talk First of all, we have multiple Excel files the Excel files here stand ins for any data You might accomplish it might be a CSV file. You may read something from the database. You may have your little web crawler that
06:40
Checks pricing information on the internet. So you have multiple data sources that you want to combine. That's fine then you want to Transform them a little bit and write them out to some other Excel file Okay, so far so good. You can do all this with the let's say pandas standard API
07:01
And then you want to make it Somewhat prettier some highlighting some specific extra features that you or your organization might like Then we go to them. I'll take more presentation focused stuff. We build some power points Once we have two power points, we will return those into PDFs
07:21
That's it. It's the interesting thing about this particular task that there are many many good tools available So the personal favorite I think of all its pandas Reads almost everything it does almost all kinds of calculations and it exports also quite a lot. It has a nice
07:42
Excel export for example Then you have with Excel writer and open pixel to API are two modules that are very good at working with Excel files This one hence the name M can only write them. This is the one that we are going to use here Mostly for let's say taste reasons. I like the RP API slightly better
08:05
Open pixel can also read excel file so you can read something in change just a few numbers write it out again For PDFs, so we have PDF read write which we just used to combine some PDFs. There is an
08:22
Gigantic number of PDFs library available for Python depending on what you want to do. So you'll probably need to invest some time What is quite nice but quite big if you want to do custom reporting is rapid lab again something you won't see here, but just to give you a point of this is well, if you want to do PDFs with Python and
08:44
Want them to be more complex and want to see here, that's fun um, I will use slipper office basically in headless mode to do some file conversations and Yes, then we have Python PowerPoint Pbtx which as the name suggests does Pbtx files
09:04
Okay The overall structure is if you look at the GitHub repo, it's just the main method. So we read data we do the extra forms then we do some nice output then we output the
09:21
Powerpoints and then we do as a PDFs So if you check out the codes that should basically all work It should have a couple of files. If not, let me know in which case most likely something went wrong with the share Okay, what is our task These are extremely made up and boring files. So the idea is we have
09:46
some project team this project team has Expenses for materials. They want to build a mass station and they have Their working times and well, everybody has to be paid so they have some expense per hour
10:01
So these are three files that you've got maybe from your accounting department maybe from your project team themselves and Now you want to combine them and do some reporting because you want to know what actually my expenses So first of all you work with pandas to load in all these files so you have three data frames and
10:22
Well, we're not really much to see here so with this excel files Each of these excel files only has one worksheet Okay, now you have three data Then you want to do something that you would normally done by copy pasting an excel So if you work with actual well, we can do it who loves pivot tables and Excel
10:47
Okay, that's we have four people who uses pivot tables and apple somewhat grumbling Okay, so I will assume from that that you would enjoy to use Python slightly more and well there is
11:01
practically So what we do is the same thing that we most likely do in an excel file and if you can't follow this I Don't think that you should be concerned just know that everything that you could do in Actually, you can also do in pandas and Thankfully, especially from the empire data people. There are a lot of very good and pandas tutorials available
11:24
So I just walk you through what we are doing here so that you see the AP is quite well beginner friendly And the documentation is excellent. So whatever you want to do, I think you can google to it yourself to success okay, so
11:43
What we're basically want to do is we want to combine these two files So We update we just Start from some I'm starting in some in town. We calculate overall cost by rate
12:02
So we combine two or we combine the rate and the hours per Day table. This is mainly this part here Then we add some extra column just to say what coast type and cost type that would be and then we combine them So this is I think a pretty standard copy and paste job and
12:23
As you can see, it's not so much code So if you wrote this not for demonstration purposes, it's most likely the same code up here into that copier So maybe ten lines of code, so that's nice We have to calculation and the calculation is now copy and paste proof you if you have to come have to do this multiple
12:40
Times you can more do it multiple times can even do it on a server. I think it's nicer than doing manual So this is when we end up it with so we have Lifting cost positions. Where does it come from? Is it working time or are these expensive who is responsible for it now? We have well two projects teamed in a mass colony. So it's two rather
13:02
Let's say ambitious things and we want to output this to an Excel file So this is almost the unpunded standard way of doing it So in pandas, you can do two things. You can either go to Excel and write out a single Excel file or you can
13:20
Initiate your own Excel writer and the actual writer will make it possible to have slightly more control particularly here you will Create Different worksheets So these three worksheets all going to the writer and you can give them a cheat name and now you have
13:40
Some control over your Excel, but it's basically still all data. So I Think it's not really management friendly quite now So first of all, what we want to do is we want to have some Introduction a sheet into our Excel. We want a company logo version number and maybe some formatting and
14:03
to do this You take a workbook Object you can get the workbook object from the writer object and the workbook we presents. Well, basically your Excel sheet And you can then define with Excel writer some formats And there's a large number of different properties that you may want to use in this particular case
14:24
I just want to say I want to sing in blue and slightly bold just ends up here So I have now created a new format and now I write it. This zero zero means Row zero and line zero the API also has these
14:43
Exospecific and coordinate. So if you want to go a one is it where you can also do this Or write my text and bold in this case is this Format object so write this out. I can also go unformatted. So just over a costing is here and
15:01
I can play around a little bit with the column width so set column Takes which column it should apply to so first column to first column. We could also go multiple columns and says how Which which what do I want to have in terms of characters?
15:21
Now write this out. So I have my first formatting and I can include an image well, and then conclude typos, but That's not the same Okay, so what we have done so far we have output an Excel straight from pandas and we have created a new cheat Into in this Excel
15:44
Okay, but Well, what do we want to do if we need slightly more control? So for example If my data is not in a pandas data set or if I have some other data that I want Mike to add in a arbitrary position Then you have two methods. We see the first one here. Well, you have multiple but you can write whole rows
16:03
Into the system. So what we do here is just we take our columns Add them here. So we have a header and then we add The raw information in the different rows So now I have control over how upset is So if you have something in a dict or wherever you got it from you don't need to
16:22
Assume that you do an expertise from pandas Now this Looked rather drap. So where we want to end up is something little bit nicer So what you you see here is more manual control over what you're doing. So first of all
16:41
We build a new worksheet and then we add two formats this time And you see the different properties that are available think bold. We already used we have different font sizes and different colors Then we want to write the column information out again, we apply one of these
17:02
Font information one of these styles and then the next thing that we want to do is to Change the column widths again. So this happens here Then we want to write out our data and this time we do it really data element by data elements that we don't use
17:21
The raw function book just go over all our data and what we want to accomplish is we want to change everything that is higher than the 75th percentile in the data in red. So we just use pandas to get this particular value And then we go over every row in our data
17:41
We look which column we are because the first column is data is a number the rest is text and if we are above the Percentile value we just apply red bold. So it's the other style that we defined Then it looks like this. So it
18:00
Gets slightly more well interesting. So you don't have these Desert of numbers anymore, but you can highlight numbers that are of specific interest Now those of you who like Excel too much like me Might notice okay, but there are conditional formats in Excel So I could just if I do this by hand I could do all that in Excel and in terms of highlighting I could maybe do it in as a table
18:24
There's also possible so You can just say okay, I want a specific number form it's like Excel wants them and define those and I Can define part of my Excel sheet as a table so
18:46
Okay, so in order to do that you just Go for this data. And then what you can also do is say I want Conditional highlighting conditional highlighting if you can know an Excel you work
19:01
You can put defines custom rules and there are quite a lot of complicated properties that you can use But my old-time favorite is just this sweet color scale. So it's gave basically gives you a traffic light And you can apply those again to a specific part in your spreadsheet in this case it's the same part that we used for the table and
19:23
You have a nice colorful Excel sheet Okay We'll have to speed up a little bit. So I'll just speak though You can also create charts charts are the most ugly part in the API and mostly because of this part here Because what you do is just here you add all the data to the table You need the data on the Excel sheet in order to create a chart excellence to find them somewhere
19:45
And then you create two serious. Each serious has a name in this case could be Well the expense information as a category with the person who is responsible for for the price and well
20:00
Once you have this you can just in create a chart. You could also go from a plot lip at this moment So it's a we already saw earlier that we could use pictures as you could the advantage of this one is that? It is an Excel chart. So it's a digital you can change it Okay, so much for Excel now, we'll get just go over the PowerPoint thing
20:23
the PowerPoint module as a thing is Wonderful. So basically everything you can do by hand you can do with a module for Excel There are still some limitations. So I didn't find a way to do pivot tables in the form of that X will expect them So The thing about PowerPoint is one M difference
20:44
You should probably prepare some template that you want to change because you have a lot of formating information that is will not necessarily Be required for the extra file and then you can just go and fill out placeholders and Write out well Ugly tables in this case on purpose that you see that I did something myself
21:03
so that I'm not it what I'm not a designer and Same thing you can add pictures to it You can add tables to it here. You will see that we use this module pandas to PowerPoint. It's I'm not pip installable, but you'll find it on github. It's a couple of hundred lines
21:22
and Again, you can add charts. It's basically well the RP is Close to it There are a lot of magic numbers in here and you're probably best of starting to copy the code and then change it So it's not the nicest code, but it is useful So you I want again get a native PowerPoint object
21:44
One things I didn't figure out so far is how to take the charts out of my Excel into my PowerPoint Like I would to do with copy and pasting Anybody has a hint there that will be helpful Okay Now we have also the PowerPoint If you want to change this into PDF, you need something that's able to change and to render powerpoints. I
22:05
Didn't find a pattern module to do that. Luckily. There's LibreOffice and LibreOffice has in command line mode So what we can do is just automate the command line mode Some of you may be familiar with that process. So what we do here is just build a command line Look for the executable at some options
22:23
These are ones other ones that are interesting and then well execute it and at least on the Mac The version that comes out of the PowerPoint in the version isn't the PDF look 90% the same so that's quite practical what you can also do this PDF read-write
22:43
It's just read an existing PDF file, so let's say you have a title page an outro and some pages that you build so you just read those in And then you create a PDF writer Then you can just say take this page edit to the output to take this page edit to the output and many things more
23:06
Okay, and that's basically all we've done So we have now seen that it is possible to create nice looking Excel files Some people like this we have seen that it is possible to create PowerPoint slides so my personal dreams in our you is that in some kind of the future will be able to create
23:24
Gun charts from Excel files, that would be great So you're working on that and you can even create and create PDFs for archiving or whatever you want to do from those files There are some things that you can look into learn more
23:41
Especially if you are new with Python this book is nice It's nicely written and the HTML version is free If you want to deep down have a deep dive into the wonderful world of PDFs Then you could also do this on the report lab page. I think this is the Sign that I should shut up now. Yeah, and the documentation for each of these modules is excellent
24:05
I mean you in excellent, so the people who did those modules. It's it's almost heroic It's not really the nicest work to do that, but they did it really well Okay
24:48
Yeah Honestly honestly no and so I think what you'd like things you can do it via come
25:03
I think there's also now a product I think it's Excel wings where you basically embed Python into Excel This is also available for the Mac and the reason that I did is that way that I did is there's us I just wanted to have a script that runs without opening Excel because it for me that makes it debugging so much easier
25:27
Pivot tables will be preserved in open pixel So if you have existing Complex Excel files with lots of pivot tables and you just want to change the data use open pixel. Yes
25:41
There are some things that you have to keep in mind so they it changes it and the file This for example means that if you have some computations in the file that Excel would trigger Automatically once you have for example included a pivot table. You will need to trigger in Excel again, so you cannot Assume that everything that would happen in Excel will also happen if you create the file outside of Excel
26:06
That makes things you can Yeah, you can the question is and does Excel do it so with open pixel Yeah, you can change the data and you mean you can force the pivot tables to update. Okay, and
26:20
you should be able to copy and paste a Excel chart into pivot into PowerPoint. Okay, in this case we can talk about this because this would be extremely interesting It's not directly possible, but you should be able to take the spreadsheet drawing dump the XML and Attach it into a PowerPoint file. It's not that difficult. Okay, that's slightly lower level
26:44
Well, I went with this if you want to work with templates It's it's it should all be doable Okay, I'm the open pixel maintainer. So I know this stuff. Thank you. We have time for one more question. Sorry
27:01
Someone here for the question Was basically also the same question about how to Interact it with an open instance of excellent PowerPoint. And yes, there is this open Excel wings Which does this for Excel and you always can do it via comm but that's quite cumbersome
27:24
I'm not aware of anything Which can do the same with PowerPoint for example, and if someone knows something just talk to me And if you have any experience with with words and templates
27:45
and if you put text in it if The text is holding the template styles and also the the talk From heading if you some experience with it that it does it work well with Python
28:02
Sorry hardly did any work anything with word files Okay, I want you to think again seven for a good talk