We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

DjangoCon US 2018 - Lightning Talks Day 1

00:00

Formal Metadata

Title
DjangoCon US 2018 - Lightning Talks Day 1
Title of Series
Number of Parts
50
Author
Contributors
License
CC Attribution - ShareAlike 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
Lightning Talks Day 1 Heim, Sarah Allen, Tim: Change just one thing Michel, Raphael: Automatic Screenshooting with Python, Django, py.test, and Selenium McLaughlin, Katie: That's not my emojii Willison, Simon: Cryptozoology Rees, Griffith: Collab Canvas Durbin III, Ernest W. Juvenal, Flávio: 1+1=1 or Record Deduplication with Python Ximens, Filipe: Parks, Buildings and Patterns
VideoconferencingMusical ensembleWater vaporMultiplication signRight angleFreewareJSONXMLUMLLecture/Conference
Programmer (hardware)CausalitySpreadsheetHypermediaCoefficient of determinationMultiplication signComputer programmingGodPlotterTwitterDaylight saving timeCellular automatonNormal (geometry)Time zoneSoftware bugBoss CorporationVideo gameComputer-assisted translationRight angleNatural numberMetropolitan area networkMathematicsProcess (computing)Lecture/Conference
YouTubeReal numberBitBoom (sailing)CausalityOffice suiteMultiplication signCodeGoodness of fitFocus (optics)Machine visionAsynchronous Transfer ModeBoss CorporationLecture/Conference
Video gameRepository (publishing)Open sourceBitOnline helpProjective planePiMultiplication signCodeSelf-organizationFamilyGroup action2 (number)Computer animation
MathematicsVideo gameCausalityData recoveryOnline helpMultiplication signSoftware developer
Web browserGraphical user interfaceSample (statistics)Software testingDependent and independent variablesFunction (mathematics)Computer fileDemosceneSocial classDevice driverObject (grammar)Port scannerServer (computing)Computer configurationParameter (computer programming)Client (computing)Utility softwareSoftware testingRepository (publishing)Web 2.0Electronic visual displayFunctional (mathematics)WritingCASE <Informatik>Object (grammar)Software developerMobile appDifferent (Kate Ryan album)TouchscreenCartesian coordinate systemSoftwareProjective planeWebsiteBitRevision controlServer (computing)String (computer science)Physical systemCodeComputer fileGraphical user interfaceClient (computing)Computer configurationResultantConfiguration spaceWeb browseroutputParameter (computer programming)Formal languageConnected spaceImage resolutionLibrary (computing)Web applicationAutomationDemosceneLocal ringMultiplication signComputer programming
Multiplication signDisk read-and-write headExistenceComputer animation
ExistenceExistenceWaveComputer virusPermutationChi-squared distributionComputer animation
Physical systemComputer-assisted translationWindowCryptographyCodePoint (geometry)Computer animation
Moment (mathematics)TwitterObservational studyRange (statistics)Metropolitan area networkCryptography
Range (statistics)Drop (liquid)Cone penetration testCartesian coordinate systemOpen sourceShape (magazine)MappingSet (mathematics)Multiplication signQuery languageRange (statistics)DatabaseRight angleDifferent (Kate Ryan album)Scripting languageFile formatSource codeDrop (liquid)Repository (publishing)CodeMoment (mathematics)Ocean currentProjective planeInformationDirectory serviceObject-oriented programmingGeometryPerfect groupInternetworkingComputer fileLine (geometry)CryptographySequelUniform resource locatorLogicPoint (geometry)Volumenvisualisierung
Time zoneError messageRadiusRange (statistics)MereologyRevision controlLink (knot theory)Data conversionMappingDatabaseComputer fileTwitterDistanceBitMoment (mathematics)Time zoneKeyboard shortcutQuery languageProjective planeExistenceSelf-organizationDrop (liquid)output
Query languageView (database)Observational studySurgerySoftware maintenanceAerodynamicsScale (map)WikiLoginCellular automatonMoore's lawNeighbourhood (graph theory)Drop (liquid)SurgeryComputer programmingFormal languageUser interfaceCausalityDefault (computer science)Multiplication signMereologyPresentation of a groupProcess (computing)Tape driveProjective plane2 (number)Square numberMobile appDifferent (Kate Ryan album)Cellular automatonDependent and independent variablesComputer animation
ImplementationLattice (group)TorusData structureAlgorithmSequenceCellular automatonResource allocationTime zoneSquare number1 (number)Neighbourhood (graph theory)MathematicsWebsiteCubeMusical ensembleGreatest elementLaptopDifferent (Kate Ryan album)Goodness of fitRight angleWeb 2.0Process (computing)Socket-SchnittstelleComputer animation
Computer virusSoftwareFormal languageTwitterEvent horizonSystem callOcean currentProcess (computing)Multiplication signOrder (biology)SoftwareCategory of beingComputer animation
Open setComputer programSpeech synthesisComputer programmingPresentation of a groupUniform resource locatorSoftwareComputer animation
Online helpStress (mechanics)Message passingWebsiteThumbnailComputer-generated imageryBitExpert systemOnline helpTwitterWebsiteSoftwareService (economics)XMLMeeting/InterviewComputer animation
Row (database)Address spaceReal numberDatenverknüpfungFuzzy logicPairwise comparisonString (computer science)Bounded variationCalculationMotion blurCoding theorySubject indexingPrice indexVector spaceCompilation albumVector graphicsFunction (mathematics)Similarity (geometry)Scale (map)Thresholding (image processing)Block (periodic table)Total S.A.Revision controlForestAddress spaceDatenverknüpfungSimilarity (geometry)Row (database)Process (computing)Functional (mathematics)Set (mathematics)Matching (graph theory)Group actionDistanceWave packetVector spaceCodePairwise comparisonNegative numberPosition operatorString (computer science)Electronic program guideSubject indexingRegulärer Ausdruck <Textverarbeitung>AlgorithmReal numberPrice indexNumberBounded variationCalculationMultiplication signGeometryPotenz <Mathematik>CausalityLibrary (computing)Fuzzy logicAverageIdentifiabilityCodeThresholding (image processing)Cartesian coordinate systemComputer animation
BuildingProduct (business)Client (computing)Mountain passPattern languageProcess (computing)Series (mathematics)Network topologyTrailInformationProcess (computing)Medical imagingPattern languageComputer architectureAreaComputer animation
Desire pathPoint (geometry)Instance (computer science)CASE <Informatik>Pattern languageMedical imaging
Desire pathNetwork topologyPoint (geometry)Pattern language
Formal languageShape (magazine)Pattern languageView (database)Pattern languageAuthorizationState observerComputer architectureNatural numberMereologyShape (magazine)Computer animation
Pattern languageFormal languagePattern languageSoftware design patternArithmetic meanGroup actionWordMultiplication signSoftwareLattice (order)MereologyComputer animation
TelecommunicationArchitectureState observerMultiplication signComputer architectureSpeech synthesisSoftware design patternProgrammer (hardware)CodeTelecommunicationFreewareComputer animation
Coma BerenicesXMLComputer animation
Transcript: English(auto-generated)
I'm coming up here as I feel like I'm Mr. Swag here I've got my got my new little cell phone holder and water and shades so I'm like stoked on all the free stuff so I don't know if a lot of you guys here for the free stuff but I'm here to so next week I'm going home I'm I live here so I'm
kind of on vacation all the time but next week I'm going home to Minnesota and yeah Skoll right anybody who's here from Minnesota so next week going home my cousin is in stand-up and I want to do that with him so this is my first kind of public speaking a little practice but meanwhile I'm a programmer
so I feel like I have some funny things to say about programming all right so I'd like to say thank Chloe and because as a programmer who's locked in a padded cell all day it's nice to laugh and and I think we don't laugh enough as programmers so so here I am trying to make us laugh and and
and you know my boss shows me away in that padded cell says go do XYZ a little inside joke because I do I just did a plot an XYZ and and I work at Scripps Institute of Oceanography my goal you know save the oceans so so that's what I do on my day job and meanwhile I crack myself up all the time
so I thought hey might as well give it a try you know cracking some y'all up so we're in the land of the hippies aka San Diego and we like labels so I actually don't have a button because I don't really care for labels but I respect other people's labels and so you know except for food I'm always like is this local organic shade growing fair trade you know I like my food
labels right well another thing of land of the one we are one we like to be one with each other one with nature one with ourselves let me tell you about what I wish we were one with time it's just like there's just never enough time right and anybody program with time time zones time zones oh
those are fun aren't they cuz oh man there's more than just 24 and I'm talking to normal people like oh oh I know a lot about time zones man there's more than 24 someone from our 30 minutes or 45 minutes apart and then there's daylight savings time I am like please can we get rid of this like I've been looking up on the history of this because I'm like this
just is archaic this needs to go programming you're just like oh my god can we not just all be one with time it's called UTC you know I wish that'd be that just be my programming life so much easier so new to social media I just started my Twitter thank you I started my Twitter today to
because I was like it's in a spreadsheet and I love to fill up spreadsheets I've been doing spreadsheets since I was 10 so Twitter all right been meaning to do this at procrastinate sometimes just a little push all right made a Twitter Twitter is pico de loco there's one but there cuz there was already a pico de loco I just made an Instagram account like a
month ago only a few posts a lot of cats dogs you know and I skateboard and so anyways so it's me right well I was pico de loco I do Twitter there's already one so I'm pico de loco one well okay well all right so get on Instagram change my change my name now and I am now pico de loco one cuz
guess what I am crazy I was crazy once they locked me in a room with bugs bugs make me crazy I was crazy once they locked me in a room with bugs oh oh yes cuz programming bugs programming bugs make me crazy especially when you're all alone trying to figure out a bug it makes me crazy so I'm crazy I
embrace the crazy I am proud of my crazy label some people think oh crazies derogatory no I'm crazy and I'm proud so oh oh I got one minute that's sick I'm going fast cuz I talk fast so try to listen fast cuz I am slowing down can't stop won't stop social media I'm on social media so I'm new to things I got this 360 I want to get a YouTube haven't done a YouTube
video going to I see you and focus focus focus it's what I need to do I have a little bit of ADD I'm having a hard time right now staying to this microphone because I like to move and my boss meanwhile I just in an office
all day trying to sit on that chair I'm trying to be a real good I'm trying to be a good employee to sit on that chair sit all day sometimes you get tunnel vision code mode you're just going going going going going right meanwhile oh dang it I gotta go to the bathroom boom that's when you figure something out is when you stepped away right you step away you realize I need to step away a little bit more often you know cuz we
all need to step away anyways last thought trying to save the oceans sometimes there's labels respect those labels including recycle labels some of you all using recycled cups or whatever hey try to put in a cup try to put in a bottle if you're not if it's a recycle make sure it goes in the recycle thank you thanks hi everybody I'm Tim I write code for the Wharton
school various open source projects and most importantly for myself I help organize Django con us here and also the Philadelphia Python users group
back home I started writing code when I was six years old because my mom won a raffle and have been doing it ever since and absolutely love it I consider myself very lucky to get paid to do what I love with colleagues and friends that are close enough to be considered like a second non DNA family so let me
show you my github chart from 2014 did that look like the chart of somebody who likes to write code a lot the truth is in 2014 I wasn't doing what I loved I had been slipping further and further away from doing what I loved for a long time and you know why I still worked hard and had a
lovely home squaring who I thought I was and who I should be with how I acted was getting harder and harder every day and looking in the mirror was getting harder and harder every day so it was because of this that I had to change just one thing so what happened here for many many years I had been
trying to change just one thing on my own and failing finally in April I asked for help with my alcohol addiction problem in May I got out of rehab and you can see a sudden uptick in activity there and I've been clean and sober one day at a time ever since so thank you so what happened here in July
and August I made a terrible terrible tragic mistake and reopened my World of Warcraft account so 2016 looks like it might be a bit of a lighter year but I
was contributing to several private repositories and me and my friends from Wharton were also a little busy helping host this conference that you may have heard of the year it was in Philadelphia but when we take a look at 2017 this is where I really started to hit my stride having gotten more involved with a bunch of open source projects and from Wharton we also started to put some of the packages we had been developing internally on
github and pi pi and if you take a look at 2018 it's up to so the first year was 21 contributions in the year 1053 in the last year so as you can see now I'm pretty active again and doing what I love writing code with a community of friends old and new it's been such a wonderful way to connect
and just such a better life so what I just really want to do here is ask everybody if there is one thing that you could change about yourself one thing that you don't like what would it be and how much better could your life look if you could change just that one thing it doesn't have to be an addiction like I had it doesn't have to be you know only you will really know
what it is when you look at the mirror at night but I want to encourage everybody here to you know if you do have that one thing that you want to change about yourself it is worth doing it is worth asking for help for I couldn't do it until I asked for help and I've had support not just from the rehab center I went to and not just the recovery communities I've
been in my colleagues at work have been incredibly supportive the Django community which I have immersed myself in ever since has been incredibly supportive and it's been an amazing amount of support everywhere I've looked to try to keep this going one day at a time and the same thing could happen for you because if you need help to make that change you know absolutely
ask for it because it really did save my life and you know as Gandhi said we must be the change we want to see in the world so if you have a problem sometimes changing one thing can change everything and it really did for me thank you very much hello everyone I'm a software developer and
I write software but occasionally I do not only need to write software I also need to write documentation or put up a website for the software that I've written both things usually include screenshots and doing screenshots is
cumbersome you want to do them on different screen resolutions you might want to do them with different locales and once you're all done having created them they are outdated and you have to start over again so let's automate this and let's automate it with tools that we already know or
that probably most of you have already seen the first tool that we want to use is selenium you might know selenium as something to write front-end tests for your web application basically it's a way to remotely control your browser from a program the next ingredient is Chrome
headless Chrome headless is a way to start Chrome without requiring a display or anything attached so you can run it on a server without needing to have a full-blown desktop operating system and then we use pi test pi test is test runner but if you look at it closely it's not only a test runner that makes
it easy to define tests and run them due to run them in specific ways but basically it's a way to run functions in a specific way so usually you have a test that is just a Python function prefixed with test underscore and that checks for something and then you run pi test and it finds all of those
functions in your project and runs them and informs you about the result but pi test can do so much more it can you have parameters are parameters for input and fixtures and so on fixtures for example are things that you wanted that should be there before the test is run so for example
in this in this code snippet we are testing in the SMTP library and we have a fixture that creates an SMTP connection object and then we have a test that declares that it needs this fixture passed as an input and for
Django there is pi test Django and in Django itself there's live server test case which allows for creating a test case that accesses an actual run server like version of your app so with all these ingredients we already know them
they're usually used for used for testing in this case we will use them for screenshotting and to make it a bit nicer we use pi tests configuration to redefine some strings for example we want to define our screenshots in scene files with shot functions and not in a we don't want to call it tests and
then we can use fixtures to create objects because nobody's interested in screenshots of an empty application we want to populate the data ways in some way beforehand and we can define or as and as another picture we define the
Selenium client that is already locked into you to our system we can also define our screen resolution in a fixture as well as any other options that you want to pass to Chrome and then we can define every one of our screenshots by simply writing something like a pi test test with specifying what
fixtures we want to be run before and then just calling a utility function that does a screenshot with Selenium and we can run it by just calling pi test on our folder and we're done and with pi test parameters for example
you could do this for every language that your project supports or for every theme that your project supports or whatever and you end up with a bunch of screenshots and for the application I'm working on you can find the repository with the screenshot definitions and also the utility code that is a bit more complicated than I've shown here in this github repository and thank you very much if you have any questions on that feel
free to talk to me in the hallway thank you it's time for our story today's story is that's not my emoji that's not my emoji its head is too
shiny this is too emoji put together this is not an emoji this is not my emoji its face is far too animated can you make a unicorn sound emojis do not
have sound this is not my emoji its existence as a character in a movie is distressing you see when a studio loves a marketing opportunity too much they
can make a terrible movie out of it where a bunch of the main characters aren't actually emoji that's not my emoji it is partying far too hard you see the pretty bird he's dancing emoji do not dance that's not my emoji
its permutations are under documented can you wave at the Chi emoji that's not
my emoji its ligature is only vendor implemented you see the cat drinking coffee cats don't drink coffee and this only appears on Windows 10 operating systems that is my emoji its code point is so standardized and
that's the end of our story sorry about that so yeah this is a talk about crypto but it's not a talk about that stupid new thing this is a talk about
the original crypto which of course is crypto zoology the study of cryptids or legendary stroke mysterious creatures you know we're talking Bigfoot's and chupacabras and the Michigan dog-faced man and all of these wonderful things and the reason I'm interested in cryptids at the moment is
that last week I was in Ohio and I went back to texting with my wife Natalie in the woods in the dark and it was dark and it was the woods in Ohio and we realized that we haven't really done our research and this is America there are weird creatures out here we didn't know if we were within the range of any of these mythical creatures and I'd like to know so that I can greet them and say hi so um I obviously hit Twitter and on
where I can retain range maps of crypto cryptos who are logical creatures like yetis and chupacabras and so forth and then I asked a question about it on ask matter filter and I realized that actually no
the internet does not have a conclusive source of range maps for different cryptos or logical creatures so I started to get have repository and this is this is a get have repository which I would actively encourage people to contribute to where I am trying to get a directory of information on cryptids and their ranges and so the way I'm doing this
is using a file format called geo JSON it's brilliant formats a way of representing geographical shapes in JSON so let's take a look at the Loveland frogman this is my geo JSON file for the Loveland frogman a great thing about I'm github is that github knows how to render geo JSON so it's rendering this shape for me but if I click on that shape I can see
that it's got a Wikipedia URL the name it was first cited in 1955 last seen in 1972 and it's a human-eyed frog described as standing roughly four feet tall so that was my first cryptid I have been you know what I've got some pull requests Russell has sent me a pull request adding the drop bear I'm
going to merge this right now and if I'm lucky it will deploy by the end of my lightning talk I should have done this a couple of minutes ago so we now have a drop out it's very exciting so you get the idea so then what am I doing with this data well so I'm working I've been working this open source project called data set which is a application that takes a sequel like
database and gives you a UI for exploring it and it gives you an API for getting the data back out as well and it's the perfect application for cryptozoology so what I've done here is this github repository has Travis set up to run a script every time I commit anything and the script reads in the geo JSON and writes it into a sequel like database it's 123
lines of code there was not a lot to this and so it builds a it builds a sequel like database with all these cryptids it deploys it and then based on this I can build out an API so here is the nice thing about data set everything sequel light so the API is just a sequel query here is a sequel query that selects details of cryptids where the geometry overlaps the where
within geom text so where the geometry overlaps a point here's our current last year longitude right now and if I run that query I get back the bigfoot it turns out America has a lot of bigfoot sightings so this is now a JSON API which you can feed latitudes and longitudes to and it will tell you
which cryptozoological creatures are what you are within range of here's Travis oh look Travis is building the drop there right now so if we're lucky that'll be built and deployed in just a moment and I'll talk about bigfoot quickly it turns out there is already a database of bigfoot sightings run by the organization who are banned from Twitter if you try and tweet a link
you get an error message so clearly Twitter a part of a cover-up conspiracy trying to hide the trying to hide the existence of Bigfoot but they've got 3,000 sightings they publish a KML file a KML file has latitudes and longitudes in I happen to know that the range of Bigfoot is 15 miles from
a conversation I had at the Bigfoot Discovery Museum outside of Santa Cruz thoroughly recommended if you take those 3,000 points and put a 15 mile radius on them you can see that Bigfoot have been cited across much of the United States everywhere in Florida it turns out everyone in Florida seen a Bigfoot and here in San Diego so when I ran that query with our San Diego
coordinates I did get back the Bigfoot I've got a iOS shortcut that I isn't working at the moment so I can say hey Siri check for cryptids that's very useful and if you want something a bit more useful I do have a version of this that works for time zones instead and so please take a look at the project draw maps of cryptozoological creatures since they probably don't exist
you don't have to be very accurate and and let me know how it goes on and before I get hugged I'm just gonna quickly see if Russ's drop there has deployed yet myself unfortunately it looks like the drop there did not quite make it but in a couple of minutes there will be a drop there so thank you very much all right thank you okay this is the first
presentation I've given in a very long time right this was from something I did literally eight years ago so I'm gonna try and do this really quickly in 2010 I was at a place called the Santa Fe Institute for a summer program and
they asked us to like create various research projects so me and some friends stuck something together in three weeks which was a Django app which you'll see four years that should have said yeah four years later I got diagnosed with a really complicated situation which is why I've had three surgeries and why I can't have this part of my glasses on that side of my
face long story still managed to get a PhD in sociology now I'm trying to figure out what to do it went so well my most recent surgery in June that I'm here and I'm really really happy to be here and my github name is spool if
any theater nerds who know craps last tape get that reference please see me after because very few people do right okay so the project was to try to study how people create collaborate creatively and how they might respond to each other and making a design and I'm gonna skip over that because that was a lot of again some things I had before you can ask me stuff about but
basically like if we're working together on making a design what do how do people respond to each other's things people may be aware of the whole this came before that technically in 2010 and you'll see some differences in
the user interface so the idea was we printed a t-shirt at the end it was gonna be a black t-shirt by default and then we were gonna put some designs on it we wrote it in Django and a language called processing which is really cool for doing some user interface and sound experiment stuff
there was a processing JS back then which was really slow there's now p5 which I really recommend people try if they like to we did as we created a grid with 64 cells or squares that people could do their own designs on and that whole thing became the canvas and you could only see your more neighbors
can I see a pair of hands or hands for anyone who knows what a Moore's neighbor is okay do you want to tell everyone what that is fine okay that's cool you'll see it in a second so this is just an example of how we run
it in a museum recently so you get assigned a square in this big grid and you can only see the edges of your neighbors so if you there's the the ones to the left right top and bottom and then there's the ones on the so those eight are the Moore's neighborhood so you can press spacebar
do a design and then press spacebar at the end and you see how your design fits alongside the designs of everyone else okay so that was that so that was
the you had you could log back in so when we ran it at a museum you could only do one design and then see the whole canvas the way we read it as a website is you could log in you could only see your neighbors and if you logged in the next day you might see how your neighbors changed and then you
could respond so we had a couple of pregnant problems if someone logged in and then left it for an hour and then logged back in their neighbors might have changed and then they could they would log in and we ended up they would accidentally not respond to the most recent changes of their neighbors we should have done that with web sockets we also didn't take into account
time zones so someone's laptop was on a different time zone and they kept overriding everyone else's and we didn't know why that was really frustrating so this is what it looked like so each of those cubes are
individual people's designs the music that was when the dude tried to do his own and then accidentally over read everyone else's and it's a Taurus so the people on this on each side can actually see each other and the people on the tops can see the bottoms
good job so that was the final thing if anyone's interested in the music you just heard that's by a really incredible composer named Julius Eastman who sadly died in poverty in 1990 thank you
I'm earnest I really like to tweet I'm you might think that tweeting is my full-time job based on this incredible brand engagement but in reality I work for the Python software foundation I also tweet a lot and so if you want to follow me that's where you do it on Twitter but yeah
the Python software foundation so the Python software foundation if you're not aware is a non-profit that controls the intellectual property copyright trademark etc for Python the language also it is a grant-giving nonprofit and so we raise money and we pour that money back into the community in order to support events like this meetups and smaller events as well there's a
larger event associated with the Python software foundation called PyCon if you're not familiar with PyCon it was in Cleveland in 2019 it's gonna be in Cleveland in 2018 it's gonna be in Cleveland in 2019 and I want to talk briefly about that currently the call for put for call for proposals for
PyCon is open you can check that URL out to go submit a talk tutorial education summit presentation poster or a charless or a talk in Spanish speaking of talks in Spanish the PyCon charless came out of the PyCon hatchery program and so this is kind of what I really want to pitch to you
the PyCon hatchery program is a way for your ideas to be realized in PyCon so if you've ever been to PyCon and not seen something that you wanted to see this would be the way that you would tell us what you want to see and perhaps even propose to do the work to make it so so please check out
the hatchery program and read more about that so you might be saying okay like this is DjangoCon so what about Django the Python software foundation loves Django we actually so I'm the director of infrastructure and so that's like a bunch of services that run on Python org out of that seven of
them are currently written in Django the PyCon website is written in Django and I love Django admission I haven't always loved Django but it but it's gotten much easier in the past few years and so being here and being among
people who are interested in or experts in Django has been really exciting and so I'm also gonna come up here and ask for your help I frequently tweet and sometimes I tweet about asking for help in Python in Django more often
than not when I'm tweeting about Django it's asking for help so help please and when I say this I mean this sincerely if you are in this room it is probably because Django has either made you feel like you have superpowers or you want to feel what it feels like to have superpowers and I
want you to be involved in Django and the PSF and PyCon if any of that's true you might just be starting out and I'd love to work with you to get you started you might be an expert and I would love to work with you to get a little bit more of that expertise and so that we can all share and grow so there's also a bonus it's not Django but PyPI is a piece of software many of
you in the room might be familiar with and if you're at all interested in contributing to PyPI we have stickers now you can check out a micro sprint that's going to be occurring at lunch with myself and Dustin Ingram and
that's it I'm Ernest and you can follow me on Twitter there so I'm going to talk about you get folks about one plus one equals one or record a
application with Python this is a 45 minute talk I will make it in five minutes not sure if it work so real-world data is a mess probably dealt with data like this before those are restaurants restaurant records and you see here that clearly those four records here from zero to three
are duplicates because the name is quite similar the other is similar CD can vary so we have duplicates the here real-world data is a mess we don't have unique identifiers and the solution is to perform the duplication also known as record linkage to join records in a fusee way using data like names and
addresses mostly we will deal with those kinds of things but it can be ordered kinds of dates and to solve that we should do some fusee comparison of strings we can use algorithms like Gerald Winkler similarity if I compute the Jericho similarity between those two similar strings I get this high
number and with those different strings I get this lower number so I can use that too as a tip for me as an indication of similarity between records I can do also fusee comparison of addresses and the trick is to geocode them to latitude and longitude and that will allow us to clean irrelevant
address variations like a small variation on on the street number or something like that and to enable the calculation of geometric distance using latitude and longitude because we want to group and match things that are close together and if I geocode those two addresses here you can see that
although they have variations and even typos the latitude and longitude is the same and the zip code is also the same I can grab this from Google geocoders for example okay now into the process of the duplicating a data set first we need to pre-process it we will use the restaurant data set it
contains 881 restaurant records from the folders and Zagat's guides and it contains 150 duplicates and we want to find those duplicates the data set looks like this so it comes with the cluster column which is the truth about this data of course we will remove that and we will also remove the phone
column because it will make things very easy for us and we left only with name address and CD and we want to duplicate using only that first we clean just using some regaxis to clean then the name will geocode all the
addresses so we get the postal lot to the longitude for the addresses and then we can move to the next step on the record linkage process which is indexing we will use the library record linkage also known as Python record linkage toolkit and we have the cleaner records now we want the
pairs to compare to find matches to produce the pairs we could do a full index we could compare our records against our records of course that's slow but we don't have enough time to think about a smarter way to index so we just produce our records against our records and the pairs looks like look like this we compare 0 with 1 0 with 2 and there it goes now running
the comparisons we want to compare the pairs to get a comparison vector for each pair so the comparison vector looks like this the names are similarities 0.5 the other similarities 0.8 and that's our comparison vector and to
compute the similar the comparison vectors we define similarity functions for each column we can do that with the record linkage toolkit we use gero inkler for name address postal and an exponential decay geometric similarity between lot to the longitude and that's what we get if we
run that now with the vectors we want to explore different ways to classify them as matches and no matches and we can do some simple threshold basic classification we can compute a weighted average over those vectors and
by looking at data we see that those three are matches so we just consider anything with more than 0.9 of a score as a match and if we do that and compute from the truth about this data we see that we got 128 true positives and two false positives only two false positives and 22 false negatives so
it's quite good performance but there are smarter ways to solve that problem make sure to check active learning classification it will help you a lot because it will allow you to build a training set for your data
okay so as I'm Felipe and partners at Vinta and let's talk about parks so I
have an image here of two layouts for a park trail or something like that and I want you to think about which one of those you think is there more pleasant prayer for Park so to be honest that this is a very trick question I didn't give you enough information to answer that and the reason
is you don't know enough you don't know anything about the terrain you don't know anything about if there are trees around you have no idea of what's going on in that area so to answer that kind of questions you need something like this book which called a pattern language from Christopher
Alexander and it's for 1977 and this guy he defined a series series of patterns and things to help architects to build and design good and do good architecture so for example you can take the pattern 120 which is
about paths and goals and it says the layout of paths will seem right and comfortable only when it's compatible with the process of walking and the process of walking is far more subtle than one might imagine this is very interesting especially because it can be visualized through this thing called
desire paths desire paths are like in this image they show how people interact with the place they live through like this patterns in the rain so for instance here we are seeing there are two paths going one
that goes straight to the door but there is a fence and not only good that goes around so it probably is at some point the fence didn't exist there and people just changed the way they they went inside the house after that also this other example so in this case people are going around the
trees and and here is very interesting because like they got the desire pattern the desire path and actually made it a fixed path and a proper path so one thing to remember and this comes from this idea of patterns and
paths is that when you see a big street mainly when it's all one main avenue in a city probably at some point there was a bare road in the woods and maybe before that there was just a path where people and animals passed by
and so those are patterns and this this is from Christopher the author of the book and he says this idea comes simply from the observation that most of wonderful place of the world were not made by architects but by people so the
book is really very based on these on how architecture is built by is made by people and and constructed naturally so the book has other partners and for example the path shape one there is six spots so when you okay so there's
another part that's missing here well we saw what language so it's a pattern language we talk about pattern and language it's just like when you have words you have words that have separate meeting by each of them and when you group that together you got a lot more meaning through language so
language that's a group you group words to convey a lot more meaning and that's where we get you design patterns so design patterns was actually created the book we know the software book we know it was actually created based on Christopher's book and I don't have a lot of time to talk about like design
patterns here but like it's just that that idea that comes from so let's just jump to some takeaways so for things like partners are not created they emerge so just like the Christopher's partners our design patterns are just observation from how people code design partners are two for
communication between programmers free speech and code they are not to brag about they are to help you communicate with your teammates and the last thing I want to leave to you is that that quote that says good architecture is about improving people's lives and most of the time this means that what feels
more natural or more pleasant to us and this applies to both to architecture and to programming. Thank you.