Merken

Things I wish I knew before starting using Python for Data Processing

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
talk of the session it a cycle the until we should you before you start using Python for that the process is still using from and these will come our next speaker media have few so we're to all uh is
remember this means moderate neuronal um so my dog uses a T they title so but I was really maniac today are so many of the algorithm and will not argue about some things to learn in In the last few years have been working with Python and I that I would prefer to long before starting are using for their so quick introduction on yellow laughter Columbia idea from invariant I work for a company in unique culture assume we do data processing for all those as an I also said be in doing
Python just workable 2 years so it is more like a big year to begin a talk however I think if you're starting with Python birds the data size area you and some some
good stuff from the stock that's my contact information from so the priors for this small world where you where you are so you're relatively new to Python you are used by
the mostly used when the scientific study implies that on you work or you decide to work has the data that were in need for a machine learning you are not necessarily a train so for ending in our so if you are you have yourself so when you're learning experience you probably gonna get in all so this
you you know your prepared to walk away and you 2 minutes back legislative but gonna leave you want so this will be a 1 so who wants to be always a data scientist please raise your hand OK today that analysis lot engineer you know some machine learning developer good OK that's the government like so somebody over yeah so you being more 3 basic really high level so if you already have experience you might be the more of this is really based all title so the gender we're going to talk about some basic concepts and practices them but what object-oriented programming then I will talk about some good use of the collection module so we have something about iterators and terrible as this was like a collection of things I wanted to show us many more things but because of time I do have to be the ones I like the most as there are different things there that different levels of abstraction and so the 1 that we're going to switch to a really high level some cold and we're not going get in in any of those however I will give you some points in some direction when you can get more information so I you like I mentioned talk so let's start with the a story this talk others that is based on my experience but is also based on the use of my colleagues and my interaction with them so let's talk about babies are baby just graduated from university is say map Ph.D. and he was use are in uh matlab and he comes to work for a company where he has produced mostly piping and and she says that the classifier to write code to classify some documents that have for example he uses it to gain and secular Jews I assume you are somehow familiar with thing is that he has rights and really nice IPython Jupiter no sorry with the cold around like graphs and so what is a random image brain get anything from it I but and then he might tell you that you have to integrate the cold In our so he had to go from IPython the will to a really big protein with a lot of that and the that lowest students so on and of course he's lost so he tries the best way to do that we not knowing what's going on and the and the stop writing what some people call spaghetti data size growth
which is same as spaghetti code by for our own so if he has to
integrate the cold it's gonna be really bad for him as well and someone else has the degraded code that person implementation for that almost whatever so How would prevent that happen when we're we are going to start with data size and 1 actually integrate data science and machine learning secular according to on so the thing that there is going back to the basics and In a nutshell I think they are engineers assigned have to become more so for the work in getting to a in the middle point and how you do that well 1st we have to really have a distinction between cold themselves were I think these from a
talk from that you certainly played a doubling this year and as realize how he how he would be so code is something that runs in computer so when you write this creep you right the title of you
probably writing gh goal might have to or follow any comments you know recommendation to just from decoding is just the job is to take our so were in the other hand some people think it's
just the parameters text inside the deliverable by some people think is the whole thing including parts all stream deployment script has been documentation
even customer support technical support are inside the cells were and you want to do during the sulfur studies by dynamic decibel deployable in all the models you can put in answer the question is how the way how would but the way to John for my coding to so I think that this go back to the Python the basic ones Python them from 1 work by the which quite important thing from this this
is I got it from the uh the documentation of like this is by this object-oriented programming language so as a data scientist this you should be able to know what is out there and how to use of so so this is like the 1st thing I'm going to give you a as a Python data scientists with then there our use of and for that I'm I'm going to give you a really quick and dirty Introduction to Art 2 1 hour or so of 3 main concepts and the objects they have data the core attributes and they have some operation on the data that are called a matrix that's been an inch often hobbled cannot get looting by well before
going to that I just want to raise these features between cookies to be better focused on traffic and so
8 o'clock was the class where the before and there they class is gonna like the team place that you use to create more such objects in the case of cookies the cookie cutter here from the quality of the screen and
use you you take who got a great many many many cookies and you read them hopefully after works not all because of of the 4 you so entitled and
this is however an object looks have a class really that's the template for creating duties on that occur in this case is the heart so I mean our construction function that you that is that call every time difference here and
has some they are at the at
tributes and many right now you're ready expert in orientation back if you want to read that located in this instance the duplex if you want to what 1 of
the key concepts in object-oriented promise that you cancel so you can express your last extend 1 1 of the other 1 class to make something special in this case I'll add just a type could be that sitting in the Spain of South America in ideas in this with this are example like I extend the to class and I just got some of what time so put something here
with secular this is where your hand if you use should be there and advise them to be so not so much but when you're when you're working in in our IPython or Jupiter and you're comming secularist temples you're writing statement many of those
staining look like this in what you're doing there's a draw actually calling object in creating object and indirect without so you have to be aware of what you're doing and how you can use that in your are so how do I write good object-oriented programming alone on the right road this is this a really tough question I don't plan to answer today but I didn't give you some there's basic ideas in the objectivity of the warning actually in programming what that want to know 1 is a star and don't repeat yourself so if you are
writing streets and you feel you're repeating repeating goal from 1 5 to another single you might want to create an out and out of date introduces and that's 1 of the key features of degree and of the target using
to the invasion to you things g is always the simple don't try to
put a lot of things that objects are in also used the solid principle is a really abstract principles are not gonna go into details but basically use that 1 class of 1 and you have to do only 1 job and you have in well I'm going to skip the rest of the time issues but my recommendations to check it out into 2 billion below it's really important every nite to know these status of so 1st
preferred thing I like I think is important if you're going to start in serious data science in which their auditory parameter
masters so the next thing that you have to learn once you already know how to organize the recording of this is this spoken combinations is the Commission is like table manners for developers so you're sitting at something you want to do so you know what I'm going other people when you're doing what you're reading over your program in this case I when they are scientists tried to integrate because 1 of the things that annoy people animals is that they have no idea how information and you have to always return the call to to fix it or you fix it yourself on why convention well what is well convention are important because the rate of the in its there they are small because actually things
like let ideas space or Suu task where the indentation rules how they organize the code in a file that it is the fact is that the fact of standard solution learning there's some resources online to you check this is a nice user-friendly way of learning that they got or is an example of right this wrong way to do things there are many
details in these conventions and you might get all so many things to
learn but I just want to go where you have yourself in your edit or in this case is the max 1 I use that sorry for the the idealized or other editors you can
configure probably to help you not only would checking that you're following the convention of the company that you're using that they want to make it but also to help you detecting there thing that might go wrong like for example in this case it's a variable is never used and your data can help to detect such things all of those that I will have a low
dimensional to go into more detail but I don't have the time because I want to show you more cool stuff lots of project structure this thing there's no testing of I'd been assigned for you generally they don't come with test version in a branching namely learn how to use your source control for
reviews in in general this aware that let's there's some books that I recommend you to read
are there really general the name that is to be the 1 language but I you want to get closer to the side of the they define the area becomes amorphous or whatever those are good books to start with some
also I was reading the description of Europe I website and there are some thoughts that I think are relevant and they probably talk about the these features but if you go to 1 of them please tell the guy that I send you there
probably but have you for that so I some so let's go to so now
going to call right now being a really theoretical of boring and you wanna see some cold so as through it so that it's entreaty I
would have loved to known before starting doing cold and frequency I can use in a nutshell the collection much are it's incredible how you when you start using Python from big assigns perspective how do you know of other things that are in the and library in 1 of those things is the collection of em let's are we basically company companies that are like the basic building block for many statistical algorithm if you start from base a date to work today they are based on however I don't think they decide there's no hope to come properly by the sea a mythology public county 1st of them use dictionaries who has read such trouble 2 kinds of what the burden 1 the use of learning the soul they actually more of a running away if this was when you the you look after
operation about for forgiveness I think it's something like that and the means I was reading something like this this is again the this is collection of all the from it was summing of following so some of you this book some of you so it's basically the same you use of all the that has the basically you pass the the full well you already full generation function and in this case is an integer and real by default will be to in this so I have to go into but this is the country was familiar with the counter
field you here so got is really cool
but in is just 84 albeit that is already prepared for counting and it's for free are in that's how you use it I just passed only the list of items on a durable there and I just get the come conference Summit securities like you can yeah so the most common some values and do some separation and I found that recall lot remember countries that class can I just mentioned that done take
classes and extended in add your own behavior and for example you want to are calculated probability in for some items I can extend the glass
down there at normalized function and you really have the RO-TD must function for the if please if I want to overload the hour they initialize article normalized as soon as I have all the items in the country director of so when you're counting things in Python that you 1 the when you're using a statistic and you can things like secular writing sometimes what European the features
Our she should outcomes in all kinds of plastic in the sun really nice article from 300 about how it's the counting process that have been developed in Python in is the really good 3 main troubles so Main our thing diarrhea I discovered reason recently there and and there cannot be who familiar with make doubles most of you some of you know so the thing about when you're writing Gold you use people use a lot of dictionaries reasonable couples and when you start integrating that into a logical base you see that told indices of dictionary and you know how what expected so is really makes the cold hard to read in the example if I remove this you have no idea what
I'm talking about what is beauty in this case you might not part of the context may be so just like using named troubles you gonna make the clear so named double of basically sort of like a class generator all life with the particularity that there be the attributes are read-only so they're basically the nice structure implied as if you're familiar with the next moralistically and so you can create classes on the it has household metals
also if you really need to use the the transform it into a dictionary and you'd actually 1 part of an so and I think it's a nice way when you're writing code to organize it and to create sort of like main places that represent things in your in your in your problem you're in your ontology in this days we work so a lot with hotel so I created a hotel based and out of the street there and I actually inherit from it then add an ethyl 2 Douglas something so an eye-popping classes for instances of this
class around my goal and that makes it in my opinion but more readable so so to do the
mn more media Park and is a really interesting because I it's really
going to use it and for me was and actually I remembered that my 1st must use interview for my company working right now I was asked something about the derivatives and durables and I think I answered correctly by the was out of luck I don't think I mean I have been at the store have the right but you know why so this talk about about and so when you see like this you're probably familiar with the whole
iterate through the least but what is happening under what what you can do these help how can you do this and why it works and how can you write your own classes that a have the same behavior in I was
confused I was looking for ways to you know what's the difference between the the rebuilding dictionary and so on and I found this nice article by the same reason that when I use to use the the this graph
from him I is just ticket and we're going to start can I like exploring the concept of using this so to come maybe
after it for you right now it's it's a terrible and iterate so an iterator sorry and in the the world is something that you can call that the remember on any will return and iterate and any directory something that produces and then you when you call next abstract OK there's going to more did so at comprehension for example because is a container that containers for example can be at least a dictionary and toppled also their container is something that you can check whether something is sight the container the word for listening come from in this case I checked that 1 numbers in that leaf in this case set in a container is typically a need and a terrible so you can go through all the elements 1 by
1 so in this case this is latest and I call that the
German like the mean x and y got so I can call the price of the types of both 1 is soleus and over is the iterate and I can call them at the next from those are item in obtained the items from the Phillies mouth when you go this store on the in the that's what happened Michael gets the territory Rome delays least and start updating the values so this like syntactic sugar In some ways so in a nutshell the terrible is any object that can return the rate of that includes container like these dictionaries files they have to implement the if you want an object to behave like that you have to implement in the 3rd fundamental some some of those things might not be
finite villages can generate value for everyone as an example of that Bob there's a module entitled qualita tools that have a lot local of functionality working with them in the troubles indirect so How will implement my your during the so for the galley pragmatic reasons you can implement both the terrible and data of what class so or you have India Jamaica the return itself and then you implement the next method in Python 3 like about the method In this case is just a told that
a final then iterates in inverse order to start from the the last line up when there is no more lines to rhetorical will raise iteration which is this section that is called was felt therefore you can use it easy inference here and then you do the same as well as a as if it were a really so now we go over to the green part of the of the of the this crap so we know that we can get the terrible from things like this and dictionaries and files and from then we can get the the rate of of knowledge
with but there's another way to get back writer and his by using generate who knows as a generator if fewer them so what you so let's start argument is generated from a generation of friction and what's or
from generation function both as a safe are generators FIL from a generator respiration arm let's start with unknown generation and it's basically earliest comprehension in and generating the numbers and then the the idea of creating sorry I'll use of the numbers and then I create the families of the square those numbers so if I take the time selling what these are all the time but
what is disobedient number Brody I have enough memory to store them in my my random something or might be used you can do the same with generations inspirations and this is not a
couple years although it looks like it could generate it creates a generator objects that we produce but the squares from the least number in lady way so each time I call next it was coupling to the square in return I think about a body that's a factory of items in the factory uses in this case the function the square function will be by it's by itself some so if I want to do the same idea to do with the latest I can get a generates the squares and the laces where I can bring the items would be all the generated when
our therefore internally calls the next function this week before about those number they don't so I generation function is the same idea but uses a magical work
colonial all that works not a nice ways when you create this you call the fortunately would you will obtain also generation a generator then you call next what will happen it's not the code is going be executed then the will return the value back to the to the program and will continue only after the the next this the guy with the next next this called on and this December and calculated that the 1 i should sequence that you might be you're familiar I'm sure you're familiar with and I can just called exon-exon something to
we aware here these assigned in generated using a wide the you want if I put this into a you will go for it would be generating generating generating their numbers or the sequence of I can use some wonderment explored function from data to produce obtained just a subset of that and in this case I just get the 1st 3
using for you also implemented during data already there also using the yield uh you work mainly replacing the beta function instead the restored itself you just rhetorical generation function in this case I'm reading of fighting for example from the HDFS them in a distributed system you might just
1 1 7 1 server located somewhere and I are imagine that has a source that has been met book
open and I start iterating through it it might this open medical might be even will even generation innovative for example I do something with the lines and then I pass it back and I just got a call out as I with data with a for loop impressive local so that's
more or less the either Wilson iterate so I I I I can't think that is supposed to be related to basic science and data processing and so on what is this with iteration no you sometimes you can load all your data into memory and if you're working to the data field that's probably Europe your situation you might not have enough memory to store all the data you want that I will happen when you use it is so you can work with so in such cases using data streaming in data stream in you can get it by lazy evaluation which is what I just showed generating all processing things as long as they are available available or needed and event result such like the memory data processing pipeline using it by changing the some examples I do show you believe that class
get some food obtain line from the server and use standard do some processing may be split something I can create an order that based on and then check whether said Python common or some random comment and passes over so it's kind of like a I stream that give process and then sent forward and you can change 1st created with the source creative the 1 1 object in passive as they will are there for example and you can just call it that's enough look inside you wanna be going be processing in Ghana like industry fashion so think about in section the movie you're going the various levels of first-level do something the 2nd level of something they are sending data but you
don't have to write it on the object for the 2 countries get a generation function and replace
the whole thing the function just width of an example of how I like to do Our home there's a thought in your which will get into more detail odds and frightening if you got really brought you didn't get the whole idea they also want but you can for sure get more information In
this talk so in finalized on so the question or closing remarks made of scientists and engineers developers name a few sure there start with a collection of data to smaller basic you they are your best friends intervals the director of a meal you data-processing Pylab using them use object-oriented programming for organize you go in have been that's not only to metrical more maintainable but when you go to integration and you're working in large you will have a better time getting cold into the covariance and finally you're going to have to start moving to be more so for engineers instead of being just assign already a data a job where you will have to get becomes more so whatever uh when you want to get your solution you solutions into either 2 things from so credits arms the images that
I use them I base most of my talk in In the proposed articles in Part 2 main ideas coming from writing directly uh the greater clerical against and he also I really like how he had in this article and linking here how he talks about data processing pipelines using such iterators antivirals armor
as if they were just you we're hiring you want to know more about it when it working for we have a lot more stable when you can get some goodies and good just drop by
and talk to us and if you want to talk to me about it talk after the Q & a session also are work there have so question comments remarks where you want to trust my the work to thank you few at the UN is hungry so I don't think many questions with them for question you if you want you really don't have questions we thank you again
Sichtbarkeitsverfahren
Algorithmus
Prozess <Physik>
Webforum
Singularität <Mathematik>
Eindeutigkeit
Hypermedia
Dreiecksfreier Graph
Datenverarbeitung
Computeranimation
Metropolitan area network
Flächeninhalt
Information
Computeranimation
Subtraktion
Wellenpaket
Punkt
Wort <Informatik>
t-Test
Interaktives Fernsehen
Iteration
Ungerichteter Graph
Code
Computeranimation
Eins
Übergang
Richtung
Virtuelle Maschine
Objektorientierte Programmiersprache
Randomisierung
Softwareentwickler
Grundraum
Bildgebendes Verfahren
Analysis
Beobachtungsstudie
Abstraktionsebene
Anwendungsspezifischer Prozessor
Güte der Anpassung
Modul
Keller <Informatik>
Mapping <Computergraphik>
Software
Rechter Winkel
Geschlecht <Mathematik>
Information
Code
Besprechung/Interview
Implementierung
Algorithmische Lerntheorie
Code
Computeranimation
Trigonometrische Funktion
Software
Prozess <Informatik>
Code
Computer
Computer
Code
Computeranimation
Beobachtungsstudie
Streaming <Kommunikationstechnik>
Parametersystem
Informationsmodellierung
Diskretes System
Mereologie
Codierung
Zellularer Automat
Skript <Programm>
Computeranimation
Eins
Matrizenrechnung
Nichtlinearer Operator
Interpretierer
Computeranimation
Objekt <Kategorie>
Bildschirmmaske
Diskrete-Elemente-Methode
Modul <Datentyp>
Loop
Code
Objektorientierte Programmiersprache
Typentheorie
Cookie <Internet>
Attributierte Grammatik
Speicherabzug
Attributierte Grammatik
Objekt <Kategorie>
Klasse <Mathematik>
Cookie <Internet>
Computeranimation
Konstruktor <Informatik>
Lineares Funktional
Orientierung <Mathematik>
Expertensystem
Simplex
Template
Cookie <Internet>
Klasse <Mathematik>
PASS <Programm>
Systemaufruf
Objektklasse
Zeitzone
Computeranimation
Objekt <Kategorie>
Schwebung
Instantiierung
Vererbungshierarchie
Objektorientierte Programmiersprache
Datentyp
Klasse <Mathematik>
Schlüsselverwaltung
Computeranimation
Objekt <Kategorie>
Last
Objektorientierte Programmiersprache
Automatische Handlungsplanung
Gerade
Computeranimation
Offene Menge
Schnittstelle
Klasse <Mathematik>
Computeranimation
Einfache Genauigkeit
Objekt <Kategorie>
Physikalisches System
OISC
Minimalgrad
Umkehrung <Mathematik>
Prozess <Informatik>
Schlüsselverwaltung
Informationssystem
Parametersystem
Datensatz
Schaltnetz
Systemaufruf
Information
Softwareentwickler
Optimierung
Bitrate
Computeranimation
Tabelle <Informatik>
Pauli-Prinzip
Logarithmus
Schlussregel
Elektronische Publikation
Raum-Zeit
Code
Computeranimation
Task
Metropolitan area network
Selbst organisierendes System
Programmfehler
Rechter Winkel
Code
Standardabweichung
Texteditor
Extrempunkt
Konfigurationsraum
Computeranimation
Softwaretest
Software
Code
Versionsverwaltung
Projektive Ebene
Datenstruktur
Computeranimation
Deskriptive Statistik
Flächeninhalt
Code
Formale Sprache
Computeranimation
Statistik
Algorithmus
Perspektive
Code
Gebäude <Mathematik>
Programmbibliothek
Punkt
p-Block
Frequenz
Computeranimation
Data Dictionary
Inklusion <Mathematik>
Arithmetisches Mittel
Erzeugende
Nichtlinearer Operator
Datenfeld
Ganze Zahl
Reelle Zahl
Default
Computeranimation
RFID
Trennungsaxiom
Computersicherheit
Klasse <Mathematik>
Mailing-Liste
Ausgleichsrechnung
Computeranimation
Lineares Funktional
Statistik
Overloading <Informatik>
Prozess <Physik>
Vererbungshierarchie
Total <Mathematik>
Schreiben <Datenverarbeitung>
Computeranimation
Data Dictionary
Inklusion <Mathematik>
Videospiel
Ontologie <Wissensverarbeitung>
Klasse <Mathematik>
Kontextbezogenes System
Extrempunkt
Code
Quick-Sort
Computeranimation
Prozessfähigkeit <Qualitätsmanagement>
Mereologie
Punkt
Datenstruktur
Attributierte Grammatik
Instantiierung
Magnettrommelspeicher
Rechter Winkel
Hypermedia
Klasse <Mathematik>
Datenerfassung
Derivation <Algebra>
Nichtlinearer Operator
Speicher <Informatik>
Computeranimation
Metropolitan area network
Subtraktion
Graph
Klasse <Mathematik>
Ext-Funktor
Hilfesystem
Computeranimation
Data Dictionary
Metropolitan area network
Bit
Zahlenbereich
Wort <Informatik>
Element <Mathematik>
Verzeichnisdienst
Computeranimation
Gerichteter Graph
Iteration
Bitrate
Elektronische Publikation
Extrempunkt
Computeranimation
Data Dictionary
Portscanner
Objekt <Kategorie>
Metropolitan area network
Mailing-Liste
Uniforme Struktur
Bit
Datentyp
Socket-Schnittstelle
Speicher <Informatik>
Implementierung
Lineares Funktional
Inferenz <Künstliche Intelligenz>
Klasse <Mathematik>
Stellenring
Indexberechnung
Iteration
Bitrate
Elektronische Publikation
Gerade
Computeranimation
Data Dictionary
Metropolitan area network
Generator <Informatik>
Mereologie
Garbentheorie
Ordnung <Mathematik>
Ext-Funktor
Gerade
Erzeugende
Parametersystem
Mailing-Liste
Generator <Informatik>
Uniforme Struktur
Reibungskraft
Familie <Mathematik>
Multitasking
Zahlenbereich
Extrempunkt
Computeranimation
Objekt <Kategorie>
Lineares Funktional
Mailing-Liste
Spitze <Mathematik>
Generator <Informatik>
Quadratzahl
Festspeicher
Zahlenbereich
Faktor <Algebra>
Große Vereinheitlichung
Computeranimation
Metropolitan area network
Lineares Funktional
Erzeugende
Folge <Mathematik>
Generator <Informatik>
Zahlenbereich
Optimierung
Chi-Quadrat-Verteilung
Code
Computeranimation
Lineares Funktional
Offene Menge
Folge <Mathematik>
Betafunktion
Zahlenbereich
Physikalisches System
Gerade
Computeranimation
Teilmenge
Sinusfunktion
Erzeugende
Bildschirmmaske
Suite <Programmpaket>
PCMCIA
Lesen <Datenverarbeitung>
Loop
Generator <Informatik>
PCMCIA
Stellenring
Server
Systemaufruf
Quellcode
Atomarität <Informatik>
Gerade
Gerade
Computeranimation
Resultante
Prozess <Physik>
Klasse <Mathematik>
Iteration
Quellcode
Gerade
Ereignishorizont
Computeranimation
Übergang
Sinusfunktion
Objekt <Kategorie>
Streaming <Kommunikationstechnik>
Datenfeld
Festspeicher
Datenstrom
Server
Datenverarbeitung
Garbentheorie
Ordnung <Mathematik>
Gerade
Standardabweichung
Leistungsbewertung
Sinusfunktion
Objekt <Kategorie>
Lineares Funktional
Erzeugende
Iteration
Bitfehlerhäufigkeit
Cliquenweite
Digitalfilter
Brennen <Datenverarbeitung>
Computeranimation
Binärdaten
Metropolitan area network
Kovarianzfunktion
Prozess <Informatik>
Mereologie
Iteration
Abgeschlossene Menge
Extrempunkt
Softwareentwickler
Bildgebendes Verfahren
Computeranimation
Inverser Limes
Mobiles Endgerät
Computeranimation

Metadaten

Formale Metadaten

Titel Things I wish I knew before starting using Python for Data Processing
Serientitel EuroPython 2016
Teil 55
Anzahl der Teile 169
Autor Cabrera, Miguel
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/21189
Herausgeber EuroPython
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Miguel Cabrera - Things I wish I knew before starting using Python for Data Processing In recent years one of the ways people get introduced into Python is through its scientific stack. Although this is not bad, it may lead to learn solely one aspect of the language, while overlooking other idioms and functionality included in Python as well as some basic software development good practices. I will share some useful tricks, tools and techniques and software design and development principles that I find beneficial when working on a data processing / science project. ----- In recent years of the ways people get introduced into Python is through its scientific stack. Most people that learned Python this way are not trained software developers and many times it is the first contact with a programming language. Although this is not bad, it may lead to learn solely one aspect of the language while overlooking other idioms, standard and common libraries included in Python as well as some basic software development good practices. This may become a problem when a data science project is moved from an experimentation phase to an integration with technical environment. In this talk I share some useful tricks, tools and techniques and as well as some software design and development principles that I find beneficial when working on a data processing / science project. The talk is divided into two parts, one is Python centered, where I will talk about some powerful Python construct that are useful in data processing tasks. This include some parts collections module, generators and iterators among others. The other I will describe some general software development concepts including SOLID, DRY, and KISS that are important to understand the rationale behind software design decisions.

Ähnliche Filme

Loading...