Merken

Combining the powerful worlds of Python and R

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
they
so I again Oaxaca starting now we have right of hankered here tho I've started using price and 19 98 L 80 80 80 998 yeah 4 0 m well so the puzzle developed unites 98 and many by a computer could Biocomputing and notice he's looking as a food supplies and therefore up and he's going to talk about how we can use Python and are for such a circuit arises the wrong but they also reference just it also yeah hello good morning everybody and welcome to this talk about how to build a bridge between the Python or some of how many of you know are using can just give a quick OK so most of you guys are so
far as I can see it for those who don't know what are is are is basically a huge package tool for doing such a signal analysis and calculating graphical representations of your data it's open source
runs on all major platforms like Windows Linux Mac and in the language itself is like 1 company not so great Python is much better language program and but are the real power of our thing think comes from the huge library packages that's available around it and then you can also download most of those packages from this year on network from that situation we in faced
when I started this project was that Python are basically completely separate ecosystems and when we wanted to do assess its statistical analysis of data from Python we had to basically a packet into a is the fires transport him over 2 pi 2 are included analysis and put them back into Python and that was not really very convenient for us um so there are packages which solves the problem like at that time there was a no there's all to and these are basically extension packages for our in Python so you compiler into a model imported and then this are applied to model provides functional and usually axis are to utilization and and get your results directly in Python it has a slight disadvantage which was a disadvantage for us that are runs in the same process as Python 1 but even on the same machine so when Python in our case was running a web application server and we wanted to do analysis in R and that was the heavy analysis that was really slowing down our so we had to spread out our on 2 different machines and that was
the approach we're taking that's what I'm going to talk about and we wanted to build a bridge between Python and R and to be able to run on on a different computer or on a farm of computers yeah and the 1st piece of that bridge the 1st socket is positive are served as the
TCP I server for developed by Sun urban it
allows for multiple simultaneous connections from arbitrary number of clients arbitrary as long as the machine can take it of course and every client that connects to that our server by a TCP IP has its own namespace so all calculations are really done without side effects clients by default available besides Python 1 4 the C + + C sharp and so on and there's a growing number of clients part of them come with the package directly in other clients load of downloaded 3rd party packages from the offered server and on them
so the 2nd piece of that bridge is pi over surface that's the part that I have been writing
and it's a pure blind adapter for connecting by a TCP IP to answer what it does it's utilizes a a Python data objects over the network science some 2 are are kinda some calculation with it and there's you that some results data is deserialized of passed on the Python side notes and native Python Result objects are created by that it allows us to come to evaluate arbitrary are commands on your side and the answer you can trigger functions function calls in you can set and get variables in the ah namespace and the latest addition to will play a surface that it allows our approach to trigger the commands in your Python interpreter from the outside now I will show that later the missing pieces of
that bridge is the protocol which that these 2 problems sockets are talking each other that's the
281 protocol creditors protocol which sounds much bigger than it is I think it's invented by some Simon just for the purpose of letting our clients talk to our server but it's a little bit protocol like maybe people in Python but it allows us to exchange this year objects between R and Python not just within the pies next year and it doesn't only allowed to come serialized data I'd also contains commands so that part the our society knows what to do with the data that you're sending to the outside it's synchronous particle and so and offered command Anderson of your data to yourself and you have to wait until when you get back even if it's just a non-object you have to wait until the ah connection really has to finish the calculations can send of 2nd come onto the same connections if you wanted to parallel computing you have to do you have to open multiple connections on the ourselves which is possible from the US from the Python size and yeah
installation it is quite easy if you download From the ourselves from the from the but SOS server it's not possible to use the pre-compiled packages because for running on a server you need to compile and link are with a special flag this enable are shelf lives otherwise positive cannot be loaded in this manner under the excuse space of or our service can be obtained directly compiled by all so there's our brings its own compiler for packages are common installed and their new packages which you downloaded before and finally the missing piece of the pie inside just of Python package downloadable from pipeline server it runs on all major modern Python versions from 2 . 6 6 0 1 2 3 . 4 um please nonpolar and so that's fine otherwise it will installment on the fly Starting is
also using the server side is just started with are command are opens a connection on the metric and by default it only listens to the local host that's its security features but because in the olden times are surf didn't have a way to protect access to it so there was no longer possible check it's now building on the odds of side it's not just build and apply ourselves side so that's why but before they only listen to private IP addresses and Logan host when you connect to the are server server running on your local machine it's enough to call Pizer passive connect uh goes to local host on by default if if you want to connect to a remote machine just provide hostname and provide a port if you're running on a non-default poured on the server side the connection it
cells has some adjectives so you can go and see where really connecting to get as that's especially interesting if you have multiple parallel connections on open on your pies inside so you can see where which connections connecting to where you can close the connection you can see the connection is closed on the on so now we
come to the 1st real steps what can you do 1 with such a Python with such a connection to answer the connector itself provides a method called evil that allows you to send arbitrary or expressions are commands to the ah aside letters that are evaluated that string expression and receive the result back as a native
Python object on the high side so here I just sort of run they just summing up 2 numbers you can also call functions in our
this the operator in creates an area of America and area on the aside and since you returns the result of that expression and what you get is that memory that area and there's something popping up all the time you get that connecting in number areas in the part inside
sometimes it's not always you want to do is return the result back from hot so when you sign a very complex data structure to a variable on the odd side that's nothing you want to see in the Python side because it would have to be serialized from are pass through the network and deserializing right inside and if you just want assigned to a variable you want to avoid that so for that case there's a variant of the ego command called boy develop which just executes the expression on your side and just doesn't return anything to Python still want to see this and the value of the variable the box just can use the command some more
examples of string evaluations like years you can even define a function on their side so here create a function call times to which takes 1 argument and the 2nd year command just provides executes the function and the result is returned back to Python you can even there development strips that you can define Python you store them in a string of whatever samovar execute them and negative result I think that's really what straightforward so there
using the values for a sort of the basic usage of connecting to oral communicating with ah um a connector provides a much more interesting actually called
are which represents the namespace off you are running on the remote site so why are these are you can access the variables and set variables in the interpreter and can make function calls and have to watch out namespaces are treated as separate as as as before for every connection but they're also getting deleted once you've connections close so we have to make sure you a few words space and in are for you just use whatever in there so
just to see what the difference between string validation and using real names based approach is these commands to
basically the same thing but a variable anywhere is instantiated on the odd side and the string ABC and the 1st approach is a string of relation part the 2nd 1 is doing exactly the same same thing just very Python so it looks like ABC is assigned to a local variable in orange but it's actually serialized and sent over to art and set in that namespace it's even possible to land such more complex data so that's an example where trade and empire area in Python give it a shape and sign that that area to a variable called the matrix of actions and also that number area is serialised send to
R and 1884 are areas traded on the odd side and the last column with con you go the main matrix shows you that you can access that area in our and get the dimension as a as a result but how often
should so called him that Pythonic way using the on in space and trading here just print and to demonstrate that and creating free simple functions the first one doesn't take an argument just returns a static string the 2nd 1 takes 1 argument doubles the value the transition and the last 1 takes from Cuba arguments so that's what you can do in our very Pythonic already had and now that's what way you call it just using on in space called functions 0 it's strength provided argument and provide a keyboard a keywords value to the last 1 and get the live the list that I think that's very easy to see and understand
the more complex thing is some
functions allowed to except another function as an arguement maybe like the map functions in Python and it accepts a data structure and the function you can map it against a supply and R and that's basically the same thing as the arguments on the other hand different order so it takes an area and allows you to pass a function in are to be applied to it so that's also plus possible you can read graphic we can refer to the function that sitting on the outside from Python concurrent tends to and it's important not to pass references to Python functions that doesn't make sense so like the double is isn't here I define a function in Python and if you try to to refer to that function of course it's not possible to serialize functions from Python into aren't you can't serious data but not function so that gives you a number name error because double it's just defined on the our side this example also shows you that apply our cannot handle errors errors that are raised on the ah side so I'm also aware that when an expression is evaluated and looking at the results and I can see if there's an error rates and I can drive over the error message from on into Python and raise the exception providing that net that the message that our sense to me so in the name double is not defined as basically what's are tells
me this example shows you that things can be rather inefficient if you don't do it right so here what I'm doing trading number hearing and assign it to a variable power they are on our side and then I make a function call with as supply rail provided there is an argument and and referring to the times to function and applied to every argument in the area so why is that an efficient what that really does is that it's the signing in the 1st line the area on the side then I'm pulling the area back over the literary into Python and the last line pushes the area back 2 are and so the area sent back and forth 3 times and to avoid that there's this additional attitude are these different namespaces actually did a reference before which Congress which that allows you to reference and a data object in all without actually pulling it
all over so it just provides approximate that and share that example now use that to reference an area which exists in power and supply that as an argument to the supply function so that avoids the dataset and we're back and forth 3 times out of all mentions
messages that's 1 of the latest additions that allows our code to send messages into Python interpreter which on the Python side trigger the call of of a callback function that you can define and in order to make that direct you need to have start are served with the special flag enabling the conflict conflict from so that's the whole being able to check and see in that example and you have to start our ourselves to use that concept file with the corresponding command line options that is stars of the
year additional coding our search for callback messages the way it's set up factories seemed to define a callback functions in Python that takes 2 arguments messages basically the message you want to see from our the payload of the actual call back and message code is no additional qualifier that helps it to interprete what do have received a message and that can be defined when the call is triggered you see that in a moment in order to make that call that I accept that has to be assigned to a very special attitude and the connector called the call back so just signed up to it and whenever
pious of received a call the phone message that method will then be called them so that's 2 simple examples b to trigger call back from our you have to call the self the send call the self has nothing to do with Python self it's just a name space and the median are some I don't really understand why someone has implemented that way but it's done so that's the way to college so the 1st policy I just send the message no message cold and when I print out when I received the call that you see the message code is always 0 by default the 2nd call here I can all the year 0 standard more qualified and message code and the next example shows that 1 of the next examples which show you why you'd choose want to do that 1
possible application of doing them for and callbacks is provided feedback message for the progress so here you see a fake and I'm a big job function which has intermittent callbacks hold the sender sense of you calculation has been done and setting up the primitive called that methods and in Python it just printed for for that case and then when I call the big drop you see the called X-hawk they call that messages are printed out while the our function is still running and then at the end you get the results that and can do anything with it and another
realized applications that is to have a method dispatcher so you can make a call back from R and control which kind of call method is then actually called for doing that can defining 3 constants on the ah aside and the Python side I'm setting up a dictionary in Python and assigning 3 different functions to be called depending on what kind of message code I received the various smaller dispatcher method is treated as a quality function which just accept the message protocol looks up the appropriate function in the function dictionary and calls it with the message I received and here you can see if I make a call back provide the argument fool and I want to see the the storm method called which actually just depends the message that received into the and list called storm in print the list it has 1 argument so that's a very nice feature on them trying for adult effect if you haven't seen it I'm coming to the
end small discussion of this network approach so the good thing about compared to the archive and approaches when model a father with people in your group when your team and all they they all doing calculations and you want to make sure everybody wants and the exact are version and the exact versions of all are packaged using having 1 single insulation the survey is much easier to maintain and has when every team member has to maintain and to ensure that all run the same versions come to and what you can do for what are the losses if you have 3 Compute compute-intensive stuff to a set of real horror compute form and have a load balancer which distributes CPU-intensive jobs to different all service the con sider of course you have to serialize all your data that you're sending back and forth if using a huge amount of data that can be really a bottleneck for you so it's always a thing you have to balance of yourself security aspects of the last things as before and the or a server-side now nowadays allow us to um have credentials things so you can lock in about comparisons doesn't have that so in the moment best to just use the analysis
and that's a problem thank you for your attention and you can see
few any questions many 1 of them so thanks for the talk very interesting approach and the 1st question is going to get it right you have uh 1 session connection which keeps the state yeah there's a lot of this 1 1 namespace 1 session for it OK so with suitable for multiple user exactly and can you say anything more about the civilization of the data using you the protocol from India end up OK it's a it's a binary format was invented by a the last assignment and there's a document in a very elaborate documentation on this website it's so it's basically working the same way as oscillator recursive into you data and goes down the the the tree for simpler in this simple but for nested dictionaries lists and published all that can be serialized in that basically has the same approach just that this and serialization protocol is not Python-specific bodies are so specific so all the lines all service can interact with that and I mean going into a technical details would just be too much for that torque and everything can be looked up on the website have you considered to implement any other kinds to other languages other than Python client or neurobiologist other languages I'm small ones you using serializing data so it doesn't matter you know what to use as a client mineral declined already exists and demand that we can exchange binary data between different languages so have a network of connections OK that's made an interesting approach has thought about that it could be useful as a general way to exchange binary data between different languages and different systems that maximize interesting idea OK thank you very much OK than the norm of this and then you know the any other questions on this so and so much to vote for the pri fj
Rechter Winkel
Wort <Informatik>
Maschinencode
Digitaltechnik
Güte der Anpassung
Bridge <Kommunikationstechnik>
Computer
Computeranimation
Offene Menge
Analoge Signalverarbeitung
Datennetz
Datenanalyse
Open Source
Formale Sprache
Selbstrepräsentation
Programmbibliothek
Optimierung
Systemplattform
Computeranimation
Leistung <Physik>
Resultante
Mereologie
Prozess <Physik>
Division
Web-Applikation
Statistische Analyse
Kartesische Koordinaten
Bridge <Kommunikationstechnik>
Computerunterstütztes Verfahren
Computeranimation
Metropolitan area network
Virtuelle Maschine
Informationsmodellierung
Server
Socket
Projektive Ebene
Bridge <Kommunikationstechnik>
Maßerweiterung
Analysis
Einfach zusammenhängender Raum
Soundverarbeitung
Server
Softwareentwickler
Namensraum
Applet
Web Site
Nummerung
Rechnen
Computeranimation
Metropolitan area network
Virtuelle Maschine
Multiplikation
Client
Last
Datennetz
Mereologie
Client
Server
Default
Drei
Resultante
Lineares Funktional
Interpretierer
Addition
Namensraum
Mereologie
Datennetz
Regulärer Ausdruck
Systemaufruf
Bridge <Kommunikationstechnik>
Rechnen
Systemaufruf
Variable
Computeranimation
Objekt <Kategorie>
Metropolitan area network
Variable
Funktion <Mathematik>
Flächentheorie
Maschinencode
Mereologie
Client
Bridge <Kommunikationstechnik>
Bit
Mereologie
Parser
Bridge <Kommunikationstechnik>
Socket-Schnittstelle
Synchronisierung
Computeranimation
Client
Message-Passing
Pi <Zahl>
Implementierung
Einfach zusammenhängender Raum
Binärcode
Protokoll <Datenverarbeitungssystem>
Viereck
Rechnen
Endogene Variable
Objekt <Kategorie>
Mereologie
Parallelrechner
Server
Attributierte Grammatik
Serielle Schnittstelle
Partikelsystem
Bridge <Kommunikationstechnik>
Einfach zusammenhängender Raum
Computersicherheit
Compiler
Stellenring
Versionsverwaltung
Einfach zusammenhängender Raum
Raum-Zeit
Netzadresse
Computeranimation
Virtuelle Maschine
Dienst <Informatik>
Fahne <Mathematik>
Server
Pi <Zahl>
Installation <Informatik>
Default
Brennen <Datenverarbeitung>
Polare
Einfach zusammenhängender Raum
Resultante
Leistungsbewertung
Zellularer Automat
Einfach zusammenhängender Raum
Regulärer Ausdruck
Computeranimation
Zeichenkette
Arithmetischer Ausdruck
Offene Menge
Pi <Zahl>
Reelle Zahl
Zeichenkette
Zeichenkette
Resultante
Objekt <Kategorie>
Nichtlinearer Operator
Lineares Funktional
Arithmetischer Ausdruck
Flächeninhalt
Leistungsbewertung
Festspeicher
Regulärer Ausdruck
Nummerung
Reelle Zahl
Quick-Sort
Computeranimation
Binärdaten
Resultante
Lineares Funktional
Parametersystem
Datennetz
Quader
Leistungsbewertung
Regulärer Ausdruck
Systemaufruf
Oval
Komplex <Algebra>
Menge
Variable
Systemaufruf
Computeranimation
Zeichenkette
Arithmetischer Ausdruck
Funktion <Mathematik>
Skript <Programm>
Reelle Zahl
Datenstruktur
Softwareentwickler
Zeichenkette
Leistungsbewertung
Einfach zusammenhängender Raum
Lineares Funktional
Interpretierer
Namensraum
Web Site
Ablöseblase
Systemaufruf
Systemaufruf
Variable
Quick-Sort
Raum-Zeit
Computeranimation
Variable
Funktion <Mathematik>
Standardabweichung
Attributierte Grammatik
Wort <Informatik>
Subtraktion
Namensraum
Shape <Informatik>
Matrizenmultiplikation
Relativitätstheorie
Gruppenoperation
Validität
Nummerung
Komplex <Algebra>
Matrizenmultiplikation
Menge
Variable
Computeranimation
Portscanner
Flächeninhalt
Vorzeichen <Mathematik>
Typentheorie
Maschinencode
Mereologie
Zeichenkette
Resultante
Parametersystem
Lineares Funktional
Schnelltaste
Matrizenmultiplikation
Hausdorff-Dimension
Hochdruck
Gruppenoperation
Mailing-Liste
Bildschirmtext
Systemaufruf
Variable
Matrizenmultiplikation
Raum-Zeit
Computeranimation
Portscanner
Mailing-Liste
Funktion <Mathematik>
Flächeninhalt
Typentheorie
Maschinencode
Zeichenkette
Resultante
Lineares Funktional
Parametersystem
Subtraktion
Elektronische Publikation
Datennetz
PASS <Programm>
Ausnahmebehandlung
Nummerung
Bitrate
Systemaufruf
Gerade
Computeranimation
Mapping <Computergraphik>
Metropolitan area network
Arithmetischer Ausdruck
Funktion <Mathematik>
Flächeninhalt
Serielle Schnittstelle
Delisches Problem
Datenstruktur
Ordnung <Mathematik>
Message-Passing
Expertensystem
Fehlermeldung
Parametersystem
Lineares Funktional
Namensraum
Systemaufruf
Nummerung
Euler-Winkel
Variable
Systemaufruf
Computeranimation
Objekt <Kategorie>
Flächeninhalt
Vorzeichen <Mathematik>
Proxy Server
Maschinencode
Bildschirmsymbol
Gerade
Leistung <Physik>
Parametersystem
Addition
Interpretierer
Lineares Funktional
Multifunktion
Elektronische Publikation
Momentenproblem
Wurm <Informatik>
Versionsverwaltung
Systemaufruf
Euler-Winkel
Elektronische Publikation
Code
Computeranimation
Message-Passing
Uniforme Struktur
Funktion <Mathematik>
Fahne <Mathematik>
Faktor <Algebra>
Ordnung <Mathematik>
Chatbot
Message-Passing
Resultante
Rückkopplung
Lineares Funktional
Explosion <Stochastik>
Namensraum
Systemaufruf
Kartesische Koordinaten
Rechnen
Medianwert
Code
Computeranimation
Metropolitan area network
Message-Passing
Funktion <Mathematik>
Arithmetische Folge
Prozess <Informatik>
Maschinencode
Tropfen
Default
Message-Passing
Standardabweichung
Server
Einfügungsdämpfung
Momentenproblem
Hochdruck
Gruppenkeim
Versionsverwaltung
Kartesische Koordinaten
Sondierung
Code
Computeranimation
Lastteilung
Informationsmodellierung
Reelle Zahl
Prozess <Informatik>
Datennetz
Lambda-Kalkül
Analysis
Soundverarbeitung
Parametersystem
Lineares Funktional
Protokoll <Datenverarbeitungssystem>
Datennetz
Computersicherheit
Systemaufruf
Mailing-Liste
Paarvergleich
Rechnen
Packprogramm
Einfache Genauigkeit
Data Dictionary
Konstante
Summengleichung
Dienst <Informatik>
Menge
Gamecontroller
Message-Passing
Web Site
Subtraktion
Formale Sprache
Baumechanik
Binärcode
Computeranimation
Eins
Netzwerktopologie
Metropolitan area network
Client
Binärdaten
Gerade
Einfach zusammenhängender Raum
Teilnehmerrechensystem
Namensraum
Datennetz
Protokoll <Datenverarbeitungssystem>
Mailing-Liste
Physikalisches System
Data Dictionary
Moment <Stochastik>
Dienst <Informatik>
Serielle Schnittstelle
Rekursive Funktion
Pendelschwingung
Normalvektor
Aggregatzustand

Metadaten

Formale Metadaten

Titel Combining the powerful worlds of Python and R
Serientitel EuroPython 2014
Teil 54
Anzahl der Teile 120
Autor Heinkel, Ralph
Lizenz CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/20022
Herausgeber EuroPython
Erscheinungsjahr 2014
Sprache Englisch
Produktionsort Berlin

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Ralph Heinkel - Combining the powerful worlds of Python and R Although maybe not very well known in the Python community there exists a powerful statistical open-source ecosystem called R. Mostly used in scientific contexts it provides lots of functionality for doing statistical analysis, generation of various kinds of plots and graphs, and much, much more. The triplet R, Rserve, and pyRserve allows the building up of a network bridge from Python to R: Now R-functions can be called from Python as if they were implemented in Python, and even complete R scripts can be executed through this connection. ----- pyRserve is a small open source project originally developed to fulfill the needs of a German biotech company to do statistical analysis in a large Python-based Lab Information Management System (LIMS). In contrast to other R-related libraries like RPy where Python and R run on the same host, pyRserve allows the distribution of complex operations and calculations over multiple R servers across the network. The aim of this talk is to show how easily Python can be connected to R, and to present a number of selected (simple) code examples which demonstrate the power of this setup.
Schlagwörter EuroPython Conference
EP 2014
EuroPython 2014

Ähnliche Filme

Loading...