Add to Watchlist

Combining the powerful worlds of Python and R

30 views

Citation of segment
Embed Code
Purchasing a DVD Cite video

Automated Media Analysis

Beta
Recognized Entities
Speech transcript
they
so I again Oaxaca starting now we have right of hankered here tho I've started using price and 19 98 L 80 80 80 998 yeah 4 0 m well so the puzzle developed unites 98 and many by a computer could Biocomputing and notice he's looking as a food supplies and therefore up and he's going to talk about how we can use Python and are for such a circuit arises the wrong but they also reference just it also yeah hello good morning everybody and welcome to this talk about how to build a bridge between the Python or some of how many of you know are using can just give a quick OK so most of you guys are so
far as I can see it for those who don't know what are is are is basically a huge package tool for doing such a signal analysis and calculating graphical representations of your data it's open source
runs on all major platforms like Windows Linux Mac and in the language itself is like 1 company not so great Python is much better language program and but are the real power of our thing think comes from the huge library packages that's available around it and then you can also download most of those packages from this year on network from that situation we in faced
when I started this project was that Python are basically completely separate ecosystems and when we wanted to do assess its statistical analysis of data from Python we had to basically a packet into a is the fires transport him over 2 pi 2 are included analysis and put them back into Python and that was not really very convenient for us um so there are packages which solves the problem like at that time there was a no there's all to and these are basically extension packages for our in Python so you compiler into a model imported and then this are applied to model provides functional and usually axis are to utilization and and get your results directly in Python it has a slight disadvantage which was a disadvantage for us that are runs in the same process as Python 1 but even on the same machine so when Python in our case was running a web application server and we wanted to do analysis in R and that was the heavy analysis that was really slowing down our so we had to spread out our on 2 different machines and that was
the approach we're taking that's what I'm going to talk about and we wanted to build a bridge between Python and R and to be able to run on on a different computer or on a farm of computers yeah and the 1st piece of that bridge the 1st socket is positive are served as the
TCP I server for developed by Sun urban it
allows for multiple simultaneous connections from arbitrary number of clients arbitrary as long as the machine can take it of course and every client that connects to that our server by a TCP IP has its own namespace so all calculations are really done without side effects clients by default available besides Python 1 4 the C + + C sharp and so on and there's a growing number of clients part of them come with the package directly in other clients load of downloaded 3rd party packages from the offered server and on them
so the 2nd piece of that bridge is pi over surface that's the part that I have been writing
and it's a pure blind adapter for connecting by a TCP IP to answer what it does it's utilizes a a Python data objects over the network science some 2 are are kinda some calculation with it and there's you that some results data is deserialized of passed on the Python side notes and native Python Result objects are created by that it allows us to come to evaluate arbitrary are commands on your side and the answer you can trigger functions function calls in you can set and get variables in the ah namespace and the latest addition to will play a surface that it allows our approach to trigger the commands in your Python interpreter from the outside now I will show that later the missing pieces of
that bridge is the protocol which that these 2 problems sockets are talking each other that's the
281 protocol creditors protocol which sounds much bigger than it is I think it's invented by some Simon just for the purpose of letting our clients talk to our server but it's a little bit protocol like maybe people in Python but it allows us to exchange this year objects between R and Python not just within the pies next year and it doesn't only allowed to come serialized data I'd also contains commands so that part the our society knows what to do with the data that you're sending to the outside it's synchronous particle and so and offered command Anderson of your data to yourself and you have to wait until when you get back even if it's just a non-object you have to wait until the ah connection really has to finish the calculations can send of 2nd come onto the same connections if you wanted to parallel computing you have to do you have to open multiple connections on the ourselves which is possible from the US from the Python size and yeah
installation it is quite easy if you download From the ourselves from the from the but SOS server it's not possible to use the pre-compiled packages because for running on a server you need to compile and link are with a special flag this enable are shelf lives otherwise positive cannot be loaded in this manner under the excuse space of or our service can be obtained directly compiled by all so there's our brings its own compiler for packages are common installed and their new packages which you downloaded before and finally the missing piece of the pie inside just of Python package downloadable from pipeline server it runs on all major modern Python versions from 2 . 6 6 0 1 2 3 . 4 um please nonpolar and so that's fine otherwise it will installment on the fly Starting is
also using the server side is just started with are command are opens a connection on the metric and by default it only listens to the local host that's its security features but because in the olden times are surf didn't have a way to protect access to it so there was no longer possible check it's now building on the odds of side it's not just build and apply ourselves side so that's why but before they only listen to private IP addresses and Logan host when you connect to the are server server running on your local machine it's enough to call Pizer passive connect uh goes to local host on by default if if you want to connect to a remote machine just provide hostname and provide a port if you're running on a non-default poured on the server side the connection it
cells has some adjectives so you can go and see where really connecting to get as that's especially interesting if you have multiple parallel connections on open on your pies inside so you can see where which connections connecting to where you can close the connection you can see the connection is closed on the on so now we
come to the 1st real steps what can you do 1 with such a Python with such a connection to answer the connector itself provides a method called evil that allows you to send arbitrary or expressions are commands to the ah aside letters that are evaluated that string expression and receive the result back as a native
Python object on the high side so here I just sort of run they just summing up 2 numbers you can also call functions in our
this the operator in creates an area of America and area on the aside and since you returns the result of that expression and what you get is that memory that area and there's something popping up all the time you get that connecting in number areas in the part inside
sometimes it's not always you want to do is return the result back from hot so when you sign a very complex data structure to a variable on the odd side that's nothing you want to see in the Python side because it would have to be serialized from are pass through the network and deserializing right inside and if you just want assigned to a variable you want to avoid that so for that case there's a variant of the ego command called boy develop which just executes the expression on your side and just doesn't return anything to Python still want to see this and the value of the variable the box just can use the command some more
examples of string evaluations like years you can even define a function on their side so here create a function call times to which takes 1 argument and the 2nd year command just provides executes the function and the result is returned back to Python you can even there development strips that you can define Python you store them in a string of whatever samovar execute them and negative result I think that's really what straightforward so there
using the values for a sort of the basic usage of connecting to oral communicating with ah um a connector provides a much more interesting actually called
are which represents the namespace off you are running on the remote site so why are these are you can access the variables and set variables in the interpreter and can make function calls and have to watch out namespaces are treated as separate as as as before for every connection but they're also getting deleted once you've connections close so we have to make sure you a few words space and in are for you just use whatever in there so
just to see what the difference between string validation and using real names based approach is these commands to
basically the same thing but a variable anywhere is instantiated on the odd side and the string ABC and the 1st approach is a string of relation part the 2nd 1 is doing exactly the same same thing just very Python so it looks like ABC is assigned to a local variable in orange but it's actually serialized and sent over to art and set in that namespace it's even possible to land such more complex data so that's an example where trade and empire area in Python give it a shape and sign that that area to a variable called the matrix of actions and also that number area is serialised send to
R and 1884 are areas traded on the odd side and the last column with con you go the main matrix shows you that you can access that area in our and get the dimension as a as a result but how often
should so called him that Pythonic way using the on in space and trading here just print and to demonstrate that and creating free simple functions the first one doesn't take an argument just returns a static string the 2nd 1 takes 1 argument doubles the value the transition and the last 1 takes from Cuba arguments so that's what you can do in our very Pythonic already had and now that's what way you call it just using on in space called functions 0 it's strength provided argument and provide a keyboard a keywords value to the last 1 and get the live the list that I think that's very easy to see and understand
the more complex thing is some
functions allowed to except another function as an arguement maybe like the map functions in Python and it accepts a data structure and the function you can map it against a supply and R and that's basically the same thing as the arguments on the other hand different order so it takes an area and allows you to pass a function in are to be applied to it so that's also plus possible you can read graphic we can refer to the function that sitting on the outside from Python concurrent tends to and it's important not to pass references to Python functions that doesn't make sense so like the double is isn't here I define a function in Python and if you try to to refer to that function of course it's not possible to serialize functions from Python into aren't you can't serious data but not function so that gives you a number name error because double it's just defined on the our side this example also shows you that apply our cannot handle errors errors that are raised on the ah side so I'm also aware that when an expression is evaluated and looking at the results and I can see if there's an error rates and I can drive over the error message from on into Python and raise the exception providing that net that the message that our sense to me so in the name double is not defined as basically what's are tells
me this example shows you that things can be rather inefficient if you don't do it right so here what I'm doing trading number hearing and assign it to a variable power they are on our side and then I make a function call with as supply rail provided there is an argument and and referring to the times to function and applied to every argument in the area so why is that an efficient what that really does is that it's the signing in the 1st line the area on the side then I'm pulling the area back over the literary into Python and the last line pushes the area back 2 are and so the area sent back and forth 3 times and to avoid that there's this additional attitude are these different namespaces actually did a reference before which Congress which that allows you to reference and a data object in all without actually pulling it
all over so it just provides approximate that and share that example now use that to reference an area which exists in power and supply that as an argument to the supply function so that avoids the dataset and we're back and forth 3 times out of all mentions
messages that's 1 of the latest additions that allows our code to send messages into Python interpreter which on the Python side trigger the call of of a callback function that you can define and in order to make that direct you need to have start are served with the special flag enabling the conflict conflict from so that's the whole being able to check and see in that example and you have to start our ourselves to use that concept file with the corresponding command line options that is stars of the
year additional coding our search for callback messages the way it's set up factories seemed to define a callback functions in Python that takes 2 arguments messages basically the message you want to see from our the payload of the actual call back and message code is no additional qualifier that helps it to interprete what do have received a message and that can be defined when the call is triggered you see that in a moment in order to make that call that I accept that has to be assigned to a very special attitude and the connector called the call back so just signed up to it and whenever
pious of received a call the phone message that method will then be called them so that's 2 simple examples b to trigger call back from our you have to call the self the send call the self has nothing to do with Python self it's just a name space and the median are some I don't really understand why someone has implemented that way but it's done so that's the way to college so the 1st policy I just send the message no message cold and when I print out when I received the call that you see the message code is always 0 by default the 2nd call here I can all the year 0 standard more qualified and message code and the next example shows that 1 of the next examples which show you why you'd choose want to do that 1
possible application of doing them for and callbacks is provided feedback message for the progress so here you see a fake and I'm a big job function which has intermittent callbacks hold the sender sense of you calculation has been done and setting up the primitive called that methods and in Python it just printed for for that case and then when I call the big drop you see the called X-hawk they call that messages are printed out while the our function is still running and then at the end you get the results that and can do anything with it and another
realized applications that is to have a method dispatcher so you can make a call back from R and control which kind of call method is then actually called for doing that can defining 3 constants on the ah aside and the Python side I'm setting up a dictionary in Python and assigning 3 different functions to be called depending on what kind of message code I received the various smaller dispatcher method is treated as a quality function which just accept the message protocol looks up the appropriate function in the function dictionary and calls it with the message I received and here you can see if I make a call back provide the argument fool and I want to see the the storm method called which actually just depends the message that received into the and list called storm in print the list it has 1 argument so that's a very nice feature on them trying for adult effect if you haven't seen it I'm coming to the
end small discussion of this network approach so the good thing about compared to the archive and approaches when model a father with people in your group when your team and all they they all doing calculations and you want to make sure everybody wants and the exact are version and the exact versions of all are packaged using having 1 single insulation the survey is much easier to maintain and has when every team member has to maintain and to ensure that all run the same versions come to and what you can do for what are the losses if you have 3 Compute compute-intensive stuff to a set of real horror compute form and have a load balancer which distributes CPU-intensive jobs to different all service the con sider of course you have to serialize all your data that you're sending back and forth if using a huge amount of data that can be really a bottleneck for you so it's always a thing you have to balance of yourself security aspects of the last things as before and the or a server-side now nowadays allow us to um have credentials things so you can lock in about comparisons doesn't have that so in the moment best to just use the analysis
and that's a problem thank you for your attention and you can see
few any questions many 1 of them so thanks for the talk very interesting approach and the 1st question is going to get it right you have uh 1 session connection which keeps the state yeah there's a lot of this 1 1 namespace 1 session for it OK so with suitable for multiple user exactly and can you say anything more about the civilization of the data using you the protocol from India end up OK it's a it's a binary format was invented by a the last assignment and there's a document in a very elaborate documentation on this website it's so it's basically working the same way as oscillator recursive into you data and goes down the the the tree for simpler in this simple but for nested dictionaries lists and published all that can be serialized in that basically has the same approach just that this and serialization protocol is not Python-specific bodies are so specific so all the lines all service can interact with that and I mean going into a technical details would just be too much for that torque and everything can be looked up on the website have you considered to implement any other kinds to other languages other than Python client or neurobiologist other languages I'm small ones you using serializing data so it doesn't matter you know what to use as a client mineral declined already exists and demand that we can exchange binary data between different languages so have a network of connections OK that's made an interesting approach has thought about that it could be useful as a general way to exchange binary data between different languages and different systems that maximize interesting idea OK thank you very much OK than the norm of this and then you know the any other questions on this so and so much to vote for the pri fj
Goodness of fit
Digital electronics
Computer animation
Bridging (networking)
Right angle
Computer
Computer programming
Open source
Open source
Data analysis
Signal processing
Formal language
Computer animation
Computer network
Computing platform
Representation (politics)
Statistics
Representation (politics)
Library (computing)
Computing platform
Library (computing)
Server (computing)
Process (computing)
Computer
Scientific modelling
Multiplication sign
Projective plane
Mathematical analysis
Virtual machine
Mereology
Cartesian coordinate system
Web application
Embedded system
Computer animation
Personal digital assistant
Bridging (networking)
Network socket
Bridging (networking)
Statistics
Extension (kinesiology)
Resultant
Default (computer science)
Multiplication
Numbering scheme
Server (computing)
Namespace
Server (computing)
Structural load
Java applet
Virtual machine
Sound effect
Client (computing)
3 (number)
Client (computing)
Mereology
Calculation
Computer animation
Computer network
Namespace
Addition
System call
Namespace
Set (mathematics)
Surface
Code
Client (computing)
Mereology
Mereology
Variable (mathematics)
Functional (mathematics)
System call
Variable (mathematics)
Calculation
Type theory
Computer animation
Bridging (networking)
Function (mathematics)
Computer network
Interpreter (computing)
Regular expression
Object (grammar)
Namespace
Resultant
Socket-Schnittstelle
Server (computing)
Serial port
Parsing
Client (computing)
Parallel computing
Mereology
Pi
Bridging (networking)
Synchronization
Dependent and independent variables
Communications protocol
Implementation
Message passing
Complex (psychology)
Attribute grammar
Bit
Mereology
Calculation
Particle system
Computer animation
Quadrilateral
Bridging (networking)
Object (grammar)
Communications protocol
Installation art
Default (computer science)
Server (computing)
Installation art
Spacetime
Service (economics)
Multiplication sign
Virtual machine
Chemical polarity
IP address
Compiler
Revision control
Optical disc drive
Pi
Computer animation
Flag
Configuration space
Information security
Local ring
Default (computer science)
Pi
Performance appraisal
Computer animation
String (computer science)
Real number
Cellular automaton
String (computer science)
Attribute grammar
Regular expression
Open set
Resultant
Area
Read-only memory
Numbering scheme
Real number
Multiplication sign
Functional (mathematics)
Performance appraisal
Computer animation
Function (mathematics)
String (computer science)
Operator (mathematics)
Regular expression
Quicksort
Object (grammar)
Resultant
Complex (psychology)
System call
Scripting language
Real number
Multiplication sign
Software developer
Parameter (computer programming)
Functional (mathematics)
System call
Variable (mathematics)
Performance appraisal
Performance appraisal
Computer animation
Personal digital assistant
Function (mathematics)
String (computer science)
String (computer science)
Computer network
Cuboid
Regular expression
Data structure
Resultant
Standard deviation
Spacetime
Spacetime
Namespace
Attribute grammar
Variable (mathematics)
System call
Functional (mathematics)
Pointer (computer programming)
Word
Flow separation
Computer animation
Function (mathematics)
Interpreter (computing)
Website
Quicksort
Namespace
Area
Complex (psychology)
Numbering scheme
Group action
Matrix (mathematics)
Theory of relativity
Validity (statistics)
Namespace
Matrix (mathematics)
Shape (magazine)
Mereology
Shape (magazine)
Variable (mathematics)
Variance
Sign (mathematics)
Type theory
Computer animation
String (computer science)
Subtraction
Area
Group action
Matrix (mathematics)
Spacetime
Keyboard shortcut
Letterpress printing
Electronic mailing list
Matrix (mathematics)
Electronic mailing list
Parameter (computer programming)
Functional (mathematics)
Shape (magazine)
Variable (mathematics)
Variance
Inclusion map
Positional notation
Type theory
Computer animation
Function (mathematics)
Hausdorff dimension
String (computer science)
Resultant
Area
Numbering scheme
System call
Serial port
Mapping
Mountain pass
Line (geometry)
Parameter (computer programming)
Functional (mathematics)
Message passing
Computer animation
Doubling the cube
Bit rate
Function (mathematics)
Computer network
Order (biology)
Data structure
Subtraction
Error message
Resultant
Exception handling
Area
Numbering scheme
System call
Namespace
Euler angles
Multiplication sign
Line (geometry)
Parameter (computer programming)
Functional (mathematics)
System call
Variable (mathematics)
Sign (mathematics)
Computer animation
Object (grammar)
Namespace
Addition
Computer file
Code
Euler angles
Correspondence (mathematics)
Computer file
Moment (mathematics)
Letterpress printing
Parameter (computer programming)
Functional (mathematics)
System call
Revision control
Message passing
Computer animation
Function (mathematics)
Factory (trading post)
Order (biology)
Interpreter (computing)
Flag
Message passing
Library (computing)
Computer-assisted translation
Computer worm
Default (computer science)
Standard deviation
Code
Namespace
Feedback
Median
Letterpress printing
Drop (liquid)
Cartesian coordinate system
System call
Functional (mathematics)
Calculation
Message passing
Explosion
Process (computing)
Computer animation
Personal digital assistant
Function (mathematics)
Message passing
Arithmetic progression
Resultant
Logical constant
Game controller
Service (economics)
Code
Real number
Scientific modelling
Letterpress printing
Archaeological field survey
Insertion loss
Parameter (computer programming)
Data dictionary
Revision control
Computer network
Information security
Lambda calculus
Modal logic
Pairwise comparison
Server (computing)
Building
Moment (mathematics)
Electronic mailing list
Mathematical analysis
Sound effect
Letterpress printing
Set (mathematics)
Cartesian coordinate system
Functional (mathematics)
System call
Local Group
Single-precision floating-point format
Calculation
Message passing
Process (computing)
Computer animation
Function (mathematics)
Computer network
File archiver
Lastteilung
Information security
Communications protocol
Teilnehmerrechensystem
Service (economics)
Serial port
State of matter
Civil engineering
1 (number)
Client (computing)
Data dictionary
Binary file
Formal language
Torque
Recursion
Subtraction
Physical system
Raw image format
Namespace
Binary code
Electronic mailing list
Oscillation
Coma Berenices
Line (geometry)
Computer animation
Network topology
Computer network
Website
Normal (geometry)
Hill differential equation
Communications protocol

Metadata

Formal Metadata

Title Combining the powerful worlds of Python and R
Title of Series EuroPython 2014
Part Number 54
Number of Parts 120
Author Heinkel, Ralph
License CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
DOI 10.5446/20022
Publisher EuroPython
Release Date 2014
Language English
Production Place Berlin

Content Metadata

Subject Area Information technology
Abstract Ralph Heinkel - Combining the powerful worlds of Python and R Although maybe not very well known in the Python community there exists a powerful statistical open-source ecosystem called R. Mostly used in scientific contexts it provides lots of functionality for doing statistical analysis, generation of various kinds of plots and graphs, and much, much more. The triplet R, Rserve, and pyRserve allows the building up of a network bridge from Python to R: Now R-functions can be called from Python as if they were implemented in Python, and even complete R scripts can be executed through this connection. ----- pyRserve is a small open source project originally developed to fulfill the needs of a German biotech company to do statistical analysis in a large Python-based Lab Information Management System (LIMS). In contrast to other R-related libraries like RPy where Python and R run on the same host, pyRserve allows the distribution of complex operations and calculations over multiple R servers across the network. The aim of this talk is to show how easily Python can be connected to R, and to present a number of selected (simple) code examples which demonstrate the power of this setup.
Keywords EuroPython Conference
EP 2014
EuroPython 2014

Recommendations

Loading...
Feedback
AV-Portal 3.5.0 (cb7a58240982536f976b3fae0db2d7d34ae7e46b)

Timings

  507 ms - page object