Merken

Python's Role in Big Data Analytics: Past, Present, and Future

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
morning everybody knew as they call it to announce on excuse the culture is up front here he's a good friend of mine I didn't know him but he's also of number number as property the scientific Python package pretty much everybody is that if the error in Python this is not my and he's the 1 who wrote it and it's also the CEO of the continent and it is the company behind and under under and a lot of open source packages so those supporting prices community and is also a founding member of non focused and unfocused as a nonprofit behind my be speaking about my the conference starting today actually happened Toral so if you're interested in Python amputated this From this when the talk about having a lot of talks tutorials is much more that's about it so we encouraged to join divided a conference of still tickets available so please let them buy and buy tickets and not I would like to give the verdict interprets this about some content highs and indicated that it is talking about the past present and future of this field he's given becomes the Travis Figure arrives at the gravel doubled the goal here so it seems to be a common pattern the this the usual to those so this is in terms of student and and the reason for the problems but so there we are see a very messy desktop then of balancing regional excellent may will come back to this you don't know I'm still alive I don't have control over my strange system the I don't have it bound to be false but in a way that you were swapping things out of think the work really OK you right so you want and get when and as you give me your full attention right that's good clustering million chance send off understand let's check my
mother and talks to all try to keep your attention so you don't get to to board or the little about me not too much to the 1 most to talk about the technologies that have been involved with but just to give you a little background I know not everybody here is familiar with the number highest acted with science at the same a scientist by training a rooster and satellite scatter
ometry I used to measure wind speed over the ocean solid-state runners that's really what got me into large scale data analysis was tracking the data have backscatter from the ocean surface big satellites that can
dictate drives the use of EMS machine mass that's the message using those systems to really awesome hadn't had a floating point format those different logically 54 something before that's why started and we submitted make really nice pictures I would do a
lot of up and it's and for all I did some amount 1 these loci Europe produces character of
slides but when I did my PhD program at the Mayo Clinic I got into a different kind of way of it was waves I was making some of people they so we will people with speakers researchers shake them
and you shake people waves propagate inside and then with either MRI ultrasound can actually see those waves propagating and from that you end up with a bigger inverse problem is that equation there which like to scare people with and my committee with the to respond and it's not that hard equation simple linear equation you're are pretty good at my goal was to invert that and to invert
that I have defined the derivatives of
high-dimensional data series was the very large Cuba data it was too big to fit in memory for from out of the Matlab double didn't have there wasn't Matlab float the time and the I really liked working at a high level I could program c but it but when I was thinking about my data problems
and 1 of the 4 the because they had to think about pointers arithmetic and figure out where my memory leaks were so I really like the high-level so sourced around I found Python and does so that the rest is history
basically a found Python sort of do a lot with Python did finish my phd although it was somewhat delayed and that this is just
a newcomer context for Python's origins I started use Python this is it we really In this argument interval before this the angle is 0 . 9 . 0 cannot there were at any 1 I was not applied user that I really can't see 97 as you can see the 96 is like and that I will use version 1 . 4 applied to gray version actually factored you'd like to try that 1 . 0 series of kind installed make a new environment kind and install the 1 . 0 version of Python just for fun and see kind of how it worked and do a test environment by that 1 that I started using it in 97 and I have the little gold highlight
keep that in mind because of such an important date 94 with respect to the analysis and by
the this slide because this is really what got me going in writing extensions for Python it if I
don't have anything else is over a whole bunch of the extension for by much this regret of the pipe crowd but the problem of the number 1 enemy because every in so many C extensions that make their job harder in moving people off of the larger a platform but not the only 1 that does a a lot of people the rights he accepts the pi bond because it's so easy to and because it's so extensible but so this slide because of Mike they might Miller actually very close to my from my but might he released a package called table I and that's how I learned programs extension is I grabbed his package study the source and also written essay that we about reference counting gotta figure out reference can extensions for cigarettes I thought of things
and I do that anymore but you really like the thing is the reference counting so this opened my eyes to the power of open source I could basically look at the code understand it learn something and tremendous amount reading this code and then I started experiment with my own models in 1988 was my 1st extension model for Python's called 9 5 0 it's completely it sort of embedded in other parts of the stack these days and but better ways to do it but that was my reverse model 1998 and I kind of got hooked sort of that's that's when I had my career as a scientist sort pivoted what's in the middle that's the right word divided into tools for
scientists so in 1998 this 1990 started to get really addicted that really caught by that but I think there is a chemical compound is called addiction open source it is somewhat related to our interdiction of Facebook and it's it's connected but the back in the time that started releasing wrappers 998 1st of 2 W then models and starts which is from health from Gary strain when he put out something
in 1998 and 999 1 . 1 year
and looked at the material especially nice to back and look at the history of what you said in the past and green is a little bit stupid you were but you we're all we all make mistakes other go back in Canada also see can how motivated you excited you about something new I was very excited about so of Python 999 as it he we could use
Python to build a data analysis environment we can do all of our cells are all calculation that all the things I loved using higher-level language like matlab about I could do that right so that year at about back and looked at the middle list and the every month I was so inherited the package is made in the course it wasn't very pretty in the web page for it was very ugly and celebrated web designer I can put content but not really pretty pictures as available that back and they nobody else had either so that there is a really the Spartan website and they don't want to releases and again in appear ear-piercing came along and said hey this is really stupid of you hand
writing portraying wrappers all that pack on all the other libraries and that lead underwriter tool to do that but I 1st learned the difference between me and the real
computer scientist right me and like all do this manually because I I just want you done and a real computer scientist goes with the ultimate that's naked and that you know you know a real computer scientist will spend more time on a meeting that would have taken this to a manually right and that that's not the case at the pose a tremendous tool and we were together basically for the last part of june and I meant that in the year 2 999 that was sort of multiple was called the time and that was my I put on the on the
web and people started downloaded got the started is no people from all over the world Gary Peterson the John wrote applies so of these here could be here but it's kind of you know he likes to water ski in lakes Estonia that was just a tremendous Russia FIL that coordinate and communicate with people all over the world and see then use your staff and contribute back and make it better it was amazing thing it was my 1st taste of open source communities and just seen
that grow and grow grow ever since you can hear that applies of a data analysis back in 2 thousand what were the road number by number I came from a traditional camera history that started in 1994 and here in 2 thousand I could use it to published and pictures of my thesis is using a Python interface something called this land in various disciplines in this was actually a pretty sophisticated tools of a little bit tedious to use because it would do it used to publish all my pictures like these dangerous 200 different images like my thesis all with this land with Python and
2001 that that's when they're Jones contacted me and together with their peers and we can uphold stuff together and spend a whole bunch of time building for Windows and a whole bunch of times since the going through the blood to building a package we came out with what we call the sci-fi library there really was a sci-fi distribution and as did realize time that's really what it was there was a collection of all these tools together with a single installer to get everything up and running quickly there are a lot of people who
contributed to getting something out the door there was a lot of work just from employing everything you get into a lot of I was I was Windows program time sense learner make my peace with the Windows platform and and and fine some interesting things about actually apparel spaces and relation things about it but as I said before not entirely inherited a long legacy of that great minds gone and tried to build numeric
array object replied that Jim Fulton he's embarrassed I show this because his it was it was a Python matrix of but was real as part of the early discussions that caused Jim Huguenin as a graduate student MIT really excited and sort of progressing his graduation in order to write numeric which became the foundation of all and that's why I can apply them because America existed so I'm really grateful for that then in 2001 as some of the some features were desired for American and carry Greenfield require Todd Miller at the space size telescope the folks the plot Hubble and process the Hubble images they needed some changes most particularly the bill memory mapped to more diverse data and so they writing memory and then at that time I can and
I said well it it turned out that you did have a class to teach usually have a mass confluence events in habitat class at each other because everyone was scheduled
nobody signed up for a sort of building MRI and only 1 person centered at Augusta them out into interviews smart actually not to sign up for the class of ended up without a class to teach and I probably should have been publishing papers in order to keep my position
is that a professor at the time I Santamaria times built in America and some models being built number and I was I I saw the split a community as artists and they said people were still struggling just to support 1 thing and I felt like and do something about it was just a really strong feeling that some of us do something about this and I don't know who's going to because nobody else knows the code base very well and I had time so it it is that said OK to caution the window and basis that about what I thought would be about 3 month project in an 18 month project in order to kind of put the 1st version of out there as
is about 2007 the community it took a while for the community excited about and get more contributors but 2007 came in and it's really started to take off and lost more people joined there's still room
of people especially the low level but there were people to understand the CPI Python and help maintain the non-PPI base in C is shrinking right and that and that becomes a challenge but there is a non sci-fi both now a very impressive community efforts and I'm really grateful for that because they would be what are without all the people that are contributing to make it possible I
don't know exactly how many NONE
pages there are estimated to be 3 million others on the basis of the of of hits to a web page and download numbers but it's always hard to tell because they don't know rights home and the reason the postcard so you never really know how many users of empire have but I want
economists parcel of it and maybe to motivate some of you who are you know a building your own of software packages and communities talking about the things that I've learned that what it takes to do this because it part of their role in big data analytics is going to take the work and effort of a lot of people are just as it has to date that so 1 of the things most important thing is to recognize it is hard work initially it's not that easy and use it's quite lonely actually start off a new venture nation nobody really believes your idea except you and that's that's really the way it should be that others will need some proof that this is actually a were before they died in western Hadamard's Norway and America and do it on American based a lot of people said that's great you go to go
do that defines and is really until they start seeing results in this of this is actually a work OK will die then and and then very gratefully do that as well but that's that's going to be a power to discover have to dive in and do something with a small thing you know the person and in fact the more complicated we're doing is the lonely it's going to be because the fewer people will understand enough to be able to make trade-offs to help you and so the sci-fi was another example I mean I started sci-fi and and it took a while before got more more help and are joined with the end of the year the idea of a customer PAC released a year contributors aside by and don't tell my wife in the room so I should have said that because I was a poor starving grad student 3 kids we're making 18 thousand here in Minnesota and are you integrate we did great but have and those using linear finished I suggest you have a job I Pierre appears putting tremendous work 3 after Pi and Psi filling knowledge I was just flabbergasted by my work you put in that will migrate anecdotes from Kourou is when I sent out my multi pack and he basis submitted a Makefile I still encompass will make fun of no idea what said but was amazing it is sort of the built everything including I think the coffee in the morning people do that and he put interest on a work it takes that can work to get things off the ground it really does this this not we can just show up and hope that things happen you really have to put in the work but the thing I think is important is you do
what's right and you know some in other words you put aside kind of thoughts of only make a lot of money around will do something really cool because doing what's right means at did you have your information you have a knowledge of the semantic environment the things you know about those things come together for you to feel that this is what I think is right and you're going to have that feeling and you need to own it and use to do something that's right timing is everything sometimes you're the right person for the job sometimes it's the right time for that to happen and that's that's hard to know when is the right time but when the time is right out of the urgency can't wait to wait people are are
impatient if things don't move and directional wander off and do something else after a little but the time you have to strive for excellence it really takes work and you to give the best you have it'll never be enough but give your best anyway all of the School of military service action a whole series of these kind of these aphorisms Due to good use they will often be forgotten do but anyway it's actually true it's essential it's it's 1 of those life qualities that I think I would love to grow into something
on another here success is what it's initially lonely and you doing things on your own and you have to build a community of us can have long-term success you have to others involved and that means interacting with people and that means having your ideas we put up and shut down that basically means you'll need the help other people of the people of different is that you have to give up some your hard-won ideas if scenario someone will put our house have value stock and you probably do in some places and it's OK listen to that and that's the only way to make progress in communities for people to come together and have some empathy toward each other it does expose you know it to treat other people the right way you have to care about them and that does
expose you because you can get hurt but do you do it anyway and then you keep moving forward and it's it's it's a great thing much more research this topic
I think the key to to build a community there's a lot of lessons to be learned
from all kinds of projects and love the torques sum you by your lessons of how you build communities than patients who also good things take time but it doesn't happen right away nanoparticle long time to get real adoption 2007 6 this is the sort of 2 2 years after words was 2009 forces in a lot of adoption of comprises 4 years later on the right factors have to come together while those factors for example the number sci-fi communities was developed for a long time and the number of predicted contributors for creep along in the ease in which you could contribute jumped and all of a sudden
we got more contributors all right so that's
a little bit of history of what I did in the past with some sci-fi so what is this number by anyway a perhaps your hands who uses vampire knows what it is OK a fair number of excellence in the europeans always smarter than the as the question the in the US I don't quite get the same response is that we have the room here that released as soon as they all rejected of course to do something else that's that's another as the the beauty of the European but what number is essentially the rate or an extension is ray
object and fast operation on an array of it's very simple actually score and here's a simple examples build up
to the two-dimensional array and some computations on it and you can some along different axes again a three-dimensional array you have an n-dimensional rain city of organizing data together in a way that can be operated over quickly and by quickly we mean you handed over to a precompiled loop or precompiled engine that does the computation and not so that's not happening in Python so when you work
with the virus you you can release the goal you are releasing that soldier brought you don't have the same problems that happen if you're not using but so here's a diagram of what it
looks like a python object each element of the array has to be exactly the same type that's 1 restriction on various has the same number of bytes is basic restrictions at that byte stimulated and things we appointed a python object it can be a structure with the into a float and 10 bytes of strings that can be decoded UTF stuff 32 is always where you could is number and then there's these are a scalar things also creep up and and bite you on occasion if you're trying to understand what comes out an entire race so and it is
fairly straight forward eat once about 3 years ago of 4 or 5 years from now so we asked the help of a set of known PPIs if you don't import this Allerton peers aphorism that come up the nice and this is definitely not the same caliber but gives a little bit of flavor of how I think about them by tried better than scattered contiguous but striated descriptive is better than imperative array-oriented is that of an object-oriented capsules available we like the fun 1 of broadcasting is a great idea and factorizes better explicit loop and this is too complicated and then you can use either the number and
think in higher dimensions to solve the problems real quick
kind example rarer computing to so this is a little bit it's easy example that the Fibonacci numbers are so common in Python we like to show that
pursues the Python implementation 1 bad 1 1 better 1 in terms of performance that you compare the Python approach to Fibonacci versions of gamma array
on guy using non-price that why do well of course reached solution and find the roots of the discrete difference equation and just compute them freely and so that's vectorized computation here fit to a I just generate a vector of numbers and and then I calculate with the root are 1 or 2 use the roots command to take the roots of polynomial calculate the power to those of those routes subtract was array expressions happening at all for there before lecture having recovers that's the that's the concept of a rare computing is gathering the data to handle computations on them all at once but and and if
you're really clever you understand that it's based the output of an unstable filter and I can use the linear filter tool inside 5 to generate the output of the least the 1st part of the Fibonacci sequence of those who have done this normal also add up to also understand our half overflow if I use the floating point of the machine that can do this I find them by but you
can get better performance at
1 of the benefits of computing is you you immediately typically a fast performance so that's you see what people reach for and why reach Ferreira computing is to get the best performance they're looking for there are other reasons to do it however and
the APL really was the father of a languages has been around since 1964 but it was scripted had the hieroglyphics of APL are still trying to be decoded we have not found a set of stone you have to understand what people actually said in all these wonderful area codes just kidding action of some people can read it and the other English versions of that same concept brought in a lot of the same ideas aligned among them 5 is a descendant of appeal all right so another
simple idea of rearing opinions to gather data together well as a lot of auditory and approaches and scattering your memory all over the place objects for statutes beautifuI gather that all together and make objects essentially
rows in a table of attributes that you call Fourier processing you is altogether and you're modern processors can
screen through this a rare computing is perfectly suited matching the vector computers of today the multi-core multi-CPU so whenever you can do that you get the added benefit that I should be able to take advantage of the that hardware but otherwise not really expose very well in the languages of today so I
talked about those benefits or move
on and skip this example put a briefly just you get a feel for the kind code something that once was ostensibly tweeted here's a problem I select like I think
I can solve that rather than spend time on company stuff so I have to get a good time to stop playing with this problem is what I came up with the 2 basic find a circle out of this from roughly circle like image the 2nd thing you can do
an entire has had a story and Alex for a long time because of the structure arrays as said briefly before that every element entire rate can be an arbitrary structure to be integer float or whatever and so I can think of a one-dimensional Piraeus a table as Excel table it's a really nice mapping however it's an array of structures which sometimes is is the optimal data structure when you're trying to say add new columns quickly or to computations down the columns and so even know it works is not as flexible as we like and can has emerged
as they see this generic structure where they just pointers define erasing the covers services on top of Mumbai and it provides a lot more user-friendly tools for people doing the analysis so whereas the hand the past when using the hydrogen analysis you might have to write 10 5 10 lines of code with penises of 1 or a method call and it's it's quite a bit simpler so a lot of people that come to the pipe and the community because of and this list basic comes from a user of pentoses while obtained as a few of these reasons and I modify slightly
so currently today in big data analytics Python this is the basic key libraries and might be missing a few here but this is this is sort of basic ones price I 5 has is not followed by Python the
list goes on it becomes quite a statue when you're sitting there I wanna use Python analysis you have to get a bit of stuff together to make that happen that's really why we create
anaconda conduct so we may be
interested note that whereas a lot of he when using our for Data Science Python is growing rapidly creeping as a equal footing for data analytics with are a a lot of people this is a recent survey done better Riley and that survey the people attended Stratton 2012 and 2013 this revelation O'Reilly as well I think because they've been really searching for people write books and type of analysis I think also West McKinney's book was successful in that that really opened the floodgates as well he there is a market here we wish to get books and the natural log books I have no time to write a book so so far
and I can to a few other people we will get a book of a continuum of we also articles like this I wanted a language words I think actually we can work together with your community but it just goes to show that it becomes kind of a choice you can do everything you need to do in Python pretty much occasionally court are and can do that by that is drawing the top
language in school that you've seen this practice so it when a
Python conference how should can celebrate the fact that Python is being used for over a lot of places is US schools top universities Python on 1 of introductory class being taught some will say that's how languages go die so maybe it's not such good news and him a whole
lot of people using by the noise of the doing of but I trust that our community is both vibrant enough and robust enough to welcome and then train them up actually learn the things they might have been wrong in school and help me with this new community forward so I
have here an element at a time I but that but had plenty of slides played things to talk about but I wanted to talk a little bit about what I think Python is fantastic protect computing I and they said I was a domain expert data scientist scientists come into Python had reasons for it some of those were were the same as they are today 1 syntactic it's out of your way I don't have to learn that jargon and concepts basically leverages my English language data centers and perhaps of language delivers manner in data centers in the future I will benefit from it but others will this leverages those latin
latin characters and there's a white space of white space the fact that it it conveys intention not tell you why because my field of view is limited I have limited horizontal and vertical real real estate I concede that understand something that limited space so if I'm using an up braces in brackets and things are unnecessary it's just it's waste but for me also why if I have a long long paths long long . pass through things my variable name takes up the whole screen in trouble so
I think kind of not a big fan of that year that complex numbers were built in early overlap which built an early is mistake just made for the scientists scientists need complex numbers yet if T is the reason why follows years have jobs and it's really has a 50 you've gotta have a complex number or it'll have 200 of them and we will agree on what should be but just another language support for arrays which is you know the the brackets ability to have commas go really tuples you don't have to have the funny index indexing these things were added actually had a critical time but have you don't have to think the common and other call the wasn't in humans who worked with the Python there as we go and others to make sure that the other language and early time it's been fantastic the occasional Procter programmers can understand it OK so programmers people like I was 2 . 1 have to solve this by dimensional differential equations and don't want to spend time chasing pointers but you will see code reuse standard and Haskell or closure is too much to put my head and remember so Python works provision space and I used to say that Pakistan was a problem with Python and always say that because packaging is also with Commander Kondo nature problem go away it's it's it's fantastic we get that feedback from users all the times when not to say that because we put out a statement because the effect the impact from people and I use it and I love it could solve exactly the problem I've seen with the with not really everything installed easily and quickly so a lot of great things about about Python simple extensible
implementations general build you build a system general-purpose forcible programming styles all these things you know but critical 1 is that it does have a critical mass because again the ideal language other people using it and you be stuck you could not build community that's the hard thing and that's that's a bit of a chaotic question about when that will
happen I can give you answers this sort of 1 of those emergent phenomena
and those things I don't like about Python and we can all probably
right together but is some these are being addressed I would love to see an anonymous blocks that led to see the ability of anonymous chunks of code you could then send around places really for deferred evaluation is the most common use case for me but I would love to yield a slice syntax also the brackets you please let us in fact feel the tree slice operator used a lot as a new railroad programming but C Python runtime to Gilgal variables inside like the compilation there's some there's some work to be done there and that's that that's a really hard 1 but I would love to see some language extension other the import city a lot of use of a lot of very creative uses of important works to have kind DSLs of important applied that's kind of cool actually stand by them as you like with the import statement from inside and it can be hard because using a general-purpose language
because the deaths of that language don't scientist and use cases where a programmer and then I'll have a story about that a little bit about have 3 1 8 so non-PPI like it's
goods that would go a lot of things
but there's no problem with to the D. type system the datatype system such that allows the structure arrays it's too limited and difficult to extend is really it's it's more it's grew out of numerics data structure that was the beginning and it can be extended just far enough that it needs to be an overhaul of the median mode pretty huge tempers all the time have this equation to to evaluate it's almost in every database really closer to use this ecological nonpartisan purpose faster actually but it's not quite as we evaluate some the operations we would like
a lot of amount unoptimized parts losses embarrassingly and optimize parts are actually start Unitrode is interrupted but hopefully the blame is not meaning some change the comments or something of the codebase is organic and and hard extend but I think as they
reflect on back to the history that 1 of the most important pieces of work they did in 2005 2006 let it sit down we go I remember flying descendants cemetery always working inside have wanted we don't see how we did not find the Python right ensorcelled professor and again at the end of the season the next guy very common so I went there with call the what we said we had lunch we talked about what we do get non-PPI enter into the America that's new and their numeric the time we're calling it into Python and he and he cautioned about well you know there's there's if you get a python NET mentally cycle it will give the updated very quickly the summary is downsides to and I was
that the outcome idea do that we don't really what can the structure of an entire in Python and so I spend time writing this had kept the 118 through really the extend of of a particle arrays ahead of you know what that 218 is on the buffer for policy I should see if your hands because this is 1 of those sort of underbelly a Python variability and it's kind of on the from the lowest level this reliability for arbitrary objects to share data and that's what the buffer protocol now that initially that only allowed you to share a single point of data and no metadata about the data can share that was that was really it had this kind of data type in it so the extent a protocol was really all about getting more metadata around the pointer to memory that was being shared it really makes possible 100 years world a powerful like objects so it isn't really necessary for the be the only 1 on they really could be a lot of like objects that share memory can really operate in independently and and coexist I think adding multiple dispatch language action prove that the removal this but library would actually make that heaven at that point just intent to dispatched nearly a single mother rely rely combined so that's what I think of the future of non in the future of the world I think about 100 years for all the world with a lot of things working the bottom vertical exposes this idea and I don't have time to talk about today Sunday on many of talk about this and I'm really not that they always
qualified to do this I'm always Canada a scientist by training and always kind approaching computer scientists offered of over by by effort and trade and learning from other people but there's something real about this dual the well the bill of encapsulation of rough path of a particle exposes these data types this sort of invert the idea of having data at an having methods attach the data the data is exposed to talk about it in the fall term that schema and stroke code passed to attack little bit about that but I think what
it would allow stressor do in the future so what of the future what is Python's role in the future we talked about today we talk about past which was what happened the future well I was to start a campus so so I think I know the future and I'm going to be a little you what's going to happen of this is traveling like to see from the principles that I think will guide the future we can we can ever told can
happen 1 of things it's very real is based in physics is the idea that data has mass there is growing faster than the speed of light can carry that data from 1 point to another and therefore that means you're ailab dataset where it is and that's true whether it's in
GPU words in a memory cache whether it's on a cluster somewhere you don't want to be pulling data so that means a lot of our systems that were built around the idea of of encapsulation serialization actually wrong they don't work very well when we do that and so we have to come to think differently about how we manage this this
and there's a a blog about this as well as a well known observation data gravity sometimes even invented you talk
about the formula for data and cannot attract each other I don't know
maybe maybe that's useful but I think it is useful to think differently in relativistic sense
normally we think about ourselves on the platform all data moves past because in work objects and the Madurell's can computationally serialize along the way but when the it has mass is
really expensive in terms of a computationally simple octobre machines are can gravity the conversation away all the along the
pipe through so how do we can invert our thinking about this and the development data center perspective where the code comes to the data and close to the data so that's that's 1 thing I think about and part of what impact 3 1 8 above a protocol has some of the answers I think the somewhat I will see some of you I'm sure think even deeper more deeply and better about that I'd love to get your feedback in your inspiration I
think fundamentally the future of Big Data in Python is really had a genius was city was before we had this notion of here's number users number as a single colour channel that I really is just a description of the protocol of the talk on Tuesday by P picture of the 0 q author we talked about the decentralized role in the future and contracts with most important thing that Papd through example of that kind of protocol or contract between objects and
it is a beautiful thing it's important thing I think that's what the future you like as well much more of that rather than here's a single library that everything that around us so this and then what
Python its role is you were doing what it's always done really well and has played this tremendous glue this tremendous ability to stick things together a quickly that's the advantage of not having static types it can pull things together for all sorts of places in and out of fashion in an iterative fashion so quickly find solutions in the forward so
a continuum where database were basically the weather from the entire
sci-fi and watching it deployed what would I do differently what happens when you differently and some of them is expressed in what I described before about that having mass so the projects this encompass that reality is really 3 projects funded the a number of complaints the sci-fi really was a distribution the library and so to distribution other packages so we have to come up with a cross platform complete Python independent package manager called conduct of so that's the forward and all these are open source number is about making sure
of course as fast as possible so just a little blurb on condom over the dimension of the yesterday after my talk to me said you gotta talk about kind of keep talking about current as we love it's also so tricot if you're not using a condom having tried coming down county you don't have to get an account that just to pick and stop on the continent if you want and I can do a kind based management packages without even
anything from our side as it's completely open source for free we communicate with the Python Package authority with NEC and others to try to help understand how you integrate this even better but plays number
our to open source projects and really we mean that they have a lot of that has an apparent especially number has the sodium dependency 1 big really get installed so the
idea around this blaze modified motivated by generalizing this spent to all languages and datasets really creating this Python glued to point out that blue things together in a in a marvelous way that makes what you have currently when you do data processing if you it is equal you store in HDFS restore impose stress that that defines how you period but in your query becomes some kind convoluted version of that Cree language they created for
you and that's how you have to work have to use it and a lot of us in this analysis we love the use not higher patterns expressions because we would like to with the field is that brain and that's we want to think about the problem we like to use those expressions currently going to do that is to pull the data to us the use of expressions is ablaze about inferring that increasing expressions of then moved to the data in multiple ways number is desired by the is is motivated by the the desire to not have to meet people right extension this anymore and your high level code that is as fast or can be as fast as for track no and if that exists in nonpartisan array or a computing can be done at full speed on modern processors with very little effort and so that's the goal so I'm not really cover all the slides there here only posted online you can see
them but the blaze is its goal is to deal with the their pain
is architecture divided up API deferred expressions of the heart it's got data adapters and compute interpreters basically compute interpreter on different back and that's this architecture and you can it's it use a flexible architectures to easily add new ones to new data that there a new computer the back ends the data descriptors
data format approach that allows you to have a uniform array on interface to whatever data directors of CSC files to a single database is defined files to add just J. Santen and it's directories adjacent files then compute cluster of
a uniform interface to
dying which is part next generation empire to C + + library for Python to pandas even despite actually you can run a computer on just Python lists of lists of lists of that's a list of tuples just and just see if everything is working on it spot quite table you have support for SPARC i've spark is a member of the family it allows you to run in memory on machines but I still things that we went like Hadoop normally the spark and pollen and finally warming up to so that they're kind of that that that that save the Hadoop ecosystem from my perspective the blaze
expressions is deferred evaluation you basically create an expression is also an example that builds up a a DAG directed acyclic graph that describes an expression and then they erase in that graph can be referred to various data adapters and then that gets sent to a computer all separators we've separated out compute from data from code so you can you reuse those components independently and bringing together for Natural computation through a
simple example coming weblinks
at the heart is what was missing from Papua 8 which is a really good data description language but remember those discussions forth we argued whether being untidy types or C types specifications for data or whether and we'll said of will just be a string of destruction syntax only and so that that's the data declaration language in the buffer particle not quite good enough the data shape we created a lot of time and try to figure out the
initiated and encompass all kinds of data that so we love your feedback on this what we it's it's a separate project is independent it can be downloaded installed independently and spacing some passes for the language you can interpret many women various back ends so you construct a table symbol which is in this case is simple to call on table which we name a node ID and they have different types of string is actually code and then you have these 2 Table objects and here's the
expression and joining these 2 tables together as a deferred evaluation and doing group by and counting the law data is what distinguishes right at the data from these different versions of loaded and where there is but when I compute I just compute a dictionary that maps my did I get back from relative from to the actual variables will be expressed in that dictionary in that compute expression in and out of the competition happens that when you brought together the data and the code on a compute context then the low data will be different depending on whether in
spark initiative us or maybe it's in panels local she writes that to load your data the new expression is completely separated and you can have a very complex expression that looks patterns like that is not or wherever your data storage and no more and no longer do you have to write differently In order data stored differently and our goal is to end data silos logic to be words optimally fits where you get the most performance and not have to change your code so much in order to use the best performance data set this right back into blazoner were beautiful so blatantly
ecosystem around that we did a lot of experimentation on exploring premise that we will met the space currently dying blood diamond the shape of the key pieces in the blaze library itself which has its different components data shape is that the general data description language that I think was missing from that to annotate and breaks out about this because it is so there were a number of the types were you can use it now we use and blaze a dying uses it as if the description language Diamond is a
Python wrapper to a C + + but equivalent empire the nice thing about that is you can bind that Ruby combine job scripted by national celebrity wherever you like and have that multidimensional array concept across the board also can help with the gluing again uh for Python 2 .
0 right I think another time and so I'm not convinced that number talk about number quite a bit other numbers I love it here I still
growing it's still P 1 . 0 we still need help were looking for
people to help us with it but I would
just show you that could a Python works in numbers you can actually with number today target the GPU if you have a GPU very easily basically the python comes in
number were working on making interfaces that are more blaze like to make it much easier and less for the specific all right so part a
long and fruitful history galaxy you'll have a lot bright future with your help join the unity and help make the world a better place
or dedicate my talk to any elephants my wife and if she's here she made over thank you for all you've done I would nothing I've ever done would be possible form for you thank you very much that so that
the
your little time but I think that at that time put 2 questions please take the microphone thank
you for the working this thank you for the very by thought I had a question I'm sorry about it about vitamin by you mentioned by by my question is they're trying really hard to reimplement nonpipelined yeah but do they contribute but you mentioned that some rustic on Mumbai ones we optimize like he said very obvious but really awful and did they contribute back in order to be able to use it optimize the parts 1 empire it's really hard because the
the the status of different right so the code they write is quite different than what you write it down I I am really excited about what's going on number a object are number array of writing has a lot in common with the type I actually I finally see a way to collaborate with them so I'm really excited about that because I always love a collaborate constant this challenging good thank you yes excellent and those things for all
some talking and the insights about the history of the novel and ecosystem all around so my question is regarding the packaging and because I looked up the pipeline indexes number I still don't provide a Python Wheels for that we'll packages for Windows which would be great because you know in on Windows system it's always in a heart and now the compiling is not common not as convenient as modeling of the leaders of any plans to provide pre-compiled packages women's so I think I heard
plans like that yes I think people are talking about the impact of my attention is you can't install number solves the problem so I'm less motivated myself to worry about that but I think there are some people that are trying to produce wheels yeah so great a tradeoff yes excellent hunger Madigan
Kategorie <Mathematik>
Open Source
Güte der Anpassung
t-Test
Zahlenbereich
Physikalisches System
Term
Computeranimation
Data Mining
Datenfeld
Rechter Winkel
Code
Mustersprache
Gamecontroller
Polstelle
Figurierte Zahl
Fehlermeldung
DoS-Attacke
Satellitensystem
Zentrische Streckung
Punkt
Streuung
Datenanalyse
Ruhmasse
Zahlenbereich
Physikalisches System
Whiteboard
Computeranimation
Virtuelle Maschine
Flächentheorie
Dateiformat
Booten
Einflussgröße
Message-Passing
Portscanner
NP-hartes Problem
Rechenschieber
Spezialrechner
Subtraktion
Transinformation
Wellenlehre
Ultraschall
Lineare Gleichung
Gleichungssystem
Booten
Inverses Problem
Optimierung
Computeranimation
Leck
Festspeicher
Reihe
Derivation <Algebra>
Delisches Problem
Zeiger <Informatik>
Figurierte Zahl
Quick-Sort
Computeranimation
Übergang
Softwaretest
Metropolitan area network
Parametersystem
Winkel
Reihe
Versionsverwaltung
Kontextbezogenes System
Versionsverwaltung
Programmierumgebung
Computeranimation
Gammafunktion
Beobachtungsstudie
Open Source
Zahlenbereich
Extrempunkt
Systemplattform
Computeranimation
Rechenschieber
Rechter Winkel
Prozess <Informatik>
Ein-Ausgabe
Maßerweiterung
Optimierung
Figurierte Zahl
Informationssystem
Tabelle <Informatik>
Facebook
Open Source
Quick-Sort
Code
Computeranimation
Informationsmodellierung
Reverse Engineering
Hook <Programmierung>
Wrapper <Programmierung>
Mereologie
Wort <Informatik>
Maßerweiterung
Lesen <Datenverarbeitung>
Leistung <Physik>
Bit
Webdesign
Web Site
Prozess <Informatik>
Datenanalyse
Formale Sprache
Applet
Zellularer Automat
Mailing-Liste
Extrempunkt
Web-Seite
Rechnen
Speicherbereichsnetzwerk
Computeranimation
Metropolitan area network
Diskrete-Elemente-Methode
Rechter Winkel
Total <Mathematik>
Plot <Graphische Darstellung>
Programmierumgebung
Große Vereinheitlichung
Subtraktion
Prozess <Informatik>
Quick-Sort
Computeranimation
Metropolitan area network
Multiplikation
Lemma <Logik>
Verbandstheorie
Reelle Zahl
Rechter Winkel
Total <Mathematik>
Wrapper <Programmierung>
Mereologie
Programmbibliothek
Plot <Graphische Darstellung>
Informatik
Quotient
Schnittstelle
Bit
Open Source
Wasserdampftafel
Datenanalyse
Singularität <Mathematik>
Zahlenbereich
Statistische Hypothese
Maßerweiterung
Statistische Hypothese
Computeranimation
Spezialrechner
Metropolitan area network
Benutzerbeteiligung
Total <Mathematik>
Bildgebendes Verfahren
Schnittstelle
Metropolitan area network
Relativitätstheorie
Bildschirmfenster
Matrizenrechnung
Programmbibliothek
Installation <Informatik>
Peer-to-Peer-Netz
Extrempunkt
Optimierung
Systemplattform
Personal Area Network
Raum-Zeit
Computeranimation
Matrizenrechnung
Klasse <Mathematik>
Mathematisierung
Matrizenrechnung
t-Test
Ruhmasse
Plot <Graphische Darstellung>
Ereignishorizont
Quick-Sort
Raum-Zeit
Computeranimation
Objekt <Kategorie>
Konfluenz <Informatik>
Festspeicher
Mereologie
Übertrag
Ordnung <Mathematik>
Personal Area Network
Bildgebendes Verfahren
Informationsmodellierung
Ortsoperator
Heegaard-Zerlegung
Bildschirmfenster
Basisvektor
Klasse <Mathematik>
Matrizenrechnung
Zahlenbereich
Projektive Ebene
Ordnung <Mathematik>
Code
Quick-Sort
Computeranimation
Lokales Netz
Spezialrechner
Hilfesystem
Computeranimation
Übergang
Schlüsselverwaltung
Datenanalyse
Zahlenbereich
NP-hartes Problem
Web-Seite
Computeranimation
Homepage
Spezialrechner
Metropolitan area network
Rechter Winkel
Beweistheorie
Basisvektor
Mereologie
Petersen-Graph
Hadamard-Matrix
Modul <Software>
Resultante
Schlüsselverwaltung
t-Test
Quick-Sort
Computeranimation
Gradient
Zeichenkette
Multiplikation
Prozess <Informatik>
Rechter Winkel
Migration <Informatik>
Basisvektor
Wort <Informatik>
Information
Programmierumgebung
Hilfesystem
Leistung <Physik>
Metropolitan area network
Videospiel
Dienst <Informatik>
Schlüsselverwaltung
Arithmetische Folge
Gruppenoperation
Reihe
Punkt
Hilfesystem
Computeranimation
Richtung
Schlüsselverwaltung
Benutzerfreundlichkeit
Zahlenbereich
Teilbarkeit
Quick-Sort
Computeranimation
Portscanner
Metropolitan area network
Moment <Stochastik>
Forcing
Wort <Informatik>
Projektive Ebene
Punkt
Schlüsselverwaltung
Software Engineering
Nichtlinearer Operator
Bit
Datentyp
Zahlenbereich
Maßerweiterung
Gleitendes Mittel
Bitrate
Computeranimation
Objekt <Kategorie>
Metropolitan area network
Wechselsprung
Endogene Variable
Zustand
Maßerweiterung
Broadcastingverfahren
Gammafunktion
Computervirus
Stichprobe
Zahlenbereich
Computerunterstütztes Verfahren
Element <Mathematik>
Menge
Computeranimation
Objekt <Kategorie>
Metropolitan area network
Loop
Diagramm
Datentyp
Schwimmkörper
Datenstruktur
Array <Informatik>
Zeichenkette
Loop
Bit
Hausdorff-Dimension
Menge
Loop
Algorithmische Programmierung
Hausdorff-Dimension
Peer-to-Peer-Netz
Broadcastingverfahren
Ideal <Mathematik>
Broadcastingverfahren
Teilbarkeit
Hilfesystem
Computeranimation
Bit
Datentyp
Versionsverwaltung
Zahlenbereich
Implementierung
Routing
Vektorraum
Diskrete Gruppe
Computerunterstütztes Verfahren
Term
Gerade
Computeranimation
Differenzengleichung
Metropolitan area network
Arithmetischer Ausdruck
Fibonacci-Folge
Wiederherstellung <Informatik>
Bildschirmsymbol
Booten
Wurzel <Mathematik>
Gammafunktion
Leistung <Physik>
Lineare Abbildung
Punkt
Datentyp
Computerunterstütztes Verfahren
Digitalfilter
Computeranimation
Linearisierung
Virtuelle Maschine
Iteration
Fibonacci-Folge
Pufferüberlauf
Mereologie
Zoom
Booten
Funktion <Mathematik>
Gerichtete Menge
Transinformation
Gruppenoperation
Formale Sprache
Versionsverwaltung
ROM <Informatik>
Gesetz <Mathematik>
Computeranimation
Arithmetisch-logische Einheit
Objekt <Kategorie>
Metropolitan area network
Menge
Flächeninhalt
Festspeicher
Codierung
Kreisfläche
Prozess <Physik>
Hardware
Formale Sprache
Computerunterstütztes Verfahren
Extrempunkt
Information
Code
Computeranimation
Keller <Informatik>
Portscanner
Arithmetisch-logische Einheit
Metropolitan area network
Vektorrechner
Software
Datensatz
Translation <Mathematik>
Coprozessor
Große Vereinheitlichung
Bildgebendes Verfahren
Attributierte Grammatik
Tabelle <Informatik>
Touchscreen
Bit
Minimierung
Landau-Theorie
Oval
Element <Mathematik>
Computerunterstütztes Verfahren
Extrempunkt
Code
Computeranimation
Überlagerung <Mathematik>
Metropolitan area network
Hausdorff-Dimension
Datenstruktur
Zeiger <Informatik>
Ganze Funktion
Gerade
Normalvektor
Array <Informatik>
Analysis
Benutzerfreundlichkeit
Pufferüberlauf
Systemaufruf
Mailing-Liste
Bitrate
Systemaufruf
Speicherbereichsnetzwerk
Keller <Informatik>
Mapping <Computergraphik>
Portscanner
Dienst <Informatik>
Array <Informatik>
Schwimmkörper
Reelle Zahl
Tabelle <Informatik>
Bit
Schlüsselverwaltung
Datenanalyse
Applet
Mailing-Liste
Sondierung
Speicherbereichsnetzwerk
Menge
Quick-Sort
Computeranimation
Eins
Metropolitan area network
Graph
Differenzkern
Datennetz
Momentenproblem
Datentyp
Programmbibliothek
Notebook-Computer
Polstelle
Cloud Computing
Wärmeleitfähigkeit
Analysis
Metropolitan area network
Kontinuumshypothese
Klasse <Mathematik>
Formale Sprache
Wort <Informatik>
Grundraum
Auswahlaxiom
Computeranimation
Expertensystem
Bit
Formale Sprache
Geräusch
Aggregatzustand
Element <Mathematik>
Computerunterstütztes Verfahren
Nichtlinearer Operator
Computeranimation
Rechenzentrum
Rechenschieber
Domain-Name
Array <Informatik>
Reelle Zahl
Lateinisches Quadrat
Soundverarbeitung
Rückkopplung
Algebraisch abgeschlossener Körper
Befehl <Informatik>
Programmiergerät
Sichtenkonzept
Natürliche Zahl
Komplexe Darstellung
Formale Sprache
Zwei
Systemaufruf
Nichtlinearer Operator
Raum-Zeit
Code
Computeranimation
Portscanner
Metropolitan area network
Poisson-Klammer
Datenfeld
Array <Informatik>
Automatische Indexierung
Prozess <Informatik>
Fächer <Mathematik>
Lateinisches Quadrat
Maßerweiterung
Zeiger <Informatik>
Touchscreen
Array <Informatik>
Standardabweichung
Bit
Compiler
Program Slicing
Formale Sprache
Implementierung
Maßerweiterung
Code
Computeranimation
Netzwerktopologie
Physikalisches System
Variable
Poisson-Klammer
Netzbetriebssystem
Skript <Programm>
Optimierung
Maßerweiterung
Implementierung
Leistungsbewertung
Meta-Tag
NP-hartes Problem
Data Encryption Standard
Nichtlinearer Operator
Befehl <Informatik>
Rechenzeit
Ideal <Mathematik>
Physikalisches System
p-Block
Variable
Quick-Sort
Portscanner
Chatten <Kommunikation>
Ruhmasse
Programmierstil
Schnittstelle
Programmiergerät
Bit
Mereologie
Desintegration <Mathematik>
Formale Sprache
NP-hartes Problem
Gleichungssystem
Kardinalzahl
ROM <Informatik>
Computeranimation
Physikalisches System
Graphikprozessor
Code
Datentyp
Speicherabzug
Datenstruktur
Broadcastingverfahren
Nichtlinearer Operator
ATM
Datenhaltung
Güte der Anpassung
Physikalisches System
Medianwert
Array <Informatik>
Chatten <Kommunikation>
ATM
Einfügungsdämpfung
Gewicht <Mathematik>
Pauli-Prinzip
Array <Informatik>
Standardabweichung
Chatten <Kommunikation>
Rechter Winkel
Dreiecksfreier Graph
Mereologie
Systemaufruf
Ikosaeder
ROM <Informatik>
Computeranimation
Lineare Abbildung
Dualitätstheorie
Bit
Punkt
Wellenpaket
Gemeinsamer Speicher
Algebraisches Modell
Formale Sprache
Gruppenoperation
Geheimnisprinzip
Term
Code
Computeranimation
Übergang
Metadaten
Puffer <Netzplantechnik>
Variable
Reelle Zahl
Typentheorie
Code
Datentyp
Programmbibliothek
Zeiger <Informatik>
Maßerweiterung
Datenstruktur
Informatik
Protokoll <Datenverarbeitungssystem>
Einfache Genauigkeit
Übergang
Quick-Sort
Portscanner
Objekt <Kategorie>
Festspeicher
Partikelsystem
Inverter <Schaltung>
Punkt
Physikalismus
Ruhmasse
Geheimnisprinzip
Ikosaeder
Cluster-Analyse
Physikalisches System
Computeranimation
Caching
Ruhmasse
Multitasking
Serielle Schnittstelle
Wort <Informatik>
Data Mining
Mittelwert
Gravitation
Rückkopplung
Umsetzung <Informatik>
Web log
Kraft
Extrempunkt
Term
Systemplattform
Code
Computeranimation
Ausdruck <Logik>
Rechenzentrum
Virtuelle Maschine
Metropolitan area network
Bit
Perspektive
Softwareentwickler
Beobachtungsstudie
Binärdaten
Transinformation
Gruppe <Mathematik>
Protokoll <Datenverarbeitungssystem>
Ruhmasse
Objekt <Kategorie>
Sinusfunktion
Mereologie
Autorisierung
Protokoll <Datenverarbeitungssystem>
Element <Mathematik>
Desintegration <Mathematik>
Zahlenbereich
Einfache Genauigkeit
Quick-Sort
Computeranimation
Design by Contract
Objekt <Kategorie>
Deskriptive Statistik
Fundamentalsatz der Algebra
Datentyp
Programmbibliothek
Offene Menge
Distributionstheorie
Vervollständigung <Mathematik>
Datenhaltung
Kontinuumshypothese
Stochastische Abhängigkeit
Open Source
Ruhmasse
Zahlenbereich
Computeranimation
Portabilität
Datenmanagement
Programmbibliothek
Projektive Ebene
Wärmeleitfähigkeit
Autorisierung
Offene Menge
Pauli-Prinzip
Hausdorff-Dimension
Open Source
Zahlenbereich
Datenmanagement
Ikosaeder
Programmierumgebung
Computeranimation
Physikalisches System
Freeware
Datenmanagement
Code
Projektive Ebene
Versionsverwaltung
Pauli-Prinzip
Formale Sprache
Versionsverwaltung
Zahlenbereich
Ikosaeder
Computerunterstütztes Verfahren
Code
Computeranimation
Übergang
Weg <Topologie>
Arithmetischer Ausdruck
Multiplikation
Code
Mustersprache
Datenverarbeitung
Coprozessor
Maßerweiterung
Analysis
Abfrage
Menge
Rechenschieber
Datenfeld
Verbandstheorie
Rechter Winkel
Normalspannung
Schnittstelle
Subtraktion
Datenanalyse
Regulärer Ausdruck
Computer
Computer
Computerunterstütztes Verfahren
ROM <Informatik>
Analysis
Computeranimation
Eins
Systemprogrammierung
Arithmetischer Ausdruck
Uniforme Struktur
Schnittstelle
Expertensystem
Formale Grammatik
Interpretierer
Datenhaltung
Elektronische Publikation
Ausgleichsrechnung
Dateiformat
Einfache Genauigkeit
Keller <Informatik>
Bildschirmmaske
Anpassung <Mathematik>
Elektronischer Fingerabdruck
Dateiformat
Overhead <Kommunikationstechnik>
Computerarchitektur
Verzeichnisdienst
Multiplikation
n-Tupel
Familie <Mathematik>
Computer
SPARC
Mailing-Liste
Computer
Computeranimation
Keller <Informatik>
Virtuelle Maschine
Generator <Informatik>
Perspektive
Front-End <Software>
Festspeicher
Mereologie
Programmbibliothek
Implementierung
Schnittstelle
Tabelle <Informatik>
Computer
Computerunterstütztes Verfahren
ROM <Informatik>
Code
Computeranimation
Puffer <Netzplantechnik>
Arithmetischer Ausdruck
Last
Code
Datentyp
Zusammenhängender Graph
Plot <Graphische Darstellung>
Große Vereinheitlichung
Parallele Schnittstelle
Leistungsbewertung
Trennungsaxiom
Umwandlungsenthalpie
Shape <Informatik>
Graph
Hardwarebeschreibungssprache
Güte der Anpassung
Kreisbogen
Zeichenkette
Hyperlink
Anpassung <Mathematik>
Deklarative Programmiersprache
Partikelsystem
Zeichenkette
Rückkopplung
Subtraktion
Formale Sprache
Gesetz <Physik>
Code
Computeranimation
Knotenmenge
Variable
Arithmetischer Ausdruck
Last
Code
Datentyp
Große Vereinheitlichung
Leistungsbewertung
Symboltabelle
Kontextbezogenes System
Kreisbogen
Objekt <Kategorie>
Mapping <Computergraphik>
Zeichenkette
Last
Projektive Ebene
Message-Passing
Zeichenkette
Tabelle <Informatik>
Bitmap-Graphik
Schnittstelle
Subtraktion
Pauli-Prinzip
Zahlenbereich
Regulärer Ausdruck
Computer
Mathematische Logik
Code
Raum-Zeit
Computeranimation
Arithmetischer Ausdruck
Last
Mustersprache
Datentyp
Programmbibliothek
Zusammenhängender Graph
Speicher <Informatik>
Shape <Informatik>
Hardwarebeschreibungssprache
Speicher <Informatik>
Dateiformat
Kreisbogen
Rhombus <Mathematik>
Array <Informatik>
Menge
Rechter Winkel
Wort <Informatik>
Ordnung <Mathematik>
Ext-Funktor
Schnittstelle
Bit
Multiplikation
Pauli-Prinzip
Regulärer Ausdruck
Zahlenbereich
Äquivalenzklasse
Whiteboard
Computeranimation
Metropolitan area network
Graphikprozessor
Font
Prozess <Informatik>
Wrapper <Programmierung>
Hilfesystem
Gammafunktion
Schnelltaste
Singularität <Mathematik>
Globale Optimierung
Speicher <Informatik>
Spieltheorie
Mathematisierung
Dateiformat
Portscanner
Array <Informatik>
Analogieschluss
Compiler
Ext-Funktor
Umwandlungsenthalpie
Zahlenbereich
Varianz
Pascal-Zahlendreieck
Computeranimation
Eins
Metropolitan area network
Bildschirmmaske
Verschlingung
Mereologie
Hilfesystem
Große Vereinheitlichung
Demo <Programm>
Schnittstelle
Gammafunktion
Metropolitan area network
Berline
Mereologie
Ordnung <Mathematik>
Pascal-Zahlendreieck
Kontextbezogenes System
Computeranimation
Eins
Subtraktion
Automatische Handlungsplanung
Zahlenbereich
Schreiben <Datenverarbeitung>
Physikalisches System
Extrempunkt
Code
Service provider
Computeranimation
Objekt <Kategorie>
Metropolitan area network
Informationsmodellierung
Rechter Winkel
Automatische Indexierung
Datentyp
Bildschirmfenster
Automatische Handlungsplanung
Zahlenbereich
Computeranimation

Metadaten

Formale Metadaten

Titel Python's Role in Big Data Analytics: Past, Present, and Future
Serientitel EuroPython 2014
Teil 31
Anzahl der Teile 120
Autor Oliphant, Travis
Lizenz CC-Namensnennung 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen.
DOI 10.5446/20043
Herausgeber EuroPython
Erscheinungsjahr 2014
Sprache Englisch
Produktionsort Berlin

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Travis Oliphant - Python's Role in Big Data Analytics: Past, Present, and Future Python has had a long history in Scientific Computing which means it has had the fundamental building blocks necessary for doing Data Analysis for many years. As a result, Python has long played a role in scientific problems with the largest data sets. Lately, it has also grown in traction as a tool for doing rapid Data Analysis. As a result, Python is the center of an emerging trend that is unifying traditional High Performance Computing with "Big Data" applications. In this talk I will discuss the features of Python and its popular libraries that have promoted its use in data analytics. I will also discuss the features that are still missing to enable Python to remain competitive and useful for data scientists and other domain experts. Finally, will describe open source projects that are currently occupying my attention which can assist in keeping Python relevant and even essential in Data Analytics for many years to come.
Schlagwörter EuroPython Conference
EP 2014
EuroPython 2014

Ähnliche Filme

Loading...