Scalable Realtime Architectures in Python


Formal Metadata

Scalable Realtime Architectures in Python
Title of Series
Part Number
Number of Parts
Baker, Jim
CC Attribution 3.0 Unported:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal purpose as long as the work is attributed to the author in the manner specified by the author or licensor.
Release Date
Production Place

Content Metadata

Subject Area
Jim Baker - Scalable Realtime Architectures in Python This talk will focus on you can readily implement highly scalable and fault tolerant realtime architectures, such as dashboards, using Python and tools like Storm, Kafka, and ZooKeeper. We will focus on two related aspects: composing reliable systems using at-least-once and idempotence semantics and how to partition for locality. ----- Increasingly we are interested in implementing highly scalable and fault tolerant realtime architectures such as the following: * Realtime aggregation. This is the realtime analogue of working with batched map-reduce in systems like Hadoop. * Realtime dashboards. Continuously updated views on all your customers, systems, and the like, without breaking a sweat. * Realtime decision making. Given a set of input streams, policy on what you like to do, and models learned by machine learning, optimize a business process. One example includes autoscaling a set of servers. Obvious tooling for such implementations include Storm (for event processing), Kafka (for queueing), and ZooKeeper (for tracking and configuration). Such components, written respectively in Clojure (Storm), Scala (Kafka), and Java (ZooKeeper), provide the desired scalability and reliability. But what may not be so obvious at first glance is that we can work with other languages, including Python, for the application level of such architectures. (If so inclined, you can also try reimplementing such components in Python, but why not use something that's been proven to be robust?) In fact Python is likely a better language for the app level, given that it is concise, high level, dynamically typed, and has great libraries. Not to mention fun to write code in! This is especially true when we consider the types of tasks we need to write: they are very much like the data transformations and analyses we would have written of say a standard Unix pipeline. And no one is going to argue that writing such a filter in say Java is fun, concise, or even considerably faster in running time. So let's look at how you might solve such larger problems. Given that it was straightforward to solve a small problem, we might approach as follows. Simply divide up larger problems in small one. For example, perhaps work with one customer at a time. And if failure is an ever present reality, then simply ensure your code retries, just like you might have re-run your pipeline against some input files. Unfortunately both require distributed coordination at scale. And distributed coordination is challenging, especially for real systems, that will break at scale. Just putting a box in your architecture labeled **"ZooKeeper"** doesn't magically solve things, even if ZooKeeper can be a very helpful part of an actual solution. Enter the Storm framework. While Storm certainly doesn't solve all problems in this space, it can support many different types of realtime architectures and works well with Python. In particular, Storm solves two key problems for you. **Partitioning**. Storm lets you partition streams, so you can break down the size of your problem. But if the a node running your code fails, Storm will restart it. Storm also ensures such topology invariants as the number of nodes (spouts and bolts in Storm's lingo) that are running, making it very easy to recover from such failures. This is where the cleverness really begins. What can you do if you can ensure that **all the data** you need for a given continuously updated computation - what is the state of this customer's account? - can be put in **exactly one place**, then flow the supporting data through it over time? We will look at how you can readily use such locality in your own Python code. **Retries**. Storm tracks success and failure of events being processed efficiently through a batching scheme and other cleverness. Your code can then choose to retry as necessary. Although Storm also supports exactly-once event processing semantics, we will focus on the simpler model of at-least-once semantics. This means your code must tolerate retry, or in a word, is idempotent. But this is straightforward. We have often written code like the following: seen = set() for record in stream: k = uniquifier(record) if k not in seen: seen.add(k) process(record)
EuroPython Conference
EP 2014
EuroPython 2014
Architecture Enterprise architecture Computer animation Real-time operating system Coma Berenices Real-time operating system Scalability Logic gate Scalability
Enterprise architecture Software developer Multiplication sign Streaming media Fault-tolerant system Replication (computing) Focus (optics) Quantum state Architecture Software Partition (number theory) Enterprise architecture Programming language Focus (optics) Constraint (mathematics) Scaling (geometry) Key (cryptography) Computer Cellular automaton Electronic program guide Real-time operating system Core dump Set (mathematics) Scalability Computer animation Data storage device Key (cryptography)
Dialect Complex (psychology) Group action Batch processing Mapping Stochastic process Feedback Decision theory Characteristic polynomial Real-time operating system Streaming media Event horizon Thresholding (image processing) 2 (number) Architecture Term (mathematics) Information Extension (kinesiology) Loop (music) Installable File System Social class Physical system Stochastic process Scale (map) Service (economics) Information Decision theory Computer Characteristic polynomial Complex (psychology) Basis (linear algebra) Real-time operating system Counting Point cloud Streaming media Control flow Batch processing Event horizon Computer animation Integrated development environment Function (mathematics) Pressure Physical system Electric current Reduction of order
Filter <Stochastik> Scale (map) Digital filter Computer file Line (geometry) Computer file Virtual machine Real-time operating system Operator (mathematics) Mathematical analysis Student's t-test Rotation Category of being Summation Computer animation Row (database) Software framework Hydraulic jump Window
Classical physics Complex (psychology) State observer Multiplication sign Virtual machine Division (mathematics) Event horizon Code Scalability Term (mathematics) Physical system Stochastic process Scale (map) Multiplication Information Network switching subsystem Scalability Code Degree (graph theory) Category of being Data mining Arithmetic mean Event horizon Computer animation Integrated development environment Software framework Key (cryptography)
Topology Enterprise architecture Java applet Decision theory Gene cluster Auto mechanic Mereology Fault-tolerant system Event horizon Computer icon Term (mathematics) String (computer science) Algebraic closure Configuration space Partition (number theory) Physical system Stochastic process Source code Graph (mathematics) Data recovery Element (mathematics) Projective plane Java applet Formal language Partition (number theory) Event horizon Computer animation Integrated development environment Network topology Self-organization Pattern language Sweep line algorithm Cuboid Diagram Resultant Directed graph
Numbering scheme Stochastic process INTEGRAL State of matter Graph (mathematics) Student's t-test Event horizon Invariant (mathematics) Term (mathematics) Network topology Software framework Message passing Partition (number theory) Directed graph Physical system Stochastic process Graph (mathematics) Numbering scheme Computer Video tracking Moment (mathematics) Open source Java applet Variance Counting Bit Mereology Software maintenance Variable (mathematics) Partition (number theory) Category of being Arithmetic mean Event horizon Computer animation Integrated development environment Network topology System programming Invariant (mathematics) Local ring Directed graph Separation axiom
Read-only memory Random number Mapping Structural load 1 (number) Real-time operating system Event horizon Field (computer science) Term (mathematics) Social class Physical system Spacetime Mapping Information Computer Relational database Total S.A. Principle of locality Local Group Latent heat Event horizon Computer animation Query language Field (mathematics) Quicksort Routing Fundamental theorem of algebra
Database transaction Context awareness Stochastic process Multiplication sign Event horizon Code Theory Term (mathematics) Natural number Row (database) Physical system Stochastic process Continuous track Numbering scheme Computer Consistency Streaming media Instance (computer science) Semantics (computer science) Functional (mathematics) Code Event horizon Computer animation Function (mathematics) Quicksort Row (database)
Building Multiplication sign Event horizon Query language Term (mathematics) Row (database) Namespace Computer-assisted translation Quicksort Physical system Exception handling Window Musical ensemble Graph (mathematics) Trail Block (periodic table) Computer Data recovery Video tracking Query language Mechanism design Internetworking Event horizon Computer animation Quicksort Window Matching (graph theory) Library (computing) Directed graph
Modal logic Source code Trail Java applet Chemical equation Multiplication sign Moment (mathematics) State of matter Online help Batch processing Event horizon Twitter Twitter Writing Event horizon Computer animation System programming Queue (abstract data type) Invariant (mathematics) Message passing Physical system
Point (geometry) Group action Computer file Decision theory Real-time operating system Local Group Social class Equation of state Maize Information Physical system Social class Source code Scale (map) Service (economics) Information Decision theory Computer file Java applet Fitness function Point cloud Cloud computing Line (geometry) Group action Code Computer animation Integrated development environment Personal digital assistant Uniform resource name Natural number Quicksort Curve fitting
Installation art Concurrency (computer science) Java applet Line (geometry) Serializability Social class Process (computing) Computer animation Revision control Module (mathematics) Modul <Datentyp> Quicksort Social class
Single-precision floating-point format Social class Installation art Process (computing) Fluid statics Computer animation Oval Java applet Exception handling Physical system Code
Reading (process) Topology Decision tree learning Set (mathematics) Multiplication sign Real-time operating system Event horizon Twitter Social class n-Tupel output Context awareness Rule of inference Enterprise architecture Computer font Torus Scaling (geometry) Computer Parity (mathematics) SI-Einheiten Connected space Code Event horizon Computer animation Uniform resource name Field (mathematics) Quicksort Reading (process)
Context awareness Enterprise architecture Real number Java applet Virtual machine Real-time operating system Auto mechanic Coma Berenices Real-time operating system Fault-tolerant system Test-driven development Transcodierung Formal language Formal language Arithmetic mean Computer animation Strategy game Term (mathematics) Algebraic closure Data logger Fingerprint Physical system
Multiplication Information Lecture/Conference Multiplication sign Data center Line (geometry) Mathematical optimization Information security Physical system
gates on the talk about as well work almost everyone in and this should still just the introductory material of scalable real-time architectures and pipeline what do I mean by that
I want you to walk away from the session with a couple key ideas of K I specifically around partitioning and fault tolerance and how we can achieve that in building such scalable real-time architectures the focus will be on store the Ganges replicable to other tools as well as such as Spark Streaming or tools that you might ruin cell and I think the reason that we want to be doing this is of course were given more data but we also want more responsive to the idea I will show my but more at the end so I mean chordal reject on you may seem I see jet enlightening talk yesterday ieee when
applying this book if you're referring to synergize this is a great book there were correct states on the set of issues I've had a chance to work in distributed computing for a while and that's usually failures at scale I teach the principles of programming language course occasionally at the University of Colorado just fine but actually is done in scholars title I work with this users up on storm we're going to be changing this to real-time streaming in the near future with a real-time architectures would you like build well I can't think of a few and I'll just mention a few since we are time constraints on
maybe real-time aggregation this is your classes of how it should be doing do but again streaming and you're not looking at this on a batch by and basis instead as it comes in your updating the counts for other aggregated aggregation pressure building the dashboard so some extension of this because of the real-time aggregation I particularly interested in the idea of decision making where you will be responding to information in the environment and taking some action so what certainly common real-time characteristics of such systems will you're consuming
streams you are being given oriented as an event occurs you may take some action they want and you you may be having something downstream you want to go and minimize the latency from the arrival of that event to that potential computations certainly not ours ideally going down seconds or below in terms of that threshold of latency such systems are often called complex event processing or you could call stream processing it doesn't matter these are the concepts the 1 thing that we are not doing here is really a hard real-time system 0 1 last thing you might have written 1 of your own type systems that In fact show you what you might read in the past so you might have read something like this this is look familiar world it should because it is about as
generic as you get with the Unix pipeline of but what I wanted to show here is that you have compositions student being able to go on to build a pipeline of reusable filters another nice property you can rewrite this pipeline with any intermediate step fail but some problems and I obviously did show some of the details around
tailing what the but there's also these aspects of how do you go and implement these other things such as joins windows but most importantly how would you steal that thing up and I think we can all think about how that could be done you might go and so in some way to describe the 1st 1 set files is being processed by this machine once the files by some other machine but you have to go and actually manage that competition yourself to so Everex's were I'm using this sum a framework
for complex event processing called ESFA and which has very much the same properties organized this homegrown code and here's the problem you need to ensure all relevant events about a given customer in 1 place right if I'm going to be able to know something about this given customer I need to bring in all the relevant information I have to go and put them in 1 place for that to happen we have to have some lookout but again in terms of the this tracks this example or implementing global alarms we want to allow customers some degree of customization so we don't want to make it too hard coded then run some competition so again the classic complex event processing systems and simple shot the 1 that we might just no be able to do so come quite readily and by say well maybe we'll do it all by all the customers and what it doesn't necessarily get as far as like because you might have indentured ontology mining the shot and multiple cheese so what observations meaning of these small problems are easy how you make a
small a large problem easier to divide and conquer it to divide and conquer you need to have some horizontal scaling will no longer building systems such that they always will require just that larger machine and said we build so that they can scale to a cluster of such machines but what do we know the more machines we add to the next the more likely when a failure as well especially since we like to use commodity systems times and once we have a failure in we have to go and coordinate so I have seen and maybe you've seen in your own environments that some sometimes people will
propose as people or important issues and that just doesn't go and make the problem go away even as you keep results you still need to know go and consider how do you manage failure so yes if EPA can go and give you all this In this toolbox In fantastic as a consequence but I'll tell you that that doesn't you know just assuming that you have distributed walking in your environment doesn't tell you how to recover from a failure and then going and releasing this distributed locks the solution is not removed the clusters the how many of them there are so that's solutions we 1 B it's OK for a given node in the cluster fail but rebuilding the cluster defeats the purpose of running a cluster so storm and it has some terminology which I will and
introduced by as we go through but there is this idea of an then source recall stouts and we have event processing nodes called bolts and some topology that we get together in a decision directed acyclic graph there is strong support for partitioning and fault tolerance these key elements that we started with in terms of thinking about how would I go India able to build up a problem that could divide and conquer on started enclosure but it exposes a j a Java API on hence my the usage icon here although course you could use some IPC mechanism to talk about with you know that C Python for example now was actually done on in the talk on tuesday where they were looking at her as a system for their support and users in Cuba to manage things that you don't necessarily see at all you can use it as a resource for your own organization and as part of the Apache Incubator actually there some other at the pattern their projects that are competing in the States probably most notable be struck string but it
actually but look great in terms of d support you can do that like specially with its Python integration I think it's the the top contender and has some nice properties that when you look at that but in terms of being functional that you do not see any storm these others sunset and as for I don't know I think it these are not nearly as competitive and there's some interesting stuff around the legs European as well that you could potentially be doing so storm was partition strange she can break down the size of the problem that returning
to that idea of being able to divide and conquer the problem if a node fails storm or start but here's and even more important aspect of what that means and oftentimes when you're thinking about building systems especially students systems we want to think about what are the invariants provided by the underlying framework what does it give us and this actually helps explain also why you have this distinction between an advanced source and storm and the variables this these processing because the sources are the things that you are pre producing events that have to eventually be acknowledged in storm so that you can ensure that all events are in fact eventually process and when I said eventually by the way it actually means an eventually consistent scheme and likewise this idea of balls distinguishing that because you can always replace them the this insists topology and the topology in terms of coming years you might have so this is describing how the cluster is being split up in terms of resources here's the press 1 more important variance that's been provided the number of nodes for a given so spell maybe a cocker than sources the number of nodes for a given maybe that something that is actually doing some interesting processing of that event is held constant so during the lifetime this topology you know that you always have 7 notes for endure Kafka ingest and out with is that mean in terms of your ability to know exactly what's going on in your environment in order to maintain appropriate counters other state it's very strong the OK to start thinking In the count 1 extremely important aspect of how you did divided that problem up is during the lifetime of this topology if you always know that someone is handling this problem you can always go think about if we have people here can always get to that person and so now you don't have to think about what happens if we add some additional people it's nice to be able to steal a problem so that is a separate issue in terms of scaling up the size of topology would do that outside of the given of 1 of the real-time computation if that is insufficient motivational give you an example in just a moment forget I think I'm running a little bit slow song and try to see that but these are the most important concepts you're the takeaways so you have computational locality you
know that the events a given node are the ones that are supposed to be there because that's how you define your routing and you can rout on some sort of political field groupings of again again example as your customer attendant in a class and a region of some kind some way of breaking up the problem space in terms you events and again but possibilities you have if you knew all of the information needed to know about a given customer and it was in 1 place it changes how you think about things so normally when you go and write your queries and say hi and using a relational database the data is over there and you bring it to you for a given computation instead in systems like storm you move the computation to see the data as the fundamental idea in MapReduce system like storm like to do but unlike something like to do this is in real time so you're able to keep all of that data in memory compute on it in in real time and is an interesting so I I should mention of course Sensors Mappings and you you might have multiple customers and in this you have to consider that but that's an easy distinction to make so again you will know that it will be on this node and only on this there are other ways of routing such as
random shuffling I am global grouping which means that there is just 1 node everything goes there obviously not scalable but useful for getting totals of storm will track success all you have to do
is consider which retries are and there are other ways of doing this in terms of doing exactly once event processing but knowing that at least 1 all advanced successively successfully observed and computed on is pretty fantastic again think back to the pipeline that Unix pipeline showing you that's that same idea if the pipeline goes down I can reach right you have to handle we try but what does that look like and use the 1st fight code the easiest the fury seeing that records then you
can ignore it otherwise processes if you we tried this computation then you if you haven't seen it in the in the context of actually been successful with that you can do that we try theory that emerge function is something they can be that simple but it will depend on the nature of the problem for instance if I am doing something as transactional nature and you are the wiring the 1 thousand dollars you can retry that many times and I'll be very happy about that so I don't think that and we try to always going to be successful NASA nature of eventually consistent systems but sometimes you actually have to have strong consistency and the cost of the ways you could go and have some sort of bouncing compensation or something like that and say will give me back that thousand dollars In
and actually rewiring of funds that in fact this half of 0 at another thing any citizen in interpersonal systems you're streaming should not capture everything that comes through and this is that you know older thing you know in order to your computations you have download the annotated don't crawling sure you do that on but again that's done appropriate system you have to window and another cat caveats revealed around the new query languages but you have to build your own and that's actually kind of interesting again without personally has some approaches I expect that we will see these things emerge over time I think of these as building blocks that give you these capabilities but again these capabilities are a lot like a Unix pipeline and match and we know we can build those and so may have to a major or maybe use some of these other libraries but there and then they go into so through this little bit of except that I will say
that the zookeeper is an important aspect of most systems that you would build and so you probably will be interacting with in some way so you or and spouse alternately in a responsible there then sources are responsible for ensuring that all events rack and eventually played through successfully and eventually acknowledged and you need to go and ensure that appropriate hand shaking so as you are consuming from Kafka you are updating you offset it it accordingly so I should tell you that again there's some experiences a if you actually try to do 1 thing I didn't mention at the very beginning is is that people run a million events per 2nd are through a storm node and that and critical have as many of those nodes on but if you try to do a million events or more per 2nd against a keeper that's work you have to do some sort of matching just like a dust for that so and so you get into things like this in terms of the copper hand shaking what that would look like and I will go back to this
in this moment the welcome 2 1 to have that's like this and some of these are already written on mostly in Java
or and celebrate the evangelist marmosets multilingual sense alright so GNU can run piece like Twitter through you can pull in data from Cassandra the important thing to know is it's not that difficult implement how should in just a moment the another thing is is that you can push or pull events into the system there isn't something storm is just saying you can only it will only accept events at a certain time you can always and push events that but I'll ask you what it wants events as well so you can use that to help balance things on again you're going to be using that policy size invariant to know how the work is being slowed up so you might go and have fun something like a real that's where you might using Rexist of
files or some other cloud providers to send things out you might have some sort of real-time decision making interest in not scaling up in the environment might often have contradictory information about how things are working convergence in action that's a point of which is why you bring all that data in 1 place you may have found something on this lines where you are doing some sort of real-time navigation couple minutes I will describe the Python on storm using quantitative briefly describe when
he talked history as well of let's face it Python is a great fit for writing your strong and and we have this system called plant that allows you to readily the Raptor Python class classes so they can be readily used to each other in some cases in just 1 1 goes and so I discussed this yesterday where did go in
and have some bar clamped class so this is sort of a whole world examples and you can readily in a couple
lines there you need to use a lot of clamp just go in and and added you
can construct a new job of and Germany against and you
Vijaya 1 job but for a single job to build it has everything and to and storms our you can use it in this fashion but here's what it really looks like here's some footage
readily see monitoring so that
you're going to be opening up your connection to copy and there again I I actually in the stock at Rackspace looks like a peasant internal system that will force people talk about publicly called Adam operate again some others have if you like Twitter on you can go and read perseverance spike as it's asking for it from accessible next tuple but again if you have something we're pushing events you can emit at any time and then you're responsible for managing these called callbacks of fail and that's what you'd do under those circumstances and perhaps you want to go and do some sort of computation on this what would it
look like this is the pseudocode associated with it this is all that you need to do this is the basic thing that you need to implement you need to implement these 3 methods you're pretty much done how complicated is that it can be like Linux pipeline it could be something that some of the really simple or could be more complicated again and if I'm trying to weigh events that are telling me 0 it's going this way it's going that way and i have to figure out what really is happening but that's the advantage of having the all 1 place still conclusions about this storm let's you horizontally scale real-time architecture you have to
consider partitioning and fault tolerance In fact to me this is these are the key questions you answer this is how you actually think about what it means to divide and conquer you're large problems so that it works on this cluster machines answer these questions and you get to go from which were previously doing in terms of a simple unix pipelines to which was analyzing log files something very similar to something that can be scalable and again in real time during you can choose your favorite language that I know what your favorite language here is on this Python and you know you can again use some some of the mechanisms that are out there in terms of communicating obviously Python aware of use again Ji Thornton tight directly into this big data on system so I I did we would advocate for it and why should I you're strategies will work with storm such as test-driven development have so this talk is available but I might get out recode Jamaica toxicity and any
questions this so there's a lot of the meaning of the meaning of yes no
all of these all of of the the world the movie yeah so that's a great question on yeah sure so the question is along the lines of the currently have optimal multiple datacenters rainstorm clustering each of these and they need to go and consolidate that information in 1 place know the 1st thing to note is that you do not want to that there very few systems that will span multiple datacenters and storm is not 1 of those systems OK you are going to be running a Storm cluster and 1 data center another strong cluster and some of data center the way that you read what I do the spanning problem is some so much you to ensure that data is pushed to some central data you you may need to go of course consider would fill you might have it gets more complicated so you you basically knew that problem to your cues use something like Kafka for example up open yes on that's the best yeah I mean there's certainly not something that you can do in storm because again it really it depending on the fact that there is really answer keeper and secure dozen spam oblivious as coming yes you can in
a theoretical sense and the I can be great so don't do that use systems that actually are proven to work on and you're not just doing some interesting innovation along those lines which you'll find is not fantastic information on so we are we're so that I I don't know if there any other questions I this is something that I can take out afterwards I obviously I would love to spend lots of time 1 of anything else so I guess we're done wants to please ask me those questions that you might have


  628 ms - page object


AV-Portal 3.9.1 (0da88e96ae8dbbf323d1005dc12c7aa41dfc5a31)