Merken

A deep dive into the Pymongo MongoDB driver

Zitierlink des Filmsegments
Embed Code

Automatisierte Medienanalyse

Beta
Erkannte Entitäten
Sprachtranskript
and he's I dig dug into the by among among the TV driver is going to stand out the thank you so we call so I know how many slides I can skip here to here the new deployment almost OK we'll do a little bit you look those experts amongst us who crashed through the datasets and shot just bear with me for a couple minutes and my name is John Willis that I among the developer advocacy organization moment Europe and so my job is to get you start using more moving the and the stick right you know those other databases that no good qualities the rule the world and so we've been doing this for
about 9 years pre-mature database and it's it's it's designed for its onset to take away the pain of programming anybody who's ever built a problem with an SQL database knows about the kind of object relational mapping and the horror of taking beautifully written in object-oriented functional coding cast into SQL statements so from its onset moment was designed with a set of native drivers in mind drivers essentially the client API isolating in graph database and it was purpose-built so that you wouldn't even have to know all about JavaScript Object Notation pretty much which is the coin of the realm of the database that's what we store in among JavaScript documents not Word documents in the years the jobs should object that patient often but when you interact with the driver like deployment driver of the Java driver of the No JS driver and using the types and the objects of the language programming so in in Python using dates and topples arrays and a times and all the objects you know and love so all all underneath the water we're all JavaScript and election binary version called which I'll talk about you need to worry about that we handle of that complexity for you and him and then belief that then again we've got the document model which is Jason based and then underneath that you can effectively go straight to 1 of the storage engine so that the full operating mode something downloading along with you for the 1st time is the 1 of a single node on the laptop in that instance you running directly with what we call wired hybrid storage engine that the full Storage engine version 3 . 2 1 would be designed to be high performance and to run very efficiently on clusters with a high memory instances with lots of course right so that's the designs and we also have a legacy data engine called which we used up version 2 . 6 months of the and if you're running legacy applications built on 2 . 6 versions among the before you can start off using and that if we do the conversion will 7 in memory database in indeed at the moment that we generally available in 3 . 4 which comes at the end of this year and we have an increase in database so it is storing credit card information you doing e-commerce data or you've got a medical data or anything you wanna make sure is secured at multiple levels we can put it in the encryption Storage Engines but the key thing is these layers are essentially be insulated from their operational decisions that doesn't care about the driver works independently of all of those there so the great thing about moment you can start running a single node then moved to represent a shot at cluster change Storage Engines movement Storage Engines the only change a single line of code that's the real you to remotely use separated the concerns of operational deployment from developed and we have a range of security and management frameworks and technologies of wrap around this for a production of 1 another talk about any of that today and I think direct lots of information that for anybody who's interested in this talk and we're doing lots more talks that 1 would be at the moment in Europe Conference on November the 15th in New in London I do recommend you sign up for that discount code that and interest as well and so the
driver is support just like any programming language you can see using obviously Python we're here today in the C + + C continental knowledge job of course Ruby and frameworks so we ubiquitous and and we're no longer in these kind of model cultures where there's just 1 problem language removes all modern distributed mature development organization is the most appropriate language for the most appropriate time place so being able to support all of these languages important and 1 of the key tenets of the of the July over philosophy entities is that drivers should have effectively the same semantics so when you move from job at the and the Ruby you don't have to radically relearn what you're doing and so a lot of the semantics with the talk that they are essentially crossed driver in their application although I'm going to talk specifically about on that underneath the covers and we don't use just standard Jason if you according Jason text into the database you'll be horrendously that we positively time so the Sun is essentially a binary encoding of Jason it's a public suspect that's available and a decent essentially to things that we don't get from jobless with air type information so I can tell ahead of time this an important document or a on knowledge daytime and like so I can skip objects efficiently so I can run down a collection very efficiently I can tell that this many objects of this size and skip to the and I can seeking into it am and every driver uses these so every driver will have a piece on library included inside of its driver and and this is usable whether you use Monge or not so lots of people take these and capabilities out just use them as encoding mechanism for Jason effect company in Germany there is doing exactly this among and and energy 1 example what they're like a like Google Protocol Buffers was 3 or if you're as ancient as I am Abstract Syntax Notation One areas and 1 so it's essentially binary encoding of text so this really the standard topologies for layouts of service that you're gonna see among and it's important to understand these what follows because the way in which we interact with these clusters is the way in which we will understand the driver out so the standard 1 we start off with we downloaded developer cue from all of the b we get a driver from is among the is around driver amounts of application code which links to drive really connect was like and on I of done this on my laptop billion times images from the single mom with the process which points it generally such data the data files they're great for development not recommended for production or anything other than development but very simple to get up and running the next and most common deployment along is is a replica set a rapid this set is the way in which we establish durability concerns over the data so a replica set essentially allows us to say how important is this data and therefore how many copies do I want and how are they distributed across a particular class of systems so uh you look a moment Atlas service which is a database in the cloud we mandate that each replica member is in a separate AWS region that means we have isolated in terms of network power supply in geography so that we can guarantee that even if the region goes down a replica set to survive so represents a designed to give his data durability and the constructed as a primary this node at the top here which is designed to only accept right you can't write to any node in represent except the primary and this set a 2nd reason this case the canonical representative Reno replica set which gives us their fault tolerance for any single node failing which about a 50 nodes and so when we like to the primary the primary job in the references holds job is to ensure those rights are replicated to the 2nd reason we can specify a concern called white concerns that tells us how important those rights to someone running log data I might set my concern to be just acknowledge which essentially says I get round trip to the server everything from if on the other hand I'm writing a transaction data for a bank I need to be sure that every right goes to the server in this case I would specify why concerning majority and that will ensure the white is only returned is successful if a majority of the nodes except the right right majority is away most of our production clusters so represents fail and so was the primary fails now I have a right to well represents designed to automatically recover from this and where comes from this is has an election so the remaining nodes agree on the most stable most up-to-date member and they elect that members the new primary so what we get there is that secondly now becomes a problem we now have a stable replica set again and now we can accept rights again and that typically happens in 2 to 3 seconds and can be faster cheaper faster network and faster clusters and then recovery is when we add the other all that it will essentially do is seeing with the rate at the primary and come back up with a complete copy of the data we're now back to when we were those primaries moves and the whole idea is the primary can rotate around represent will we don't really need to care where he's with the represent will manage handling where the primary is and part of what the saliva allows us to do is to establish the connection to the primary when it moves away failed will talk about the protocols that uses to do that in the next couple of slides the last cluster environment is the shot
cluster so there's the sharp-eyed amongst you will realize that this this is a finite limit to how much traffic you can come to a single node in any computer cluster so primaries give you attack and associate cap on the amount of bandwidth on memory or CPU that single node can handle was shot and cluster essentially removes that limit by allowing you to want a unified faster in which there are multiple replicas sets which give you multiple primaries and in that scenario involves an additional demons called among OS which a shot route queries writes and reads and that leads to the appropriate shot-based another tool to much more of charter clusters today although it you should be aware that what we talk about what's happening in the driver very similar driver color is used in the among less to establish and manage connections among OS is and the the driver code itself the key thing to remember is your driver doesn't change single node star moved to represent don't move to this shot at 1st I don't think there are some cultures we suggest you made and we'll talk about those to handle failures that happen that represent level to the driver does a bunch of things it's not just air and and as a kind of a passive API alive you love this mold more often than lot in in in libraries that time ongoing Java it's a multithreaded fire so it starts additional threads so it handles authentication security it does the Python the bees and mapping that I talked about it handles the error handling and recovery for the event we see failures it manages the wire protocol so having bumblebees and objects onto a socket is sent to the server and handles response is a M it also has a connection cool so we can essentially manager said connections over a set of servers but we don't have as many connections as we have clients and finally this topology management and that's a very specific choice of words because there is a specification which I'll show you again which defines what topology is from a driver's perspective and topology is essentially the drivers understanding of what a cluster looks like this a lot of what we talk about for the rest of this talk is the library gathers topology understands it use it to allow the client to interact and and with covariance is when the to the topology changes so if you look inside the time call there is a topology object that encapsulates all of this interactivity and so when you start reading respect you ever do you get to that point full-service covering modeling is a link the end I will always have the final code alongside your favorite editor and because actually makes that we'll look at how they stop implemented gonna save time on color is some the easiest code to redevelop come across they put a lot of hard work and effort into making that we'll be manageable easy-to-understand projects so a skip over this most people here seem to understand mobility but this this is just a standard set of Python calls we might make so we call model client drive with a host headquarters from stand-alone instance connecting the local host we create a database the collection and then we insert we find we update we delete and the rules Jason document these are the fact you know obviously dates in environment of speaker and part of speech so what happens when we're actually connecting to a replicas so what I want to connect greater collection records that I can give it some number of knowledge of 1 node will work in this instance we're giving it to hold and we given a name replica set so rebels said is that the names used by M tools which I used to try out some of this stuff and tools is a package provided by involving the employee which is the great way of building clusters of very fast and very easily do recommend you look at the end to how repository so when we the
client is what happens to the Mongol client actually returns immediately right so when you call gnomic find you call multiplied in high model to it would store because it would be establishing connections to to the server explicit well in the primal frequent which I recommend everybody uses not using already it returns immediately this is actually speeding up some friends in the back and those threads are monitoring threads that are designed to establish the topology of the past so we've given in to help so the way you expect operates is the spin in although 1 monitoring server now we only know about 2 servers because the only told the monitoring the modified about 2 service inside host 1 of 2 so we start monitoring was among the thread to what these monitors friends do is they send standard cool called is master of his master is an anachronistic name for a function it should be called get status it predates clusters and represents when we only had a master-slave deployment model but effectively it gets status and so is master will be sent to each 1 of these nodes and his master response the 1st 2 terms in this case is the monitor rate to and it says 2 things his master is false that means I'm not a primary and just doubles think that in a 2nd the true tells it's explicitly exactly but it also returns a list of whole so remember when a rapid this set is constructed it knows about all the other members so every member of the replica set is aware of all the other members research and continues to be whereby a heartbeat is recognized as so monitor threat to give you a bunch of data now I will
I ran this earlier wrong it gives you are actually quite a substantial dictionary of data but the parts were interested in for the purpose of this tool I tells you the list of hosts now in this case of run them all on the same box I don't recommend you do that production and his master's folds 2nd is true and the represent name is reference set so what we do
is we take the current topology which right now is I know that 2 nodes the seed nodes of a past among client and and combine it with new information that I grasped by learning is master on the 1st note the reply and that is Master has told us that this really nodes in its cluster host 1 post to impose 3 per year so right now we've got model 1 that has returned hasn't got response from his master modern French to tutor has got a response so we now know for sure that topology is telling us we have 1 Real Nolan discussed and as a setting with the set is our 2nd readers a status field inside we also know that was rain particular holds in this cluster so we're gonna spin 3rd monitoring so now we have to rephrase all monitoring the service although sent is masters we had 1 is master replied remember these threads running in parallel but they with all the constraints of the global interval often the threading issues is high among themselves they're not simply displayed fast right so words that is returned model 1 and he was still waiting for his master In the meantime what's you can't do it right so you cold start multiplying return immediately so the returns immediately well I'm going to do something with the database at some point and obviously I
might call an right so I'm going to insert 1 field and simple document a right and it's the same replica set so it knows it's insertion replicas this
insert is going to block right and gonna block because I can't write to a replica set unless I have a primary why because you only like the primaries and reference that's 1 of the constraints of the of the of the cluster so the moral client will hold that right because you know it doesn't have a complete state of repetition this point I it only knows about 1 2nd so now again is master response from host 1 right and his master response because 1 tells me this is the primary node for this cluster right now there's a service selection protocols a whole set expected moment talk about data that tells us how to pick a service to the particular operations but the 1 thing that service selection knows is I can only write primary so now the problem is that the client doesn't have to wait it doesn't need to know all about the the rest of the cluster because the cluster will manage itself the client knows if can their right to the primary the primary Willie they're satisfy the white concern like to majority nodes return error to me in which case I can return error the so I can go right ahead so this light will succeed to the primary health right and in fact the primary will respond with an OK will send response back and certainly satisfied and in fact you know it in most of what is this this whole round-trip only a matter of a couple hundred milliseconds so we have rights succeeding we have a cluster topology that's got to remodel threads running but we only know that you know nodes but at some point in time later also is gonna respond right at that point we will update the topology again we now have 3 Moratinos would freeze last responses and we have a complete view of the topology white and this is what we call steady state so this is the standard operating model for a faster and now what will happen is these threads will essentially wake up every 10 seconds and 1 his last right and his master will say everything's A. OK that updated topology and go back to sleep and they may in parallel may get some ideas in that school culture right we don't mind about that because as long as we're running inserts is succeeding that will be following unfortunately life as a nasty habit of intervening and uh you know you know that the 6 saddest word in fight the I've ever heard of uh we have not experienced failure right for I've never experienced a failure guarantee I remember I was 1 of the 1st people use Amazon's S 3 service in 2006 is really online backup system and everybody reviews said this stuff never down and sure enough for like 2 2 and a half years it didn't go down and then it went down and out the whole eastern seaboard of applications when the 1st which meant that I wasn't told all about as you need to be multi region to be fault-tolerant resilient to the level that we want to be but everybody was piling the nodes into AWS people regions these Virginia and they took out most of the stops in North America and so you know we've never had a failure that's just you know I guarantee you next week is going to be a frustrated and then you can never ever say that again ever right so when like into being so we quite a primary host right now that a threat so we wake up every 10 seconds so you quite a primary house and the cluster could recover and in a lightly loaded cluster where was made light active activities happening you might never know it's the class and the topology would recover the 10 seconds to comply with his new primary the topology of data clients would never see it but of course in a heavily loaded faster and why the hell would you use among you have a heavy right so the whole point of no SQL is to handle like thousands to millions to hundreds of millions of rights for a 2nd bite will it's likely that the forum or wakes up the answer is going to be the client and the answer is that at the time and now it's unpredictable what will happen the time it might send the request but never received a response is mighty halfway to sending the response make get a server flows we we don't know but all those errors we wrapped up by the driver and sent back to the client as a connection very so you don't need to worry about sockets timeouts all that crazy stuff we do all lot for you all you need to know is at the point when you're making the right the right primary died in some ways interrupted you right so what you need to do in this situation well you really tried to write and talk about how many times you need to be trying to couple minutes but you know we try so what happens when you we try tried well the primary is down and here we try use that among the client now known as the primary is there is no longer a surprise to it so it's going to hold you right in the same way held whites that starts at 0 wait for that right the put it is you and it's gonna essentially put out all points bulletin so wake-up call it friends it's going to say Hey guys Hey girls there's no primary in this cluster that's a big deal and it's gonna essentially instead of waiting every 10 seconds you call those threads every half a 2nd until primary returns meanwhile the answer is waiting right so elections said happens pretty quickly between 2 3 seconds so within a couple of seconds of you know putting out all points bulletin having all of these threads calling his master would elect a new primary in this case house to their homes to it's probably if you follow the guidelines in the same topology and and it's the same service size and scope and bandwidth network and memories house 1 so we should see a drop in performance will oscillate this fault tolerant because there were only replicating 2 nodes rather than 3 but we still got some replication so Italian conceived remember the client will go once I have a primary I'm good to go so now I can send the right straight into the server and then the 3rd noble cause will send some here the data center on the table in place the power supply will install a new node on a different AWS Region will put the forum data will take a cup of coffee out of the network connector and the system will recover and now we're back in steady state and that in many cases we may not have the the same primary and that's fine the topology is comfortable with the primary moving around just the same with the replicas cities that you can change that you can have a priority which essentially tells you I always wanted them host 1 is available have having become the primary what happens there is you can get 2nd election so the minute the hosts 1 comes back to have a new election and every time have an election you like to get a retrial from your cold or connect failure so we don't really wanna promote lots of elections so do managing a cluster what to we say is if you're doing maintenance of a cluster C were upgrading all of the cluster members with operating system take each 2nd the 1st the 2nd reason not interrupt the ability of the driver to write to the cluster and then take out the primary last and take it out at a time when you traffic is quietest now we all know what services the global in scope or that never turn off but there's always a window when you least amount of traffic that sometimes take a primary and the drivers will handle I can set is is always you don't think the the primary down for so what is mean field as
programmer as well if you want to establish connectivity to a server if you wanna know all I have connectivity to a particular server calling 1 of client is meaningless model client will wait for 30 to 40 seconds before drawing error because it's got those threads and their their waking up every 10 seconds just saying gee I wonder if this surveillance of getting a client object back doesn't give you any state information about when you need to run isn't these master yourself explicitly so you can on client . admin . command and admin is a database that's always available to you don't need to create a new database by accident and is not possible return and if it returns then you got a connection if it throws exception then you know you still don't have a server that's available and his master has another couple properties are interesting 1 if you got indicated services require users with his master does not need authentication it's purely a effectively a server paying so you can very quickly run on a server is also efficient and of course gives you round information to the driver which the driver can use to establish what's the best use a range of queries so don't use mobile client is adding what it meaningful queries well find with the covering remember what happens when we have a year it's gonna fail and then it's gonna q Europe query until it reestablishes connected with class so B is not a panacea moment B will manage Mongo DB and Mongo DB will stay up for as long as you want right color clusters that are enormously stable but we manage your network the in the role the individual who walks in your data center in kinks virus cable has happened in Bangalore Holland couple years ago or you know other external influences like that the most common words in a in couldn't cooperation way back in the number 1 thing used take out the whole dataset and this is a plot of 5 thousand people so we JCB of happen all the time so you know those kind of allergies you driver is going to eventually time out and tell you things wrong so if you if the rapid this set doesn't return it means something else is up so you get a 2nd connection failure you need to populate the graph and your application because that represent isn't coming back anytime soon and meaningful time frame that makes a difference to use about so the guidance for drivers and we're not going to automatically resolve your issues you must participate in your own salvation but you only need to retry try once so these pre-trial loops you see a lot of code which is 4 1 4 i want to fly retries we tried don't do that what we try is enough that it doesn't work once it's never going to work on a time frame that matters to from so very important advantages to that 1 we try what is it mean for instance well for instance we might be careful because and insert might succeed but the response might not come back so we might say that there be some object and the command to insert to the server the primary gets it then we lose the network and other response doesn't come back so we wanna essentially established that we don't do a double in so we do that by explicitly adding the object ID to the document before we Indians there is some time do anyway but if it if is already a document ID it doesn't find all right now we we do the the the insert will fail or will succeed everything goes on as before for it fails and we really try and we we try if you started actually succeeded before we get into community yeah so remember I The is a unique index on so you can't add objects more than 1 same ID so that you can hear will essentially identified already done the insert now you know you essentially back this golden you can carry on as before that updates a trick here right so updates uh change a document in place so well I mean there are 3 ways to think of this and to them I always think the little bit all the people all of the different ways of problems what the 1st way is it doesn't matter if I understand right the amount and over cool updated doesn't succeed so what carry on languages catalog indices missing a couple is in in the world right or it doesn't matter if I always found in which case I can we tried I always we try everything it but you can't always tell or under that you need to think about creating and try to hold the protocol for the database and that's that's beyond the scope of the driver to help you with you need to think about their essentially turning objects into a series of rights and those lights may be duplicated and you can use the client to essentially make do this all more often than not if using event sourcing model and their do the talking event sourcing so maybe you can catch me again and catch that talk using event source model if you turn all your updates into straight up right so you can essentially eliminated the rights use the same kind of idea model to essentially at a unique ID to each life and then look for a unique ID which are attraction or using the ID field itself it was sent back to the here and then you can accumulate aggregated those values into the updates on the server side the aggregation framework for the whole point is the sole operations that just have been designed from a driver's perspective to be recoverable and that's just the nature of distributed cluster database system as 1 of the trade-offs that we make you know my SQL impose and Oracle will tell you that we've never had a fairly until you do you have a failure and we value catastrophic become like the database at all right so we're trying to avoid that catastrophic failure 2 more kind of choosing feels for you and the 2 parameters that you can put in the cluster and time MS is how long the monitoring thread will wait to establish a connection to a server when it's essentially pinning with his master so this is something again we can't tell you so if you're a high frequency trader and you put a cage servers in the New York Stock Exchange right next to the New York Stock Exchange well connected MS might two-millisecond you can cover clustering 2 milliseconds losing billions of dollars of transaction might on the other hand if we ever put there I don't know what got servers running among the meaning International Space Station you don't show the latency of a satellite link that space station is going to be slightly lower bound with them and between ourselves in data center and Amazon in that case you might want to extend the connect timeout ms from greater 30 so the other 1 is server
ms so connector MS is on the red server time MS is on the whole I we used to use the same 1 in economies and Seretide is something we added because you essentially 1 understand how long my willing to wage index to complete this life and so if you've got a system that you can afford to human rights of you got back pressure so imagine you right so coming off at units a catholic you and that backpressure concurrently demographic you need more back to the clients then you can afford to set this time at the very hot if on the other hand you've got IOT events coming in you need process in real time 1 dataset much well in which case you'll get an error back to the client you get at time so these again are in areas where you can choose new environment to control what happens in your application so
although this is documented in in great document written by an engineer in the US called Jesse Julie Davis you told them among the year-ago so I'm kind of channeling his talk and he has a set of links and you can read out of the of put all the index up on slideshare afterward so should be able to get it you might account Mangeot removed during remove the full martyring inspect server discoveries models that is on the hot air please read download them we accept the change requests and that something you don't understand the common that's this but the key thing is this has established a consistent model for how drivers interoperate with clusters both shot and represents a lowly told that represents today from old drivers so all drivers among Latino comply with respect to now you're writing code of more or less works in primal know that same cold path should work in Ruby and Java in the same way it may work slightly differently see is there the semantics for a single threaded drivers different from multivariate drivers we typically don't do multi clients for languages that don't have explicit training based into language otherwise including the many third-party libraries and so the cluster will manage itself but the driver requires your house so when you're writing longer client code don't forget that you're in a distributed network system you know the 7 fallacies of building industry assistance is always the there but it's still true and so what we do is try and help you as much as possible but you must participate in your own salvation so you must engage readers enough to become an expert from this my talk but reading suspect will really help you to understand have topologies managed inside the driver lost that's pretty much it for me there are those of you who read tools summation of gone what about certain queries and updates have that they work well when I started to this talk I hadn't really got head inside the respect and I used to run cells from only that they would let me write call increase sales so I moved to develop products in this now I can write code least half the time have to talk about letting go the other so all pull the operations among the in the client-side apart from topology manner in which is the most complicated and interesting run the same model validate validated parameters and there's lots of parameter validation convert the data into these never be actually has an intermediate format called song which is essentially an ordered a dictionary to so objects have the property of being able to convert themselves to and from these take on object and then essentially we look for a in sort cool and within that socket that server we ask for a server that's suitable for this operation this is the service selection protocol I talked about them is another expect there also immediate just fundamentally so the most simple service selection is only a primary to do rights before reads I can use secondary ones I found a server that fits the selection I think these an object they injected down the server sockets a straightforward TCP IP socket I wait for a response object which comes back in the reason I am bundle that into a solid object and then around that into essentially the or arrays or whatever the client expects to see them the only something with queries is we don't actually return anything when you do a query which is the current switches client-side only only query occurrences it's only when you ask for a class of elements that the inter-rater goes and gets a list of elements and the when query because the 100 elements so that's the Mongol client time on the clients about 20 to some questions that people want to ask questions otherwise it thank you very much for listening to the few question them about their way might otherwise there I have to repeat it I'm so lazy the 1st question is that you mentioned that people often use dishonest serializing logic uh done some of we about when you think Mungle thing and client actually uh so maybe it would be a good idea to adjust to reflect our module or civilization and this realization outside the plant because if I remember correctly I wanted to use this realization of logic here just this summer and that that's the to to install the whole the Mongol plant Beckett's Hench other colleagues that came to me us think about why why do we need to Mungo here because I just wanted to use this this you probably made the point that we also we it's a trade off we're here to support among the developers primarily and again it's easier if they just call 1 package which is the totality of what they need if we were to seperate that package out and it makes their lives more difficult so the trade off you have some pain just when is the son but the primal developers how much you like physical bonding together and then this is a good deal so referred to the to the slides is believed that the logic or the the title insertions so the database the so this logic you insert to the 1st it then let's assume phase and then you insert again but to say that the disk can control duplicate can curious but do you want the catch this error because the 2nd insert isn't in the try block this in it except book celebrated the 2nd except want to cont'd if I'm on my I will I will amend that code of things yeah and and also its moment of each is allowed to make announcements I I wanted to say very quickly if any of you use Monday being with scientific Franck-Condon untiI should prompt I because they're developing a new package and I would love to get the idea that the 1 of the 1st thank you for the conference of like to know about 2 Multi-Driver would depend for the 2 pi minus 3 area about to multiply the simulta arcing kill driver of don't know don't know about it and I have to ask the driver training OK this is come to me afterward if you beginning question I will I will send an e-mail to the driver in front of the Quality of do you know about libraries like a among lending and that what people what they think about this man makes it look like it's a norm of sequel database and makes it look like the same but I I I I have a personal opinion this is not among the opinions and will develop opinions of perceiving is no matter what people say and quite would you why would you use something that makes Mongo DB look like an SQL engine the whole purpose of use among entities to make there things you put in the database is very similar things you interact with linear programming language I would use all the state that there was less than that I I know it's which is I already have SQL call that lights the 1 1 right to alongside right and if you have a compelling reason you need to write SQL queries or were right to that as an absolute user layer that's 1 of the things that we had in the commercial offerings is a B I connected will later white among the the which you can get new SQL queries among so we support that the idea of query money using SQL and and this is a a range of this and mapping capability insulated on rat arrays and nested objects into new tables you get a tabular view today data but you can't write and you're right that tried to model and you may allow you to write in which case yeah if you actually have to use SQL then here use the tools you need to use mean restaurant when under 200 cover using the bottom right just to complete the bottom ongoing giant it could be a useful to validate can't be forwarded history so I didn't talk about today but we have document validation yeah that's since and what would you be . yes he falls under always addition documents so modern giant could've is the lead the to the command of I saw the conference recently and summary quoted an engineer from their organization EEG is in there you know he form local and regional which is we can build today we could wait 6 months and there and and Amazon building for well there is a little time only be you know things that you build today we're probably gonna build tomorrow so if there is a rule a community of people need a capability we spend a lot of time to community so I can't make any future looking statements on what we won't do that surprise surprise you know having some kind of SQL capability against a database in something that until the runtime and we introduce that in the last release among and between 2 but they continue to enhance their capabilities so we really drop features this is like is masters still rock is right is master with the original status call the very original versions monitor the so 1 of the good things in 1 of the uh testament stress etiologies ancestors of backwards compatibility so we tend not to break stuff in the past in order to bring stuff forward so lots of our customers and still running among the need to point to shame on them but they are 1 then the I here I have a question about automatically to recover of the customer a you told us that to you when you have a cluster of free notes and primary friends that in of tourists new primarily available and uh I thought uh situations not far ago that I've had to have a cluster 5 5 months cluster and when free of the nodes phase effective and to 1 of the primary then I ended up with 2 2nd who couldn't area and the ectoderm pre-primary and this could be fixed at by 1st 2nd and 3rd configuration of the cost my mom order by weight seeing tool atleast 1 of their affected starts the conversation this normal behavior on that is normal document behavior that's the way the clusters meant to respond so that cluster will only recover a primary if a majority of the nodes survivor failure and knows majority of the nodes are connectable which is to say they're all in the same sort of saying that partition so if you lose the majority of the knowledge of the 2 nodes that are left will come up with a come and secondly so the driver will say well I can't write with that select service that says I can still query so I can still write queries to the 2nd we know that any right will fail and that's what I mean about the catastrophic failure catastrophic failure is where I kind of like the primary because I don't have a majority of the note that catastrophic failures caused by failure to isolate nodes or to a data center failure doesn't really matter I mean 1 of the things that I say is Mongo DB nor any other distributed database is not a panacea right people ask 99 . 9 9 9 9 9 9 9 up time because they don't know what that means right and then I say to them you know that your power provider for you National Grid doesn't have that level of up on what he wanted do when the national grid goes down and they say something dumb like but we haven't generated and I say who's it for nobody has a network home grid is there yes you running with the rest of the world can connect to you so it's just you know we're just we're we're responsible for ourselves the rest of the world has to worry about the rest of the world so will make sure the class to recover the you set up so that the nodes are independent of the single point of failure within your domain of control then you shouldn't have freedoms variance and if that happens then that means 2 things happen you you you you you joined into a single point of failure like a single node and that was only 2 or more often than not you poisoned your own cluster so you might destroy faster with cold air so that that happens all the time think the end OK you have questions go for and right excellent driver code thinking that
Rechenschieber
Expertensystem
Bit
Druckertreiber
Momentenproblem
Rechter Winkel
Selbst organisierendes System
Prozess <Informatik>
Datenhaltung
Schlussregel
Softwareentwickler
Offene Menge
Umsetzung <Informatik>
Momentenproblem
Parser
Information
Computeranimation
Richtung
Formale Semantik
Fehlertoleranz
Zahlensystem
Client
Standardabweichung
Gerade
Array <Informatik>
Kraftfahrzeugmechatroniker
Befehl <Informatik>
Datentyp
Datennetz
Computersicherheit
ASN.1
Biprodukt
Bitrate
Entscheidungstheorie
Menge
Rechter Winkel
Festspeicher
Server
Programmierumgebung
Instantiierung
Stabilitätstheorie <Logik>
Selbst organisierendes System
Wasserdampftafel
Mathematisierung
Klasse <Mathematik>
Überlagerung <Mathematik>
Knotenmenge
Informationsmodellierung
Spannweite <Stochastik>
Datentyp
Programmbibliothek
Cluster <Rechnernetz>
Soundverarbeitung
Protokoll <Datenverarbeitungssystem>
Datenmodell
Gibbs-Verteilung
Elektronische Publikation
Menge
Einfache Genauigkeit
Chipkarte
Portscanner
Wiederherstellung <Informatik>
Klumpenstichprobe
Streuungsdiagramm
Baum <Mathematik>
Prozess <Physik>
Applet
Formale Sprache
Selbstrepräsentation
Versionsverwaltung
Euklidische Ebene
Kartesische Koordinaten
Komplex <Algebra>
Binärcode
Übergang
Datenmanagement
Information Engineering
Prozess <Informatik>
Objektrelationale Abbildung
Mapping <Computergraphik>
Druckertreiber
ATM
Nichtlinearer Operator
Datenhaltung
Dateiformat
Rechenschieber
Transaktionsverwaltung
Chiffrierung
Information
Server
Kondition <Mathematik>
Term
Framework <Informatik>
Code
Notebook-Computer
Binärdaten
CMM <Software Engineering>
Speicher <Informatik>
Hybridrechner
Optimierung
Softwareentwickler
Bildgebendes Verfahren
Gammafunktion
Trennungsaxiom
Programmiersprache
Einfach zusammenhängender Raum
Graph
Division
Zwei
Einfache Genauigkeit
Physikalisches System
Keller <Informatik>
Objekt <Kategorie>
Energiedichte
Druckertreiber
Flächeninhalt
Mereologie
Codierung
Kovarianzfunktion
Punkt
Applet
Datenmanagement
Computeranimation
Übergang
Netzwerktopologie
Metropolitan area network
Client
Mehrrechnersystem
Datenmanagement
Druckertreiber
Auswahlaxiom
Umwandlungsenthalpie
Lineares Funktional
Datenhaltung
Computersicherheit
Mobiles Internet
Systemaufruf
Ähnlichkeitsgeometrie
Ausnahmebehandlung
Bitrate
Systemaufruf
Ereignishorizont
Kugelkappe
Garbentheorie
Menge
Festspeicher
Client
Server
Socket
Projektive Ebene
Programmierumgebung
Instantiierung
Standardabweichung
Mathematisierung
Hyperbelfunktion
Zahlenbereich
Interaktives Fernsehen
Sprachsynthese
Zentraleinheit
Term
Code
Wiederherstellung <Informatik>
Datensatz
Multiplikation
Knotenmenge
Informationsmodellierung
Perspektive
Endogene Variable
Programmbibliothek
Inverser Limes
Thread
Cluster <Rechnernetz>
Assoziativgesetz
Einfach zusammenhängender Raum
NP-hartes Problem
Protokoll <Datenverarbeitungssystem>
Einfache Genauigkeit
Routing
Schlussregel
Binder <Informatik>
Menge
Objekt <Kategorie>
Mapping <Computergraphik>
Druckertreiber
Mereologie
Authentifikation
Wort <Informatik>
Wiederherstellung <Informatik>
Bandmatrix
Kantenfärbung
Dämon <Informatik>
Baum <Mathematik>
Nebenbedingung
Punkt
Quader
Computeranimation
Netzwerktopologie
Informationsmodellierung
Knotenmenge
Client
Code
Endogene Variable
Thread
Datenhaltung
Mailing-Liste
Biprodukt
Data Dictionary
Datenfeld
Menge
Zustandsdichte
Mereologie
Server
Client
Wort <Informatik>
Information
Baum <Mathematik>
Logik höherer Stufe
Einfügungsdämpfung
Punkt
Momentenproblem
Kartesische Koordinaten
Aggregatzustand
Datensicherung
Computeranimation
Übergang
Netzwerktopologie
Rechenzentrum
Fehlertoleranz
Client
Code
Trennschärfe <Statistik>
Bildschirmfenster
Datenreplikation
Fließgleichgewicht
Tropfen
Parallele Schnittstelle
Nichtlinearer Operator
Vervollständigung <Mathematik>
Sichtenkonzept
Datennetz
Physikalischer Effekt
Systemaufruf
p-Block
Dialekt
Softwarewartung
Datenfeld
Menge
Rechter Winkel
Festspeicher
Client
Server
p-Block
Aggregatzustand
Standardabweichung
Fehlermeldung
Tabelle <Informatik>
Nebenbedingung
Subtraktion
Klasse <Mathematik>
Socket-Schnittstelle
Informationsmodellierung
Knotenmenge
Multiplikation
Webforum
Endogene Variable
Thread
Einfach zusammenhängender Raum
Videospiel
Protokoll <Datenverarbeitungssystem>
Zwei
Physikalisches System
Endogene Variable
Druckertreiber
Mailbox
Wort <Informatik>
Bandmatrix
Baum <Mathematik>
Logik höherer Stufe
Satellitensystem
Programmiergerät
Einfügungsdämpfung
Bit
Prozess <Physik>
Punkt
Momentenproblem
Natürliche Zahl
Formale Sprache
PASS <Programm>
Kartesische Koordinaten
Computeranimation
Rechenzentrum
Client
Einheit <Mathematik>
Datenmanagement
Code
Inklusion <Mathematik>
Parametersystem
Nichtlinearer Operator
Datennetz
Kategorie <Mathematik>
Kraft
Datenhaltung
Mobiles Internet
Reihe
Abfrage
Ausnahmebehandlung
Plot <Graphische Darstellung>
Frequenz
Ereignishorizont
Arithmetisches Mittel
Druckverlauf
Transaktionsverwaltung
Datenfeld
Rechter Winkel
Automatische Indexierung
Client
Server
Information
Programmierumgebung
Fehlermeldung
Aggregatzustand
Instantiierung
Subtraktion
Computervirus
Rahmenproblem
Klasse <Mathematik>
Zahlenbereich
Online-Katalog
Framework <Informatik>
Code
Wiederherstellung <Informatik>
Loop
Spannweite <Stochastik>
Informationsmodellierung
Perspektive
Endogene Variable
Thread
Cluster <Rechnernetz>
Einfach zusammenhängender Raum
Videospiel
Protokoll <Datenverarbeitungssystem>
Graph
Open Source
Zwei
Systemverwaltung
Physikalisches System
Binder <Informatik>
Objekt <Kategorie>
Echtzeitsystem
Druckertreiber
Zustandsdichte
Flächeninhalt
Authentifikation
Wort <Informatik>
Kantenfärbung
Baum <Mathematik>
Einfügungsdämpfung
Umsetzung <Informatik>
Gewichtete Summe
Momentenproblem
Freeware
Schreiben <Datenverarbeitung>
Baumechanik
Service provider
Formale Semantik
Netzwerktopologie
Client
Trennschärfe <Statistik>
Notepad-Computer
E-Mail
Phasenumwandlung
Array <Informatik>
Metropolitan area network
Addition
Befehl <Informatik>
Sichtenkonzept
Datennetz
Kategorie <Mathematik>
Gebäude <Mathematik>
Ausnahmebehandlung
Biprodukt
Katastrophentheorie
Menge
Rechter Winkel
Lineare Optimierung
Server
Socket
Ordnung <Mathematik>
Faserbündel
Tabelle <Informatik>
Fehlermeldung
Lesen <Datenverarbeitung>
Subtraktion
Wellenpaket
Selbst organisierendes System
Klasse <Mathematik>
Mathematisierung
Mathematische Logik
Überlagerung <Mathematik>
Spannweite <Stochastik>
Knotenmenge
Informationsmodellierung
Domain-Name
Endogene Variable
Programmbibliothek
Thread
Cluster <Rechnernetz>
Konfigurationsraum
Varianz
Protokoll <Datenverarbeitungssystem>
sinc-Funktion
Rechenzeit
Schlussregel
Binder <Informatik>
Partitionsfunktion
Modul
Gamecontroller
Baum <Mathematik>
Punkt
Formale Sprache
Versionsverwaltung
Fortsetzung <Mathematik>
Element <Mathematik>
Eins
Übergang
Rechenzentrum
Multivariate Analyse
Druckertreiber
Parametersystem
Nichtlinearer Operator
Datenhaltung
Abfrage
Systemaufruf
Rechenschieber
Betrag <Mathematik>
Verschlingung
Automatische Indexierung
Dateiformat
Versionsverwaltung
Normalspannung
Aggregatzustand
Server
Gewicht <Mathematik>
Total <Mathematik>
Zellularer Automat
Socket-Schnittstelle
Code
Multiplikation
Mini-Disc
Softwareentwickler
Schreib-Lese-Kopf
Leistung <Physik>
Expertensystem
Validität
Einfache Genauigkeit
Mailing-Liste
Physikalisches System
Quick-Sort
Objekt <Kategorie>
Druckertreiber
Flächeninhalt
Normalvektor

Metadaten

Formale Metadaten

Titel A deep dive into the Pymongo MongoDB driver
Serientitel EuroPython 2016
Teil 86
Anzahl der Teile 169
Autor Drumgoole, Joe
Lizenz CC-Namensnennung - keine kommerzielle Nutzung - Weitergabe unter gleichen Bedingungen 3.0 Unported:
Sie dürfen das Werk bzw. den Inhalt zu jedem legalen und nicht-kommerziellen Zweck nutzen, verändern und in unveränderter oder veränderter Form vervielfältigen, verbreiten und öffentlich zugänglich machen, sofern Sie den Namen des Autors/Rechteinhabers in der von ihm festgelegten Weise nennen und das Werk bzw. diesen Inhalt auch in veränderter Form nur unter den Bedingungen dieser Lizenz weitergeben
DOI 10.5446/21163
Herausgeber EuroPython
Erscheinungsjahr 2016
Sprache Englisch

Inhaltliche Metadaten

Fachgebiet Informatik
Abstract Joe Drumgoole - A deep dive into the Pymongo MongoDB driver The Pymongo driver is one of MongoDB’s most popular driver interfaces for connecting to MongoDB. But developers rarely look under the cover to see what’s happening inside the driver. By having a deeper insight into how the driver constructs server requests and responds, developers will be able to write more effective MongoDB applications in Python. ----- *The Pymongo driver is one of MongoDB’s most popular driver interfaces for connecting to MongoDB. But developers rarely look under the cover to see what’s happening inside the driver. * *By having a deeper insight into how the driver constructs server requests and responds, developers will be able to write more effective MongoDB applications in Python.* *We will look at :* -*Initial connection* -*A query* -*A simple write operation* -*A bulk write operation* -*How the driver responds when we have a node failure* *We will also give insight into the driver’s approach to server selection when connecting to a replicas set (a multi-node instance of MongoDB).*

Ähnliche Filme

Loading...